Commit 4ec397da authored by linushof's avatar linushof
Browse files

Review of probability theoretic definitions

parent 1354223e
...@@ -51,63 +51,53 @@ $$\begin{equation} ...@@ -51,63 +51,53 @@ $$\begin{equation}
\omega_i = \{\omega_1, ..., \omega_n\} \in \Omega \omega_i = \{\omega_1, ..., \omega_n\} \in \Omega
\end{equation}$$ \end{equation}$$
containing a finite set of possible outcomes, gains and/or losses respectively. containing a finite set of possible outcomes, monetary gains and/or losses respectively.
$\mathscr{F}$ is a set of subsets of $\Omega$, i.e., the event space $\mathscr{F}$ is the set of all possible subsets of $\Omega$:
$$\begin{equation} $$\begin{equation}
A_i = \{A_1, ..., A_n\} \in \mathscr{F} = \mathscr{P}(\Omega) A_i = \{A_1, ..., A_n\} \in \mathscr{F} = \mathscr{P}(\Omega)
\; . \; .
\end{equation}$$ \end{equation}$$
$\mathscr{P}(\Omega)$ denotes the power set of $\Omega$.
$P$ is a probability mass function $P$ is a probability mass function
$$\begin{equation} $$\begin{equation}
P: \mathscr{F} \mapsto [0,1] P: \mathscr{F} \mapsto [0,1]
\end{equation}$$ \end{equation}$$
that assigns each $\omega_i \in \Omega$ a probability of $0 \leq p_i \leq 1$ with $P(\Omega) = 1$ [cf. @kolmogorovFoundationsTheoryProbability1950, pp. 2-3]. that assigns each outcome a probability of $0 \leq p(\omega_i) \leq 1$ with $P(\Omega) = 1$ [cf. @kolmogorovFoundationsTheoryProbability1950, pp. 2-3].
In such a choice paradigm, agents are asked to evaluate the prospects and build a preference for either one of them. In such a choice paradigm, agents are asked to evaluate the prospects and build a preference for either one of them.
It is common to make a rather crude distinction between two variants of this evaluation process [cf. @hertwigDescriptionexperienceGapRisky2009]. It is common to make a rather crude distinction between two variants of this evaluation process [cf. @hertwigDescriptionexperienceGapRisky2009].
For decisions from description (DfD), agents are provided a full symbolic description of the triples $(\Omega, \mathscr{F}, P)_j$, where j denotes a prospect. For decisions from description (DfD), agents are provided a full symbolic description of the prospects.
For decisions from experience [DfE; e.g., @hertwigDecisionsExperienceEffect2004], the probability triples are not described but must be explored by the means of sampling. For decisions from experience [DfE; e.g., @hertwigDecisionsExperienceEffect2004], the prospects not described but must be explored by the means of sampling.
To provide a formal definition of sampling in risky or uncertain choice, we make use of the mathematical concept of a random variable. To provide a formal definition of sampling in risky choice, we make use of the mathematical concept of a random variable and start by referring to a prospect as *"risky"* in the case where all $p(\omega_{i}) \neq 1$.
Thus, if for each Here, risky describes the fact that if agents would choose a prospect and any of its outcomes in $\Omega$ must occur, none of these outcomes will occur with certainty.
It is acceptable to speak of the occurrence of $\omega_{i}$ as a realization of a random variable $X$ defined on a prospect iff the following conditions A and B are met:
$$\begin{equation}
\omega_{i} \in \Omega: p(\omega_{i}) \neq 1
\; ,
\end{equation}$$
we refer to the respective prospect as *"risky"*, where risky describes the fact that if agents would choose the prospect and any of the outcomes $\omega_{i}$ must occur, none of these outcomes will occur with certainty but according to the probability measure $P$.
It is acceptable to speak of the occurrence of $\omega_{i}$ as the realization of a random variable iff the following conditions A and B are met:
A) The random variable $X$ is defined as the function A) $X$ is the measurable function
$$\begin{equation} $$\begin{equation}
X: (\Omega, \mathscr{F}) \mapsto (\Omega', \mathscr{F'}) X: (\Omega, \mathscr{F}) \mapsto (\Omega', \mathscr{F'})
\; , \; ,
\end{equation}$$ \end{equation}$$
where the image $\Omega'$ is the set of possible values $X$ can take and $\mathscr{F'}$ is a set of subsets of $\Omega'$. where $\Omega'$ is a set of real numbered values $X$ can take and $\mathscr{F'}$ is a set of subsets of $\Omega'$.
I.e., $X$ maps any event $A_i \in \mathscr{F}$ to a subset $A'_i \in \mathscr{F'}$: I.e., $\Omega$ maps into $\Omega'$ such that correspondingly each subset $A'_i \in \mathscr{F'}$ has a pre-image
$$\begin{equation} $$\begin{equation}
A'_i \in \mathscr{F'} \Rightarrow X^{-1}A'_i \in \mathscr{F} X^{-1}A'_i \in \mathscr{F} \; ,
\end{equation}$$ \end{equation}$$
[@kolmogorovFoundationsTheoryProbability1950, p. 21]. which is the set $\{\omega_i \in \Omega: X(\omega_i) \in A'_i\}$ [@kolmogorovFoundationsTheoryProbability1950, p. 21].
B) The image $X: \Omega \mapsto \Omega'$ must be such that $\omega_i \in \Omega = x_i \in \Omega'$. B) The mapping is such that $X(\omega_i) = \omega_i$.
Given conditions A and B, we denote any realization of a random variable defined on the triple $(\Omega, \mathscr{F}, P)$ as a *"single sample"* of the respective prospect and any systematic approach to generate a sequence of single samples from multiple prospects as a sampling strategy [see also @hillsInformationSearchDecisions2010]. Given conditions A and B, we denote any occurrence of $\omega_i$ as a *single sample*, or realization, of a random variable defined on a prospect and any systematic approach to generate, in discrete time, a sequence of single samples that originate from multiple prospects as a *sampling strategy* [see also @hillsInformationSearchDecisions2010].
Because for a sufficiently large number of single samples *n* from a given prospect, i.e., $\lim_{n \to \infty}$, the relative frequencies of $\omega_{i}$ approximate their probabilities in $p_i \in P$ [@bernoulliArsConjectandiOpus1713], sampling in principle allows to explore a prospect's probability space.
## A Stochastical Sampling Model for DfE ## A Stochastical Sampling Model for DfE
Consider a choice between $1,\, ...,\, j,\,...,\, n$ prospects, where $j \leq n \geq 2$. Consider a choice between $1, ...,j,..., n$ prospects.
To construct a rough stochastic sampling model (hereafter SSM) of the random process underlying DfE, it is assumed that agents base their decisions on the information provided by the prospects, which is in principle fully described by their probability triples. To construct a rough stochastic sampling model (hereafter SSM) of the random process underlying DfE, it is assumed that agents base their decisions on the information provided by the prospects, which is in principle fully described by their probability triples.
Thus, a decision variable Thus, a decision variable
...@@ -156,29 +146,22 @@ S = ...@@ -156,29 +146,22 @@ S =
\left\{ \left\{
\frac{\frac{1}{N_X} \sum\limits_{i=1}^{N_X} x_i} \frac{\frac{1}{N_X} \sum\limits_{i=1}^{N_X} x_i}
{\frac{1}{N_Y} \sum\limits_{j=1}^{N_Y} y_j} {\frac{1}{N_Y} \sum\limits_{j=1}^{N_Y} y_j}
\right\}^{\mathbb{N}} \right\}
= =
\left\{ \left\{
\frac{\overline{X}_{N_X}} {\overline{Y}_{N_Y}} \frac{\overline{X}_{N_X}} {\overline{Y}_{N_Y}}
\right\}^{\mathbb{N}} \right\}
\; , \; ,
\end{equation}$$ \end{equation}$$
where $\mathbb{N}$ is the number of comparisons, $x_i$ and $y_j$ are the realizations of the respective random variables, i.e., the single samples, and $N_X$ and $N_Y$ are the numbers of single samples within a comparison. where $x_i$ and $y_j$ are single samples and $N_X$ and $N_Y$ denotes the number of single samples within a comparison.
For the elements of $S$ to be defined, however, the condition
$$\begin{equation}
P(\overline{Y} = 0) = 0 \; .
\end{equation}$$
must be fulfilled.
To indicate that the comparison of prospects on the ordinal scale is of primary interest, we define To indicate that the comparison of prospects on the ordinal scale is of primary interest, we define
$$\begin{equation} $$\begin{equation}
\mathscr{D} = \left\{\frac{\overline{X}_{N_X}}{\overline{Y}_{N_Y}} > 0, \frac{\overline{X}_{N_X}}{\overline{Y}_{N_Y}} \leq 0 \right\} \mathscr{D} = \left\{\frac{\overline{X}_{N_X}}{\overline{Y}_{N_Y}} > 0, \frac{\overline{X}_{N_X}}{\overline{Y}_{N_Y}} \leq 0 \right\}
\end{equation}$$ \end{equation}$$
as a set of subsets of $S$ and the decision variable as the measure as a set of subsets of $S$, i.e., the *event space*, and the decision variable as the measure
$$\begin{equation} $$\begin{equation}
D:= f: (S, \mathscr{D}) \mapsto S' D:= f: (S, \mathscr{D}) \mapsto S'
...@@ -203,6 +186,14 @@ D:= ...@@ -203,6 +186,14 @@ D:=
\end{cases} \end{cases}
\end{equation}$$ \end{equation}$$
It can be shown that for the case $N_X = N_Y = 1$, $D$ is a random variable that follows the Bernoulli distribution
$$\begin{equation}
D \sim B\left( p \left( \frac{\overline{X}_{N_X = 1}}{\overline{Y}_{N_Y = 1}} > 0\right), n\right) \; ,
\end{equation}$$
where $n$ is the number of comparisons (see *Proof 1* in Appendix).
## Predicting Choices From the SSM ## Predicting Choices From the SSM
Hills and Hertwig [-@hillsInformationSearchDecisions2010] proposed the two different sampling strategies in combination with the respective decision strategies, i.e., piecewise sampling and round-wise comparison vs. comprehensive sampling and summary comparison, as an explanation for different choice patterns in DfE. Hills and Hertwig [-@hillsInformationSearchDecisions2010] proposed the two different sampling strategies in combination with the respective decision strategies, i.e., piecewise sampling and round-wise comparison vs. comprehensive sampling and summary comparison, as an explanation for different choice patterns in DfE.
...@@ -235,8 +226,8 @@ For the summary strategy, the following prediction is obtained: ...@@ -235,8 +226,8 @@ For the summary strategy, the following prediction is obtained:
Given that Given that
$$\begin{equation} $$\begin{equation}
P\left(\lim_{N_X \to \infty} \overline{X}_{N_X} = E_X \right) = P\left(\lim_{N_X \to \infty} \overline{X}_{N_X} = E(X) \right) =
P\left(\lim_{N_Y \to \infty} \overline{Y}_{N_Y} = E_X \right) = P\left(\lim_{N_Y \to \infty} \overline{Y}_{N_Y} = E(Y) \right) =
1 \; , 1 \; ,
\end{equation}$$ \end{equation}$$
...@@ -257,7 +248,22 @@ I.e., for the summary strategy, we assume that for increasing sample sizes $N_X$ ...@@ -257,7 +248,22 @@ I.e., for the summary strategy, we assume that for increasing sample sizes $N_X$
For the round-wise strategy, the following prediction is obtained: For the round-wise strategy, the following prediction is obtained:
Given that $N_X$ and $N_Y$ are set to 1, $D$ follows the binomial distribution
$$\begin{equation}
B(k \, | \, p_X, \mathbb{N}) \; ,
\end{equation}$$
where $p$ is the probability that a single sample of prospect $X$ is larger than a single sample of prospect $Y$, $\mathbb{N}$ is the number of comparisons and $k$ is the number of times $x \in X$ is larger than $y \in Y$.
*Proof.*
For $N_X = N_Y = 1$, the sample space is
$$\begin{equation}
\left\{\frac{\overline{X}_{N_X = 1}} {\overline{Y}_{N_Y = 1}} \right\}^{\mathbb{N}} =
\left\{\frac{x_i \in \Omega'_X} {y_j \in \Omega'_Y}\right\}^{\mathbb{N}}
\end{equation}$$
# Method # Method
...@@ -859,4 +865,50 @@ cpt_curves_comprehensive %>% ...@@ -859,4 +865,50 @@ cpt_curves_comprehensive %>%
# Conclusion # Conclusion
# Appendix
Let $X_n$ and $Y_m$ be independent and discrete random variables of the sequences
$$\begin{equation}
X_1, ..., X_n, ..., X_{N_X}
\end{equation}$$
and
$$\begin{equation}
Y_1, ..., Y_m, ..., Y_{N_Y} \; .
\end{equation}$$
Then
$$\begin{equation}
P\left(\frac{\overline{X}_{N_X}}{\overline{Y}_{N_Y}} > 0 \right) =
P\left(\frac{\frac{1}{N_X}\sum\limits_{n=1}^{N_X} (X(\omega_i) = A'_X \in \mathscr{F'}_X)_n}
{\frac{1}{N_Y}\sum\limits_{m=1}^{N_Y} (Y(\omega_j) = A'_Y \in \mathscr{F'}_Y)_m} > 0 \right)
\end{equation}$$
is the probability that the quotient of the mean of both sequences takes on a value larger $0$.
Given the sample sizes $N_X = N_Y = 1$, the equation reduces to
$$\begin{equation}
P\left(\frac{X(\omega_i \in \Omega_X) = A'_X \in \mathscr{F'}_X}{Y(\omega_j \in \Omega_Y) = A'_Y \in \mathscr{F'}_Y} > 0 \right) \; ,
\end{equation}$$
which is the sum across all joint probabilities $p(\omega_i \cap \omega_j)$ for which the above inequation holds:.
$$\begin{equation}
D:=
f
\left(
\frac{\overline{X}_{N_X}} {\overline{Y}_{N_Y}}
\right)
=
\begin{cases}
1 & \text{if} & \frac{\overline{X}_{N_X}}{\overline{Y}_{N_Y}} > 0 \in \mathscr{D} \\
0 & \text{else}.
\end{cases}
\end{equation}$$
# References # References
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment