Commit 90572158 authored by linushof's avatar linushof
Browse files

Revision of the probability theoretic definitions

parent 7969ba18
......@@ -26,13 +26,13 @@ pacman::p_load(repro,
# Note
- This document was created from the commit with the hash `r repro::current_hash()`.
This document was created from the commit with the hash `r repro::current_hash()`.
# Abstract
Synthetic choice data from so-called decisions from experience is generated by applying different strategies of sample integration to a series of choice problems between two prospects.
The synthetic data is explored for characteristic choice patterns produced by these strategies under varying structures of the environment (prospect features) and aspects of the sampling- and decision behavior.
We start our argument by giving a probability theoretic account of prospects, sampling, and sample integration and derive assumptions about the choice patterns that result from different integration strategies if applied.
We start our argument by giving a probability theoretic account of prospects, sampling, and sample integration and derive assumptions about the choice patterns that result from different integration strategies, if applied.
# Summary
......@@ -44,7 +44,7 @@ Provide short summary of simulation results.
Let a prospect be a *probability space* $(\Omega, \mathscr{F}, P)$ [@kolmogorovFoundationsTheoryProbability1950; @georgiiStochasticsIntroductionProbability2008, for an accessible introduction].
$\Omega$ is the *sample space* containing a finite set of possible outcomes
$\Omega$ is the *sample space* containing an at most countable set of possible outcomes
$$\begin{equation}
\omega_i = \{\omega_1, ..., \omega_n\} \in \Omega
......@@ -53,18 +53,12 @@ $$\begin{equation}
$\mathscr{F}$ is a set of subsets of $\Omega$, i.e., the *event space*
$$\begin{equation}
A_i = \{A_1, ..., A_n\} \in \mathscr{F}
\end{equation}$$
where
$$\begin{equation}
\mathscr{F} \subset \mathscr{P}(\Omega)
A_i = \{A_1, ..., A_n\} \in \mathscr{F} = \mathscr{P}(\Omega)
\end{equation}$$
$\mathscr{P}(\Omega)$ denotes the power set of $\Omega$.
$P$ is a *probability mass function* (PMF) which maps the event space to the set of real numbers in $[0, 1]$
$P$ is a probability mass function that maps $\mathscr{F}$ to the set of real numbers in $[0, 1]$
$$\begin{equation}
P: \mathscr{F} \mapsto [0,1]
......@@ -76,12 +70,12 @@ by assigning each $\omega_i \in \Omega$ a probability of $0 \leq p_i \leq 1$ wit
In research on the decision theory, a standard paradigm is the choice between $n \geq 2$ monetary prospects (hereafter indexed with j), where $\omega_{ij} \in \Omega_j$ are monetary outcomes, gains and/or losses respectively.
$P_j$ is then the probability measure which assigns each $\omega_{ij}$ a probability with which they occur.
In such a choice paradigm, agents are asked to evaluate the prospects and build a preference for, i.e., choose, either one of them.
In such a choice paradigm, agents are asked to evaluate the prospects and build a preference for, i.e., choose either one of them.
It is common to make a rather crude distinction between two variants of this evaluation process [cf. @hertwigDescriptionExperienceGap2009].
For decisions from description (DfD), agents are provided a full symbolic description of the triples $(\Omega, \mathscr{F}, P)_j$.
For decision from experience (DfE; e.g., @hertwigDecisionsExperienceEffect2004), the probability triples are not described but must be explored by the means of *sampling*.
For decision from experience [DfE; e.g., @hertwigDecisionsExperienceEffect2004], the probability triples are not described but must be explored by the means of *sampling*.
To provide a formal definition of sampling in risky or uncertain choice, we make use of random variables, functions which build the foundation of the random processes decision theory is concerned with but which are rarely explicated.
To provide a formal definition of sampling in risky or uncertain choice, we make use of the mathematical concept of a random variable, a function that models the random processes decision theory is concerned with but which is rarely explicated.
Thus, if for each
$$\begin{equation}
......@@ -89,50 +83,80 @@ $$\begin{equation}
\end{equation}$$
we refer to the respective prospect as *"risky"*, where risky describes the fact that if agents would choose the prospect and any of the outcomes $\omega_{i}$ must occur, none of these outcomes will occur with certainty but according to the probability measure $P$.
It is acceptable to speak of the occurrence of $\omega_{i}$ as the realization of a random variable iff the latter is defined as the function
It is acceptable to speak of the occurrence of $\omega_{i}$ as the realization of a random variable iff the following conditions a. and b. are met:
(a) The random variable $X$ is defined as the function
$$\begin{equation}
X: (\Omega, \mathscr{F}) \mapsto (\Omega', \mathscr{F'})
\end{equation}$$
where $(\Omega', \mathscr{F'})$ is a measurable image of $(\Omega, \mathscr{F})$.
I.e., $X$ maps any event $A_i \in \mathscr{F}$ to a quantity $A'_i \in \mathscr{F'}$ and we denote the latter as the realization of the random variable $X$
where the image $\Omega'$ is the set of possible values $X$ can take and $\mathscr{F'}$ is a set of subsets of $\Omega'$.
I.e., $X$ maps any event $A_i \in \mathscr{F}$ to a subset $A'_i \in \mathscr{F'}$
$$\begin{equation}
A_i \in \mathscr{F}: X(A_i) \Rightarrow A'_i \in \mathscr{F'}
A'_i \in \mathscr{F'} \Rightarrow X^{-1}A'_i \in \mathscr{F}
\end{equation}$$
However, to allow $\omega_{i}$ to be a realization of a random variable defined on $(\Omega, \mathscr{F})$, we must also set
[cf. @georgiiStochasticsIntroductionProbability2008].
(b) The image $X: \Omega \mapsto \Omega'$ must be such that $\omega_i \in \Omega = x_i \in \Omega'$.
Given conditions a. and b., we denote any realization of a random variable defined on the triple $(\Omega, \mathscr{F}, P)$ as a *single sample* of the respective prospect and any systematic approach to generate a sequence of single samples from $n \geq 2$ prospects as a sampling strategy [see also @hillsInformationSearchDecisions2010].
Because for a sufficiently large number of single samples from a given prospect the relative frequencies of $\omega_{i}$ approximate their probabilities in $p_i \in P$, sampling in principle allows to explore a prospect's probability space.
So far, we used the probability triple of a prospect and conditions a. and b. solely to provide a probability theoretic definition of a single sample.
However, since in the decision literature the (stochastic) occurrence of the raw outcomes in $\Omega$ is often treated as the event of interest, it should be justified to state that stochastic model formulated under a. with the restriction b. is abundantly although implicitly assumed to underlie the evaluation processes of agents.
We do not contend that this model is not adequate but rather empirically warranted and mathematically convenient, not least because of the measurable nature of the monetary outcomes in $\Omega$.
However, in line with the literature that deviates from utility models and its derivatives [@heOntologyDecisionModels2020, for an ontology of decision models], we propose that the above restricted model is not the only suitable for describing the random processes agents are interested in, when building a preference between risky prospects, from sampling respectively.
We can construct an alternative stochastic sampling model (hereafter SSM) underlying DfE between risky prospects by starting from the assumption that agents do not make random choices but base their decisions on the information provided by the prospects, which is readily described by their probability triples.
Thus, we may start by defining a decision variable $D$
$$\begin{equation}
\Omega = \mathscr{F} = \mathscr{F'}
D := f((\Omega, \mathscr{F}, P)_j)
\end{equation}$$
Given this restriction, we define a realization of the described random variable as a *single sample* and any systematic approach to generate a sequence of single samples from $n \geq 2$ prospects as a sampling strategy [see also @hillsInformationSearchDecisions2010].
Because for a sufficiently large number of single samples from a given prospect the relative frequencies of $\omega_{i}$ approximate their probabilities in $p_i \in P$, sampling in principle allows to explore a prospects probability space.
first without any further assumption on which information of the triple $f$ utilizes and how.
Although in principle many models for $f$ are proposed and tested in the decision literature, in DfE we can restrict the SSM to the case where decisions are based on sequences of single samples generated from the prospect triples.
Since we have already provided a restricted stochastic model of a random variable for this generative process, we can write
So far, we used the random variable defined on a prospect's triple $(\Omega, \mathscr{F}, P)$ and the restriction $\Omega = \mathscr{F'}$ solely to provide a probability theoretic definition of a single sample.
However, since in the decision literature the stochastic occurrence of the raw outcomes in $\Omega$ is often treated as the event of interest, it should be justified to state that this restricted stochastic model is abundantly but implicitly assumed to underlay the evaluation processes of agents.
We do not contend that this model is not adequate but rather empirically warranted and mathematically convenient because of the measurable nature of monetary outcomes.
However, in line with the literature that deviates from utility models and its derivatives [@heOntologyDecisionModels2020, for an ontology of decision models], we propose that the above restricted model is not the only suitable for describing the random processes agents are interested in, when building a preference between risky prospects, from experience or sampling respectively.
$$\begin{equation}
D := f((X: (\Omega, \mathscr{F}) \mapsto (\Omega', \mathscr{F'}))_j)
\end{equation}$$
where $\Omega_j = \mathscr{F_j} = \mathscr{F_j'}$.
How to construct alternative stochastic models underlying DfE between risky prospects?
We start from the assumption that agents do not make random choices but base their decisions on the information provided by the prospects, which is readily described by their probability triples.
Thus, we may define a decision variable $D$ as
For n prospects we can write the above definition as
$$\begin{equation}
D := f((\Omega, \mathscr{F}, P)_j)
D := f(X_1, ..., X_j, ..., X_n)
\end{equation}$$
irrespective of the information $f$ utilizes and how.
Although in principle many models for $f$ were proposed and tested in the decision literature, in DfE we can restrict the model to the case where decisions are based on single samples generated from the prospect triples.
which reduces to
$$\begin{equation}
D := f(X_1, X_2)
\end{equation}$$
in the case of $n = 2$ prospects, which we now consider further.
Up to this point, we have defined the decision variable $D$ as a function of the random variables associated with the prospects probability spaces.
Since the decision variable $D$ serves as a measure for the evidence for one prospect over the other, we want $f$ to be a measurable function that maps the comparison of $X_1$ and $X_2$ to the measure space $\mathscr{D}$.
Because $X_1$ and $X_2$ are themselves measurable, we write their sample means as a fraction
Although in principle many models for $f$ were proposed and tested in the decision literature, we can use the general (unrestricted) model of the random variable from above as a starting point but now interpret the image $(\Omega', \mathscr{F'})$ as a decision variable.
This allows us to depart from the standard model of the random process
$$\begin{equation}
f: \frac{\overline{X_1}} {\overline{X_2}} =
\frac{\frac{1}{N_1} \sum_{i=1}^{N_1} \omega_{i1}}{\frac{1}{N_2} \sum_{i=1}^{N_2} \omega_{i2}} \mapsto \mathscr{D}
\end{equation}$$
The decision variable $D$ is thus a function of the comparative measure $\frac{\overline{X_1}} {\overline{X_2}}$ of the random variables both defined on the probability spaces of their respective prospects.
We assume that the elements of $\mathscr{D}$ are the natural numbers $\{0, 1\}$, indicating that the ordinal comparison of $\overline{X_1}$ and $\overline{X_2}$ either provides evidence for a given prospect $\{1\}$ or not $\{0\}$.
Thus, $f$ itself can be defined as a random variable that maps the sample space $\Omega = \{\frac{\overline{X_1}} {\overline{X_2}}\}$ to the measurable space $\mathscr{D} = \{0, 1\}$.
However, since we are not interested in the measure $\frac{\overline{X_1}} {\overline{X_2}} = \mathbb{R}$ itself but in the ordinal comparison of $X_1$ and $X_2$, we introduce the event space $\mathscr{F} = \{\frac{\overline{X_1}} {\overline{X_2}} > 0, \frac{\overline{X_1}} {\overline{X_2}} \leq 0\}$
use the general (unrestricted) model of the random variable from above as a starting point but now interpret the image $(\Omega', \mathscr{F'})$ as a decision variable.
$$\begin{equation}
f: (\Omega, \mathscr{F}) \mapsto (\Omega', \mathscr{D})
\end{equation}$$
# Method
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment