manuscript.Rmd 34.1 KB
 linushof committed Jul 01, 2021 1 ---  linushof committed Aug 31, 2021 2 3 title: 'Sampling Strategies in Decisions from Experience' author: "Linus Hof, Thorsten Pachur, Veronika Zilker"  linushof committed Jul 02, 2021 4 5 6 7 8 9 bibliography: sampling-strategies-in-dfe.bib output: html_document: code_folding: hide toc: yes toc_float: yes  linushof committed Aug 02, 2021 10  number_sections: no  linushof committed Aug 16, 2021 11 12 13  pdf_document: toc: yes csl: apa.csl  linushof committed Aug 31, 2021 14 15 16 editor_options: markdown: wrap: sentence  linushof committed Jul 01, 2021 17 18 ---  linushof committed Jul 02, 2021 19 20 {r} # load packages  linushof committed Aug 02, 2021 21 22 pacman::p_load(repro, tidyverse,  linushof committed Aug 16, 2021 23 24  knitr, viridis)  linushof committed Jul 02, 2021 25 26   27 # Author Note  linushof committed Jul 01, 2021 28   29 This document was created from the commit with the hash r repro::current_hash().  30   31 32 - Add information on how to reproduce the project. - Add contact.  linushof committed Aug 02, 2021 33   34 # Abstract  linushof committed Aug 02, 2021 35   36 37 A probability theoretic definition of sampling and a rough stochastic model of the random process underlying decisions from experience are proposed. It is demonstrated how the stochastic model can be used a) to explicate assumptions about the sampling and decision strategies that agents may apply and b) to derive predictions about the resulting decision behavior in terms of function forms and parameter values.  linushof committed Sep 14, 2021 38 Synthetic choice data is simulated and modeled in cumulative prospect theory to test these predictions.  linushof committed Jul 01, 2021 39   linushof committed Aug 02, 2021 40 # Introduction  linushof committed Jul 01, 2021 41   42 43 ...  linushof committed Sep 14, 2021 44 ## Random Processes in Sequential Sampling  linushof committed Aug 31, 2021 45   linushof committed Sep 14, 2021 46 47 48 In research on the decision theory, a standard paradigm is the choice between at least two (monetary) prospects. Let a prospect be a probability space $(\Omega, \mathscr{F}, P)$. $\Omega$ is the sample space  linushof committed Aug 31, 2021 49   linushof committed Sep 09, 2021 50 $$ linushof committed Sep 14, 2021 51 \omega_i = \{\omega_1, ..., \omega_n\} \in \Omega  linushof committed Sep 09, 2021 52 $$  linushof committed Aug 31, 2021 53   linushof committed Sep 14, 2021 54 55 containing a finite set of possible outcomes, gains and/or losses respectively. $\mathscr{F}$ is a set of subsets of $\Omega$, i.e., the event space  linushof committed Aug 31, 2021 56   linushof committed Sep 09, 2021 57 $$ linushof committed Sep 14, 2021 58 59 A_i = \{A_1, ..., A_n\} \in \mathscr{F} = \mathscr{P}(\Omega) \; .  linushof committed Sep 09, 2021 60 $$  linushof committed Aug 31, 2021 61   linushof committed Sep 09, 2021 62 $\mathscr{P}(\Omega)$ denotes the power set of $\Omega$.  linushof committed Aug 31, 2021 63   linushof committed Sep 14, 2021 64 $P$ is a probability mass function  linushof committed Sep 09, 2021 65 66  $$ linushof committed Sep 14, 2021 67 P: \mathscr{F} \mapsto [0,1]  linushof committed Sep 09, 2021 68 69 $$  linushof committed Sep 14, 2021 70 that assigns each $\omega_i \in \Omega$ a probability of $0 \leq p_i \leq 1$ with $P(\Omega) = 1$ [cf. @kolmogorovFoundationsTheoryProbability1950, pp. 2-3].  linushof committed Sep 09, 2021 71   linushof committed Sep 14, 2021 72 In such a choice paradigm, agents are asked to evaluate the prospects and build a preference for either one of them.  linushof committed Sep 16, 2021 73 It is common to make a rather crude distinction between two variants of this evaluation process [cf. @hertwigDescriptionexperienceGapRisky2009].  linushof committed Sep 14, 2021 74 75 For decisions from description (DfD), agents are provided a full symbolic description of the triples $(\Omega, \mathscr{F}, P)_j$, where j denotes a prospect. For decisions from experience [DfE; e.g., @hertwigDecisionsExperienceEffect2004], the probability triples are not described but must be explored by the means of sampling.  76 To provide a formal definition of sampling in risky or uncertain choice, we make use of the mathematical concept of a random variable.  linushof committed Sep 09, 2021 77 78 79 Thus, if for each $$ linushof committed Sep 14, 2021 80 81 \omega_{i} \in \Omega: p(\omega_{i}) \neq 1 \; ,  linushof committed Sep 09, 2021 82 83 84 $$ we refer to the respective prospect as *"risky"*, where risky describes the fact that if agents would choose the prospect and any of the outcomes $\omega_{i}$ must occur, none of these outcomes will occur with certainty but according to the probability measure $P$.  linushof committed Sep 14, 2021 85 It is acceptable to speak of the occurrence of $\omega_{i}$ as the realization of a random variable iff the following conditions A and B are met:  linushof committed Sep 10, 2021 86   linushof committed Sep 14, 2021 87 A) The random variable $X$ is defined as the function  linushof committed Sep 09, 2021 88 89  $$ linushof committed Sep 14, 2021 90 91 X: (\Omega, \mathscr{F}) \mapsto (\Omega', \mathscr{F'}) \; ,  linushof committed Sep 09, 2021 92 93 $$  linushof committed Sep 10, 2021 94 where the image $\Omega'$ is the set of possible values $X$ can take and $\mathscr{F'}$ is a set of subsets of $\Omega'$.  linushof committed Sep 14, 2021 95 I.e., $X$ maps any event $A_i \in \mathscr{F}$ to a subset $A'_i \in \mathscr{F'}$:  linushof committed Sep 09, 2021 96 97  $$ linushof committed Sep 10, 2021 98 A'_i \in \mathscr{F'} \Rightarrow X^{-1}A'_i \in \mathscr{F}  linushof committed Sep 09, 2021 99 100 $$  linushof committed Sep 14, 2021 101 [@kolmogorovFoundationsTheoryProbability1950, p. 21].  linushof committed Sep 10, 2021 102   linushof committed Sep 14, 2021 103 B) The image $X: \Omega \mapsto \Omega'$ must be such that $\omega_i \in \Omega = x_i \in \Omega'$.  linushof committed Sep 10, 2021 104   linushof committed Sep 14, 2021 105 Given conditions A and B, we denote any realization of a random variable defined on the triple $(\Omega, \mathscr{F}, P)$ as a *"single sample"* of the respective prospect and any systematic approach to generate a sequence of single samples from multiple prospects as a sampling strategy [see also @hillsInformationSearchDecisions2010].  linushof committed Sep 16, 2021 106 Because for a sufficiently large number of single samples *n* from a given prospect, i.e., $\lim_{n \to \infty}$, the relative frequencies of $\omega_{i}$ approximate their probabilities in $p_i \in P$ [@bernoulliArsConjectandiOpus1713], sampling in principle allows to explore a prospect's probability space.  linushof committed Sep 10, 2021 107   linushof committed Sep 14, 2021 108 ## A Stochastical Sampling Model for DfE  linushof committed Sep 10, 2021 109   linushof committed Sep 14, 2021 110 111 112 Consider a choice between $1,\, ...,\, j,\,...,\, n$ prospects, where $j \leq n \geq 2$. To construct a rough stochastic sampling model (hereafter SSM) of the random process underlying DfE, it is assumed that agents base their decisions on the information provided by the prospects, which is in principle fully described by their probability triples. Thus, a decision variable  linushof committed Sep 09, 2021 113 114  $$ linushof committed Sep 10, 2021 115 D := f((\Omega, \mathscr{F}, P)_j)  linushof committed Sep 09, 2021 116 117 $$  linushof committed Sep 14, 2021 118 119 is defined. Since in DfE no symbolic descriptions of the triples are provided, the model is restricted to the case where decisions are based on sequences of single samples generated from the triples:  linushof committed Sep 09, 2021 120   linushof committed Sep 10, 2021 121 $$ linushof committed Sep 14, 2021 122 123 D := f((X: (\Omega, \mathscr{F}) \mapsto (\Omega', \mathscr{F'}))_j) = f(X_1, ..., X_j, ..., X_n) \; ,  linushof committed Sep 10, 2021 124 125 $$  linushof committed Sep 11, 2021 126 where $\Omega_j = \Omega'_j$.  linushof committed Sep 09, 2021 127   linushof committed Sep 14, 2021 128 129 130 131 132 Note that the decision variable $D$ is defined as a function $f$ of the random variables associated with the prospects' probability spaces, where $f$ can operate on any quantitative measure, or moment, related to these random variables. Since decision models differ in the form of $f$ and the measures the latter utilizes [@heOntologyDecisionModels2020, for an ontology of decision models], we take the stance that these choices should be informed by psychological or other theory and empirical protocols. For what do these choices mean? They reflect the assumptions about the kind of information agents process and the way they do, not to mention the question of whether they are capable of doing so. In the following section, it is demonstrated how such assumptions about the processing strategies that agents may apply in DfE can be captured by the SSM.  linushof committed Sep 09, 2021 133   linushof committed Sep 14, 2021 134 ## Integrating sampling and decision strategies into the SSM  linushof committed Sep 09, 2021 135   linushof committed Sep 14, 2021 136 137 138 139 Hills and Hertwig [-@hillsInformationSearchDecisions2010] discussed a potential link between the sampling and decision strategies of agents in DfE, i.e., a systematic relation between the pattern according to which sequences of single samples are generated and the mechanism of integrating and evaluating these sample sequences to arrive at a decision. Specifically, the authors suppose that frequent switching between prospects in the sampling phase translates to a round-wise decision strategy, for which the evaluation process is separated into multiple rounds of ordinal comparisons between single samples (or small chunks thereof), such that the unit of the final evaluation are round wins rather than raw outcomes. In contrast, infrequent switching is supposed to translate to a decision strategy, for which only a single ordinal comparison of the summaries across all samples of the respective prospects is conducted [@hillsInformationSearchDecisions2010, see Figure 1]. The authors assume that these distinct sampling and decision strategies lead to characteristic patterns in decision behavior and may serve as an additional explanation for the many empirical protocols which indicate that DfE differ from DfD [@wulffMetaanalyticReviewTwo2018, for a meta-analytic review; but see @foxDecisionsExperienceSampling2006].  140   linushof committed Sep 14, 2021 141 In the following, choices between two prospects are considered to integrate the assumptions about the sampling and decision strategies from above into the SSM.  142   linushof committed Sep 14, 2021 143 144 145 Let $X$ and $Y$ be random variables related to the prospects with the probability spaces $(\Omega, \mathscr{F}, P)_X$ and $(\Omega, \mathscr{F}, P)_Y$. By definition, the decision variable $D$ should quantify the accumulated evidence for one prospect over the other, which Hills and Hertwig [-@hillsInformationSearchDecisions2010] describe in units of won comparisons. Hence, $f$ should map the possible outcomes of a comparison of quantitative measures related to $X$ and $Y$, hereafter the sampling space $S = \mathbb{R}$, to a measure space $S' = \{0,1\}$, indicating the possible outcomes of a single comparison:  146   linushof committed Sep 14, 2021 147 148 149 150 151 152 $$D:= f: S \mapsto S' \; .$$ Since Hills and Hertwig [-@hillsInformationSearchDecisions2010] assume that comparisons of prospects are based on sample means, $S$ is the set  153 154  $$ linushof committed Sep 14, 2021 155 156 157 158 159 160 161 S = \left\{ \frac{\frac{1}{N_X} \sum\limits_{i=1}^{N_X} x_i} {\frac{1}{N_Y} \sum\limits_{j=1}^{N_Y} y_j} \right\}^{\mathbb{N}} = \left\{  linushof committed Sep 16, 2021 162  \frac{\overline{X}_{N_X}} {\overline{Y}_{N_Y}}  linushof committed Sep 14, 2021 163 164  \right\}^{\mathbb{N}} \; ,  165 166 $$  linushof committed Sep 16, 2021 167 168 169 170 171 172 173 174 where $\mathbb{N}$ is the number of comparisons, $x_i$ and $y_j$ are the realizations of the respective random variables, i.e., the single samples, and $N_X$ and $N_Y$ are the numbers of single samples within a comparison. For the elements of $S$ to be defined, however, the condition $$P(\overline{Y} = 0) = 0 \; .$$ must be fulfilled.  linushof committed Sep 14, 2021 175 To indicate that the comparison of prospects on the ordinal scale is of primary interest, we define  176   177 $$ linushof committed Sep 16, 2021 178 \mathscr{D} = \left\{\frac{\overline{X}_{N_X}}{\overline{Y}_{N_Y}} > 0, \frac{\overline{X}_{N_X}}{\overline{Y}_{N_Y}} \leq 0 \right\}  179 $$  180   linushof committed Sep 14, 2021 181 as a set of subsets of $S$ and the decision variable as the measure  linushof committed Sep 09, 2021 182   linushof committed Sep 10, 2021 183 $$ linushof committed Sep 14, 2021 184 D:= f: (S, \mathscr{D}) \mapsto S'  linushof committed Sep 10, 2021 185 $$  linushof committed Aug 31, 2021 186   linushof committed Sep 14, 2021 187 with the mapping  linushof committed Jul 02, 2021 188   linushof committed Sep 10, 2021 189 $$ linushof committed Sep 14, 2021 190 191 D:= \left(  linushof committed Sep 16, 2021 192  \frac{\overline{X}_{N_X}} {\overline{Y}_{N_Y}}  linushof committed Sep 14, 2021 193 194 195 196  \right) \in S : f \left(  linushof committed Sep 16, 2021 197  \frac{\overline{X}_{N_X}} {\overline{Y}_{N_Y}}  linushof committed Sep 14, 2021 198 199  \right) =  200  \begin{cases}  linushof committed Sep 16, 2021 201 202  1 & \text{if} & \frac{\overline{X}_{N_X}}{\overline{Y}_{N_Y}} > 0 \in \mathscr{D} \\ 0 & \text{else}.  linushof committed Sep 14, 2021 203  \end{cases}  linushof committed Sep 10, 2021 204 $$  linushof committed Sep 09, 2021 205   linushof committed Sep 16, 2021 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 ## Predicting Choices From the SSM Hills and Hertwig [-@hillsInformationSearchDecisions2010] proposed the two different sampling strategies in combination with the respective decision strategies, i.e., piecewise sampling and round-wise comparison vs. comprehensive sampling and summary comparison, as an explanation for different choice patterns in DfE. How does the current version of the SSM support this proposition? Given prospects $X$ and $Y$, the sample spaces $S = \left\{\frac{\overline{X}_{N_X}} {\overline{Y}_{N_Y}}\right\}^{\mathbb{N}}$ can be varied by changes on three parameters, i.e., the number of comparisons $\mathbb{N}$ and the sample sizes $N_X$ and $N_Y$ on which these comparisons are based. First, only considering the pure cases formulated by the above authors, the following restrictions are put the parameters: $$\mathbb{N} = \begin{cases} 1 & \text{if} & \text{Summary} \\ \geq 1 & \text{if} & \text{Round-wise} \end{cases}\\$$ and $$N_X \, \text{and} \, N_Y = \begin{cases} \geq 1 & \text{if} & \text{Summary} \\ 1 & \text{if} & \text{Round-wise} \end{cases}\\$$ For the summary strategy, the following prediction is obtained: Given that $$P\left(\lim_{N_X \to \infty} \overline{X}_{N_X} = E_X \right) = P\left(\lim_{N_Y \to \infty} \overline{Y}_{N_Y} = E_X \right) = 1 \; ,$$ we obtain that $$\left( \frac{\overline{X}_{N_X}} {\overline{Y}_{N_Y}} \right) \in S : P\left(\lim_{N_X \to \infty} \lim_{N_Y \to \infty} \frac{\overline{X}_{N_X}} {\overline{Y}_{N_Y}} = \frac{E(X)} {E(Y)} \right ) = 1 \; ,$$ I.e., for the summary strategy, we assume that for increasing sample sizes $N_X$ and $N_Y$, the prospect with the larger expected value is chosen almost surely. For the round-wise strategy, the following prediction is obtained:  linushof committed Sep 09, 2021 261   linushof committed Aug 02, 2021 262 263 # Method  linushof committed Aug 04, 2021 264 ## Test set  linushof committed Jul 02, 2021 265   linushof committed Aug 31, 2021 266 267 268 269 270 271 272 273 274 Under each condition, i.e., strategy-parameter combinations, all gambles are played by 100 synthetic agents. We test a set of gambles, in which one of the prospects contains a safe outcome and the other two risky outcomes (*safe-risky gambles*). Therefore, 60 gambles from an initial set of 10,000 are sampled. Both outcomes and probabilities are drawn from uniform distributions, ranging from 0 to 20 for outcomes and from .01 to .99 for probabilities of the lower risky outcomes $p_{low}$. The probabilities of the higher risky outcomes are $1-p_{low}$, respectively. To omit dominant prospects, safe outcomes fall between both risky outcomes. The table below contains the test set of 60 gambles. Sampling of gambles was stratified, randomly drawing an equal number of 20 gambles with no, an attractive, and an unattractive rare outcome. Risky outcomes are considered *"rare"* if their probability is $p < .2$ and *"attractive"* (*"unattractive"*) if they are higher (lower) than the safe outcome.  linushof committed Jul 02, 2021 275   linushof committed Aug 16, 2021 276 277 278 {r message=FALSE} gambles <- read_csv("data/gambles/sr_subset.csv") gambles %>% kable()  linushof committed Jul 02, 2021 279 280   linushof committed Aug 02, 2021 281 ## Model Parameters  linushof committed Jul 02, 2021 282   linushof committed Aug 31, 2021 283 **Switching probability** $s$ is the probability with which agents draw the following single sample from the prospect they did not get their most recent single sample from.  linushof committed Aug 13, 2021 284 $s$ is varied between .1 to 1 in increments of .1.  linushof committed Jul 01, 2021 285   linushof committed Aug 31, 2021 286 The **boundary type** is either the minimum value any prospect's sample statistic must reach (absolute) or the minimum value for the difference of these statistics (relative).  linushof committed Aug 02, 2021 287 Sample statistics are sums over outcomes (comprehensive strategy) and sums over wins (piecewise strategy), respectively.  linushof committed Jul 01, 2021 288   linushof committed Aug 13, 2021 289 290 For comprehensive integration, the **boundary value** $a$ is varied between 15 to 75 in increments of 15. For piecewise integration $a$ is varied between 1 to 5 in increments of 1.  linushof committed Jul 01, 2021 291   linushof committed Aug 16, 2021 292 {r message=FALSE}  linushof committed Jul 27, 2021 293 294 295 296 297 298 299 300 # read choice data cols <- list(.default = col_double(), strategy = col_factor(), boundary = col_factor(), gamble = col_factor(), rare = col_factor(), agent = col_factor(), choice = col_factor())  linushof committed Aug 16, 2021 301 choices <- read_csv("data/choices/choices.csv", col_types = cols)  302 303   linushof committed Aug 16, 2021 304 In sum, 2 (strategies) x 60 (gambles) x 100 (agents) x 100 (parameter combinations) = r nrow(choices) choices are simulated.  linushof committed Jul 20, 2021 305   linushof committed Aug 16, 2021 306 # Results  307   linushof committed Aug 31, 2021 308 309 Because we are not interested in deviations from normative choice due to sampling artifacts (e.g., ceiling effects produced by low boundaries), we remove trials in which only one prospect was attended. In addition, we use relative frequencies of sampled outcomes rather than 'a priori' probabilities to compare actual against normative choice behavior.  310 311  {r}  linushof committed Aug 16, 2021 312 313 314 # remove choices where prospects were not attended choices <- choices %>% filter(!(is.na(a_ev_exp) | is.na(b_ev_exp)))  315 316   linushof committed Aug 16, 2021 317 318 319 320 321 {r eval = FALSE} # remove choices where not all outcomes were sampled choices <- choices %>% filter(!(is.na(a_ev_exp) | is.na(b_ev_exp) | a_p1_exp == 0 | a_p2_exp == 0))   linushof committed Jul 20, 2021 322   linushof committed Aug 31, 2021 323 Removing the respective trials, we are left with r nrow(choices) choices.  linushof committed Jul 20, 2021 324   linushof committed Aug 16, 2021 325 ## Sample Size  linushof committed Jul 20, 2021 326   linushof committed Aug 16, 2021 327 328 329 330 331 332 {r message=FALSE} samples <- choices %>% group_by(strategy, s, boundary, a) %>% summarise(n_med = median(n_sample)) samples_piecewise <- samples %>% filter(strategy == "piecewise") samples_comprehensive <- samples %>% filter(strategy == "comprehensive")  333 334   linushof committed Aug 16, 2021 335 The median sample sizes generated by different parameter combinations ranged from r min(samples_piecewise$n_med) to r max(samples_piecewise$n_med) for piecewise integration and r min(samples_comprehensive$n_med) to r max(samples_comprehensive$n_med) for comprehensive integration.  linushof committed Jul 27, 2021 336   linushof committed Aug 16, 2021 337 ### Boundary type and boundary value (a)  338   linushof committed Aug 31, 2021 339 As evidence is accumulated sequentially, relative boundaries and large boundary values naturally lead to larger sample sizes, irrespective of the integration strategy.  linushof committed Jul 20, 2021 340   linushof committed Aug 16, 2021 341 342 {r message=FALSE} group_med <- samples_piecewise %>%  linushof committed Jul 20, 2021 343  group_by(boundary, a) %>%  linushof committed Aug 16, 2021 344  summarise(group_med = median(n_med)) # to get the median across all s values  linushof committed Jul 20, 2021 345   linushof committed Aug 16, 2021 346 347 samples_piecewise %>% ggplot(aes(a, n_med, color = a)) +  linushof committed Jul 20, 2021 348  geom_jitter(alpha = .5, size = 2) +  linushof committed Aug 16, 2021 349 350 351  geom_point(data = group_med, aes(y = group_med), size = 3) + facet_wrap(~boundary) + scale_color_viridis() +  linushof committed Jul 27, 2021 352  labs(title = "Piecewise Integration",  linushof committed Aug 16, 2021 353  x ="a",  linushof committed Jul 20, 2021 354  y="Sample Size",  linushof committed Aug 16, 2021 355  col="a") +  linushof committed Jul 20, 2021 356  theme_minimal()  linushof committed Aug 16, 2021 357   linushof committed Jul 20, 2021 358   linushof committed Aug 16, 2021 359 360 {r message=FALSE} group_med <- samples_comprehensive %>%  linushof committed Jul 20, 2021 361  group_by(boundary, a) %>%  linushof committed Aug 16, 2021 362  summarise(group_med = median(n_med))  linushof committed Jul 20, 2021 363   linushof committed Aug 16, 2021 364 365 samples_comprehensive %>% ggplot(aes(a, n_med, color = a)) +  linushof committed Jul 20, 2021 366  geom_jitter(alpha = .5, size = 2) +  linushof committed Aug 16, 2021 367 368 369  geom_point(data = group_med, aes(y = group_med), size = 3) + facet_wrap(~boundary) + scale_color_viridis() +  linushof committed Jul 27, 2021 370  labs(title = "Comprehensive Integration",  linushof committed Aug 16, 2021 371  x ="a",  linushof committed Jul 20, 2021 372  y="Sample Size",  linushof committed Aug 16, 2021 373  col="a") +  linushof committed Jul 20, 2021 374  theme_minimal()  375 376   linushof committed Aug 16, 2021 377 ### Switching probability (s)  linushof committed Jul 27, 2021 378   linushof committed Aug 31, 2021 379 380 381 For piecewise integration, there is an inverse relationship between switching probability and sample size. I.e., the lower s, the less frequent prospects are compared and thus, boundaries are only approached with larger sample sizes. This effect is particularly pronounced for low probabilities such that the increase in sample size accelerates as switching probability decreases.  linushof committed Jul 20, 2021 382   linushof committed Aug 16, 2021 383 384 {r message=FALSE} group_med <- samples_piecewise %>%  linushof committed Jul 20, 2021 385  group_by(boundary, s) %>%  linushof committed Aug 16, 2021 386  summarise(group_med = median(n_med)) # to get the median across all a values  linushof committed Jul 20, 2021 387   linushof committed Aug 16, 2021 388 389 390 391 392 393 samples_piecewise %>% ggplot(aes(s, n_med, color = s)) + geom_jitter(alpha = .5, size = 2) + geom_point(data = group_med, aes(y = group_med), size = 3) + facet_wrap(~boundary) + scale_color_viridis() +  linushof committed Jul 27, 2021 394  labs(title = "Piecewise Integration",  linushof committed Aug 16, 2021 395  x ="s",  linushof committed Jul 20, 2021 396  y="Sample Size",  linushof committed Aug 16, 2021 397  col="s") +  linushof committed Jul 20, 2021 398 399 400  theme_minimal()   linushof committed Aug 31, 2021 401 402 403 For comprehensive integration, boundary types differ in the effects of switching probability. For absolute boundaries, switching probability has no apparent effect on sample size as the distance of a given prospect to its absolute boundary is not changed by switching to (and sampling from) the other prospect. For relative boundaries, however, samples sizes increase with switching probability.  linushof committed Jul 20, 2021 404   linushof committed Aug 16, 2021 405 406 {r message=FALSE} group_med <- samples_comprehensive %>%  linushof committed Jul 20, 2021 407  group_by(boundary, s) %>%  linushof committed Aug 16, 2021 408  summarise(group_med = median(n_med)) # to get the median across all a values  linushof committed Jul 20, 2021 409   linushof committed Aug 16, 2021 410 411 412 413 414 415 samples_comprehensive %>% ggplot(aes(s, n_med, color = s)) + geom_jitter(alpha = .5, size = 2) + geom_point(data = group_med, aes(y = group_med), size = 3) + facet_wrap(~boundary) + scale_color_viridis() +  linushof committed Jul 27, 2021 416  labs(title = "Comprehensive Integration",  linushof committed Aug 16, 2021 417 418 419  x ="s", y = "Sample Size", col="s") +  linushof committed Jul 20, 2021 420 421 422  theme_minimal()   linushof committed Aug 02, 2021 423 ## Choice Behavior  linushof committed Jul 20, 2021 424   linushof committed Aug 31, 2021 425 Below, in extension to Hills and Hertwig [-@hillsInformationSearchDecisions2010], the interplay of integration strategies, gamble features, and model parameters in their effects on choice behavior in general and their contribution to underweighting of rare events in particular is investigated.  linushof committed Aug 16, 2021 426 427 428 429 430 431 432 433 434 435 436 437 We apply two definitions of underweighting of rare events: Considering false response rates, we define underweighting such that the rarity of an attractive (unattractive) outcome leads to choose the safe (risky) prospect although the risky (safe) prospect has a higher expected value. {r message=FALSE} fr_rates <- choices %>% mutate(ev_ratio_exp = round(a_ev_exp/b_ev_exp, 2), norm = case_when(ev_ratio_exp > 1 ~ "A", ev_ratio_exp < 1 ~ "B")) %>% filter(!is.na(norm)) %>% # exclude trials with normative indifferent options group_by(strategy, s, boundary, a, rare, norm, choice) %>% # group correct and incorrect responses summarise(n = n()) %>% # absolute numbers mutate(rate = round(n/sum(n), 2), # response rates type = case_when(norm == "A" & choice == "B" ~ "false safe", norm == "B" & choice == "A" ~ "false risky")) %>% filter(!is.na(type)) # remove correct responses  linushof committed Jul 20, 2021 438 439   linushof committed Aug 31, 2021 440 Considering the parameters of Prelec's [-@prelecProbabilityWeightingFunction1998] implementation of the weighting function [CPT; cf. @tverskyAdvancesProspectTheory1992], underweighting is reflected by decisions weights estimated to be smaller than the corresponding objective probabilities.  linushof committed Jul 20, 2021 441   linushof committed Aug 16, 2021 442 ### False Response Rates  linushof committed Jul 20, 2021 443   linushof committed Aug 16, 2021 444 445 446 {r message=FALSE} fr_rates_piecewise <- fr_rates %>% filter(strategy == "piecewise") fr_rates_comprehensive <- fr_rates %>% filter(strategy == "comprehensive")  linushof committed Jul 20, 2021 447   448   linushof committed Aug 31, 2021 449 The false response rates generated by different parameter combinations ranged from r min(fr_rates_piecewise$rate) to r max(fr_rates_piecewise$rate) for piecewise integration and from r min(fr_rates_comprehensive$rate) to r max(fr_rates_comprehensive$rate) for comprehensive integration.  linushof committed Aug 16, 2021 450 However, false response rates vary considerably as a function of rare events, indicating that their presence and attractiveness are large determinants of false response rates.  linushof committed Jul 20, 2021 451   linushof committed Aug 16, 2021 452 453 454 455 456 457 {r message=FALSE} fr_rates %>% group_by(strategy, boundary, rare) %>% summarise(min = min(rate), max = max(rate)) %>% kable()  linushof committed Jul 20, 2021 458 459   linushof committed Aug 31, 2021 460 The heatmaps below show the false response rates for all strategy-parameter combinations.  linushof committed Aug 16, 2021 461 462 Consistent with our - somewhat rough - definition of underweighting, the rate of false risky responses is generally higher, if the unattractive outcome of the risky prospect is rare (top panel). Conversely, if the attractive outcome of the risky prospect is rare, the rate of false safe responses is generally higher (bottom panel).  linushof committed Aug 31, 2021 463 As indicated by the larger range of false response rates, the effects of rare events are considerably larger for piecewise integration.  464   linushof committed Aug 16, 2021 465 466 467 468 469 470 471 472 473 474 475 476 477 478 {r message=FALSE} fr_rates %>% filter(strategy == "piecewise", boundary == "absolute") %>% ggplot(aes(a, s, fill = rate)) + facet_grid(type ~ fct_relevel(rare, "attractive", "none", "unattractive"), switch = "y") + geom_tile(colour="white", size=0.25) + scale_x_continuous(expand=c(0,0), breaks = seq(1, 5, 1)) + scale_y_continuous(expand=c(0,0), breaks = seq(.1, 1, .1)) + scale_fill_viridis() + labs(title = "Piecewise Integration | Absolute Boundary", x = "a", y= "s", fill = "% False Responses") + theme_minimal()  479 480   linushof committed Aug 16, 2021 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 {r message=FALSE} fr_rates %>% filter(strategy == "piecewise", boundary == "relative") %>% ggplot(aes(a, s, fill = rate)) + facet_grid(type ~ fct_relevel(rare, "attractive", "none", "unattractive"), switch = "y") + geom_tile(colour="white", size=0.25) + scale_x_continuous(expand=c(0,0), breaks = seq(1, 5, 1)) + scale_y_continuous(expand=c(0,0), breaks = seq(.1, 1, .1)) + scale_fill_viridis() + labs(title = "Piecewise Integration | Relative Boundary", x = "a", y= "s", fill = "% False Responses") + theme_minimal()   linushof committed Jul 20, 2021 496   linushof committed Aug 16, 2021 497 498 {r message=FALSE} fr_rates %>%  linushof committed Jul 20, 2021 499  filter(strategy == "comprehensive", boundary == "absolute") %>%  linushof committed Aug 16, 2021 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514  ggplot(aes(a, s, fill = rate)) + facet_grid(type ~ fct_relevel(rare, "attractive", "none", "unattractive"), switch = "y") + geom_tile(colour="white", size=0.25) + scale_x_continuous(expand=c(0,0), breaks = seq(15, 75, 15)) + scale_y_continuous(expand=c(0,0), breaks = seq(.1, 1, .1)) + scale_fill_viridis() + labs(title = "Comprehensive Integration | Absolute Boundary", x = "a", y= "s", fill = "% False Responses") + theme_minimal()  {r message=FALSE} fr_rates %>%  linushof committed Jul 20, 2021 515  filter(strategy == "comprehensive", boundary == "relative") %>%  linushof committed Aug 16, 2021 516 517 518 519 520 521 522 523 524 525 526  ggplot(aes(a, s, fill = rate)) + facet_grid(type ~ fct_relevel(rare, "attractive", "none", "unattractive"), switch = "y") + geom_tile(colour="white", size=0.25) + scale_x_continuous(expand=c(0,0), breaks = seq(15, 75, 15)) + scale_y_continuous(expand=c(0,0), breaks = seq(.1, 1, .1)) + scale_fill_viridis() + labs(title = "Comprehensive Integration | Relative Boundary", x = "a", y= "s", fill = "% False Responses") + theme_minimal()  527 528   linushof committed Aug 16, 2021 529 #### Switching Probability (s) and Boundary Value (a)  linushof committed Jul 20, 2021 530   linushof committed Aug 31, 2021 531 As for both piecewise and comprehensive integration the differences between boundary types are rather minor and of magnitude than of qualitative pattern, the remaining analyses of false response rates are summarized across absolute and relative boundaries.  linushof committed Jul 20, 2021 532   linushof committed Aug 16, 2021 533 Below, the $s$ and $a$ parameter are considered as additional sources of variation in the false response pattern above and beyond the interplay of integration strategies and the rarity and attractiveness of outcomes.  linushof committed Jul 20, 2021 534   linushof committed Aug 16, 2021 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 {r message=FALSE} fr_rates %>% filter(strategy == "piecewise") %>% ggplot(aes(s, rate, color = a)) + facet_grid(type ~ fct_relevel(rare, "attractive", "none", "unattractive"), switch = "y") + geom_jitter(size = 2) + scale_x_continuous(breaks = seq(0, 1, .1)) + scale_y_continuous(breaks = seq(0, 1, .1)) + scale_color_viridis() + labs(title = "Piecewise Integration", x = "s", y= "% False Responses", color = "a") + theme_minimal()   550   linushof committed Aug 16, 2021 551 552 {r message=FALSE} fr_rates %>%  linushof committed Jul 20, 2021 553  filter(strategy == "comprehensive") %>%  linushof committed Aug 16, 2021 554 555 556 557 558 559  ggplot(aes(s, rate, color = a)) + facet_grid(type ~ fct_relevel(rare, "attractive", "none", "unattractive"), switch = "y") + geom_jitter(size = 2) + scale_x_continuous(breaks = seq(0, 1, .1)) + scale_y_continuous(breaks = seq(0, 1, .1)) + scale_color_viridis() +  linushof committed Jul 27, 2021 560  labs(title = "Comprehensive Integration",  linushof committed Aug 16, 2021 561 562 563 564  x = "s", y= "% False Responses", color = "a") + theme_minimal()  565 566   linushof committed Aug 31, 2021 567 For piecewise integration, switching probability is naturally related to the size of the samples on which the round-wise comparisons of prospects are based on, with low values of $s$ indicating large samples and vice versa.  linushof committed Aug 16, 2021 568 Accordingly, switching probability is positively related to false response rates.  linushof committed Aug 31, 2021 569 570 I.e., the larger the switching probability, the smaller the round-wise sample size and the probability of experiencing a rare event within a given round. Because round-wise comparisons are independent of each other and binomial distributions within a given round are skewed for small samples and outcome probabilities [@kolmogorovFoundationsTheoryProbability1950], increasing boundary values do not reverse but rather amplify this relation.  571   linushof committed Aug 31, 2021 572 573 574 For comprehensive integration, switching probability is negatively related to false response rates, i.e., an increase in $s$ is associated with decreasing false response rates. This relation, however, may be the result of an artificial interaction between the $s$ and $a$ parameter. Precisely, in the current algorithmic implementation of sampling with a comprehensive integration mechanism, decreasing switching probabilities cause comparisons of prospects based on increasingly unequal sample sizes immediately after switching prospects.  linushof committed Aug 16, 2021 575 Consequentially, reaching (low) boundaries is rather a function of switching probability and associated sample sizes than of actual evidence for a given prospect over the other.  576   linushof committed Aug 31, 2021 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 ### Cumulative Prospect Theory In the following, we examine the possible relations between the parameters of the *choice-generating* sampling models and the *choice-describing* cumulative prospect theory. For each distinct strategy-parameter combination, we ran 20 chains of 40,000 iterations each, after a warm-up period of 1000 samples. To reduce potential autocorrelation during the sampling process, we only kept every 20th sample (thinning). {r} # read CPT data cols <- list(.default = col_double(), strategy = col_factor(), boundary = col_factor(), parameter = col_factor()) estimates <- read_csv("data/estimates/estimates_cpt_pooled.csv", col_types = cols)  #### Convergence {r} gel_92 <- max(estimates$Rhat) # get largest scale reduction factor (Gelman & Rubin, 1992)  The potential scale reduction factor$\hat{R}$was$n \leq\$ r round(gel_92, 3) for all estimates, indicating good convergence.  linushof committed Sep 09, 2021 601 #### Piecewise Integration  linushof committed Aug 31, 2021 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635  {r} # generate subset of all strategy-parameter combinations (rows) and their parameters (columns) curves_cpt <- estimates %>% select(strategy, s, boundary, a, parameter, mean) %>% pivot_wider(names_from = parameter, values_from = mean)  ##### Weighting function w(p) We start by plotting the weighting curves for all parameter combinations under piecewise integration. {r} cpt_curves_piecewise <- curves_cpt %>% filter(strategy == "piecewise") %>% expand_grid(p = seq(0, 1, .1)) %>% # add vector of objective probabilities mutate(w = round(exp(-delta*(-log(p))^gamma), 2)) # compute decision weights (cf. Prelec, 1998) # all strategy-parameter combinations cpt_curves_piecewise %>% ggplot(aes(p, w)) + geom_path(size = .5) + geom_abline(intercept = 0, slope = 1, color = "red", size = 1) + labs(title = "Piecewise Integration: Weighting functions", x = "p", y= "w(p)") + theme_minimal()  {r} cpt_curves_piecewise %>% ggplot(aes(p, w)) +  linushof committed Sep 09, 2021 636  geom_path() +  linushof committed Aug 31, 2021 637 638  geom_abline(intercept = 0, slope = 1, color = "red", size = 1) + facet_wrap(~a) +  linushof committed Sep 09, 2021 639 640 641 642 643  labs(title = "Piecewise Integration: Weighting functions", x = "p", y= "w(p)", color = "Switching Probability") + scale_color_viridis() +  linushof committed Aug 31, 2021 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719  theme_minimal()  {r} cpt_curves_piecewise %>% ggplot(aes(p, w, color = s)) + geom_path() + geom_abline(intercept = 0, slope = 1, color = "red", size = 1) + labs(title = "Piecewise Integration: Weighting functions", x = "p", y= "w(p)", color = "Switching Probability") + scale_color_viridis() + theme_minimal()  {r} cpt_curves_piecewise %>% ggplot(aes(p, w, color = s)) + geom_path() + geom_abline(intercept = 0, slope = 1, color = "red", size = 1) + facet_wrap(~a) + labs(title = "Piecewise Integration: Weighting functions", x = "p", y= "w(p)", color = "Switching Probability") + scale_color_viridis() + theme_minimal()  ##### Value function v(x) {r} cpt_curves_piecewise <- curves_cpt %>% filter(strategy == "piecewise") %>% expand_grid(x = seq(0, 20, 2)) %>% # add vector of objective outcomes mutate(v = round(x^alpha, 2)) # compute decision weights (cf. Prelec, 1998) # all strategy-parameter combinations cpt_curves_piecewise %>% ggplot(aes(x, v)) + geom_path(size = .5) + geom_abline(intercept = 0, slope = 1, color = "red", size = 1) + labs(title = "Piecewise Integration: Value functions", x = "p", y= "w(p)") + theme_minimal()  {r} cpt_curves_piecewise %>% ggplot(aes(x, v, color = s)) + geom_path(size = .5) + geom_abline(intercept = 0, slope = 1, color = "red", size = 1) + labs(title = "Piecewise Integration: Value functions", x = "p", y= "w(p)") + scale_color_viridis() + theme_minimal()  {r} cpt_curves_piecewise %>% ggplot(aes(x, v, color = s)) + geom_path(size = .5) + geom_abline(intercept = 0, slope = 1, color = "red", size = 1) + facet_wrap(~a) + labs(title = "Piecewise Integration: Value functions", x = "p", y= "w(p)") + scale_color_viridis() + theme_minimal()   linushof committed Sep 09, 2021 720 #### Comprehensive Integration  linushof committed Aug 31, 2021 721 722 723 724  ##### Weighting function w(p) We start by plotting the weighting curves for all parameter combinations under piecewise integration.  linushof committed Jul 20, 2021 725   linushof committed Aug 31, 2021 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 {r} cpt_curves_comprehensive <- curves_cpt %>% filter(strategy == "comprehensive") %>% expand_grid(p = seq(0, 1, .1)) %>% # add vector of objective probabilities mutate(w = round(exp(-delta*(-log(p))^gamma), 2)) # compute decision weights (cf. Prelec, 1998) # all strategy-parameter combinations cpt_curves_comprehensive %>% ggplot(aes(p, w)) + geom_path(size = .5) + geom_abline(intercept = 0, slope = 1, color = "red", size = 1) + labs(title = "Comprehensive Integration: Weighting functions", x = "p", y= "w(p)") + theme_minimal()  {r} cpt_curves_comprehensive %>% ggplot(aes(p, w)) + geom_path(size = .5) + geom_abline(intercept = 0, slope = 1, color = "red", size = 1) + labs(title = "Comprehensive Integration: Weighting functions", x = "p",  linushof committed Sep 09, 2021 752 753  y= "w(p)") + facet_wrap(~a) +  linushof committed Aug 31, 2021 754 755 756 757 758  theme_minimal()  {r} cpt_curves_comprehensive %>%  linushof committed Sep 09, 2021 759 760  ggplot(aes(p, w, color = s)) + geom_path() +  linushof committed Aug 31, 2021 761 762 763  geom_abline(intercept = 0, slope = 1, color = "red", size = 1) + labs(title = "Comprehensive Integration: Weighting functions", x = "p",  linushof committed Sep 09, 2021 764 765 766  y= "w(p)", color = "Switching Probability") + scale_color_viridis() +  linushof committed Aug 31, 2021 767 768 769 770 771 772 773 774  theme_minimal()  {r} cpt_curves_comprehensive %>% ggplot(aes(p, w, color = s)) + geom_path() + geom_abline(intercept = 0, slope = 1, color = "red", size = 1) +  linushof committed Sep 09, 2021 775  facet_wrap(~a) +  linushof committed Aug 31, 2021 776  labs(title = "Comprehensive Integration: Weighting functions",  linushof committed Sep 09, 2021 777 778  x = "p", y= "w(p)",  linushof committed Aug 31, 2021 779 780 781 782 783 784 785  color = "Switching Probability") + scale_color_viridis() + theme_minimal()  {r} cpt_curves_comprehensive %>%  linushof committed Sep 09, 2021 786  filter(s >= .7) %>%  linushof committed Aug 31, 2021 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856  ggplot(aes(p, w, color = s)) + geom_path() + geom_abline(intercept = 0, slope = 1, color = "red", size = 1) + facet_wrap(~a) + labs(title = "Comprehensive Integration: Weighting functions", x = "p", y= "w(p)", color = "Switching Probability") + scale_color_viridis() + theme_minimal()  ##### Value function v(x) {r} cpt_curves_comprehensive <- curves_cpt %>% filter(strategy == "comprehensive") %>% expand_grid(x = seq(0, 20, 2)) %>% # add vector of objective outcomes mutate(v = round(x^alpha, 2)) # compute decision weights (cf. Prelec, 1998) # all strategy-parameter combinations cpt_curves_comprehensive %>% ggplot(aes(x, v)) + geom_path(size = .5) + geom_abline(intercept = 0, slope = 1, color = "red", size = 1) + labs(title = "Comprehensive Integration: Value functions", x = "p", y= "w(p)") + theme_minimal()  {r} cpt_curves_comprehensive %>% ggplot(aes(x, v)) + geom_path(size = .5) + geom_abline(intercept = 0, slope = 1, color = "red", size = 1) + facet_wrap(~a) + labs(title = "Comprehensive Integration: Value functions", x = "p", y= "w(p)") + theme_minimal()  {r} cpt_curves_comprehensive %>% ggplot(aes(x, v, color = s)) + geom_path(size = .5) + geom_abline(intercept = 0, slope = 1, color = "red", size = 1) + labs(title = "Comprehensive Integration: Value functions", x = "p", y= "w(p)") + scale_color_viridis() + theme_minimal()  {r} cpt_curves_comprehensive %>% ggplot(aes(x, v, color = s)) + geom_path(size = .5) + geom_abline(intercept = 0, slope = 1, color = "red", size = 1) + facet_wrap(~a) + labs(title = "Comprehensive Integration: Value functions", x = "p", y= "w(p)") + scale_color_viridis() + theme_minimal()   linushof committed Jul 20, 2021 857   858 859 860 861 # Discussion # Conclusion  linushof committed Aug 02, 2021 862 # References