manuscript.Rmd 34.6 KB
 linushof committed Jul 01, 2021 1 --- linushof committed Aug 31, 2021 2 3 title: 'Sampling Strategies in Decisions from Experience' author: "Linus Hof, Thorsten Pachur, Veronika Zilker" linushof committed Jul 02, 2021 4 5 6 7 8 9 bibliography: sampling-strategies-in-dfe.bib output: html_document: code_folding: hide toc: yes toc_float: yes linushof committed Aug 02, 2021 10 number_sections: no linushof committed Aug 16, 2021 11 12 13 pdf_document: toc: yes csl: apa.csl linushof committed Aug 31, 2021 14 15 16 editor_options: markdown: wrap: sentence linushof committed Jul 01, 2021 17 18 --- linushof committed Jul 02, 2021 19 20 {r} # load packages linushof committed Aug 02, 2021 21 22 pacman::p_load(repro, tidyverse, linushof committed Aug 16, 2021 23 24 knitr, viridis) linushof committed Jul 02, 2021 25 26  27 # Author Note linushof committed Jul 01, 2021 28 29 This document was created from the commit with the hash r repro::current_hash(). 30 31 32 - Add information on how to reproduce the project. - Add contact. linushof committed Aug 02, 2021 33 34 # Abstract linushof committed Aug 02, 2021 35 linushof committed Sep 24, 2021 36 37 A probability theoretic definition of prospects and a rough stochastic sampling model for decisions from experience is proposed. It is demonstrated how the model can be used a) to explicate assumptions about the sampling and decision strategies that agents may apply and b) to derive predictions about function forms and parameter values that describe the resulting decision behavior. linushof committed Sep 14, 2021 38 Synthetic choice data is simulated and modeled in cumulative prospect theory to test these predictions. linushof committed Jul 01, 2021 39 linushof committed Aug 02, 2021 40 # Introduction linushof committed Jul 01, 2021 41 42 43 ... linushof committed Sep 23, 2021 44 ## Sampling in Decisions from Experience linushof committed Aug 31, 2021 45 linushof committed Sep 22, 2021 46 In research on the decision theory, a standard paradigm is the choice between at least two (monetary) prospects. linushof committed Sep 14, 2021 47 48 Let a prospect be a probability space $(\Omega, \mathscr{F}, P)$. $\Omega$ is the sample space linushof committed Aug 31, 2021 49 linushof committed Sep 09, 2021 50 $$linushof committed Sep 21, 2021 51 \Omega = \{\omega_1, ..., \omega_n\} linushof committed Sep 09, 2021 52$$ linushof committed Aug 31, 2021 53 linushof committed Sep 21, 2021 54 containing a finite set of possible outcomes $\omega$, monetary gains and/or losses respectively. linushof committed Sep 20, 2021 55 $\mathscr{F}$ is the set of all possible subsets of $\Omega$: linushof committed Aug 31, 2021 56 linushof committed Sep 09, 2021 57 $$linushof committed Sep 21, 2021 58 \mathscr{F} = \{A_1, A_2, ...\} = \mathscr{P}(\Omega) linushof committed Sep 14, 2021 59 \; . linushof committed Sep 09, 2021 60$$ linushof committed Aug 31, 2021 61 linushof committed Sep 14, 2021 62 $P$ is a probability mass function linushof committed Sep 09, 2021 63 64 $$linushof committed Sep 14, 2021 65 P: \mathscr{F} \mapsto [0,1] linushof committed Sep 09, 2021 66 67$$ linushof committed Sep 21, 2021 68 that assigns each outcome $\omega$ a probability $0 < p(\omega) \leq 1$ with $P(\Omega) = 1$ [ @kolmogorovFoundationsTheoryProbability1950, pp. 2-3]. linushof committed Sep 09, 2021 69 linushof committed Sep 14, 2021 70 In such a choice paradigm, agents are asked to evaluate the prospects and build a preference for either one of them. linushof committed Sep 22, 2021 71 It is common to make a distinction between two variants of this evaluation process [cf. @hertwigDescriptionexperienceGapRisky2009]. linushof committed Sep 20, 2021 72 For decisions from description (DfD), agents are provided a full symbolic description of the prospects. linushof committed Sep 21, 2021 73 74 75 For decisions from experience [DfE; e.g., @hertwigDecisionsExperienceEffect2004], prospects are not described but must be explored by the means of sampling. To provide a formal definition of sampling in risky choice, we make use of the mathematical concept of a random variable and start by referring to a prospect as *"risky"* in the case where $p(\omega) \neq 1$ for all $\omega \in \Omega$. linushof committed Sep 20, 2021 76 Here, risky describes the fact that if agents would choose a prospect and any of its outcomes in $\Omega$ must occur, none of these outcomes will occur with certainty. linushof committed Sep 22, 2021 77 It is acceptable to speak of the occurrence of $\omega$ as a realization of a random variable $X$ defined on a prospect iff the following conditions (1) and (2) are met: linushof committed Sep 10, 2021 78 linushof committed Sep 22, 2021 79 (1) $X$ is a measurable function $$$$X: (\Omega, \mathscr{F}) \mapsto (\Omega', \mathscr{F'}) \; ,$$$$ where $\Omega'$ is a set of real numbered values $X$ can take and $\mathscr{F'}$ is a set of subsets of $\Omega'$. I.e., $\Omega$ maps into $\Omega'$ such that correspondingly each subset $A' \in \mathscr{F'}$ has a pre-image $X^{-1}A' \in \mathscr{F}$, which is the set $\{\omega \in \Omega: X(\omega) \in A'\}$ [@kolmogorovFoundationsTheoryProbability1950, p. 21]. linushof committed Sep 09, 2021 80 linushof committed Sep 22, 2021 81 (2) The mapping is such that $X(\omega) = x \equiv \omega$. linushof committed Sep 10, 2021 82 linushof committed Sep 22, 2021 83 In (2), $x \equiv \omega$ means that the realization of a random variable $X(\omega) = x$ is numerically equivalent to its pre-image $\omega$. linushof committed Sep 23, 2021 84 85 Given conditions (1) and (2), we denote any observation of $\omega$ as a *"single sample"*, or realization, of a random variable defined on a prospect and the act of generating a sequence of single samples in discrete time as *"sequential sampling"*. Note that, since random variables defined on the same prospect are independent and identically distributed (iid), the weak law of the large number applies to the relative frequency of occurrence of an outcome $\omega$ in a sequence of single samples originating from the same prospect [cf. @bernoulliArsConjectandiOpus1713]. linushof committed Sep 22, 2021 86 Thus, long sample sequences in principle allow to obtain the same information about a prospect by sampling as by symbolic description. linushof committed Sep 10, 2021 87 linushof committed Sep 22, 2021 88 Consider now a choice between prospects $1, ..., k$. linushof committed Sep 24, 2021 89 To construct a stochastic sampling model for DfE, we assume that agents base their decision on the information related to these prospects and define a decision variable as a function of the latter: linushof committed Sep 09, 2021 90 91 $$linushof committed Sep 21, 2021 92 93 D:= f((\Omega, \mathscr{F}, P)_1, ..., (\Omega, \mathscr{F}, P)_k) \;. linushof committed Sep 09, 2021 94 95$$ linushof committed Sep 23, 2021 96 Now, since in DfE no symbolic descriptions of the prospects are provided, the model must be restricted to the case where decisions are based on sequences of single samples originating from the respective prospects: linushof committed Sep 09, 2021 97 linushof committed Sep 10, 2021 98 $$linushof committed Sep 23, 2021 99 D := f(X_{i1}, ..., X_{ik}) linushof committed Sep 14, 2021 100 \; , linushof committed Sep 10, 2021 101 102$$ linushof committed Sep 23, 2021 103 where $i = 1, ..., N$ denotes a sequence of length $N$ of random variables that are iid. linushof committed Sep 09, 2021 104 linushof committed Sep 23, 2021 105 Concerning the form of $f$ and the measures it utilizes, it is quite proper to say that they reflect our assumptions about the exact kind of information agents process and the way they do and that these choices should be informed by psychological theory and empirical protocols. linushof committed Sep 24, 2021 106 Taking the case of different sampling and decision strategies previously assumed to play a role in DfE, the following section demonstrates how such assumptions can be explicated in a stochastic model that builds on the sampling approach outlined so far. linushof committed Sep 09, 2021 107 linushof committed Sep 24, 2021 108 ## A Stochastic Sampling Model Capturing Differences in Sampling and Decision Strategies linushof committed Sep 09, 2021 109 linushof committed Sep 22, 2021 110 111 Hills and Hertwig [-@hillsInformationSearchDecisions2010] discussed a potential link between sampling and decision strategies in DfE. Specifically, the authors suppose that if single samples originating from different prospects are generated in direct succession (piecewise sampling), the evaluation of prospects is based on multiple ordinal comparisons of single samples (round-wise decisions). linushof committed Sep 23, 2021 112 In contrast, if single samples originating from the same prospect are generated in direct succession (comprehensive sampling), it is supposed that the evaluation of prospects is based on a single ordinal comparison of long sequences of single samples (summary decisions) [@hillsInformationSearchDecisions2010, Figure 1 for a graphical summary]. linushof committed Sep 24, 2021 113 114 We now consider choices between two prospects and the assumptions of Hills and Hertwig [-@hillsInformationSearchDecisions2010] in more detail to build the respective stochastic sampling model for DfE. linushof committed Sep 23, 2021 115 116 117 Let $X$ and $Y$ be random variables defined on the prospects $(\Omega, \mathscr{F}, P)_X$ and $(\Omega, \mathscr{F}, P)_Y$. linushof committed Sep 24, 2021 118 119 Hills and Hertwig [-@hillsInformationSearchDecisions2010] suggest that any two sample sequences $X_i$ and $Y_i$ are compared by their means. Let thus $C = \mathbb{R}$ be the set of all possible outcomes of such a mean comparison for given sequence lengths $N_X$ and $N_Y$ and linushof committed Sep 23, 2021 120 121 122 123 124 125 126 $$\mathscr{C} = \left\{ \{c \in C: \overline{X}_{N_X} - \overline{Y}_{N_Y} > 0\}, \{ c \in C: \overline{X}_{N_X} - \overline{Y}_{N_Y} \leq 0\} \right\}$$ be a set of subsets of $C$, indicating that comparisons of prospects on the ordinal (rather than on the metric) scale are of primary interest. linushof committed Sep 24, 2021 127 128 The outcome of such an ordinal comparison can be regarded as evidence for or against a prospect and the number of wins over a series of independent ordinal comparisons as accumulated evidence. To integrate the concept of evidence accumulation into the current model, we let $D$ be a measurable function that maps the possible outcomes of a mean comparison in $C$ onto a measure space $C' = \{0,1\}$, with $0$ ($1$) indicating a lost (won) comparison: linushof committed Jul 02, 2021 129 linushof committed Sep 10, 2021 130 $$linushof committed Sep 24, 2021 131 D(c \in C) = 132 \begin{cases} linushof committed Sep 24, 2021 133 1 & \text{for} & \{c \in C: (\overline{X}_{N_X} - \overline{Y}_{N_Y} > 0) \in \mathscr{C} \} \\ linushof committed Sep 16, 2021 134 0 & \text{else}. linushof committed Sep 14, 2021 135 \end{cases} linushof committed Sep 10, 2021 136$$ linushof committed Sep 09, 2021 137 linushof committed Sep 24, 2021 138 It can be shown that for fixed sequence lengths $N_X$ and $N_Y$, a sequence $D_i = D_1, ..., D_n$ is a Bernoulli process following the binomial distribution linushof committed Sep 20, 2021 139 140 $$linushof committed Sep 24, 2021 141 D \sim B\left( p \left(\overline{X}_{N_X} - \overline{Y}_{N_Y} > 0\right), n\right) \; , linushof committed Sep 20, 2021 142 143$$ linushof committed Sep 24, 2021 144 145 where $p$ is the probability of $X$ winning a single mean comparison and $n$ is the number of comparisons (see [Appendix]). However, although $p$ can in principle be determined, it becomes intractable with increasing elements in $\Omega$ and growing sequence lengths. linushof committed Sep 20, 2021 146 linushof committed Sep 23, 2021 147 ## Predicting Choice Behavior in DfE linushof committed Sep 16, 2021 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 Hills and Hertwig [-@hillsInformationSearchDecisions2010] proposed the two different sampling strategies in combination with the respective decision strategies, i.e., piecewise sampling and round-wise comparison vs. comprehensive sampling and summary comparison, as an explanation for different choice patterns in DfE. How does the current version of the SSM support this proposition? Given prospects $X$ and $Y$, the sample spaces $S = \left\{\frac{\overline{X}_{N_X}} {\overline{Y}_{N_Y}}\right\}^{\mathbb{N}}$ can be varied by changes on three parameters, i.e., the number of comparisons $\mathbb{N}$ and the sample sizes $N_X$ and $N_Y$ on which these comparisons are based. First, only considering the pure cases formulated by the above authors, the following restrictions are put the parameters: $$\mathbb{N} = \begin{cases} 1 & \text{if} & \text{Summary} \\ \geq 1 & \text{if} & \text{Round-wise} \end{cases}\\$$ and $$N_X \, \text{and} \, N_Y = \begin{cases} \geq 1 & \text{if} & \text{Summary} \\ 1 & \text{if} & \text{Round-wise} \end{cases}\\$$ For the summary strategy, the following prediction is obtained: Given that $$linushof committed Sep 20, 2021 179 180 P\left(\lim_{N_X \to \infty} \overline{X}_{N_X} = E(X) \right) = P\left(\lim_{N_Y \to \infty} \overline{Y}_{N_Y} = E(Y) \right) = linushof committed Sep 16, 2021 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 1 \; ,$$ we obtain that $$\left( \frac{\overline{X}_{N_X}} {\overline{Y}_{N_Y}} \right) \in S : P\left(\lim_{N_X \to \infty} \lim_{N_Y \to \infty} \frac{\overline{X}_{N_X}} {\overline{Y}_{N_Y}} = \frac{E(X)} {E(Y)} \right ) = 1 \; ,$$ I.e., for the summary strategy, we assume that for increasing sample sizes $N_X$ and $N_Y$, the prospect with the larger expected value is chosen almost surely. For the round-wise strategy, the following prediction is obtained: linushof committed Sep 20, 2021 201 Given that $N_X$ and $N_Y$ are set to 1, $D$ follows the binomial distribution linushof committed Sep 16, 2021 202 linushof committed Sep 20, 2021 203 204 205 206 207 208 209 210 211 212 213 214 215 216 $$B(k \, | \, p_X, \mathbb{N}) \; ,$$ where $p$ is the probability that a single sample of prospect $X$ is larger than a single sample of prospect $Y$, $\mathbb{N}$ is the number of comparisons and $k$ is the number of times $x \in X$ is larger than $y \in Y$. *Proof.* For $N_X = N_Y = 1$, the sample space is $$\left\{\frac{\overline{X}_{N_X = 1}} {\overline{Y}_{N_Y = 1}} \right\}^{\mathbb{N}} = \left\{\frac{x_i \in \Omega'_X} {y_j \in \Omega'_Y}\right\}^{\mathbb{N}}$$ linushof committed Sep 09, 2021 217 linushof committed Aug 02, 2021 218 219 # Method linushof committed Aug 04, 2021 220 ## Test set linushof committed Jul 02, 2021 221 linushof committed Aug 31, 2021 222 223 224 225 226 227 228 229 230 Under each condition, i.e., strategy-parameter combinations, all gambles are played by 100 synthetic agents. We test a set of gambles, in which one of the prospects contains a safe outcome and the other two risky outcomes (*safe-risky gambles*). Therefore, 60 gambles from an initial set of 10,000 are sampled. Both outcomes and probabilities are drawn from uniform distributions, ranging from 0 to 20 for outcomes and from .01 to .99 for probabilities of the lower risky outcomes $p_{low}$. The probabilities of the higher risky outcomes are $1-p_{low}$, respectively. To omit dominant prospects, safe outcomes fall between both risky outcomes. The table below contains the test set of 60 gambles. Sampling of gambles was stratified, randomly drawing an equal number of 20 gambles with no, an attractive, and an unattractive rare outcome. Risky outcomes are considered *"rare"* if their probability is $p < .2$ and *"attractive"* (*"unattractive"*) if they are higher (lower) than the safe outcome. linushof committed Jul 02, 2021 231 linushof committed Aug 16, 2021 232 233 234 {r message=FALSE} gambles <- read_csv("data/gambles/sr_subset.csv") gambles %>% kable() linushof committed Jul 02, 2021 235 236  linushof committed Aug 02, 2021 237 ## Model Parameters linushof committed Jul 02, 2021 238 linushof committed Aug 31, 2021 239 **Switching probability** $s$ is the probability with which agents draw the following single sample from the prospect they did not get their most recent single sample from. linushof committed Aug 13, 2021 240 $s$ is varied between .1 to 1 in increments of .1. linushof committed Jul 01, 2021 241 linushof committed Aug 31, 2021 242 The **boundary type** is either the minimum value any prospect's sample statistic must reach (absolute) or the minimum value for the difference of these statistics (relative). linushof committed Aug 02, 2021 243 Sample statistics are sums over outcomes (comprehensive strategy) and sums over wins (piecewise strategy), respectively. linushof committed Jul 01, 2021 244 linushof committed Aug 13, 2021 245 246 For comprehensive integration, the **boundary value** $a$ is varied between 15 to 75 in increments of 15. For piecewise integration $a$ is varied between 1 to 5 in increments of 1. linushof committed Jul 01, 2021 247 linushof committed Aug 16, 2021 248 {r message=FALSE} linushof committed Jul 27, 2021 249 250 251 252 253 254 255 256 # read choice data cols <- list(.default = col_double(), strategy = col_factor(), boundary = col_factor(), gamble = col_factor(), rare = col_factor(), agent = col_factor(), choice = col_factor()) linushof committed Aug 16, 2021 257 choices <- read_csv("data/choices/choices.csv", col_types = cols) 258 259  linushof committed Aug 16, 2021 260 In sum, 2 (strategies) x 60 (gambles) x 100 (agents) x 100 (parameter combinations) = r nrow(choices) choices are simulated. linushof committed Jul 20, 2021 261 linushof committed Aug 16, 2021 262 # Results 263 linushof committed Aug 31, 2021 264 265 Because we are not interested in deviations from normative choice due to sampling artifacts (e.g., ceiling effects produced by low boundaries), we remove trials in which only one prospect was attended. In addition, we use relative frequencies of sampled outcomes rather than 'a priori' probabilities to compare actual against normative choice behavior. 266 267 {r} linushof committed Aug 16, 2021 268 269 270 # remove choices where prospects were not attended choices <- choices %>% filter(!(is.na(a_ev_exp) | is.na(b_ev_exp))) 271 272  linushof committed Aug 16, 2021 273 274 275 276 277 {r eval = FALSE} # remove choices where not all outcomes were sampled choices <- choices %>% filter(!(is.na(a_ev_exp) | is.na(b_ev_exp) | a_p1_exp == 0 | a_p2_exp == 0))  linushof committed Jul 20, 2021 278 linushof committed Aug 31, 2021 279 Removing the respective trials, we are left with r nrow(choices) choices. linushof committed Jul 20, 2021 280 linushof committed Aug 16, 2021 281 ## Sample Size linushof committed Jul 20, 2021 282 linushof committed Aug 16, 2021 283 284 285 286 287 288 {r message=FALSE} samples <- choices %>% group_by(strategy, s, boundary, a) %>% summarise(n_med = median(n_sample)) samples_piecewise <- samples %>% filter(strategy == "piecewise") samples_comprehensive <- samples %>% filter(strategy == "comprehensive") 289 290  linushof committed Aug 16, 2021 291 The median sample sizes generated by different parameter combinations ranged from r min(samples_piecewise$n_med) to r max(samples_piecewise$n_med) for piecewise integration and r min(samples_comprehensive$n_med) to r max(samples_comprehensive$n_med) for comprehensive integration. linushof committed Jul 27, 2021 292 linushof committed Aug 16, 2021 293 ### Boundary type and boundary value (a) 294 linushof committed Aug 31, 2021 295 As evidence is accumulated sequentially, relative boundaries and large boundary values naturally lead to larger sample sizes, irrespective of the integration strategy. linushof committed Jul 20, 2021 296 linushof committed Aug 16, 2021 297 298 {r message=FALSE} group_med <- samples_piecewise %>% linushof committed Jul 20, 2021 299 group_by(boundary, a) %>% linushof committed Aug 16, 2021 300 summarise(group_med = median(n_med)) # to get the median across all s values linushof committed Jul 20, 2021 301 linushof committed Aug 16, 2021 302 303 samples_piecewise %>% ggplot(aes(a, n_med, color = a)) + linushof committed Jul 20, 2021 304 geom_jitter(alpha = .5, size = 2) + linushof committed Aug 16, 2021 305 306 307 geom_point(data = group_med, aes(y = group_med), size = 3) + facet_wrap(~boundary) + scale_color_viridis() + linushof committed Jul 27, 2021 308 labs(title = "Piecewise Integration", linushof committed Aug 16, 2021 309 x ="a", linushof committed Jul 20, 2021 310 y="Sample Size", linushof committed Aug 16, 2021 311 col="a") + linushof committed Jul 20, 2021 312 theme_minimal() linushof committed Aug 16, 2021 313  linushof committed Jul 20, 2021 314 linushof committed Aug 16, 2021 315 316 {r message=FALSE} group_med <- samples_comprehensive %>% linushof committed Jul 20, 2021 317 group_by(boundary, a) %>% linushof committed Aug 16, 2021 318 summarise(group_med = median(n_med)) linushof committed Jul 20, 2021 319 linushof committed Aug 16, 2021 320 321 samples_comprehensive %>% ggplot(aes(a, n_med, color = a)) + linushof committed Jul 20, 2021 322 geom_jitter(alpha = .5, size = 2) + linushof committed Aug 16, 2021 323 324 325 geom_point(data = group_med, aes(y = group_med), size = 3) + facet_wrap(~boundary) + scale_color_viridis() + linushof committed Jul 27, 2021 326 labs(title = "Comprehensive Integration", linushof committed Aug 16, 2021 327 x ="a", linushof committed Jul 20, 2021 328 y="Sample Size", linushof committed Aug 16, 2021 329 col="a") + linushof committed Jul 20, 2021 330 theme_minimal() 331 332  linushof committed Aug 16, 2021 333 ### Switching probability (s) linushof committed Jul 27, 2021 334 linushof committed Aug 31, 2021 335 336 337 For piecewise integration, there is an inverse relationship between switching probability and sample size. I.e., the lower s, the less frequent prospects are compared and thus, boundaries are only approached with larger sample sizes. This effect is particularly pronounced for low probabilities such that the increase in sample size accelerates as switching probability decreases. linushof committed Jul 20, 2021 338 linushof committed Aug 16, 2021 339 340 {r message=FALSE} group_med <- samples_piecewise %>% linushof committed Jul 20, 2021 341 group_by(boundary, s) %>% linushof committed Aug 16, 2021 342 summarise(group_med = median(n_med)) # to get the median across all a values linushof committed Jul 20, 2021 343 linushof committed Aug 16, 2021 344 345 346 347 348 349 samples_piecewise %>% ggplot(aes(s, n_med, color = s)) + geom_jitter(alpha = .5, size = 2) + geom_point(data = group_med, aes(y = group_med), size = 3) + facet_wrap(~boundary) + scale_color_viridis() + linushof committed Jul 27, 2021 350 labs(title = "Piecewise Integration", linushof committed Aug 16, 2021 351 x ="s", linushof committed Jul 20, 2021 352 y="Sample Size", linushof committed Aug 16, 2021 353 col="s") + linushof committed Jul 20, 2021 354 355 356 theme_minimal()  linushof committed Aug 31, 2021 357 358 359 For comprehensive integration, boundary types differ in the effects of switching probability. For absolute boundaries, switching probability has no apparent effect on sample size as the distance of a given prospect to its absolute boundary is not changed by switching to (and sampling from) the other prospect. For relative boundaries, however, samples sizes increase with switching probability. linushof committed Jul 20, 2021 360 linushof committed Aug 16, 2021 361 362 {r message=FALSE} group_med <- samples_comprehensive %>% linushof committed Jul 20, 2021 363 group_by(boundary, s) %>% linushof committed Aug 16, 2021 364 summarise(group_med = median(n_med)) # to get the median across all a values linushof committed Jul 20, 2021 365 linushof committed Aug 16, 2021 366 367 368 369 370 371 samples_comprehensive %>% ggplot(aes(s, n_med, color = s)) + geom_jitter(alpha = .5, size = 2) + geom_point(data = group_med, aes(y = group_med), size = 3) + facet_wrap(~boundary) + scale_color_viridis() + linushof committed Jul 27, 2021 372 labs(title = "Comprehensive Integration", linushof committed Aug 16, 2021 373 374 375 x ="s", y = "Sample Size", col="s") + linushof committed Jul 20, 2021 376 377 378 theme_minimal()  linushof committed Aug 02, 2021 379 ## Choice Behavior linushof committed Jul 20, 2021 380 linushof committed Aug 31, 2021 381 Below, in extension to Hills and Hertwig [-@hillsInformationSearchDecisions2010], the interplay of integration strategies, gamble features, and model parameters in their effects on choice behavior in general and their contribution to underweighting of rare events in particular is investigated. linushof committed Aug 16, 2021 382 383 384 385 386 387 388 389 390 391 392 393 We apply two definitions of underweighting of rare events: Considering false response rates, we define underweighting such that the rarity of an attractive (unattractive) outcome leads to choose the safe (risky) prospect although the risky (safe) prospect has a higher expected value. {r message=FALSE} fr_rates <- choices %>% mutate(ev_ratio_exp = round(a_ev_exp/b_ev_exp, 2), norm = case_when(ev_ratio_exp > 1 ~ "A", ev_ratio_exp < 1 ~ "B")) %>% filter(!is.na(norm)) %>% # exclude trials with normative indifferent options group_by(strategy, s, boundary, a, rare, norm, choice) %>% # group correct and incorrect responses summarise(n = n()) %>% # absolute numbers mutate(rate = round(n/sum(n), 2), # response rates type = case_when(norm == "A" & choice == "B" ~ "false safe", norm == "B" & choice == "A" ~ "false risky")) %>% filter(!is.na(type)) # remove correct responses linushof committed Jul 20, 2021 394 395  linushof committed Aug 31, 2021 396 Considering the parameters of Prelec's [-@prelecProbabilityWeightingFunction1998] implementation of the weighting function [CPT; cf. @tverskyAdvancesProspectTheory1992], underweighting is reflected by decisions weights estimated to be smaller than the corresponding objective probabilities. linushof committed Jul 20, 2021 397 linushof committed Aug 16, 2021 398 ### False Response Rates linushof committed Jul 20, 2021 399 linushof committed Aug 16, 2021 400 401 402 {r message=FALSE} fr_rates_piecewise <- fr_rates %>% filter(strategy == "piecewise") fr_rates_comprehensive <- fr_rates %>% filter(strategy == "comprehensive") linushof committed Jul 20, 2021 403  404 linushof committed Aug 31, 2021 405 The false response rates generated by different parameter combinations ranged from r min(fr_rates_piecewise$rate) to r max(fr_rates_piecewise$rate) for piecewise integration and from r min(fr_rates_comprehensive$rate) to r max(fr_rates_comprehensive$rate) for comprehensive integration. linushof committed Aug 16, 2021 406 However, false response rates vary considerably as a function of rare events, indicating that their presence and attractiveness are large determinants of false response rates. linushof committed Jul 20, 2021 407 linushof committed Aug 16, 2021 408 409 410 411 412 413 {r message=FALSE} fr_rates %>% group_by(strategy, boundary, rare) %>% summarise(min = min(rate), max = max(rate)) %>% kable() linushof committed Jul 20, 2021 414 415  linushof committed Aug 31, 2021 416 The heatmaps below show the false response rates for all strategy-parameter combinations. linushof committed Aug 16, 2021 417 418 Consistent with our - somewhat rough - definition of underweighting, the rate of false risky responses is generally higher, if the unattractive outcome of the risky prospect is rare (top panel). Conversely, if the attractive outcome of the risky prospect is rare, the rate of false safe responses is generally higher (bottom panel). linushof committed Aug 31, 2021 419 As indicated by the larger range of false response rates, the effects of rare events are considerably larger for piecewise integration. 420 linushof committed Aug 16, 2021 421 422 423 424 425 426 427 428 429 430 431 432 433 434 {r message=FALSE} fr_rates %>% filter(strategy == "piecewise", boundary == "absolute") %>% ggplot(aes(a, s, fill = rate)) + facet_grid(type ~ fct_relevel(rare, "attractive", "none", "unattractive"), switch = "y") + geom_tile(colour="white", size=0.25) + scale_x_continuous(expand=c(0,0), breaks = seq(1, 5, 1)) + scale_y_continuous(expand=c(0,0), breaks = seq(.1, 1, .1)) + scale_fill_viridis() + labs(title = "Piecewise Integration | Absolute Boundary", x = "a", y= "s", fill = "% False Responses") + theme_minimal() 435 436  linushof committed Aug 16, 2021 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 {r message=FALSE} fr_rates %>% filter(strategy == "piecewise", boundary == "relative") %>% ggplot(aes(a, s, fill = rate)) + facet_grid(type ~ fct_relevel(rare, "attractive", "none", "unattractive"), switch = "y") + geom_tile(colour="white", size=0.25) + scale_x_continuous(expand=c(0,0), breaks = seq(1, 5, 1)) + scale_y_continuous(expand=c(0,0), breaks = seq(.1, 1, .1)) + scale_fill_viridis() + labs(title = "Piecewise Integration | Relative Boundary", x = "a", y= "s", fill = "% False Responses") + theme_minimal()  linushof committed Jul 20, 2021 452 linushof committed Aug 16, 2021 453 454 {r message=FALSE} fr_rates %>% linushof committed Jul 20, 2021 455 filter(strategy == "comprehensive", boundary == "absolute") %>% linushof committed Aug 16, 2021 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 ggplot(aes(a, s, fill = rate)) + facet_grid(type ~ fct_relevel(rare, "attractive", "none", "unattractive"), switch = "y") + geom_tile(colour="white", size=0.25) + scale_x_continuous(expand=c(0,0), breaks = seq(15, 75, 15)) + scale_y_continuous(expand=c(0,0), breaks = seq(.1, 1, .1)) + scale_fill_viridis() + labs(title = "Comprehensive Integration | Absolute Boundary", x = "a", y= "s", fill = "% False Responses") + theme_minimal()  {r message=FALSE} fr_rates %>% linushof committed Jul 20, 2021 471 filter(strategy == "comprehensive", boundary == "relative") %>% linushof committed Aug 16, 2021 472 473 474 475 476 477 478 479 480 481 482 ggplot(aes(a, s, fill = rate)) + facet_grid(type ~ fct_relevel(rare, "attractive", "none", "unattractive"), switch = "y") + geom_tile(colour="white", size=0.25) + scale_x_continuous(expand=c(0,0), breaks = seq(15, 75, 15)) + scale_y_continuous(expand=c(0,0), breaks = seq(.1, 1, .1)) + scale_fill_viridis() + labs(title = "Comprehensive Integration | Relative Boundary", x = "a", y= "s", fill = "% False Responses") + theme_minimal() 483 484  linushof committed Aug 16, 2021 485 #### Switching Probability (s) and Boundary Value (a) linushof committed Jul 20, 2021 486 linushof committed Aug 31, 2021 487 As for both piecewise and comprehensive integration the differences between boundary types are rather minor and of magnitude than of qualitative pattern, the remaining analyses of false response rates are summarized across absolute and relative boundaries. linushof committed Jul 20, 2021 488 linushof committed Aug 16, 2021 489 Below, the $s$ and $a$ parameter are considered as additional sources of variation in the false response pattern above and beyond the interplay of integration strategies and the rarity and attractiveness of outcomes. linushof committed Jul 20, 2021 490 linushof committed Aug 16, 2021 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 {r message=FALSE} fr_rates %>% filter(strategy == "piecewise") %>% ggplot(aes(s, rate, color = a)) + facet_grid(type ~ fct_relevel(rare, "attractive", "none", "unattractive"), switch = "y") + geom_jitter(size = 2) + scale_x_continuous(breaks = seq(0, 1, .1)) + scale_y_continuous(breaks = seq(0, 1, .1)) + scale_color_viridis() + labs(title = "Piecewise Integration", x = "s", y= "% False Responses", color = "a") + theme_minimal()  506 linushof committed Aug 16, 2021 507 508 {r message=FALSE} fr_rates %>% linushof committed Jul 20, 2021 509 filter(strategy == "comprehensive") %>% linushof committed Aug 16, 2021 510 511 512 513 514 515 ggplot(aes(s, rate, color = a)) + facet_grid(type ~ fct_relevel(rare, "attractive", "none", "unattractive"), switch = "y") + geom_jitter(size = 2) + scale_x_continuous(breaks = seq(0, 1, .1)) + scale_y_continuous(breaks = seq(0, 1, .1)) + scale_color_viridis() + linushof committed Jul 27, 2021 516 labs(title = "Comprehensive Integration", linushof committed Aug 16, 2021 517 518 519 520 x = "s", y= "% False Responses", color = "a") + theme_minimal() 521 522  linushof committed Aug 31, 2021 523 For piecewise integration, switching probability is naturally related to the size of the samples on which the round-wise comparisons of prospects are based on, with low values of $s$ indicating large samples and vice versa. linushof committed Aug 16, 2021 524 Accordingly, switching probability is positively related to false response rates. linushof committed Aug 31, 2021 525 526 I.e., the larger the switching probability, the smaller the round-wise sample size and the probability of experiencing a rare event within a given round. Because round-wise comparisons are independent of each other and binomial distributions within a given round are skewed for small samples and outcome probabilities [@kolmogorovFoundationsTheoryProbability1950], increasing boundary values do not reverse but rather amplify this relation. 527 linushof committed Aug 31, 2021 528 529 530 For comprehensive integration, switching probability is negatively related to false response rates, i.e., an increase in $s$ is associated with decreasing false response rates. This relation, however, may be the result of an artificial interaction between the $s$ and $a$ parameter. Precisely, in the current algorithmic implementation of sampling with a comprehensive integration mechanism, decreasing switching probabilities cause comparisons of prospects based on increasingly unequal sample sizes immediately after switching prospects. linushof committed Aug 16, 2021 531 Consequentially, reaching (low) boundaries is rather a function of switching probability and associated sample sizes than of actual evidence for a given prospect over the other. 532 linushof committed Aug 31, 2021 533 534 535 536 537 538 539 540 541 542 543 544 545 ### Cumulative Prospect Theory In the following, we examine the possible relations between the parameters of the *choice-generating* sampling models and the *choice-describing* cumulative prospect theory. For each distinct strategy-parameter combination, we ran 20 chains of 40,000 iterations each, after a warm-up period of 1000 samples. To reduce potential autocorrelation during the sampling process, we only kept every 20th sample (thinning). {r} # read CPT data cols <- list(.default = col_double(), strategy = col_factor(), boundary = col_factor(), parameter = col_factor()) linushof committed Sep 24, 2021 546 estimates <- read_csv("data/estimates/estimates_cpt_pooled_goldstein-einhorn-87.csv", col_types = cols) linushof committed Aug 31, 2021 547 548 549 550 551 552 553 554 555 556  #### Convergence {r} gel_92 <- max(estimates$Rhat) # get largest scale reduction factor (Gelman & Rubin, 1992)  The potential scale reduction factor$\hat{R}$was$n \leq$r round(gel_92, 3) for all estimates, indicating good convergence. linushof committed Sep 09, 2021 557 #### Piecewise Integration linushof committed Aug 31, 2021 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 {r} # generate subset of all strategy-parameter combinations (rows) and their parameters (columns) curves_cpt <- estimates %>% select(strategy, s, boundary, a, parameter, mean) %>% pivot_wider(names_from = parameter, values_from = mean)  ##### Weighting function w(p) We start by plotting the weighting curves for all parameter combinations under piecewise integration. {r} cpt_curves_piecewise <- curves_cpt %>% filter(strategy == "piecewise") %>% expand_grid(p = seq(0, 1, .1)) %>% # add vector of objective probabilities mutate(w = round(exp(-delta*(-log(p))^gamma), 2)) # compute decision weights (cf. Prelec, 1998) # all strategy-parameter combinations cpt_curves_piecewise %>% ggplot(aes(p, w)) + geom_path(size = .5) + geom_abline(intercept = 0, slope = 1, color = "red", size = 1) + labs(title = "Piecewise Integration: Weighting functions", x = "p", y= "w(p)") + theme_minimal()  {r} cpt_curves_piecewise %>% ggplot(aes(p, w)) + linushof committed Sep 09, 2021 592 geom_path() + linushof committed Aug 31, 2021 593 594 geom_abline(intercept = 0, slope = 1, color = "red", size = 1) + facet_wrap(~a) + linushof committed Sep 09, 2021 595 596 597 598 599 labs(title = "Piecewise Integration: Weighting functions", x = "p", y= "w(p)", color = "Switching Probability") + scale_color_viridis() + linushof committed Aug 31, 2021 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 theme_minimal()  {r} cpt_curves_piecewise %>% ggplot(aes(p, w, color = s)) + geom_path() + geom_abline(intercept = 0, slope = 1, color = "red", size = 1) + labs(title = "Piecewise Integration: Weighting functions", x = "p", y= "w(p)", color = "Switching Probability") + scale_color_viridis() + theme_minimal()  {r} cpt_curves_piecewise %>% ggplot(aes(p, w, color = s)) + geom_path() + geom_abline(intercept = 0, slope = 1, color = "red", size = 1) + facet_wrap(~a) + labs(title = "Piecewise Integration: Weighting functions", x = "p", y= "w(p)", color = "Switching Probability") + scale_color_viridis() + theme_minimal()  ##### Value function v(x) {r} cpt_curves_piecewise <- curves_cpt %>% filter(strategy == "piecewise") %>% expand_grid(x = seq(0, 20, 2)) %>% # add vector of objective outcomes mutate(v = round(x^alpha, 2)) # compute decision weights (cf. Prelec, 1998) # all strategy-parameter combinations cpt_curves_piecewise %>% ggplot(aes(x, v)) + geom_path(size = .5) + geom_abline(intercept = 0, slope = 1, color = "red", size = 1) + labs(title = "Piecewise Integration: Value functions", x = "p", y= "w(p)") + theme_minimal()  {r} cpt_curves_piecewise %>% ggplot(aes(x, v, color = s)) + geom_path(size = .5) + geom_abline(intercept = 0, slope = 1, color = "red", size = 1) + labs(title = "Piecewise Integration: Value functions", x = "p", y= "w(p)") + scale_color_viridis() + theme_minimal()  {r} cpt_curves_piecewise %>% ggplot(aes(x, v, color = s)) + geom_path(size = .5) + geom_abline(intercept = 0, slope = 1, color = "red", size = 1) + facet_wrap(~a) + labs(title = "Piecewise Integration: Value functions", x = "p", y= "w(p)") + scale_color_viridis() + theme_minimal()  linushof committed Sep 09, 2021 676 #### Comprehensive Integration linushof committed Aug 31, 2021 677 678 679 680 ##### Weighting function w(p) We start by plotting the weighting curves for all parameter combinations under piecewise integration. linushof committed Jul 20, 2021 681 linushof committed Aug 31, 2021 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 {r} cpt_curves_comprehensive <- curves_cpt %>% filter(strategy == "comprehensive") %>% expand_grid(p = seq(0, 1, .1)) %>% # add vector of objective probabilities mutate(w = round(exp(-delta*(-log(p))^gamma), 2)) # compute decision weights (cf. Prelec, 1998) # all strategy-parameter combinations cpt_curves_comprehensive %>% ggplot(aes(p, w)) + geom_path(size = .5) + geom_abline(intercept = 0, slope = 1, color = "red", size = 1) + labs(title = "Comprehensive Integration: Weighting functions", x = "p", y= "w(p)") + theme_minimal()  {r} cpt_curves_comprehensive %>% ggplot(aes(p, w)) + geom_path(size = .5) + geom_abline(intercept = 0, slope = 1, color = "red", size = 1) + labs(title = "Comprehensive Integration: Weighting functions", x = "p", linushof committed Sep 09, 2021 708 709 y= "w(p)") + facet_wrap(~a) + linushof committed Aug 31, 2021 710 711 712 713 714 theme_minimal()  {r} cpt_curves_comprehensive %>% linushof committed Sep 09, 2021 715 716 ggplot(aes(p, w, color = s)) + geom_path() + linushof committed Aug 31, 2021 717 718 719 geom_abline(intercept = 0, slope = 1, color = "red", size = 1) + labs(title = "Comprehensive Integration: Weighting functions", x = "p", linushof committed Sep 09, 2021 720 721 722 y= "w(p)", color = "Switching Probability") + scale_color_viridis() + linushof committed Aug 31, 2021 723 724 725 726 727 728 729 730 theme_minimal()  {r} cpt_curves_comprehensive %>% ggplot(aes(p, w, color = s)) + geom_path() + geom_abline(intercept = 0, slope = 1, color = "red", size = 1) + linushof committed Sep 09, 2021 731 facet_wrap(~a) + linushof committed Aug 31, 2021 732 labs(title = "Comprehensive Integration: Weighting functions", linushof committed Sep 09, 2021 733 734 x = "p", y= "w(p)", linushof committed Aug 31, 2021 735 736 737 738 739 740 741 color = "Switching Probability") + scale_color_viridis() + theme_minimal()  {r} cpt_curves_comprehensive %>% linushof committed Sep 09, 2021 742 filter(s >= .7) %>% linushof committed Aug 31, 2021 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 ggplot(aes(p, w, color = s)) + geom_path() + geom_abline(intercept = 0, slope = 1, color = "red", size = 1) + facet_wrap(~a) + labs(title = "Comprehensive Integration: Weighting functions", x = "p", y= "w(p)", color = "Switching Probability") + scale_color_viridis() + theme_minimal()  ##### Value function v(x) {r} cpt_curves_comprehensive <- curves_cpt %>% filter(strategy == "comprehensive") %>% expand_grid(x = seq(0, 20, 2)) %>% # add vector of objective outcomes mutate(v = round(x^alpha, 2)) # compute decision weights (cf. Prelec, 1998) # all strategy-parameter combinations cpt_curves_comprehensive %>% ggplot(aes(x, v)) + geom_path(size = .5) + geom_abline(intercept = 0, slope = 1, color = "red", size = 1) + labs(title = "Comprehensive Integration: Value functions", x = "p", y= "w(p)") + theme_minimal()  {r} cpt_curves_comprehensive %>% ggplot(aes(x, v)) + geom_path(size = .5) + geom_abline(intercept = 0, slope = 1, color = "red", size = 1) + facet_wrap(~a) + labs(title = "Comprehensive Integration: Value functions", x = "p", y= "w(p)") + theme_minimal()  {r} cpt_curves_comprehensive %>% ggplot(aes(x, v, color = s)) + geom_path(size = .5) + geom_abline(intercept = 0, slope = 1, color = "red", size = 1) + labs(title = "Comprehensive Integration: Value functions", x = "p", y= "w(p)") + scale_color_viridis() + theme_minimal()  {r} cpt_curves_comprehensive %>% ggplot(aes(x, v, color = s)) + geom_path(size = .5) + geom_abline(intercept = 0, slope = 1, color = "red", size = 1) + facet_wrap(~a) + labs(title = "Comprehensive Integration: Value functions", x = "p", y= "w(p)") + scale_color_viridis() + theme_minimal()  linushof committed Jul 20, 2021 813 814 815 816 817 # Discussion # Conclusion linushof committed Sep 20, 2021 818 819 # Appendix linushof committed Sep 24, 2021 820 Let linushof committed Sep 20, 2021 821 822 $$linushof committed Sep 24, 2021 823 X_1, ..., X_{N_X} linushof committed Sep 20, 2021 824 825 826 827 828$$ and $$linushof committed Sep 24, 2021 829 Y_1, ..., Y_{N_Y} linushof committed Sep 20, 2021 830 831$$ linushof committed Sep 24, 2021 832 be sequences of random variables that are iid. linushof committed Sep 20, 2021 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 Then $$P\left(\frac{\overline{X}_{N_X}}{\overline{Y}_{N_Y}} > 0 \right) = P\left(\frac{\frac{1}{N_X}\sum\limits_{n=1}^{N_X} (X(\omega_i) = A'_X \in \mathscr{F'}_X)_n} {\frac{1}{N_Y}\sum\limits_{m=1}^{N_Y} (Y(\omega_j) = A'_Y \in \mathscr{F'}_Y)_m} > 0 \right)$$ is the probability that the quotient of the mean of both sequences takes on a value larger$0$. Given the sample sizes$N_X = N_Y = 1$, the equation reduces to $$P\left(\frac{X(\omega_i \in \Omega_X) = A'_X \in \mathscr{F'}_X}{Y(\omega_j \in \Omega_Y) = A'_Y \in \mathscr{F'}_Y} > 0 \right) \; ,$$ which is the sum across all joint probabilities$p(\omega_i \cap \omega_j)\$ for which the above inequation holds:. $$D:= f \left( \frac{\overline{X}_{N_X}} {\overline{Y}_{N_Y}} \right) = \begin{cases} 1 & \text{if} & \frac{\overline{X}_{N_X}}{\overline{Y}_{N_Y}} > 0 \in \mathscr{D} \\ 0 & \text{else}. \end{cases}$$ linushof committed Aug 02, 2021 864 # References linushof committed Sep 20, 2021 865