Some of the `R code` is folded but can be unfolded by clicking the `Code` buttons.

```{r}

# load packages

pacman::p_load(tidyverse,

knitr)

```

# Prospects

Let a prospect be a *probability space* $(\Omega, P)$ and $\Omega$ the *sample space* containing a finite set of possible outcomes $\{\omega_1, ..., \omega_n\}$ [cf. @kolmogorovFoundationsTheoryProbability1950]. P is then a *probability mass function* (PMF) $P: \Omega \mapsto [0,1]$ which assigns each outcome $\omega_i$ a probability of $0 \leq p_i \leq 1$ with $\sum_{i=1}^{n} p(\omega_i) = 1$.

## Example

Below, each row of the table represents a choice problem with a risky prospect `A` and a safe prospect `B` (one outcome only, "sure thing"), where each outcome falls in the gain range of 0 to 10. Variable strings `_p`, `_o`, and `ev` denote probabilities, outcomes, and the expected values.

```{r}

source("./functions/fun_gambles.R") # call generate_gambles() function

In *decisions from experience* [DfE; @hertwigDecisionsExperienceEffect2004], where no summary description of prospects' probability spaces are provided, agents can either first explore them before arriving to a final choice (*sampling paradigm*), or, exploration and exploitation occur simultaneously (*partial-* or *full-feedback paradigm*) [cf. @hertwigDescriptionExperienceGap2009]. Below, only the sampling paradigm is considered.

## Sampling strategies

In the context of gambles, a *single sample* represents an outcome obtained when randomly drawing from a prospect's sample space $\Omega$. Thus, a single sample is the realization of a discrete random variable $X$ defined on $(\Omega, P)$, which can take the value of any real-valued outcome in $\Omega$ according to $P$:

$$\begin{equation}

X: \Omega \mapsto \mathbb{R}

\end{equation}$$

In general terms, we define a *sampling strategy* as a systematic approach to generate a sequence of single samples from a gamble's prospects as a means of exploring their probability spaces. Single samples that are generated from the same prospect reflect a sequence of realizations of random variables that are independent and identically distributed.

## Comprehensive sampling strategy

In *comprehensive sampling* [@hillsInformationSearchDecisions2010], for each prospect single samples are drawn in direct succession before sampling from another prospect.

### Integration and decision strategy

In comprehensive sampling, single samples of the same prospect are assumed to be integrated into an empirical outcome distribution. This can be considered a special instance of the more general case that, irrespective of the sampling strategy, realizations of the *random variables of interest* are integrated into a prospect's frequency distribution. We then know from the law of large numbers that the relative frequencies of outcomes in these distributions should approximate the probabilities of the respective random variable as the sample size increases - for comprehensive sampling, it follows that the mean of the frequency distribution associated with a prospect approximates its expected value (EV). Prospects are consequentially assumend to be chosen on the basis of a mean comparison [*"summary strategy"*, @hillsInformationSearchDecisions2010].

### Example

A synthetic agent applies a comprehensive sampling strategy to explore the probability spaces of the first gamble (see above) and applies the associated integration- and decision strategy. For demonstrative purposes, it is assumed that the agent draws five consecutive single samples from each prospect.

The table below summarizes the simulated process. Each row represents a single sample drawn from one of the prospects. Outcomes associated with a single sample are given in column `A` and `B`. `A_mean` and `B_mean` represent the cumulative means across outcomes. `diff` is the difference of means. `choice` indicates which prospect is chosen on the basis of a mean comparison.

```{r}

source("./functions/fun_moving_stats.R") # call function cumsum2() and cummean2

fd <- tibble() # frequency distribution (start sampling in a state of ignorance)

n_smpls <- 5 # number of single samples

set.seed(345)

# draw series of single samples from prospect A

for (i in seq_along(1:n_smpls)) {

single_smpl <- gambles[1, ] %>% # get gamble features

Assuming a perfectly unnoisy sampling-, integration-, and decision process, the synthetic agent chooses prospect `A` over prospect `B`.

## Piecewise sampling strategy

In *piecewise sampling* [@hillsInformationSearchDecisions2010], single samples from different prospects are drawn in direct succession.

### Integration and decision strategy

In piecewise sampling, it is assumed that single samples of different prospects are compared against each other [@hillsInformationSearchDecisions2010]. Here, we define a new discrete random variable on a probability space $(\Omega, \Sigma, P)$, where $\Omega$ is a set of all possible combinations of outcomes from different prospects in a gamble written as a fraction. $\Sigma$ is a set of subsets of $\Omega$, i.e., the event space $\{\varsigma_1, ...,\varsigma_n\}$, and $P$ is the joint probability mass function of a gamble's prospects. The random variable maps $\Sigma$ to the measurable space $E = \{0, 1\}$ as follows:

$$\begin{equation}

X(\omega_i) = \left\{

\begin{array}{l}

0, & if\ \omega_i\ \in\ \varsigma_s\ \leq\ 1 ,\\

1, & if\ \omega_i\ \in\ \varsigma_g\ >\ 1,

\end{array}

\right.

\end{equation}$$

where subset $\varsigma_s$ contains all fractions $\omega_i \leq 1$, i.e., an outcome of a given prospect is smaller or equal to the outcome of the other prospect. $\varsigma_g$ contains all $\omega_i > 1$. Since the measurable space consists of only two values $\{0, 1\}$, in piecewise sampling the frequency distribution of the random variable of interest (i.e., win vs. no win) is always a Bernoulli distribution, irrespective of the number of different outcomes of a prospect. Prospects are consequentially assumend to be chosen on the basis of a comparison of the number of wins [*"round-wise strategy"*, @hillsInformationSearchDecisions2010].

### Example

By alternating back and forth between prospects `A` and `B`, a synthetic agent applies a piecewise sampling strategy (and the associated integration- and decision strategy) while exploring the probability spaces of the same gamble as before when comprehensive sampling was applied. Again, five single samples are drawn from each prospect.

The table below summarizes the simulated sampling process. Each row represents a single sample drawn from on of the prospects. Outcomes associated with a single sample are given in column `A` and `B`. After every second single sample, i.e., when an outcome from `A` and `B` is drawn, prospects are compared: `diff` is the difference between outcomes; `A_win` and `B_win` denote which of the outcomes was larger, i.e., the realizations of the random variable. `A_sum` and `B_sum` denote the cumulative number of comparisons in favor of a prospect. `choice` indicates which prospect is chosen on the basis of all comparisons.

Assuming a perfectly unnoisy sampling-, integration-, and decision process, the agent chooses prospect `B` over prospect `A`. Thus, as previously demonstrated by Hills and Hertwig [-@hillsInformationSearchDecisions2010], different sampling strategies can in theory produce different choices on the basis of the same set of single samples.

The very starting point of these eventual variations in choice behavior are assumed to be differences in the random experiments that are repeatedly performed under the comprehensive and the piecewise sampling approach. These differences in the underlying random process are assumed to interact with the structure of the environment, i.e., the features of a gamble's prospects, and other aspects of the sampling and decision behavior.

# Computational Framework for Sequential Sampling Strategies

Within the scope of this work, both comprehensive and piecewise sampling are assumed to be *sequential sampling* strategies. I.e., by generating sequences of single samples of a gamble's prospects, agents sequentially accumulate information about the probability spaces and thereby form a preference for one prospect over the other.

Below, the framework of comprehensive and piecewise sampling is extended to hybrids therof, acknowledging that people are likely to apply sampling strategies that deviate from the pure cases [cf. @hillsInformationSearchDecisions2010] and accounting for modeling issues that arise when applying them to situations in which the number of single samples is not fixed *a priori*.

## Autonomous sampling and model parameters

The above proposition of a sequential sampling process does not require sampling to be *autonomous*, i.e., no *a priori* fixed number of single samples, however, in many experimental paradigms, and arguably in the majority of real world situations, this is the case. For such cases, it is assumed that termination of the sampling process and choice are determined by reaching a boundary at which a preference for one of the prospects could be formed.

### Boundaries

Such a boundary can be defined as the minimum value a count or other form of summary statistic over the sequences of realized random variables must arrive at. We will compare different types of boundaries (absolute vs. relative) and introduce the boundary parameter $a$ (denoting the boundary value) into the computational model of sampling strategies.

### Switching probability

Due to the reasons described above (autonomous sampling, deviations from pure cases, etc.), we introduce a switching probability parameter $s$ to allow for variation in the probability with which agents draw the succesive single sample from the same prospect they got their most recent single sample from. For $\lim \limits_{s \to 0}$ perfect comprehensive sampling is approximated, for $\lim \limits_{s \to 1}$, perfect piecewise sampling is approximated.

As values of $s < 1$ allow for drawing consecutive single samples from the same prospect, it was elsewhere [see Notes in @hillsInformationSearchDecisions2010] proposed that for piecewise sampling, round-wise comparisons between prospects can also be made on the basis of the means of multiple single samples. As a downside, this somewhat complicates the definition of the respective random variable. As an upside, however, both sampling strategies can be considered special instances of one another.

### Noise

The representation of the outcomes sampled from the probability spaces is assumed to be stochastical. Therefore, we add Gaussian noise $\epsilon \sim N(0, \sigma)$ in units of the outcomes.

## Simulation

Below, `code` for the computational framework of both sampling strategies is displayed, including the parameters discussed above. However, parameter values are chosen arbitrarily and must be adapted according to the particularities of the investigation.

```{r eval=FALSE, class.source = "fold-show"}

n_agents <- 1 # number of gambles

gambles <- gambles # a tibble of gamble features (see above)

# parameters

parameters <- expand_grid(s = 0, # probability increment to sampling probability of p = .5

sigma = .1, # noise

boundary = c("absolute", "relative")) # boundary type

theta_c <- expand_grid(parameters, a = 10) # boundaries comprehensive (in units of outcomes)

theta_p <- expand_grid(parameters, a = 1) # boundaries piecewise (in units of wins)