Commit 76aa2ff7 authored by Simon Ciranka's avatar Simon Ciranka

latest version

parent 85ef9831
Developping\_Marbles
================
Simon
16/01/2019
- [Marble Data](#marble-data)
- [look at the subjects.](#look-at-the-subjects.)
- [Lets have a first Look at the Data](#lets-have-a-first-look-at-the-data)
- [Who Wins](#who-wins)
- [Risky Choice](#risky-choice)
- [How Good were they at Estimating?](#how-good-were-they-at-estimating)
- [Who is most Confident?](#who-is-most-confident)
- [How does the Estimation Accuracy Relate to Confidence?](#how-does-the-estimation-accuracy-relate-to-confidence)
- [Inverse U?](#inverse-u)
# YOU NEED TO USE THE LOG PRECICION AS CONFDENCE MODEL
Marble Data
===========
In this document we have a look at the Data from the Marble Task. We are supposed to have 50 participants pre, 50 post and 50 adolescents who performed a social and solo version of the experiment. They were asked to decide between safe and sure gamble options in order to recieve bonus points. Sometimes the outcome probabilites were described to them as marbles in a jar. On other occasions the outcome probabilites had to be inferred by the participants. This was possible through seeing samples from an urn. Participans always saw 9 pieces of information. Then we asked them for an estimate about the proportion of marbles in that urn. Subsequently we asked how sure there are about this estimate. Only then they made a choice. In the Social trials we showed participants the choices of another subject who completed a similar task before [here is the paper](http://www.tandfonline.com/doi/full/10.1080/87565641.2016.1158265). This made it possible to choose an adviosr with a ralistic choice function. Advisors were selected to make risky choices 20% more than the subject before. The raw data was pretty messy, because it was collected on two servers and every trial has an own column but doesnt save all information. I used [this script](RawData/TidyMarble_JointTask.R) to tidy up.
look at the subjects.
=====================
For some reason this Histogram bins the first subjects together but this is the numebr of trials competed by every subject. 144 as it should be (4\*36). But generally, this looks satisfying and as if i did not screw anything up when coding, saving and making the SQL Data Tidy.
``` r
MarbleData<-read.csv("RawData/TidyMarble.csv")
hist(MarbleData$subject,breaks=length(unique(MarbleData$subject)))
```
![](Analyze_files/figure-markdown_github/cars-1.png)
And here i reformat the stuff i need.
``` r
MarbleData$Agegroup=as.factor(MarbleData$Agegroup)
MarbleData$DFE1DFD0=as.factor(MarbleData$DFE1DFD0)
MarbleData$Social1Ind0=as.factor(MarbleData$Social1Ind0)
MarbleData$PercentBlueEstimate=round(as.numeric(MarbleData$PercentBlueEstimate))
MarbleData$PercentBlueShownRel=round(as.numeric(MarbleData$PercentBlueShownRel),2)
MarbleData$PercentBlueShownRel=as.factor(as.character(MarbleData$PercentBlueShownRel))
MarbleData$HowSure=as.numeric(MarbleData$HowSure)
```
Lets have a first Look at the Data
==================================
Who Wins
--------
Thats the Most intersting thing of all. WHO GETS THE MOST BONUS POINTS? Lets check it.
![](Analyze_files/figure-markdown_github/unnamed-chunk-2-1.png)
lol. Adolescents. This is going to be interesting.
Risky Choice
------------
So lets start to look at the overall proportion of risky desicions, by conditions. We can see (as preregistered) that Adolescent impact of Advice depends less on the Uncertainty Condition than we thought it would. There seems to be a linear Decrease in Risk Taking for Both Condiitions. In "Decisons From Expierience" subjects make less risky choices.
![](Analyze_files/figure-markdown_github/unnamed-chunk-3-1.png)
How Good were they at Estimating?
---------------------------------
In the Decisions from Expierience trials we asked Pariticpants about what they thing the outcome Ditribution really was. So now see if there are age differnces in estimating these distributions. For this i treat the real distributions as factors.
``` r
MarbleData%>%filter(DFE1DFD0=="1"&PercentBlueEstimate!="NULL")%>%ggplot(aes(x=PercentBlueShownRel,y=PercentBlueEstimate,fill=Agegroup))+
stat_summary(geom="bar",fun.y = "mean",position="dodge")+
stat_summary(geom="pointrange",fun.y=mean,fun.ymin =function(x) mean(x)-(sd(x)/sqrt(length(x))),fun.ymax=function(x) mean(x)+(sd(x)/sqrt(length(x))),position = position_dodge(0.9))+
scale_x_discrete(name="Actual Distribution")+
scale_y_continuous(name="Participants Estimate")+
ggtitle("Estimation Accuracy")
```
![](Analyze_files/figure-markdown_github/unnamed-chunk-4-1.png)
Turns out they were accurate and captured the trends. While estimating similarly well, Children & Adolescents tended to be overoptimistic in their estimation. Is this a linear trend? This effect looks smaller with higher proportions of blue marbles. Participants also were all Optimistic for low probabilites and Pessimistic for high ones.
Who is most Confident?
----------------------
``` r
MarbleData%>%filter(DFE1DFD0=="1"&PercentBlueEstimate!="NULL")%>%
ggplot(aes(x=Agegroup,y=HowSure))+
stat_summary(geom="pointrange",fun.y=mean,fun.ymin =function(x) mean(x)-1.96*(sd(x)/sqrt(length(x))),fun.ymax=function(x) mean(x)+1.96*(sd(x)/sqrt(length(x))),position = position_dodge(0.9))+
scale_x_discrete(name="Agegroup",breaks=c(0,1,2),labels=c("Kids","Adolescents","Adults"))+
scale_y_continuous(name="Confidence Rating")+
ggtitle("Confidence Rating")
```
![](Analyze_files/figure-markdown_github/unnamed-chunk-5-1.png)
Adolescents report to be least confident with their estimation.
How does the Estimation Accuracy Relate to Confidence?
------------------------------------------------------
I compute the difference between the Actual Number shown and the Participants estimate as their Squared Error.
``` r
labels <- c(
"0" = "Kids",
"1" = "Adolescents",
"2" = "Adults"
)
MarbleData%>%filter(DFE1DFD0=="1"&PercentBlueEstimate!="NULL")%>%mutate(
SquaredError=(as.numeric(PercentBlueEstimate) - as.numeric(PercentBlueShownRel))^2
)%>%ggplot(aes(x=SquaredError,y=as.numeric(HowSure)))+
stat_summary(geom="point",fun.y = "mean",position="dodge")+
stat_smooth(method="loess")+
stat_summary(geom="pointrange",fun.y=mean,fun.ymin =function(x) mean(x)-(sd(x)/sqrt(length(x))),fun.ymax=function(x) mean(x)+(sd(x)/sqrt(length(x))),position = position_dodge(0.9))+
facet_grid(.~Agegroup,
labeller = labeller(Agegroup = labels)
)+
scale_x_continuous(name="Squared Estimation Error")+
scale_y_continuous(name="Confidence")+
ggtitle("Squared Estimation Error and Confidence")
```
## Warning: Width not defined. Set with `position_dodge(width = ?)`
## Warning: Removed 3 rows containing missing values (geom_pointrange).
![](Analyze_files/figure-markdown_github/unnamed-chunk-6-1.png)
I cant make much of this inverted U shape. Does this really mean, that when they are super wrong, then they are most confident? This bias seems to be least present in Kids.
### Inverse U?
Maybe the confidence has got more to do with the particiapnts estimate per se than with their accuracy. They seem more confident, the more extreme they judge the distributions to be.
``` r
labels <- c(
"0" = "Kids",
"1" = "Adolescents",
"2" = "Adults"
)
MarbleData%>%filter(DFE1DFD0=="1"&PercentBlueEstimate!="NULL")%>%mutate(
delta=abs(as.numeric(PercentBlueShownRel)-PercentBlueEstimate)
)%>%ggplot(aes(x=PercentBlueEstimate,y=HowSure))+
stat_summary(geom="point",fun.y = "mean",position="dodge")+
stat_smooth(method="loess")+
stat_summary(geom="pointrange",fun.y=mean,fun.ymin =function(x) mean(x)-(sd(x)/sqrt(length(x))),fun.ymax=function(x) mean(x)+(sd(x)/sqrt(length(x))),position = position_dodge(0.9))+
facet_grid(.~Agegroup,
labeller = labeller(Agegroup = labels)
)+
scale_x_continuous(name="Estimate")+
scale_y_continuous(name="Confidence")
```
## Warning: Width not defined. Set with `position_dodge(width = ?)`
## Warning: Removed 15 rows containing missing values (geom_pointrange).
![](Analyze_files/figure-markdown_github/unnamed-chunk-7-1.png)
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
......@@ -325,7 +325,7 @@ hist(TidyMarble$subject,breaks=length(unique(TidyMarble$subject)))
recollect=(Supposed %in% unique(TidyMarble$subject))
missing<-Supposed[!recollect]
write.csv(TidyMarble,file="TidyMarble.csv")
write.csv(TidyMarble,file="TidyMarbleUpdate.csv")
#down here i build sth for Lucas so we can connect datasets.
demographics<-c("sex","age","subject")
demographicsTBL<-TidyMarble[demographics]
......
This diff is collapsed.
This diff is collapsed.
data {
int<lower=1> N;// Number of Subjects
int<lower=1> T;// Trials
//int<lower=1> Seq; // Sequence size
int<lower=1, upper=T> Tsubj[N];//number of trials for each subject per Group
//int<lower=1,upper=3> Groups[N];
int<lower=0, upper=1>choice[N, T];
int<lower=0, upper=1>risk1Unc0[N,T];
int<lower=0, upper=3>condition[N, T]; // 0: solo, 1: ss, 2: mix, 3: rr
int<lower=0, upper=99>Sucess[N, T]; // 0: solo, 1: ss, 2: mix, 3: rr
int<lower=0, upper=99>Fail[N, T]; // 0: solo, 1: ss, 2: mix, 3: rr
//int<lower=0, upper=3> group[N, T]; // 0: solo, 1: ss, 2: mix, 3: rr
real safe_payoff[N, T];
real risky_payoff[N, T];
//real<lower=0, upper=1> ambigLevel[N, T];
real<lower=0, upper=1> p_gamble[N, T];
real<lower=0, upper=1> p_gamble_est[N, T];
}
transformed data {
}
parameters {
//Group mus.
real<lower=0, upper=2> mu_rho ;
real<lower=0> mu_tau ;
real mu_ocu_Uncertainty;
// Social Parameters
//roup Sigmas
real<lower=0>sigma_rho;
real<lower=0>sigma_tau;
real<lower=0>sigma_ocu_Uncertainty;
//individual.
real rho_p[N];
real tau_p[N];
real ocu_Uncertainty_p[N];
}
transformed parameters {
real<lower=0, upper=2> rho[N];
real<lower=0> tau[N];
real ocu_Uncertainty[N];
//A Normal(μ,σ) distribution, like other distributions in the locationscale distribution family, can be reparameterized to
//be sampled from a unit normal distribution that is multiplied by the scale parameter σ and then shifted with the location parameter μ. Formally,
// ξ∼Normal(μξ,σξ)
//is mathematically equivalent to
// ξ′∼Normal(0,1) ........ Which is defined in the nmodel
// ξ∼Normal(μξ+ξ′·σξ). which is defined over here.
for (i in 1:N) {// subs
// Social Utility Model
rho[i] = Phi_approx(mu_rho + sigma_rho * rho_p[i]) * 2; // i dont quite understand this part. but it makes the correct estimates. Good i would s
// choice model
tau[i] = exp(mu_tau + sigma_tau * tau_p[i]);
ocu_Uncertainty[i] = mu_ocu_Uncertainty + sigma_ocu_Uncertainty * ocu_Uncertainty_p[i];
}//endsubs
}//end transfomred params
model {
// peer_ocu
//hyper parameters... hyperpriors for all parameteres.
// i could in principle set different hyperpriors for each
//hyper parameters... hyperpriors for all parameteres.
// i could in principle set different hyperpriors for each
mu_rho ~ normal(0,1);
mu_tau ~ normal(0,1);
mu_ocu_Uncertainty ~ normal(0,1);
sigma_rho ~ normal(0, 0.2);
sigma_tau ~ normal(0, 0.2);
sigma_ocu_Uncertainty ~ cauchy(0, 1);
// individual parameters w/ Matt trick
// I define the distributions in the loop bc of my nested data i have too many dimensions for vectorizing.
for (i in 1:N) {
rho_p[i] ~ normal(0, 1.0);
tau_p[i] ~ normal(0, 1.0);
ocu_Uncertainty_p[i] ~ normal(0, 1.0);
for (t in 1:Tsubj[i]) {
real U_safe;
real U_risky;
// is it a risk trial?
U_safe = pow(safe_payoff[i, t], rho[i]);
U_risky = p_gamble[i,t] * pow(risky_payoff[i, t], rho[i]);
if (condition[i, t] == 1) { // safe-safe
U_risky = U_risky + ocu_Uncertainty[i];
}
if (condition[i, t] == 3) { // risky-risky
U_risky = U_risky + ocu_Uncertainty[i];
}
choice[i, t] ~ bernoulli_logit((tau[i])* (U_risky - U_safe));
}//end Risk
} //endTrail
} //endSub
data {
int<lower=1> N;// Number of Subjects
int<lower=1> T;// Trials
//int<lower=1> Seq; // Sequence size
int<lower=1, upper=T> Tsubj[N];//number of trials for each subject per Group
//int<lower=1,upper=3> Groups[N];
int<lower=0, upper=1>choice[N, T];
int<lower=0, upper=1>risk1Unc0[N,T];
int<lower=0, upper=3>condition[N, T]; // 0: solo, 1: ss, 2: mix, 3: rr
int<lower=0, upper=99>Sucess[N, T]; // 0: solo, 1: ss, 2: mix, 3: rr
int<lower=0, upper=99>Fail[N, T]; // 0: solo, 1: ss, 2: mix, 3: rr
//int<lower=0, upper=3> group[N, T]; // 0: solo, 1: ss, 2: mix, 3: rr
real safe_payoff[N, T];
real risky_payoff[N, T];
//real<lower=0, upper=1> ambigLevel[N, T];
real<lower=0, upper=1> p_gamble[N, T];
real<lower=0, upper=1> p_gamble_est[N, T];
}
transformed data {
}
parameters {
//Group mus.
real<lower=0, upper=2> mu_rho ;
real<lower=0> mu_tau ;
// Beta Updating Parameters
real<lower=0, upper=5> mu_alpha_add;
real<lower=0, upper=5> mu_beta_add;
// Social Parameters
real mu_ocu_Risk;
real mu_ocu_Uncertainty;
//roup Sigmas
real<lower=0>sigma_rho;
real<lower=0>sigma_tau;
real<lower=0>sigma_ocu_Risk;
real<lower=0>sigma_ocu_Uncertainty;
real<lower=0>sigma_alpha_add;
real<lower=0>sigma_beta_add;
//individual.
real rho_p[N];
real tau_p[N];
real ocu_Risk_p[N];
real ocu_Uncertainty_p[N];
real alpha_add_p[N];
real beta_add_p[N];
}
transformed parameters {
real<lower=0, upper=2> rho[N];
real<lower=0, upper=2> alpha_add[N];
real<lower=0, upper=2> beta_add[N];
real<lower=0> tau[N];
real ocu_Risk[N];
real ocu_Uncertainty[N];
//A Normal(μ,σ) distribution, like other distributions in the locationscale distribution family, can be reparameterized to
//be sampled from a unit normal distribution that is multiplied by the scale parameter σ and then shifted with the location parameter μ. Formally,
// ξ∼Normal(μξ,σξ)
//is mathematically equivalent to
// ξ′∼Normal(0,1) ........ Which is defined in the nmodel
// ξ∼Normal(μξ+ξ′·σξ). which is defined over here.
for (i in 1:N) {// subs
// Social Utility Model
rho[i] = Phi_approx(mu_rho + sigma_rho * rho_p[i]) * 2; // i dont quite understand this part. but it makes the correct estimates. Good i would s
ocu_Risk[i] = mu_ocu_Risk + sigma_ocu_Risk * ocu_Risk_p[i];
ocu_Uncertainty[i] = mu_ocu_Uncertainty + sigma_ocu_Uncertainty * ocu_Uncertainty_p[i];
// updating Model
alpha_add[i]= Phi_approx(mu_alpha_add + sigma_alpha_add * alpha_add_p[i])*2;
beta_add[i]= Phi_approx(mu_beta_add + sigma_beta_add * beta_add_p[i])*2;
// choice model
tau[i] = exp(mu_tau + sigma_tau * tau_p[i]);
}//endsubs
}//end transfomred params
model {
// peer_ocu
//hyper parameters... hyperpriors for all parameteres.
// i could in principle set different hyperpriors for each
//hyper parameters... hyperpriors for all parameteres.
// i could in principle set different hyperpriors for each
mu_rho ~ normal(0,1);
mu_tau ~ normal(0,1);
mu_ocu_Risk ~ normal(0,1);
mu_ocu_Uncertainty ~ normal(0,1);
mu_alpha_add ~ normal(0,1);
mu_beta_add ~ normal(0,1);
sigma_rho ~ normal(0, 0.2);
sigma_tau ~ normal(0, 0.2);
sigma_ocu_Risk ~ cauchy(0, 1);
sigma_ocu_Uncertainty ~ cauchy(0, 1);
sigma_alpha_add ~ normal(0,0.2);
sigma_beta_add ~ normal(0,0.2);
// individual parameters w/ Matt trick
// I define the distributions in the loop bc of my nested data i have too many dimensions for vectorizing.
for (i in 1:N) {
rho_p[i] ~ normal(0, 1.0);
tau_p[i] ~ normal(0, 1.0);
ocu_Risk_p[i] ~ normal(0, 1.0);
ocu_Uncertainty_p[i] ~ normal(0, 1.0);
alpha_add_p[i] ~ normal(0,1.0);
beta_add_p[i] ~ normal(0,1.0);
for (t in 1:Tsubj[i]) {
real U_safe;
real U_risky;
// is it a risk trial?
if(risk1Unc0[i, t]==0){
U_safe = pow(safe_payoff[i, t], rho[i]);
U_risky = p_gamble[i,t] * pow(risky_payoff[i, t], rho[i]);
if (condition[i, t] == 1) { // safe-safe
U_risky = U_risky + ocu_Risk[i];
}
if (condition[i, t] == 3) { // risky-risky
U_risky = U_risky + ocu_Risk[i];
}
choice[i, t] ~ bernoulli_logit((tau[i])* (U_risky - U_safe));
}//end Risk
if(risk1Unc0[i, t]==1){// is it an uncertain trial?
// //LEARNING MODEL.
// for (s in 1:Seq) {
// alpha_total=alpha_total+ Sucess[i,t,s]*alpha_add[i];// thats the weighting multiplier for gains
// beta_total=beta_total+ Fail[i,t,s]*beta_add[i];// thats the weighting multiplier for losses
// }
p_gamble_est[i,t] ~ beta(pow(Sucess[i,t],alpha_add[i]),pow(Fail[i,t],beta_add[i])); // we know that the winning probability is beta distributed with the obtained shape parameters.
//beta_rng(alpha_total,beta_total)
U_safe = pow(safe_payoff[i, t], rho[i]);
U_risky = p_gamble_est[i,t] * pow(risky_payoff[i, t], rho[i]);
if (condition[i, t] == 1) { // safe-safe
U_risky = U_risky + ocu_Uncertainty[i];
}
if (condition[i, t] == 3) { // risky-risky
U_risky = U_risky + ocu_Uncertainty[i];
}
choice[i, t] ~ bernoulli_logit((tau[i])* (U_risky - U_safe));
}//end Uncertain
} //endTrail
} //endSub
}
data {
int<lower=1> N;// Number of Subjects
int<lower=1> T;// Trials
int<lower=1> G; // Different Groups
int<lower=1, upper=T> Tsubj[N,G];//number of trials for each subject per Group
int<lower=0, upper=1> choice[N, T, G];
int<lower=0, upper=3> condition[N, T, G]; // 0: solo, 1: ss, 2: mix, 3: rr
//int<lower=0, upper=3> group[N, T]; // 0: solo, 1: ss, 2: mix, 3: rr
real safe_payoff[N, T, G];
real risky_payoff[N, T, G];
real<lower=0, upper=1> p_gamble[N, T, G];
}
transformed data {
}
parameters {
//Group mus.
real<lower=0, upper=2> mu_rho[G] ;
real<lower=0> mu_tau[G] ;
//roup Sigmas
real<lower=0>sigma_rho[G];
real<lower=0>sigma_tau[G];
//individual.
real rho_p[N,G];
real tau_p[N,G];
}
transformed parameters {
real<lower=0, upper=2> rho[N,G];
real<lower=0> tau[N,G];
//A Normal(μ,σ) distribution, like other distributions in the locationscale distribution family, can be reparameterized to
//be sampled from a unit normal distribution that is multiplied by the scale parameter σ and then shifted with the location parameter μ. Formally,
// ξ∼Normal(μξ,σξ)
//is mathematically equivalent to
// ξ′∼Normal(0,1) ........ Which is defined in the nmodel
// ξ∼Normal(μξ+ξ′·σξ). which is defined over here.
for (g in 1:G){// groups.
for (i in 1:N) {// subs