Demo Experiment Analysis, Part II
Contents
Demo Experiment Analysis, Part II¶
Alright, so we’ve evaluated our main hypothesis in the last script (whether dB level, and also individual factors affect the proportion of times participants hear “Yanny” over “Laurel”).
Now let’s look at reaction times to evaluate our secondary hypothesis, that smaller dB ratios will yield longer RTs. To do this, we said we use a polynomial regression. However, we are also going to check out what linear regression has to say about this data.
First, setup the script…
Set paths and load libraries¶
data_path = file.path('../data')
results_path = file.path('../3_results')
library(tidyverse)
Load Data¶
data <- read.csv(file.path(data_path,"data.csv"), row.names = NULL, colClasses=c("participant" = "factor", "sex" = "factor"))
1. Reaction Times¶
From our preregistration:
“We will exclude particularly slow and particularly fast responses, defined as median +- 3 * the median absolute deviation (MAD). However if this procedure eliminates more than 5% of observations, we will opt for a factor to multiply the MAD by, that allows for retention of at least 95% of the data.”
Outlier removal¶
Let’s find out if we have outliers based on our pre-registered criteria! What was that again?
“We will exclude particularly slow and particularly fast responses, defined as median +- 3 * the median absolute deviation (MAD). However if this procedure eliminates more than 5% of observations, we will opt for a factor to multiply the MAD by, that allows for retention of at least 95% of the data.” Let’s start by visualizing the distribution and summary statistics of our RT data.
Your turn: check out whether we can use our baseline criteria for outlier removal and maintain our threshold for data loss.
If it’s too high, follow the procedure we outlined to get an acceptable cutoff.
hist(data$RT, breaks = 50)
summary(data$RT)
too_short <- which(data$RT<median(data$RT)-3*mad(data$RT))
too_long <- which(data$RT>median(data$RT)+3*mad(data$RT))
data2 <- data[-too_long,]
length(data2$RT)/length(data$RT)
too_short <- which(data$RT<median(data$RT)-6*mad(data$RT))
too_long <- which(data$RT>median(data$RT)+6*mad(data$RT))
data3 <- data[-too_long,]
length(data3$RT)/length(data$RT)
#... we tried several factors, and decided on doing a strict cut-off of the data at 95%:
keep_N <- ceiling(length(data$RT) * 0.95)
#sort(data$RT)
data_or_interm <- data %>% arrange(RT)
data_or <- data_or_interm[1:keep_N,]
# caluclate the factor
(max(data_or$RT)-median(data$RT)) / mad(data$RT)
ggplot() +
geom_histogram(data, mapping = aes(RT), fill = 'lightblue', alpha = 0.8) +
geom_histogram(data_or, mapping = aes(RT), fill = 'orange', alpha = 0.5)
Since a factor of 5 * the mad cuts off a bit too much data (~ 94%), but a factor of 6 leaves quite a few outliers remaining (18), we decided to implement a strict cut off of the data by keeping only the first 95% of sorted RT values. This corresponds to cut-off of ~ 5.32 * the mad.
Summary statistics¶
Let’s look at some summary statistics:
summary(data_or$RT)
How do we expect the data to look, broken down by participant and dB level?
RTdata <- data_or %>% group_by(participant, dB_ratio) %>%
summarise(median = median(RT),
mad = mad(RT),
quant = IQR(RT))
Plot¶
We’ll use geom_boxplot() to visualize the summary statistics of RTs for each level of our dB manipulation.
We’ll also visualize the raw data by plotting the points with geom_point().
ggplot(data_or) +
geom_boxplot(aes(dB_ratio, RT, group = dB_ratio), width = 3) +
geom_point(aes(dB_ratio, RT, color = participant)) +
scale_x_continuous(name = "High/low dB ratio") +
scale_color_brewer(palette = "Dark2") +
theme_bw()
Challenge! Use Raincloud plots to also add a “half violin” to the side of each boxplot. Then, you’ll be visualizing the raw data, summary stats, and the distribution of responses for each level of dB.
2. Plain Linear Regression¶
Now, we’re going to model the effect of dB level (or stim_idx, doesn’t matter here…). What happens if we do a simple linear regression?
RTmodel1 <- lm(RT ~ stim_idx, data = data_or)
summary(RTmodel1)
plot(RTmodel1)
The plot()
function from base R has an interesting behavior when it comes to model objects:
it generates a bunch of diagnostic plots to help us determine if our model is a good fit
for the data. Take a look at the plots and online - what are each of these plots telling us?
Predicted vs. Observed Values¶
As before, let’s visualize the predicted values from the model against our observed values.
predicted <- data.frame(RT_pred = predict(RTmodel1, data_or), dB_ratio=data_or$dB_ratio)
ggplot(data_or, aes(dB_ratio, RT)) +
geom_pointrange(stat = "summary",
fun.min = function(z) {quantile(z,0.25)},
fun.max = function(z) {quantile(z,0.75)},
fun = median) +
geom_line(color='red',data = predicted, aes(x=dB_ratio, y=RT_pred)) +
stat_smooth(method='lm', formula = y ~ x, size = 1)
Does this look like a good fit for the model?
A plain linear regression assumed a linear relationship between the predictor and outcome variable. But that’s not actually the relationship we observe, is it?
We predicted this too! We wrote:
“To determine whether RTs are modulated by dB ratio (for the purposes of our study, an index of stimulus ambiguity), we will perform a polynomial linear regression with RTs as outcome variable and dB ratio as predictor.”
3. Polynomial Regression¶
Polynomial regression requires us to state the polynomial degree that defines the relationship of the predictor to the outcome variable.
Each time you add a degree, you can imagine you add another bend to the curve that we hope will capture the relationship between x and y, or dB ratio and responses.
Need a refresher on polynomial regression? Try this.
So let’s give it a try:
In R, you can define that a predictor will be treated as a polynomial by wrapping the function
poly()
around the variable. The second argument to poly()
is the polynomial order.
Go ahead and try this out. Visualize the predicted values, and play around with different orders…
Let’s give the polynomial regression that we planned in the pre-registration a chance. Since polynomial order = 1 is the same as linear regression, and as we increase the order, we risk overfitting fast, let’s start with an order = 2.
RTmodel2 <- lm(RT ~ poly(stim_idx,2), data = data_or)
summary(RTmodel2)
plot(RTmodel2)
# for plotting & calculating RSME
predicted2 <- data.frame(RT_pred = predict(RTmodel2, data_or), dB_ratio=data_or$dB_ratio)
What’s nice about the lm()
function here is that it gives us separate coefficients and their test
statistics for each polynomial order. This can help us figure out which order is contributing the most
towards capturing the variance in the data.
Let’s quickly see what happens if we try different polynomial orders. Pay
attention to the formula in the stat_smooth()
layer of this ggplot()
.
ggplot(data_or, aes(dB_ratio, RT)) +
geom_pointrange(stat = "summary",
fun.min = function(z) {quantile(z,0.25)},
fun.max = function(z) {quantile(z,0.75)},
fun = median) +
#geom_line(color='red',data = predicted2, aes(x=dB_ratio, y=RT_pred)) +
stat_smooth(method='lm', formula = y ~ poly(x,1), size = 1, color = "black") +
stat_smooth(method='lm', formula = y ~ poly(x,2), size = 1, alpha = 0.6, color = "blue") +
stat_smooth(method='lm', formula = y ~ poly(x,3), size = 1, alpha = 0.6, color = "green") +
stat_smooth(method='lm', formula = y ~ poly(x,4), size = 1, alpha = 0.6, color = "orange")
As before, we’re interested in the coefficients and their confidence intervals.
coef(RTmodel2)
confint(RTmodel2, level=0.95)
How do we interpret the polynomial coefficients?
The predicted reaction time for a given dB level is given by the following equation:
$$RT = the intercept + first order polynomial* (dB level) + second order polynomial* (dB level)^2$$
Model Performance¶
The Root Mean Square Error is a useful metric to evaluate the goodness-of-fit of a linear model. Let’s break down the phrase from the end:
we take the error: the difference between each observed and predicted value,
square it,
take the mean of those squared errors,
then take the square root of that mean.
# RSME for RTmodel2
sqrt(mean( (data_or$resp - predicted2$RT_pred)^2 ) )
In addition, compare the RMSE values for the models you fit:
sqrt(mean( (data_or$resp - predicted$RT_pred)^2 ) )
Which has the lowest error? Well, they’re really the same actually. It looks like neither model is doing an excellent job of fitting the data, yet when looking at the data we can see why. There does not appear to be a clear pattern, or at least, not one we can observe with our sample size of 7 and the few trials we had per stimulus.
Model Comparison¶
Think back to the last tutorial: what does model comparison using a Chi-square tell you?
anova(RTmodel1, RTmodel2, test = "Chisq")
What is the interpretation of this result in plain English?
Writing it up…¶
Values to report: R^2, F value (F), degrees of freedom (numerator, denominator; in parentheses separated by a comma next to F), and significance level (p), β. Report the β and the corresponding t-test for that predictors for each predictor in the regression
For Example: “A polynomial regression analysis was used to test if the dB ratio level significantly predicted participants’ reaction times to the Yanny/Laurel 2AFC task. The results of the regression indicated the predictor “dB ratio” explained __ % of the variance (R2 = , F( ,)=, p __ ). It was found that __ [did not/significantly predict/ed RT (β = __, p __ ).”