data("sleepstudy")
# install.packages("lmerTest")
library(lmerTest)
# library(ggplot2)
library(knitr)
opts_template$set(figure_small = list(fig.height = 4, fig.width = 6))
kable(head(sleepstudy))
Reaction | Days | Subject |
---|---|---|
249.5600 | 0 | 308 |
258.7047 | 1 | 308 |
250.8006 | 2 | 308 |
321.4398 | 3 | 308 |
356.8519 | 4 | 308 |
414.6901 | 5 | 308 |
Let’s replicate the mixed model with an interaction term that we fitted in the first part of this tutorial:
# First let's create the additional variable Lunch
set.seed(88)
lunch <- sample(c(0,1), replace=TRUE, size=18)
sleepstudy$lunch <- factor(lunch)
head(sleepstudy)
## Reaction Days Subject lunch
## 1 249.5600 0 308 0
## 2 258.7047 1 308 0
## 3 250.8006 2 308 1
## 4 321.4398 3 308 0
## 5 356.8519 4 308 0
## 6 414.6901 5 308 1
We will have 2 variations in our model: first, we will consider Days as a categorical rather than a numerical variable. We do this because we want to be able to answer the question: does lunch affect differently day 1 from day 2/3/4 etc? In other words, is there a difference in a difference between days? Second, we will remove the fixed intercept from the model, by adding a “-1” to the formula. This addition does not change much the result but will make the successive operations a little bit more handy as it provides the complete display of all the factor levels.
# Change the data type of Days from number to factor
sleepstudy$Days <- factor(as.character(sleepstudy$Days), levels = unique(sleepstudy$Days))
levels(sleepstudy$Days)
## [1] "0" "1" "2" "3" "4" "5" "6" "7" "8" "9"
# Fit the model
m <- lmer(Reaction
~ Days * lunch - 1 +
( 1 | Subject),
data=sleepstudy)
summary(m)
## Linear mixed model fit by REML. t-tests use Satterthwaite's method
## [lmerModLmerTest]
## Formula: Reaction ~ Days * lunch - 1 + (1 | Subject)
## Data: sleepstudy
##
## REML criterion at convergence: 1640.8
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -3.2245 -0.5705 0.0388 0.5205 3.5249
##
## Random effects:
## Groups Name Variance Std.Dev.
## Subject (Intercept) 1501.0 38.74
## Residual 955.2 30.91
## Number of obs: 180, groups: Subject, 18
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## Days0 255.124 12.384 47.192 20.600 <2e-16 ***
## Days1 256.157 14.506 76.838 17.658 <2e-16 ***
## Days2 274.506 12.385 47.195 22.165 <2e-16 ***
## Days3 286.584 14.506 76.838 19.756 <2e-16 ***
## Days4 290.838 12.383 47.179 23.487 <2e-16 ***
## Days5 298.809 14.504 76.816 20.602 <2e-16 ***
## Days6 305.550 12.382 47.166 24.677 <2e-16 ***
## Days7 316.437 14.518 76.942 21.797 <2e-16 ***
## Days8 343.418 12.382 47.165 27.736 <2e-16 ***
## Days9 359.820 14.517 76.938 24.786 <2e-16 ***
## lunch1 6.873 18.510 145.244 0.371 0.7109
## Days1:lunch1 8.137 24.150 145.167 0.337 0.7367
## Days2:lunch1 -48.021 26.629 146.005 -1.803 0.0734 .
## Days3:lunch1 -13.338 24.221 145.412 -0.551 0.5827
## Days4:lunch1 -16.722 25.652 144.372 -0.652 0.5155
## Days5:lunch1 10.604 24.863 146.347 0.426 0.6704
## Days6:lunch1 22.956 26.557 145.781 0.864 0.3888
## Days7:lunch1 -2.709 23.495 144.086 -0.115 0.9084
## Days8:lunch1 -37.421 26.495 145.596 -1.412 0.1600
## Days9:lunch1 -23.016 24.271 145.583 -0.948 0.3445
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
From the above output, we can see that there are main effects of individual days on reaction, no effect of lunch and various estimates of the interaction between lunch and a particular day. As in the previous tutorial, we should interpret main effects of days as effects without lunch (lunch = 0). In addition to the main effect of lunch, each day will receive an additional coefficient given by the interaction coefficient.
The summary shows also that the interactions are not significant. But if we are interested in the differences between all the days and the first day of the experiment, Day 1? How can we compare all the other days to Day 1 in particular? First, we need to compute all the combinations of the levels of the factor Day and Lunch:
group <- paste0(sleepstudy$Days, sleepstudy$lunch)
group <- aggregate(model.matrix(m) ~ group, FUN=mean)
# group
rownames(group) <- group$group
(group <- group[,-1])
## Days0 Days1 Days2 Days3 Days4 Days5 Days6 Days7 Days8 Days9
## 00 1 0 0 0 0 0 0 0 0 0
## 01 1 0 0 0 0 0 0 0 0 0
## 10 0 1 0 0 0 0 0 0 0 0
## 11 0 1 0 0 0 0 0 0 0 0
## 20 0 0 1 0 0 0 0 0 0 0
## 21 0 0 1 0 0 0 0 0 0 0
## 30 0 0 0 1 0 0 0 0 0 0
## 31 0 0 0 1 0 0 0 0 0 0
## 40 0 0 0 0 1 0 0 0 0 0
## 41 0 0 0 0 1 0 0 0 0 0
## 50 0 0 0 0 0 1 0 0 0 0
## 51 0 0 0 0 0 1 0 0 0 0
## 60 0 0 0 0 0 0 1 0 0 0
## 61 0 0 0 0 0 0 1 0 0 0
## 70 0 0 0 0 0 0 0 1 0 0
## 71 0 0 0 0 0 0 0 1 0 0
## 80 0 0 0 0 0 0 0 0 1 0
## 81 0 0 0 0 0 0 0 0 1 0
## 90 0 0 0 0 0 0 0 0 0 1
## 91 0 0 0 0 0 0 0 0 0 1
## lunch1 Days1:lunch1 Days2:lunch1 Days3:lunch1 Days4:lunch1
## 00 0 0 0 0 0
## 01 1 0 0 0 0
## 10 0 0 0 0 0
## 11 1 1 0 0 0
## 20 0 0 0 0 0
## 21 1 0 1 0 0
## 30 0 0 0 0 0
## 31 1 0 0 1 0
## 40 0 0 0 0 0
## 41 1 0 0 0 1
## 50 0 0 0 0 0
## 51 1 0 0 0 0
## 60 0 0 0 0 0
## 61 1 0 0 0 0
## 70 0 0 0 0 0
## 71 1 0 0 0 0
## 80 0 0 0 0 0
## 81 1 0 0 0 0
## 90 0 0 0 0 0
## 91 1 0 0 0 0
## Days5:lunch1 Days6:lunch1 Days7:lunch1 Days8:lunch1
## 00 0 0 0 0
## 01 0 0 0 0
## 10 0 0 0 0
## 11 0 0 0 0
## 20 0 0 0 0
## 21 0 0 0 0
## 30 0 0 0 0
## 31 0 0 0 0
## 40 0 0 0 0
## 41 0 0 0 0
## 50 0 0 0 0
## 51 1 0 0 0
## 60 0 0 0 0
## 61 0 1 0 0
## 70 0 0 0 0
## 71 0 0 1 0
## 80 0 0 0 0
## 81 0 0 0 1
## 90 0 0 0 0
## 91 0 0 0 0
## Days9:lunch1
## 00 0
## 01 0
## 10 0
## 11 0
## 20 0
## 21 0
## 30 0
## 31 0
## 40 0
## 41 0
## 50 0
## 51 0
## 60 0
## 61 0
## 70 0
## 71 0
## 80 0
## 81 0
## 90 0
## 91 1
The first column summarises the interaction between day and lunch: we have Day 0 with lunch 0 and 1, day 1 with lunch 0 and 1… and so on. The other columns contains zeros and ones that represent the specification of the factor combination (day 0 and lunch 0, for example) in the extended model matrix. Now we can specify which comparisons we want to compute. In this case, we want to compare all days with lunch with day 1 with lunch:
contrasts <- rbind(group["11",] - group["21",],
group["11",] - group["31",],
group["11",] - group["41",],
group["11",] - group["51",],
group["11",] - group["61",],
group["11",] - group["71",],
group["11",] - group["81",],
group["11",] - group["91",])
head(contrasts)
## Days0 Days1 Days2 Days3 Days4 Days5 Days6 Days7 Days8 Days9
## 11 0 1 -1 0 0 0 0 0 0 0
## 111 0 1 0 -1 0 0 0 0 0 0
## 112 0 1 0 0 -1 0 0 0 0 0
## 113 0 1 0 0 0 -1 0 0 0 0
## 114 0 1 0 0 0 0 -1 0 0 0
## 115 0 1 0 0 0 0 0 -1 0 0
## lunch1 Days1:lunch1 Days2:lunch1 Days3:lunch1 Days4:lunch1
## 11 0 1 -1 0 0
## 111 0 1 0 -1 0
## 112 0 1 0 0 -1
## 113 0 1 0 0 0
## 114 0 1 0 0 0
## 115 0 1 0 0 0
## Days5:lunch1 Days6:lunch1 Days7:lunch1 Days8:lunch1
## 11 0 0 0 0
## 111 0 0 0 0
## 112 0 0 0 0
## 113 -1 0 0 0
## 114 0 -1 0 0
## 115 0 0 -1 0
## Days9:lunch1
## 11 0
## 111 0
## 112 0
## 113 0
## 114 0
## 115 0
library(multcomp)
# Transform into Matrix
contrast.matrix <- rbind("Lunch on Day 1 versus Lunch on Day 2"=as.numeric(contrasts[1,]),
"Lunch on Day 1 versus Lunch on Day 3"=as.numeric(contrasts[2,]),
"Lunch on Day 1 versus Lunch on Day 4"=as.numeric(contrasts[3,]),
"Lunch on Day 1 versus Lunch on Day 5"=as.numeric(contrasts[4,]),
"Lunch on Day 1 versus Lunch on Day 6"=as.numeric(contrasts[5,]),
"Lunch on Day 1 versus Lunch on Day 7"=as.numeric(contrasts[6,]),
"Lunch on Day 1 versus Lunch on Day 8"=as.numeric(contrasts[7,]),
"Lunch on Day 1 versus Lunch on Day 9"=as.numeric(contrasts[8,]))
comparisons <- summary(glht(m, contrast.matrix))
comparisons
##
## Simultaneous Tests for General Linear Hypotheses
##
## Fit: lmer(formula = Reaction ~ Days * lunch - 1 + (1 | Subject), data = sleepstudy)
##
## Linear Hypotheses:
## Estimate Std. Error
## Lunch on Day 1 versus Lunch on Day 2 == 0 37.809 19.044
## Lunch on Day 1 versus Lunch on Day 3 == 0 -8.952 14.305
## Lunch on Day 1 versus Lunch on Day 4 == 0 -9.822 19.010
## Lunch on Day 1 versus Lunch on Day 5 == 0 -45.119 14.288
## Lunch on Day 1 versus Lunch on Day 6 == 0 -64.212 18.708
## Lunch on Day 1 versus Lunch on Day 7 == 0 -49.435 13.972
## Lunch on Day 1 versus Lunch on Day 8 == 0 -41.704 19.020
## Lunch on Day 1 versus Lunch on Day 9 == 0 -72.510 14.311
## z value Pr(>|z|)
## Lunch on Day 1 versus Lunch on Day 2 == 0 1.985 0.26666
## Lunch on Day 1 versus Lunch on Day 3 == 0 -0.626 0.99453
## Lunch on Day 1 versus Lunch on Day 4 == 0 -0.517 0.99851
## Lunch on Day 1 versus Lunch on Day 5 == 0 -3.158 0.01178 *
## Lunch on Day 1 versus Lunch on Day 6 == 0 -3.432 0.00460 **
## Lunch on Day 1 versus Lunch on Day 7 == 0 -3.538 0.00307 **
## Lunch on Day 1 versus Lunch on Day 8 == 0 -2.193 0.17255
## Lunch on Day 1 versus Lunch on Day 9 == 0 -5.067 < 0.001 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## (Adjusted p values reported -- single-step method)
We can see from this output that, even if the p-values of the interaction effects were not significant, the effect of lunch in different days is different. In this example, The difference in Reaction progressively grows and becomes significantly different from Day 1 starting from Day 5.
Let’s format the results of the contrasts test into a table:
pq <- comparisons$test
mtests <- data.frame(pq$coefficients, pq$sigma, pq$tstat, pq$pvalues)
error <- attr(pq$pvalues, "error")
pname <- switch(comparisons$alternativ,
less = paste("Pr(<", ifelse(comparisons$df ==0, "z", "t"), ")",
sep = ""),
greater = paste("Pr(>", ifelse(comparisons$df == 0, "z", "t"), ")",
sep = ""),
two.sided = paste("Pr(>|", ifelse(comparisons$df == 0, "z", "t"), "|)",
sep = ""))
colnames(mtests) <- c("Estimate", "Std.Error", ifelse(comparisons$df ==0,
"zvalue", "tvalue"), pname)
library(dplyr)
mtests <- mtests %>%
add_rownames("Parameters") %>%
mutate(`Pr(>|z|)`=ifelse(`Pr(>|z|)`< 0.001,
"< 0.001",
ifelse(`Pr(>|z|)` < 0.01,
"< 0.01",
ifelse(`Pr(>|z|)` < 0.05,
"< 0.05",
paste(round(`Pr(>|z|)`,
4)))))) %>%
mutate_if(is.numeric, funs(round(.,
digits=2)))
kable(mtests)
Parameters | Estimate | Std.Error | zvalue | Pr(>|z|) |
---|---|---|---|---|
Lunch on Day 1 versus Lunch on Day 2 | 37.81 | 19.04 | 1.99 | 0.2667 |
Lunch on Day 1 versus Lunch on Day 3 | -8.95 | 14.31 | -0.63 | 0.9945 |
Lunch on Day 1 versus Lunch on Day 4 | -9.82 | 19.01 | -0.52 | 0.9985 |
Lunch on Day 1 versus Lunch on Day 5 | -45.12 | 14.29 | -3.16 | < 0.05 |
Lunch on Day 1 versus Lunch on Day 6 | -64.21 | 18.71 | -3.43 | < 0.01 |
Lunch on Day 1 versus Lunch on Day 7 | -49.43 | 13.97 | -3.54 | < 0.01 |
Lunch on Day 1 versus Lunch on Day 8 | -41.70 | 19.02 | -2.19 | 0.1726 |
Lunch on Day 1 versus Lunch on Day 9 | -72.51 | 14.31 | -5.07 | < 0.001 |