This vignette provides examples of some of the hypothesis tests that
can be specified in simr
. The function doTest
can be used to apply a test to an input model, which lets you check that
the test works before running a power simulation.
Documentation for the test specification functions can be found in
the help system at ?tests
.
library(simr)
The first example comes from the help page for glmer
.
The data frame cbpp
contains data on contagious bovine
pleuropneumonia. An observation variable is added to allow for
overdispersion. Note that the response is specified using
cbind
— simr
expects a binomial model to be in
this form.
$obs <- 1:nrow(cbpp)
cbpp<- glmer(cbind(incidence, size - incidence) ~ period + (1 | herd) + (1|obs), data=cbpp,
gm1 family=binomial)
summary(gm1)$coef
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -1.500290 0.2967076 -5.056460 4.271108e-07
## period2 -1.226498 0.4802941 -2.553640 1.066033e-02
## period3 -1.328831 0.4938985 -2.690495 7.134610e-03
## period4 -1.866248 0.5935804 -3.144052 1.666258e-03
Note that period
is a categorical variable with 4
levels, which enters the model as 3 dummy variables. To test all 3 dummy
variables simultaneously, you can use a likelihood ratio test.
doTest(gm1, fixed("period", "lr"))
## boundary (singular) fit: see help('isSingular')
## p-value for predictor 'period': 0.003150249
## --------------------
## Test: Likelihood ratio
If you were (for some reason) especially interested in the
significance for the dummy variable period2
you could use a
z-test. This test uses the value Pr(>|z|)
reported in
the summary above.
doTest(gm1, fixed("period2", "z"))
## p-value for predictor 'period2': 0.01066033
## --------------------
## Test: z-test
## Effect size for period2 is -1.2
Suppose your model also has a continuous predictor. You can use
fixed
to choose which fixed effect to apply tests to.
<- glmer(cbind(incidence, size - incidence) ~ period + size + (1 | herd), data=cbpp,
gm2 family=binomial)
doTest(gm2, fixed("size", "z"))
## p-value for predictor 'size': 0.8633872
## --------------------
## Test: z-test
## Effect size for size is 0.0045
Once you have chosen your tests, you can run a power analysis by
replacing doTest
with powerSim
. Don’t forget
to specify an appropriate effect size.
fixef(gm2)["size"] <- 0.05
powerSim(gm2, fixed("size", "z"), nsim=50)
## Power for predictor 'size', (95% confidence interval):
## 58.00% (43.21, 71.81)
##
## Test: z-test
## Effect size for size is 0.050
##
## Based on 50 simulations, (0 warnings, 0 errors)
## alpha = 0.05, nrow = 112
##
## Time elapsed: 0 h 0 m 8 s
As your models become more complex, it can be easier to explicitly
specify your null hypothesis using the compare
functions.
This example uses the cake
dataset.
<- lmer(angle ~ recipe * temp + (1|recipe:replicate), data=cake, REML=FALSE) fm1
Main effects should not be tested when they appear in an interaction
term. Using the fcompare
function, we can specify a
comparison with a simpler model (without having to re-type the random
effects specification).
doTest(fm1, fcompare(~ recipe + temp))
## p-value for model comparison: 0.9586321
## --------------------
## Test: Likelihood ratio
## Comparison to ~recipe + temp + [re]
This also works for polynomial terms:
<- lmer(angle ~ recipe + poly(temp, 2) + (1|recipe:replicate), data=cake, REML=FALSE)
fm2 summary(fm2)$coef
## Estimate Std. Error t value
## (Intercept) 33.122222 1.677940 19.7398074
## recipeB -1.477778 2.372966 -0.6227555
## recipeC -1.522222 2.372966 -0.6414850
## poly(temp, 2)1 44.347549 4.537142 9.7743351
## poly(temp, 2)2 -2.586135 4.537142 -0.5699921
doTest(fm2, fcompare(~ recipe + temp))
## p-value for model comparison: 0.5688225
## --------------------
## Test: Likelihood ratio
## Comparison to ~recipe + temp + [re]
We can do similar things with the budworm
data in the
pbkrtest
package.
data(budworm, package="pbkrtest")
<- glm(cbind(ndead, ntotal-ndead) ~ dose*sex, family="binomial", data=budworm)
bw1 summary(bw1)$coef
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -1.7157796 0.32233098 -5.3230368 1.020491e-07
## dose 0.1156765 0.02378865 4.8626745 1.158102e-06
## sexmale -0.2119351 0.51523286 -0.4113384 6.808244e-01
## dose:sexmale 0.1815579 0.06691613 2.7132151 6.663383e-03
Of course we don’t want to retype the cbind
boilerplate:
doTest(bw1, compare(. ~ dose + sex))
## p-value for model comparison: 0.001741539
## --------------------
## Test: Likelihood ratio
## Comparison to . ~ dose + sex
Since dose
is continous and sex
is binary
we could also use a Z-test on the single interaction term.
doTest(bw1, fixed("dose:sexmale", "z"))
## p-value for predictor 'dose:sexmale': 0.006663383
## --------------------
## Test: z-test
The random
function gives you access to tests from the
RLRsim
package. No additional arguments are needed for a
single random effect.
<- lmer(Yield ~ 1|Batch, data=Dyestuff)
re1 doTest(re1, random())
## p-value for a single random effect: 0.0037
## --------------------
## Test: Exact restricted LRT (package RLRsim)
Where the model has multiple random effects, compare
can
be used to test alternative specifications.
<- lmer(Reaction ~ Days + (Days | Subject), data=sleepstudy) fm1
doTest(fm1, compare( ~ Days + (1 | Subject)))
## p-value for model comparison: 7.072413e-10
## --------------------
## Test: Likelihood ratio
## Comparison to ~Days + (1 | Subject)
The LRT is fast but only approximate. In fact, when testing random
effects, the test is conservative1 because the null hypothesis is at a
boundary of the parameter space. This means that you will underestimate
power if you use the LRT. For more accurate results you can use
compare
with a parametric bootstrap test from the
pbkrtest
package. These can be quite slow, so you may want
to use the LRT to exploring designs, and then double check with the
parametric bootstrap.
doTest(fm1, compare( ~ Days + (1 | Subject), "pb"))
Note that the shortcut rcompare
saves you from retyping
the fixed effect specification.
doTest(fm1, rcompare( ~ (1 | Subject), "pb"))
During a simulation study, some iterations may fail due to some sort
of error. When this happens, simr
treats that iteration as
a failed (i.e. not significant) test. In the following example there are
50 simulations, with 14 successes, 34 failures, and 2 non-results. The
power is calculated as 14/50, i.e. 28%:
<- glm(formula = cbind(z, 10 - z) ~ x + g, family = binomial, data = simdata)
binFit
<- glm(formula = z ~ x + g, family = poisson, data = simdata)
poisSim coef(poisSim)[1:2] <- c(1, -0.05)
powerSim(binFit, sim=poisSim, nsim=50, seed=1)
## Power for predictor 'x', (95% confidence interval):
## 28.00% (16.23, 42.49)
##
## Test: z-test
##
## Based on 50 simulations, (0 warnings, 2 errors)
## alpha = 0.05, nrow = 30
##
## Time elapsed: 0 h 0 m 0 s
Rather than interrupting part-way through an analysis,
simr
traps and logs errors and warnings. You can access
these logs using $warnings
and $errors
. If you
didn’t assign your analysis to a variable, you can recover it with the
lastResult
function.
<- lastResult()
ps $errors ps
## stage index message
## 1 Fitting 14 Value 1.04545 out of range (0, 1)
## 2 Fitting 33 Value 1.13636 out of range (0, 1)
See, e.g., Pinheiro, J.C., Bates D.M. (2000) Mixed-Effects Models in S and S-PLUS, Springer, New York (p84).↩︎