simr.Rmd


# Beyond Superpower I: Mixed Models with `simr`

While the ANOVA is a flexible tool, there are many hypotheses that cannot be tested with an ANOVA. One of the most common situations is the introduction of "random" effects (i.e., mixed models), or the distribution of the dependent variable may not be "normal" (e.g., it may be a binary, 0 or 1, response). For these cases the `simr` R package can be very useful [@Green2016]. This is a flexible power analysis tool that can be used for linear mixed models *and* for generalized linear mixed models. Unfortunately, the documentation for this package, in the way of informative vignettes for new users, is sorely lacking. Therefore, we have added some basic information on how to perform power analyses from scratch using this package.

## How does `simr` work?

The `simr` package is quite elegant in its simplicity. Essentially, it builds a model using the `makeLmer` or `makeGlmer` functions and then simulates data (using `lme4` under-the-hood) to estimate the power of the model given the specified parameters. As a user all you have to do is specify the data and the variance components. So, the tricky part is that a user *must* have a dataset for the `simr` package to work with. However, this can be randomly generated data.

For example, from the `simr` vignettes, we can imagine a study where a researcher has three levels for a random effect. For a basic scientist this could be different Petri dishes of cells/tissue, or, for a education researcher, this could be different schools. We then have a variable, "x", and we want to see our power to detect this effect on outcome "y". The data we need for `simr` could be generated with the following code.

```{r warning=FALSE, message=FALSE}
library(simr)
# Some covariate that we have measured at 10 different levels
x <- rep(1:10)
# Three random grouping levels 
# (e.g., Petri dishes)
g <- c('a', 'b', 'c')

# Combine into 1 dataset
X <- expand.grid(x=x, g=g)
```


Now we can define our model a little bit further by assigned the fixed and random effects. For this model we need to provide 2 slope coefficients, and the random intercept variance.

```{r}
# fixed intercept and slope for the model
b <- c(2, -0.1) 
# random intercept *variance*
V1 <- 0.5 

# residual variance
s <- 1 
```

We then need to build a mixed model using the `makeLmer` function. In this case we define a new outcome variable (`y`) then assign a fixed effect for `x`, and provide the random effect structure as random intercept only (`(1|g)`). The other model information we created above is also passed onto this function. We can print the model information to verify that the details are correct.

```{r}
model1 <- makeLmer(y ~ x + (1|g),
                   fixef=b, 
                   VarCorr=V1, 
                   sigma=s, 
                   data=X)
print(model1)
```

Now we can pass this information onto the `powerSim` function for a power analysis. Please note that the total number of observations and data structure (3 levels for the random effect) will be the same in these simulations.

```{r warning=FALSE, message=FALSE}
sim1 = powerSim(model1, 
                nsim=20, 
                progress=FALSE)
```

```{r}
sim1
```


## To Be Continued...