$$\newcommand{\bv}[1]{\boldsymbol{\mathbf{#1}}}$$
Describe the importance of having sufficient sample size for scientific research
Describe conceptually the steps for sample size planning: precision analysis and power analysis
Perform power analysis for MLM using the PowerUpR application and the simr
package
Understand the effect of uncertainty in parameter values and explore alternative approaches for sample size planning
Assume true effect is \(\gamma_{01} = 0.10\)
Let's say
Add the 0 line, the 0.1 line, and the cutoff lines
Write down your model equations
List out all parameters in the model
Determine if you want to achieve a desired level of
a. Power, or
b. Precision
Level-1 $$Y_{ij} = \beta_{0j} + \beta_{1j} X\_\text{cmc}_{ij} + e_{ij}$$ $$e_{ij} \sim N(0, \sigma)$$ Level-2 $$ \begin{aligned} \beta_{0j} & = \gamma_{00} + \gamma_{01} W_j + u_{0j} \\ \beta_{1j} & = \gamma_{10} + \gamma_{11} W_j + u_{1j} \\ \begin{bmatrix} u_{0j} \\ u_{1j} \end{bmatrix} & \sim N\left( \begin{bmatrix} 0 \\ 0 \end{bmatrix}, \begin{bmatrix} \tau^2_0 & \\ \tau_{01} & \tau^2_{1} \end{bmatrix} \right) \end{aligned} $$
Level-1 $$Y_{ij} = \beta_{0j} + \beta_{1j} X\_\text{cmc}_{ij} + e_{ij}$$ $$e_{ij} \sim N(0, \sigma)$$ Level-2 $$ \begin{aligned} \beta_{0j} & = \gamma_{00} + \gamma_{01} W_j + u_{0j} \\ \beta_{1j} & = \gamma_{10} + \gamma_{11} W_j + u_{1j} \\ \begin{bmatrix} u_{0j} \\ u_{1j} \end{bmatrix} & \sim N\left( \begin{bmatrix} 0 \\ 0 \end{bmatrix}, \begin{bmatrix} \tau^2_0 & \\ \tau_{01} & \tau^2_{1} \end{bmatrix} \right) \end{aligned} $$
Fixed effects: \(\gamma_{00}\), \(\gamma_{01}\), \(\gamma_{10}\), \(\gamma_{11}\)
Random effects: \(\tau^2_{0}\), \(\tau^2_{1}\), \(\tau_{01}\)
Number of clusters: \(J\)
Cluster size: \(n\)
Level-1 $$Y_{ij} = \beta_{0j} + \beta_{1j} X\_\text{cmc}_{ij} + e_{ij}$$ $$e_{ij} \sim N(0, \sigma)$$ Level-2 $$ \begin{aligned} \beta_{0j} & = \gamma_{00} + \gamma_{01} W_j + u_{0j} \\ \beta_{1j} & = \gamma_{10} + \gamma_{11} W_j + u_{1j} \\ \begin{bmatrix} u_{0j} \\ u_{1j} \end{bmatrix} & \sim N\left( \begin{bmatrix} 0 \\ 0 \end{bmatrix}, \begin{bmatrix} \tau^2_0 & \\ \tau_{01} & \tau^2_{1} \end{bmatrix} \right) \end{aligned} $$
In the previous graph, when \(N = 20\), the sample estimate is likely to be anywhere between -0.4 and 0.6
$$SE \propto \frac{1}{\sqrt{N}}$$
In the previous graph, when \(N = 20\), the sample estimate is likely to be anywhere between -0.4 and 0.6
$$SE \propto \frac{1}{\sqrt{N}}$$
One goal of sample size planning is to
Have sufficient sample size to get precise (low SE) sample estimates of an effect
\(J\) = Number of clusters; \(n\) = Cluster size
Assuming \(\tau_{01} = 0\)
\begin{aligned} \mathit{SE}(\gamma_{01}) & = \sqrt{\frac{1}{S^2_W}\left(\frac{\tau^2_0}{J} + \frac{\sigma^2}{Jn}\right)} \\ \mathit{SE}(\gamma_{10}) & = \sqrt{\frac{\tau^2_1}{J} + \frac{\sigma^2}{JnS^2_X}} \\ \mathit{SE}(\gamma_{11}) & = \sqrt{\frac{1}{S^2_W}\left(\frac{\tau^2_1}{J} + \frac{\sigma^2}{JnS^2_X}\right)} \\ \end{aligned}
Group-based therapy for eating disorder (cluster-randomized trial)
Intervention at group level
10 participants per group
Outcome standardized (i.e., SD = \(\sqrt{\tau^2_0 + \sigma^2} = 1\))
ICC = .3 (i.e., \(\tau^2_0 = .3\))
Group-based therapy for eating disorder (cluster-randomized trial)
Intervention at group level
10 participants per group
Outcome standardized (i.e., SD = \(\sqrt{\tau^2_0 + \sigma^2} = 1\))
ICC = .3 (i.e., \(\tau^2_0 = .3\))
Goal: estimate \(J\) such that \(\mathit{SE}(\gamma_{10}) \leq .1\)
When the predictor is binary (e.g., treatment-control), if half of the groups is in one condition, \(S^2_W = 0.25\)
E.g., if \(J = 30\) $$\mathit{SE}(\gamma_{01}) = \sqrt{\frac{1}{S^2_W}\left(\frac{\tau^2_0}{J} + \frac{\sigma^2}{Jn}\right)} = \sqrt{\frac{1}{0.25}\left(\frac{0.3}{30} + \frac{0.7}{(30)(10)}\right)} = 0.2221111$$
When the predictor is binary (e.g., treatment-control), if half of the groups is in one condition, \(S^2_W = 0.25\)
E.g., if \(J = 30\) $$\mathit{SE}(\gamma_{01}) = \sqrt{\frac{1}{S^2_W}\left(\frac{\tau^2_0}{J} + \frac{\sigma^2}{Jn}\right)} = \sqrt{\frac{1}{0.25}\left(\frac{0.3}{30} + \frac{0.7}{(30)(10)}\right)} = 0.2221111$$
Keep trying, and you'll find ...
When \(J\) = 148, \(\mathit{SE}(\gamma_{01}) = 0.1\)
So you'll need 148 groups (74 treatment, 74 control)
Two-tailed test, \(\alpha = .05\)
\(H_0: \gamma_{01} = 0\)
Critical region: \(\hat \gamma_{01} \leq -0.45\) or \(\hat \gamma_{01} \geq 0.45\)
Two-tailed test, \(\alpha = .05\)
\(H_0: \gamma_{01} = 0\)
Critical region: \(\hat \gamma_{01} \leq -0.45\) or \(\hat \gamma_{01} \geq 0.45\)
\(H_1: \gamma_{01} = 0.3\)
Power1 \(\approx P(\hat \gamma_{01} \leq -0.45) + P(\hat \gamma_{01} \geq 0.45) = 0.2465731\)
[1] In practice, we need to incorporate the sampling variability of the standard error as well, so this power calculation is only a rough approximation.
Two-tailed test, \(\alpha = .05\)
\(H_0: \gamma_{01} = 0\)
Critical region: \(\hat \gamma_{01} \leq -0.2\) or \(\hat \gamma_{01} \geq 0.2\)
Two-tailed test, \(\alpha = .05\)
\(H_0: \gamma_{01} = 0\)
Critical region: \(\hat \gamma_{01} \leq -0.2\) or \(\hat \gamma_{01} \geq 0.2\)
\(H_1: \gamma_{01} = 0.3\)
Power \(\approx P(\hat \gamma_{01} \leq -0.2) + P(\hat \gamma_{01} \geq 0.2) = 0.8461551\)
Stand-alone programs
R packages
simr
Spreadsheet/Webapp
See more discussion in Arend & Schäfer (2019)
Simulate a large number (e.g., \(R\) = 1,000) of data sets based on given effect size, ICC, etc
Fit an MLM to each simulated data
Power \(\approx\) Proportion of times \(p < \alpha\)
simr
In the PowerUpR demo, to calculate the number of clusters \(J\) need to achieve 80% power, we determined
g2
, r21
, r22
= 0, as we did not include any covariatesp
= .5, for a balanced design (half treatment, half control)However, we need to guess the values of
The more uncertainty we have but ignore about a parameter value, the more power loss we will have in our study (red curve)
Uncertainty in both effect size and ICC can further reduce our power
The more uncertainty we have, the more samples we need to achieve 80% power
Incorporates uncertainty for sample size planning
Instead of plugging in a point value of a guess, we can specify how much uncertainty we have (e.g., standard error of \(\gamma_{01}\) from a previous study)
$$\delta \sim N(.3, .1) \quad \rho \sim \text{Beta}(a, b)$$
Increasing \(J\) usually leads to higher power than increasing \(n\)
Balanced designs generally have higher power than unbalanced designs
Larger sample size required for testing level-2 predictors
Testing an interaction requires a much larger sample size
Doubling \(J\) is better than doubling \(n\)
$$\newcommand{\bv}[1]{\boldsymbol{\mathbf{#1}}}$$
Describe the importance of having sufficient sample size for scientific research
Describe conceptually the steps for sample size planning: precision analysis and power analysis
Perform power analysis for MLM using the PowerUpR application and the simr
package
Understand the effect of uncertainty in parameter values and explore alternative approaches for sample size planning
Keyboard shortcuts
↑, ←, Pg Up, k | Go to previous slide |
↓, →, Pg Dn, Space, j | Go to next slide |
Home | Go to first slide |
End | Go to last slide |
Number + Return | Go to specific slide |
b / m / f | Toggle blackout / mirrored / fullscreen mode |
c | Clone slideshow |
p | Toggle presenter mode |
t | Restart the presentation timer |
?, h | Toggle this help |
Esc | Back to slideshow |