Describe the importance of having sufficient sample size for scientific research
Describe conceptually the steps for sample size planning: precision analysis and power analysis
Perform power analysis for MLM using the PowerUpR application and the simr
package
Understand the effect of uncertainty in parameter values and explore alternative approaches for sample size planning
Assume true effect is γ01=0.10
Let's say
Add the 0 line, the 0.1 line, and the cutoff lines
Write down your model equations
List out all parameters in the model
Determine if you want to achieve a desired level of
a. Power, or
b. Precision
Level-1 Yij=β0j+β1jX_cmcij+eij eij∼N(0,σ) Level-2 β0j=γ00+γ01Wj+u0jβ1j=γ10+γ11Wj+u1j[u0ju1j]∼N([00],[τ20τ01τ21])
Level-1 Yij=β0j+β1jX_cmcij+eij eij∼N(0,σ) Level-2 β0j=γ00+γ01Wj+u0jβ1j=γ10+γ11Wj+u1j[u0ju1j]∼N([00],[τ20τ01τ21])
Fixed effects: γ00, γ01, γ10, γ11
Random effects: τ20, τ21, τ01
Number of clusters: J
Cluster size: n
Level-1 Yij=β0j+β1jX_cmcij+eij eij∼N(0,σ) Level-2 β0j=γ00+γ01Wj+u0jβ1j=γ10+γ11Wj+u1j[u0ju1j]∼N([00],[τ20τ01τ21])
In the previous graph, when N=20, the sample estimate is likely to be anywhere between -0.4 and 0.6
SE∝1√N
In the previous graph, when N=20, the sample estimate is likely to be anywhere between -0.4 and 0.6
SE∝1√N
One goal of sample size planning is to
Have sufficient sample size to get precise (low SE) sample estimates of an effect
J = Number of clusters; n = Cluster size
Assuming τ01=0
SE(γ01)= ⎷1S2W(τ20J+σ2Jn)SE(γ10)= ⎷τ21J+σ2JnS2XSE(γ11)= ⎷1S2W(τ21J+σ2JnS2X)
Group-based therapy for eating disorder (cluster-randomized trial)
Intervention at group level
10 participants per group
Outcome standardized (i.e., SD = √τ20+σ2=1)
ICC = .3 (i.e., τ20=.3)
Group-based therapy for eating disorder (cluster-randomized trial)
Intervention at group level
10 participants per group
Outcome standardized (i.e., SD = √τ20+σ2=1)
ICC = .3 (i.e., τ20=.3)
Goal: estimate J such that SE(γ10)≤.1
When the predictor is binary (e.g., treatment-control), if half of the groups is in one condition, S2W=0.25
E.g., if J=30 SE(γ01)= ⎷1S2W(τ20J+σ2Jn)=√10.25(0.330+0.7(30)(10))=0.2221111
When the predictor is binary (e.g., treatment-control), if half of the groups is in one condition, S2W=0.25
E.g., if J=30 SE(γ01)= ⎷1S2W(τ20J+σ2Jn)=√10.25(0.330+0.7(30)(10))=0.2221111
Keep trying, and you'll find ...
When J = 148, SE(γ01)=0.1
So you'll need 148 groups (74 treatment, 74 control)
Two-tailed test, α=.05
H0:γ01=0
Critical region: ^γ01≤−0.45 or ^γ01≥0.45
Two-tailed test, α=.05
H0:γ01=0
Critical region: ^γ01≤−0.45 or ^γ01≥0.45
H1:γ01=0.3
Power1 ≈P(^γ01≤−0.45)+P(^γ01≥0.45)=0.2465731
[1] In practice, we need to incorporate the sampling variability of the standard error as well, so this power calculation is only a rough approximation.
Two-tailed test, α=.05
H0:γ01=0
Critical region: ^γ01≤−0.2 or ^γ01≥0.2
Two-tailed test, α=.05
H0:γ01=0
Critical region: ^γ01≤−0.2 or ^γ01≥0.2
H1:γ01=0.3
Power ≈P(^γ01≤−0.2)+P(^γ01≥0.2)=0.8461551
Stand-alone programs
R packages
simr
Spreadsheet/Webapp
See more discussion in Arend & Schäfer (2019)
Simulate a large number (e.g., R = 1,000) of data sets based on given effect size, ICC, etc
Fit an MLM to each simulated data
Power ≈ Proportion of times p<α
simr
In the PowerUpR demo, to calculate the number of clusters J need to achieve 80% power, we determined
g2
, r21
, r22
= 0, as we did not include any covariatesp
= .5, for a balanced design (half treatment, half control)However, we need to guess the values of
The more uncertainty we have but ignore about a parameter value, the more power loss we will have in our study (red curve)
Uncertainty in both effect size and ICC can further reduce our power
The more uncertainty we have, the more samples we need to achieve 80% power
Incorporates uncertainty for sample size planning
Instead of plugging in a point value of a guess, we can specify how much uncertainty we have (e.g., standard error of γ01 from a previous study)
δ∼N(.3,.1)ρ∼Beta(a,b)
Increasing J usually leads to higher power than increasing n
Balanced designs generally have higher power than unbalanced designs
Larger sample size required for testing level-2 predictors
Testing an interaction requires a much larger sample size
Doubling J is better than doubling n
Describe the importance of having sufficient sample size for scientific research
Describe conceptually the steps for sample size planning: precision analysis and power analysis
Perform power analysis for MLM using the PowerUpR application and the simr
package
Understand the effect of uncertainty in parameter values and explore alternative approaches for sample size planning
Keyboard shortcuts
↑, ←, Pg Up, k | Go to previous slide |
↓, →, Pg Dn, Space, j | Go to next slide |
Home | Go to first slide |
End | Go to last slide |
Number + Return | Go to specific slide |
b / m / f | Toggle blackout / mirrored / fullscreen mode |
c | Clone slideshow |
p | Toggle presenter mode |
t | Restart the presentation timer |
?, h | Toggle this help |
Esc | Back to slideshow |