Statistical Power Resources: STEPP Center

1.1 Math and/or Reading/ELA Outcomes

Bloom, H. S., Richburg-Hayes, L., & Black, A. R. (2007). Using covariates to improve precision for studies that randomize schools to evaluate educational interventions. Educational Evaluation and Policy Analysis, 29(1), 30-59. https://doi.org/10.3102/0162373707299550

Fahle, E. M., & Reardon, S. F. (2018). How much do test scores vary among school districts? New estimates using population data, 2009–2015. Educational Researcher, 47(4), 221-234. https://doi.org/10.3102/0013189X18759524

Hedberg, E. C. (2016). Academic and behavioral design parameters for cluster randomized trials in kindergarten: an analysis of the Early Childhood Longitudinal Study 2011 Kindergarten Cohort (ECLS-K 2011). Evaluation Review, 40(4), 279-313. https://doi.org/10.1177/0193841X16655657

Hedberg, E. C., & Hedges, L. V. (2014). Reference values of within-district intraclass correlations of academic achievement by district characteristics: Results from a meta-analysis of district-specific values. Evaluation Review, 38(6), 546-582. https://doi.org/10.1177/0193841X14554212

Hedges, L. V., & Hedberg, E. C. (2007). Intraclass correlation values for planning group-randomized trials in education. Educational Evaluation and Policy Analysis, 29(1), 60-87. https://doi.org/10.3102/0162373707299706

Hedges, L. V., & Hedberg, E. C. (2013). Intraclass correlations and covariate outcome correlations for planning two- and three-level cluster randomized experiments in education. Evaluation Review, 37(6), 445-489. https://doi.org/10.1177/0193841X14529126

Jacob, R., Zhu, P., & Bloom, H. S. (2010). New empirical evidence for the design of group randomized trials in education. Journal of Research on Educational Effectiveness, 3(2), 157-198. https://doi.org/10.1080/19345741003592428

Schochet, P. Z. (2008). Statistical power for random assignment evaluations of education programs. Journal of Educational and Behavioral Statistics, 33(1), 62-87. https://doi.org/10.3102/1076998607302714

Xu, Z., & Nichols, A. (2010). New estimates of design parameters for cluster randomization studies: Findings from North Carolina and Florida. CALDER Working Paper No. 43. https://files.eric.ed.gov/fulltext/ED510553.pdf

Zhu, P., Jacob, R., Bloom, H., & Xu, Z. (2012). Designing and analyzing studies that randomize schools to estimate intervention effects on student academic outcomes without classroom-level information. Educational Evaluation and Policy Analysis, 34(1), 45-68. https://doi.org/10.3102/0162373711423786

Online Intraclass Correlation Database: http://stateva.ci.northwestern.edu/

Germany Data

Stallasch, S. E., Lüdtke, O., Artelt, C., & Brunner, M. (2021). Multilevel design parameters to plan cluster-randomized intervention studies on student achievement in elementary and secondary school. Journal of Research on Educational Effectiveness, 14(1), 172-206. https://doi.org/10.1080/19345747.2020.1823539

Sub-Saharan Africa Data

Kelcey, B., & Shen, Z. (2016). Multilevel design of school effectiveness studies in sub-Saharan Africa. School Effectiveness and School Improvement, 27(4), 492-510. https://doi.org/10.1080/09243453.2016.1168855

Multi-country Data (Programme for International Student Assessment (PISA))

Brunner, M., Keller, U., Wenger, M., Fischbach, A., & Lüdtke, O. (2018). Between-school variation in students' achievement, motivation, affect, and learning strategies: Results from 81 countries for planning group-randomized trials in education. Journal of Research on Educational Effectiveness, 11(3), 452-478. https://doi.org/10.1080/19345747.2017.1375584

1.2 Science Outcomes

Spybrook, J., Westine, C., & Taylor, J. (2016). Design parameters for impact research in science education: A multi-state analysis. AERA Open, 2(1), 1-15. https://doi.org/10.1177/2332858415625975

Westine, C., Spybrook, J., & Taylor, J. (2013). An empirical investigation of variance design parameters for planning cluster randomized trials of science achievement. Evaluation Review, 37(6), 490-519. https://doi.org/10.1177/0193841X14531584

Germany Data

Stallasch, S. E., Lüdtke, O., Artelt, C., & Brunner, M. (2021). Multilevel design parameters to plan cluster-randomized intervention studies on student achievement in elementary and secondary school. Journal of Research on Educational Effectiveness, 14(1), 172-206. https://doi.org/10.1080/19345747.2020.1823539

Multi-country Data (Programme for International Student Assessment (PISA))

Brunner, M., Keller, U., Wenger, M., Fischbach, A., & Lüdtke, O. (2018). Between-school variation in students' achievement, motivation, affect, and learning strategies: Results from 81 countries for planning group-randomized trials in education. Journal of Research on Educational Effectiveness, 11(3), 452-478. https://doi.org/10.1080/19345747.2017.1375584

1.3 Social-Emotional/Cognitive/Behavioral Outcomes

Dong, N., Reinke, W. M., Herman, K. C., Bradshaw, C. P., & Murray, D. W. (2016). Meaningful effect sizes, intraclass correlations, and proportions of variance explained by covariates for planning two-and three-level cluster randomized trials of social and behavioral outcomes. Evaluation Review, 40(4), 334-377. https://doi.org/10.1177/0193841X16671283

Hedberg, E. C. (2016). Academic and behavioral design parameters for cluster randomized trials in kindergarten: an analysis of the Early Childhood Longitudinal Study 2011 Kindergarten Cohort (ECLS-K 2011). Evaluation Review, 40(4), 279-313. https://doi.org/10.1177/0193841X16655657

Jacob, R., Zhu, P., & Bloom, H. S. (2010). New empirical evidence for the design of group randomized trials in education. Journal of Research on Educational Effectiveness, 3(2), 157-198. https://doi.org/10.1080/19345741003592428

Germany Data

Stallasch, S. E., Lüdtke, O., Artelt, C., & Brunner, M. (2021). Multilevel design parameters to plan cluster-randomized intervention studies on student achievement in elementary and secondary school. Journal of Research on Educational Effectiveness, 14(1), 172-206. https://doi.org/10.1080/19345747.2020.1823539

Multi-country Data (Programme for International Student Assessment (PISA))

Brunner, M., Keller, U., Wenger, M., Fischbach, A., & Lüdtke, O. (2018). Between-school variation in students' achievement, motivation, affect, and learning strategies: Results from 81 countries for planning group-randomized trials in education. Journal of Research on Educational Effectiveness, 11(3), 452-478. https://doi.org/10.1080/19345747.2017.1375584

1.4 Other Outcomes (nutritional outcomes)

Juras, R. (2016). Estimates of intraclass correlation coefficients and other design parameters for studies of school-based nutritional interventions. Evaluation Review, 40(4), 314-333. https://doi.org/10.1177/0193841X16675223

1.5 Teacher Outcomes

Kelcey, B., & Phelps, G. (2013). Considerations for designing group randomized trials of professional development with teacher knowledge outcomes. Educational Evaluation and Policy Analysis, 35(3), 370-390. https://doi.org/10.3102/0162373713482766

Kelcey, B., & Phelps, G. (2013). Strategies for improving power in school-randomized studies of professional development. Evaluation Review, 37(6), 520-554. https://doi.org/10.1177/0193841X14528906

Kelcey, B., Spybrook, J., Phelps, G., Jones, N., & Zhang, J. (2017). Designing large-scale multisite and cluster-randomized studies of professional development. The Journal of Experimental Education, 85(3), 389-410. https://doi.org/10.1080/00220973.2016.1220911

Westine, C. D., Unlu, F., Taylor, J., Spybrook, J., Zhang, Q., & Anderson, B. (2020). Design Parameter Values for Impact Evaluations of Science and Mathematics Interventions Involving Teacher Outcomes. Journal of Research on Educational Effectiveness, 13(4), 816-839. https://doi.org/10.1080/19345747.2020.1821849

2.1 Magnitude of Effect Sizes

2.1.1 Math and/or Reading/ELA Outcomes

Hedberg, E. C. (2016). Academic and behavioral design parameters for cluster randomized trials in kindergarten: an analysis of the Early Childhood Longitudinal Study 2011 Kindergarten Cohort (ECLS-K 2011). Evaluation Review, 40(4), 279-313. https://doi.org/10.1177/0193841X16655657

Zhu, P., Jacob, R., Bloom, H., & Xu, Z. (2012). Designing and analyzing studies that randomize schools to estimate intervention effects on student academic outcomes without classroom-level information. Educational Evaluation and Policy Analysis, 34(1), 45-68. https://doi.org/10.3102/0162373711423786

2.1.2 Science Outcomes

Taylor, J. A., Kowalski, S. M., Polanin, J. R., Askinas, K., Stuhlsatz, M. A., Wilson, C. D., ... & Wilson, S. J. (2018). Investigating science education effect sizes: Implications for power analyses and programmatic decisions. AERA Open, 4(3), 1-19. https://doi.org/10.1177/2332858418791991

2.1.3 Social-Emotional/Cognitive/Behavioral Outcomes

Dong, N., Reinke, W. M., Herman, K. C., Bradshaw, C. P., & Murray, D. W. (2016). Meaningful effect sizes, intraclass correlations, and proportions of variance explained by covariates for planning two-and three-level cluster randomized trials of social and behavioral outcomes. Evaluation Review, 40(4), 334-377. https://doi.org/10.1177/0193841X16671283

Hedberg, E. C. (2016). Academic and behavioral design parameters for cluster randomized trials in kindergarten: an analysis of the Early Childhood Longitudinal Study 2011 Kindergarten Cohort (ECLS-K 2011). Evaluation Review, 40(4), 279-313. https://doi.org/10.1177/0193841X16655657

2.2 Benchmarks for Effect Sizes

Brandon, P.R., Harrison, G.M., & Lawton, B.E. (2013). SAS code for calculating intraclass correlation coefficients and effect size benchmarks for site-randomized education experiments. American Journal of Evaluation, 34(1), 85-90. https://doi.org/10.1177/1098214012466453

Hill, C. J., Bloom, H. S., Rebeck-Black, A., & Lispey, M. W. (2008). Empirical benchmarks for interpreting effect sizes in research. Child Development Perspectives, 2(3), 172-177. https://doi.org/10.1111/j.1750-8606.2008.00061.x

3.1 Math and/or Reading/ELA Outcomes

Weiss, M. J., Bloom, H. S., Verbitsky-Savitz, N., Gupta, H., Vigil, A. E., & Cullinan, D. N. (2017). How much do the effects of education and training programs vary across sites? Evidence from past multisite randomized trials. Journal of Research on Educational Effectiveness, 10(4), 843-876. https://doi.org/10.1080/19345747.2017.1300719

3.2 Other Outcomes (post-secondary outcomes and labor/workforce outcomes)

Weiss, M. J., Bloom, H. S., Verbitsky-Savitz, N., Gupta, H., Vigil, A. E., & Cullinan, D. N. (2017). How much do the effects of education and training programs vary across sites? Evidence from past multisite randomized trials. Journal of Research on Educational Effectiveness, 10(4), 843-876. https://doi.org/10.1080/19345747.2017.1300719

4.1 Design Parameters for Planning Studies

Somers, M.-A., Weiss, M. J., & Hill, C. (2022). Design Parameters for Planning the Sample Size of Individual-Level Randomized Controlled Trials in Community Colleges. Evaluation Review, 0(0). https://doi.org/10.1177/0193841X221121236

Statistical Power Resources

1. Resources for Estimating ICCs and/or R2s in Education

1.1 Math and/or Reading/ELA Outcomes

1.2 Science Outcomes

1.3 Social-Emotional/Cognitive/Behavioral Outcomes

1.4 Other Outcomes (nutritional outcomes)

1.5 Teacher Outcomes

2. Resources for Estimating Effect Sizes in Education

2.1 Magnitude of Effect Sizes

2.1.1 Math and/or Reading/ELA Outcomes

2.1.2 Science Outcomes

2.1.3 Social-Emotional/Cognitive/Behavioral Outcomes

2.2 Benchmarks for Effect Sizes

3. Resources for Estimating Effect Size Variability in Education

3.1 Math and/or Reading/ELA Outcomes

3.2 Other Outcomes (post-secondary outcomes and labor/workforce outcomes)

4. Resources for Post-Secondary Studies

4.1 Design Parameters for Planning Studies