Skip to main content

Readings and Presentations

Instruction Day 1

Session 1: Specify Models, Identify Outcomes, and Craft Questions

Instructor: Laura Peck

Peck, Laura P. (2020). Experimental Evaluation Design for Program Improvement, Chapter 2. Thousand Oaks, CA: SAGE Publishing.

Kekahio, W., Cicchinelli, L., Lawton, B., & Brandon, P. R. (2014). Logic models: A tool for effective program planning, collaboration, and monitoring. National Center for Education Evaluation and Regional Assistance. Retrieved from https://www2. ed. gov/about/offices/list/oese/oss/technicalassistance/easnlogicmodelstool monitoring. pdf.

Suggested Reading:
Additional (IES/REL Pacific) resources on logic models:

Resources on Theory of Change:

Ratan, S. K., Anand, T., & Ratan, J. (2019). Formulation of research question–Stepwise approach. Journal of Indian Association of Pediatric Surgeons, 24(1), 15.


Session 2: Describe and Operationalize Outcomes

Instructor: Laura Peck

Litwok, Daniel, Douglas Walton, Laura R. Peck, and Eleanor Harvill. (2018). Health Profession Opportunity Grants (HPOG) Impact Study’s Three-Year Follow-Up Analysis Plan, Section 2 through 2.2.1 (pp.17-24). OPRE Report #2018-124, Washington, DC: Office of Planning, Research, and Evaluation, Administration for Children and Families, U.S. Department of Health and Human Services.

Suggested Reading:
Illustration: How does the U.S. measure hunger?


Session 3: Treatment Fidelity

Instructor: Carolyn Hill

Weiss, M. J., Bloom, H. S., & Brock, T. (2014). A conceptual framework for studying the sources of variation in program effects. Journal of Policy Analysis and Management, 33(3), 778-808.

Domitrovich, C. E., Bradshaw, C. P., Poduska, J. M., Hoagwood, K., Buckley, J. A., Olin, S., ... & Ialongo, N. S. (2008). Maximizing the implementation quality of evidence-based preventive interventions in schools: A conceptual framework. Advances in school mental health promotion, 1(3), 6-28.

Hamilton, G., & Scrivener, S. (2018). Measuring Treatment Contrast in Randomized Controlled Trials.


Instruction Day 2

Sessions 5–7: Basic experimental design for education studies

Instructor:  Spyros Konstantopoulos

Hedges, L.V. & Hedberg, E.C. (2007). Intraclass correlations for planning group randomized experiments in education. Educational Evaluation and Policy Analysis, 29, 60–87.

Kirk, R. E. (2013). Experimental design: Procedures for the behavioral sciences (4th ed.). SAGE Publications, Inc. Chapter 8: Randomized Block Designs. Chapter 11: Hierarchical Designs.

Konstantopoulos, S. (2011). Optimal Sampling of Units in Three-Level Cluster Randomized Designs An ANCOVA Framework. Educational and Psychological Measurement, 71(5), 798-813.

Konstantopoulos, S. (2012). The Impact of Covariates on Statistical Power in Cluster Randomized Designs: Which Level Matters More?. Multivariate Behavioral Research, 47(3), 392-420.

Konstantopoulos, S. (2013). Optimal Design in Three-Level Block Randomized Designs With Two Levels of Nesting An ANOVA Framework With Random Effects. Educational and Psychological Measurement, 73(5), 784-802.

Raudenbush, S.W. (1993). Hierarchical linear models and experimental design. In L. K. Edwards (Ed.) Applied analysis of variance in behavioral science (pp. 459–496). New York: Marcel Dekker, Inc.

Rhoads, C.H. (2011). The implications of “contamination” for experimental design in education research. Journal of Educational and Behavioral Statistics, 36(1), 76–104.

Xu, Z., and Nichols, A. (2010). New estimates of design parameters for clustered randomization studies. Center for Analysis of Longitudinal Data in Education Research, Working Paper 43.


Instruction Day 3

Session 8–9: Analysis lab

Instructor:  Beth Tipton

Peugh, J. and Enders, C. (2005). Using the SPSS Mixed Procedure to fit Cross-Sectional and Longitudinal Multilevel Models. Educational and Psychological Measurement, 65, 717–741.

Singer, J.D. (1998). Using SAS PROC MIXED to fit multilevel models, hierarchical models, and individual growth models. Journal of Educational and Behavioral Statistics, 25, 323–355.

Suggested Reading:

Bloom, H.S. (2005). Randomizing groups to evaluate place-based programs. In Howard S. Bloom (ed.) Learning More from Social Experiments: Evolving Analytic Approaches, pp. 115–172. New York: Russell Sage Foundations. (Also listed for Sessions 5–7).

Raudenbush, S. (1997). Statistical analysis and optimal design for cluster randomized trials. Psychological Methods, 2(2), 173–185.

Shek, D. and Ma, C. (2011) Longitudinal data analysis using linear mixed models in SPSS: Concepts, procedures and illustrations. The Scientific World Journal. 42–76.


Session 10: External Validity

Instructor:  Beth Tipton

Tipton, E. & Hartman, E. (2021) Generalizability and Transportability. Chapter in Handbook of Multivariate Matching and Weighting (Edited by Stuart, E., Rosenbaum, P., Small, D., & Zubizarreta, J.). 

Tipton, E., & Olsen, R. B. (2022) Enhancing the Generalizability of Impact Studies in Education. (NCEE 2022-003). Washington, DC: U.S. Department of Education, Institute of Education Sciences, National Center for Education Evaluation and Regional Assistance. Retrieved from


Instruction Day 4

Sessions 11 – 12: Sample size and statistical power

Instructor:  Larry V. Hedges

Hedges, L. V. & Hedberg, E. C. (2013). Intraclass correlations and covariate outcome correlations for planning 2 and 3 level cluster randomized experiments in education. Evaluation Review, 37, 13-57.

Hedberg, E. C. & Hedges, L. V. (2014). Reference values of within-district intraclass correlations of academic achievement by district characteristics: Results from a meta-analysis of district-specific data. Evaluation Review, 38, 546-582.

Spybrook, J., Hedges, L. V., & Borenstein, M. (2014). Understanding statistical power in cluster randomized trials: Challenges posed by differences in notation and terminology, Journal of Research on Educational Effectiveness, 7, 384-406.

Hedges, L. V. & Borenstein, M. (2014). Constrained optimal design in three and four level experiments. Journal of Educational and Behavioral Statistics, 39, 257-281.


Session 13: Statistical power analysis Lab

Instructor:  Jessaca Spybrook

Spybrook, J., Bloom, H., Congdon, R., Hill, C., Martinez, A., Raudenbush, S., & To, A. (2011). Optimal design plus empirical evidence: Documentation for the “Optimal Design” software. William T. Grant Foundation.

Download the PowerUP! power analysis programs from:

PowerUP!, PowerUP!-Moderator, PowerUp!-Mediator


Dong, N., & Maynard, R.A. (2013). PowerUP!: A tool for calculating minimum detectable effect sizes and minimum required sample sizes for experimental and quasi-experimental design studies. Journal of Research on Educational Effectiveness, 6(1), 24-67.

Additional Readings will be provided with the session materials.


Instruction Day 5

Session 14: Statistical power analysis Lab (continued)

Instructor:  Jessaca Spybrook


Session 15 – 16: School Recruitment

Instructor: Kylie Flynn

Coburn, C.E., Penuel, W.R., & Geil, K.E. (2013). Research-Practice Partnerships: A Strategy for Leveraging Research for Educational Improvement in School Districts. William T. Grant Foundation, New York, NY.

Roschelle, J., Feng, M., Gallagher, H., Murphy, R., Harris, C., Kamdar, D., Trinidad, G. (2014). Recruiting Participants for Large-Scale Random Assignment Experiments in School Settings. Menlo Park, CA: SRI International.

 Suggested Reading:

Tipton, E., & Matlen, B. J. (2019). Improved generalizability through improved recruitment: Lessons learned from a large-scale randomized trial. American Journal of Evaluation40(3), 414-430.


Instruction Day 6

Session 17: Growth Modeling

Instructor: Chris Rhoads

Hedeker, D. (2004). An introduction to growth modeling.  In D. Kaplan (Ed.), Quantitative Methodology for the Social Sciences. Thousand Oaks CA: Sage Publications. 

Suggested Reading: 

Bryk, A.S. & Raudenbush, S.W. (1988). Toward a more appropriate conception of school effects. American Journal of Education, 97, 65–108. 

Burchinal, M. & Appelbaum,M.I. (1991). Estimating individual developmental functions:Methods and their assumptions. Child Development, 62, 23– 43. 

Raudenbush,S. W. (2001). Toward a coherent framework for comparing trajectories of individual change. In L. Collinsand A. Sayer (Eds.),Best Methods for Studying Change (pp. 33–64). Washington,DC: The American Psychological Association.” 


Session: 18 – 19: Moderator Analysis

Instructor: Spyros Konstantopoulos

Baron R. & Kenny D.A. (1986). The moderator-mediator variable distinction in social psychological research: conceptual, strategic, and statistical considerations. Journal of Personality and Social Psychology, 51(6), 1173–1182. (Also listed for Sessions 1 & 20).

Rubin, D.B. (1977). Assignment to treatment group on the basis of a covariate. Journal of Educational Statistics, 2(1), 1–26.

Suggested Reading

Raudenbush, S. W., & Bryk, A. S. (2002). Hierarchical linear models. Thousand Oaks, CA: Sage.


Instruction Day 7

Session 20: Mediation Models

Instructor:  Ben Kelcey

Baron R. & Kenny D.A. (1986). The moderator-mediator variable distinction in social psychological research: conceptual, strategic, and statistical considerations. Journal of Personality and Social Psychology, 51(6), 1173–1182.

Krull, J.L., and MacKinnon, D.P. (2001). Multilevel modeling of individual and group level mediated effects. Multivariate Behavioral Research, 36(2): 249–77.

MacKinnon, D. P., & Fairchild, A. J. (2009). Current directions in mediation analysis. Current Directions in Psychological Science, 18(1), 16–20.

Suggested reading:

Bauer, D.J., Preacher, K.J., and Gil, K.M. (2006). Conceptualizing and testing random indirect effects and moderated mediation in multilevel models: New procedures and recommendations. Psychological Methods, 11(2): 142–63.

Keele, L. (2015). Causal Mediation Analysis: Warning! Assumptions Ahead. American Journal of Evaluation, 36(4), 500–513.

Kelcey, B., Dong, N., Spybrook, J., & Cox, K. (2017). Statistical Power for Causally-defined Indirect Effects in Group-randomized Trials with Individual-level Mediators. Journal of Educational and Behavioral Statistics, 42, 5, 499-530.

Kelcey, B., Bai, F. & Xie, Y. (2020). Statistical Power in Partially Nested Designs Probing Multilevel Mediation. Psychotherapy Research Journal, 30, 8, 1061-1074.

Pituch, K. A., & Stapleton, L. M. (2012). Distinguishing between cross- and cluster-level mediation processes in the cluster randomized trial. Sociological Methods & Research, 41, 630–670.

Zhang, Z., Zyphur, M., & Preacher, K. (2009). Testing multilevel mediation using hierarchical linear models: Problems and solutions. Organizational Research Methods, 12, 695–719.


Session 21: Mediation Models (continued)

Instructor:  Ben Kelcey


Session 22: Reporting Research

Instructor: Larry Hedges

Campbell, M. K. et al. (2012). Consort 2010 statement: Extension to cluster randomized trials.British Medical Journal, 345, e5661.

CONSORT Extension for Cluster Trials Checklist