This article proposes a strategy to accelerate the learning of struggling learners that uses proven reading and mathematics programs to ensure student success. Based on Response to Intervention (RTI), the proposed policy, Response to Proven Intervention (RTPI), uses proven whole-school or whole-class programs as Tier 1, proven one-to-small group tutoring programs as Tier 2, and proven one-to-one tutoring as Tier 3. The criteria for “proven” are the “strong” and “moderate” evidence levels specified in the Every Student Succeeds Act (ESSA). This article lists proven reading and mathematics programs for each tier, and explains how evidence of effectiveness within an RTI framework could become central to improving outcomes for struggling learners.
Evidence-based reform in education refers to policies that enable or encourage the use of programs and practices proven to be effective in rigorous research. This article discusses the increasing role of evidence in educational policy, rapid growth in availability of proven approaches, and development of reviews of research to summarize the evidence. A highlight of evidence-based reform was the 2015 passage of the Every Student Succeeds Act (ESSA), which defines strong, moderate, and promising levels of evidence for educational programs and ties certain federal funding to use of proven approaches. To illustrate how coordinated use of proven approaches could substantially improve educational outcomes, the article proposes use of proven programs to populate each of Tiers 1, 2, and 3 in response to intervention (RTI) policies. This article is adapted from an address for the E.L. Thorndike Award for Distinguished Psychological Contributions to Education, August 7, 2018.
Slavin, R. E. (2020). How evidence-based reform will transform research and practice in education. Educational Psychologist, 55 (1), 21-31. DOI: 10.1080/00461520.2019.1611432.
Education policies should support the use of programs and practices with strong evidence of effectiveness. The Every Student Succeeds Act (ESSA) contains evidence standards and incentives to use programs that meet them. This provides a great opportunity for evidence to play a stronger role in decisions about education programs and practices. However, for evidence-based reform to prevail, three conditions must exist: many practical programs with solid evidence; trusted and user-friendly reviews of research; and more education policies that provide incentives for use of proven programs. The article discusses recent progress in each of these areas and notes difficulties in each. It makes a case that if these difficulties can be effectively addressed, evidence-based reform may begin to make a meaningful difference in education outcomes at the national level.
There has long been interest in using summertime to provide supplemental education to students who need it. But are summer programs effective? This review includes 19 randomized studies on the effects of summer intervention programs on reading and mathematics, based on rigorous quality criteria. In reading, there were two types of summer programs: summer school and summer book reading approaches. In mathematics, there was only summer school. The mean effect of summer school programs on reading achievement were positive (mean ES = +0.23), but there were no positive effects, on average, of summer book reading programs (ES=0.00). In mathematics, positive mean effects were also found for summer programs (ES=+0.17). However, the positive-appearing means for summer schools were not statistically significant in a metaregression, and depended on just two reading and one math study with very large impacts. These successful interventions focused on well-defined objectives with intensive teaching.
Success for All (SFA) is a comprehensive whole-school approach designed to help high-poverty elementary schools increase the reading success of their students. It is designed to ensure success in grades K-2 and then build on this success in later grades. SFA combines instruction emphasizing phonics and cooperative learning, one-to-small group tutoring for students who need it in the primary grades, frequent assessment and regrouping, parent involvement, distributed leadership, and extensive training and coaching. Over a 33-year period, SFA has been extensively evaluated, mostly by researchers unconnected to the program. This quantitative synthesis reviews the findings of these evaluations. Seventeen U.S. studies meeting rigorous inclusion standards had a mean effect size of +0.24 (p < .05) on independent measures. Effects were largest for low achievers (ES = +0.54, p < .01). Although outcomes vary across studies, mean impacts support the effectiveness of Success for All for the reading success of disadvantaged students.
Slavin, R.E., Lake, C., Chambers, B., Cheung, A., & Davis, S. (2009). Effective reading programs for the elementary grades: A best-evidence synthesis. Review of Educational Research, 79 (4), 1391-1466.
As evidence-based reform becomes increasingly important in educational policy, it is becoming essential to understand how research design might contribute to reported effect sizes in experiments evaluating educational programs. The purpose of this article is to examine how methodological features such as types of publication, sample sizes, and research designs affect effect sizes in experiments. A total of 645 studies from 12 recent reviews of evaluations of reading, mathematics, and science programs were studied. The findings suggest that effect sizes are roughly twice as large for published articles, small-scale trials, and experimenter-made measures, than for unpublished documents, large-scale studies, and independent measures, respectively. In addition, effect sizes are significantly higher in quasi-experiments than in randomized experiments. Explanations for the effects of methodological features on effect sizes are discussed, as are implications for evidence-based policy.
Cheung, A., & Slavin, R. (2016). How methodological features affect effect sizes in education. Educational Researcher, 45 (5), 283-292.
Rigorous evidence of program effectiveness has become increasingly important with the 2015 passage of the Every Student Succeeds Act (ESSA). One question that has not yet been fully explored is whether program evaluations carried out or commissioned by developers produce larger effect sizes than evaluations conducted by independent third parties. Using study data from the What Works Clearinghouse, we find evidence of a “developer effect,” where program evaluations carried out or commissioned by developers produced average effect sizes that were substantially larger than those identified in evaluations conducted by independent parties. We explore potential reasons for the existence of a “developer effect” and provide evidence that interventions evaluated by developers were not simply more effective than those evaluated by independent parties. We conclude by discussing plausible explanations for this phenomenon as well as providing suggestions for researchers to mitigate potential bias in evaluations moving forward.
Wolf, R., Morrison, J.M., Inns, A., Slavin, R. E., & Risman, K. (2020). Average effect sizes in developer-commissioned and independent evaluations. Journal of Research on Educational Effectiveness. DOI: 10.1080/19345747.2020.1726537
Large-scale randomized studies provide the best means of evaluating practical, replicable approaches to improving educational outcomes. This article discusses the advantages, problems, and pitfalls of these evaluations, focusing on alternative methods of randomization, recruitment, ensuring high-quality implementation, dealing with attrition, and data analysis. It also discusses means of increasing the chances that large randomized experiments will find positive effects, and interpreting effect sizes.
Program effectiveness reviews in education seek to provide educators with scientifically valid and useful summaries of evidence on achievement effects of various interventions. Different reviewers have different policies on measures of content taught in the experimental group but not the control group, called here treatment-inherent measures. These are contrasted with treatment-independent measures of content emphasized equally in experimental and control groups. The What Works Clearinghouse (WWC) averages effect sizes from such measures with those from treatment-independent measures, while the Best Evidence Encyclopedia (BEE) excludes treatment-inherent measures. This article contrasts effect sizes from treatment-inherent and treatment-independent measures in WWC reading and math reviews to explore the degree to which these measures produce different estimates. In all comparisons, treatment-inherent measures produce much larger positive effect sizes than treatment-independent measures. Based on these findings, it is suggested that program effectiveness reviews exclude treatment-inherent measures, or at least report them separately.
Slavin, R.E., & Madden, N.A. (2011). Measures inherent to treatments in program effectiveness reviews. Journal of Research on Educational Effectiveness, 4 (4), 370-380.
Research in fields other than education has found that studies with small sample sizes tend to have larger effect sizes than those with large samples. This article examines the relationship between sample size and effect size in education. It analyzes data from 185 studies of elementary and secondary mathematics programs that met the standards of the Best-Evidence Encyclopedia. As predicted, there was a significant negative correlation between sample size and effect size. The differences in effect sizes between small and large experiments were much greater than those between randomized and matched experiments. Explanations for the effects of sample size on effect size are discussed.
This systematic review of research on early childhood programs seeks to identify effective approaches capable of improving literacy and language outcomes for preschoolers. It applies consistent standards to determine the strength of evidence supporting a variety of approaches, which fell into two main categories: balanced approaches, which include phonemic awareness, phonics, and other skills along with child-initiated activities, and developmental-constructivist approaches that focus on child-initiated activities with little direct teaching of early literacy skills. Inclusion criteria included use of randomized or matched control groups, evidence of initial equality, a minimum study duration of 12 weeks, and valid measures of literacy and language. Thirty-two studies evaluating 22 programs found that early childhood programs that have a balance of skill-focused and child-initiated activities programs had significant evidence of positive literacy and language outcomes at the end of preschool and on kindergarten follow-up measures. Effects were smaller and not statistically significant for developmental-constructivist programs.
Chambers, B., Cheung, A., & Slavin, R. (2016). Literacy and language outcomes of balanced and developmental-constructivist approaches to early childhood education: A systematic review.Educational Research Review 18, 88-111.
Which comprehensive school reform programs have been proven to help elementary and secondary students achieve? To find out, this review summarizes evidence on comprehensive school reform (CSR) models in elementary and secondary schools. Comprehensive school reform models are programs used schoolwide to improve student achievement. They typically include the following elements:
Innovative approaches to instruction and curriculum used in many subjects throughout the school
Extensive, ongoing professional development, and coaches or facilitators in the building to help manage the reform process
Measurable goals and benchmarks for student achievement
Emphasis on parent and community involvement
CSR models are developed and supported by national organizations, mostly nonprofits, that provide professional development, materials, and support to networks of schools.
This article reports a systematic review of research on science programs in grades 6-12. Twenty-one studies met inclusion criteria including use of randomized or matched assignment to conditions, measures that assess content emphasized equally in experimental and control groups, and a duration of at least 12 weeks. Programs fell into four categories. Instructional process programs (ES=+0.24) and technology programs (ES=+0.47) had positive sample-size weighted mean effect sizes, while use of science kits (ES=+0.05) and innovative textbooks (ES=+0.10) had much lower effects. Outcomes support the use of programs with a strong focus on professional development, technology, and support for teaching, rather than materials-focused innovations.
Which science programs have been proven to help elementary students to succeed? To find out, this review summarizes evidence on three types of programs designed to improve the science achievement of students in grades K–6:
Inquiry-oriented programs without science kits, such as Increasing Conceptual Challenge, Science IDEAS, and Collaborative Concept Mapping. These programs help teachers learn and use generic processes, such as cooperative learning, concept development, and science-reading integration, in their daily science teaching.
Inquiry-oriented programs with science kits, such as Insights, FOSS, STC, SCALE, and Teaching SMART. The theory of action in science kit programs is that implementing hands-on activities helps to build deep learning about the scientific process and core concepts of elementary science.
Technology programs, such as BrainPOP, The Voyage of the Mimi, and web-based labs. Technologies utilized in these approaches include computer-assisted instruction and class-focused technology (such as video and interactive whiteboard technologies).
The evidence from studies that met the review’s inclusion criteria supports a view that improving outcomes in elementary science depends on improving teachers’ skills in presenting lessons, engaging and motivating students, and integrating science and reading. Technology applications that help teachers teach more compelling lessons and that use video to reinforce lessons also have promise.
Slavin, R. E., Lake, C., Hanley, P., & Thurston, A. (2014). Experimental evaluations of elementary science programs: A best-evidence synthesis. Journal of Research in Science Teaching, 51 (7), 870-901.