home about the BEE review methods sign up for updates resources

Reading / Beginning

Full Report pdf (816 KB) || Educator's Summary pdf (354 KB)

Review Methods

An exhaustive search considered more than 2000 published and unpublished articles. It included those that met the following criteria.

  • Schools or classrooms using each program had to be compared to randomly assigned or well-matched control groups.
  • Study duration had to be at least 12 weeks.
  • Outcome measures had to be assessment of the reading content being taught in all classes. Almost all are standardized tests or state assessments.
  • The review placed particular emphasis on studies in which schools, teachers, or students were assigned at random to experimental or control groups.

Program Ratings Basis

Programs were rated according to the overall strength of the evidence support in their effects on reading achievement. “Effect size” (ES) is the proportion of a standard deviation by which a treatment exceeds a control group. Average effect sizes were weighted by sample sizes in computing means. The categories are as follows:

strong evidenceStrong Evidence of Effectiveness: At least two studies, one of which is a large randomized or randomized quasi-experimental study, or multiple smaller studies, with a sample size-weighted effect size of at least +0.20, and a collective sample size across all studies of 500 students or 20 classes.

moderate evidenceModerate Evidence of Effectiveness: At least one randomized or two matched studies of any qualifying design, with a collective sample size of 250 students or 10 classes, and a weighted mean effect size of at least +0.20.

limited evidenceLimited Evidence of Effectiveness: Strong Evidence of Modest Effects: Studies meet the criteria for ‘moderate evidence of effectiveness’ except that the weighted mean effect size is +0.10 to +0.19.

limited evidenceLimited Evidence of Effectiveness: Weak Evidence with Notable Effect: Studies have a weighted mean effect size of at least +0.20, but do not qualify for ‘moderate evidence of effectiveness’ due to insufficient numbers of studies or small sample sizes.

InsufficientInsufficient Evidence of Effectiveness: Qualifying studies do not meet the criteria for “limited evidence of effectiveness.”

N No Qualifying studies: No studies meet inclusion standards.





about CDDRE
privacy disclosure contact us site map
Back to Homepage Back to Homepage Center for Data-Driven Reform in Education Johns Hopkins University School of Education