Skip to main content

Evaluation Summary and Metrics: "Building Resilient Education Systems: Evidence from Large-Scale Randomized Trials in Five Countries"

Evaluation Summary and Metrics: "Building Resilient Education Systems: Evidence from Large-Scale Randomized Trials in Five Countries" for The Unjournal.

Published onDec 02, 2024
Evaluation Summary and Metrics: "Building Resilient Education Systems: Evidence from Large-Scale Randomized Trials in Five Countries"
·

Abstract

We organized two evaluations of the paper: "Building Resilient Education Systems: Evidence from Large-Scale Randomized Trials in Five Countries"[1]. To read these evaluations, please see the links below.

Evaluations

1. Anonymous evaluation 1

2. Anonymous evaluation 2

Overall ratings

We asked evaluators to provide overall assessments as well as ratings for a range of specific criteria.

I. Overall assessment (See footnote1)

II. Journal rank tier, normative rating (0-5): On a ‘scale of journals’, what ‘quality of journal’ should this be published in?2 Note: 0= lowest/none, 5= highest/best.

Overall assessment (0-100)

Journal rank tier, normative rating (0-5)

Evaluator 1

90

4.3

Evaluator 2

83

3.9

See “Metrics” below for a more detailed breakdown of the evaluators’ ratings across several categories. To see these ratings in the context of all Unjournal ratings, with some analysis, see our data presentation here.3

See here for the current full evaluator guidelines, including further explanation of the requested ratings.

Evaluation summaries

Anonymous evaluator 1

The paper studies an important question [surrounding the] effectiveness of low-cost education programs which could reduce learning losses faced by primary [school] students due to shocks or emergencies in five developing countries.

Anonymous evaluator 2

Provides important evidence on [the] effectiveness of a low-tech intervention to stem learning losses from school closures (likely to increase in coming years) by scaling it up vertically & across diverse geographies

Note that [the] results only apply to numeracy skills and for temporary and abrupt school closures. Future work should see if it is effective for other skills & circumstances (displaced people, girls), and if contextualizing content/methods improves outcomes.

Metrics

Ratings

See here for details on the categories below, and the guidance given to evaluators.

Evaluator 1

Anonymous

Evaluator 2

Anonymous

Rating category

Rating (0-100)

90% CI

(0-100)*

Comments

Rating (0-100)

90% CI

(0-100)*

Overall assessment4

90

(85, 95)

5

83

(73, 93)

Claims, strength, characterization of evidence6

80

(72, 88)

Advancing knowledge and practice7

95

(90, 100)

8

88

(78, 98)

Methods: Justification, reasonableness, validity, robustness9

85

(80, 90)

10

78

(68, 88)

Logic & communication11

84

(75, 90)

12

79

(71, 87)

Open, collaborative, replicable13

75 14

(60, 90)

15

62

(52, 72)

Real-world relevance 16,17

95

(90, 100)

18

90

(85, 95)

Relevance to global priorities19, 20

95

(90, 100)

21

90

(85, 95)

Journal ranking tiers

See here for more details on these tiers.

Evaluator 1

Anonymous

Evaluator 2

Anonymous

Judgment

Ranking tier (0-5)

90% CI

Comments

Ranking tier (0-5)

90% CI

On a ‘scale of journals’, what ‘quality of journal’ should this be published in?

4.3

(3.7, 4.8)

22

3.9

(3.5, 4.4)

What ‘quality journal’ do you expect this work will be published in?

4.5

(4.0, 4.8)

23

4.5

(4.0, 5.0)

See here for more details on these tiers.

We summarize these as:

  • 0.0: Marginally respectable/Little to no value

  • 1.0: OK/Somewhat valuable

  • 2.0: Marginal B-journal/Decent field journal

  • 3.0: Top B-journal/Strong field journal

  • 4.0: Marginal A-Journal/Top field journal

  • 5.0: A-journal/Top journal

Evaluation manager’s discussion

We sought two evaluations of this well-regarded paper, which has since received the 2024 ADB-IEA Innovative Policy Research Award. This paper provides evidence on remote instruction during the COVID-19 pandemic. The authors conducted several powered RCTs in India, Kenya, Nepal, The Philippines, and Uganda. They find that delivering tutorials over the phone can stem learning loss. The authors also provide valuable evidence on the setting within which remote teaching/learning may be most effective; they find no major difference between NGO-provided tutorials and government-provided ones (if anything, the latter do better, assuaging fears about scalability within government systems).24 (The ‘null effect’ is not tight. The point estimates for the government-provided tutorials is substantially larger than for NGOs (0.315 SDs vs 0.263 SDs) with standard errors of about 0.05 SDs for each, suggesting fairly wide confidence intervals for each.)25

The two evaluators were asked specifically to comment on the design of the RCTs, the claims, and the policy considerations arising from the paper.

Evaluator 1 mentions that more details are needed on the design of the phone tutorials to be able to fully understand the impacts it might have, and to understand whether the efficacy of the tutorials might be undermined by shared phone ownership or similar constraints that low-income households in developing countries typically operate in. They also suggested a closer look at channels (explanations for the results obtained) related to health (given the pandemic) and via greater attention from caregivers. The authors would do well to consider these.

The second evaluation provides a more detailed and careful look at the claims as well as the policy implications. They make specific suggestions for authors to provide more details of the SMS-only intervention (again, likely helpful for scaling or translating this evidence into other contexts), as well as some contextual factors related to the delivery of the intervention. They also mention several issues related to the scaling of this intervention.

A key consideration here: if many individuals in a household are sharing a phone this could impede efficient delivery. If this was the case for a majority of households in this trial, this could be seen as an underestimate of the impact of an intervention in a context where everyone had (or was provided with) their own phone.

Overall, the paper makes a valuable contribution to our understanding of how to scale education interventions in emergency settings, especially in low-resource contexts.

Comments
0
comment
No comments here
Why not start the discussion?