Skip to main content

Evaluation Summary and Metrics: "Replicability & Generalisability: A Guide to CEA discounts", Unjournal applied Stream

Evaluation Summary and Metrics: "Replicability & Generalisability: A Guide to CEA discounts" for The Unjournal (Applied stream).

Published onJun 10, 2024
Evaluation Summary and Metrics: "Replicability & Generalisability: A Guide to CEA discounts", Unjournal applied Stream
·

Abstract

We organized two evaluations of the paper: "Replicability & Generalisability: A Guide to CEA discounts". This paper was evaluated as part of our “applied and policy stream”, described here, and evaluators were asked to use the google doc template provided here, specifically for this stream. To read these evaluations, please see the links below. COI considerations: The author of this paper, Rosie Bettle, is a member of The Unjournal’s Management Team (discussed further below).

Note: Applied and Policy Stream

This paper was evaluated as part of our “applied and policy stream”, described here. The ratings should not be directly compared to those in our main academic stream.

Evaluations

1. Anonymous evaluator

2. Max Meier

Overall ratings

We asked evaluators to provide overall assessments as well as ratings for a range of specific criteria.

Note: Applied and Policy Stream

This paper was evaluated as part of our “applied and policy stream”, described here. The ratings should not be directly compared to those in our main academic stream

Overall assessment: We asked them to rank this paper “heuristically” as a percentile “relative to applied and policy research you have read aiming at a similar audience, and with similar goals.” We requested they “consider all aspects of quality, credibility, importance to future impactful applied research, and practical relevance and usefulness.”

Overall assessment (0-100)

Anonymous

25

Max Meier

40

See “Metrics” below for a more detailed breakdown of the evaluators’ ratings across several categories. To see these ratings in the context of all Unjournal ratings, with some analysis, see our data presentation here.1

Caveat: This stream is in its early stages. These ratings may not yet be reliable; e.g., one evaluator noted

I am not sure what to compare this to, because I am unclear about scope. So please don’t take these ratings seriously.”

Evaluation summaries

Anonymous evaluator

Pro:

  • Raises important points and brings them to wider attention in simple language

  • Useful for considering individual RCTs 

Con:

  • Not clear enough about intended use cases and framing. Writing should be clearer, shorter, more purposeful.

  • Guidelines need more clarity and precision before they can be genuinely used. I think best to reframe this as a research note, rather than a ready-to-use ‘guideline’.

  • Unclear whether this is applicable to considering multiple studies and doing meta-analysis.

Max Meier

The proposal makes an important practical contribution to the question of how to evaluate effect size estimates in RCTS. I also think overall the evaluation steps are plausible and well justified and will lead to a big improvement in comparison to using an unadjusted effect size. However, I am unsure whether they will lead to an improvement over simpler adjustment rules (e.g., dividing the effect size by 2) and see serious potential problems when applying this process in practice, especially related to the treatment of uncertainty.

Metrics

Ratings

See here for details on the categories below, and the guidance given to evaluators.

Evaluator 1

Anonymous

Evaluator 2

Max Meier

Rating category

Rating (0-100)

90% CI

(0-100)*

Comments

Rating (0-100)

90% CI

(0-100)*

Comments

Overall assessment2

25

(0, 50)

3

40

(10, 60)

Advancing knowledge and practice4

30

(20, 80)

5

50

(20, 70)

Methods: Justification, reasonableness, validity, robustness6

N/A

N/A

7

10

(0, 40)

8

Logic & communication9

30

(10, 70)

10

30

(10, 50)

11

Open, collaborative, replicable12

10

(0, 50)

13

N/A

N/A

Relevance to global priorities, usefulness for practitioners14

70

(50, 95)

15

50

(10, 80)

Evaluation manager’s discussion; process notes

COI issues

As noted above, the author of this paper, Dr. Rosie Bettle, is a member of The Unjournal’s Management Team. To minimize the conflict of interest issues, the author was kept at arms length. She was not made a part of our deliberations over whether to prioritize this paper. We handled the evaluations outside of our normal interface, so the author could not see the process.

Why we chose this paper

I (David Reinstein) first came across this paper in an earlier forum where the author was requesting general feedback (this was before Dr. Bettle joined our team). Joel Tan of CEARCH later independently recommended we consider this work. I think it has strong potential to impact cost-effectiveness analyses and funding. Founder's Pledge, GiveWell and other organizations use fairly ad-hoc corrections and 'discounts' for external validity, publication bias/winner's curse, etc. DrBettle aims to provide guidance towards using a more rigorous framework for this. I think it's interesting, very decision, relevant, and underprovided. Other researchers are doing careful work on the deep, technical issues, and some (e.g. Noah Haber) are trying to bring this to applied contexts. But my impression is that it's not being done in a way that is 'sticking' and actually being adopted in practice by relevant organizations like GiveWell. These specific and less-technical guidelines may help.

This was not an academically-targeted paper, thus we evaluated it under our "applied and policy stream".

Evaluation process

The author was eager to get feedback on this paper, and shared with us a list of specific questions. These questions are embedded in the Google Doc here, along with some of the second evaluator’s specific responses.

Comments
0
comment
No comments here
Why not start the discussion?