Skip to main content

Evaluation Summary and Metrics: "Do Celebrity Endorsements Matter? A Twitter Experiment Promoting Vaccination In Indonesia"

Evaluation Summary and Metrics: "Do Celebrity Endorsements Matter? A Twitter Experiment Promoting Vaccination In Indonesia"

Published onAug 25, 2023
Evaluation Summary and Metrics: "Do Celebrity Endorsements Matter? A Twitter Experiment Promoting Vaccination In Indonesia"
·

Abstract

This summarizes the evaluations of the paper: “Do Celebrity Endorsements Matter? A Twitter Experiment Promoting Vaccination In Indonesia” (2019). To read the evaluations, please see the links below. See the “Unjournal Process” section for notes on the versions of the paper evaluated.

Evaluations

1. Anonymous evaluator

2. Evaluation by Anirudh Tagat

Overall ratings

1 Mar 2024 note: The description of the “overall assessment” and “journal rank tier” below reflects our current practice. However, these evaluators were part of our pilot, and received slightly different instructions. See here to learn how this changed, and to see the earlier instructions.


We asked evaluators to provide overall assessments, in addition to ratings for a range of specific criteria.

I. Overall assessment:1 We asked them to rank this paper “heuristically” as a percentile “relative to all serious research in the same area that you have encountered in the last three years.” We requested they “consider all aspects of quality, credibility, importance to knowledge production, and importance to practice.”

II. Journal rank tier, normative rating (0-5):2 “On a ‘scale of journals’, what ‘quality of journal’ should this be published in? (See ranking tiers discussed here)” Note: 0= lowest/none, 5= highest/best”.

Overall assessment (0-100)

Journal rank tier, normative rating (0-5)

Anonymous evaluator

62

3

Anirudh Tagat

85

4

See “Metrics” below for a more detailed breakdown of the evaluators’ ratings across several categories. To see these ratings in the context of all Unjournal ratings, with some analysis, see our data presentation here.

See here for the current full evaluator guidelines, including further explanation of the requested ratings.3

Evaluation summaries

ChatGPT summaries below4

Anonymous evaluator (ChatGPT summary)

Strengths:

  • Innovative approach to dissecting the impact of celebrity endorsements on spreading public health messages.

  • Significant contribution to filling the research gap on the efficacy of celebrity endorsements in social media campaigns.

Critiques and suggestions:

  • Concerns about the measurement of retweet/like behavior and its potential confounds, such as visibility differences across experimental conditions.

  • Questions the direct applicability of social media behavior metrics (likes, retweets) to actual vaccination rates and behaviors in the real world.

  • Calls for clarity on experimental setups, particularly how celebrity authorship versus endorsement was distinguished and its effect on user engagement.

  • Advocates for the public availability of the study's data and code to ensure transparency, reproducibility, and further analysis by the research community.

Anirudh Tagat (ChatGPT summary)

Strengths:

  • Novel and rigorous experimental design focusing on celebrity endorsements in health communication within Indonesia.

  • Significant methodological contribution in disentangling the endorsement effect from the reach effect using randomized variation.

  • Important findings on the impact of celebrity-originated messages on public engagement and the counterintuitive effect of credible sourcing on message engagement.

  • Evidence suggesting real-world impacts of social media campaigns on vaccine-related knowledge and beliefs.

Critiques and suggestions:

  • Concerns about causal inference, particularly the impact of messages on offline knowledge and beliefs, suggesting these are correlations rather than causations.

  • Methodological suggestions include accounting for the dynamic nature of tweet engagement using timestamps and considering the role of emotions in message virality.

  • Questions the uniformity of the study's randomization across all Twitter users due to the platform's algorithm, potentially affecting the study's generalizability.

  • Recommends supplementing self-report data with secondary data on immunization rates to strengthen claims about the campaign's effectiveness.

Metrics (all evaluators)

Ratings

See here for details on the categories below, and the guidance given to evaluators.

Evaluator 1: Anonymous

Rating category

Rating (0-100)5

Confidence (low to high)* See note: 6

Additional comments (optional)

Overall assessment

62

3 dots

I think this is a topic which really needs empirical research, and is also difficult to test empirically- bumped up a little bit because of this.

Advancing knowledge and practice

55

3 dots

I think this paper advances our knowledge and tackles a real gap in the field, but is also far off from being implemented into policy (many uncertainties remaining, unclear generalisability)

Methods: Justification, reasonableness, validity, robustness

55

2 dots

I am unsure if the potential methodological problems I spotted are real problems or not; may change judgement based on author’s response

Logic & communication

70

3 dots

Open, collaborative, replicable

45

2 dots

Could change this view if the code/ data is available somewhere and I’ve missed it

Engaging with real-world, impact quantification; practice, realism, and relevance

55

3 dots

Relevance to global priorities

70

3 dots

Evaluator 2: Tagat

Rating category

Rating (0-100)7

90% CI for this rating

Please give a range: (low,high)(low,high)
Where: 0LowRatingHigh1000 \leq Low \leq Rating \leq High \leq 100 E.g., with a rating of 50, you might give a CI of (42, 61)

Confidence (low to high)*

Overall assessment

85

(78, 90)

4

Advancing knowledge and practice

90

(88,92)

4

Methods: Justification, reasonableness, validity, robustness

80

(74, 83)

3

Logic & communication

85

(80, 89)

4

Open, collaborative, replicable

80

(70, 81)

3

Engaging with real-world, impact quantification; practice, realism, and relevance8

100

(91, 100)

4

Relevance to global priorities

100

(89, 100)

5

Journal ranking tiers

See here for details on these tiers.

Evaluator 1: Anonymous

Evaluator 2: Tagat

Metric

Rating (0-5) (low to high)

Confidence (0-5)
High = 5, Low = 0

Rating (0-5)

Confidence (0-5)
High = 5, Low = 0

What ‘quality journal’ do you expect this work will be published in?

3

2

4

5

On a ‘scale of journals’, what tier journal should this be published in?

3

2

5

5

Unjournal Process/ Evaluation Manager’s Notes

Note on versions:

Each evaluator considered the most recent version of the working paper that was available at the time of their evaluation.9

This paper was selected as part of our (NBER) direct evaluation track.

Why we chose this paper

This work seems important methodologically and practically, both for understanding the effect of social media (and perhaps ‘polarization’ as well) and for health and other interventions involving debiasing and education (e.g., Development Media International). 

How we chose the evaluators

We sought expertise in

  • Empirical (econometric-style) analysis with peer-effects/networks, direct and indirect effects, causal inference

  • Field experiments (on social media), social media data (esp. Twitter)

  • Vaccine adoption, global health, Indonesian context

Evaluation process

  • We shared this document with evaluators, suggesting some ways in which the paper might be considered in more detail

  • This process took over 8 months — far longer than expected and targeted. Delays occurred because:

    • We had difficulty commissioning qualified evaluators.

    • One highly qualified evaluator agreed to the assignment but was not able to find the time to complete it

    • With a third evaluator (in between the two mentioned above) we had a communication error. The evaluator considered a much earlier version of the paper (the 2019 NBER version). Thus we are not posting this evaluation.

  • Because of these delays we requested Anirudh Tagat, a member of our Management Team, to write the second (final) evaluation. We do not see any obvious conflicts of interest here. Anirudh did not select this paper for evaluation, did not reach out to evaluators, and had no strong connection to the authors. Anirudh will exempt himself for consideration of the ‘most informative evaluation’ prize (or will exempt himself from the adjudication of this).

  •  As per The Unjournal’s policy, the paper’s authors were invited and given two weeks to provide a public response to these evaluations before we posted them. They did not provide a response, but they are invited to do so in the future (and if they do, we will post and connect it here).

Comments
0
comment
No comments here
Why not start the discussion?