Description
Founded in 1920, the NBER is a private, non-profit, non-partisan organization dedicated to conducting economic research and to disseminating research findings among academics, public policy makers, and business professionals.
Anonymous evaluation of "Do Celebrity Endorsements Matter? A Twitter Experiment Promoting Vaccination In Indonesia"
We asked evaluators to give some overall assessments, in addition to ratings for a range particular criteria. See the evaluation summary “metrics” for a more detailed breakdown of this. See these ratings in the context of all Unjournal ratings, with some analysis in our data presentation here.
Rating | Confidence level (0-5) | |
Overall assessment | 62/100 | 3 |
Journal rank tier, normative rating | 3/5 | 3 |
Overall assessment:1 We asked them to rank this paper “heuristically” as a percentile “relative to all serious research in the same area that you have encountered in the last three years.” We requested they “consider all aspects of quality, credibility, importance to knowledge production, and importance to practice.”
Journal rank tier, normative rating (0-5):2 “On a ‘scale of journals’, what ‘quality of journal’ should this be published in? (See ranking tiers discussed here)” Note: 0= lowest/none, 5= highest/best”.
See here for the full evaluator guidelines, including further explanation of the requested ratings.
1 Mar 2024 note: The description of the “overall assessment” and “journal rank tier” above reflects our current practice. However, these evaluators were part of our pilot, and received slightly different instructions. See here to learn how this changed, and to see the earlier instructions.
Note from David Reinstein, Evaluation Manager:3
This evaluator considered the February 2022 (Stanford) working paper version titled “Designing Effective Celebrity Messaging Results From a Nationwide Twitter Experiment Promoting Vaccination in Indonesia”
In this paper, the authors conducted a nationwide Twitter experiment in Indonesia (pre-covid; this experiment took place in 2015-2016). The aim of the experiment was to better understand which aspects of social media campaigns are important for disseminating a message; in particular, the role of celebrity endorsement (including celebrity authorship specifically) and of referencing credible sources. I think this is a highly important and relevant topic for global priorities research, that addresses a gap in the literature (since there is significant disagreement as to whether celebrity endorsements work or not.)
The paper’s key findings were that (1) celebrity messages (regardless of their social network position, i.e. the number of people who could see the tweet) are more likely to be liked and retweeted, (2) most of this effect comes from celebrity authorship, rather than merely passing on a message, (3) citing external credible sources (perhaps surprisingly!) decreases retweets, and (4) people who were exposed to more vaccination messaging appeared to have more accurate immunisation knowledge, and reported more vaccination among friends and neighbours.
In general, I think that the methods used were high-quality; although I point to a few areas for additional clarification below. In particular, I am unsure about whether there are potential confounds in measuring the retweet/ liking behavior (for example, due to users being able to see different comments and numbers of retweets across experimental conditions, which might drive different patterns of retweeting/ liking). Regarding the offline behavior, I am happy this is included since this data is of key importance for practical application. However, I am curious whether the people who were more likely to see the celebrity tweets might be better-informed in general—it seems possible to me that this might explain some of the results seen here, although this could also reflect a misunderstanding on my part (i.e. I am not sure that exposure to tweets was truly random; see below). I also note that while the evidence is suggestive of a link between mass media campaigns and increased vaccination rates, this data is not yet clear (since the data here only speaks to the subjects’ awareness of others’ vaccination status, which the authors themselves also point out). I think that further research into how these campaigns into real-world behavior is critically needed, since it is unclear whether retweets/likes are a reasonable proxy for ‘increase in vaccination rates’ (which should be the real measure of a campaign’s success).
I point to some additional data that I would like to see. For this paper specifically, I would like to see some more detail about the celebrities who were included– should we expect that the Twitter audiences are pro-vaccination or against vaccination in general? I think we might see very diverse effects according to the celebrities’ Twitter audience (i.e. if a Twitter celebrity with an anti-vaxxer following tweeted a pro-vaccination message, I might expect fewer likes and retweets but perhaps more vaccinations from people who would otherwise never have gotten vaccinated). Understanding these effects seems to be a key step prior to real-world implementation.
I think that the statistical methods used are appropriate, and I commend the authors for pre-registering their study. However, I would love to see the code and data used for this study openly available (e.g. anonymised on DRYAD). I would also have liked to see more detail in the pre-registration analysis plan.
Overall I think this is a good paper, examining a topic of key importance. I think this study approaches a key question that is very difficult to experimentally examine, and their methods (although not perfect— I think this would be impossible to do) are fairly rigorous. I hope that further research is conducted in this area, so we have a sufficient understanding as to put some of these principles into practice- I do note some potential longterm implications of celebrity endorsement (beyond the scope of this study) that we should attend to prior to real-world application, and I emphasise that understanding the psychological mechanisms underlying these effects may be key to ensuring effective implementation/ understanding generalisability.
Methods summary
The experiment used two main interventions to examine online behavior. In the first intervention, the authors tested (1) the effect of knowing that the tweet originated from a celebrity, and (2) the effect of knowing that the celebrity authored the tweet.
For (1), the authors took advantage of the fact that Twitter’s timeline shows the identity of only two people within tweets; the originator who wrote the tweet, and the person who you follow who directly passed it to you (i.e. who retweeted it in your network). Here, the message would either originate directly from a celebrity (and be retweeted by the followers of the celebrity), or be authored by a Joe/ Jane (and be retweeted by a celebrity, before being retweeted by the celebrity’s followers). The authors examined the behavior of the ‘followers-of-followers’ of the celebrities, because in the latter case these individuals would not be aware that the celebrity retweeted the message- they would only see the name of the regular user who wrote the message, and the person who they follow that directly passed it to them (the celebrity follower). This meant that it was possible for the authors to isolate the effect of knowing that the celebrity originated the message (versus a regular Twitter user), while holding the network position constant (since celebrities will generally have larger audiences than non-celebrities, but the celebrities have ‘anonymously retweeted’ the message from the perspective of the followers-of-followers, in the latter case). The authors found that tweets originating from a celebrity had a 72% increase in the retweet/ like rate, relative to those that originated from a regular Twitter user.
One problem—which the authors spot— is that F1’s behavior (decision to retweet or not) may be endogenous- i.e. the specific F1s who decide to retweet may depend upon whether the celebrity authored the message or not. Although the equations that the authors used controlled for the log number of F1s who retweeted (and therefore the number of F2s who could potentially retweet it), it could be the case that especially influential F1s (i.e. the composition of F1s who retweet) could be endogenous and also affect F2’s behavior.
I think the authors’ methods to account for this are reasonable. They used a subset of their Joes/ Janes (who were also F1s, i.e. direct followers of a celebrity), and had some of them randomly retweet the celebrity tweets/ retweets- creating exogenous F1s. They then analysed the experiment using the subsample with exogenous F1s. Given that the results were similar to that observed in the full experiment (point value estimates were actually a bit higher. I note that the p values are a bit lower, but I assume this is due to lower sample size), the authors concluded that this effect was not creating a bias in their results. Ideally, I would have liked to have seen more info about the authors’ plan to deal with this within the pre-reg analysis plan.
I am less sure about the authors’ response to the second potential confound; the fact that the retweets show the number of times that the original tweet has been liked and retweeted (which may vary by condition, since the treatment affects the retweet count). I am unsure of the extent to this effect (i.e. whether this created a difference in retweets across treatments that was considerably > 15; if so, I would remain unsure about whether this confound was contributing to the overall effect). Apologies if this is in the paper somewhere and I missed it.
An additional factor that I think could be somewhat endogenous—and does not appear to be mentioned in the paper—is the comments beneath the post. Based on my understanding of Twitter, I think F2 would be able to see these under the tweets and retweets. I wonder if these comments (from F1s) are different (both in terms of content and volume) based upon whether the celebrity authored or retweeted the tweet, and if this affects F2’s behavior differentially across condition. I would be really interested to see some analysis of this.
For (2), the authors then used the same experimental variation, but examined the celebrity followers; who had visibility into both the tweets that were directly authored by the celebrities, and the tweets that were written by a regular user but retweeted by a celebrity. The authors found that tweets that were authored by the celebrities were 200% more likely to be liked/ retweeted than those where the celebrity retweeted another Twitter user’s tweet, implying that 79% of the celebrity endorsement effect comes from authorship specifically.
I do not see any problems with this method; this is based on F1’s behavior (rather than F2s), so there are not the same concerns about whether F1’s behavior is endogenous.
In the second intervention, the authors tested the effect of including credible sources. To do this, they randomised whether a source was included in the tweet, and examined the behaviour of the celebrity followers. The authors found that citing a public health authority reduces the retweet and liking rate by 26.3%.
I think these methods (looking at F1’s behavior) are appropriate.
Importantly, the authors also measure offline behavior. As far as I am aware, this is the first real-world data examining behavior change as a result of a celebrity mass media campaign—I congratulate the authors for collecting this data, which I think is very important for real-world application. These offline effects were measured by randomising the celebrities into two phases. A survey of the celebrity followers was conducted between these phases, and the between-celebrity randomisation was used to estimate the impact of the Twitter campaign on offline beliefs and behaviors. (I.e. based on the number of celebrities that a user follows who were randomised to tweet before the phone survey, versus after).
One potential problem I see (which could be a misunderstanding on my part) is that I think that ‘Exposure to Tweets’ is a function of both the number of celebrities that the individual follows at baseline (of the celebrities who were involved in the campaign), as well as whether the celebrities they follow happen to be randomised to tweet before or after the survey. Presumably, the individuals who follow more of the campaign’s celebrities (and were therefore more likely to see the campaign's tweets before the phone survey) exhibit different characteristics than the average Twitter user (i.e. this method may present confounds).
These individuals who follow a lot of the campaign’s celebrities may follow more people in general, and perhaps be better-informed in general than their counterparts. These individuals might also spend more time on Twitter in general than the average user. Assuming that the celebrities who took part in this campaign are more health-conscious/ accurate than the average celebrity (since they volunteered to take part in this campaign), these users might follow more accurate health sources in general than the average Twitter user, be more interested in health, or be people who are especially motivated to seek out immunisation information/ talk about health-related topics.
I am therefore unsure if this explanation could contribute to the effect that the authors observe (that people with higher exposure to tweets were more likely to have heard of the Twitter hashtag, to have improved knowledge of immunisation facts, etc).
I also note that although we have information about people’s reports of others’ vacciantion status, we can’t be sure whether ‘reports of increased vaccinations by friends/ neighbours/ etc’ actually correspond to an increase in vaccination. It could also be the case that people are simply more likely to report their vaccination status to the people who have a higher exposure to the campaign’s tweets (i.e. because they realise that person is more pro-vaccination than the average person) or the person is more likely to talk about/ remember other people’s vaccination status. However, the authors do point towards this out, i.e. by saying ‘does the campaign appear to change immunization behavior as reported by our survey respondents?’
I would be curious if more survey questions were asked and not published (can’t spot this information on the pre-registration or appendix.)
I also note that the subjects were recruited via ads (I assume that there wasn’t another viable option). I do think that people who are likely to sign up for a study based on a Twitter advertisement are likely to be more heavily engaged with Twitter than the average user–perhaps these people are more likely to be responsive to informative tweets than the average (less engaged) Twitter user.
I think that the Poisson regression models used to estimate the effect of endorsement and authorship, and the logistic regression model used to understand the offline effects, are appropriate. I think that the statistical methods used in the paper are of high-quality, although I note that I am not especially familiar with some components of the authors’ methods (e.g. double post-LASSO)—if other reviewers have more expertise here, I would defer to their views.
One point of clarification (regarding the dummy variable used for the different types of messages included); am I correct that this only related to content (fact/ importance/ misconception correction?) If so, I wasn’t sure why this had a dummy variable, since the message content factor was already randomised. This may be a misunderstanding on my part.
I commend the authors for pre-registering their study on the AEA RCT registry. However, the analysis plan (here) is blank. I think this is an oversight since it left me unable to assess whether there were deviations from the original analysis plan (e.g. if lots more survey questions were asked and results not reported).
The data and code used to analyse this study don’t appear to be publicly available. If possible, I think that having data (e.g. anonymised and placed on DRYAD) and code available would be of significant benefit.
I think the topic is of high importance, and is very relevant to global priorities (although I think that more research is needed until this paper can be action-guiding in practice; see below.)
Understanding the role of celebrity endorsement in driving up immunisation is a gap in the literature, where there is significant disagreement. If celebrity endorsement does work to drive up vaccination rates, this could be a key method used to improve vaccination rates in people who might not have gotten vaccinated otherwise. Given the reach of social media, these interventions have the potential to be hugely impactful if they do have an effect upon their audience.
I would love to see more information about the celebrities who volunteered for this campaign;
A key question I have for this study regards the celebrities (and their followers) who volunteered to be a part of this campaign. Thinking about the US, vaccination hesitancy became split along political lines- I would expect that celebrities with lots of democratic-leaning followers would get a lot of retweets/likes of a pro-vaccination tweet, but probably fewer ‘counterfactual’ vaccinated people as a result. I.e. I would expect that the people who liked these tweets were always pretty likely to get vaccinated, regardless of the tweets from that celebrity. Are there likely to be similar effects going on within the Indonesian data? I am not sure if any of the campaign’s celebrities had Twitter audiences who were especially likely to be against vaccination, or who were likely to be especially pro-vaccination. I’d love to see some microdata about how people from different audiences (based on characteristics of the celebrity’s typical Twitter following) respond to these tweets; I suspect some celebrities have a far higher impact (in terms of driving up vaccination rate) than others.
Understanding ‘why this works’;
As a general point (outside the scope of this paper), I think that understanding why celebrity endorsements work (from a psychological perspective) will be key to ensuring their best use, and to understanding the generalisability of the findings from this paper. For example, is it that some people trust celebrities more than official healthcare organisations? If so, it might be best to target celebrities that are highly trusted in vaccine-hesitant communities. On the other hand, perhaps people want to be similar to high-prestige celebrities? If so, it might be best to target celebrities that people in vaccine-hesitant communities admire or want to be like. Without understanding these mechanisms, policy-makers are liable to make implementation errors that reduce effectiveness. I think that this mechanistic understanding is key to understanding whether these results will generalise to other populations.
Similarly, to what degree should we expect social media behavior to follow-through to real-world behavior? Is it the case that people retweet certain celebrities to generate a particular image on social media, or because they are more altruistically motivated to pass on accurate information? Understanding these factors may help researchers to design their campaigns appropriately (i.e. to drive up vaccination rates- note that the amount of retweeting/liking may be an inaccurate proxy of a media campaign’s success).
Regarding the finding that citation of credible sources decreases retweets;
I have a lot of uncertainty here, and highlight the need for more research. I suspect this is an area where (as above) understanding ‘what is the mechanism underlying this effect’ may be important.
Is it that including these citations makes a tweet less readable and look more boring? Is it that people on social media mistrust medical organisations in general? Is it that including these citations makes it clearer that the tweet is not in the celebrities ‘own words’ (so it looks less personal)? The best solution is dependent on which factor is driving this effect.
I am also unclear if ‘number of retweets’ is a good proxy here for the result that a policy maker would be most interested in- ‘number of additional vaccinations’. It doesn’t seem impossible to me that linking to a credible source might generate fewer retweets (e.g. if it looks more academic/ boring) but result in more vaccinations (e.g. it might be more boring, but it’s also more trustworthy- and that’s the factor that matters more for real-world behavior).
Finally, there might be longterm effects to encouraging celebrities to cite (or not cite) credible medical sources. Perhaps encouraging celebrities to cite these sources increases trust in these organisations in the long run, even if it results in fewer retweets (I have no idea, but just to highlight that there may be longterm effects here beyond the number of retweets).
Although this is outside the scope of this paper, I think it is worth considering whether there are potential long-term deleterious effects of healthcare organisations partnering with celebrities.
I think a cost-benefit analysis might well still say that the effect is likely to be very beneficial overall. But some risks include if the celebrity is then photographed going against the advice they gave (e.g. not wearing a mask, breaking social isolation rules etc)- perhaps this could result in a loss of trust against the healthcare organisation in general.
I am not sure whether (over a long-term timeframe) this kind of partnership might encourage people to rely on celebrities for healthcare information, rather than going to credible sources- which could obviously have negative effects.
On the other hand, perhaps people who update on healthcare information from celebrities would never rely upon the credible sources anyway.
How long have you been in this field?
Managing editor’s summary (to anonymize): Evaluator has been considering mass-media work only recently. They previously worked for 5-10 years as a researcher in psychology.
How many proposals and papers have you evaluated?
Around 15 peer-reviewed papers, in psychology and [ME: a natural science field].
Evaluation of "Do Celebrity Endorsements Matter? A Twitter Experiment Promoting Vaccination In Indonesia" by Anirudh Tagat for the Unjournal