Description
Evaluation of "Willful Ignorance and Moral Behavior" for The Unjournal.
We are very grateful to Romain Espinosa and Joshua Tasoff for their thorough reading of our paper and their constructive comments, which will be helpful in preparing a revised version. Below we shortly respond to their comments (and include the original wording for the reader’s convenience).
1. Perhaps my biggest concerns are about the field results. I believe that there is an effect in the time window, but I do not understand the choice of this window. Why not use the full sample window? Won’t this have greater power?
a. Why not use a panel regression or a difference-in-difference estimation where the observation is whether a specific meal is meat or not? This is the approach used in Jalil, Tasoff, and Vargas-Bustamante (2020, 2023).[1][2]
b. The analysis appears underpowered to estimate any sort of dynamics. How many meals are in the analysis window? How many meals are in the entire data set? It would help to have this reported in Table A.14.
i. If there are on average 3 meals in the two weeks prior to the intervention, that implies about 1.5 meals/week. If there are about 28 weeks in the sample with 261 subjects that implies about 10,000 meals in the whole sample. However, if you’re considering a window five weeks long, that implies a sample of about 2,000 meals. I think it is difficult to estimate dynamics from a sample so small.
1. Indeed, Figs 3 and 4 show that 95% confidence are quite wide, and the immediate post-treatment mean is within the confidence intervals of the mean a few weeks later.
2. Jalil et al (2020)[1] came to the incorrect conclusion from their larger sample (~50,000 meals) that the treatment effects attenuated over time. But they discovered in their 2023 follow-up that there was no attenuation when they expanded the window over three years (~100,000 meals).
ii. What would happen, if rather than using a moving average, which multiply counts the same observations, you estimate treatment effects for specific post-windows: week 1, week 2, week 3 or month 1 vs. month 2?
c. That said, I still think there is value in the field study. I am surprised that there is any significant effect from such a short 5-minute intervention. I think this shows robustness as well as showing the effects persist outside of the lab. It is truly impressive that such a quick intervention can last for weeks. That is my takeaway, I’m less credulous about the claims on the dynamics.
Authors’ response: Thank you for sharing these thoughts. We explain our choice in Section 4.3 but will make sure to do that more carefully in our next revision of the paper.
The main reason for estimating effects in a moving time window is that individuals do not purchase food in the canteen daily, but only occasionally (on average, participants eat about 1.5 meals per week). Using a 7-day moving time window estimation ensures that the sample composition stays comparable across the considered periods. Furthermore, the semester breaks and the COVID pandemic limit the post-observation period for many participants (see also Appendix Figure A.14 for the number of observations in the time windows). This is why we limit the observation period to at most three weeks after the experiment.
We agree that we could also use a difference-in-difference estimator considering a specific ad-hoc pre and post period (with the limitation that some individuals would not be observed in both periods). We decided to focus on the IPW estimator applied to a specific (pre- or) post-period because we wanted to use the same methodology for our analysis of the lab and field data.
We also share the view that week-specific treatment effects are informative results (as suggested in comment b.ii). They are shown in our graph, but indirectly (e.g., the treatment effect for the first week is depicted as the treatment effect for the time window with day 3 as a midpoint).
In general, we agree that the estimated treatment effects in the field are less precisely estimated than those in the laboratory, which is a practical limitation of our study. Nonetheless, we believe they are valuable because they show that the intervention affects real-world behavior outside of the laboratory and suggest that treatment effects fade over time.
The battery of belief, emotion, and preference questions is a valuable addition to the paper. I often wonder how much interventions such as this shift behavior through the belief channel vs. more of an emotional/attentional channel. I think this paper presents very nice evidence on the latter (PANAS [Positive and Negative Affect Scores] and distaste of violence). This stands in contrast to the other studies in this area that seem to focus on information content that is less emotionally-charged: Schwitzgebel, Cokelet, and Singer (2020, 2021)[3][4] and Jalil, Tasoff, and Vargas-Bustamante (2020, 2023).[1][2]
Authors’ response: Thank you for specifying that you consider this as a valuable contribution. We will make sure to clarify this contribution of our paper in our next revision.
As far as design goes, I am not familiar with the IPW [inverse probability weighting] estimator, but I understand that the assignment is random conditional on WTP (but unbalanced). I wonder how well this method performs compared to alternative designs in experimental economics in which assignment is random for a large fraction of subjects and […] a small fraction (5 or 10% usually) are placed in an “incentive compatible group” where they get assignment based on choices and chance. Usually, this IC group is thrown out. I wonder which method is more efficient. Or perhaps the obvious best approach is to use the above method (random assignment) and use the IPW on the IC group only?
Authors’ response: Our choice to pursue the multiple price list approach with the IPW estimator in our study is based on two main rationales. First, we cannot force subjects to watch the video for ethical reasons, which rules out full random assignment. Our multiple price list approach can be seen as an encouragement design, where prices are selected such that most individuals (do not) acquire the information if prices are low (high) enough. Second, we want to ensure that participants are sufficiently encouraged to carefully think about each of the 11 decisions in the multiple price list (MPL). If we only implemented the MPL for a small subgroup of individuals (e.g., 5%), the probability of a specific decision [being] consequential would be very low (e.g., less than 0.5%).
How the efficiency of our method compares to the one described in the comment will depend on several factors, including the size of the IC group to be thrown out, the predictive power of the WTP in explaining behavior, and the extent of imbalance with respect to WTP. In our design, we increase power by randomly selecting the maximum prices with a higher probability (each with 27.5%, instead of 5% for the intermediate prices). This brings our design closer to full random assignment, while respecting incentive compatibility and ethical concerns.
My first major comment concerns one of the core elements of the paper, namely the willingness-to-pay (WTP) for watching the video. At first, I did not understand that the prices were relative, i.e., they were opportunity costs of watching the video. The authors write that they are ‘relative prices’ in the body of the manuscript, but it became clear what they mean by that only after looking at the instructions.
While this is not a concern for the validity of the paper, I have two behavioral concerns about the wording here. First, I think that it is important to underline in the main body of the manuscript that the prices are opportunity costs. I would suspect participants to give different answers if they had to pay EUR 8 to avoid watching the video about animal farming compared to the current situation where they would not gain EUR 8 if they decided to watch the alternative video. Actively paying money to avoid watching a video is a more active behavior. It relates to the status quo and loss aversion. The initial endowment would also matter here. The experimental design captures information avoidance as a more passive phenomenon than what the readers could understand from the current wording.
Second, I am not sure that what the authors measure is a WTP. Here, the authors offer participants to receive some amount of money if they agree to watch a video. It is much more like compensating people for doing a task than asking them to actively pay to watch the video. Thus, I feel that this is closer to a willingness-to-accept (WTA) than a WTP. I think that it is of particular importance in the case of information acquisition where there are different behaviors at stake: actively looking for information, passively accepting the information, actively avoiding information (and possibly passively avoiding information, if this makes sense).
Again, these questions do not affect the results of the paper, but I see them as important to understand what is measured and what we learn from it.
Authors’ response: Thank you. We will make sure to mention that prices are opportunity costs in our next revision of the manuscript.
We agree that implementing different relative prices by letting participants pay money instead of forgoing bonuses might trigger behavioral mechanisms that scale our WTP measure up or down. Importantly, such a rescaling would not affect our categorization of information avoiders and seekers since the corresponding choice in the multiple price list is based on a relative price of 0 and thus does not involve any bonuses (or payments if relative prices were implemented with the alternative approach).
In our setting, the multiple price list requires an active choice between two videos/tasks (i.e., there is no default) and implements the relative prices symmetrically (as many choices with a bonus for the animal video as choices with a bonus for the central bank video). This, we believe, should mitigate any status quo or endowment effect.
My second major comment relates to the beliefs and belief updating. First of all, we can note that the belief items are not as precisely elicited as the behaviors of the participants (which is fine given that the focus is on behaviors). However, I would suggest being more careful when discussing the beliefs given this. First, the beliefs are stated by the participants. The authors cite one of my [papers] (Espinosa & Stoop, 2021).[3] As we show in this work, there are significant differences between incentivized and non-incentivized reported beliefs in the case of meat consumption. If people engage in motivated reasoning somehow after watching the video to limit cognitive dissonance, if the video makes the belief question more salient, if on the contrary cognitive dissonance becomes impossible after watching the video, the authors might misestimate the impact on beliefs of the treatment effect.
Second, I also think that the belief items are relatively vague such that it is unclear what conclusions we can draw from it. Note that I did not find the precise wording of the belief questions in the paper nor on the PAP (https://www.socialscienceregistry.org/trials/5015). I think that only evaluating beliefs on a 1-to-5 Likert scale about the ‘pigs’ living conditions’ is not the most efficient design if we really care about understanding what cognitive process happens here and what participants learn from the video. These questions seem to reflect a relatively general evaluation of the pigs’ welfare rather than accurately assessing the knowledge of their living conditions. While I do not see this point as a threat to the validity of the paper, I would have appreciated that the authors mentioned this issue and recognized the possible limits in terms of what we can learn from the results.
Third, I did not understand in the manuscript the effective design regarding the beliefs. On page 18, in the first paragraph, the authors discuss the difference in beliefs between information seekers and information avoiders. They say that the average belief deteriorated with the video and that belief updating is not statistically significant between the two groups. However, the authors said on page 13 that they asked the belief and preference questions before watching the video. So, how did the authors evaluate the change in beliefs? Was this question asked twice (within- subject) or conditional on exposure (between-subject)?
Assuming that the authors evaluated the beliefs twice, I might have some concerns here. One issue is that most of the participants on this question are distributed at the highest level of the Likert scale (about 70% of the participants report the maximum value looking at Figure A5). When assessing a difference between treatment groups or a treatment effect, ceiling effects are important as they can lead to considerably underestimate the difference. I would suggest using here a Tobit model to take this issue into account. (I assume that there is no issue with combining it with inverse probability weighting.) Another related issue concerns the difference in beliefs. The authors write that the difference in beliefs is 0.15 for information avoiders and 0.20 for information seekers (page 18). However, note that the difference is non-negligible (it is about 33% larger). The lack of significance for the difference does not mean that there is no difference (well-known moto: the absence of evidence is not the evidence of absence). This is particularly true in the case of underpowered tests. And as I mentioned above, this is likely to be the case here because of the ceiling effects. If we look at Table A9, we see that the average beliefs are 4.69 for information avoiders and 4.59 for information seekers. It seems that information seekers have more room to update their beliefs on this Likert scale than information avoiders (because of the ceiling effect).
Overall, I would suggest being more careful about the conclusions the authors draw about the beliefs (ex: ‘Hence, differences in baseline beliefs or in belief updating do not explain why some individuals engage in willful ignorance while others seek information.’).
Authors’ response: Thank you for this comment. We agree with the suggestion of being more careful when drawing conclusions about the role of beliefs in (not) explaining the effect heterogeneity. The belief elicitation was kept simple (five-point Likert scale) to limit the cognitive load for subjects when facing several questions throughout the experiment. As pointed out, this approach limits the scope to identify changes in beliefs, also because 70 percent of subjects already select the most negative belief category before watching the video. Nevertheless, we find significant belief updating on average and conditional on being an avoider (seeker).
Thank you also for pointing out the lack of clarity in how we measure belief updating. Individuals who watch the video on intensive farming report their beliefs for a second time, which allows us to identify belief updating within-subjects.
My third major comment is about what drives the change in behavior and what the authors measure. I think that the authors test the overall effect of the video, which could (in theory) be decomposed into two effects: a pure informative effect (people learn about the state of the world) and an emotional/affective effect of the video (people feel negative affects when exposed to the video). Of course, it is very difficult to decompose the overall effect into these two sub-effects. It might also be that such a decomposition is not relevant because the cognitive and affective processes are interconnected.
However, the authors provide evidence supporting the idea that avoiders and seekers show different affects associated with the video (see Section 4.2 and Table A13). Given that (i) avoiders and seekers have similar priors, (ii) they have similar posteriors, and (iii) have different affective/emotional reactions, it seems to me that the former are more negative-affect[s] avoiders than information avoiders. So, are the authors evaluating here willful ignorance in the sense people do not want to get information that will change their beliefs or are they measuring emotional protection, i.e., avoiding exposure to information they already know but which they expect to generate negative emotions?
To be fully transparent: I think that this question is beyond the current knowledge and discussions in economics and I think that, while the paper might not be able to address it, it might open the path to new research on the topic.
Authors’ response: We share the view that it is interesting to understand whether and to which extent willful ignorance is driven by avoiding new information and/or avoiding an emotional response. In our setting, we observe significant belief updating in response to the video. We also detect a strong emotional response which is consistent with the treatment effect heterogeneity between avoiders and seekers. These results are consistent with recent models of information acquisition/avoidance where receiving information might not only affect beliefs but also the attention devoted to certain beliefs (see, e.g., Golman et al., 2022) and thereby the emotional response these beliefs may trigger. Yet, whether the emotional response arises as a consequence of belief updating or not goes beyond the scope of the paper.
First, the authors state several times in the paper that the canteens ‘typically serve meat from intensive farming’. While it might be correct, it would have been useful to have some numbers here. Ideally, I would have preferred to have the participants’ beliefs on this. In fact, in my second work cited by the authors,[4] we find that participants who watched such a video are likely to accuse the intensive farming industry but a large share of them then say that it is not representative of the animal industry in their country. So, an obvious protective mechanism to maintain his/her meat consumption is to say that it does not concern their own consumption. While it is not a concern for the paper (it would imply that the authors underestimate the treatment effect), I think it is an important point.
Authors’ response: Thank you for this suggestion. Unfortunately, we did not elicit those beliefs. In future revisions of our manuscript we will provide numbers on the extent of intensive farming in meat production. In Germany, for instance, only about 1% of pigs are raised under husbandry conditions certified by the EU eco label, while 96% of pigs are raised in barns on slatted floors (https://www.bmel-statistik.de/landwirtschaft/tierhaltung/schweinehaltung, last access: 30.7.2024), a common practice in intensive farming.
Second, on a side note, I would like to stress that increasing the costs of information avoidance (discussed at the end of the paper) could also lead to other behavioral issues. For instance, the authors mention the strategy of activists to display information on product packages or approach pedestrians on the street. Increasing the coercion level could lead to backlash effects such as reactance where people feel restricted in their choice and could start criticizing the activists rather than considering the information from the video. This effect could be even stronger for information avoiders. Thus, strategies affecting the information acquisition costs as discussed in Section 5 might induce other and possibly backfiring behaviors.
Authors’ response: Thank you, we agree. This point is particularly relevant for policy making. To the extent that offering a bonus for acquiring information is already seen as pushing people into a certain direction, such backfiring effects are partially included in our estimates. Yet, other forms of increasing the coercion level may trigger larger backlash effects.
[1]Jalil, A. J., Tasoff, J., & Bustamante, A. V. (2020). Eating to save the planet: Evidence from a randomized controlled trial using individual-level food purchase data. Food Policy, 95, 101950. https://doi.org/10.1016/j.foodpol.2020.101950
[2]Jalil, A. J., Tasoff, J., & Bustamante, A. V. (2023). Low-cost climate-change informational intervention reduces meat consumption among students for 3 years. Nature Food, 4(3), 218–222. https://doi.org/10.1038/s43016-023-00712-1
[3]Espinosa, R., & Stoop, J. (2021). Do people really want to be informed? Ex-ante evaluations of information-campaign effectiveness. Experimental Economics, 24(4), 1131–1155. https://doi.org/10.1007/s10683-020-09692-6
[4]Espinosa, R., Borau, S., & Treich, N. (2024). Impact of NGOs’ Undercover Videos on Citizens’ Emotions and Pro-Social Behaviors. SSRN Electronic Journal. https://doi.org/10.2139/ssrn.4778679