Abstract
We organized two evaluations of the paper: “Meaningfully reducing consumption of meat and animal products is an unsolved problem: A meta-analysis”[1]. Both evaluators offer some praise (of the importance of the topic and the transparent reporting), but they are critical on the whole, while offering substantive suggestions for improvement, emphazizing a more systematic approach. Their concerns include the transparency, design logic, and robustness of the paper’s methods—particularly in relation to its search strategy and inclusion criteria, outcome selection, and handling of missing data. Jané highlights guesswork/approximation in effect size coding, ignoring imputation variance, and neglecting key sources of study bias (such as selective outcome reporting). To read these evaluations, please see the links below.
Evaluations
1. Evaluator 1
2. Matthew B. Jané
Author response
Overall ratings
We asked evaluators to provide overall assessments as well as ratings for a range of specific criteria.
I. Overall assessment (See footnote)
II. Journal rank tier, normative rating (0-5): On a ‘scale of journals’, what ‘quality of journal’ should this be published in? Note: 0= lowest/none, 5= highest/best.
| Overall assessment (0-100) | Journal rank tier, normative rating (0-5) |
Evaluator 1 | 75 | 3.5 |
Matthew B. Jané | 39 | 1.1 |
See “Metrics” below for a more detailed breakdown of the evaluators’ ratings across several categories. To see these ratings in the context of all Unjournal ratings, with some analysis, see our data presentation here.
See here for the current full evaluator guidelines, including further explanation of the requested ratings.
Evaluation summaries
Anonymous evaluator 1
This is a strong statistical analysis of the literature on meat and animal product consumption. It shows weaker evidence than previous reviews, purportedly because of a stricter set of criteria for inclusion which focused on RCTs. The authors have not followed standard methods for systematic reviews, basing their analysis on previous meta-analyses with which they were familiar. They searched Google Scholar, not in fact a bibliographic database, in place of traditional databases like Scopus, Web of Science Core Collections, CAB Abstracts, etc. I do feel there’s a strong likelihood studies have been missed. The authors also fail to conduct important checks of consistency in screening, data extraction and appraisal of risk of bias. Their risk of bias assessment also seems less robust than standard approaches in evidence synthesis using peer-reviewed tools.
Matthew B. Jané
Strengths: The study is highly transparent, with fully reproducible code and open data. The analytic pipeline is well-documented and is an excellent example of open science.
Limitations: However, major methodological issues undermine the study's validity. These include improper missing data handling, unnecessary exclusion of small studies, extensive guessing in effect size coding, lacking a serious risk-of-bias assessment, and excluding all-but-one outcome per study. Overall, the transparency is strong, but the underlying analytic quality is limited.
Metrics
Ratings
See here for details on the categories below, and the guidance given to evaluators.
| Evaluator 1 Anonymous | | | Evaluator 2 Matthew B. Jané | |
|---|
Rating category | Rating (0-100) | 90% CI (0-100) | Comments | Rating (0-100) | 90% CI (0-100) |
|---|
Overall assessment | 75 | (60, 80) | Generally probably good | 39 | (14, 62) |
|---|
Claims, strength, characterization of evidence | 60 | (50, 70) | Methodological problems lower [i.e., reduce] strength of conclusions | 59 | (30, 85) |
|---|
Advancing knowledge and practice | 60 | (50, 70) | Interesting conclusions but unreliable given methodological flaws | 50 | (5, 95) |
|---|
Methods: Justification, reasonableness, validity, robustness | 40 | (30, 50) | Poor methodology beyond statistics | 25 | (10, 43) |
|---|
Logic & communication | 80 | (70, 90) | Excellently written, very clear, logical | 40 | (19, 65) |
|---|
Open, collaborative, replicable | 70 | (60, 80) | Strong, but needs a lot more data in line with evidence synthesis standards | 91 | (86, 100) |
|---|
Real-world relevance , | 90 | (80, 100) | This is vital stuff, just do it right, the robust methods are there for your protection | 88 | (45, 93) |
|---|
Relevance to global priorities, | 90 | (80, 100) | | 88 | (45, 93) |
|---|
Journal ranking tiers
See here for more details on these tiers.
| Evaluator 1 Anonymous | | | Evaluator 2 Matthew B. Jané | |
|---|
Judgment | Ranking tier (0-5) | 90% CI | Comments | Ranking tier (0-5) | 90% CI |
|---|
On a ‘scale of journals’, what ‘quality of journal’ should this be published in? | 3.5 | (3.0, 4.0) | | 1.1 | (0.4, 1.9) |
|---|
What ‘quality journal’ do you expect this work will be published in? | 4.0 | (3.5, 4.5) | | 3.2 | (2.3, 4.1) |
|---|
See here for more details on these tiers. | We summarize these as: 0.0: Marginally respectable/Little to no value 1.0: OK/Somewhat valuable 2.0: Marginal B-journal/Decent field journal 3.0: Top B-journal/Strong field journal 4.0: Marginal A-Journal/Top field journal 5.0: A-journal/Top journal
|
Claim identification and assessment (summary)
For the full discussions, see the corresponding sections in each linked evaluation.
| Main research claim | Belief in claim | Suggested robustness checks | Important ‘implication’, policy, credibility |
Evaluator 1 Anonymous | Claim: Meat and animal product reduction interventions don’t seem to be as effective as previously suggested by evidence synthesis. | Belief/confidence: It’s hard to say given the methods used-maybe they missed some important studies, maybe they still included some that were not robust. | Conduct searches in Scopus, WoSCC and CAB abstracts to see what search results overlapped and what you might have missed. Provide cross checking tests of screening, data extraction, and critical appraisal. Provide more details on appraisal of Risk of Bias. Include a PRISMA/ROSES checklist. | Policy/funding implication: Impossible to say with the level of reliability I have. It looks great, but I don’t really trust your methods. |
Evaluator 2 Matthew B. Jané | “We conclude that while existing approaches do not provide a proven remedy to MAP consumption, designs and measurement strategies have generally been improving over time, and many promising interventions await rigorous evaluation.” | I agree with the claim because this is all I have read on this topic and it does not appear that the evidence of effectiveness (or ineffectiveness) is very compelling. | Obtain all relevant outcomes and studies (not filtered by sample size <25). Work with a librarian to make a supplementary systematic search to ensure good coverage. Use proper missing data methods for non-significant and unreported effects. Add a rigorous risk-of-bias assessment that includes glaring issues. | N/A |
Evaluation managers’ discussion
This discussion was mostly written by Tabare Capitan, with content from David Reinstein where indicated
This paper received two critical and methodologically informed evaluations from researchers with substantial experience in quantitative research synthesis. The evaluators identified a range of concerns regarding the transparency, design logic, and robustness of the paper’s methods—particularly in relation to its search strategy, outcome selection, and handling of missing data. Their critiques reflect a broader tension within the field: while meta-analysis is often treated as a gold standard for evidence aggregation, it remains highly sensitive to subjective decisions at multiple stages.
Both evaluations highlight issues with transparency in search strategies, subjective coding decisions, and the treatment of missing or ambiguous data. Importantly, the authors themselves acknowledge many of these concerns, including the resource constraints that shaped the final design. Across the evaluations and the author response, there is broad agreement on a central point: that a high degree of researcher judgment was involved throughout the study. Again, this may reflect an important feature of synthesis work beyond the evaluated paper—namely, that even quantitative syntheses often rest on assumptions and decisions that are not easily separable from the analysts' own interpretive frameworks. These shared acknowledgements may suggest that the field currently faces limits in its ability to produce findings with the kind of objectivity and replicability expected in other domains of empirical science.
We also note that one of the evaluators chose to forgo anonymity in submitting their review. While The Unjournal permits reviewers to remain anonymous, we support transparency when reviewers feel comfortable sharing their identity. In this case, the decision may be seen as an act of professional integrity and confidence in open scientific debate, particularly given the potential reputational asymmetries involved in critiquing work by more senior scholars.
David Reinstein: I think I’m more optimistic than Tabaré about the potential for meta-analysis. I’m deeply convinced that there are large gains from trying to systematically combine evidence across papers, and even (carefully) across approaches and outcomes. Yes, there are deep methodological differences over the best approaches. But I believe that appropriate meta-analysis will yield more reliable understanding than ad-hoc approaches like ‘picking a single best study’ or ‘giving one’s intuitive impressions based on reading’. Meta-analysis could be made more reliable through robustness-checking, estimating a range of bounded estimates under a wide set of reasonable choices, and enabling data and dashboards for multiverse analysis, replication, and extensions.
I believe a key obstacle to this careful, patient, open work is the current system of incentives and tools offered by academia and the current system of traditional journal publications as a career outcome an ‘end state’. The author’s response “But at some point, you declare a paper ‘done’ and submit it” exemplifies this challenge.
The Unjournal aims to build and facilitate a better system. The author’s response offers some thoughts about how others might follow this up. "If anyone reading this wants to run any further extensions, and you have any questions about how to go about it, my email is in the paper and I’d be glad to hear from you"..."...please conduct the analysis, we’d love to read it". We’d like to help facilitate and encourage such work.
Why We Chose This Paper
This paper was selected in part due to the high level of attention it has received in communities focused on animal welfare and effective altruism. >
David Reinstein: See especially this post on the EA forum. EA organizations have mainly focused away from the sorts of ~persuasive behavioral animal welfare interventions covered in this paper, in favor of other approaches to improving farmed animal welfare (e.g., see Open Philanthropy’s grants in this area here.)
It addresses a consequential and policy-relevant question—how interventions may reduce meat consumption—through a structured synthesis of existing literature. Although meta-analyses are sometimes treated as endpoints in a research chain, we viewed this project as an opportunity to evaluate the strength of the synthesis itself, particularly as such studies often play an outsized role in shaping public discourse and policy. The work also aligns with The Unjournal’s mission to support rigorous and transparent research on important social topics. We hoped the evaluation process would clarify the strengths and limitations of this particular synthesis while also contributing to methodological reflection in the broader field.
David Reinstein: Meta-analysis seems particularly challenging in this context, even conceptually. It is not clear to me here what the "average effect" across studies and interventions means or what it's useful for. I'm more familiar with study-specific random effects for meta-analyses where each of the studies is fundamentally considering the same intervention and the same outcome. Jané requests a focus on “Prediction intervals [to] communicate the range of likely effects in future studies.” The author’s response notes “ the paper advances a distinct vision of what meta-analysis is for, and how to conduct it, that bears further motivation and explanation. I’ll start to do that here, but I think it calls for a separate methods note/paper.” I think this merits further discussion.
Evaluation Process
Two evaluators were selected through our standard process, with a focus on research experience, methodological competence, and domain relevance. We prioritized methodological expertise over domain expertise (in strategies to reduce meat consumption). Each evaluator considered both the paper’s methodological innovations and its departures from standard practice. The feedback was shared with the authors, who then provided a comprehensive public response. The author’s reply elaborates on their rationale for several non-standard choices, emphasizes the challenges of conducting such work without substantial resources, and highlights the considerable effort invested in transparency and reproducibility.
The exchange between authors and evaluators is collegial but pointed, and may serve as a useful case study of both the potential and the limits of current synthesis practices.
Finally, we note that it was difficult to find evaluators with the right expertise to evaluate this work — particularly at the intersection of this topic (animal product consumption and behavioral interventions) and these methods. We ended up focusing on methodological expertise (although E1 has some adjacent links to the topic). Indeed, this may be a relatively neglected field, and building and funding more expertise may be highly valuable.
References
1. Green, S. A., Smith, A. B., & Mathur, M. B. (2025). Meaningfully reducing consumption of meat and animal products is an unsolved problem: A meta-analysis. *Appetite*, 216, 108233. https://doi.org/10.1016/j.appet.2025.108233