Evaluation Summary and Metrics: "Long Term Cost-Effectiveness of Resilient Foods for Global Catastrophes Compared to Artificial General Intelligence Safety"
Evaluation 3 of "Long term cost-effectiveness of resilient foods for global catastrophes compared to artificial general intelligence"
This is an evaluation of Denkenberger et al (2022).
“On a ‘scale of journals’, what ‘quality of journal’ should this be published in?: Note: 0= lowest/none, 5= highest/best”
See HERE for a more detailed breakdown of the evaluators’ ratings and predictions.
I am a political scientist specializing in science policy (i.e., how expertise and knowledge production influences the policymaking process and vice-versa), with a focus on “decision making under conditions of uncertainty,” R&D prioritization, and the governance of systemic and catastrophic risk. With respect to the various categories of expertise highlighted by the authors, I can reasonably be considered a “policy analyst.”
Potential conflict of interest/source of bias: one of the authors (Dr. Anders Sandberg) is a friend and former colleague. He was a member of my PhD dissertation committee.
A quick further note on the potential conflict of interest/bias of the authors (three of the four are associated with ALLFED, which, as the authors note, could stand to benefit financially from the main implication of their analysis - that significant funding be allocated to resilient food research in the short-term). In my opinion, this type of “self-advocacy” is commonplace and, to some extent, unavoidable. Interest and curiosity (and by extension, expertise) on a particular topic motivates deep analysis of that topic. It’s unlikely that this kind of deep analysis (which may or may not yield these sorts of “self-confirming” conclusions/recommendations) would ever be carried out by individuals who are not experts on - and often financially implicated in - the topic. I think their flagging of the potential conflict of interest at the end of the paper is sufficient - and exercises like this Unjournal review further increase transparency and invite critical examinations of their findings and “positionality.”
I am unqualified to provide a meaningful evaluation of several of the issues “flagged” by the authors and editorial team, including: the integration of the sub-models, sensitivity analysis, and alternative approaches to the structure of their Monte Carlo analysis. Therefore, I will focus on several other dimensions of the paper.
This paper has two core goals: (1) to explore the value and limitations of relative long-term cost effectiveness analysis as a prioritization tool for disaster risk mitigation measures in order to improve decision making and (2) to use this prioritization tool to determine if resilient foods are more cost effective than AGI safety (which would make resilient food the highest priority area of GCR/X-risk mitigation research). As I am not qualified to directly weigh in on the extent to which the authors’ achieved either goal, I will reflect on the “worthiness” of this goal within the broader context of work going on in the fields of X-risk/GCR, long-termism, science policy, and public policy - and the extent to which the authors’ findings are effectively communicated to these audiences.
Within this broader context, I believe that these are indeed worthy (and urgent) objectives. The effective prioritization of scarce resources to the myriad potential R&D projects that could (1) reduce key uncertainties, (2) improve political decision-making, and (3) provide solutions that decrease the impact and/or likelihood of civilization-ending risk events is a massive and urgent research challenge. Governments and granting agencies are desperate for rigorous, evidence-based guidance on how to allocate finite funding across candidate projects. Such prioritization is impeded by uncertainty about the potential benefits of various R&D activities (partially resulting from uncertainty about the likelihood and magnitude of the risk event itself - but also from uncertainty about the potential uncertainty-reducing and harm/likelihood-reducing “power” of the R&D). Therefore, the authors’ cost-effectiveness model, which attempts to decrease uncertainty about the potential uncertainty-reducing and harm/likelihood-reducing “power” of resilient food R&D and compare it to R&D on AGI safety, is an important contribution. It combines and applies a number of existing analytical tools in a novel way and proposes a tool for quantifying the relative value of (deeply uncertain) R&D projects competing for scarce resources.
Overall, the authors are cautious and vigilant in qualifying their claims - which is essential when conducting analysis that relies on the quasi-quantiative aggregation of the (inter)subjective beliefs of experts and combines several models (each with their own assumptions).
I largely agree with the authors’ dismissal of theoretical/epistemic uncertainty (not that they dismiss its importance or relevance - simply that they believe there is essentially nothing that can be done about it in their analysis). Their suggestion that “results should be interpreted in an epistemically reserved manner” (essentially a plea for intellectual humility) should be a footnote in every scholarly publication - particularly those addressing the far future, X-risk, and value estimations of R&D.
However, the authors could have bolstered this section of the paper by identifying some potential sources of epistemic uncertainty and suggesting some pathways for further research that might reduce it. I recognize that they are both referring to acknowledged epistemic uncertainties - which may or may not be reducible - as well as unknown epistemic uncertainties (i.e., ignorance - or what they refer to as “cluelessness”). It would have been useful to see a brief discussion of some of these acknowledged epistemic uncertainties (e.g., the impact of resilient foods on public health, immunology, and disease resistance) to emphasize that some epistemic uncertainty could be reduced by exactly the kind of resilient food R&D they are advocating for.
When effectively communicating uncertainties associated with research findings to multiple audiences, there is a fundamental tradeoff between the rigour demanded by other experts and the digestibility/usability demanded by decision makers and lay audiences. For example, this tradeoff has been well-documented in the literature on the IPCC’s uncertainty communication framework (e.g., Janzwood & Millar 2020). What fellow-modelers/analysts want/need is usually different from what policymakers want/need. The way that model outputs are communicated in this article (e.g. 84% confidence that the 100 millionth dollar is more cost-effective) leans towards rigour and away from digestibility/usability. A typical policymaker who is unfamiliar with the modeling tools used in this analysis may assume that an 84% probability value was derived from historical frequencies/trials in some sort of experiment - or that it simply reflects an intersubjective assessment of the evidence by the authors of the article. Since the actual story for how this value was calculated is rather complex (it emerges from a model derived from the aggregation of the outputs of two sub-models, which both aggregate various types of expert opinions and other forms of data) - it might be more useful to communicate the final output qualitatively.
This strategy has been used by the IPCC to varying levels of success. These qualitative uncertainty terms can align with probability intervals. For example, 80-90% confidence could be communicated as “high confidence” or “very confident.” >90% could be communicated as “extremely confident.” There are all sorts of interpretation issues associated with qualitative uncertainty scales - and some scales are certainly more effective than others (again, see Janzwood & Millar 2020) but it is often useful to communicate findings in two “parallel tracks” - one for experts and one for a more lay/policy-focused audience.
Recognizing the hard constraints of word counts - and that a broader discussion of global priorities and resource allocation was likely “out of scope” - this article could be strengthened (or perhaps simply expanded upon in future work) by such a discussion. The critical piece of context is the scarcity of resources and attention within the institutions making funding decisions about civilization-saving R&D (governments, granting organizations, private foundations, etc.). There are two dimensions worth discussing here. First, R&D activities addressing risks that are generally considered low-probability/high-impact with relatively long timelines (although I don’t think the collapse of global agricultural would qualify as low-risk - nor is the likely timeline terribly long - but those are my priors) are competing for scarce funding/attention against R&D activities addressing lower-impact risks believed to be shorter-term and more probable (e.g., climate change, the next pandemic, etc.). I think most risk analysts - even hardcore “long-termists” - would agree that an ideal “R&D funding portfolio” be somewhat diversified across these categories of risk. It is important to acknowledge the complexity associated with resource allocation - not just between X-risks but between X-risks and other risks.
Second, there is the issue of resource scarcity itself. On the one hand, there are many “high value” candidate R&D projects addressing various risks that societies can invest in - but only a finite amount of funding and attention to allocate between them. So, these organizations must make triage decisions based on some criteria. On the other hand, there are also a lot of “low” or even “negative value” R&D activities being funded by these organizations - in addition to other poor investments - that are providing little social benefit or are actively increasing the likelihood/magnitude of various risks. I believe that it is important in these sort of discussions about R&D prioritization and resource scarcity to point out that the reosource pool need not be this shallow - and to identify some of the most egregious funding inefficiencies (e.g., around fossil fuel infrastructure expansion). It should go without saying— but ideally, we could properly resource both resilient food and AGI safety research.