Summary, metrics and ratings, and Manager's comments on Evaluation of “Artificial Intelligence and Economic Growth” by Aghion et al.
Artificial Intelligence and Economic Growth, Philippe Aghion, Benjamin F. Jones, Charles I. Jones. in The Economics of Artificial Intelligence: An Agenda, Agrawal, Gans, and Goldfarb. 2019. Originally published as NBER Working Paper 23928.
: We organized two evaluations of this paper (1. Seth Benzell, 2. Philip Trammell). The authors also responded. To read the evaluations and the response, click the links at the bottom.
We are grateful to the authors of this paper for agreeing to participate and engage with the Unjournal’s evaluation of this paper, and for following through with this. (Although this was an NBER working paper this was selected before we began the “Unjournal Direct track”.)
In our current phase, The Unjournal is mainly targeting empirical papers (and papers with quantitative simulations, impact evaluations, direct policy recommendations, etc.) In contrast, this would probably be considered ‘applied macroeconomic/growth theory’. Nonetheless, we saw this work as particularly important and influential for reasons mentioned here (considering tradeoffs between positive and negative consequences of AI; the paper appears in ‘economics of effective altruism and longermism’ syllabi; it has nearly 500 citations).
We are also grateful for the extremely diligent work of the evaluators. My impression (from my own experience, from discussions, and given the incentives we have in place) is that we rarely see referees and colleagues actually reading and checking the math and proofs in their peers’ papers. Here Phil Trammel did so and spotted an error in a proof of one of the central results of the paper (the ‘singularity’ in Example 3). Thankfully, he was able to communicate with the authors, and work out a corrected proof of the same result (see philiptrammell.com “Growth given Cobb-Douglas Automation”) currently linked here.
The authors have acknowledged this error (and a few smaller bugs), confirmed the revised proof, and link a marked up version on their page. This is ‘self-correcting research’, and it’s great!
Even though the same result was preserved, I believe this provides a valuable service.
Readers of the paper who saw the incorrect proof (particularly students) might be deeply confused. They might think ‘Can I trust this papers’ other statements?’ ‘Am I deeply misunderstanding something here? Am I not suited for this work?’ Personally, this happened to me a lot in graduate school; at least some of the time it may have been because of errors and typos in the paper.
I suspect many math-driven paper also contain flaws which are never spotted, and these sometimes may affect the substantive results (unlike in the present case).
Again, I’m grateful to the present authors for being willing to put their work through this public checking, and acknowledging and correcting the errors. I now have more confidence that the paper’s results are valid, and that the authors have confidence in their work. This makes their research output more credible overall, and it sets a great example for the field.
Evaluators were asked to follow the general guidelines available here. For this paper, we did not give specific suggestions on ‘which aspects to evaluate’.
In addition to written evaluations (similar to journal peer review), we ask evaluators to provide quantitative metrics on several aspects of each article. These are put together below.
90% CI (0-100)*
90% CI (0-100)*
Advancing knowledge and practice
Methods: Justification, reasonableness, validity, robustness
Logic & communication
Open, collaborative, replicable*
Relevance to global priorities
*Evaluation Manager (David Reinstein): Evaluator 2 wrote “80?” in his rating here; see comment column footnote.
Rating (0-5) (low to high)
90% CI (0-5)*
What ‘quality journal’ do you expect this work will be published in? Note: 0= lowest/none, 5= highest/best
On a ‘scale of journals’, what ‘quality of journal’ should this be published in?
**Evaluation Manager’s note (David Reinstein) evaluator 2 indicated “Medium-high” confidence. I am interpreting this as a confidence level of 4 on our scale