Abstract
We organized two evaluations of the paper: "How Effective Is (More) Money? Randomizing Unconditional Cash Transfer Amounts in the US"[1]. To read these evaluations, please see the links below.
Evaluations
1. Anonymous evaluation 1
Anonymous evaluation 2
Overall ratings
We asked evaluators to provide overall assessments as well as ratings for a range of specific criteria.
I. Overall assessment (See footnote)
II. Journal rank tier, normative rating (0-5): On a ‘scale of journals’, what ‘quality of journal’ should this be published in? Note: 0= lowest/none, 5= highest/best.
| Overall assessment (0-100) | Journal rank tier, normative rating (0-5) |
Evaluator 1 | 80 | 4.2 |
Evaluator 2 | 85 | 4.5 |
See “Metrics” below for a more detailed breakdown of the evaluators’ ratings across several categories. To see these ratings in the context of all Unjournal ratings, with some analysis, see our data presentation here.
See here for the current full evaluator guidelines, including further explanation of the requested ratings.
Evaluation summaries
Anonymous evaluator 1
The paper presents and interprets the results of a well-powered one-time unconditional cash transfer experiment for low-income people in the US. The cash significantly increased recipient spending, but decreased survey measures of self-assessed wellbeing relative to control. Participant attrition limits the strength of the conclusions we can draw from this high-quality study. Overall, the study modestly shifted my views against one-time transfers and toward other policy tools for reducing poverty and inequality in wealthy societies.
Anonymous evaluator 2
Overall, I find this paper competently executed. I like the inductive approach to theorizing after the findings, the multiple tests for mechanism, extensive robustness checks, and the use of transaction-level data to check what participants used the money for. The paper has external validity limitations, and can be framed better since the unconditional cash transfers were given during the Covid pandemic. [I suggest] several further tests to test the main mechanism, and improve the mediation analysis.
Metrics
Ratings
See here for details on the categories below, and the guidance given to evaluators.
| Evaluator 1 Anonymous | | | Evaluator 2 Anonymous | | |
---|
Rating category | Rating (0-100) | 90% CI (0-100)* | Comments | Rating (0-100) | 90% CI (0-100)* | Comments |
---|
Overall assessment | 80 | (65, 95) | | 85 | (80, 90) | |
---|
Advancing knowledge and practice | 70 | (50, 90) | | 75 | (70, 80) | |
---|
Methods: Justification, reasonableness, validity, robustness | 90 | (85, 95) | | 95 | (90, 100) | |
---|
Logic & communication | 90 | (85, 95) | | 95 | (90, 100) | |
---|
Open, collaborative, replicable | 90 | (85, 95) | | 85 | (80, 90) | |
---|
Real-world relevance , | 80 | (70, 89) | | 75 | (70, 80) | |
---|
Relevance to global priorities, | 80 | (70, 89) | | 75 | (70, 80) | |
---|
Journal ranking tiers
See here for more details on these tiers.
| Evaluator 1 Anonymous | | Evaluator 2 Anonymous | |
---|
Judgment | Ranking tier (0-5) | 90% CI | Ranking tier (0-5) | 90% CI |
---|
On a ‘scale of journals’, what ‘quality of journal’ should this be published in? | 4.2 | (3.5, 4.9) | 4.5 | (4.0, 5.0) |
---|
What ‘quality journal’ do you expect this work will be published in? | 4.2 | (3.5, 4.9) | 4.0 | (3.5, 4.5) |
---|
See here for more details on these tiers. | We summarize these as: 0.0: Marginally respectable/Little to no value 1.0: OK/Somewhat valuable 2.0: Marginal B-journal/Decent field journal 3.0: Top B-journal/Strong field journal 4.0: Marginal A-Journal/Top field journal 5.0: A-journal/Top journal
|
Evaluation manager’s discussion (Robert Kubinec)
As the evaluation manager for "How Effective Is (More) Money? Randomizing Unconditional Cash Transfer Amounts in the US", I first want to emphasize how remarkable this piece of research is. The authors used enormous treatment sizes by the standards of the field (transfers up to $2,000) and a vast array of measurements on their subjects. As a result, they can investigate and rule out a number of mechanisms. In these situations, we might wish as political economists that the results conform to our theoretical priors. In this case, they do not–which is not uncommon with experiments. The growing prominence of field experiments has shown that we are often over-confident in our ability to predict the results of policies based on political economy theories. My hope is that this paper receives the attention it deserves so that we can make better theories that design better interventions.
However, before I turn to the evaluations, I wanted to provide my own bottom-line assessment from reading the paper and our evaluations of it. Cash transfers are one of the development policies du jour, and are increasingly being tested in developed versus developing countries where they largely originated. They are also based on a strong grounding in economic theory, particularly that individuals are best able to optimize spending given their preferences versus a social planner with limited information. So does this paper, with its largely negative or null effects on important outcomes like well-being and health, suggest that we should abandon this type of policy?
I echo the evaluators' comments in suggesting that we should not, at this point, abandon testing these types of policies. Of course, this conclusion runs the risk of protecting beloved theories from falsification. If negative results from an RCT do not stop us from implementing a cash transfer, then what was the point of an RCT? The reasons why we should not abandon cash transfers, at least at this stage, is that the RCT – as robust as it was – can only answer certain questions. It answers these questions very well, and we need to think hard about those lessons. But a priori, the RCT did not have a wide enough scope to show definitely that cash transfers will not affect important outcomes like health or financial independence.
The RCT's limitation in terms of time frame is natural – studies cannot be run forever – but does raise questions. As evaluator 1 noted, the effects of cash transfers are likely to manifest themselves over the long-term. As rigorous as this RCT was, they only recorded outcomes at up to 15 weeks later. The causal chain for cash transfers on outcomes like well-being or health could be much longer, potentially over years as lifestyles and habits adjust. For example, an increase in income could make it easier for a subject to qualify for education or find a better job–but that is unlikely to occur within 15 weeks. In addition, as big as this cash incentive was, it is still small relative to people's incomes in a wealthy country like the United States. The federal poverty level for a single person is about $15,000, so the maximum payment offered was about 13% of someone's annual income. A good amount–but not one that was likely to cure all problems a person might face.
Second, as evaluator 2 noted, there is an issue with so-called external validity. This RCT was an “opt-in” program where people had to apply to receive the funds, or what is known as the “treatment effect on the treated.” There are lots of reasons why someone would apply for these funds, and these factors might also predispose them to see fewer benefits. Outside of the experimental context, such as in a “real-world” government policy agenda, these cash transfers would presumably be available to a much broader slice of the population. As the authors of the paper note (p. 6), people's expectations that the payment could solve a lot of problems – which could have been heightened by the fact that they sought it out – could also explain the negative effects on reported well-being. The fact that the outcomes are relatively short-term further raises the risk that this factor occurred.
It would be difficult, of course, to run the RCT with a more general sample, akin to giving lottery payouts to people who did not participate in the lottery. But, as social scientists, we have to be attuned to these difficult-to-address contextual factors when interpreting evidence. At the same time, our theories should explain what happened with this RCT, and at present we do not have perfect answers. The authors' work in their paper is commendable and illuminating, but arguably we need to run more tests with different populations – along with analyzing observational data – to better understand why and how people reacted the way they did.
My second bottom line is that this RCT does provide evidence that cash transfers are trickier to understand than at first appear. A universal basic income, for example, might not mitigate as many harms as some expect. Furthermore, the RCT does demonstrate that subjective well-being and other important outcomes probably will not change significantly in the short term following a reasonably-sized payout. That much is clear.
However, we stand to benefit by designing better policy programs if we pay attention to the RCT's wealth of information about their subjects' financial activities, perceptions, and even relationships. We also know that this RCT does not seem to result in significant harm, which is an important policy consideration – i.e., the “do no harm criterion”. Given these uncertainties, and the fact that no one appeared to be harmed by the design (besides a possible short-term increase in anxiety), there is ample reason to continue testing these types of policies for poverty alleviation. In the end, learning is the goal, and this RCT did a marvelous job at advancing the field and will hopefully place in a top journal.
Notes on Conflicts of Interest
One of the authors of this paper (Julian Jamison) is a member of our Field Specialist team. However, he was not part of the decision to prioritize this paper, and we have managed this evaluation outside of our normal interface, so he has not been able to see or be part of this process.