Skip to main content

Evaluation 2 of "Universal Basic Income: Short-Term Results from a Long-Term Experiment in Kenya"

Evaluation of "Universal Basic Income: Short-Term Results from a Long-Term Experiment in Kenya" for The Unjournal. Evaluator: Anonymous

Published onJun 11, 2024
Evaluation 2 of "Universal Basic Income: Short-Term Results from a Long-Term Experiment in Kenya"
·
key-enterThis Pub is a Supplement to

Abstract

[Strengths]: + Rigorous large-scale RCT design + Simple yet effective statistical approach + (Mostly) outstanding statistical analysis/interpretation & clear communication + Insightful analysis of cash-transfer effects in development context
[Limitations: - Limited relevance for generalized UBI in macro equilibrium - Potential biases in self-reported data, which is insufficiently discussed - Additional socio-economic context would enhance understanding - Some statistical results and interpretations from the paper remain unclear

Summary Measures

We asked evaluators to give some overall assessments, in addition to ratings across a range of criteria. See the evaluation summary “metrics” for a more detailed breakdown of this. See these ratings in the context of all Unjournal ratings, with some analysis, in our data presentation here.1

Rating

90% Credible Interval

Overall assessment

85/100

65 - 95

Journal rank tier, normative rating

4.0/5

2.5 - 5.0

Overall assessment: We asked evaluators to rank this paper “heuristically” as a percentile “relative to all serious research in the same area that you have encountered in the last three years.” We requested they “consider all aspects of quality, credibility, importance to knowledge production, and importance to practice.”

Journal rank tier, normative rating (0-5): “On a ‘scale of journals’, what ‘quality of journal’ should this be published in? (See ranking tiers discussed here)” Note: 0= lowest/none, 5= highest/best”.

See here for the full evaluator guidelines, including further explanation of the requested ratings.

Written report2

A. On this evaluation

My review is informed by a background in economics and policy, UBI, and the fundamentals of statistics and econometrics. An ideal complement could be a separate evaluation focusing more specifically on how the paper fits into the current development economics research landscape.

Given how the paper is presented, I evaluate it with regards to three implicit goals […]:

(i) Informing us about economic effects from UBI in general and in a development context in particular,

(ii) informing us about [the] effects [of] and the ideal design of cash transfer programs, and

(iii) advancing [the state of the art]of RCT execution & evaluation.

According to the abstract, the paper indeed examines “What would be the consequences of a long-term commitment to provide everyone enough money to meet their basic needs?”, and the preregistration registry states the goal to also inform UBI debates in wealthy countries, including as to whether economic security motivates their citizens to work more or less.

The various highlighted limitations with regards to the generalizability of the study results to potential UBI settings more broadly should not be misinterpreted as a fundamental critique of the core analysis in the paper itself; some of the limitations could be difficult to overcome within a cash-transfer RCT in a development context.

B. Subjective summary of the research

Context

  • The interest in universal basic income (UBI) has become increasingly popular during the past decades, with the growing inequality within most countries worldwide, and looming automation plausibly being contributors to the interest in the topic.

  • Given this interest, it seems a natural idea that a new RCT on cash transfers be designed in a way to elicit insights on the viability of a UBI, and, as the authors argue, with a focus on (long-term) effects on economic development and (market) income

  • […added line break] given the difficulty to continuously finance a UBI in the developing world. GiveDirectly financed the study.

Study

  • RCT on poor villages in Kenya, to observe economic responses from cash transfer schemes of three types (“arms”): (i) one-off lump-sum, (ii) monthly for two years, and (iii) monthly for 12 years. Each arm delivered a similar cumulative income transfer within the first two years, “sufficient to meet basic needs”.

  • Early-stage evaluation two years into the program, finding a multitude of economic effects from the cash-transfers. There was “substantial economic expansion—more enterprises, higher revenues, costs, and net revenues—and structural shifts, with the expansion concentrated in the non-agricultural sector. Labor supply did not change overall but shifted out of wage employment and towards self-employment”. The expansion was concentrated “particularly in retail—indeed much of the economic story appears to have been the expansion of supply chains to meet increased local demand for goods manufactured elsewhere”. The value of land increased a lot, and “household well-being improved on some common measures (e.g. food consumption, depression) but not others (e.g. children’s anthropometrics and schooling)”.

  • A few differences between the arms’ effects may surprise. The authors design a model that can theoretically explain these.

  • The authors strike a rather positive tone with regards to implications for UBI, and I understand this is notably due to the absence of a negative effect on total working hours.

C. Strengths

  • Well-conducted large-scale randomized controlled trial (RCT) on relevant questions of increasing importance or interests.

  • Scale and spatial setup of RTC informed by careful examination of expectable statistical power of the analysis, and on expectable spillover effects between different groups.

  • Simple, yet effective statistical approach, well explained.

  • In many instances, exceptionally sober statistical interpretation of results: Regularly discussing the confidence interval (CI) ranges as opposed to only statistical significance (or absence thereof); focusing not just on statistical significance but also on economic significance of results.

  • Preregistration of study along with to-be-analyzed core variables.

  • Effective three-arm setup with different cash administration durations, enabling deeper insights into the subtle psychological and financial mechanisms at play, offering valuable insights into short-term and potentially long-term economic effects.

  • Treatment subjects are individual members of the households in the treatment villages. This seems a useful innovation relative to the more usual approach of treating households, whose composition can change over the study period.

  • Assesses a broad range of relevant outcomes, with outcome variables including, for example, psychological, financial/economic, and health results.

  • All in all, careful implementation from start to finish means not only reliable results are gathered, but the study can also be an inspiration for future work in similar and related domains.

D. Main claims and assessment

I address four major claim sets.

1. Claims around economic expansion, structural shifts, and labor supply: Communities receiving UBI showed significant economic expansion, characterized by increased business enterprises, revenues, and shifts from wage employment to self-employment, particularly in the non-agricultural sectors. Total labor supply did not decrease, indicating that UBI does not promote "laziness" but reallocates labor towards more entrepreneurial activities.

These effects are factually observed and rather robust (see Table 4). One open question is how far they may generalize. In so far as the study will be seen as supporting the idea that “UBI works” in [the] most general sense, multiple limitations are noteworthy:

1.a) The scheme did not first fiscally collect revenues from the local, treated population, for redistribution in form of an unconditional cash transfer, as is usually thought of for a UBI scheme. Instead, the money distributed was ‘manna from heaven’ for the treated population, i.e., an inflow of external cash (from GiveDirectly). A series of macroeconomic equilibrium effects expectable of a more standard UBI can therefore not be observed.

1.b) Baseline wealth and size of transfer: While substantial for the concerned households, with roughly 0.30 USD/day per person in the household,3 the transfers received seem small even with PPP-adjustment (0.76 USD PPP per person in the household), when compared to the requirements of a materially more or less decent life, even in the villages. The households were quite poor in the baseline (85% experienced hunger, according to Table D.1). It can therefore not be known to which degree the observation that the “UBI” does not weaken labor incentives generalizes to other contexts. Arguably, it would seem natural that less poor populations, receiving a higher UBI, might more readily shift, after their broader basic needs [were] met, to unpaid activities, reducing their wage labor supply.

1. c) Desirability of shifts: A related question that the paper does not discuss in much detail, is how ‘positive’ the observed economic changes really are. Naturally, the transfers have a first-order effect of allowing the households to consume more, including, for example, more proteins. The broader “economic expansion” and “structural shifts” away from agriculture have been overwhelmingly concentrated in the retail sector. It therefore seems like, first and foremost, more goods, produced elsewhere, seem to have been consumed. This may contrast with hopes that much of the new resources would be invested [in] local production of new types of goods and services. Changes in the latter direction would have partly occurred according to point estimates, though they were not statistically significant, and seem to have overall been mostly quite small relative to the observed retail expansion (Table E.9).

2. Claims from comparison between transfer types: One of the most noteworthy differences between the three arms was that the lump-sum & long-term arms had a much larger economic expansion than the short-term arm, while the latter instead had a larger expansion in consumption. For mental health, the main difference was between the lump-sum and the two regular payment arms: mental health improved significantly less in the lump-sum arm than in the other arms.

These conclusions seem statistically robust (Tables 2 and 9). The question about the underlying reasons is interesting. Without presenting it as the only potential explanation, the authors show how the smaller economic expansion in the short-term arm can be obtained in a model with savings and borrowing difficulties where “investment projects require both a large up-front capital outlay and ongoing flow investment to turn a profit”. This is a relatively complex mechanism. I wonder whether a much simpler, psychological explanation could be behind the results: Receiving a larger one-off lump sum, might spur one to think seriously of how to productively invest it; receiving regular payments for the longer term, might also make one wonder how to best change one’s life. Receiving, instead, the limited extra income regularly for a known shorter period of two years, might do less of either of these two things, and may therefore instead mainly lead to an increased consumption in the short term arm.

3. Claims about psychological and consumption effects: UBI increased general well-being, reduced depression, and improved nutrition through greater food variety and protein consumption. It also increased education spending across the three arms. Impacts on health spending were mixed across transfer schemes.

These effects are readily observed in the data (Table 5). They seem mostly unsurprising, with the main question being to which degree some of the more unexpected differences across arms (e.g. health) might be due to statistical noise. After all, with the large number of parameters estimated in the study, we’d expect potentially several […] to become statistically significant, potentially even with the wrong sign, merely due to statistical coincidence.

4. Claims about price impacts: “Overall we do not reject the null that consumer prices [in nearby markets] were unaffected, in which case the revenue increases discussed above come entirely from increased sales, albeit with fairly wide confidence intervals”.

On one hand, Table E.1 confirms that many price effect estimates are insignificant, and the authors’ explanation as to the reason why the estimates may have a large confidence interval seems compelling.4 On the other hand, among the “Study Shares” estimates in Table E.1,5 almost all the nine “Study Shares” effect estimates (three arms times three goods sets: Overall, Agricultural, and Non-Agricultural items) have a positive sign, three of them significant, and most with at least around one standard deviation in the positive direction, and only one with an economically meaningful negative price effect point estimate (albeit only half a standard deviation in size). The table provides less statistical information than some of the other tables (e.g. no p-values for joint tests), so I could not follow in detail how exactly the authors concluded [that there were] no significant price effects in the data; overall, from the data provided, it was not what I expected. In conclusion, from the table data, it should be noted at least that economically highly positive price effect point estimates were found, some of them individually significant and others often roughly one standard deviation large.

Two further points with regards to the price effect seem worth keeping in mind:

1. Should the price effects indeed be economically significant, one could imagine them to be mainly a temporary phenomenon. Many of the goods surveyed (sandals, fertilizer, soap, …) could arguably be supplied to the market with constant or even increasing returns to scale, once supply-chain and retail facilities adjust to the new demand level.

2. For a macroeconomically more significant UBI (nationwide or even larger regions), demand-induced price increases could, (i) in a similar ‘manna from heaven’-setup, be expected to be substantial rather independently of whether […] major price effects were found in the present study. [This is because] demand effects for many of the inter-regionally produced goods may only start to play a role once quantities demanded increase on a country-wide basis. And (ii) in a more typical UBI scheme, where regional taxes could be used to finance the regional UBI, one could potentially expect there to be limited first-order impact on average prices (as the taxes to be paid neutralize the UBI payments received by households on average). […] Again [this would be] relatively independent of whether the prices in the present study turn out to be meaningfully positive or not.

The authors refer to a forthcoming study apparently showing a tiny price effect of cash transfers[…] I cannot meaningfully comment on [this] without reading that study in detail; the results could crucially depend on time and geographic dimensions of the treatment and its measured effects.

5. Claims on land value: Land values in treatment villages increased strongly, without correspondingly large investments in the enhancement of the value of the land.

This claim appears to be well-supported by results from different surveys, see Table 6. The interpretation [makes intuitive sense]: The improved living conditions and/or the households’ increased purchasing power (and [their] search for ways to store wealth combined with [an] inelastic supply of land) may have contributed to the land appreciation […]

6. Claims on alcohol: There is evidence that alcohol problems have decreased.

This is factually true, although the evidence can be put into question by potential biases in answers on particularly sensitive questions and by harder, albeit statistically less significant, data pointing towards a different interpretation:

The alcohol data is reported in Table E.10. Alcohol is perceived as a large problem (baseline mean 3.37 on a scale of 1-5 of perception of alcohol as a problem within the community). Alcohol as a problem in the treated villages is reported to have decreased significantly, most notably with a very substantial reduction of reported daily drinkers. The point estimates on individual self-reports of drinking (columns 1 and 2) are all statistically insignificant and seem arbitrary in their direction.6

The authors do not discuss the last data column, about alcohol sales. Coefficient point estimates point towards a very strong increase in sales for each of the three UBI arms, two being significant at the 10% level, with net revenues more than doubling on average across the three treatment arms. The unit of the values is not indicated. On the one hand, this evidence is statistically weaker than the villagers’ reports about alcohol in their villages. On the other hand, alcohol is a particularly sensitive topic as the authors point out, so without further evidence, it would certainly seem plausible that villager reports about alcohol problems in their village are not perfectly objective. This may be particularly true if very poor respondents know they are surveyed to assess the effects on their village of a program that provides for their livelihood.

It would therefore have been interesting to see whether the authors have alternative explanations for the increased revenue figures to allow one to understand whether the reservations about the overall conclusions on alcohol here expressed may be safely dismissed or not.

E. Further important limitations

Caveats not addressed as part of the Main claims assessment above.

Bias potential: ‘Program goodwill’ bias, survey completion rates, alcohol results

  • Large parts of the study rely on self-reported data. This begs the question [of the degree to which] participants’ answers may be biased by goodwill towards the program[…] [This could be] both due to positive views towards [the program], as well as potentially strategically in the hope, however faint, that positive answers could increase the probability of receiving future benefits from the program or from similar programs. There are two observations in the data presented that I interpret as underlining this hypothesis:

    • The follow-up survey response rate (Table D.2) is significantly and substantially higher in each of the three arms than in the control, with roughly 1/3rd less non-respondents in the arms than in the control.7 It would not seem a priori implausible that if the program motivates people to participate in the survey, they may also have some motivation to report positively about its effects. In a similar vein, estimated treatment effects could be affected if control households, informed about the treatment in nearby villages and knowing they were left out, answered surveys with biases of some sorts. On the one hand this is speculative – the authors do not discuss the topic. On the other hand, at least intuitively, it would seem rather natural for such types of biases to arise in the given setting.

    • Alcohol (Table E.10): Villager reports report a reduction of alcohol consumption problems in their villages (Columns 3-4). At the same time, point estimates indicate high increases in sales in all three arms (Column 5), although only significant at the 10%-level in two of the three arms (see comment on alcohol above). Although statistical noise can obviously not be excluded, a preference for reporting a positive picture about the program’s effects could immediately explain the observations in particular in this domain of the known large problem of alcohol.

Other NGOs

  • The authors report a significant effect on the number of other NGOs active in the treated villages, although it is not obvious (to me) how the corresponding data table is consistent with the author’s exact text.8

  • It would seem very useful to understand more about the activities of these other NGOs, and how the GD cash-transfer program affects their presence, to get a better understanding of how/whether study results may be driven by location-specific interaction effects that cannot be expected to generalize to different settings.

Information of participants/framing

  • The early-stage effects of a promised 12-year intervention may depend a lot on how much confidence participants have in the long-term nature of the project. This in turn depends on how the trial setup is framed/communicated. There is no information on such implementation details.

F. Minor limitations

  • Authors remove outliers (top percentile) in data without it being clear how this threshold was chosen.

  • Original datasets and code used do not seem to be made accessible.

G. Final note on UBI

Extending and summarizing reasons for limited generalizability of insights on the economic viability of typical UBI concepts, mostly due to natural limitations of an exogenously financed cash-transfer RCT:

  1. Anticipated limited program duration (even of the longest arm).

  2. Limited lifetime exposure to the scheme (people not having grown up/socialized under a UBI scheme).

  3. Lack of public finance effects (requirement to absorb large domestic resources for UBI financing).

  4. Lack of macroeconomic scale (supply) effects when UBI is administered on a national or global scale.

  5. Further, with regards to generalizability to wealthier regions/higher UBIs9: here [this is] a specific development context, with cash amounts not nearly sufficient to ensure what one might arguably consider a materially comfortable, healthy, and save life.

  6. Finally, with regards to a potential future with UBI and jobs AI-automated away, in the present study: lack of unavailability of meaningful remunerated jobs, along with, e.g., potential challenges to traditional sources of self-worth.

Evaluator details

  1. How long have you been in this field?

    • In the broader fields of economic policy, statistics, etc.: 10 years.

  2. How many proposals and papers have you evaluated?

    • 20

Comments
0
comment
No comments here
Why not start the discussion?