The AER-Insights version takes this almost exactly
(self-reported outcomes only I believe)
Evaluation 1 of “Mental Health Therapy as a Core Strategy for Increasing Human Capital: Evidence from Ghana” (Barker et al) (anonymous evaluator); paper published in AER Insights (2022) as "Cognitive Behavioral..."
Overall assessment
Answer: 75
Confidence: 4 out of 5
Quality scale rating
“On a ‘scale of journals’, what ‘quality of journal’ should this be published in?: Note: 0= lowest/none, 5= highest/best”
Answer: 4
Confidence: 4 out of 5
See here for a more detailed breakdown of the evaluators’ ratings and predictions.
Summary: This paper uses a field experiment to explore the impact of a 12-week CBT program among poor households in rural Ghana. The authors find that the CBT program increases mental and physical well-being, as well as cognitive and socioemotional skills and downstream economic outcomes. There are no heterogeneous treatment effects by baseline mental health. However, a measure of vulnerability to mental distress does predict the impact of CBT on mental and physical health, but not on cognitive and socioemotional skills or downstream economic outcomes. The authors argue that CBT operates via two channels: It alleviates vulnerability to mental distress for those most at-risk, and it also generates greater cognitive bandwidth across the population.
Major comments: This paper addresses an important topic in development economics with clear policy implications; if effective, CBT programs could improve not just mental well-being but also, as this paper suggests, key downstream economic outcomes at a relatively low cost. There is relatively little large-scale, well-identified work on the impacts of CBT in low-income, developing country contexts, and as such additional evidence is quite valuable. The paper is also clearly written and a pleasure to read.
My more substantial comments primarily regard my desire to better understand both the context and results. I understand (and appreciate!) the concise writing, but at times the paper felt a bit barebones. In particular:
A. Nods to attrition, balance, and other “standard” discussions in experimental work were missing from the main text. As far as I could tell attrition was not once mentioned, and yet it seems likely that all participants did not in fact complete all 12 weeks of the CBT program; similarly I would expect that some participants did not respond to the follow-up survey. This feels quite important to report to the readers, at least in a footnote. In the case of balance, I found the rich discussion of the randomization procedure in Appendix A to be quite compelling and worth at least some reference in the main text, in part just to clearly state the variables for which balance was (or was not) ultimately achieved. [Evaluation manager’s note: the authors do consider and report on attrition in the AER-Insights published version.]
B. I would have appreciated access to survey materials, either via the appendix or the authors’ websites. It is possible to access the CBT program guide online, but from what I can tell this does not include the questions that make up the survey and lab outcomes included in the analysis. This is important for several reasons. Three that come to mind in particular:
I was a bit puzzled as to the selection of mental health outcomes. Three question types were included in an index (presumably these were the only three relevant questions asked in the survey?), but only one of the question types (K10) is included in many of the key analyses. The K10 score also happens to be the outcome with the most statistically precise and positive treatment effect but with quite variable scores over time, which makes an understanding of the decision to focus on this outcome especially important. Access to the survey materials would help readers to understand this selection decision, as well as the wording of the particular questions that made up the index.
Access to the survey would also allow readers to better understand how experimenter demand may affect self-reports, which was not discussed in the paper. Were participants aware that the survey questions were related to their participation in the CBT program, for instance, or was this link obfuscated? This is important for the interpretation of study findings, and perhaps for understanding the difference between these results and the Haushofer et al findings (see more on this below).
More generally, access would be appreciated for ease of replication and/or scale-up efforts.
C. On page 4 when discussing the hypothesis that the poor are particularly vulnerable to mental health difficulties, it would be helpful to point to empirical data supporting this claim. Similarly, when discussing the observed baseline rates of mental health in this sample, it would be helpful to see how the data compare with other samples. There is a note on page 8 comparing the BRFSS averages in the US to the K10 scores collected in Ghana. Given that the authors also collect at least one BRFSS question in the Ghana sample, I would encourage them to report this here for the sake of comparison to the US sample.
My preference would have been to put a bit less weight in the introduction and discussion on the particular mechanisms underlying how CBT operates. The data suggest that CBT has some impact on downstream outcomes (cognitive, socioemotional, and economic) independent of the effect it has on mental and physical wellbeing, largely because a measure of vulnerability to mental distress only predicts treatment effects for mental and physical wellbeing. As a result, the authors claim that “CBT directly improves bandwidth, increasing cognitive and socioemotional skills and hence economic outcomes.” However – especially given that there are not heterogeneous effects by predicted bandwidth – I would encourage the authors to discuss the scarcity mechanism as one possible mechanism, rather than suggesting that this channel has been mechanistically isolated. My sense is that the note in the conclusion that “the poor can generally benefit from CBT whether they have mental health problems or not” nicely captures the policy implications of these findings, and I would encourage this set of results to be more broadly framed in this light.
The authors appropriately cite a Haushofer et al (2020) paper which finds no impact of a 5-week CBT training program at a 12-month follow-up. While it is of course no critique of this paper that different results were observed here, the difference does raise questions about how the respective studies ought to inform our posterior. To aid in this endeavor, I would like to see a somewhat more complete discussion of the two sets of results. For instance, it seems worth mentioning why a 12-month follow-up wasn’t collected here to best compare results; was it because the other treatments in this trial had begun to be implemented, and so the randomization was no longer clean? The authors also suggest that perhaps given the Haushofer et al results there may be fade out in the longer run (Haushofer et al may have also observed effects if they looked at shorter-run outcomes). But if this is the case that significantly impacts how we interpret the value of the CBT program, and so a longer-run follow-up in this context would be very valuable. Finally, another related paper that seems worth including in this broader discussion is Bhat et al (2023).
While this is not an actionable suggestion, it seems worth noting that a pre-analysis plan would in my view have been quite useful for this project. For instance, it would have allowed the authors to clarify why the analysis focuses on the particular K10 outcome (Was it a question of power [but then why not rely on the full index]? Was it necessary to restrict the outcomes for a secondary analysis?). It would have also provided a space to register decisions such as pooling the control group households in CBT treatment communities with the control group households in control communities.
I found the degree of churn in the K10 mental wellbeing scores over time to be a bit surprising. It would be helpful to see the correlations between K10 and the other mental wellbeing measures. Was there a similar degree of churn over time across these measures?
It seems as though there is a lot of rich data here, and indeed even more that could perhaps be gleaned. I was interested in a few questions around spillovers that might be further discussed, for instance:
a. Both adults in control households are included in the analysis, but only the treated member of the treated household is included. This presents a nice opportunity to look at spillovers for spouses of treated participants: I’m interested in what the data tell us?
b. The lack of spillover effects in general is rather interesting, and I would have liked to see more on this.
The randomization procedures described in Appendix A are very neat – a great way to ensure proper balance. I’d suggest including some of this discussion in the main text, at least in a footnote.
More table notes would be appreciated, e.g. for Table 1.
There is a typo in footnote 17; “results” is not spelled correctly.
How long have you been in this field?
[About 10 years … Editors: removed some mildly-identifying content here]
How many proposals and papers have you evaluated?
20-30 formal referee reports
Evaluation 2 of “Mental Health Therapy as a Core Strategy for Increasing Human Capital: Evidence from Ghana”