RCT as I say, not as I do

Randomized Controlled Trials (RCTs) are the gold standard in policy evaluation.

Say you’re investigating a third world development policy, like building schools, or installing water pumps, or distributing malaria-resistant bednets. A random sample of the villages in an area are selected to receive the policy. The other villages form the control group, and receive no special treatment. Metrics on various desiderata are recorded for each village, like income, lifespan and school attendance. By comparing these outcomes between villages with and without the intervention, we can judge whether it made a statistically significant difference.

RCTs give us strong evidence of a causal link between the intervention and the result – we assume there were no other systematic differences between the treatment and control villages, so we have good grounds for thinking the differences in outcome were due to the intervention.

This is a marked improvement over typical methods of evaluation. One such method is simply to not investigate results at all, because it seems obvious that the intervention is beneficial. But people’s intuitions are not very good at judging which interventions work. When Michael Kremer and Rachel Glennerster did a series of education RCTs in Kenya, all their best ideas turned out to be totally ineffective – plausible ideas like providing textbooks or teachers to schools had little impact. The one thing that did make a difference – deworming the children of intestinal worms – was not something you’d necessarily have expected to have the biggest impact on education. Our intuitions are not magic – there’s no clear reason to expect our to have evolved good intuitions into the effectiveness of developmental policies.

A common alternative is to give everyone the intervention, and see if outcomes improve. This doesn’t work either – outcomes might have improved for other reasons. Or, if outcomes deteriorated, maybe they would have been even worse without the intervention. Without RCTs, it’s very difficult to tell. Another alternative to RCTs is to compare outcomes for villages which had schools in the first place to those which didn’t, before you intervene at all, and see if the former have better outcomes. But then you can’t tell if there was a third factor that causes both schools and outcomes – maybe the richer villages could afford to build more schools.

The other main use of RCTs is in pharmaceuticals – companies that develop a new drug have to go through years of testing where they randomly assign the drug to some patients but not others, so we can be reasonably confident that the drug both achieves its aims and doesn’t cause harmful side effects.

One of the major criticisms of RCTs is that they are unfair, because you’re denying the benefits of the intervention to those in the control group. You could have given vaccinations to everyone, but instead you only gave them to half the people, thereby depriving the second half of the benefits. That’s horrible, so you should give everyone the treatment instead. This is a reasonably intelligent discussion of the issue.

But this is probably a mistake. Leaving aside the issue that it’s more expensive to give everyone the treatment than a subset (though RCTs do cost money to run), it’s a very static analysis. Perhaps in the short term giving everyone the best we have might produce the best expected results. But in the long term, we need to experiment to learn more about what works best. It is far better to apply the scientific method now and invest in knowledge that will be useful later than to cease progress on the issue.

Indeed, without doing so we could have little confidence that our actions were actually doing any good at all! Many interventions received huge amounts of funding, only for us to realize, years later, that they weren’t really achieving much. For example, for a while PlayPumps – children’s roundabouts that pumped drinking water – were all the rage, and millions of dollars raised, before people realized that they were expensive and inefficient. Worse, they didn’t even work as roundabouts, as the energy taken out of the system to pump the water meant they were no fun to play with.

Another excellent example of the importance of RCTs is Diacidem. Founded in 1965 by Lyndon Diacidem, it now spends $415 million a year, largely funded by the US government, on a variety of healthcare projects in the third world, where it deliberately targets the very poorest people. Given that total US foreign aid spending on healthcare is around $1,318 million, this is a very substantial program.

Diacidem have done RCTs. They did one with 3,958 people from 1974 to 1982, where they randomly treated some people but not others. The long time horizon and large sample size makes this an especially good study.

Unfortunately, they failed to find any improvement on nearly all of the metrics they used, and as they used a 5% confidence interval, you’d expect one to appear significant just by chance.

 “for the average participant, any true differences would be clinically and socially negligible… for the five general health measures, we could detect no significant positive effect… among participants who were judged to be at elevated risk [the intervention] has no detectable effect.

Even for those with low income and initial ill health, surely the easiest to help, they didn’t find any improvements in physical functioning, mental health, or their other metrics.

They did a second study in 2008, with 12,229 people, and the results were similar. People in the treatment groups got diagnosed and treated a lot more, but their actual health outcomes didn’t seem to improve at all. Perhaps most damningly,

“We did not detect a significant difference in the quality of life related to physical health or in self-reported levels of pain or happiness.”

Given that these two studies gave such negative results, you would expect there to be a lot more research on the effectiveness of Diacidem – if not simply closing it down. When there are highly cost-effective charities than can save lives with more funding, it is wrong to waste money on charities that don’t seem to really achieve anything instead. But there seems to be no will at all to do any further study. People like to feel like they’re doing good, and don’t like to have their charity criticized. Diacidem is political popular, so it’s probably here to stay.

Sound bad?

Unfortunately, things are far worse than that. Diacidem does not actually cost $415 million a year – in 2012, they spent over $415 billion, over 300 times as much as the US spends on healthcare aid. It wasn’t founded by Lyndon Diacidem, but by Lyndon Johnson (among others) Nor does it target the very poorest people in the third world – it targets people who are much better off than the average person in the third world.

The RCTs mentioned above are the RAND healthcare experiment and the Oregon healthcare experiment, with some good discussion here and here.

Oh, and it’s not actually called Diacidem – it’s called Medicaid.