Randomized controlled trials — or RCTs — are often held up as the “gold standard of research.” I like to think of RCTs as being more the “gold standard of RCTs.”
I’ve written before about the technical limitations of RCTs, but today I want to focus on the analytical limitations. Randomized controlled trials are extremely good at what they do. They produce unbiased estimates of the average treatment effect, which is good to know. Unfortunately, it’s the only thing they can produce — and merely measuring averages doesn’t make us well-informed enough to make practical decisions in many cases. Let me explain.
Who is Average?
The average treatment effect (often referred to as the ATE) is just what it sounds like: the average effect of the treatment across an entire population. What’s wrong with that? Nothing, if that’s what you need to know. But it’s not the typical treatment effect. It’s also not the likely treatment effect for any given individual.
Let’s look at an example.
Say we’re trying to determine if the program we run to help people start their own businesses is increasing incomes of the people in our community. To conduct an RCT, we randomly assign ten people to be in the program and 10 to be excluded. Everyone in both groups starts with a prior income of $10.
We conduct our very scientific RCT. The result is a very unbiased ATE. It shows the program increased the average income by $5. Because we used a randomized controlled trial, we are almost certain that our program caused this average change. Great! But using an RCT, we cannot know if the program was good for each individual in the group.
In the group of 10 people included in the program:
- One became very successful and increased their income by $90
- Five saw their incomes unchanged
- Four lost money on their new business and decreased their income by $10
In the group of 10 people not included in the program:
- One person saw moderate success and increased their income by $10
- Two people had some success and increased their incomes by $5
- Five saw their incomes unchanged
- Two lost money and decreased their incomes by $10
Measuring Averages Can Mislead
Based on the chart above, our RCT shows our program had an average treatment effect of $5. That’s a full 50% improvement when compared to those not in the program. It looks like our program was a huge success.
Not so fast there.
All of those numbers are correct. On average, the group of people who were included in the program ended up with higher incomes than those who weren’t included. But when we look at individual incomes, the story looks quite different:
- The program worked really well for one person
- It made little difference for about half the people
- And it worked out quite poorly for almost another half
When you look at the numbers for individuals who didn’t participate in the program, a higher percentage of them were better off in the end than of those who were in the program. So how great was the program for people really?
So was the RCT wrong?
It depends on what the question was. If our question was how helpful the program was to the group on average, the RCT provided an unbiased, accurate answer. If the question was how useful our program was to individuals? The RCT didn’t answer it at all. And in the social sector, often the question we’re trying to understand needs to be answered for individuals.
Determining whether or not your question can be effectively answered by a randomized controlled trial can be challenging. If you’d like help analyzing your data in a way that can support practical decision-making, we can help. The experts at Datassist will find, collect, and analyze the data you need to help tell your story. Get in touch with us today.