Continuing the blog mini-series I’ve been doing over the last couple weeks about ensuring that we use data correctly, this week’s post is about the prosecutor’s fallacy. But while my other posts pulled different examples from all over — some serious and some silly — this one will focus on a single instance.
The Sally Clark story is, quite possibly, the most famous example of prosecutor’s fallacy. It’s also a chilling warning about how important it is to make sure we understand what our data is saying.
Who is Sally Clark?
If you haven’t heard the Sally Clark story before, be prepared. It’s a grim tale. Sally Clark was a successful lawyer, happily married, and a new mother when things first started to go wrong for her. She discovered her 2-month old son Christopher in his crib, not breathing. An ambulance was called but attempts to resuscitate him failed. His death was blamed on Sudden Infant Death Syndrome.
Around a year later, Sally and her husband Steve were hit with another tragedy. Their second son, Harry, was also found dead at around 2 months old. An ambulance was called, but the boy could not be saved. This time, the coroner determined there was something suspicious about the boy’s death, which in turn prompted a deeper investigation into his brother’s death one year prior.
Sally Clark was subsequently charged with the murder of both her children and convicted based on statistical evidence that was later determined to be flawed. Sally served three years in prison before her life sentence was overturned. She died of alcohol poisoning a few years later, apparently having never recovered from the tragic events.
The Stats That Ruined Everything
There were a few problems with the math that was used in the court. For today, we’re going to focus on the fundamental flaw of how the witness asked the question. He fell victim to the prosecutor’s fallacy.
Pediatrician Sir Roy Meadow testified as an expert witness at Sally Clark’s trial. His argument was that if the probability of one child in a healthy, high-income household succumbing to SIDS were 1 in 8,500, then the likelihood of losing two children to the same phenomenon was 1 in 73,000,000 (8,500 squared). [Side note: there are lots of problems with this decision – but that’s for another post]. With the odds stacked against her, Sally Clark was found guilty.
However, this is the probability of the evidence given innocence. That’s the wrong question. This is not the probability that matters when determining innocence. To put it another way: this numerator and denominator answer an irrelevant question.
The relevant question is the opposite. Instead of examining the probability of the evidence given innocence, we should look at the probability of innocence given the evidence. In other words, what are the chances that these babies died from unexplained natural causes vs the chances these babies died from murder?
And there lies the prosecutor’s fallacy. Sally Clark’s conviction was based on how likely it was that two children in the same household could die (probability of the evidence given innocence).
What it should have looked at was the likelihood that Sally Clark could be innocent under the circumstances (probability of innocence given evidence). There was a two in three chance these babies died from SIDS or other unexplained natural causes.
Prosecutor’s Fallacy isn’t Just For Prosecutors
The careful use of conditional probability is necessary for anyone who works with data. People who work in courts aren’t the only ones who can fall victim to prosecutor’s fallacy. Here are some other examples of cases where asking the right question is critical.
Death and Violence Rates
The following two questions look very similar, but will get you very different results:
- Given a successful suicide, what percentage used a certain method?
- Given attempted suicide using a certain method, what percentage were successful?
Ecology and Climate Change
Same thing here. Studying dead swamps and their characteristics as predictors of death:
- What are predictors given death?
- How likely is death given predictors?
Health and Medicine
The chances of having a heart attack if you have high cholesterol are not the same as the chances that a heart attack victim has high cholesterol.
Prosecutor’s fallacy is a dangerous data trap. Asking (and answering) the wrong question can have life-changing impacts if we’re not careful. If you’d like help ensuring you’re not falling victim to prosecutor’s fallacy, we can help. Get in touch with Datassist today.