“I keep hearing about Bayesian methods for social sector data analysis. What the heck are they?”
The word “Bayesian” crops up again and again when we talk about data analysis in the social sector. We see it in stories about nonprofits using Big Data. It appears frequently in data journalism. (Nate Silver of FiveThirtyEight.com is a big fan of Bayesian methods.) References to Bayesian methods even show up on popular TV shows. (OK, maybe just the one.)
But What Are Bayesian Methods?
Bayesian analysis is based on the Bayes Theorem, which describes the probability of an event based on prior knowledge of conditions that could be related to the event.
It’s been a pretty big deal in medical research, biology, physics, and other sciences for some time now. Corporate prediction algorithms also often rely on Bayesian analysis. Netflix and Amazon use Bayesian methods to predict which shows or products you’ll enjoy.
So how does it work?
In their most basic form, Bayesian methods combine beliefs and knowledge based on prior research and experience into our current findings.
Traditional data analysis takes data as it is and uses algorithms and models to calculate results and generate evidence. In contrast, Bayesian methods combine data with information we have already learned about similar data and then use algorithms and models to calculate results and generate evidence. This special Bayesian component — the information we already learned about similar data — is called “the prior.”
Implementing Bayesian Methods
Sometimes Bayesian methods are better explained by example.
Say we want to see if providing free hot breakfast to children will help improve reading skills. We have literacy scores over time for 1,000 children who received hot breakfast through the program and for 1,000 children who did not. (Determination of which children would receive hot breakfast was, for the purposes of our example, completely random.)
In traditional data analysis, we’d use the literacy scores to build a model that could test whether the two sets of scores differed statistically. From that, we’d decide if the breakfast program had a measurable impact.
If we applied Bayesian methods to our analysis, our method would look a little different. We would:
- Examine a similar program in another city somewhere else in our country
- Incorporate results from that program into our model
- Amount of change
- Standard deviation from estimates
- Other information they discovered
This increases the amount of context our model has to work with. That, in turn, ups the amount of good new data our model can produce. Instead of spitting out a traditional p-value (which is very hard to interpret in a useful way), our Bayesian model will provide us with a probability that the program is having a significant effect.
If you’re still not totally clear on how Bayesian methods work (or just want to see some more examples), there are many great visual examples of Bayesian analytics on the web. My personal favourites include:
Julia Galef’s A Visual Guide to Bayesian Thinking – Julia uses her own personal experiences to demonstrate how Bayesian methods can help us determine probability. (She also answers the question of how to identify an extroverted mathematician.)
Conditional Probability with Bayes’ Theorem on the Khan Academy site – Brit Cruise uses an incredibly simple coin flipping experiment to illustrate Bayes Theorem.
Count Bayesie’s Bayes Theorem with Lego – A lego-based illustration of Bayesian principles from what is “probably a probability blog”.
Still need help? If you’d like to learn more about data analysis for the social sector or want some hands-on assistance, our team is at your service. Get in touch with us at Datassist today.