As I’ve told my monthly Resource List readers, there’s a lot to learn about statistics. And there’s lots of controversy about how to use specific methodologies. Many statistical methods are simply not intuitive, and the growing practice of holding up two numbers and talking about their relationship as if it’s meaningful can lead to incorrect ideas, misuse of funding, a confused worldview, and much worse. Many of us are — unintentionally — lying with our data.
Basic Guidelines Keep Your Data Honest
My years of working on-the-ground with non-profits and journalists have taught me a couple of basic guidelines for dealing with data that will help you start the journey on the right foot.
1. Always look at your data.
Summary statistics are never enough if you want to avoid lying with your data. All four of the datasets in the famous “Anscombe’s Quartet” have the exact same average, standard deviation, and correlation. Yet they reflect very different realities.
2. Figure out what type of data you have.
Then use tools that are appropriate for your data. Continuous numbers, Ordinal numbers, Likert scales, Nominal numbers – these all need different tools. An “average” is great for a continuous number. An “average” is meaningless for most nominal numbers.
3. Put your data in context.
One of the most common ways to misrepresent statistics is to take the data out of context. It’s vital to compare apples to apples. Denominators matter. Reporting changes over time as percent change is not a good idea. Instead, report changes in points. Or per capita.
4. Avoid patterns of association between only two data points.
Comparing two variables is almost never enough. This is not enough context. Associations between two points are often not real. Relationships can appear:
- Because they’re really happening
- By chance
- By bias (such as the ecological fallacy)
The way to avoid associations that can result in lying data is by including other important variables and providing specifics about the data. (Alberto Cairo offers further thoughts on this here.)
Resources to Help Avoid Lying With Data
For next steps and some great book and video learning resources, we recommend:
- Cathy O’Neil’s free e-book on thinking carefully about your data.
- The upcoming online course Learning to Love Statistics
- A good piece by Jonathan Stray on accurately drawing conclusions from data
(None of our suggestions or recommendations here are ever paid or commission-based. This is just stuff we like and use and think you might like too!)
Datassist is consistently providing real-world answers to unique data-based questions. Turning insights from data analytics into competitive advantage requires a cultural shift toward evidence-based decision-making. We can help your organization get there: