Random sampling, contrary to popular belief, is not simply gathering information from the first few people in a group that you happen to encounter. There’s actually a great deal of science that goes into making something truly random, as strange as it sounds.
But how can it be random if I have to plan that carefully?
The mistake that many people make is thinking the word “random” means selecting survey respondents without any planning or forethought. The “random” aspect of random sampling actually refers to the likelihood of any one member of the population being chosen to answer your survey.
Say, for example, you want to know how many people in your city see a need for more public parking. To collect a random sampling of opinions, you go out and start asking the question. But where do you go? If you go to a big box shopping center, it’s likely most of the people you’ll encounter there will have driven there. Drivers are more likely to see a need for somewhere to put their cars. If you stood at the bus station, the responses you’d get would likely be much different. Ditto if you went and asked around at a seniors’ home.
By choosing one specific location to gather information, you’ve eliminated the possibility of your sample being random. If you’re trying to gauge the way the whole city feels about an issue, but you only ask people in one area, not everyone has an equal likelihood of being asked. That is not random sampling.
Does Size Really Matter?
A while back, I wrote a post on why larger sample sizes aren’t necessarily better. The critical thing to remember is that the size of your population of interest does not usually affect the size of your sample.
It’s entirely possible to have a small random sampling that will tell you much more than a larger sample of convenience. Using the example I started with above, which do you think would provide more representative results?
Let’s assume your town has a population of 100,000 people. Would you get better results by:
- Speaking to 500 people in a single area?
- Asking 50 people in various locations around the city?
The answer, obviously, is the second option. How the data is collected is much more important than how much of the data is collected.
Still confused? That’s fair. Random sampling is often misunderstood. The team at the Pew Research Center have created a really cool tutorial video to help explain it better:
Is Random Sampling Right for You?
There are pros and cons to different types of sampling. Random sampling is widely regarded as the best way to avoid introducing bias into your results. (Nobody likes bias — it can distort your findings and lead you to inaccurate conclusions.)
Here’s a chart explaining the differences between various sampling methods and why one might be more (or less) suited to what you’re doing:
Need Expert Help with Random Sampling?
Random sampling is not an easy concept to understand. If you’d like some more comprehensive instruction on random sampling and using statistics to tell your story, you’re in luck. Starting September 18, I will be leading a course with the Knight Center on Crafting Data Stories. (This course will be a little different than the MOOC I ran earlier this year with Alberto Cairo, so make sure you check it out early to avoid disappointment.)
Need more hands-on help? No problem. The team at Datassist is standing by to help with all your data collection, analysis, and visualization needs. Drop us a line to discuss your project.