This post was originally written for the blog over at We All Count, a project for equity in data science. We’re working to demystify and democratize data and demonstrate how we can make it more equitable for everyone.
Most of us know that careful data collection is important to quality data analysis. Asking how your data was collected and where it comes from is critical. Especially when the results of your analysis are going to affect people’s lives. But did you know that who collects your data can also affect it? I’m not talking about the hidden agendas of corporations. I mean the actual person who asks questions to amass statistics. It’s called the interviewer effect, and it’s a pretty big deal.
What is the Interviewer Effect?
Much of the data we use about people each day comes from social surveys. These surveys involve respondents answering questions about attitudes, opinions, daily experiences, and life histories. The information we glean from this type of survey is valuable. Social sector organizations frequently use this type of data for a variety of purposes:
- Determining if social programs and projects are working as intended
- Setting policy and making decisions
- Influencing human lives
In some settings, survey administrators put a significant amount of effort into ensuring the neutrality of these surveys results. They use methods (like RCTs, for example) that purport respondents are representative of the entire population. However, in my experience, they give little consideration to the influence of the people asking the questions. Could the interviewer effect impact your results?
Who is Asking Can Change the Answer
Obviously, we need to know if the interviewer effect is real. Our team used a large, well-designed survey to examine how the people asking survey questions might affect the answers. And we’ve uncovered good evidence to support the existence of the interview effect.
If we want equitable evaluations, we must take the interviewer effect into account when designing and analyzing surveys, and when deciding which survey data to use.
We All Count is in the process of conducting a detailed study employing a robust methodology to establish the strength of different types of interviewer effects. We are comparing:
- Society types
- Enumerator types
- Data collection methods
Our initial analysis has already revealed crucial insight. Take a look at this:
When asking the question, “Does the respondent reject wife-beating?” we found some interesting patterns in the results. We tried the following variations:
Male respondent, male enumerator, both urban
Male respondent male enumerator, both rural
Female respondent, female enumerator, both urban
Female respondent, female enumerator, both rural
Male respondent, female enumerator, both urban
Male respondent, female enumerator, both rural
Female respondent, male enumerator, both urban
Female respondent, male enumerator, both rural
Respondents in the second variation were most likely to say violence against women is ok. Respondents in the third variation were least likely to say it was ok.
Using mixed-effects models that account for urban and rural variations, as well as variations within and between countries, it looks like women speaking to women enumerators are 7 to 12% more likely to reject violence against women than women speaking to male enumerators.
So What Can We Do?
Recognizing that the interviewer effect exists is one thing. But what can we do about it?
There are a couple of steps you can take to minimize the impact of the interviewer effect on survey results.
- Check who was asking the questions. What are the enumerators’ demographics? Were they women or men? How does their income related to respondents’ income? Are there differences in ethnicity, race, social status, or education? (A recent EU survey on violence against women used only female interviewers, to eliminate that variable. This is pretty rare.)
- Check how interviewers were assigned to administer surveys. Were they randomly assigned to survey respondents? (This is not generally the case.)
- Adjust for the interviewer effect in your analysis. This is unnecessary if interviewers were randomly assigned to respondents. If not, the next best situation is to balance enumerators by characteristics between treatment and control groups. If you’ve already collected the data and it isn’t balanced, you’ll need to use a statistical method to estimate the effects of the interviewer on your questions and adjust for that in models. (You can test for effects by dividing data into subsections and using a machine learning algorithm. You can also use matching techniques like propensity score matching. Or you can build mixed-effects models with an interaction effect for the interviewer characteristics.)
One way or another, the person asking survey questions has a significant impact on the answers. To ensure equitable evaluations, we must adjust for this — either in the project design before collection or statistically afterward.
Need help? Want to learn more about the interviewer effect? Contact the experts at Datassist now.