This summer, Datassist has taken our audience through surveying with the goal of helping both novice and experienced surveyors find new ways — and often simpler or more effective methods — of working through the survey process, and today I’ll continue that journey with seven easy steps for sane data entry.
(On the Road to a Great Survey shows where we’ve been. I encourage those who haven’t already to review our summer survey series blog posts to prepare for survey projects.)
Manual Entry = Slow and Error-Prone
Many of my clients’ datasets are collected manually by a field worker administering the survey to each respondent. These pages of data are then brought in from the field and given to a data entry person, whose job it is to transcribe each survey sheet into some type of spreadsheet — an arduous task that is prone to errors, even among consummate professionals.
To simplify the task of those entering data and those will work with that data once entered in the spreadsheet, we’ve created a system for sane data entry using simple templates.
These templates control what types of data can be entered into each cell and they can also limit the data entered to a specific set of entries. For example:
- Each cell can be set to either receive only numeric data or only text data
- More specifically, each cell can be set to receive only certain letters (such as T or F for a true/false question) or (0 or 1 for a yes/no question)
Using this system, you avoid ending up with data that is impossible. A true or false question will never be answered with a 3. For age variables, you can set the cell to only accept whole numbers between 0 and 100 – so you’ll never get a respondent who 212 years old.
This system for sane data entry:
- Saves data collectors time because they don’t have to go back to get correct information
- Saves data entry people time because they can move more quickly through the cells, knowing that they have this extra technological support
- Saves data analysts time because the data they get is in better shape than data that has simply been entered into a blank spreadsheet
So how do you get this miracle solution? Here you go…
Seven Pain-Free Steps to Building a Template for Sane Data Entry
Step 1. Create a spreadsheet with your variable names as columns.
Create an Excel Spreadsheet in which the first sheet contains one column header for each survey question that has to be entered. For this example I’m using Name, ID, Sex, Income and whether or not the respondent owns any animals. Then, allow one row in the template for each respondent.
Step 2. Select the first column you will format for data entry.
Select the first column that should only accept certain types of data. For example, here I have selected the ID column. Be sure to select the entire column by clicking on the letter of the column (“B” in this example), rather than simply highlighting some of the cells in the column. Then go to the “Data” menu and choose “Validation”. The Data Validation dialog box will appear.
Step 3. Specify the type of data you want the column to accept or allow.
Now that you’re inside the Data Validation dialog box, select from the drop-down menu what type of data you’d like to allow the cells in this column to accept. You will get to specify in the next step which specific values within that data type. For the data in our “ID” variable, we want the data entry people to be able to enter only a whole number between 1 and 1000, since these are all the ID numbers of the people in our group of respondents.
Step 4. Specify the specific values of the data the cell will allow.
Now that we have selected the “Whole number” option, additional choices appear. We can, if we want, specify a specific list of whole numbers that the cells will accept. We can set it to accept numbers “between” or “higher than” or lower than” etc.
I’m setting at whole numbers between 1 and 1000.
Step 5. Repeat steps 2 through 4 with each of the columns that you want to control. (It is fine to select many columns at once that will accept the same types of data.)
In the column for the variable “Sex” I chose the “List” option rather than the “Whole Number” option. Then I can enter a list of anything I choose. For this example, I type M,F into the “Source” field. Anything I type into this “Source” space will be allowed to be entered into the cells in this column. Separate each item in the list with a comma.
When I try to enter a letter in the ID column I get an error message, which is a sign that my cells are protected. This is good.
Want More Data Insights and Advice?
At Datassist, our goal is to provide real-world answers to unique data-based questions and help our partners turn data into stories and pictures people talk about.
If you’d like more data tools and tutorials, take time right now to:
- Sign up for monthly data science and visualization resources.
- Check out this recently published book that features Datassist work,
- Download this Datassist Road Map to a Great Survey
- Take a look at some of the work we have done.
- Friend us on Facebook for a personal connection with what we do.
- Follow us on Twitter for the latest news.
- Get inspired with us on Pinterest.