It has been a recurrent sentiment in my corner of Twitter that surveys are useless for predicting what people will do — or why — because one can't trust people to tell the truth. Jason Oke, then a planner at Leo Burnett, blogs about it in 2007. Faris Yakob throws a bomb of a blog post in 2010. Richard Shotton writes about it in 2017.
I happily collect my data using observations and experiments. I love digging through search queries (come back another time for a story about that). I've tried just about every implicit method there is. But often, just asking people questions works pretty well, too.
To be sure, there are plenty of reasons to be skeptical about survey results. I wouldn't base my life's decisions on the results from a survey that asks questions this way:
I'd also disregard the data from any survey that looked like this to its respondents:
But let's not throw away a perfectly sharp tool just because other people keep grabbing it by the wrong end.
The concerns about respondents' biases are as legitimate as they are well known. How to ask questions in a way that produces reliable data is literally a science. It has its own experiments, discoveries, and textbooks. (I can recommend three: The Psychology of Survey Response and Asking Questions are more theoretical, and The Complete Guide to Writing Questionnaires is my desk reference.)
Look at the description of Asking Questions:
Drawing on classic and modern research from cognitive psychology, social psychology, and survey methodology, this book examines the psychological roots of survey data, how survey responses are formulated, and how seemingly unimportant features of the survey can affect the answers obtained. Topics include the comprehension of survey questions, the recall of relevant facts and beliefs, estimation and inferential processes people use to answer survey questions, the sources of the apparent instability of public opinion, the difficulties in getting responses into the required format, and distortions introduced into surveys by deliberate misreporting.
Philip Graves, "a consumer behaviour consultant, author and speaker," takes a dim view of market research surveys in his 2013 book Consumerology. (Faris in 2010 and Richard in 2017 both mention this book in their posts.) Among other things, Graves writes that "attempts to use market research as a forecasting tool are notoriously unreliable, and yet the practice continues."
He then uses political polling as an example of an unreliable forecasting tool. He does not elaborate beyond this one paragraph (p.178).
I'm glad he wrote this.
First, horse race polls ask exactly the forward-looking "what will you do" kind of question that people, presumably, should not be able to answer in any meaningful way. Here's how these questions usually look:
If the presidential election were being held TODAY, would you vote for
- the Republican ticket of Mitt Romney and Paul Ryan
- the Democratic ticket of Barack Obama and Joe Biden
- the Libertarian Party ticket headed by Gary Johnson
- the Green Party ticket headed by Jill Stein
- other candidate
- don’t know
Second, in election polling, there's nowhere to hide. The data and the forecasts are out there, and so, eventually, are the actual results.
And so, every two and four years, we all get a chance to gauge how good surveys are at forecasting people's future decisions.
Here's a track record of polls in the US presidential elections between 1968 and 2012. FiveThirtyEight explains: "On average, the polls have been off by 2 percentage points, whether because the race moved in the final days or because the polls were simply wrong."
On average, you can expect 81% of all polls to pick the winner correctly.
The closer to the election day polls are conducted, the more accurate they are.
"The chart shows how much the polling average at each point of the election cycle has differed from the final result. Each gray line represents a presidential election since 1980. The bright green line represents the average difference." (NYTimes, June 2016)
What about the 2016 polls? The final national polls were not far from the actual vote shares.
"Given the sample sizes and underlying margins of error in these polls, most of these polls were not that far from the actual result. In only two cases was any bias in the poll statistically significant. The Los Angeles Times/USC poll, which had Trump with a national lead throughout the campaign, and the NBC News/Survey Monkey poll, which overestimated Clinton’s share of the vote." (The Washington Post, December 2016)
So why then was Trump's win such a surprise for everyone?
"There is a fast-building meme that Donald Trump’s surprising win on Tuesday reflected a failure of the polls. This is wrong. The story of 2016 is not one of poll failure. It is a story of interpretive failure and a media environment that made it almost taboo to even suggest that Donald Trump had a real chance to win the election." (RealClearPolitics, November 2016)
In an experiment conducted by The Upshot, four teams of analysts looked at the same polling data from Florida.
"The pollsters made different decisions in adjusting the sample and identifying likely voters. The result was four different electorates, and four different results." In other words, a failure to interpret the data correctly."
(Here's a primer on how pollsters select likely voters.)
Nate Silver's list of what went wrong:
- a pervasive groupthink among media elites
- an unhealthy obsession with the insider’s view of politics
- a lack of analytical rigor
- a failure to appreciate uncertainty
- a sluggishness to self-correct when new evidence contradicts pre-existing beliefs
- a narrow viewpoint that lacks perspective from the longer arc of American history.
In other words, when surveys don't work, you must be holding it wrong.