On election day, according to the majority of pollsters, Hillary Clinton seemed to have it in the bag. Nate Silver, predicted Clinton at a 71% win. A Clinton win was also predicted by 61 other national polls, including Fox, Reuters, Economist/YouGov, USA TODAY/Suffolk, and NBC News/SM. Indeed, as the votes started coming in on election night, you could feel the palpable shock among newscasters, as their precious data proved to come up so wrong.
So, how did beloved data whiz Nate Silver and 61 other national polls mess up so “bigly”? Did the polls happen too late, before the impact of the additional Clinton email scandal hit? Did voter turnout have a play in it? Were samplings flawed? Or was it something else – something that perhaps speaks to deeper gaps in our understanding of human behavior and flaws in our overall approach to polling?
Public vs. Private Personas
One angle that is circulating is that many polled who said they were not going to vote for Trump were not telling the truth. Indeed, polling firm, Morning Consult reported that voters were more likely to say they want Trump in the White House anonymously online versus a live telephone interview. Meanwhile, Arie Kapetyn, director of the University of Southern California (USC) Dornsife Center of Economic and Social Research, whose organization jointly runs The Los Angeles Times/University of Southern California tracking poll (one of the few polls that did predict Trump as the winner) told reporters, “There’s some suggestion that Clinton supporters are more likely to say they’re a Clinton supporter than Trump supporters are to say they’re a Trump supporter.”
This theory that there was a discrepancy between what Trump supporters shared publicly versus what they did privately, is also supported by our Outbrain study authored by Roy Sassons, Outbrain’s Chief Data Technologist and Ram Meshulam, Outbrain’s Organic Recommendations Team Lead. The study revealed a significant gap exists between what we read privately and what we share publicly with friends.
Sassons and Meshulam’s research was comprised of two billion views on several hundred thousand articles from the world’s largest content and news websites. These views were then observed over three time periods between 2015 and 2016. Online content was split into categories using a natural language algorithm and was rated according to the ratio between the number of readers that read an article about a certain topic, and the number of readers that shared the article on Facebook.
What this data ultimately revealed is that categories that received many views and few shares did not reflect well on the personality of those who consumed them. In fact, most of them could be considered superficial, sexual, or blunt.
On the other hand, categories that received many shares and little views were much more flattering. They tended to be more deep and intellectual.
And Outbrain’s data also revealed that this wasn’t isolated to a certain country or region – it was a global pattern. (Which might also explain why polls were so wrong regarding the controversial Brexit vote in the United Kingdom.)
Take, for example, specific content like “Game of Thrones.” When Sassons and Meshulam did a deeper dive on this content, they found that articles that were shared the most on Facebook focused on profound analysis and storyline prediction. Additionally, articles with a feminist spin that were focused on strong female characters, were also extensively shared.
Meanwhile, while the article people read the most were focused on the show’s violent and sexual scenes, they rarely shared these stories on Facebook feeds.
The Silent Trump Supporters
Call it what you will: the “Game of Thrones Effect,” the “Trump Effect,” The “Brexit Effect”; what it adds up to is all the same. When it comes down to it, what we tend to share and present to the world is ultimately very different than who we are privately. This is something that the majority of election polls seemed to neglect in their analysis.
Further, if you couple this with our current social network-dominated society, in which the ordinary reader has tremendous power, sometimes equivalent to that of a magazine editor, you can also see why so many in the media world – Nate Silver, the pollsters, the bewildered newscasters reading the results on TV – may have felt like the rug was pulled out from under them.
In the age of social media, we decide what content we see based on our surroundings. The people connected to us in our network are exposed to the content we share and that, in turn, dictates what is exposed to others in their network. Since that decision is not always made based on the level of interest we really have in that specific content, the current “world map of interests” that everyone sees becomes distorted. This, in turn, eventually feeds the algorithms that recommend more content to us.
A recommendation system, be it human or automatic, must take this under consideration. It should find creative ways to fix these biases, or balance these representations with rich data from a source that reflects reader’s true preferences. Otherwise, expect more polling misses ahead.