Want More Accurate Polls? Maybe Ask Twitter
IN A PUBLIC Policy Polling survey, quite a few Texans say they’ll vote for Harambe for president in November. If you haven’t looked at the Internet in a while, Harambe was a gorilla fatally shot by a zookeeper after a toddler fell into his pen, but he’s more than that. He’s a meme, and his candidacy in Texas represents the voice of the Internet insinuating its way into polling. It’s silly, but it’s actually a sign of positive change.
Traditional polling methods aren’t working the way they used to. Upstart analytics firms like Civis and conventional pollsters like PPP, Ipsos, and Pew Research Institute have all been hunting for new, more data-centric ways to uncover the will of the whole public, rather than just the tiny slice willing to answer a random call on their landline. The trending solution is to incorporate data mined from the Internet, especially from social media. It’s a crucial, overdue shift. Even though the Internet is a cesspool of trolls, it’s also where millions of Americans go to express opinions that pollsters might not even think to ask about.
How It Works
People have tweeted about Donald Trump over 22 million times since the Republican National Convention. Data-wise, that’s an analyst’s dream. “Twitter data is relatively easy to get, and quantity has a quality all of its own,” says Seth Redmore, CMO at sentiment analysis company Lexalytics, which has conducted political polls for The Boston Globe.
So to put the data to use, analysts pull together all of the relevant mentions (names and handles are the obvious ones, but hashtags and memes can get parsed, too) using boolean search queries, which are basically just keyword searches that use the operators “or,” “and,” and “not” to refine results. Then they filter for sentiment using highly accurate natural language processing algorithms. “Machines are much better at doing sentiment analysis than humans now, especially on a large scale,” says Apoorv Agarwal, a computer scientist at Columbia University. Which is why there are whole companies devoted to mining social media text for clues of rising trends, often for marketing and stock research.