There is no doubt that social networks are increasingly present in electoral campaigns and the 2012 American Presidential election is not an exception in this regard. Voices from the Blogs (VfB), the Social-Media Observatory of Università degli Studi di Milano, is following since the end of September the sentiment towards the two Presidential candidates for the on-line version of Corriere della Sera, one of the most widespread Italian newspaper. VfB has already analyzed with success other national elections such as, for example, the French Presidential one.
It can be considered a bit risky for VfB to play this game given the competitors like Twitter Political Index (or Twindex) by Topsy & Twitter itself, but it is an interesting match, given that our Sentimeter index is computed in a different way. And indeed Twindex, the “official” barometer of Twitter sentiment for the 2012 Presidential election, and Sentimeter index do not necessarily agree most of the times as we will show below, both among themselves as well with respect to traditional electoral surveys. It is therefore useful to clarify their differences and which insights they convey. This is a conditio sine qua non to get the most out of the two. So, let’s dig into this.
For simplicity, let us denote by “T” the Twindex and by “S” the VfB’s Sentimeter index. As for the T index and for its specifications, we rely for our discussion on the informations available on the official Twindex page. For the techniques of sentiment analysis behind VfB we refer to the scientific literature and, in particular, to the Hopkins and King’s method.
Apples and oranges
T: For each of the two candidates the T index is constructed as the proportion of the total number of positive tweets versus the total number of positive and negative tweets.
S: We constructed an index based on the propensity to vote (i.e., to express a positive sentiment) for each candidate using four categories: Obama, Romney, Others, Uncertain. We consider as valid only explicit statements in favor of the candidates. Even eliminating “Other” and “Uncertain“ from index S, the two indices S and T are not therefore equivalent because S does not consider the tweets that are opposed to a candidate as a vote (i.e., a positive sentiment) in favor of the other. For example, if a tweet says “do not vote for Romney” this does not necessarily imply that the person who wrote the post will then vote for Obama.
Input data and calculation method
Every day the volume of Twitter posts generated all over the world is around 400 million post.
T: The T index is constructed by extracting a sample of daily tweets on American soil for a total of about 2 million tweets a week. The analysis is updated at 8 pm ET. The final data is calculated as a moving average along 3 days.
S: For each analysis, the index is constructed by taking a sample of tweets generated from the 50 states for a total of about 1 million of posts a day. The overall aggregate shown in the on-line Special on USA Elections is a weighted average of the propensities to vote with weights proportional to the post collected in each state. The analysis is updated at 7 am and 7 pm, Italian local time. Each data is calculated as a moving average along 7 days.
So T is built to react faster than S, while S is designed to capture the trend in the medium term.
Method of textual analysis
T: To determine positive and negative feelings, Topsy uses techniques based on complex semantic rules that can catch even humorous phrases and nuances of language using ontological dictionaries which contain several thousands of words.
S: The S index does not use any ontological dictionaries or semantic rules. A statistically significant set from the 1 million tweets downloaded (about 800-1000 per analysis) is read and coded manually. From this initial coding (via Hopkins & King’s algorithm) the semantic rules that came out in a natural way are then used to estimate the propensity toward the candidates in the entire data set.
The best coder is the human
The difference in the approach is substantive, since the direct reading of the tweets allows to follow more closely and accurately not only the nuances and double meanings included in the text, but also the change in the natural language used by the tweeters. Think about the term “bayonet” which emerged as an hot term after the last presidential debate.
The focus of investigation
Reading a portion of text as VfB does, also allows one to focus the analysis on different targets. For example, within hours of the third television debate, VfB conducted two parallel analyzes. The first was about the judgment of the Americans on the performance of the two candidates in the debate. In this analysis, current president Obama resulted as the clear winner (60.3% of pros compared to 39.7% for Romney). Still, this judgment per-se does not necessarily affect the propensity to vote: “Come on! Romney stand up and show who you are” is likely to be the tweet of a pro-Romney voter who judges the performance of his or her candidate as unsatisfactory. In fact, the S-index for the same day shown Romney at 43.2% compared to 40.1% for Obama.
What about the two indexes and the standard polls?
Althought both T and S are only measuring the sentiment towards the candidates at the American Presidential race on Twitter, rather than any voting intention, it remains an interesting exercise to compare their predictions with the ones we can obtain from more traditional electoral surveys. We rely in this respect on the index that RealClearPolitics makes available each day by averaging on all the published national electoral surveys. According to RealClearPolitics in the last three weeks (8 October – 28 October), Romney has led 17 days, for an overall average lead of .6 point. According to T, Obama has been always leading, but twice (the 13th and the 28th of October) with a quite comfortable overall average lead of 7.7 points. On the contrary, according to S Romney and Obama has split the lead almost equally: 10 times Romney and 11 times Obama, while over the entire period Romney leads on average by 1.2 point. Also the correlation between the three measures with respect to the difference in the percentage predicted for Romney and Obama (i.e., vote share of Romney – vote share of Obama) is radically different: -.38 RealClearPolitics vs. T, +.37 RealClearPolitics vs. S, while the correlation between T and S has been not significative over the last 21 days (-.08).
So, which one is able to predict USA elections better?
Probably none of the two. Indeed, as Topsy remarks “the result is analogous to describing the general tone of political discussion overheard in a café on a given day”, but it is intriguing to notice these differences.