Twitter-based trading strategies (Yep, you read this right !)

To many, this might sound like the latest fad and one knows how quickly the financial sector gets hooked to fads. However, before we start debating, we might want to determine what we mean exactly by a twitter-based trading strategy. In my opinion, because of the many possible uses of Twitter, this term actually refers to multiple practices:

  • The Twitter Alarm (short term trends) : In my view, this should be a no-brainer for traders. You're trading Apple stocks and news breaks about strikes in one of their screen providers' factories. You want to be sure to get that information as quickly as possible. Well more an more, news is likely to break on Twitter first. Hence, you need an alarm that sounds when a thousand hundred tweeters mention Apple. Excluding this strategy because Twitter "is not trustworthy" is dumb in this case because such a large outbreak in tweets about Apple is likely to be based on some sort of news. What if it's a re-tweeted false rumor ? It might be. And you might want to wait for the AFP news to sell. But that's where personal judgement comes in : Twitter is a tool. You are the user.
  • Twitter sentiment indexes (long term trends) : To be clear, by long term I don't mean 5-year trends but rather 3-month trends (quarters). This technique I've talked about. It consists of using Twitter to model tweeters sentiment over a period of time and use it to predict the direction in which larger financial indexes will move. There are two premises here : First that using words in tweets, one can determine sentiment and second that this sentiment affects major financial indexes or stock prices in an indirect way by steering the traditional macroeconomic equilibrium. This is the strategy I'd like to discuss here.
In the following bullets, I'll point out everything fishy about the Twitter-sentiment-based trading strategies:
  1. Semantics related problems: I'm starting with the biggest issue which is currently puzzling NLP (Natural Language Processing) specialists. It is an issue half way between linguistic and programming : A machine cannot process sarcasm and when you ask it to track "oil" on Twitter, it'll also spot the tweets relating to the "oil" Justin Bieber dropped on his shirt that afternoon (Bieber distortion) ... beyond the obvious problems however is a more profound problematic: What is sentiment and how are you processing it ? Because if you're looking for an index that indicates bad mood when it goes down and good mood when it goes up, I've got a bad news for you : Human sentiment is not binary. If you're trying to build a social-emotional predictor of the S&P 500, building it the same way the S&P is built is wrong. The now famous "Twitter predicts the stock market" tries to circumvent the problem but fails in my opinion. Sentiment is not an up or down that'll eventually determine the mood of the financial market.
  2. Getting quality out of quantity : This refers to the first premise above (using words in tweets, one can determine sentiment). What this premise implies is that the words used by tweeters actually indicate sentiment. General sentiment. The mood of a sample of the population. I might conceive of a sentiment index about a brand or a concept or a person but a general sentiment that will in turn tip us about which way the S&P will go is too encompassing and reducing of human emotions for me to fathom. In other terms, I don't think general sentiment means anything.
  3. Not stand-alone:  At this stage, you might be thinking "Yeah, I'll use the Twitter alarm but this sentiment thing is fishy". You're right. As long as it's not a stand-alone solution, I believe it can be a great add-on to a trading strategy. You can use free tools such as Twittrading which has the right take since it has a specific index for several stocks
  4. People will game the system: But then doubts arise. What if it becomes too mainstream? What if people realize that funds and traders are actually sonar-scanning their tweets for insight ? Won't they just start writing misleading rubbish to fool the machines? You bet they will. But no one said the semantics game was a risk free one. Machines are already making mistakes. Here's one example of a tweet the Twittrading algorithm used to determine the Apple sentiment index (Yep) :
  5. How representative: What you should consider is that this is sentiment based on tweeters' tweets. These are individuals who are rather accustomed to technology, more or less "wired". Hence the reason why you're more likely to get an accurate Apple Sentiment Index then a "Saint Gobain" or "Berkshire Hathaway" sentiment index. Tweeters are simply more interested in the former than the latter. And even if you do find semantic data about the latter, John Glasgow points out on Quora : "Retail trading accounts for roughly 11% of all trading volume, so individual investors have a small weight in overall stock prices. "
  6. How predictive: This is about the second premise cited above (sentiment affects major financial indexes or stock prices in an indirect way by steering the traditional macroeconomic equilibrium). Again John Glasgow on Quora narrows it down to a question : "If Coca Cola has a lot of negative tweets, does that actually mean their sales will decrease?". I think this is a question worth pondering on.
  7. It's working for movies ! It's true. Look at Fflick  (well, now it's Google's property ... too) and its accurate prediction of next box office hits based on tweets. But I see the market as being much more complex than the movie market where tweets are revelatory of future viewership and hence of future revenue ("Harry Potter 8's last trailer looks awesome !" usually means "I really want to see this movie") while tweets aren't revelatory of stock purchasing intent.
But whatever we write and say, it's already being done!  Pluga Al Fund is a financial fund using blogs to perceive sentiment, Thomson Reuters and Dow Jones are using Lexalytics (a social media analysis company)  to bring a new kind of insight to their clients. The New York Times states that "according to Aite Group, a financial services consulting company, about 35 percent of quantitative trading firms are exploring whether to use unstructured data feeds".
Why is this happening ? What happened to the good old Benjamin Graham techniques ? Well, it's still here. Only, since our lives are moving to the digital sphere, we should start "listening" to that sphere with more attention than ever.