Creating a sentiment signal for publicly traded firms by using IBM Watson on Bluemix
In today's world, we are overwhelmed with statistics on the amount of unstructured information that is created daily and published on the Internet. For example, in any given day there can be hundreds of traditional media articles that are published, and hundreds of thousands of tweets about the coffee giant, Starbucks. To extract sentiment information that is related to Starbucks from this unreadable amount of data, we look to IBM Watson to calculate meaningful insight and summarize it so we can understand the trend of the data at a single glance. To create this sentiment signal, we use Bluemix as our platform as a service to run our system and we integrate several APIs, including the Watson Natural Language Understanding API. In addition to Starbucks, we targeted 15 firms across three different sectors. After we demonstrate how to create and browse a sentiment signal for a given firm, we then show two examples of how you can use this information in equity research.
Anatomy of the sentiment signal
Twitter has been a go-to source of public sentiment posted on the Internet since its explosive burst on to the scene at the 2007 SXSWi conference. One of the reasons it is so attractive to people who are studying the opinions of Internet denizens, is that it has a fantastic API; APIs let programmers systematically download and process information from the website. Also, historical data is available, which means you can build signals that extend back in time and look at prior performance. For our project, we gathered our tweets by using the GNIP PowerTrack syntax made available through the IBM Insights for Twitter Bluemix service. For each of the firms of interest in our study, we created a PowerTrack rule to maximize the relevance of tweets to each firm. Syntax reference for PowerTrack rules is available on the GNIP website. PowerTrack rules are powerful in that you can catch tweets about firms of interest in near real time, creating a signal that continuously updates itself automatically.
To mine the web media content, we used a traditional news crawler that was seeded with a white list of trusted news content. After each article was crawled, relevant metadata about the article was extracted including:
- Publish date
The web-crawled data is encapsulated by an API call. We can query for explicit examples of firm mentions for display on the Bluemix hosted user interface, or the query for the aggregate number of firm mentions that are considered positive (sentiment score greater than zero), neutral (sentiment score is zero), or negative (sentiment score less than zero).
As you might imagine, the web content store is very large, with over 80 million articles in the database.
Augmenting tweets and web content with Watson
As the tweets pour in to the PowerTracks and we crawl the web, we augment the text with Watson Sentiment Analysis, which is available through the Watson Natural Language Understanding service. The data is classified into three categories: positive, neutral, and negative. A signal with a daily frequency is created by aggregating the number of positive, neutral, and negative mentions on a specific day, t, and is defined by the equation:
- np(t) = Number of positive mentions on a specific day t
- nn(t) = Number of negative mentions on a specific day t
- nm(t) = Number of neutral mentions on a specific day t
This calculation happens at a fixed time every day, and the new daily aggregation numbers are available on the user interface immediately. The data is continuously being collected and is available for analysis for select clients. As you can see from the equation, the signal is normalized, and therefore, results in a distinctly different signal than by analyzing Twitter volume for the firms of interest.
Firms of interest (FOI)
We used 16 firms in this project, which are broken into two groups. Ten group AA firms have Twitter data from 01 January 2017 to 01 March 2017. Six group A firms have Twitter data from 01 December 2015 to 01 March 2017. The stock tickers for the groups:
- Buffalo Wild Wings: BWLD
- Chipotle Mexican Grill: CMG
- Domino's: DPZ
- Fitbit: FIT
- GoPro: GPRO
- Garmin: GRMN
- HP Inc: HPQ
- McDonalds: MCD
- Panera Bread: PNRA
- Starbucks: SBUX
- Dunkin' Donuts: DNKN
- Tableau Software: DATA
- Imperva: IMPV
- Splunk: SPLK
- Salesforce: CRM
- Apple: AAPL
Visualizing the sentiment data
Our team built a user interface (UI) that is hosted on Bluemix to browse the history and current trends of the sentiment signal for all of the 16 firms in three sectors. The main landing page has a sector overview, and every firm has a detail page that contains examples of the mentions in the time frame of the search parameters, such as this detail page for Chipotle Mexican Grill.
Using the user interface, anyone who is interested in the sentiment of the firm can see the trend of the sentiment at a glance, and then drill down into specific mentions of the firm to do analysis on any sudden changes of sentiment or volume.
In addition to the ability to browse the data, the UI enables the export of the sentiment data for each firm aggregated daily. This sentiment signal can be used for further analysis. For example, you can correlate with past data metrics of interest (like revenue) and use the signal as a predictor indicator for the next quarter's revenue growth. Also, you can detect events in the sentiment signal that correlate to high price volatility in the near-term market prices.
Quarterly revenue predictions
For each firm on the AA or A list, we found the start and end date for their fiscal quarter. Using these dates, we integrated the sentiment score across the entire quarter. When earnings were announced after the end of the quarter, we compared the year-over-year change in quarterly revenue and found it correlated to the Watson Sentiment score calculation.
While we can see some outliers that confute the correlation, for most firms in the study the Watson sentiment score is a predictor of year-over-year quarterly revenue growth. Integrated with other features in a machine learning model, this can be a powerful tool for data scientists.
Event detection for stock price volatility
The sentiment signal can also be monitored for anomalies, or events, which correlate to equity price volatility. There are many ways to perform anomaly detection in signal analysis, the simplest of which is to define a threshold and declare an event every time the signal crosses the predefined threshold boundary. For the purposes of detecting events in the sentiment signal, we found this to be a poor fit. The firms of interest have distinct characteristics that make the signal very different from firm to firm. The following figure shows an example of two firms, Chipotle (CMG) and Dominos (DPZ). Dominos has a much higher day-to-day variance but remains mostly centered around zero. Chipotle starts with a high sentiment value at the beginning of the time series, which drops in late 2015 during their food safety issues, and never quite recovers to the level of January 2015. A threshold cut could suit Dominos but cause all of Chipotle to be an event up until the actual food safety event occurs.
To define an event for our study, we examine the first derivative of the sentiment signal, and check whether it exceeds a variable, Tau:
In other words, we are looking at the change in yesterday's sentiment, comparing it to today's sentiment, and calling this an event if the change was significantly abnormal. Of the 5,918 unique data points in our study, 202, or 3.4% of these days resulted in an event detection. The following image displays the firms Chipotle and Dominos with the events that are detected by this algorithm.
Event today, volatility tomorrow
An event is determined by the change of yesterday's sentiment signal to today's sentiment signal. But, what is most interesting is what that means for the stock price tomorrow.
Let the index of all events on s(i) be contained in the set I. And let the price of the stock on specific day be p(i). Now define the change in stock price from today to tomorrow for all events for the delta:
And the change in stock price for any specific day (the null hypothesis) in the study is:
If the events are indicators of stock volatility, then the probability density function of δe is different than the null hypothesis, δa.
Using Dunkin (DNKN) as an example, you see in the previous image that on any specific day there is a 24% probability that the change in price is 20 - 40 cents. After an event, the probability that the change in tomorrow's stock price is 20 - 40 cents doubles to 50%.
Of course, firms trade at different prices, so to some firms a 20 cent change might be significant, but not to others. To look across all the firms on all the days in the study measuring the changes on the same scale, we instead use the relative change in stock price for events:
And the relative change in stock price for any specific day (the null hypothesis) in the study is:
The following figure is the cumulative density function of re and ra. Separation between the two curves, with the re beneath the ra curve indicates that over the data in our data set, a bigger change in stock price is more likely after an event than in the null hypothesis. For some firms, this effect is larger than others. If we only picked firms that responded better to the events, the increased likelihood of volatility would be even more striking. A change in stock price above 0.024% is 10% more likely after an event detected by Watson when compared to any specific day.
Using IBM Bluemix, the Watson Natural Language Understanding API, Twitter data, and web data, we created a web interface to track the aggregated sentiment signal for each of the 16 firms in our project. The user interface allows for natural browsing and visualization of the data along with the ability to download the raw sentiment signal data. Using the sentiment signal data made available through the UI, we found two strong use cases for this data: quarterly revenue predictions and a daily stock price volatility indicator.