Analyzing tweets to predict flu epidemics

Share this post:

Influenza is a serious health problem. Millions of Americans get the flu each year. It sends hundreds of thousands to the hospital and thousands die from it. If healthcare professionals could more accurately predict when and where flu outbreaks will peak, they could improve the timing of flu vaccination clinics, better communicate the need for vaccination and improve the distribution of antiviral medications. Many lives could potentially be saved.

The U. S. Centers for Disease Control and Prevention (CDC) does track the flu, of course, but its surveillance system is based on reports from physicians, clinics and hospitals. Therefore, it lags flu activity by one to two weeks, reducing its overall effectiveness as a predictive tool.

To accelerate flu epidemic forecasting, three of my master’s students undertook a research project to develop a cognitive expert system that tapped into social media. Utilizing IBM Bluemix and Watson cognitive services, they created a working solution in just six months.

Digging into Twitter

Conceptually, my students realized that Twitter posts could help pinpoint the location and severity of flu outbreaks if they could learn how many tweeters had symptoms and where they lived. Success would depend on the system’s ability to analyze up to 500 million daily tweets in English to dig out and categorize the needed data.

IBM Watson Natural Language Classifier service delivered that capability. Powered by the cloud, the service’s natural language processing (NLP) can understand nuances in the content and context of everyday language. NLP extends the bounds of keyword search by comprehending the content of tweets.

How Natural Language Processing works

As an example, it makes a difference whether someone tweets that he plans to get a flu shot, or that he already has flu-like symptoms and is staying home from work. Watson can recognize the difference by understanding the sentences. The system applies such cognition across millions of tweets while correlating the analysis with data from the CDC.

Combining the CDC’s rock-solid data with “fuzzy” data from Twitter produces a rock-solid result – the ability to predict flu outbreaks before they happen in near-real time. The system is so promising that we entered it in the CDC’s Predict the Influenza Season Challenge.

Answering people’s questions

Also, we paired a body of knowledge from about 4,000 research papers on why people get the flu with the IBM Watson Engagement Advisor service, which conducts a dialogue in natural language with people who contact the system. It can answer critical questions such as “Do I have the flu?”, “What are flu symptoms?” and “Should I get vaccinated?” The more questions and responses the system handles, the smarter it gets.

What is the future of our system? We plan to add additional data sources to increase accuracy. Now that Watson speaks German, we will extend it to Germany. We may apply it to other infectious diseases. The system would be especially welcome in developing nations where the healthcare infrastructure is weak – Twitter data could be more accurate than information from the medical establishment.

A machine that supports us

Key to our system’s value is its ability to keep users in the loop through natural language processing. We never wanted to create an isolated supermachine, we wanted a machine on our side to support us. A cognitive system that engages medical professionals and answers everyday questions can truly improve our health and our quality of life.


For more details on the flu prediction system, visit the IBM case study or view Dr. Pipa’s video below:

More stories

How to increase loyalty with tech-smart consumers

Digital transformation is sweeping the retail industry. It’s especially evident in B2C marketing, where we must rethink our processes and refresh our technology to become relevant to today’s tech-smart consumers. At MediaWorld, Italy’s largest retailer of consumer electronics, we’re transforming by putting customer data at the center of marketing operations. Whereas before our marketing was […]

Continue reading

How blockchain technology is helping bands and their fans

The trouble with tickets You’ve probably thought about it: getting in at the ground floor of an IPO is a dream come true. For the vast majority, however, it remains only a dream — accessible only to a relatively small group of privileged investors. To get a piece of the action, the average person has […]

Continue reading

How high-tech AI technology helps fulfill retail demand

Barn coats and beyond Since our founding 130 years ago, those of us at Carhartt have always maintained a tight focus on what we see as our mission: providing hardworking people with durable clothing they can rely on. Along the way, we’ve also become an iconic brand with a broadening appeal and growing visibility in […]

Continue reading