CASE STUDY: National Institutes of Health

Finding cures faster by transforming big data into valuable clinical insight.

NIH is empowering scientists to tap into big data faster than ever, helping them better understand medical conditions, and develop treatments and preventions.


Executive Summary

NIH is helping researchers unlock new insights from decades of data, using the IBM PureData System.

National Institutes of Health (NIH) supports and conducts vital medical research. How could it better help scientists to find rapid answers to research questions and drive treatment breakthroughs?

NIH is empowering scientists to tap into vast stores of data faster than ever, and to uncover new links that help them better understand medical conditions, and develop treatments and preventions. Now, researchers can run queries on big volumes of data in seconds, rather than minutes. Faster, deeper insight opens up opportunities to identify diseases and develop treatments to improve public health.

The NIH Clinical Center is adding new data sources to the IBM PureData System every day. Rapid access to a wider range of data will open up even greater opportunities for researchers.



Every day, vast quantities of data are collected from patients at the National Institutes of Health (NIH) Clinical Center – the largest research Hospital in America.

The insights hidden within this wealth of information could hold the key to advancing medical research and enhancing patient care, but only if researchers are able to access and analyze the data.

NIH is helping researchers unlock new insights from decades of data, using the IBM® PureData® System. Today, research teams can run queries on patient data in seconds, rather than minutes or hours, helping them conduct more thorough experiments and find more accurate answers to research questions. These insights can be used to uncover new patterns of disease and effective treatments, improving patient care and potentially saving lives.



To keep up the pace of medical research, scientists need to enhance their ability to access large amounts of data.

Information is one of the NIH’s most powerful commodities. Data can help reveal subtle relationships between symptoms and ailments, or between a patient’s genetic makeup and predisposition to certain conditions. Understanding these relationships allows scientists to develop new medical treatments and techniques that improve health outcomes. It is critical for researchers to have access to all the information they need, when they need it, so they can make these kinds of breakthroughs.

Experts at the NIH have access to clinical and research data through the Biomedical Translational Research Information System (BTRIS). BTRIS, which was introduced in 2009, serves as a collection point for active patient information and historical research data – including demographics, vital signs, lab tests and medication history – enabling scientists to pull data from multiple systems of record with a few clicks of a mouse.

As technological and scientific advances increase the scale and pace of medical research, NIH must work to provide scientists with even faster insight into ever-increasing data volumes. Research teams expect information to be available at their fingertips, and the NIH must continue to meet this need, even as the amount of data continues to grow.

A few years ago, the NIH Clinical Center realized that the volume and complexity of data threatened to outstrip the capacity of its existing systems. Although the NIH was always able to provide the data, there were instances where it could not make some types of information available to scientists in a practical way.

For example, due to the sheer size of genomic data sets, it was impossible to return data to research teams in a usable format. The NIH could not afford these system limitations to hold back research, and the Clinical Center looked for a way to enable faster information access and reporting.



The NIH supercharged the performance of its BTRIS applications using the IBM PureData System for Analytics, which is powered by Netezza® technology.

Today, researchers can run analyses on large, complex data sets and generate reports faster than ever before. Some queries that took up to five minutes to run now take just five seconds. The speed gains have a great benefit as researchers can run and re-run queries, and have the ability to look at data in many different ways.

As an example, one NIH researcher wanted to know the heights and weights of certain volunteers who have participated in medical research studies at NIH. This may seem like a straightforward question, but it returned results from 25,000 volunteers, which worked out to three million rows of data. In the past, researchers would have needed to run such a report over the course of the weekend; now, they can get the data in near real time.

Solution Components

  • IBM® PureData® System for Analytics, powered by Netezza® technology.



Creating new research opportunities.

Improving BTRIS performance has opened up new opportunities for NIH researchers, allowing them to access and analyze data that was previously unavailable.

While genomic data has always been available through other platforms, for the first time, researchers can access and report on both clinical and genomic data in BTRIS. When the NIH Clinical Center added this data, the size of the repository grew from 5 billion to 16 billion rows of data.

The ability to combine genomic data with the clinical data already held in BTRIS is giving researchers access to an enormous, rich data source providing new research opportunities.

Data that was once difficult to access will now be made available for analysis, helping uncover patterns of cause and effect and indicators of disease that were previously unknown. As a result, the organization is opening up exciting opportunities for advancing research and improving patient care.

For example, researchers can take a breast cancer study, look at the genomic data, and find the locations of any abnormalities on a particular chromosome across the entire population of participants. They can then combine those results with clinical data, to see what drugs participants were taking, where they were living, and other lifestyle indicators.

By bringing all of this information into one place and analyzing it in new ways, researchers can potentially reveal hidden patterns and relationships that could hold the key to finding new treatments or improving healthcare delivery.

Better real-world outcomes.

By making it quicker and easier for researchers to find the answers they need, BTRIS is helping to make a real difference to patients’ lives. The research conducted by the NIH has been a driving force behind decades of medical advances, and it continues to invest in new ways to cure disease and prevent illness. With quicker, more complete access to data, researchers can now identify new research questions, and find treatments faster. Ultimately, this will make a difference in the type of care people receive—helping them live longer, healthier lives.

Would you like to speak with an IBM Healthcare expert?

Contact IBM

We're here to help