Brainiacs: Applying Watson for Genomics to better understand brain tumors

Share this post:

This spring I was invited to a global meeting about cancer research – how tumor data should be gathered, integrated and interpreted.  It brought together specialists from medicine, biology, chemistry, mathematics and computer science for an extensive multi-disciplinary exploration. On the long trans-Atlantic flight back, to distract myself, I casually pulled out a movie from the in-flight entertainment with an intriguing title, “Collateral Beauty.”  To my great surprise, the movie touched on cancer- it was about the devastating effect on the hapless family of a glioblastoma multiforme (GBM) victim.  A gut-wrenching account.

GBM strikes indiscriminately and more frequently than one would imagine[1]. About a hundred thousand cases of brain tumors are diagnosed a year in the US, and a quarter of these are gliomas, or tumors of the supportive tissue of the brain. They account for 75 percent of all malignant tumors, and nearly 50 percent of the gliomas are GBMs. GBMs are usually highly malignant and grow aggressively, invading surrounding tissues. Our team in IBM Research set out to explore the potential for applying machine learning and data science to better understand and predict this disease.

Researchers prepare tissue samples for whole genome sequencing at The Rockefeller University, where clinical researcher Robert Darnell, MD, PhD, led a study with the New York Genome Center and IBM to analyze complex genomic data from state-of-the-art DNA sequencing of whole genomes. The findings were published in the July 11, 2017 issue of Neurology® Genetics, an official journal of the American Academy of Neurology. (Photo Credit: Epic Creative)

Would analyzing more genes give us a more complete view of the patient? In this case, is more really ‘more’? That’s the question we investigated in our paper published in Neurology Genetics this month: one of the results of our collaborative effort with New York Genome Center and other specialists [2].  Current commercially available genomic testing (called “assays”) target a small panel of a patient’s genes. We extended this analysis to a patient’s entire genome, as well as other omic-assays, such as proteomics (the study of proteins) that included whole genome expression (i.e., RNA or ribonucleic acid) data. We found that this indeed results in identifying more variants of their individual genome that can be potentially targeted for therapy by an oncologist.

Next we asked, does a machine-based (algorithm) analysis of this multi-modal, whole genome data hold a candle to a crack team of human bioinformaticians and cancer oncologists, in terms of accuracy and quality of analytics? We used a research version of Watson for Genomics at the time and demonstrated that it does! It was able to cut the time for accurate genomic data interpretation from 160 expert human hours to 10 minutes, opening the door for the possibility of scaling this highly specialized analytics.

Currently we are extending this work to a larger set of GBM patients and extending to other cancers, while we continuously improve the underpinning algorithms. Now we are also setting our sights on understanding the genomic basis of other complex phenomena such as resistance and response to therapy and immunotherapy.



[1] American Brain Tumor Association, “Glioblastoma”

[2] Comparing sequencing assays and human-machine analyses in actionable genomics for glioblastoma, Neurology Genetics, 2017.

More Healthcare stories

Dark Matter Matters: AI Makes DNA Dark Matter Useful

What is the minimal description that captures a space? Asking a mathematician’s basic question of a  biological dataset reveals interesting answers about biology itself. This summarizes our underlying approach to subtyping hematological cancer. Disease subtyping is a central tenet of precision medicine, and is the challenging task of identifying and classifying patients with similar presentations […]

Continue reading

Helping to Untangle Cancer Drug Resistance with Data

Why do targeted cancer therapies often fail? We have acquired so much more understanding about cancer in the last fifty years than in the last five thousand years. Approaches to patient treatments have dramatically changed, and statistics show significant improvement in patient response and outcomes to therapy in the last half a century [1]. Yet […]

Continue reading

Novel AI tools to accelerate cancer research

At the 18th European Conference on Computational Biology and the 27th Conference on Intelligent Systems for Molecular Biology, IBM will present significant, novel research that led to the implementation of three machine learning solutions aimed at accelerating and guiding cancer research.

Continue reading