Skip to main content
Premium Trial:

Request an Annual Quote

NYGC-led Glioblastoma Study to Test IBM's Watson's Ability to Handle Genomic Data


The New York Genome Center said this week that it will use a prototype of IBM's Watson specifically designed to handle genomic data for a clinical research study aimed at finding better treatments for glioblastoma, a particularly aggressive and malignant type of brain cancer.

The NYGC announced the glioblastoma study and its plans to use Watson for data analysis during a launch event held at its offices in New York City. At the same time, the center announced that it has selected IBM as its first technology partner. In a series of presentations, the partners explained that they will use a cloud-based version of Watson to combine genomic data with information from biomedical literature and drug databases in an effort to identify driver mutations and effective treatments that can target these variants. Currently, patients receive a standard three-month chemotherapy regimen but the drugs that are used only work in a fraction of cases and at best only add a few extra months of life.

The partners plan to analyze information from about 20 recently diagnosed patients who have been selected to participate in the NYGC-led clinical research study, which is being conducted in conjunction with nine hospitals in New York state. Specifically, they'll sequence tumor and normal samples from each patient — at 80x and 40x respectively — and they'll also do RNA-sequencing on the biopsied samples, Toby Bloom, NYGC's head of informatics told BioInform after the announcement. Her team will be responsible for analyzing the sequence data and generating the list of variants that will be fed into Watson.

Once the variant data is in Watson, the system maps the mutation information to biochemical pathways and proteins in the cellular processes that are involved in cell replication, and explores the effects that these mutations have on pathways with an eye toward identifying driver mutations, Ajay Royyuru, head of the computational biology center at IBM Research, explained during his presentation on Watson's capabilities. It then searches the biomedical literature for relevant information on the pathways and driver mutations and also looks for available drugs that could target the variants in question. These steps are performed in a matter of seconds and Watson's findings are presented in the report that's returned to the pathologist or tumor board.

Watson relies on information from several standard biomedical resources — it currently has access to information from about 23 million Medline abstracts, for example — and its internal knowledgebase will be updated "incrementally" as new information is added to these resources, Royyuru said in his presentation.

Furthermore, Watson has the ability to learn from new patient scenarios and from the addition of new information from medical research, journal articles, and clinical studies. Watson is part of a "new generation of computing systems that no longer require extensive programming," John Kelly, the senior vice president and director of IBM Research, said during his presentation.

These so-called "cognitive systems" are capable of "dealing with immense amounts of data and … they learn as they experience the data and [as] they experience outcomes of the use of that data," he explained during his presentation. They're also able to cut down on the time that would normally be required to parse the biomedical literature and combine it with the patients' genomic information. These improved systems can "rapidly advance care in diseases such as brain cancer," Kelly said.

Furthermore, the system doesn't favor data from specific sources over others. Rather, it starts out with a "completely clean slate" simply presenting pathologists with all of the relevant information it's pulled in from public literature but "as it runs long enough, it will start learning … which sources it can trust over the long term more than others," Kelly said.

Although, Watson has had some experience in healthcare settings in the past, this is the first time that the system will be used in a research project that’s aimed at using genomic data to find new treatments for disease. In 2011, IBM worked with collaborators at Columbia University and the University of Maryland to figure out how best to use Watson in a clinical setting.

Then in 2012, IBM teamed up with Memorial Sloan-Kettering Cancer Center to use Watson to combine clinical, genomic, and molecular information from patients’ electronic medical records with data from biomedical literature in order to improve breast cancer diagnosis and therapy selection. This particular partnership, according to Kelley, was instrumental in preparing Watson's ability to work in clinical settings, improving specifically the system's ability to "understand the medical language and terminology in cancer."

Since this version of Watson is implemented on the cloud, partner institutions can access the system and available data remotely. Steve Harvey, IBM’s global technology and analytics leader, told BioInform that Watson's algorithms sit on cloud infrastructure that was developed by SoftLayer Technologies, a cloud computing provider that IBM bought in June last year.

Implementing it on the cloud also makes it possible to scale up as needed, making it possible to analyze data from thousands of patients rapidly. This is important for diseases such as glioblastoma where patients' prognoses are poor. With a median survival time from diagnosis to time of death that's between 12 and 14 months, "time is not your friend when you have glioblastoma," Robert Darnell, NYGC's CEO, president, and scientific director, noted in his talk.

Not only are glioblastomas different from each other, but each one has thousands of mutations all of which "have robust literature backing them up" in terms of things like which pathways are involved and how these drive tumor development," he said. Watson has the ability to "read thousands of papers in a split second" and to process "huge amounts of data from each individual patient" enabling physicians and researchers to home in on the most important mutations and identify the best available drugs in a timely fashion.