Skip to main content
Premium Trial:

Request an Annual Quote

Natural Language Processing to Play Major Role in Bringing Watson into Clinics


By Uduak Grace Thomas

Under the terms of a recently inked agreement between IBM And Nuance, Watson's deep question answering, natural language processing, and machine learning capabilities will be linked with Nuance's speech recognition and Clinical Language Understanding, CLU, solutions to help physicians more accurately diagnose and treat their patients (BI 02/11/2011).

In the months leading up to the first offerings from the collaboration, researchers at IBM and Nuance will work with collaborators at Columbia University and the University of Maryland, to figure out how Watson can best help in the clinical setting as well as to incorporate some healthcare-specific adaptations to the system, Jennifer Chu-Carroll, a member of the Watson Research Team, told BioInform.

"For the most part, the natural language analytics, the machine learning and the whole architecture are domain independent so we expect to be able plug these into the medical domain," she said. However, "there [will] be some ... research and development that is specific to the medical domain that we are going to have to bring in."

She said that Watson will have access to several "dimensions of evidence" that will help the system suggest diagnoses for individual patients. These will include medical information in medical textbooks and other published literature; past case studies that, for example, describe the correlation between symptoms and diseases; patient-specific data including medical history; and experimental results and other kinds of data.

She also said that the partners "don’t know at this point" whether genomic data will be incorporated into Watson's knowledgebase.

She noted that Nuance's CLU solution, which also includes natural language-processing capabilities that are tailored to the healthcare space, will "supplement" Watson's more general abilities.

Nick van Terheyden, chief medical information officer for Nuance Healthcare, concurred and further distinguished the capabilities of the two tools.

"Watson has the capacity to consume large amounts of data in an automated fashion and to present that in a form that can be answered in a question-and-answer-type style," he explained to BioInform. "CLU is technology that takes narrative in the healthcare setting and extracts out distinct data elements but it does not link those data elements to research papers or practical activities ... the sum of the two is much greater than the two individually."

CLU is based on a natural language processing engine that includes an ontology "for structuring and capturing clinical information" and then connecting it to standardized medical vocabularies such as the Systematized Nomenclature of Medicine--Clinical Terms, ICD-9, ICD-10, among others, van Terheyden said.

"What all that translates to is the ability to extract out from a free-form narrative document structured, tagged clinical data, what I would term actionable information because its now semantically interoperable meaning that a computer can consume it without human intervention," he said. "That data is then available for use within clinical and research systems [and] within all of the tools in the healthcare setting that are driven by data that can be generated from the natural form."

An additional feature of the software is its ability to standardize data contained in medical documents using its built-in ontology. To illustrate this point, Van Terheyden noted that while there tends to be less variation in terminologies used in medical research papers, physician's notes are notorious for using different language to describe the same condition. For example, a physician may refer to a heart attack as a myocardial infarction, MI, or simply call it a heart attack.

CLU's ontology incorporates over 1.5 million concepts which it uses to normalize data.

Using the example of the heart attack, the tool records 'myocardial infarction' as a term linked to a set of references that capture other ways that term could be reported as well as instances where it would occur in a clinical setting since — whether it is the patient's current health complaint, a past condition, or a included in family history. Other relationships defined within the ontology include where in the body the condition occurs, if it leads to other diseases and so on.

"What the engine does, through a variety of steps, is to process through this information [and] understand or find those links based on the context of the information," Van Terheyden said. "[CLU] essentially creates a three-dimensional model of the data that says 'based on what I understand, here are all the linkages to my ontology that allows me to say this is what this term is'"

CLU will also add value to Nuance's speech recognition software — which is being incorporated into Watson to quite obviously enable physicians to simply speak to the system — since it can also mine the text based on its ontology.

Have topics you'd like to see covered in BioInform? Contact the editor at uthomas [at] genomeweb [.] com.

The Scan

Harvard Team Report One-Time Base Editing Treatment for Motor Neuron Disease in Mice

A base-editing approach restored SMN levels and improved motor function in a mouse model of spinal muscular atrophy, a new Science paper reports.

International Team Examines History of North American Horses

Genetic and other analyses presented in Science find that horses spread to the northern Rockies and Great Plains by the first half of the 17th century.

New Study Examines Genetic Dominance Within UK Biobank

Researchers analyze instances of genetic dominance within UK Biobank data, as they report in Science.

Cell Signaling Pathway Identified as Metastasis Suppressor

A new study in Nature homes in on the STING pathway as a suppressor of metastasis in a mouse model of lung cancer.