Skip to main content
Premium Trial:

Request an Annual Quote

U Penn Cancer Center Initiative Builds Bridge Between Clinical and Genomic Data


Few would deny the potential benefits of marrying clinical and genomic data — linking knowledge of molecular-level mechanisms with the physiological effects of human disease may be the only way to relieve the target validation bottleneck now facing the industry. However, bringing these two ends of the pipeline together offers a number of technical, logistical, and cultural challenges that have proven difficult to overcome.

Michael Liebman, director of computational biology at the Abramson Cancer Center of the University of Pennsylvania, is heading up a biomedical informatics initiative that aims to make this integration a reality. Liebman noted that when he first began a career in the pharmaceutical industry in 1988, “We started to put together a strategic plan of what we were going to do, and this is still the same strategic plan. It hasn’t changed — it’s just taken a long time to get to this level of maturity and to get the technology and the access necessary to actually do it.”

Clinical data currently inhabits many disparate legacy systems, Liebman said, and electronic medical records for patient data are still a relative rarity. The Penn project is creating a patient-based distributed database model to integrate this data, and while they’re at it, they’re adding genomic data and family history and previous medical records into the mix. Additional linkages to tissue and sample repositories, cell lines, genotyping data, image data, and other sources of information are also planned.

Working in cooperation with Penn’s Genomics Institute, Liebman’s team at the Cancer Center is laying the initial groundwork to make sure all the participating parties in the system are on the same page, technologically. “Some divisions or departments have no structure and we’re helping them impose a structure, or we’re sharing ours so they can use it as a template,” he said. On the other hand, some divisions have been capturing data electronically for some time, but with an emphasis on scheduling or billing, rather than on data of value to researchers, posing other compatibility hurdles.

The first step in the undertaking involves “inventorying what exists,” Liebman said, “providing access at a high level through a controlled vocabulary to what is available and who needs to be contacted.” The next step gives researchers the ability to search across the distributed repositories of data, and the third step is building an object-oriented interface for a full-scale knowledge management system, he said.

The knowledge management system is based on a specialized ontology that incorporates breast cancer as “a process rather than a diagnosis,” Liebman said. The ontology, unique in its mechanistic rather than “parts list” approach to organizing knowledge, will enable researchers to query breast cancer patient data at both the clinical and molecular level.

Liebman and his five-member team have been plugging away at each of the three steps in the project “in parallel.” The full system is in its final stages of deployment, and a prototype of the object-oriented knowledge management system to top it all off is currently in place.

Once up and running, Liebman envisions the effort as a proof-of-principle for other types of cancers as well as other disease areas. In addition, he said, “We anticipate that more and more clinical trials for pharma will require access to and analysis of genomic information as well as clinical information, and we believe that the system design we are implementing will strongly support this capability.”

The complexity of the system will require both in-house development and off-the-shelf tools. Liebman said he is still evaluating a number of commercial products that may become part of the architecture. In some cases, he said, the project team is working with commercial groups “who may have an interest in getting into this area and see the value of having a real testbed for co-development.” While unable to discuss particular vendors under consideration, Liebman said the project is considering commercial components in the areas of electronic medical records, clinical trials management systems, expression data analysis and storage, and object-oriented technology.

The project is partly supported by a grant to the Abramson Cancer Center from Pennsylvania’s tobacco settlement, which the state has earmarked for a number of healthcare initiatives.

— BT

Filed under

The Scan

Tens of Millions Saved

The Associated Press writes that vaccines against COVID-19 saved an estimated 20 million lives in their first year.

Supersized Bacterium

NPR reports that researchers have found and characterized a bacterium that is visible to the naked eye.

Also Subvariants

Moderna says its bivalent SARS-CoV-2 vaccine leads to a strong immune response against Omicron subvariants, the Wall Street Journal reports.

Science Papers Present Gene-Edited Mouse Models of Liver Cancer, Hürthle Cell Carcinoma Analysis

In Science this week: a collection of mouse models of primary liver cancer, and more.