Skip to main content
Premium Trial:

Request an Annual Quote

AI-Driven Expression Analysis in DNA Repair Disorders IDs Sarcoma Survival Biomarker


NEW YORK – A new study suggests that DNA repair diseases with a high cancer predisposition that feature similar biological pathways can be used to inform the search for new cancer biomarkers and therapeutic targets.

In the study, published last month in Nature Cell Death & Disease, researchers from the University of Copenhagen and Hong Kong-based Insilico Medicine analyzed diseases with DNA repair defects and high cancer prevalence and found that the centriole assembly-associated gene CEP135 was often downregulated in these. They subsequently discovered that elevated expression of CEP135 correlated with lower survival among sarcoma patients.

They then used CEP135 expression to stratify survival rates in a group of sarcoma patients, suggesting the gene's use as a potential prognostic biomarker and shedding light on potential therapeutic targets that might improve existing therapies.

DNA repair is frequently disrupted in cancer, and identifying the precise molecular pathways involved in damage response and repair processes has already led to new therapeutic options for cancer patients.

Reasoning that DNA repair disorders with high cancer predisposition might yield new cancer targets and insights into how they impact tumorigenesis, the group collected publicly available gene expression signatures, along with proteomic and phenomic data, from fibroblasts or induced pluripotent stem cells from patients with ataxia-telangiectasia, Nijmegen breakage syndrome, and Werner syndrome.

"We knew that they all had cancer as a phenotype," said Morten Scheibye-Knudsen, an associate professor in the University of Copenhagen's Department of Cellular and Molecular Medicine and the paper's senior author. "This led us to the idea that things that are dysregulated in these diseases could [also] be involved in cancer progression."

An analysis of gene expression signatures and disease phenotypes via Insilico Medicine's artificial intelligence (AI)-powered PandaOmics platform revealed that CEP135 expression was significantly perturbed in all three DNA repair disorders and that high CEP135 expression was also associated with poor survival rates among sarcoma patients.

Differences between transcriptomic data derived from sarcoma patients with low survival and high CEP135 expression and that from non-tumorous tissue samples further identified PLK1, a downstream target of CEP135, as a potential therapeutic target for sarcoma patients with high CEP135 expression.

PandaOmics uses deep learning models to identify potential therapeutic targets for a given disease through the analysis of omics data put in the context of prior information coming from publications, clinical trials, and grant applications. It incorporates and extracts features from omics data, such as those derived from GWAS as well as users’ input datasets, along with textual data, scoring each model to provide an overall score for the association between molecular targets and a given disease.

"There are hundreds of ways to discover targets," said Alex Zhavoronkov, Insilico's founder and CEO and an author of the paper. PandaOmics, he noted, is a "platform that allows you to work with very diverse omics datasets [and] generate target hypotheses very efficiently."

PandaOmics users can tailor their analyses by selecting the data sources and outputs that best suit their studies.

"For example, to identify novel molecular targets for stratified sarcoma patients, only omics-based scores were taken into account," Zhavoronkov said.

PandaOmics works similarly to the Open Targets platform, to which Bristol Myers Squibb, EMBL-EBI, Genentech, GlaxoSmithKline, Pfizer, Sanofi, and the Wellcome Sanger Institute all contribute. Like PandaOmics, Open Targets integrates publicly available data sources to build and score target-disease associations for the systematic identification and prioritization of potential therapeutic drug targets.

According to Zhavoronkov, PandaOmics differentiates itself from Open Targets by incorporating more validated bioinformatic models for target identification — 23 in the case of PandaOmics, compared to two in Open Targets. This, Zhavoronkov said, makes PandaOmics more of an "industrial strength" platform.

Despite PandaOmics' broad range of data sources and computational models, Jiyang Yu, a computational biologist at St. Jude Children's Research Hospital, cautioned that the platform's readouts should not be relied upon without "serious validation."

"CEP135 might be interesting but [it] lacks detailed mechanism studies," he said. "There are hundreds of genes like CEP135 that can separate survival well. PLK1 is a long-known target — actually [an] essential gene — in many cancers, but has gone nowhere, because of toxicity."

Although Scheibye-Knudsen acknowledged that the association between PLK1 and sarcoma was "not so novel," he said that the study's strength lies more in its methodology, which demonstrates a viable alternate path toward target identification.

"I think [it's] quite novel, using these premature aging diseases as an entryway," he said. "The methodology is very useful in this context."

Scheibye-Knudsen also said that the next steps in this investigation are, in fact, to validate the current findings alongside drug discovery studies, in the hope of eventually launching a clinical trial.

Insilico has also been active in other research partnerships beyond cancer. Earlier this year, for instance, the company forged a partnership with Centogene for the discovery of candidate therapeutic targets for the rare disorder Niemann-Pick disease.

In addition, Insilico recently widened its customer base to include more clients within the financial industry with the launch of its inClinico platform. The platform is designed to predict clinical trial success by identifying weak points in trial design, with the hope of eventually pushing the entire field toward adopting better practices.

"The people who buy inClinico are mostly hedge funds and banks who bet on small and medium-size biotechs," Zhavoronkov said.

Insilico currently employs approximately 40 scientists. In 2020, the company spun out Deep Longevity, a company that develops measures of "biological age" through AI-based systems that track various age-related molecular, cellular, and other biomarkers.