NEW YORK (GenomeWeb) – Researchers from Tel Aviv University (TAU) and collaborators at other institutions have recently published a paper in Cell that describes a computational pipeline they developed to identify so-called synthetic lethal gene pairs in cancer genomes and shows how the tool could help scientists develop more personalized and less toxic therapies as well as repurpose existing drugs to treat the disease.
Synthetic lethality (SL), the concept at the heart of the rather aptly named Data Mining Synthetic Lethality Identification (DAISY) pipeline — also described in a Cell preview paper — refers to a situation where a pair of genes in a cell change in such a way that it inhibits the activity of both genes, proving lethal to the cell. Inhibiting the activity of each gene individually does not have that effect.
This means that "if two genes are synthetically lethal, they are highly unlikely to be inactive together in the same cell," Eytan Ruppin, who runs a lab at TAU's Blavatnik School of Computer Science and one of the corresponding authors on the study, explained in a statement. Cells with both genes inactivated would be selected against and eliminated from the population.
On the strength of that premise, in its study the TAU-led team used its pipeline to look for "genes that were found to be inactive in some cancer samples, but were almost never found to be inactive together in the same samples."
Predicting SL gene pairs is a complex problem given the sheer number of possible combinations that could be derived from genomic datasets. Earlier efforts to do so used screening technologies designed to detect SL interactions in model organisms and human cell lines, according to the paper. However, these technologies "are not sufficiently broad enough to encompass the large volume of genetic interactions that need to be surveyed across different cancer types," the researchers wrote.
Other groups have developed computational methods to infer these interactions specifically in cancer studies. These methods worked by "mapping SL-interactions in yeast to their human orthologs" or "by utilizing metabolic models and evolutionary characteristics of metabolic genes," the paper explained. Both these methods are "important and significant contributions" to the SL studies but they are also are limited, Ruppin told BioInform.
DAISY uses a much broader set of information, according to its developers. It uses a series of statistical inference techniques — survival of the fittest, shRNA-based functional examination, and pairwise gene co-expression — to infer candidate gene pairs directly from cancer genomic data collected from both cell lines and clinical samples. It complements genetic and chemical screens by "narrowing down the number of gene-pairs that need to be examined experimentally to detect SL and [Synthetic Dosage Letha] interactions in cancer," the researchers wrote. The paper shows for the first time, they claimed, "that genome-wide cancer SL networks can be used to successfully predict gene essentiality, drug response, and clinical prognosis."
Armed with somatic copy number data, shRNA, and gene expression data from thousands of cancer samples, the tool looks at all possible gene pairs and checks if each pair fulfills each one of its three statistical criteria. Gene pairs that fulfill all three in a statistically significant manner are predicted to be SL pairs, the paper states. Furthermore, DAISY includes statistical criteria for identifying dosage synthetic lethality pairs, a situation where a pair of overactive genes are detrimental to a cancer cell. "SDL-interactions can permit the eradication of cancer cells with over-active oncogenes that are difficult to target directly (such as KRAS), by targeting the SDL-partners of such oncogenes," they explained.
In the paper, the researchers reported the results of using DAISY to analyze information from nine different cancer datasets to identify both known and novel SL partners. In one test, they used it to predict the SL partners of the VHL gene, a tumor suppressor, focusing specifically on activity in renal carcinoma cells. They validated their findings using siRNA screens as well as by measuring their response to nine drugs approved to treat conditions such as hypertension and depression.
In another study described in the paper, the researchers used the pipeline to generate genome-wide networks of SL and SDL pairs from their cancer datasets and then tested the interactions DAISY identified in terms of "gene essentiality, clinical prognosis, and drug efficacy." Detailed descriptions of both analyses and associated findings are provided in the paper.
Among their next steps, Ruppin and his colleagues are looking to expand the list of datasets DAISY uses to predict SL-pairs. Right now, it relies mainly on copy number, gene expression, somatic mutation, and shRNA data but there's also sequence, methylation, proteomics data, and more to consider, Ruppin said. "We believe that by data mining these additional kinds of data, we can significantly improve the accuracy of the networks," he told BioInform. "Then we can bring in more sophisticated and more advanced learning techniques and based on these data networks get better predictors of drug response, lethality, and prognosis."