NEW YORK (GenomeWeb) – Researchers have compiled a list of somatic genomic rearrangements found in 18 human cancers and the genes whose expression is affected by those changes.
Researchers led by Baylor College of Medicine's Chad Creighton used whole-genome sequencing and gene expression data collected by the Cancer Genome Atlas to identify genes whose expression is altered by a somatic structural variant. Though structural variants can influence gene expression, they hadn't yet been characterized on a large scale within a number of cancer types, the researchers noted.
"There is need for a systematic identification and cataloging of genes that are recurrently altered transcriptionally in cancer as a result of genomic rearrangement," they wrote in their paper in Cell Reports today.
The investigators uncovered more than 400 genes that are directly disrupted by structural variants and about 500 additional genes whose expression is influenced by nearby structural variant breakpoints. They noted that such a catalog could provide additional insight into cancer-related processes and pathways, and inform efforts to develop personalized medicine approaches.
Creighton and his colleagues analyzed whole-genome sequencing data collected from 1,493 people with 18 different types of cancer — including bladder urothelial carcinomas, breast-invasive carcinomas, cervical squamous cell carcinomas, and endocervical adenocarcinomas — from the Cancer Genome Atlas cohort. RNA sequencing data was also available for 1,448 of the cases, allowing the researchers to combine the whole-genome sequencing and gene expression data.
For 13 of the cancer types, only low-pass WGS data — between 6X to 8X coverage — was available, as compared to between 30X and 60X coverage for the others, but the researchers argued that low-pass data in combination with other data would still enable them to identify biologically meaningful associations.
In all, they identified more than 85,560 high-confidence somatic structural variations and noted widespread associations between structural variations and changes in gene expression. More than 400 genes were directly disrupted by structural variant breakpoints falling within the gene itself, while about 500 genes were affected by breakpoints located nearby, where they likely affected regulatory elements.
A number of these affected genes have been previously implicated in cancer, the researchers noted. Genes with decreased expression and in which structural variant breakpoints were found included the tumor suppressor genes PTEN, TP53, and RB1. Meanwhile, structural variant breakpoints affecting the cancer driver genes TERT, ERBB2, and CDK4 led to their increased expression.
In particular, the researchers noted that there was an enrichment of structural variants that disrupted topologically associated domain organization, and that these TAD-disrupting structural variants included ones associated with the TERT locus.
Similarly, the researchers reported an enrichment of enhancer hijacking events that involved structural variants. These TAD and enhancer disruptions, the researchers noted, could account for some of the genes that are deregulated by structural variants. However, they added that multiple mechanisms are likely to be involved.
"Future work can further identify and refine the set of cancer-relevant SV-altered gene transcripts, which may involve larger sample numbers and deeper sequencing," the authors wrote.
They noted that the Pan-Cancer Analysis of Whole Genomes consortium is currently analyzing more than 2,800 cancer WGS samples from the TCGA and the International Cancer Genome Consortium, which could be used to complement and re-evaluate the patterns observed in this study.