NEW YORK (GenomeWeb) – Mutations in regulatory regions of the genome may play a larger role in cancer than previously thought, according to researchers from Stanford University.
The researchers analyzed whole-genome sequence data of 436 individuals spanning eight cancer subtypes from the Cancer Genome Atlas, as well as data from the Encyclopedia of DNA Elements (ENCODE) Project and other regulatory annotations, to identify point mutations in regulatory regions.
Reporting their results today in Nature Genetics, the team found recurrently mutated regulatory sites across cancers, including both known mutations to the TERT promoter and a number of novel mutated regulatory sites, suggesting that such mutations may have a bigger impact on cancer than previously appreciated.
To identify the regulatory mutations, Mike Snyder's laboratory at Stanford first established an analysis workflow for whole-genome data from 436 individuals from the TCGA. They used two algorithms, MuTect and VarScan 2, to identify SNVs from eight different cancer subtypes.
Next, they annotated the mutation set with gene and regulatory information from the gene annotation project Gencode and RegulomeDB, a database of regulatory data that includes data on transcription factors, epigenetic marks, motifs, and DNA accessibility.
Overall, they found that mutations in coding exons represented between .036 percent and .056 percent of called mutations for each cancer type, while mutations in putative regulatory regions represented between 31 percent and 39 percent of called mutations for each cancer type. The large fraction of regulatory mutations, "underscores the potential for regulatory dysfunction in cancer," the authors wrote.
Somewhat surprisingly, though, and in contrast to previous analysis that showed fewer mutations in coding and regulatory regions compared to intergenic and non-coding regions, when the team compared the observed rate of mutations to the simulated rate of mutations based on chance in regulatory and non-regulatory regions, there was no difference.
The authors attributed this difference to a number of factors. For instance, "in cancer, damaging mutations may be more tolerated owing to dysfunction in the normal apoptotic process," they wrote. In addition, their analysis showed that differences in mutation rate between coding/noncoding and regulatory/non-regulatory regions may be due to "potential false positive mutations from mapping errors and by differences in mutation rate relating to base-pair type and replication timing."
The team identified a number of recurrently mutated genes and regulatory regions, and they replicated a number of known findings of recurrent mutations in driver genes, including mutations in the coding regions of TP53, AKT1, PIK3CA, PTEN, EGFR, CDKN2A, and KRAS.
They also identified recurrent mutations to the known TERT promoter gene and recurrent mutations in eight new loci in proximity of, and therefore potential regulators of, known cancer genes, including GNAS, INPP4B, MAP2K2, BCL11B, NEDD4L, ANKRD11, TRPM2 and P2RY8.
In addition, they found positive selection for mutations in transcription factor binding sites. For instance, mutations in the binding sites of CEBP factors were "enriched and significant across all cancer types," the authors wrote. In addition, they found enrichment for mutations in transcription factor binding sites that were either likely to "destroy the site or increase affinity of the site for transcription factor binding," the authors wrote. Such mutations could either inactive tumor suppressor genes or activate oncogenes.
"Overall, we expect that many regulatory regions will prove to have important roles in cancer, and the approaches and information employed in this study thus represent a significant advance in the analysis of such regions," the authors wrote.