NEW YORK – An international team of researchers has identified 191 likely breast cancer target genes in 150 disease risk regions using a fine-mapping technique that analyzes gene expression, chromatin interaction, and functional annotations.
In a study published on Tuesday in Nature Genetics, the team noted that while GWAS studies have identified breast cancer risk variants in more than 150 genomic regions, the mechanisms underlying risk remain largely unknown. They aimed to fine-map all known breast cancer susceptibility regions using dense genotype data on more than 217,000 subjects participating in the Breast Cancer Association Consortium (BCAC) and the Consortium of Investigators of Modifiers of BRCA1/2 (CIMBA).
All samples were genotyped, and the researchers used stepwise multinomial logistic regression to identify independent association signals in each region and to define credible causal variants (CCVs) within each signal. They found genomic features significantly overlapping the CCVs.
They then analyzed these regions by combining association analysis with in silico genomic feature annotations, defining 205 independent risk-associated signals with credible causal variants in each region. In parallel, they also used a Bayesian approach that combined genetic association, linkage disequilibrium, and enriched genomic features to determine variants with high posterior probabilities of being causal, and then applied their INQUSIT pipeline for prioritizing genes as targets of those potentially causal variants, using gene expression, chromatin interaction, and functional annotations.
"Known cancer drivers, transcription factors, and genes in the developmental, apoptosis, immune system, and DNA integrity checkpoint gene ontology pathways were over-represented among the highest-confidence target genes," the authors wrote.
From the 150 genomic regions, the researchers identified 352 independent risk signals containing 13,367 CCVs. The number of signals per region ranged from one to 11, with 79 containing multiple signals. In 42 signals, there was only a single CCV: for these signals, the investigators believed that the simplest hypothesis was that the CCV was causal.
Using a case-only analysis for the 196 signals in which they found strong evidence for the variant to be credibly causal, they found 66 signals where the lead variant conferred a greater relative risk of developing ER-positive tumors and 29 where the lead variant conferred a greater risk of ER-negative cancer tumors. The remaining 101 signals showed no difference by ER status.
To increase their power to identify ER-negative signals, the researchers then performed a fixed-effects meta-analysis, combining association results from BRCA1 mutation carriers in CIMBA with the BCAC ER-negative association results. This meta-analysis identified 10 additional signals, making 206 strong-evidence signals containing 7,652 CCVs in total.
When they constructed a database of mapped genomic features in seven primary cells derived from normal breast and 19 breast cell lines using publicly available data, the researchers found significant enrichment of CCVs in open chromatin, actively transcribed genes, gene regulatory regions, and binding sites. They also found that CCVs co-localize with variants that control local gene expression, and that transcription factors and known somatic breast cancer drivers are over-represented among prioritized target genes.
In total, they identified 191 target genes supported by strong evidence. Significantly more genes were targeted by multiple independent signals than expected by chance. Target genes included 20 that were prioritized via potential coding/splicing changes, ten via promoter variants, and 180 via distal regulatory variants.
"We found that 23 percent of the ER-positive target genes were classified within developmental process pathways (including mammary development), 18 percent were classified in immune system pathways, and a further 17 percent were classified in nuclear receptor pathways," the authors wrote. "Of the genes targeted by ER-neutral signals, 21 percent were classified in developmental process pathways, 19 percent were classified in immune system pathways, and a further 18 percent were classified in apoptotic process pathways."
The study also revealed novel pathways, the researcher said, including tumor necrosis factor-related apoptosis-inducing ligand (TRAIL) signaling, the AP-2 transcription factors pathway, and regulation of IκB kinase/nuclear factor-κB (NF-κB) signaling, the last of which is specifically over-represented among ER-negative target genes. They also found significant over-representation of additional carcinogenesis-linked pathways, including cyclic adenosine monophosphate, NOTCH, phosphoinositide 3-kinase, RAS and WNT/β-catenin, and of receptor tyrosine kinase signaling, including fibroblast growth factor receptor, epidermal growth factor receptor and transforming growth factor-β receptor.
"These analyses provide strong evidence for more than 200 independent breast cancer risk signals, identify the plausible cancer variants, and define likely target genes for the majority of these," the authors concluded. "However, notwithstanding the enrichment of certain pathways and transcription factors, the biological basis underlying most of these signals remains poorly understood. Our analyses provide a rational basis for such future studies into the biology underlying breast cancer susceptibility."