NEW YORK – A Stanford University team has tracked down sets of linked causal genetic variants at genetic association sites that are linked to gene expression in humans.
Their study, published in Science on Thursday, used a multiplexed fine-mapping method that takes linkage disequilibrium into account.
"Statistical associations between genetic variants and human traits typically do not resolve to single obvious mutations that cause the effect we care about," first and co-corresponding author Nathan Abell, a researcher at Stanford University, said in an email.
Because there may be several suspicious variants in a given association peak, including variants with unclear mechanisms of action, Abell explained, it can be difficult to select individual variants for validation testing to narrow in on causal variants.
With that in mind, the team tapped into multiplex strategies to systematically assess the function of causal mutation candidates across association peaks, focusing on variants influencing gene expression at loci flagged during past genome-wide association studies.
Using a targeted sequencing-based massively parallel reporter assay (MPRA) approach, the researchers searched for causal variants at expression quantitative trait loci (eQTL) associated with more than 700 eQTL-impacted "eGene" regions reported in a past study involving individuals in Utah, Northern Europe, and Western Europe.
From the variants at each eQTL peak — including a median of half a dozen lead variants found in linkage disequilibrium with one another — the team came up with a set of nearly 31,000 variants to assay using a library of more than 49,000 randomly-barcoded alleles.
"For each variant, we computed the allele-independent regulatory effects of an oligo ('expression' effects) and the difference between reference and alternative allele-containing oligos ('allelic' effects)," the authors explained, noting that these analyses unearthed more than 8,500 expression and nearly 1,300 allelic effects.
Based on the MPRA predictions and follow-up analyses, the investigators teased out causal variants with ties to transcription factor activity, histone modification patterns, chromatin accessibility, and other regulatory features.
In particular, the team saw that a significant subset of the regulatory sites they identified — at least 17.7 percent — included multiple causal variants in linkage disequilibrium, while relatively weak individual variant effects and additive effects within a haplotype were quite common.
Through a series of subsequent analyses, meanwhile, the researchers used MPRA features and other insights to narrow in on causal variants contributing to a range of human traits or conditions such as asthma, inflammatory bowel disease, or multiple sclerosis, including clusters of linked causal variants.
Abell noted that the growing collections of whole-genome sequence and other human genetic data will likely increase investigators' ability to continue uncovering common and rare variants behind the genetic association signals being identified.
"[W]e demonstrate that MPRAs provide a scalable platform with which to separate and map the regulatory activities of expression- and complex trait-associated natural genetic variants and highlight the limitations of existing approaches to variant interpretation and computational fine mapping," he and his co-authors wrote.