COLD SPRING HARBOR, NY (GenomeWeb) – Researchers from the University of California, Los Angeles, have developed a new approach to examine how many SNPs present in a region of the genome contribute to the heritability of a trait.
While genome-wide estimates of heritability can give insight into the number of causal SNPs for a trait or disease, these estimates don't provide information about how these SNPs are distributed along the genome.
Ruth Johnson, a graduate student at UCLA and her colleagues have developed an approach dubbed Bayesian Estimation of Variants in Regions, or BEAVR, to estimate regional heritability using summary statistics from genome-wide association studies. As Johnson reported during a session at the Biology of Genomes meeting here on Wednesday, they applied this approach to study 22 traits using UK Biobank data. From this, they found that while there is generally a linear relationship between the number of causal SNPs in a region and heritability, it is not always the case, which can then point to regions of the genome that could be prioritized for follow up.
"A trait has a certain amount of heritability or your risk for disease," Johnson said in a later interview. "But then a big question is, is this spread uniformly throughout the genome? Or is this concentrated in one area?"
As Johnson described, BEAVR is a probabilistic approach that relies on both GWAS summary statistics and linkage disequilibrium data. GWAS summary statistics, she said, are now publicly available. For these analyses, Johnson said they used in-sample LD data, though that information is not always publicly available.
BEAVR then gives out an estimate of the number of "causal" SNPs in a region, where "causal" means that those SNPs influence the trait being examined.
Johnson noted, though, that BEAVR becomes less precise for traits that have low heritability.
After assessing BEAVR in simulations, she and her colleagues applied their tool to study 22 traits — including anthropomorphic, autoimmune-, cardiovascular-, and respiratory-related traits — from the UK Biobank, a sample of about 290,000 individuals. For this analysis, they divided the genome into six-megabase regions to estimate local polygenicity and, for each of those regions, estimate the local heritability and the expected per-SNP contribution to the variance in heritability.
Previous studies, Johnson said, found that, genome-wide, there's a link between the total number of SNPs — not just causal ones — and increasing chromosome length, and that heritability then scaled with chromosome size. She and her colleagues sought to determine whether that link held for causal SNPs and did also find a strong relationship between the number of causal SNPs and heritability. This, she added, suggests there is a distribution of causal SNPs, and not just a "one-hit wonder."
They then examined how this varied on a regional level to see whether it differs from what's observed genome-wide.
For instance, from the UK Biobank data, they estimated height to have a heritability of about 60 percent. When broken down by genomic regions, they found the heritability of height to be spread evenly throughout the genome. In general, Johnson said all the traits they examined followed a similar pattern, suggesting there is a distribution of causal SNPs.
But some regions varied from this pattern. These regions of excess heritability could contain high-effect SNPs. "It's a good way to narrow down where you might want to look," Johnson added.
Additionally, Johnson said she was interested in next examining whether there is a correlation between the number of causal SNPs in a region and the number of genes in that region. She added that she'd also like to then fold in more functional data.
"I think this work we've done so far has ... demonstrated that there are differences between genome-wide and regional levels of polygenicity, but now I think the really cool part comes in trying to assess why that is," she said.