NEW YORK— Researchers have found that polygenic risk scores (PRS) derived from multi-ancestry genome-wide association studies (GWAS) generally show superior performance over PRS based on single-ancestry GWAS in understudied populations.
In a paper published in Cell Genomics on Thursday, investigators from the Broad Institute and their colleagues turned to a combination of simulations and empirical studies to explore how the genetic architecture and ancestry composition of GWAS discovery cohorts affect the predictive power of PRS across diverse populations.
While research cohorts have recently started to become more diverse, most GWAS so far have used data from populations of European ancestry, which has been more readily available and for larger populations. "Recently developed statistical methodologies leverage the increasing diversity of GWAS data to improve PRS portability. However, the effect of genetic architecture, ancestry composition of GWAS discovery cohorts, and PRS construction methodologies on cross-ancestry predictive accuracy remain largely unclear," they wrote.
For their new study, they conducted large-scale population genetic simulations. In addition, they used genomic data from the BioBank Japan and the UK Biobank across traits with distinct genetic architectures and compared the PRS performance from single and multi-ancestry GWAS.
Their findings showed that PRS from multi-ancestry GWAS generally had better predictive accuracy in understudied populations than PRS from a single-ancestry GWAS, mainly of European ancestry. "The extent of improvement was influenced by factors such as sample size ratios between European ancestry GWAS and minor GWAS, genetic architecture, PRS methodology, and linkage disequilibrium reference panels," the authors wrote.
Meanwhile, their analysis also revealed that directly meta-analyzing datasets from diverse ancestral groups improved PRS accuracy more than linearly combining PRS. The authors noted that this finding supports the assumption that causal variants are shared between ancestries.
Another finding was that leveraging GWAS in admixed populations by accounting for local ancestry improved PRS predictive performance in understudied populations even without direct access to individual genotypes of admixed populations.
Among the limitations of their study, the authors noted that their analysis focused on common variants, while population-enriched variants have lower frequencies in the overall population. "The role of such variants in polygenic prediction is worth exploring across phenotypes when there are sufficient sample sizes for different ancestral populations," they wrote.
"In summary, there is no one-size-fits-all approach for constructing PRS, as the optimal approach depends on genetic architecture, ancestry composition, statistical power, and other factors," they concluded, adding that their simulations and analyses provide guidelines for future work to generate such scores.