Skip to main content
Premium Trial:

Request an Annual Quote

Global Biobank Meta-Analysis Initiative Brings Greater Diversity to Global GWAS

NEW YORK – The Global Biobank Meta-analysis Initiative (GBMI), a consortium that aims to make genome-wide association studies (GWAS) more representative of populations worldwide, made its debut Wednesday with a series of papers published in Cell Genomics.

Genetic studies have long been biased toward individuals of European descent, both at the level of clinical trials in humans and in primary research done in cells.

A team from MIT and Massachusetts General Hospital introduced the initiative in a paper detailing its scope, goals, and methods.

Spanning 23 biobanks across four continents and representing more than 2.2 million consenting individuals from six major ancestral groups with genetic data linked to electronic health records, the consortium identified 317 known and 183 new genes associated with 14 diseases, from asthma and gout to certain cancers.

Importantly, the initiative showed that it is possible to integrate genetic association results from the various biobanks, despite biobank-specific differences in location, sample size, genotyping and phenotyping approach, sample ancestry, and recruitment strategy. This, the authors wrote, makes it possible to conduct some of the largest GWAS analyses of certain diseases to date.

During manuscript preparation, the Uganda Genome Resource also joined the GBMI, bringing the current total number of biobanks to 24, spread across five continents.

In a second study, the University of Bristol's Huiling Zhao and colleagues analyzed GBMI data to estimate the causal roles of more than a thousand proteins in eight complex diseases in both African and European populations.

Although greater efforts are underway to narrow diversity gaps, people of European descent remain heavily overrepresented in most GWAS studies. This restricts researchers' ability to identify specific disease-associated protein regions or protein quantitative trait loci (pQTLs), and limits opportunities to identify multi-ancestry and ancestry-specific protein-disease associations.

GBMI data enabled Zhao and colleagues to analyze 1,311 and 1,310 proteins in African and European populations, respectively, using Mendelian randomization (MR), an analytical method that treats genetic variants as variables to evaluate risk factors affecting phenotypes such as disease.

They identified 45 protein-disease pairs in people of African ancestry and in those of European ancestry with MR and genetic colocalization evidence in both. They further uncovered two protein-disease pairs with MR evidence in both ancestries, while seven showed likely European-specific causal effects and seven showed African-specific effects. By combining these results with clinical trial evidence, the team prioritized 16 pairs for investigation in future drug trials.

A group based in Japan and led by Shinichi Namba and Takahiro Konuma of the Osaka University Graduate School of Medicine additionally introduced practical guidelines for genomics-driven drug discovery in cross-population meta-analyses using information from the GBMI.

Their guidelines utilize three techniques for in-depth, genomics-driven drug discovery that work across populations. These consist of overlap enrichment of disease risk genes with existing drug targets to identify drug repurposing opportunities, endophenotype MR with subsequent quality controls to establish causal links between proteins and disease processes, and screening negative correlations between genetically regulated disease case-control gene expression and compound-regulated gene expression profiles to identify compounds that might correct disease-related gene expression alterations.

They applied this framework to 13 common diseases, identifying 266 drug/compound-disease pairs for possible drug repositioning in disorders ranging from certain types of blood clot to immune signaling pathways in gout.

In the fourth study in this series, researchers from the Norwegian University of Science and Technology and the University of Michigan presented findings from the Trøndelag Health Study (HUNT), a Norwegian population health study begun in 1984, which counts roughly 229,000 participants, approximately 88,000 of whom have provided genetic information.

Through a genetic discovery strategy incorporating genotyping, sequencing, and imputation-based approaches, HUNT researchers have discovered insights into the mechanism of cardiovascular, metabolic, osteoporotic, and liver-related diseases. Over the course of the study, HUNT has inspired similar longitudinal studies across more diverse populations.

In a fifth study, an international team of researchers conducted a large-scale meta-analysis of idiopathic pulmonary fibrosis (IPF), a genetic disease whose rarity complicates research into it. Querying 13 biobanks around the globe and incorporating 11,160 patients from six ancestry groups, the team discovered seven new IPF-related gene markers, including several involved in lung function and COVID-19 response, as well as sex-specific effects. Importantly, the researchers estimate that only one of the new markers would have been identified had the analysis been limited to people of European ancestry.

Meanwhile, Arjun Bhattacharya of the University of California, Los Angeles, and colleagues presented a pipeline for using GBMI data to conduct transcriptome-wide association studies (TWAS). By linking genetic variants to traits, TWASs boost detection power and provide biological context to genetic associations.

The UCLA-led team laid out practical considerations for ancestry and tissue specificity, and meta-analytic strategies, as well as open challenges found at each step of the framework. Their pipeline establishes a foundation for adding transcriptomic context to biobank-linked GWAS, with the potential to accelerate genomic medicine through ancestry-specific expression models.

Finally, researchers from the Taiwan Biobank (TWB), a member of the GBMI, presented an overview of that biobank's cohort design, phenotype availability, genomic data generation, sample characteristics, genetic discoveries to date, and data access and sharing policy.

The TWB consists of more than 150,000 people largely of Han Chinese ancestry, whose genetic data researchers are linking to Taiwan's National Health Insurance database and other registries to improve genotype-phenotype relationships and enable deep, longitudinal genetic investigations.

Although biased toward male participants and currently lacking children in its cohort, the TWB provides one of the largest biobank resources for East Asian populations and promises to contribute to understanding the genetic basis of health and disease in global populations through collaborative and comparative studies with other biobanks.

"The aims of the GBMI are to increase the power to discover genetic variation associated with phenotypes for GWAS analyses, increase replication power, and determine more accurate polygenic risk scores," Laura Zahn, editor-in-chief of Cell Genomics, said in a statement. "Their work is helping to provide new insights into the underlying biology of human diseases and traits."