NEW YORK – As part of the Uganda Genome Resource (UGR) effort, a team from the UK, Uganda, Nigeria, and elsewhere have profiled genome sequence and SNP patterns in thousands of individuals from rural communities in southwestern Uganda, identifying population structures along with some of the historical movements behind them.
"[T]he UGR was designed to help develop local resources for public health and genomic research, including building research capacity, training, and collaboration across the region," the authors wrote in their study, published online today in Cell. "We envisage that data from these studies will provide a global resource for researchers, as well as facilitate genetic studies in African populations."
To develop that resource, the investigators brought together genotyping array and low-coverage genome sequencing data for 4,778 Ugandan individuals participating in the Uganda Genome-Wide Association Study (UGWAS) and almost 2,000 participants in the Uganda 2000 Genomes (UG2G) projects, respectively.
Together, the sequences revealed some 41.5 million SNPs and 4.5 million small insertions and deletions in the Ugandan genomes. Many of those variants were rare, the team found, and 9.5 million were documented in the 1000 Genomes Project, the African Genome Variation Project, or the UK10K database. Likewise, the Genome Aggregation database, gnomAD, was missing nearly 29 percent of the SNPs found in the new Ugandan genomes.
The team analyzed the variants alongside available data for individuals profiled previously in other parts of Africa and beyond, identifying 52 population clusters in the region of Uganda they studied, which is home to nine main ethno-linguistic groups.
The authors noted that the self-identified ethno-linguistic groups "should be considered as representing a broad construct that encompasses shared cultural heritage, ancestry, history, homeland, language, or ideology."
The complicated admixture and population structures found in Uganda appeared to reflect known shifts within the country, the researchers reported, as well as arrivals from Burundi, Tanzania, South Sudan, the Democratic Republic of Congo, and multiple migrations from Rwanda. They noted that participants had ancestry components resembling East African Bantu, Nilo-Saharan, Afro-Asiatic, rainforest hunter-gatherer, Eurasian, and other populations, for example, with Neanderthal sequences pointing to a back-to-Africa source for the detected Eurasian ancestry.
Along with genomes from a subsequent parent-child trio and rare variant analyses to delve deeper into East African demography and population history, the team looked at the insights that may be gained from having a more compete set of sequence variants for the UG2G participants — from allele frequency differences and sequence imputation clues to distinct disease risk patterns.
In particular, when the researchers performed a cardiometabolomic trait-focused genome-wide association study and meta-analysis involving more than 14,100 African individuals, they uncovered new loci linked to lipid, blood cell, and other traits in Africa, including associations involving variants that appear to be rare in populations from other parts of the world.
"Collectively, our findings highlight the utility of genetic resources from diverse populations in novel discovery, especially for population-specific and low-frequency association signals," they wrote. "In this context, differences in frequencies of functional alleles, allelic heterogeneity, and differences in [linkage disequilibrium] structure provide unique opportunities for discovery and resolution of causal loci and a better understanding of the genetic architecture of disease."