Skip to main content
Premium Trial:

Request an Annual Quote

Additional Genome Sequences Underscores Genetic Diversity of Africa, Ancestral Migrations

NEW YORK – By sequencing the genomes of hundreds of African individuals, researchers have uncovered millions of previously unknown variants that provide insight in both human health and ancestry.

The Human Heredity and Health in Africa (H3Africa) Initiative began about a decade ago with the aim of expanding genomic capacity in Africa and funding African researchers to conduct genomic studies of interest to African populations. It also aimed to characterize genetic diversity across the continent.

As part of that effort, an international team of researchers analyzed whole-genome sequencing data on some 400 individuals from more than two dozen ethnolinguistic groups generated by ongoing H3Africa studies. As they reported on Wednesday in Nature, the researchers uncovered more than 3 million previously undescribed variants, including ones under strong selection that were involved in immunity, DNA repair, and metabolism. 

The new data additionally provided insight into ancestral admixture within and between the populations studied and suggested one possible route for the influential Bantu expansion.

"African genome variation is likely to be a better representation of variant distribution for both African diaspora and global populations, and, therefore, a full repertoire of African genomic variation could provide a better genomic reference for both medical and population genetics," Baylor College of Medicine's Neil Hanchard and his colleagues wrote in their paper. 

The researchers analyzed sequencing data from 426 individuals. A principal components analysis of the cohort separated the individuals broadly on linguistic and geographic lines, first separating individuals based on whether they were Nilo-Saharan or Afro-Asiatic speakers, as well as separating east African Niger-Congo speakers from other Niger-Congo speakers. The analysis then separated individuals along a west-south geographic line.

Within their cohort, the researchers identified 41 million SNVs, including 3.4 million SNVs that had not been reported previously. They estimated that the novel SNVs accounted for between 2 percent and 5 percent of SNVs to be found in each population, and further noted that many of the novel SNVs were found among individuals from ethnolinguistic groups that had not previously been studied.

The novel SNVs were in genes with roles in immune-related functions as well as in non-coding regions thought to regulate traits linked to chronic kidney disease, among others. 

The researchers additionally teased out signals of selection within each population to find that, among individuals from Botswana, genes involved in metabolism were under selection, while genes involved in DNA maintenance were under selection among Gur speakers from west Africa as well as among Cameroonians. 

They further annotated their dataset using the American College of Medical Genetics and Genomics Secondary Findings gene panel to find eight individuals who had reportable variants, indicating pathogenic variants in clinically relevant genes were rare. However, when they examined variants labeled as pathogenic by the ClinVar database, nearly everyone in the cohort had a suspected pathogenic allele, suggesting a number of misclassified variants within ClinVar.

Besides potential medical applications, the genomic data researchers analyzed also provided further insight into human migration and admixture in Africa, including regarding the Bantu expansion. For instance, principal component and identity-by-descent sharing analyses indicated that Bantu speakers from Zambia are more closely related to Bantu speakers from Uganda and Botswana than they are to other Central African populations. Further, admixture tests found Bantu speakers from Zambia are the most likely central African source population for Bantu speaker-linked ancestry in east and south Africa, and additional analyses indicated populations from Angola were the closest central or central-west African population to Bantu speakers from Zambia.

Based on this, the researchers suggested that Zambia was mostly like an intermediate site during the Bantu expansion to east and south Africa.