Skip to main content
Premium Trial:

Request an Annual Quote

Arab Genome Study Improves Understanding of Migration Patterns, Haplotype Imputation

Qatar Genomics

NEW YORK – Researchers in Qatar and the US have published a high-resolution map of the genetic structure of Arab and Middle Eastern populations, revealing new insights into human migration patterns in the region and helping to improve the imputation of Arab genomes.

In a paper published on Tuesday in Nature Communications, members of the Qatar Genome Programme and their colleagues presented an in-depth analysis of 6,218 genomes of predominantly healthy individuals recruited from the general population at the Qatar Biobank. The analysis showed extensive diversity as well as genetic ancestries representing the main founding Arab genealogical lineages of Qahtanite (Peninsular Arabs) and Adnanite (General Arabs and West Eurasian Arabs).

Overall, the researchers built a reference panel of 12,432 haplotypes, demonstrating improved genotype imputation for both rare and common alleles in Arabs and the wider Middle East.

Analyzing migration patterns, they found that Peninsular Arabs are the closest relatives to ancient hunter-gatherers and Neolithic farmers from the Levant, and that founder Arab populations experienced multiple splitting events 12,000 to 20,000 years ago, consistent with the time Arabia became more desert-like and with farming in the Levant, which gave rise to settler and nomadic communities. In terms of recent genetic flow, the researchers also found that these ancestries contributed significantly to European and South Asian as well as South American populations, likely as a result of Islamic expansion over the past 1,400 years.

They also found that Middle Eastern populations have relatively high levels of consanguinity — about 20 percent to 50 percent compared to less than 0.2 percent in western Europe and the Americas — resulting in the accumulation of long runs of homozygosity, or ROH, which can be associated with autozygosity of deleterious founder mutations. When the researchers performed ROH analysis at large scale using whole-genome sequencing data on the QGP cohort, the QGP subpopulations were shown to have the least short and medium ROH, reflecting their closest links to the early populations that migrated out of Africa. However, Peninsular Arabs, General Arabs, and Arabs of West Eurasia and Persia had high proportions of long ROH, reflecting their recent interbreeding.

Notably, the researchers characterized a cohort of 1,491 men with the ChrY J1a2b haplogroup and identified 29 unique sub-haplogroups.

Recent studies of the Y chromosome and mitochondrial DNA have given useful insights into world population migration, including the expansion of people from the Middle East. Consistent with known mtDNA maps, the researchers saw that African Arabs and South Asian Arabs in the QGP had predominantly L and M haplogroups, respectively, while the other QGP subpopulations had more diverse haplogroups, with Peninsular Arabs being the least heterogeneous.

In contrast to the observed mtDNA diversity, and in line with patrilocal practices in the region, 56.7 percent of male subjects were in the J1 Chr Y haplogroup clade. J1 is known to be prevalent in the Levant and the Arabian Peninsula, particularly in Yemen. This haplogroup is almost universal among Peninsular Arabs (99.1 percent), abundant in General Arabs (77 percent), and modestly present in West Eurasia and Persia (16.7 percent), South Asian Arabs (12.5 percent), and African Arabs (7.9 percent).

Overall, the researchers said, these findings highlighted the migratory history of the Arab Peninsula and Levant regions, the presence of strong barriers to intermarriage outside tribal groups, and the dominance of female movement from other geographical regions to Arabia on gene flow.

"A dedicated QGP imputation panel was generated to leverage this dataset, which shall complement the currently available panels by providing more accurate imputation of Arab and Middle Eastern genomes," the authors concluded. "This will enable association studies with greater scale and statistical power to detect causal variants underlying biological traits and diseases."