Skip to main content
Premium Trial:

Request an Annual Quote

Analysis of NatGeo Genographic Project Data Reveals Complexity of US Population

NEW YORK – A team from the US, Taiwan, and Finland has performed a new population genomic analysis of the United States, revealing more complex ancestral patterns than those documented in the past.

The work "provides important lessons for the future," co-senior and corresponding author Alicia Martin, a researcher affiliated with the Broad Institute and Massachusetts General Hospital, said in a statement.

"[I]f we want genetic technologies to benefit everyone," she explained, "we need to rethink our current approach for genetic studies because they typically miss a huge swath of American — and more broadly human — diversity."

As they reported in the American Journal of Human Genetics on Thursday, Martin and her colleagues brought together geographic and biographic information with array-based genotyping profiles for more than 32,000 participants in the non-profit National Geographic Genographic project, using genotype phasing, population fine structure, and other analyses to explore the migration histories and ancestral relationships between populations in different parts of the country.

National Geographic started the project in 2005 and said last year that it had stopped selling genotyping kits but would continue research using its database, which includes about a million individuals.

"The population of the United States is shaped by centuries of migration, isolation, growth, and admixture between ancestors of global origins," Martin and her colleagues wrote, noting that the current findings "provide detailed insights into the genetic structure and demographic history of the diverse US population."

For their large-scale analyses, the researchers relied on genotyping profiles for 32,589 Genographic Project and Geno 2.0 Project participants, focusing on more than 108,000 ancestry-informative SNPs profiled with the validated custom Illumina GenoChip array, in combination with self-reported ethnicity, geographic, and ancestral data.

"Many previous studies have investigated specific population histories in the US at relatively small scales — on the order of hundreds to thousands of individuals. These studies have provided deep insights into many specific populations," the authors wrote, though they noted that recent work suggests "population structure is inaccurately captured in small samples sizes."

With help from 1000 Genomes Project data and a random forest classifier approach, the team teased out broad ancestry and genetic diversity patterns for populations living in the West, Midwest, Northeast, or South, according to US Census data. That view was refined further using population substructure visualization tools, fineSTRUCTURE analyses, haplotype clustering, identify-by-descent analyses involving 31,783 individuals, and other population approaches.

Native American ancestry, in general, was most common in individuals from the American West and South, the researchers reported, while Hispanic/Latino individuals often lived in places like California, New Mexico, Texas, or Florida.

The latter group included individuals with ancestry from Native American, European, and African populations, though the precise ancestral populations and the proportions of them varied from one location to the next.

In California, for example, the Hispanic/Latino populations tended to have relatively high proportions of Native American ancestry and ancestry resembling that in Central and South American populations than did Hispanic/Latino populations in New Mexico or Texas.

The team's population structure analyses also revealed distinct genetic ancestry patterns in African American individuals from the northern US compared to those in the south, consistent with the known differences in population mixing, segregation, and migration patterns in different parts of the country before and after the transatlantic slave trade.

"Our results show, at a finer scale, the barriers to [African American] migration and gene flow, particularly along the Appalachian Mountains," they explained. "This migration barrier overlaps with the boundary between slave states and free states, as well as the boundary between states that enacted laws enforcing racial segregation and states that forbade segregation."

The researchers saw pockets of populations with distinct proportions of Southern European, Central European, and Northwestern European ancestry in different regions, including Finnish, Acadian, Ashkenazi Jewish, and other founder populations.

They also identified dozens of genetically differentiated South Asian and East Asian population clusters at sites in the West, Northeast, and other parts of the country, along with signs of past or present inter-relatedness in some Asian and Middle Eastern populations.

"[T]he long [sum of runs of homozygosity] in South Asians may reflect endogamy related to the caste system in India," the authors speculated, "while similar patterns among the Middle Eastern and Southeast Asian clusters may be capturing consanguineous marriage practices in those regions."