NEW YORK (GenomeWeb News) – In a study appearing online today in Nature, members of the 1000 Genomes Project Consortium presented an integrated haplotype map representing the genomic variation present in more than 1,000 individuals from 14 human populations.
"This project provides the next important step towards understanding the function of the rare genetic variants we see across a wide variety of populations," study co-leader Richard Gibbs, director of Baylor College of Medicine's Human Genome Sequencing Center, said in a statement. "With this underpinning, we can go on to solve the puzzle of how this variation plays a part in human disease and health."
Using data on 1,092 individuals tested by low-coverage whole-genome sequencing, deep exome sequencing, and/or dense genotyping, the team looked at the nature and extent of the rare and common variation present in the genomes of individuals within these populations.
In addition to population-specific differences in common variant profiles, for example, the researchers found distinct rare variant patterns within populations from different parts of the world — information that is expected to be important in interpreting future disease studies. They also encountered a surprising number of the variants that are expected to impact gene function, such as non-synonymous changes, loss-of-function variants, and, in some cases, potentially damaging mutations.
"Our research has found that each apparently healthy person carries hundreds of rare variants of genes that have a significant impact on how genes work," Oxford University researcher Gil McVean, the study's corresponding author, said in a statement.
The genome, exome, and genotyping data generated for phase I of the 1000 Genomes Project made it possible to identify almost all of the variants found in as few as 1 percent of the population.
Using this information, the 1000 Genomes team came up with a haplotype map that includes information on 38 million SNPs, 1.4 million small insertions and deletions, and more than 14,000 relatively large deletions.
Population-specific patterns were detected for both common variants — those found in at least 5 percent of the population — and for rarer variants, researchers reported, with lower frequency variants showing particularly pronounced differentiation from one geographic region to the next.
Researchers explained that such rare variants, which have generally appeared more recently, can provide new insights into the histories of the populations tested.
Low frequency and rare variants are also expected to serve as a resource for ongoing and future disease studies. Looking at which variants seem to be en route to being weeded out by selection, for instance, may provide insights into which alterations are potentially damaging and which are more benign.
Given the influence that local effects seem to have on the types of variants turning up in a given population, authors of the study explained that "the interpretation of rare variants in individuals with a particular disease should be within the context of the local (either geographic or ancestry-based) genetic background."
"Moreover," they added, "it argues for the value of continuing to sequence individuals from diverse populations to characterize the spectrum of human genetic variation and support disease studies across diverse groups."
To that end, members of the consortium plan to sequence at least 1,500 individuals from around a dozen heretofore untested populations for the final phase of the 1000 Genomes Project, including an estimated 15 parent-child trios that will be sequenced to higher coverage.
The researchers ultimately hope to look at far more individuals in their effort to understand human genetic variation across and within populations.
"In the future we would like to reach the scale of having a grid of individuals giving us a different genome every couple of square kilometers," McVean noted in a statement, "but there is a long way to go before we can make this a reality."