While the human reference genome is done, it's not quite complete. There are a number of regions of the genome that researchers have had trouble accessing, especially near heterochromatic regions and sequences containing a number of repeats. Researchers led by the Broad Institute's Steven McCarroll turned to using an admixture mapping approach to open up those regions, as they describe in this week's American Journal of Human Genetics.
In particular, they turned to genomes from Latino people as many Latinos have ancestors from three different continents. "Latino populations have a relatively distinctive gift to give. Having some recent African ancestry, but just a little, can yield especially powerful information about what the structure of the human genome is in all populations," McCarroll said in a statement.
Then by applying their method to whole-genome sequence data from 242 Latinos, McCarroll and his colleagues were able to uncover 20 million basepairs of missing sequence.
"Despite this effort, even more sequence remains unlocalized … or missing from the current human reference genome," the investigators note.