Using a whole-genome, targeted enrichment scheme, researchers from Stanford University and elsewhere are boosting the representation of ancient human DNA in sequencing libraries — uncovering thousands of previously undetected variants that are helping them understand historical human migrations and relationships.
The whole-genome in-solution capture, or WISC, approach involves using modern genomic DNA to produce biotinylated RNA probes that can nab corresponding stretches of ancient human sequence out of solution, leaving behind sequences from microbes or other environmental contaminants, explained Meredith Carpenter, a post-doctoral researcher in Carlos Bustamante's Stanford University lab.
Carpenter presented information on the WISC strategy — along with preliminary results from an analysis of a dozen ancient human samples ranging in age from more than 500 to nearly 3,500 years old — at the annual Biology of Genomes meeting at Cold Spring Harbor Laboratory last week.
"Our sequence capture approach has allowed us to access [ancient DNA] from many specimens that were previously unsuitable for sequencing due to their low endogenous DNA contents," Carpenter and her co-authors wrote in the abstract accompanying that presentation.
Also participating in the study are researchers from Stanford University, the Natural History Museum of Denmark's Centre for GeoGenetics, and the National Institute of Archaeology at the Bulgarian Academy of Sciences.
In principle, Carpenter noted during her presentation, the WISC strategy resembles in solution capture methods that have been applied to protein-coding portions of the genome.
For their current analyses, though, she and her colleagues were keen to target the full human genome, in the hopes of bumping up representation by as many ancestry informative variants in the genome as possible.
A similar WISC strategy is also expected to prove useful for forensics applications, Carpenter said. And because it does not hinge on the availability of a reference genome sequence, it should also be possible to enrich for DNA in ancient samples from other animals by making WISC probes using DNA from appropriate modern species.
In general, the WISC approach stemmed from the realization that many ancient samples contain relatively little DNA representing the specimen of interest.
In many of the samples considered so far, Carpenter noted, authentic ancient DNA from the organism of interest makes up less than 1 percent of the total DNA content of a given artifact, with most sequences coming from microbes associated with the sample, the surrounding soil, and so on.
To get around such problems, researchers have used several strategies to enrich for and/or amplify ancient DNA sequences of interest.
In 2009, for instance, the Institute for Evolutionary Anthropology's Svante Pääbo and colleagues described a primer extension capture, or PEC, method targeting short stretches of ancient DNA — an approach Pääbo and his co-authors used to sequence mitochondrial genomes from Neanderthals (IS 7/21/2009), an early modern human, and an ancient hominin known as the Denisovan (see GWDN 3/24/2010).
When sequencing a draft version of the whole Neanderthal nuclear genome, published in Science in 2010 (IS 5/11/2010), meanwhile, members of the same Max Planck-led team used a restriction enzyme enrichment to chop up bacterial sequences in sequencing libraries as a means of bumping up the proportion of Neanderthal DNA.
For a Neanderthal re-sequencing effort appearing in the same issue of Science, collaborators from Cold Spring Harbor Laboratory came up with an array-based hybridization capture scheme centered around 60-base probes designed to target regions of interest in the Neanderthal genome (IS 5/11/2010).
And just last year, some authors on the Neanderthal draft genome study used a single-stranded DNA amplification method developed by Max Planck Institute for Evolutionary Anthropology researcher Mattias Meyer to sequence a higher-than-draft quality version of the Denisova genome (IS 9/4/2012).
Meyer and company published a more detailed description of that ancient DNA sequencing approach last month in Nature Protocols (IS 4/9/2013).
In an email message to In Sequence, Meyer said he has been following the development of the Stanford-led team's WISC strategy with "great interest."
"[T]his indeed looks very interesting," he said of the whole-genome capture approach, adding that he is "looking forward to reading about the details."
Meyer and Max Planck colleague Tomislav Maricic previously took a crack at using a comparable whole-genome enrichment approach (with human DNA probes) to try to sequence Neanderthal DNA from an ancient sample containing a slew of environmental contaminants, Meyer noted.
"Although the enrichment worked in principle, we were not satisfied with the results," Meyer said, "because the overwhelming majority of human sequences came from repetitive as opposed to the single-copy parts of the genome."
They abandoned that strategy at the time, opting instead for a 'classical' hybridization enrichment scheme based on probes synthesized on arrays — a hybridization enrichment approach that they also used to sequence a large swath of sequence from chromosome 21 for a 40,000 year old modern human sample from a Chinese cave that was described in a study in the Proceedings of the National Academy of Sciences online this February.
For their part, Carpenter, Bustamante, and colleagues have had success using WISC to significantly increase the number of SNPs detected in genome sequence data from ancient samples, according to work presented at last week's conference.
With genomic DNA from modern humans, the researchers transcribed biotinylated RNA probes, which were bound to streptavidin beads and used to grab corresponding ancient human sequences. The unbound fraction, containing genetic material from microbes and other environmental contaminants, is then washed away.
For the current analyses, researchers sequenced WISC-enriched libraries with Illumina MiSeq or HiSeq 2000 instruments.
The approach is expected to be compatible with other sequencing technologies, too. Even so, it's anticipated that researchers would see little benefit from sequencing WISC libraries with longer read platforms such as Roche 454 or Pacific Biosciences RS instruments, since ancient DNA tends to be fragmented into fairly short bits already.
When they applied the WISC method to 12 ancient human specimens — samples from four Iron and Bronze Age humans who lived in Bulgaria between around 2,500 and 3,500 years ago, bone samples from seven Peruvian mummies dated at 1000 to 1500 AD, and a single Danish hair sample dated at around 1350 BCE — Carpenter and her colleagues saw significant increases in the number of informative variants found in each post-capture.
In post-capture reads from a Bulgarian tooth sample believed to be from around 1500 BCE, for example, the group detected nearly 9,700 SNPs — up from the fewer than 1,000 single nucleotide variants detected with comparable sequence coverage of a non-enriched library from the same sample.
That difference in variant representation is far from trivial for those retracing ancient human relationships, population patterns, and migration, Carpenter noted.
Whereas the pre-capture sequence data placed the Bulgarian sample broadly within the realm of genetic variation that represents European populations, for instance, having 10-fold as many SNPs at their disposal allowed researchers to perform analyses indicating that the individual clustered most closely with populations in southern Europe.
WISC enrichment bolstered the sets of informative SNPs found in the other ancient samples, too, Carpenter said. In the seven Peruvian mummies, researchers found around 1,500 SNPs pre-capture, but nearly 21,600 after applying the enrichment scheme. And post-capture Danish hair sample sequences contained almost 6,900 SNPs — up from the 841 or so that were found from pre-capture sequences from the nearly 3,400-year-old sample.
On average, the researchers tracked down roughly 50,000 SNPs for every million Illumina reads, Carpenter said. They also found that the number of ancient DNA reads tended to level off or plateau at lower overall read levels in the post-capture samples than did cases where WISC wasn't used, suggesting the method can successfully enrich for the majority of sequences sought in the original samples.
"We think this is going to substantially increase the number of samples that are available to shotgun sequence," Carpenter noted during her presentation. The team expects to publish the study in the not-too-distant future.
For his part, Meyer said he is anxiously awaiting that publication so he can delve into the details about the team's method and data to get a better sense of how WISC compares to other targeted capture approaches — both in terms of performance and price.
"In essence, I would be surprised if the method developed by Carpenter et al. could be used to sequence complete genomes in a cost-efficient manner," Meyer said, "but it may be a simple but powerful tool to generate overlapping sequence data from parts of the genomes of ancient samples for population genetic analyses."