Scientists at Harvard Medical School have developed a hybridization-based DNA capture method that uses concatenated PCR products bound to a filter to enrich genomic regions of interest for sequencing.
The researchers claim that their method, published in Nature Methods this week, is less expensive and more flexible than existing array- or solution-based targeted enrichment methods, and does not require specialized equipment. They also demonstrate that it can be used both for SNP calling and detecting copy number variations.
The scientists are currently using their method in population studies for hypertrophic cardiomyopathy and high-density lipoprotein cholesterol levels, with the goal of sequencing targeted regions in several hundred subjects for each study.
The technique will likely be useful for studies that involve relatively small numbers of samples to search for causative mutations as a follow-up to genome-wide association studies, and as an alternative to PCR for sequencing small numbers of candidate genes, according to Matthew Bainbridge, a researcher at Baylor College of Medicine who was not involved in the study but has experience with capture arrays.
It could compete with offers from several companies that are selling commercial hybridization-based DNA capture or enrichment methods, among them Roche NimbleGen, Agilent Technologies, and Febit (see In Sequence 2/24/2009 and 3/17/2009, and other article in this issue).
For their study, the Harvard researchers chose two discontinuous target areas, or "subgenomes," one comprising 184 HCM targets within 54.7 kilobases, another one totaling 323 HDL targets in 60.1 kilobases. After amplifying the target sets, ligating them into concatemers, and amplifying them, they bound the products to nitrocellulose membrane filters where they served as "subgenomic traps."
They then used them to capture DNA from two pools, each consisting of four bar-coded DNA libraries, and sequenced it using a single flow cell lane on the Illumina Genome Analyzer.
The DNA came from three HapMap samples: four individuals with abnormal HDL cholesterol levels, and one HCM patient with a deletion and an insertion in the MYBPC3 gene. About 60 percent of reads matched the target areas, and the enrichment was at most about 40-fold, according to the paper. Within each target subgenome, captured libraries were "sufficiently complex and relatively unbiased."
The scientists analyzed the data both for SNPs and copy number variants and identified a previously unknown 11-kilobase tandem insertion in the MYBPC3 gene of one subject.
One of the attractive features of the new method, compared to array-based capture methods, is its low cost, according to Jon Seidman, a professor of genetics at Harvard Medical School and one of the authors of the study, since many labs "have already made the PCR products" for other analyses. Instead of designing and purchasing microarrays, which can cost more than $100 each, he said, researchers just need to put these PCR products onto nitrocellulose filters.
For that same reason, he said, the method could be used by any molecular biology lab, as it does not require special equipment, for example to handle arrays. Also, he pointed out, the capture sequences can be easily modified "in a few days," whereas arrays are more difficult to redesign.
[ pagebreak ]
However, if a lab does not have the PCR products in hand, designing and testing the PCR primers and generating the "subgenome" that is loaded onto the filter "is the expensive and time-consuming part," according to Baylor's Bainbridge. Yet, he said in an e-mail, the technique "is probably cheaper for a smaller number of samples because it can be done in-house."
Dan Turner, head of sequencing development at the Wellcome Trust Sanger Institute who has worked with array-based and solution-based capture methods, agreed that generating PCR probes "can amount to a lot of work if you want to look at large regions." However, other hybridization-based methods, he estimated, "will inevitably cost more to perform than the method in this paper."
The method "will make sequence capture of smaller regions much more accessible, as it is something that can be done relatively inexpensively, and it gives the experimenter full control over the probe selection [and] design," Turner said by e-mail.
Finally, Seidman said, the method "turns out to be very good" for detecting copy-number variants, which "are obviously a big deal in medical resequencing." According to Turner, that capability has not yet been reported for other methods.
The article also states that the specificity of the method is "comparable to that of other hybridization approaches" and that both sensitivity and uniformity are "superior to that of existing methods."
According to Seidman, the latter appears to be related to the length of the DNA on the filter. "We think that the longer pieces that we put on the nitrocellulose filter actually help to have a much more uniform coverage," he said. That, in turn, reduces the amount of sequencing required to reach a certain coverage across the target.
Bainbridge said he suspects uniformity "is very target-dependent," adding that "capturing a number of small regions with a lot of probes is different than capturing a much larger number of regions with [fewer] probes."
It is also not clear from the paper, he said, how well the method will capture regions larger than 115 kilobases, and how well it works with larger numbers of samples.
The Harvard researchers are now applying the approach in population studies of both HCM and HDL cholesterol, for example to identify new sequence variants. For HCM, they are planning to study, eventually, 400 to 500 subjects, according to Seidman, and "similar numbers for HDL."
Several other research groups that he and his colleagues are collaborating with are using the method as well, he said.