By Monica Heger
Constructing libraries for sequencing can be time consuming, imprecise, and inefficient. But now, researchers at the Broad Institute have designed a method to automate the library construction process for 454 sequencing that greatly increases the number of samples that can be sequenced and reduces the cost per sample.
The researchers said the method will be particularly amenable for viral sequencing projects in which there are many samples, but small genome sizes.
The Broad researchers automated the library preparation method and increased the throughput by moving the process from individual tubes to a 96-well plate.
In a paper published this month in Genome Biology, the researchers described how they used Beckman Coulter's solid-phase reversible immobilization (SPRI) technology, which uses paramagnetic beads, instead of column-based agarose gel cuts for fragment size selection, and added molecular barcodes to the samples to prevent contamination and enable pooling. Then, instead of using a fluorescent assay to quantify the library, they used qPCR, which is more sensitive and allowed them to use less starting DNA.
Using the method, one technician can prepare 96 libraries in two days as opposed to six libraries in two days using the standard protocol, reducing the cost per sample by 40-fold, according to the authors. The method can also be adapted for other platforms.
The Broad researchers don't plan to commercialize the method, but the team detailed every step of the process in the paper so other labs can use it either in its entirety or just the relevant portions, said Niall Lennon, assistant director of the genome sequencing platform at Broad and lead author of the study.
"It removes something that's been a major bottleneck in the 454 pipeline, which is the manual preparation of sequencing libraries," said Daniel Turner, head of sequencing technology development at the Wellcome Trust Sanger Institute, who was not involved in the development of the method published by the Broad.
Lennon said that the group developed the technique because it is working on a large number of projects using 454, and the standard library prep would have been too time consuming. "It would have taken us 10 years to complete all the samples we have funded," Lennon said.
Also, with current library preparation protocols, when researchers want to sequence small genomes, like viruses, they end up with many more reads than necessary, often covering the genome over 1,000 times. But in this method, because the samples are barcoded, they can be pooled and sequenced together, so scientists still achieve adequate coverage for each sample.
Pooling also enables them to use less starting DNA, added Lennon. It is possible to start with 10 nanograms of each sample in a 96-well plate, for example, because when those samples are pooled for sequencing, the result is 960 nanograms, he said.
Turner added that aside from being a more efficient process, it will reduce the likelihood of contamination. "The whole process is automated, so that reduces human error. And every sample gets a molecular barcode," he said. "Once things have been barcoded, the risk of cross-contamination is eliminated, even if you mix your samples up."
In addition to the standard libraries of 400 to 800 base pairs, the Broad team also created a protocol for preparing paired libraries with 3-kilobase insert fragments, dubbed "jumping" libraries. The method was similar, but instead of sequencing 96 samples at once, they only sequenced 24 samples. Because library preparation for the 3-kilobase insert fragments involves more steps before adapters are ligated, there is a greater chance of contamination. So, the researchers used only 24 wells on the 96-well plate, leaving an open well between each sample.
Turner said that automating the library construction process for the longer insert sizes will be especially useful because the current process is so prone to human error. "If you're trying to do this manually, it's real tricky and can go wrong at any stage. But they've automated it, and that will be really useful to a lot of people," he said.
"They've set the precedent for automated library preparation," said Poornima Parameswaran, a doctoral student in Andrew Fire's genetics and pathology lab at Stanford University. She added that a couple years ago she worked on developing a barcoding strategy to sequence multiple samples in parallel, and that the Broad protocol would have yielded libraries more quickly because it automated the entire process.
Parameswaran said she would also like to see the method adapted to work on RNA or even smaller amounts of DNA. "Miniaturization would be the next way to go — adapting the process to microfluidic devices, or using nanotechnology, so you can work with small amounts of material without losing it," she added.
Lennon said that will be one of their next steps. "We're looking at new methods of making the library from vanishingly small input material," he said.
The team is planning to study HIV in patients with low viral load. For this project they will be studying around 1,500 patients, each with fewer than 50 copies of the virus per milliliter of blood. That's "almost a billion-fold less starting material than the currently recommended amount [for sequencing]," said Lennon. "So we're really working on methods for amplification of that material and reducing requirements for starting size for that sample," he added.