By Julia Karow
Researchers in the Science for Life Laboratory at Sweden's Royal Institute of Technology have developed a simple barcoding method that allows them to sequence selected targets in thousands of samples in parallel.
The method relies on a combination of two tags, the first one added to individual samples, the other to a pool of 96 tagged samples. In a proof-of-concept study published last week in PLoS One, the scientists applied their strategy to analyze a single exon in nearly 5,000 samples with 122 tags using the 454 GS FLX sequencing platform.
According to Afshin Ahmadian, the paper's senior author, he and his colleagues originally developed the method because one of his colleagues, population geneticist Peter Savolainen, wanted to sequence a single, highly polymorphic exon in the DLA-DRB1 gene in about 5,000 samples from dogs and wolves, a project that would have been prohibitively expensive to do by cloning and Sanger sequencing.
Instead of barcoding each amplified exon individually prior to pooling and analyzing them by next-gen sequencing, the researchers first use a set of 96 position tags to barcode 96 samples at a time during a microtiter plate-based PCR reaction. They then ligate a second, plate-specific tag next to the sequencing primer to a pool of 96 amplicons from each plate, followed by sequencing library preparation.
For their study, they set out to analyze the second exon of the DLA-DRB1 gene in 4,708 dog and wolf samples, using 96 position tags in 52 PCR plates. Almost 80 percent of the PCR reactions were successful, and after pooling PCR products from each plate, they generated 52 sequencing libraries, which they sequenced on the 454 GS FLX using Titanium chemistry. In the end, they were able to genotype about 94 percent of the successfully amplified samples, or about 3,500.
"The beauty of this method is its simplicity and that you can automate it," said Ahmadian, who is an associate professor in the School of Biotechnology at KTH. Because of the plate-based format, automation is easy, eliminating pipetting errors, and for each additional plate of 96 samples, only one additional tag is needed.
And unlike other barcoding schemes, such as the DNA Sudoku method developed a few years ago by researchers at Cold Spring Harbor Laboratory (IS 6/9/2009), this one requires no "experimentally complicated sample or primer mixing procedures," according to the authors.
Also, because the method uses two barcodes, it is possible to recognize, and eliminate, chimers that form when different single-stranded PCR products hybridize after samples are pooled. About 10 percent of the sequence reads came from such chimers, according to Ahmadian.
According to Yaniv Erlich, who developed the DNA Sudoku method, the double-tagging approach is a good way to reduce the number of barcodes needed to tag many samples. But he said the number of individual PCR reactions is still large, which might become a bottleneck if the number of samples increases from thousands to tens of thousands. "For the application that they show, it's very elegant and nice, but I'm a bit worried that it will not scale to other applications as easily," Erlich told In Sequence.
Also, he said, the double barcodes get quite long, wasting valuable read length, which is not so much of a problem with the long-read 454 technology, but more so with short-read sequencers like the Illumina or SOLiD platforms.
Ahmadian said that the main reason they chose the 454 platform for their initial project was that they needed reads of at least 350 bases to cover the entire exon. But for amplicons shorter than 200 bases, he and his colleagues are now using the Illumina HiSeq 2000 sequencer, which can handle even more samples in a single run.
[ pagebreak ]
He and his colleagues are planning to use the method for other applications, for example to profile gene expression in different types of yeast in response to various compounds or environmental conditions. "For that purpose, you need a method to analyze lots and lots of different samples," he said. They are also planning to measure expression of a subset of genes in different individuals, disease states, or conditions, he said. Another possible practical application could be in HLA typing.
In principle, the method is compatible with capture technologies that target not one but a number of DNA regions in a genome, Ahmadian said, but because the cost of capture is currently high — on the order of several hundred dollars per assay — he and his colleagues are currently developing their own hybridization-based capture method. "If our approach works, we could reduce the cost of capture and enrichment, and then we can implement [the multiplexing] technology," he said.
He said the new method is not patent-protected and is "free for the research community to use," though no one outside his lab has applied it yet.
Have topics you'd like to see covered in In Sequence? E-mail the editor at jkarow [at] genomeweb [.] com.