Skip to main content
Premium Trial:

Request an Annual Quote

Swedish Researchers Devise Simple Double-Tagging Method to Sequence Thousands of Samples

Premium

By Julia Karow

Researchers in the Science for Life Laboratory at Sweden's Royal Institute of Technology have developed a simple barcoding method that allows them to sequence selected targets in thousands of samples in parallel.

The method relies on a combination of two tags, the first one added to individual samples, the other to a pool of 96 tagged samples. In a proof-of-concept study published last week in PLoS One, the scientists applied their strategy to analyze a single exon in nearly 5,000 samples with 122 tags using the 454 GS FLX sequencing platform.

According to Afshin Ahmadian, the paper's senior author, he and his colleagues originally developed the method because one of his colleagues, population geneticist Peter Savolainen, wanted to sequence a single, highly polymorphic exon in the DLA-DRB1 gene in about 5,000 samples from dogs and wolves, a project that would have been prohibitively expensive to do by cloning and Sanger sequencing.

Instead of barcoding each amplified exon individually prior to pooling and analyzing them by next-gen sequencing, the researchers first use a set of 96 position tags to barcode 96 samples at a time during a microtiter plate-based PCR reaction. They then ligate a second, plate-specific tag next to the sequencing primer to a pool of 96 amplicons from each plate, followed by sequencing library preparation.

For their study, they set out to analyze the second exon of the DLA-DRB1 gene in 4,708 dog and wolf samples, using 96 position tags in 52 PCR plates. Almost 80 percent of the PCR reactions were successful, and after pooling PCR products from each plate, they generated 52 sequencing libraries, which they sequenced on the 454 GS FLX using Titanium chemistry. In the end, they were able to genotype about 94 percent of the successfully amplified samples, or about 3,500.

"The beauty of this method is its simplicity and that you can automate it," said Ahmadian, who is an associate professor in the School of Biotechnology at KTH. Because of the plate-based format, automation is easy, eliminating pipetting errors, and for each additional plate of 96 samples, only one additional tag is needed.

And unlike other barcoding schemes, such as the DNA Sudoku method developed a few years ago by researchers at Cold Spring Harbor Laboratory (IS 6/9/2009), this one requires no "experimentally complicated sample or primer mixing procedures," according to the authors.

Also, because the method uses two barcodes, it is possible to recognize, and eliminate, chimers that form when different single-stranded PCR products hybridize after samples are pooled. About 10 percent of the sequence reads came from such chimers, according to Ahmadian.

According to Yaniv Erlich, who developed the DNA Sudoku method, the double-tagging approach is a good way to reduce the number of barcodes needed to tag many samples. But he said the number of individual PCR reactions is still large, which might become a bottleneck if the number of samples increases from thousands to tens of thousands. "For the application that they show, it's very elegant and nice, but I'm a bit worried that it will not scale to other applications as easily," Erlich told In Sequence.

Also, he said, the double barcodes get quite long, wasting valuable read length, which is not so much of a problem with the long-read 454 technology, but more so with short-read sequencers like the Illumina or SOLiD platforms.

Ahmadian said that the main reason they chose the 454 platform for their initial project was that they needed reads of at least 350 bases to cover the entire exon. But for amplicons shorter than 200 bases, he and his colleagues are now using the Illumina HiSeq 2000 sequencer, which can handle even more samples in a single run.

[ pagebreak ]

He and his colleagues are planning to use the method for other applications, for example to profile gene expression in different types of yeast in response to various compounds or environmental conditions. "For that purpose, you need a method to analyze lots and lots of different samples," he said. They are also planning to measure expression of a subset of genes in different individuals, disease states, or conditions, he said. Another possible practical application could be in HLA typing.

In principle, the method is compatible with capture technologies that target not one but a number of DNA regions in a genome, Ahmadian said, but because the cost of capture is currently high — on the order of several hundred dollars per assay — he and his colleagues are currently developing their own hybridization-based capture method. "If our approach works, we could reduce the cost of capture and enrichment, and then we can implement [the multiplexing] technology," he said.

He said the new method is not patent-protected and is "free for the research community to use," though no one outside his lab has applied it yet.


Have topics you'd like to see covered in In Sequence? E-mail the editor at jkarow [at] genomeweb [.] com.

The Scan

Germline-Targeting HIV Vaccine Shows Promise in Phase I Trial

A National Institutes of Health-led team reports in Science that a broadly neutralizing antibody HIV vaccine induced bnAb precursors in 97 percent of those given the vaccine.

Study Uncovers Genetic Mutation in Childhood Glaucoma

A study in the Journal of Clinical Investigation ties a heterozygous missense variant in thrombospondin 1 to childhood glaucoma.

Gene Co-Expression Database for Humans, Model Organisms Gets Update

GeneFriends has been updated to include gene and transcript co-expression networks based on RNA-seq data from 46,475 human and 34,322 mouse samples, a new paper in Nucleic Acids Research says.

New Study Investigates Genomics of Fanconi Anemia Repair Pathway in Cancer

A Rockefeller University team reports in Nature that FA repair deficiency leads to structural variants that can contribute to genomic instability.