One of the hallmarks of next-generation sequencing platforms has been their enormous output of data. But not every user is interested in obtaining millions of base pairs from a single sample. Rather, many would like to sequence specific regions of a genome for many samples in a single run.
Now a group of researchers in Denmark has developed and published a barcoding method that will allow them to sequence the same stretch of DNA in hundreds of individual samples simultaneously using the 454 Life Sciences sequencer.
In addition, 454 is currently developing its own method for multiplex analysis that will be sold as a kit by marketing partner Roche, while Illumina and Applied Biosystems are working on various ways to multiplex analysis on their respective platforms.
Eske Willerslev, a professor of evolutionary biology at the University of Copenhagen, told In Sequence that the barcoding method his group published in Public Library of Science One last week is the first to make the 454 technology “suitable for population genetic studies.”
Willerslev said that one of the limitations of the 454 system has been that users could not pool PCR products from hundreds of individuals and be able to assign them to each sample after the run. “There have been a lot of people who are interested in using this new powerful machine but are not interested in running DNA from single individuals,” he said. “They want DNA from numerous individuals.”
Although 454 Life Sciences allows researchers to partition its picotiter plate into up to 16 separate areas, and thus run 16 different samples in parallel, that is often not enough, according to Willerslev. “In many situations, one eighth or even one sixteenth of a run would be enough to do, say, [an entire] population study,” he said.
In order to multiplex the system further, Jonase Binladen, a PhD student in Willerslev’s lab, came up with the idea of barcoding the samples by adding dinucleotide tags to the 5’-ends of the PCR primers. “That way you will be able to distinguish what sequences came from which individual,” Willerslev said.
The method is straightforward and worked well for the 13 species they analyzed, he said. But the researchers found some biases in sequence representation depending on which tags they used, and have not validated their method with more samples. In theory, he estimated, 600 PCR products could be pooled and analyzed on one eighth of a picotiter plate using longer tags.
One disadvantage of the method, Willerslev said, is that it requires a different set of PCR primers for each sample, making the approach expensive.
Alternatively, he said, scientists could ligate tags to the PCR products, a method he said other research groups are working on at the moment.
Willersley developed the method to analyze ancient DNA samples on the 454 platform (see sidebar below), but said that it would also be useful for other applications, such as large-scale environmental studies that try to survey all bacteria in a sample by studying their 16S DNA.
Meanwhile, 454 has been working on its own version of the multiplexing technology. “We see a very strong demand for a multiplexing strategy” Mary Schramke, 454’s vice president of marketing, told In Sequence by e-mail.
She said 454 is currently working on a new kit that Roche will commercialize that will “employ essentially the same approach” as Willerslev’s. 454 is testing the method “rigorously” at the moment so it will not mis-assign samples or compromise read accuracy. “We intend that this product will support use with both our standard shotgun library and our amplicon sequencing approaches,” and will be supported by the company’s sequencing software, Schramke said.
“It is worth emphasizing that we don’t need complicated methods for tagging, as our reads on either the GS 20 or the GS FLX are of sufficient length [100 bp and 250 bp, respectively] such that tags can easily be incorporated as part of the primer design,” she wrote.
“There have been a lot of people who are interested in using this new powerful machine but are not interested in running DNA from single individuals.”
Also, other 454 users have worked on their own tagging approaches: Dutch plant breeding technology company Keygene presented a method to sequence plant PCR products at the Plant and Animal Genome conference earlier this year (see In Sequence 01/23/2007), and Feng Chen of the Department of Energy’s Joint Genome Institute presented a method at the recent Advances in Genome Biology & Technology conference for sequencing fosmid DNA from up to 30 clones in one run.
Illumina and ABI have been working on their own solutions for sample multiplexing. For a start, Illumina’s Genetic Analyzer uses a flowcell with eight channels, each of which can hold a different sample.
According to Omead Ostadan, Illumina’s vice president of marketing, Illumina researchers are working on “a number of different methods” for multiplexing. In the “preferred method,” researchers add a barcoded adapter of up to 10 nucleotides in the final step of sample preparation. The samples are then mixed and sequenced in two steps: first, up to 50 bases are sequenced on every template and the products of the first reaction are removed. A second primer, then, determines the tag sequence. A 10-base tag would allow for over a million different samples to be pooled in a single run.
The first release of ABI’s SOLiD system will allow users to deposit up to four different samples on each of the two slides that are analyzed per run, said Andy Watson, ABI’s senior director of market development and collaborations, in an e-mail message. The company hopes to segment the slides further in the future. “We believe that around 8 to 16 [areas] per slide is relatively easily achievable with manual pipetting,” Watson said, noting that robots could increase that number even further.
Watson said that ABI is also looking into barcoding methods, but did not know when they would be released or how they are expected to perform. “The only barcoding approaches that we are considering include error correction,” said Kevin McKernan, ABI’s senior director for scientific operations for high throughput discovery. “It is important that your barcodes be more accurate than your raw sequence so you don’t mis-identify a patient,” he added.