A group of researchers from Harvard Medical School has improved the performance of George Church’s home-built polony sequencing-by-ligation technology and has used it to analyze messenger RNA in murine heart tissues.
The system was significantly cheaper to build than it would have been to buy any of the commercially available next-gen sequencing platforms, though it currently does not match the commercial sequencers in technical performance metrics like read length or throughput.
The researchers, who published their results in last week’s Science, set up their polony sequencing system about a year ago in the lab of Jonathan Seidman, a professor in the department of genetics at Harvard Medical School. His lab is housed in the same building as George Church’s group, and researchers from both groups collaborated on the project.
The cost for the system, which the researchers assembled from individual components, came to about $200,000, according to Seidman, but “the price is coming down very dramatically,” he said.
It includes an automated fluorescence microscope, an integrated flow chamber, and associated components for performing sequencing reactions (a complete list can be found here). Researchers in the Church lab are currently designing a second-generation instrument that will cost less than $100,000 and plan to assemble a prototype this fall.
By comparison, Illumina’s Genetic Analyzer lists for $430,000, including a cluster station; 454’s GS FLX lists for $500,000; and ABI’s SOLiD analyzer will list for $525,000, which includes a compute system.
“What I see as the major advantage is the open source nature of polony sequencing,” said Jeremy Edwards, an assistant professor of molecular genetics and microbiology at the University of New Mexico, who also has a home-built polony sequencer in-house that cost him about $150,000.
In order to use the system for mRNA tag sequencing — as an alternative to serial analysis of gene expression, or SAGE, sequencing, which is more costly — Seidman and his colleagues made a number of improvements to the technology that increased its accuracy and enabled the team to multiplex samples. They call their approach polony multiplex analysis of gene expression, or PMAGE.
An important step was to increase the sequencing accuracy, Seidman said. For genome sequencing applications, he explained, accuracy is less important because “if you do many replications, that will take care of any sequencing errors.”
However, in mRNA tag sequencing, while some messenger RNA molecules are present with several thousand copies, others are only represented by one or a few copies. “If a single base sequence error occurs, then that tag is going to give us an erroneous answer, and no amount of redundancy could solve that problem,” he said.
In order to improve accuracy, the scientists bound the beads carrying the amplified DNA directly to glass instead of embedding them in a gel matrix. This improved the ligation reaction and enhanced the image acquisition by making the focal plane more even.
It also increased the density of the beads in the flow cell, enabling the researchers to sequence about 4.7 million tags in one experiment. In addition, they decreased the background fluorescence by capping free 3’ ends with an oligonucleotide rather than using an enzyme. Furthermore, changes to the ligation protocol enabled them to sequence AT-rich tags better.
However, the researchers did not attempt to increase the read length of 14 bases. Each experiment, not including the library construction, takes “less than four working days” in practice, according to Seidman, although it could be performed in two working days in theory.
There is still room for improvement, he said. For example, after amplifying the sample DNA by emulsion PCR, beads carrying amplified DNA could be enriched so that “in principle, some day, this [flow cell] should be able to read 50 million reads per run,” Seidman said. That, he added, would enable the researchers to sequence several libraries on the same flow cell.
By comparison, 454’s Genome Sequencer generates more than 400,000 reads per run, Illumina’s Genome Analyzer can produce more than 40 million reads per run, or 5 million reads in each of eight channels, and ABI said its SOLiD system can produce up to 40 million reads per run. Current read lengths for these platforms range from 25 base pairs to up to 300 base pairs.
“What I see as the major advantage is the open source nature of polony sequencing.” |
The Harvard researchers also developed a barcoding method that makes use of two different bead populations carrying different primers. Thus, they were able to analyze two mRNA tag libraries from two different tissues in one experiment.
“The theoretical advantage of that method is that if there are any sequence biases, then both tissues should suffer the same sequence bias,” Seidman said, adding that they have not found sequencing bias to be a big problem.
In their Science study, the researchers were able to detect mRNAs as rare as one transcript per three cells, making this approach more sensitive than any microarray-based technique, Seidman said. Traditional SAGE sequencing experiments could, in theory, detect these rare transcripts as well, if they analyzed enough tags, but their cost would be prohibitive, he added.
He and his colleagues estimated that their experiment, which generated high-quality sequence data for about 4.4 million tags, cost $1,770, about 7-fold lower than the $12,500 commercial cost for a typical SAGE Sanger sequencing project, which only generates about 100,000 tags. The polony sequencing cost includes labor, reagents, and equipment amortization.
Edwards told In Sequence that it costs him about $200 to $300 in reagents to run a similar experiment to the one described by the Harvard researchers. His lab is also planning to sequence 30 microbial strains in parallel, at about 15X coverage, for approximately $17 per strain.
Illumina cites $3,000 per run for its instrument, or $400 for a 4-megabase bacterium at 25-30X coverage, and ABI said it would cost $3,000 to sequence a gigabase on the SOLiD platform, but those prices only include reagents.
Seidman is looking forward to comparing his polony sequencing system to one of the commercial platforms. “We haven’t had a chance to try it yet, so we don’t know where they stand,” he said.
The Harvard Medical School - Partners Healthcare Center for Genetics and Genomics is planning to acquire a commercial next-generation sequencer, he said, which he hopes his lab will be able to use. “There is always room for technical improvement, I guess,” he said. “And I will say, having a slightly longer read length would be a good thing. We are looking forward to that.”
He also plans to use the PMAGE technology to analyze various mouse models for cardiac disease in the future. “This would be ideal for detecting the difference [between them],” he said. “In each of these mouse models, there are a number of pathways that are activated. And those pathways are activated, presumably, by transcription factors and other RNAs that are expressed at low levels. [If] we want to dissect these different pathways, both temporarily and spatially, we should be able to do that using this method.”