By Julia Karow
As Life Tech's Ion Torrent nears an early July launch for the Ion 316 chip for its Personal Genome Machine, the company last week released a dataset for an Escherichia coli genome on its website that it generated internally on the new chip. In addition, several early-access customers — including the Human Genome Sequencing Center at Baylor College of Medicine and the Broad Institute — have tested the 316 chip, beating Ion Torrent's own R&D teams on throughput in some runs.
The E. coli dataset, approximately 150 megabases from a single run on the 316 chip and posted in the "Torrent Dev" section of the company's "Ion Community" website, comes from the genome of the E. coli DH10B laboratory strain. According to Ion Torrent, the data contains no more than 1 error at read lengths of 100 bases, and there are 69 errors in the entire genome.
The 316 chip will represent a 10-fold increase in throughput over the 314 chip that the company currently markets for the PGM. "We made enough progress on the 316 that … we feel it's a good time to put data out so folks can look at it and analyze it in different ways," said Maneesh Jain, Ion Torrent's vice president of marketing and business development.
Some scientists have already started to pore over the data, and have posted their observations online. According to these early assessments, the dataset contains 1.69 million reads, of which more than a million are at least 100 bases long.
According to an analysis by Dan Koboldt, a researcher at Washington University's Genome Center, the average read length is about 100 bases, and the longest read is 127 bases. Koboldt noted in a blog post describing his analysis that the average base quality seems to decline along the length of the read. The substitution error rate is low, he found, while the insertion and deletion error rate is eight-fold higher, or about 0.7 percent. Most of these errors appear to be related to homopolymer runs of four or more bases.
The official specs for the commercial 316 chip, which will cost $500, will be lower than what Ion's published run may suggest. The nominal output will be 100 megabases, with a "typical read length" of about 100 bases and about a million or so reads. The chip has about 6.1 million usable wells, or sensors, and over time, a greater percentage of them will produce high-quality sequence data, according to Jain, as loading efficiency and template amplification improve.
Meanwhile, the output for the 314 chip has improved, Jain said. That chip has about 1.2 million usable sensors and, according to its specifications, produces about 10 megabases of data from about 100,000 reads. However, internally, Ion Torrent is now getting good sequence data from about half the sensors, Jain said, or 600,000 reads, and some customers have reported getting 20 megabases of data from it.
Ion Torrent will continue to support the 314 chip, which costs $250, believing that it will still be useful for sequencing panels of 10 to 100 genes. "We don't think by any means that the 314 will be obsolete," Jain said. "Each chip will have interesting applications."
[ pagebreak ]
The 318 chip, which promises an output of 1 gigabase per run (IS 3/1/2011), is already in the works and scheduled for launch sometime in the fourth quarter. That chip will enable additional applications, for example human RNA-seq, ChIP-seq, and copy number studies.
Some early-access customers, meantime, have put the 316 chip through its paces. Baylor's Human Genome Sequencing Center, which has four PGM instruments, started testing the 316 chip last month, along with new protocols and smaller Ion Sphere microbeads, and is in the midst of converting from the 314 to the 316 chip, according to Donna Muzny, the center's director of operations.
In particular, a new loading protocol in which each chip is loaded twice "seems to have really smoothed out any inconsistencies in loading and performance," she said, adding that the second loading step does not add much time. Another factor determining the output per run, she said, is consistent sizing of the libraries, which she said her lab has been working on improving.
Baylor researchers, while visiting Ion Torrent, recently broke the company's internal record for the highest output on the 316 chip, adhering to certain metrics. In that run, they generated more than 2 million reads of at least 100 base pairs in a single run, or more then 287 megabases. Since then, Baylor has had runs over 200 megabases at its own center, too, Muzny said, producing just under 2 million reads.
Ion Torrent runs an internal competition between its four R&D sites for highest output, with a trophy that has so far been traded between the sites but was handed over to Baylor last week, where it will stay until the next time the record is broken. According to Jain, the fact that a customer is doing better than the company itself is "pretty gratifying" and provides evidence that customers will have a chance to win one of Ion Torrent's "grand challenges", a Life Tech-organized competition to improve the platform's sample prep speed, data output, and data accuracy (IS 12/14/2010).
Muzny said that getting the PGM up and running in her lab has been easy and quick, noting that prior experience with any other next-gen sequencing platforms helps. But even with no experience, the PGM is "easy to get going on as an entry platform," she said. "I don't see huge obstacles, especially as the protocols are getting much more consistent."
Her lab has not yet had access to the Ion OneTouch sample-prep system, which will automate the template amplification (IS 4/26/2011), but she expects that the device will also help make sample prep more consistent.
When the Baylor center first received the PGM, it tested it on six microbial genomes with different GC content, using the 314 chip, and found that its performance was "very comparable to other platforms." The error rate at that time was about 1.2 percent. The center now generates between 10 and 20 megabases of Q17 data per run on average, she said.
To gauge the instrument's performance with mammalian DNA, Baylor researchers also sequenced a series of rat BACs but have not completed their analysis yet. Those results will also reveal whether there are any problems with homopolymer runs, she said.
[ pagebreak ]
In addition, they have sequenced a set of 166 pooled amplicons, derived from four cancer genes, generating more than 7 megabases of aligned Q20 data and receiving 98 percent coverage for two of the target regions. They found nine variants, all of which they verified using Sanger sequencing. According to Muzny, the center is now expanding this set and plans to use it to validate variants found by high-throughput sequencing platforms like the Illumina HiSeq or Life Tech SOLiD. Currently, the center conducts these validation studies with either 454 or Sanger sequencing. The PGM will offer a quicker turnaround, and as its capacity increases, it would also be cheaper to use, she said.
In addition, Baylor has used the PGM for capture sequencing of targets ranging from 500 kilobases to one megabase in total size, which Muzny said "worked quite well." For one specific megabase-sized design, they covered 85 percent of the targets at 20-fold coverage, using 200-fold sequencing coverage. They also found good genotype concordance for a region from a HapMap sample and are "now progressing on to real samples in the pipeline," she said.
According to Muzny, the PGM is currently "in transition into production mode" at Baylor. At the moment, she said, "it's not a workhorse," but has a niche in targeted sequencing for validation. As the instrument's capacity increases, there will be additional applications, she said, such as sequencing of medium-sized genomes, and the platform "holds great promise to be a higher-throughput instrument in the future," with a faster turnaround than current high-throughput platforms.
The Broad Institute, which like Baylor has four PGM instruments, has also switched over to the 316 chip, getting "very good yields" from it, with some runs exceeding 250 megabases, said Chad Nusbaum, co-director of the Broad's genome sequencing and analysis program. The institute has also beaten Ion Torrent's own best runs, he said.
The quality of the PGM data is "good enough," he said, with an error rate of between 1 and 2 percent, depending on how it is measured.
At the moment, the Broad is using the platform for fast-turnaround projects and to validate mutations, but not in production processes because the instrument's capacity is not yet large enough for the institute's needs. Even for targeted sequencing applications, high-throughput machines like the HiSeq are still cheaper, he said, noting that the Broad does not sequence a lot of small target sets, where the PGM might be useful.
"Our interest in Ion Torrent is not so much in where it is today but where we think it can go," Nusbaum said, adding that "we are very excited about where it might be next year." Right now, the platform is ideal for users wanting to get into sequencing at a low cost and who need results fast.
Nusbaum said he is encouraged by the fact that Ion Torrent has so far kept its promise of increasing the instrument's yield by 10-fold every six months. "You can talk the talk, and at least as far as today, they have walked the walk," he said. "They did hit that milestone, so it makes me more optimistic about the next one."
He said he believes the read length could increase "easily" from the current 100 bases to 150 bases. "Beyond that, it's hard to speculate on how hard that will be."
The Ion OneTouch, he said, also looks like it will save the amount of labor that Ion Torrent says it will, although the Broad does not have the machine in house yet. "I think everybody who knows about it is surprised, based on the sketches, that it does, in fact, work very well," he said. "That was a critical step for them."
As Ion Torrent is improving the PGM's output, it is racing to where Illumina's desktop sequencer, the MiSeq, promises to start off later this year. Illumina has said that MiSeq will produce up to 1.5 gigabases per run, with more than 3.4 million reads and paired-end reads up to 150 bases in length, and will start shipping in the third quarter (IS 1/18/2011)
According to Muzny, there will likely be some overlap between the applications for PGM and MiSeq. Her center, she said, has not yet considered whether to acquire a MiSeq instrument.
Nusbaum agreed that the PGM and MiSeq will both be applied for rapid development of methods, but he said it is "impossible to say" how the PGM and MiSeq will compare "because the MiSeq is not available yet." Based on models of MiSeq's cost and performance, he said, the two platforms will be similar within a two- or three-fold range, though "it could easily flip."
"We will just have to run them and know what the reagents cost, and what the yields are, and how reliable the machine is."
Have topics you'd like to see covered in In Sequence? Contact the editor at jkarow [at] genomeweb [.] com.