By Julia Karow
As other DNA sequencing firms continue developing platforms that they say will deliver long reads at low cost, 454 Life Sciences plans to make additional upgrades to its Genome Sequencer FLX platform, aiming to take market share from Sanger sequencing.
Over the next few months, the company, which is part of Roche, plans to roll out a variety of improvements for the GS FLX, including longer reads, a new library protocol, amplicon sequencing for the Titanium platform, and new assembly software.
Earlier this year at the Advances in Genome Biology and Technology conference, 454 Vice President of R&D Michael Egholm said that the company had internally reached an average read length of about 660 base pairs using its Titanium chemistry.
Now, the company, whose platform was the first second-generation sequencing technology to come to market, is "putting the finishing touches" on a longer-read capability. It plans to make these available to a set of early-access customers during the fourth quarter followed by a general launch sometime next year, Egholm told In Sequence this month.
454 has been achieving a "modal read length," or read length peak, of between 700 and 800 bases, he said, and reads of up to 1,000 bases in length. In addition, a new base-calling method has allowed it to improve the accuracy of the reads "significantly."
"Our goal is really simple: It is to intercept Sanger sequencing with this, and we feel very comfortable with being there," Egholm said.
He said applications for the long reads include de novo sequencing, metagenomics, and possibly human resequencing, although only time will tell what researchers will find them most useful for.
Companies also working on long-read sequencing technologies include Pacific Biosciences, with its single-molecule real-time platform, and Illumina, with its Avantome technology, though Egholm said possible competitors don't offer the same quality and throughput as 454’s system.
"I think the place where we are going — millions of highly accurate 1,000-base-pair reads — is a very unique space," he said. "The speed and the accuracy and the number of reads, that combination, I don't right now see any of the would-be competitors targeting. That's not to say that it won't come, but certainly, we don't see it."
A new application 454 will be launching in the fourth quarter is amplicon sequencing using its Titanium chemistry, which will enable users to sequence amplicons up to about 500 base pairs in length. This required the firm to develop a new algorithm for base calling, Egholm said.
Long amplicon sequencing "is really important for future diagnostic development," he noted. At present, he said, 454 sees "a lot of traction" for amplicon sequencing for research into HIV and other chronic viral infectious, HLA sequencing, and sequencing genes that play a role in the immune system.
The company is also "investing heavily in making our system much less expensive, much easier to use, and much more reliable," he said.
For example, 454 will also launch a new library protocol during the fourth quarter that uses a fluorescent tag, allowing users to directly and accurately quantify their libraries. The new protocol uses about a fifth of the starting DNA required previously, he said.
The firm will also introduce a new protocol for cDNA or transcriptome sequencing that combines a "very standard" way of generating double-stranded cDNA fragments with the new library preparation protocol and results in a more even distribution of read lengths and a higher yield, according to Egholm.
Lastly, 454 plans to launch new modules for its assembly software in the fourth quarter, including a new version of the Newbler assembler and a de novo transcriptome assembler.
The latest Newbler will be designed to assemble a genome the size of the human’s de novo in half a week to a week, and has already been used by customers "with very nice results," Egholm said. Unlike previous versions of the software, which required large compute clusters, the most-recent version will require only a mid-sized computer that costs around $50,000, he added. This is possible because instead of starting with 12- to 16-base-pair overlaps between reads, which created "a lot of false feeds," the software looks for 200-base-pair overlaps between reads.