An article published two weeks ago by Pacific Biosciences disclosed details of its sequencing technology and showed that the company can collect sequence data from single DNA polymerase enzymes.
The paper serves as a proof of concept for the technology and is “really a full disclosure of how the system works,” Steve Turner, PacBio’s chief technology officer, told In Sequence two weeks ago.
The company, which raised $20 million in venture capital last month (see Short Reads in this issue), first discussed its real-time single-molecule sequencing technology publicly at the Advances in Genome Biology and Technology conference in February (see In Sequence 2/12/2008), and the paper contains much of the same data the firm showed at the time. But it also reveals previously undisclosed details relating to the surface and nucleotide chemistry.
Turner said he and his colleagues have “made tremendous progress on all fronts” since the data for the paper was generated early this year, including the technology’s read length, accuracy, and degree of multiplexing.
In the article, which appeared online in Science two weeks ago, they showed they can sequence short, synthetic stretches of DNA.
In initial experiments, the PacBio researchers used two kinds of fluorescently labeled nucleotides to monitor the synthesis of a short single-stranded template, which allowed them to record two of the four DNA bases. In order to test how long the enzyme, a modified version of Φ29 polymerase, can keep going, they also used a short circular template that enabled them to observe several kilobases of DNA synthesis.
Eventually, the team used all four types of differently labeled nucleotides to determine the bases of a 150-nucleotide linear stretch of DNA, a sequence that occurs in nature.
The error rate of a single read was on the order of 15 to 20 percent, though the median accuracy at 15-fold coverage was 99.3 percent. Deletions — the dominant error type — resulted from nucleotide additions that were too short to be recorded. In contrast, most insertions were caused by nucleotides that bound to the enzyme initially but were let go. Mismatches occurred when the system confused fluorescent dyes whose spectra were close to each other.
According to Turner, the accuracy of the system “has improved considerably.” For example, he showed at a conference in October that the company has obtained an error-free sequence of the 5.4-kilobase bacteriophage φX174 genome at 13-fold coverage (see In Sequence 10/14/2008).
Improvements in accuracy have come from a faster camera, brighter dyes, a more effective signal collection, and modifications to the polymerase so it holds on to nucleotides for a longer time. PacBio’s target for the commercial launch of the instrument is that “the fold coverage needed to get to Q50 reads” — or 99.999 percent accuracy — “is going to be well under 20,” he said.
Turner also pointed out that the technology, which includes a sample-prep method that generates circular templates, allows users to sequence the same strand of DNA multiple times, thus increasing the overall accuracy of the read.
“The multiple passes means that the raw read accuracy is a variable that the user can flexibly turn according to the needs of the system,” he explained. For scaffolding, for example, customers might sequence with long reads and low accuracy, while for finding rare mutations, they might increase the accuracy at the expense of read length.
“This paper is certainly going to be the seminal paper for our sequencing technology, but we think it’s also ushering in an era of real-time biology.”
The maximum read length of the commercial system will initially be similar to that of conventional Sanger sequencing technology, or about 800 to 1,000 bases, company representatives have said before.
PacBio has also “considerably” increased the number of reads the system can obtain in parallel, according to Turner, though the company has not yet decided how much the commercial instrument will be able to multiplex.
“We haven’t frozen the design, so it’s premature to release what that number is, but we will be doing that soon,” Turner said. Better camera technology will improve multiplexing even further in the future, he predicted. “We think that by 2013, we will be able to provide a 1 million-plex system that simultaneously monitors a million zero-mode waveguides.”
Earlier this year, the company showed in a paper published in Optics Letters that it can monitor several thousand reaction chambers in parallel, and the technology described in the Science paper used an array of about 3,000 tiny holes, or zero-mode waveguides, of which about a third held a single polymerase molecule that can produce sequence data.
The Science article also revealed for the first time the nature of PacBio’s labeled nucleotides. Unlike natural DNA synthesis, which uses nucleoside triphosphates, PacBio’s technology employs nucleoside hexaphosphates that carry a fluorescent dye on the terminal phosphate group.
Earlier this year, the company published a paper in Nucleosides, Nucleotides and Nucleic Acids showing that Φ29 DNA polymerase can efficiently synthesize DNA using labeled nucleoside pentaphosphates.
The original reason for adding a phosphate group was to replenish a negative charge that the coupling of the dye had removed, according to Turner. “We found that the addition of a fourth phosphate made the performance better,” he said, and additional phosphate groups improved it even further.
Another enhancement of its technology that the company disclosed in its most recent paper is a modification to its surface chemistry that helps to orient the polymerase and prevent its direct contact with the floor of the reaction chamber.
“It may seem like a minor change from a conceptual point of view, but from an actual operational point of view, that was one of those watershed occasions that put us forward and made it possible to present the work that we did,” Turner said.
PacBio decided to publish its technology so far ahead of its launch date — it is slated to start selling its system in the second half of 2010 — to set the stage for more ambitious sequencing projects in the future.
“We wanted to have a seminal paper that we could cite when we come out with our publication of a larger genome,” said Turner, describing a “method paper that we could cite that describes how the technique works.”
The point was to publish “early enough in the product cycle that everybody would understand that this is the proof-of-concept” that does not yet reflect the capabilities of the future commercial product, he explained. He said the company has already “moved on to larger genomes” and plans to present those results at the AGBT conference in February.
Although PacBio will focus on developing its technology for DNA sequencing, Turner and his colleagues believe it could also have applications in other areas, such as studying DNA binding proteins, polymerase inhibitors, and methylation effects.
“We think that in the long term, this paper is certainly going to be the seminal paper for our sequencing technology, but we think it’s also ushering in an era of real-time biology,” Turner said, allowing researchers to study single molecules in close-to-natural environments. “And we think that’s a new capability that is going to have wide-ranging applications.”