Providing a taste of what third-generation sequencing might look like, Pacific Biosciences last week outlined its single-molecule real-time, zero-mode waveguide DNA-sequencing technology, and its progress to date.
The Menlo Park, Calif.-based company on Saturday gave its first public presentation of the technology in a packed auditorium at the Advances in Genome Biology and Technology meeting in Marco Island, Fla.
PacBio projects that with improvements to its enzyme biochemistry and in camera technology, it will eventually be able to generate more than 100 gigabases of sequence data per hour, or a diploid human genome at about 15-fold coverage; provide reads at least as long as Sanger sequencing; and offer run times measuring in minutes at a cost of hundreds of dollars.
Using a prototype system, PacBio researchers have already shown they can get read lengths of more than 1,500 bases and multiplex hundreds of sequencing reactions. The company predicts it will be selling its instruments to early adopters sometime in 2010 at a similar price to currently available next-generation sequencers.
PacBio presented its technology during the final session on the last day of the conference to a receptive audience. “It was startling to me, and very creative, and very exciting, if they can pull this off,” Dick McCombie, a professor at Cold Spring Harbor Laboratory, told In Sequence. “It’s very new, but certainly, if they came close to what they are stating they are trying to achieve, it would be a dramatic change in technology” that would change the way sequencing is used.
“I personally find the zero-mode waveguide technology thrilling and believe it has enormous potential,” said Harold Swerdlow, head of sequencing technology at the Wellcome Trust Sanger Institute, in an e-mail message.
The company had raised expectations about its technology talk with beach fireworks and a panel discussion on human genome sequencing earlier during the conference. “They worked pretty hard to get some buzz going at the meeting, in advance of their talk, and I think they delivered on the hype,” Chad Nusbaum, co-director of the genome sequencing and analysis program at the Broad Institute, told In Sequence. “Given where they are in their technology development cycle, I am very optimistic.”
At the heart of PacBio’s technology lies DNA polymerase, a “nanoscale sequencing enzyme” that has several favorable properties, including the ability to sequence an entire genome within minutes from a single DNA molecule as a template; the ability to yield read lengths of thousands of bases; a low error rate; and high speed.
PacBio researchers follow each base the polymerase incorporates in real time by “feeding” it with fluorescently labeled nucleotides. When the correct nucleotide enters the active site of the enzyme, the polymerase traps it for several milliseconds.
“During that time, we have the opportunity to collect fluorescent light from the system,” explained PacBio Co-founder and Chief Technology Officer Steve Turner. Incorrect nucleotides, on the other hand, diffuse in and out of the enzyme quickly.
But in order to be able to detect fluorescence from just a single nucleotide without interference from others that also float around in the system, the observation volume must be made much smaller.
Enter zero-mode waveguides, or ZMWs, which are tiny wells with metal sides and a glass bottom that are made by punching holes tens of nanometers wide in a 100-nanometer aluminum film that is deposited on glass. When a laser is shone at the wells from below, it cannot penetrate them because its wavelength is bigger than the hole. The effect is similar to how microwaves cannot exit the perforated screen of a microwave oven door.
However, some attenuated light forms an evanescent field just inside the well near its bottom, creating a tiny illuminated detection volume of 20 zeptoliters, small enough to observe a single molecule of DNA polymerase holding on to a nucleotide, but no surrounding fluorescent molecules.
Turner and Cornell University Professor Harold Craighead founded the company, initially called Nanofluidics, in or around 2002. After moving from Ithaca, NY, to Menlo Park, Calif., the company was renamed Pacific Biosciences.
Several years earlier, when Turner was a graudate student in Craighead’s laboratory, he and his colleagues discovered ZMWs while working on new optical-confinement techniques for single-molecule sequencing at Cornell’s = Nanobiotechnology Center.
“It was somewhat of an accidental discovery,” Turner recalled. The scientific literature described apertures they were looking to use that were made entirely of glass. However, “it’s easier to punch a hole through a metal film, and we thought, ‘it will still work.’”
Some of the holes were invisible under a microscope, though, until they switched it to epifluorescence mode. “The reason we could not see them is because they were acting as zero-mode waveguides and no light could get through,” Turner explained.
Inside, the light is exponentially attenuated rather than geometrically. “We immediately realized that we had something new, so we filed a patent” and dropped the other confinement approaches they were working on. The patent was granted with all its claims, “much to the chagrin of some of my colleagues in neighboring groups, who thought that we were patenting the hole,” Turner said.
In order to observe fluorescence during DNA synthesis when a nucleotide is incorporated, but not afterwards, the Cornell scientists attached the label not to the nucleotide but to the terminal phosphate group, which is lost at the end of the reaction and diffuses away.
Another advantage of this type of nucleotide is that the newly synthesized DNA remains completely natural. Around 2000, the researchers teamed up with a group at Amersham, a company that is now part of GE Healthcare, that was also working on phospholinked nucleotides, but for SNP genotyping. “Since then, they have discontinued all their work on this effort, so we actually hired in the entire synthesis group that they had,” Turner said.
One challenge is that every single nucleotide in the reaction has to be labeled in order to avoid gaps in the sequence. To that end, the scientists add an alkaline phosphatase to the reaction mix that destroys native, unlabeled nucleotides, but not labeled ones.
In order to develop the technology for sequencing, the company had to overcome several obstacles. For example, the researchers had to make sure they could immobilize a single active polymerase in the bottom of a ZMW, so they developed a surface coating that repels proteins and sticks to the metal sides but not the glass floor of the chamber. They also showed that they can approach the theoretical limit of random distribution, in which 37 percent of ZMWs are loaded with exactly one enzyme molecule.
Eventually, Turner said they want to use a nanotech-based self-assembly method to place exactly one polymerase in each well, which would increase the sequencing throughput, but they decided to develop this capability later on.
“If they came close to what they are stating they are trying to achieve, it would be a dramatic change in technology.”
They also tested the maximum length of DNA that a single polymerase in a ZMW can effectively synthesize. Using base-labeled nucleotides that get incorporated into the DNA, they found that the enzyme generated 5,000 bases of DNA on average during a 30-minute run, and up to more than 25,000 bases of DNA during a 2-hour run.
“This technology has the potential to reach very long read lengths, much longer than any technology that has come before it,” Turner concluded.
Finally, the researchers had to make sure the technology can detect four differently labeled types of nucleotides at the single-molecule level with high fidelity.
“Clearly, this is a challenging task” and required the company to make several improvements to the detection technology, Turner said. As a result, both excitation and detection occur through the glass bottom of the ZMW chip. A prism disperses the light according to its color before it reaches a single-photon-sensitive CCD array, a monochrome detector.
In a recent proof-of-concept experiment, the researchers used their technology to successfully sequence native DNA purified from a cell, but encountered a significant error rate.
“The accuracy level is not yet ready for prime time,” Turner admitted, but “we understand what the issues are; it’s primarily due to the kinetics of the enzyme ... and we are very confident that we are going to be able to bring that up to well above what would be commercially required” using protein evolution, he said. “The underlying raw accuracy is improving every day with our [enzyme] mutation efforts.”
Modifying the DNA polymerase is central to PacBio’s internal development. “The enzyme is really the core of our sequencing operation, so we have to be the best in the world at polymerase-directed evolution,” Turner said.
Other than by improving the polymerase, the researchers can deal with errors by sequencing a circularized template several times, a process they call “circular consensus sequencing.” As an example, they sequenced a small synthetic DNA circle more than 10 times using a strand-displacing polymerase and reached a total read length of 1,500 bases.
“What’s important to realize is that the dominant source of error in a single-molecule system are statistically uncorrelated errors [that] wash out of consensus sequence exponentially fast,” Turner said. As a result, “from an accuracy point of view, our consensus accuracy is going to be better than any other technology will produce.”
Turner envisions different modes of sequencing for the technology, including using different types of polymerases and nucleotides, for example circular consensus sequencing for heterogeneous samples where high accuracy is required, and expression profiling where each polymerase must sequence many short DNA tags.
Paired-end sequencing would also be possible, he said, requiring an appropriate sample-prep method.
‘Great Promise,’ but Early Days
Several experts find the technology compelling. “I think their technology holds great promise,” Elaine Mardis, co-director of the Genome Sequencing Center at Washington University and a consultant to PacBio, told In Sequence by e-mail. “I think it will compete strongly for long read technology needs, such as de novo and resequencing of genomes from microbial to human.”
However, short-read technologies, such as those from Illumina, ABI, and Helicos “will be important for ChIP, microRNA, and other types of experiments that only require a short sequence tag,” she added.
The instrument, once it has been developed, has “the potential to achieve levels of throughput and quality that we don’t see in any of the current technologies,” according to the Broad Institute’s Nusbaum. “There is also the possibility of substantially longer read length, although 454’s read lengths keep going up and up. The other thing I really like about the machine is the incredibly quick turnaround time; they are talking about runs being on the order of minutes rather than days.”
“Because the technology is based upon single molecules, I am skeptical that the instrument can achieve similar raw read accuracies to the best of the conventional and next-generation instruments available today” the Sanger Institute’s Swerdlow said. “However, we believe that even with somewhat low accuracy, the ultra-long reads possible with this method will be a great advantage for certain applications, for example, de novo sequencing.”
Swerdlow said he does not see “how one can read out of CCD cameras any faster than other platforms, so there will still be this limit to the speeds they can attain.”
Indeed, the system’s throughput is limited by the detection technology at the moment. “Right now, our bottleneck is the bandwidth of the output amplifier on our CCD camera,” according to Turner. “When a new camera becomes available that enables us to look at a million zero-mode waveguides, we will be able to release a new instrument that has greatly improved capacity.”
That camera would need to have approximately 20 million pixels, he said, and would have to be joined with on-chip amplification that the company uses to obtain single-photon sensitivity. “This level of multiplex already exists in CCD cameras, and this [amplification technology] already exists, they just haven’t been married,” Turner said.
Another limit for the technology will be its fluorescent labels. The speed of incorporation could be increased from the current 10 bases per second to 50 bases per second, but “at that point it would be limited by the number of photons per second that you can get from small molecule organic fluorophores,” Turner said. “Beyond 50 bases, we think we need some other kind of labeling technology.”
The company has already made chips with a million ZMWs, although it can currently only interrogate 3,000 ZMWs simultaneously. Coupled with improved camera technology and enzyme speed, the system could eventually have a sequencing throughput of over 100 gigabases per hour, according to Turner.
But those will not be the specifications of the first instrument. After hitting a series of feasibility milestones last November, the company decided a few weeks ago to develop its first commercial system.
It is not yet clear what the performance specs of that first system will be, or when it will be available, but “sometime in 2010, we will definitely be selling systems to early adopters,” PacBio Chairman and CEO Hugh Martin told In Sequence.
The company’s marketing team has already talked to approximately 30 potential customers to find out about their needs, which will help determine the initial specifications.
The price of the instrument will likely be in the range of that of 454’s Genome Sequencer, Illumina’s Genome Analyzer, or ABI’s SOLiD system, which currently sell for between $400,000 and $600,000.
“I don’t see any fundamental reason why the instrument would not be in the same general range as all of the current next-gen products, except for Helicos’,” which has a list price of $1.35 million, Martin said.
Consumables will be “competitive with the cost per base of any of the other systems at the time we release,” he added, though “there may be a slight premium to get the long read lengths.”
A single run, which would probably take only a few minutes, would cost on the order of several hundreds of dollars, he said, though the system will be able to stage several experiments so operators can let it run for a longer period.
Sample prep will be “much less labor intensive” and “overall less expensive to conduct” than on current next-gen sequencers, Turner predicted. The system is capable of sequencing both linear and circular templates, and both single- and double-stranded DNA.
In the near future, PacBio will start to grow its head count from just over 100 to about 200, "which is about the size of the development team that we are going to need to get this completed,” Martin said.
In order to sustain its growth, PacBio plans to raise an additional $80 million in venture capital over the next year. Since it was founded, the company has raised approximately $71.5 million in venture capital from Kleiner Perkins Caufield & Byers, Mohr Davidow Ventures, Alloy Ventures, Maverick Capital, and others, and has received $6.6 million in funding from the National Human Genome Research Institute under its “$1,000 Genome” sequencing technology program.
PacBio is not the only company that has been developing a single-molecule real-time sequencing technology based on DNA polymerase sequencing. Another one is VisiGen Biotechnologies, based in Houston, which said last year that it wants to sequence a megabase per second, our about 4 gigabases per hour (see In Sequence 5/8/2007).
VisiGen measures interactions between an immobilized DNA polymerase carrying a donor fluorophore and a nucleotide carrying an acceptor fluorophore using Förster resonance energy transfer, or FRET. As of last fall, the company was planning to offer a sequencing service by the end of 2009, followed by sales of sequencing instruments (see In Sequence 10/23/2007).