Name: Zhimin Wang
Position: Professor, Department of Plant Science, School of Agriculture and Biology, Shanghai Jiao Tong University, Shanghai, China, since 2003
Experience and Education:
Professor, Hebei Academy of Agricultural and Forestry Sciences, Shijiazhuang, Hebei Province, China, 2001-2003
PhD in genetics and crop breeding, China Agricultural University, Beijing, 1997
(PhD project conducted at the Department of Crop Genetics, John Innes Centre, Norwich, UK, 1992-1994)
Visiting scientist, Department of Plant Science, University of Alberta, Edmonton, Canada, 1989-1990
MSc in genetics and crop breeding, CAU, 1984
BSc in genetics and crop breeding, CAU, 1982
Zhimin Wang, a professor in the School of Agriculture and Biology at Shanghai Jiao Tong University in China, is working on a sequencing approach that uses an exonuclease to cleave off single nucleotides from a DNA molecule and detects them by one of two different types of sensors.
In principle, his method is similar to the exonuclease-nanopore approach currently under development at Oxford Nanopore Technologies, but his concept differs in a number of key aspects.
Wang was scheduled to give a talk about his approach at the Advanced DNA Sequencing Technology Development Meeting of the National Human Genome Research Institute in Chapel Hill last month but was unable to obtain a visa in time. In Sequence caught up with him via e-mail and asked him to provide an overview of his work. Below is an edited version of the interview.
Can you describe your approach to exonuclease-assisted single-molecule sequencing?
There are actually two versions in our approach. In the first version, two electrophoretic microchambers are separated by a membrane with a single nanopore, allowing ion exchange between the two chambers to happen only through the pore. One end of a long DNA fragment is attached to a magnetic bead, and then the complex is fixed onto a magnet at the cathode. The free end of the target DNA is subject to digestion by an appropriate exodeoxyribonuclease, or exonuclease.
The four kinds of individually released 2-deoxyribonucleoside 5’-monophosphates, or dNMPs — dAMP, dCMP, dGMP, and dTMP — which are negatively charged in a solution under neutral to higher pH, will pass through the nanopore while migrating from the cathode to the anode, driven by an applied electric field. Different kinds of dNMPs will leave a distinct “signature,” reflected by different magnitudes of ion current bursts and duration of dwell time in the pore, because of their distinctive spatial configurations and molecular weights. If those dNMPs can sequentially pass through a sensitive pore, sequence information of the target DNA can be decoded from those signatures, recorded by a patch clamp amplifier connected to the electrodes.
Theoretically, both single-stranded DNA and double-stranded DNA can be used as the target because of availability of exonucleases of various kinds. We, however, chose dsDNA and its corresponding exonucleases for simplicity and ease because ssDNA tends to form secondary structures through intra-strand base pairing, which may inhibit enzyme activity.
The second version has a similar framework, except that the nanopore connecting the chambers is replaced by a nano- or micro-channel, which harbors a sensor called a zero-mode waveguide, or an array of ZMWs, whose use in DNA sequencing with fluorescein-labeled dNTPs was first reported by Watt Webb’s group at Cornell University at the Conference on Lasers and Electro-Optics in 2002, and then in a paper in Science in 2003.
[ pagebreak ]
Raman spectrometry is related to near- and mid-infrared spectroscopy, which measures vibrational frequencies of various parts of a molecule. These frequencies are directly related to bond strength, mass of the bound atoms, and other factors, such as intramolecular interactions. The frequency pattern of a given molecule species is highly specific. When the cleaved dNMPs pass through the ZMW zone, each can be excited by a laser and emit a distinct Raman spectrometry signature, whose intensity can be enhanced up to four orders of magnitude by the ZMW. These signatures can be recorded by a Raman spectrometer, and converted into sequence information of the target DNA.
Similarly, RNA can also be sequenced using exoribonucleases.
What are the pros and cons of the two different types of sensors you are proposing to use — the nanopore sensor and the ZMW-enhanced Raman spectrometry sensor?
It is too early to virtually compare the two because the nanopore sensor has been used for about three years, and the ZMW-based sensor is a brand new design. However, based on our experience, my prediction would be that the ZMW-based sensor will be technically easier to fabricate but more expensive than the nanopore sensor. Dwell time of dNMPs in the pore, which could be another characteristic signature for identification, can be a limiting factor in nanopore sequencing systems in terms of speed, although it can be adjusted by changing experimental parameters, such as pore length — if it does not affect accuracy — and voltage. Dwell time can be easily regulated in the ZMW-based system.
Both sensors are expected to have an accuracy of over 99.99 percent and the ability to sequence native DNA and RNA and identify 5-methylcytosine. For other characteristics, such as read length and cycling rate, both are basically dependent on the exonuclease we choose. After proof of principle, it may become an important factor to determine which one is easier to array, to increase the overall throughput.
What have you achieved so far in this project, and what do you still need to solve?
With the nanopores, we have detected dCMP and dGMP at high salt concentration to see whether we could observe clear current changes or not. After obtaining these positive results, we tried dAMP and dTMP at a lower salt concentration, close to an enzyme's working condition. Preliminary data suggest that this system is probably promising. There is a current gap of over 1,000 picoamps between dCMP and dGMP, and over 100 picoamps between dAMP and dTMP, implying that we are likely approaching an accuracy of 100 percent in a single run. However, more data will be needed to confirm this, for example testing all four dNMPs under the same condition that favors enzyme activity, and sequencing a known stretch of DNA.
By the way, to our surprise, the signals are a current increase instead of a partial blockage, a phenomenon that has been reported by several groups using DNA fragments. The underlying mechanism will need to be elucidated.
One problem we have encountered is that the chips do not work for long, with fewer than two dozen signals collected before either the pore gets blocked or the membrane starts leaking. So we need to overcome these problems and to increase the pore’s longevity.
As to the ZMW-based sensor, we have just finished the design.
What are the greatest challenges you need to overcome in order to sequence DNA?
There are two pivotal and challenging techniques required in our system. The first is how to fabricate a highly sensitive nanopore, as it determines to what extent the current bursts from the four dNMPs can be different, that is, the system's accuracy. Previous work suggests that signatures of dNMPs will unlikely be accurately distinguished by either native α-hemolysin or by solid-state nanopores because of low signal-to-noise ratios. I have posited that a cylindrical nanopore will be a solution. However, controlling the geometry of solid-state nanopores has been a big challenge in the nanoworld. We thus attempted to use single-walled carbon nanotubes, or SWNT, to fabricate our pore, which we designated a 2D-tunable nanopore because the diameter of the pore can be tuned between 1 and 2 nanometers, and the length between tens of nanometers to microns.
The second challenge is how to manipulate DNA at the single molecule level, as our sensor requires exactly one single DNA molecule at a time. We have developed a method to reliably attach one single DNA molecule, instead of zero or more than one, onto a bead, which can be delivered into the cis-chamber. We thus have completely circumvented Poisson distribution.
For the ZMW-enhanced system, a challenge will be how to efficiently collect distinct Raman spectrometry peaks of every dNMP.
[ pagebreak ]
In particular, how do you make sure that the sensor doesn't miss nucleotides, and that they reach the sensor in the correct order?
For the ZMW-based sensor, we will adapt the cross section area of the nano- or micro-channel for the effective size of the localized optical field in the single ZMW or the arrayed ZMW zone, respectively. In this case, there will be no chance for dNMPs to escape from detection as long as they do not get absorbed on the channel wall before reaching the detection zone, which we will prevent by using surfactants.
We performed molecular simulations in order to see whether the original order of the released dNMPs is or is not interrupted by Brownian motion and variations in travel speed of different dNMPs between where they are released and the pore. The results show that Brownian motion is easily rectified, and that dNMPs will sequentially pass through the sensors under normal conditions — that is, the fastest dNMP will not exceed the slowest in a given distance, say 4 millimeters, at 120 millivolts and 37 degrees Celsius. If it were not the case, we would have backups.
How far away are you from a functioning DNA sequencer? How long do you think it will take?
This is a difficult question to answer without establishing a premise. If nothing gets seriously derailed beyond our imagination, it may take two to three years after we have proven the principle. The latter, however, will likely take another two or three years.
What kind of read lengths, speed, accuracy, multiplexing, and sequencing cost do you think this technology could achieve?
The nanopore sequencing version was designed based on four conceptual standards: high accuracy, long read length, high throughput, and low cost, coincidently consistent with the gold standards set by US National Human Genome Research Institute in 2004. Later, I added the ability to identify 5-methylcytosine as another important criterion.
The processivity and activity of the exonuclease are critical, as they translate into read length and cycling rate, respectively. We will first use λ exonuclease, as it has been extensively studied as a model and has shown a few favorable characteristics. Its processivity is about 18 kilonucleotides. But read lengths of over 5 meganucleotides can be expected if the enzyme is allowed to bind repeatedly. Akira Mizuno’s group at Toyohashi University of Technology in Japan reported the digestion rate of the enzyme is about 1 kilonucleotide per second, which used a very similar dragging regime as ours.
We anticipate an accuracy of beyond 99.99 percent per single run.
Our preliminary data show that dwell time may become a limiting factor in the nanopore system in terms of sequencing rate, which may be overcome by adjusting a few parameters, such as increasing voltage, decreasing pore length, or increasing pore diameter without sacrificing accuracy, or slowing down the digestion rate of the exonuclease, which can be compensated for by array techniques. The overall throughput will depend on how many sensors can be arrayed, and we are still waiting for a clear answer on that.
Consumable costs for sequencing a diploid human genome will be much less than $1,000 because native DNA will be used as the target, eliminating cloning, amplifying, and labeling procedures.
How is your approach similar, and how does it differ from the exonuclease/α-hemolysin nanopore sequencing platform that Oxford Nanopore Technologies is developing? What are the advantages of your approach over theirs?
As to novelty, there is a long story. Briefly, our system was conceived in late 2003 when I came to Shanghai Jiao Tong University. While preparing my genomics course, I was inspired by the pioneering work of Carleton Stewart’s group at Los Alamos National Laboratory, the work of Daniel Branton and colleagues at Harvard University, and that of Richard Crooks’ group at the University of Texas at Austin. In 2004, my proposal, entitled "Preliminary studies on dynamics of dNMPs’ translocation through a nanopore under an applied electric field," was submitted to our university and was turned down.
[ pagebreak ]
I believed at that time that it would take quite a few years before a similar system could be designed. But what surprised me was an independent publication by [Oxford Nanopore Technologies founder] Hagan Bayley’s group at Oxford University in the Journal of the American Chemical Society in early 2006. We both had the idea to use an exonuclease to enlarge spacing of adjacent bases, and to detect current change and dwell time during translocation of the released dNMPs through a nanopore. Both approaches are potentially capable of sequencing native DNA, RNA, and identifying 5-methylcytosine. Because of availability of highly processive exonucleases, both will have a long read length and, therefore, are suitable to both de novo sequencing and resequencing. It is thus obvious that the two systems have an extremely similar concept, if not completely the same.
Nevertheless, there are still some differences. For example, we chose dsDNA as a first target, while they are trying dsDNA and ssDNA simultaneously, according to their published reports. We want to place our target DNA at a distance from the pore, while they plan to fix the exonuclease at the entrance of the pore. The most significant differences are the types of pores and methods of pore adjustment. We use a SWNT pore and tune it by selection and microtoming, while they use an α-hemolysin pore and modify it by engineering.
Our nanopore is made of SWNT, and should theoretically be extraordinarily stable and lasting. Secondly, a SWNT pore is atomically cylindrical and smooth, therefore maximizing signal-to-noise ratio. A high aspect ratio of up to three orders of magnitude can be achieved through 2D-tuning, and will sufficiently increase accuracy. As mentioned above, our pore is apparently heading toward an accuracy of 100 percent, compared with a mean accuracy of 99.8 percent using the engineered α-hemolysin pore that was reported in Nature Nanotechnology last year.
In addition to de novo sequencing, the high accuracy of our system would allow us to perform single-cell-based resequencing, ChIP-seq, RNA-seq, 5-methylcytosine detection, etcetera, providing a powerful tool not only to genomics, but also to epigenomics. These single-cell-based approaches would enable one to test how sequence variation, DNA methylation, gene expression, and silencing — including small RNAs — are involved in the determination of cell fate, for example in carcinogenesis.
We predicted the ability of our pore to identify 5-methycytosine in 2006. Three years later, Oxford Nanopore Technologies reported using their engineered α-hemolysin pore for this. Overall, we both are progressing more or less in parallel, at least in thinking and direction. Besides, our ZMW-based sensor is even more different from theirs.
Are you planning to commercialize your technology eventually, and if so, have you already partnered with a company, or are you planning to do so?
Yes, when it is possible. But the system is still in its infancy, and before we think about its commercialization, we need to prove its principle. Therefore, we have not yet partnered with any company. Of course, we will be happy to collaborate if any companies are interested in a joint venture, which would greatly enhance our progress toward real sequencing.
Have you patented your technology?
Yes, many patents are pending.
Are you applying for funding from the NHGRI $1,000 Genome program?
Yes, we have submitted two proposals.
What other plans do you have regarding single-molecule DNA sequencing?
Based on our achievements in single DNA manipulation, we are now thinking about expanding this in two directions. One is to collect and deliver single protein molecules, such as DNA polymerase and exonuclease, into a chamber. The other is to fabricate chips patterned with single biomacromolecules — DNA, RNA, and proteins.
The eventual goal is to fabricate chips with a molecule density comparable to the resolution limit of an optical system. These chips would benefit existing sequencing systems. For example, this would be able to convert some approaches of ensemble sequencing into that of single-molecule-based sequencing, and increase the throughput of third-generation sequencing-by-synthesis by patterned sample loading instead of random sample loading. We have also designed methods to slow the translocation speed of a single-stranded DNA molecule through a nanopore down to 1 nucleotide per millisecond or less, which would be a great breakthrough in sequencing by threading DNA through nanopores with a constriction zone about a few angstroms long, such as the α-hemolysin and MspA pores.