Name: Jacob Schmidt
Position: Associate Professor, Department of Bioengineering, University of California, Los Angeles, since 2003
Experience and Education:
Postdoc, Cornell University and UCLA, 2000-2003
Postdoc, Rice University, 1999-2000
PhD in physics, University of Minnesota, 1999
Undergraduate degree in physics, University of Minnesota, 1991
Jacob Schmidt leads a nanobiotechnology research team at the University of California, Los Angeles, with a particular focus on developing devices such as pumps and sensors based on engineered membrane proteins. As part of this work, his team has explored the use of the nanopore protein alpha-hemolysin for DNA sequencing.
Schmidt and a graduate student in his lab, Robert Purnell, recently published a paper in ACS Nano in which they demonstrated that the alpha-hemolysin pore is able to detect single-base substitutions on strands of polyhomonucleotides — specifically polythymine, or polyT. The work built on a previous study in which Schmidt and his colleagues demonstrated that homopolymers of all four bases produced distinct current signals as they traversed the pore.
In the ACS Nano study, the researchers used a streptavidin "stopper" to immobilize strands of polyT in which single bases of A, C, and G were substituted at 12 different positions. Based on the current signals they obtained, it appeared that the pore was able to distinguish the base substitutions for each nucleotide. However, the measurements indicated that there were actually several sensitive regions within the pore, which made it difficult to determine the exact position of the substitution along the strand.
In Sequence spoke to Schmidt last week about this study and what his findings imply for nanopore sequencing in general. The following is an edited version of the conversation.
Can you describe how this study expands on your previous work, in which you were able to use a nanopore to distinguish polynucleotides of A, C, and T?
In our previous paper, we were able to tell the difference between polyA, polyC, and polyT, and we actually saw a big difference depending on what end of the DNA we put in first. So that was really interesting. But the point behind that whole experiment was that if you want to get DNA sequence information using a nanopore, you have to fulfill two different conditions: first of all, the bases have to give a different signal; and secondly, you have to be able to distinguish a single base.
The first paper sort of tackled that first point. We know that polyA, polyC, and polyT do make a different signal. What we decided to do next was just take that platform where we immobilized this poly-something strand and then put individual substitutions in there. So take a big strand of polyT — say 40 Ts — and then in position 12 let's put an A. And then put that thing in the pore, make a measurement, and see if we get something. And then do another experiment where now we’re going to out the A in position 13. So just basically move that base through the pore to see if you can find a sweet spot — a region of maximum sensitivity of the nanopore.
We did that with a polymer strand of polyT, and we put A through it in 12 different positions, we put C in 12 different positions, and we put G in there in 12 different positions. And what we found was that there isn't one sweet spot in the pore. We looked at the alpha-hemolysin protein nanopore, and in that specific pore, there are a large number of positions inside that are sensitive to the nucleotide identity. From a sequencing point of view, that was sort of a bummer because that means that if we get a current measurement, we're not going to be able to tell if it's A in position 8 or if it's A at position 10. We couldn't just tell you the sequence, even for a really lame sequence of 18 Ts and then an A and then another 19 Ts.
[ pagebreak ]
So that's a bummer, but the thing is that all of these experiments are really hard and really difficult, but they're really only on one protein. There are a lot of other groups out there that are looking at other proteins that have a much more well-defined sensitive region that won't have these problems, and there are some other groups, too, that are looking to mutate some of these proteins that will bind with specific nucleotides and give you that kind of enhanced sensitivity that's nucleotide-specific.
The paper mentions the MspA protein. Are you looking at that as an alternative to the alpha-hemolysin nanopore?
Well, this was the last paper of my student who is working on that. So now what we're trying to do is regroup, because if we did work with something else, we'd want to work with a different protein, and MspA, there's a group at the University of Washington that's working with that protein (see In Sequence, 1/13/2009). Whether we look at that protein or a different protein entirely, that's something that we're still trying to figure out.
The paper describes this work as 'complementary' to a paper published earlier this year in which researchers from Oxford University were able to discriminate between all four bases in a nanopore (see In Sequence 4/21/2009). Can you elaborate on how it complements that work, and what differences there may be in the two approaches?
There's a lot of overlap between our papers. They looked at immobilized homopolymers in the wildtype protein. They also had an engineered protein that has an amino acid substitution or two inside that will give different currents. But as far as their measurements of the wildtype protein, they immobilized strands of polyC, and we looked at strands of polyT, and that's great because now we can see what the difference is — how does that background make a difference?
It was really interesting because they scanned single bases of A through a polyC strand, and we did A, C, and G through a polyT. So you would imagine that the data might be very similar, and actually it turns out that our data is really different. So when we look at A moving through a polyT strand, the current difference that we measure has a single peak. It's a broad peak, but it's a single peak. And the Oxford work, they move A through a polyC strand, and they see something that has multiple peaks. So our data indicates that the hemolysin pore has one region of maximum sensitivity, or one sense region, for A, and the Oxford group's data suggests that the pore has a sensitive region in two different places for A.
What we speculate in our paper is that maybe what's happening is that the 'background' strand that you don't think is really doing anything — maybe what you're probing is the interaction of some of those background nucleotides with the pore. Specifically, polyC has some secondary structure, so if you just have a big long strand of polyC, it will wind itself into a little helix because the bases stack up and they hydrogen-bond with each other. Now if you imagine that you put an A in the middle of that, that's basically a defect in that, so it interrupts the secondary structure. So if the polyC is hydrogen-bonding with itself, then if you have an A in there, the Cs that are right next to the A can't hydrogen-bond with each other, so maybe they're going to look to hydrogen-bond with somebody else. So if they can hydrogen bond, say, with the sides of the pore, now there's that interaction.
PolyT doesn't have any secondary structure at all, so what we hypothesize in our paper is that when you have A moving through that background strand of C, what you also have is these free ends of polyC that can hydrogen-bond with the pore, and if there are regions of the pore that can hydrogen-bond, then maybe what you're actually seeing is not the presence of A in the pore that's giving rise to the current, but instead, you are seeing interactions of C with the pore.
We also scanned C through our polyT strand, and we see two sensitive regions, just like they see two sensitive regions with A in polyC. So that's how all that data can work together. We see one sensitive region for A and they see two sensitive regions for A. We see two sensitive regions for C, but their A is in a polyC background.
[ pagebreak ]
Are you working with that Oxford group at all in terms of combining this data?
We submitted this paper before we even saw theirs, so while it was in review our reviewer sent it back and said, 'This paper just came out. You really ought to expand the discussion of your paper.' So it's really new.
It seems like there's a huge jump between these findings and the kinds of capabilities that would be required to sequence a strand of real DNA through a nanopore. What steps will need to be taken before that is possible?
Right now, if you're talking about sequencing DNA with nanopores, there are two main approaches. One is the classic approach, where you have this long strand of single-stranded DNA that you're threading through a pore and you're just reading the current out. What our experiments are showing is that you're not going to be able to use hemolysin for that. That's the wrong protein pore for that specific method. If you were to use a protein that has a better sensitivity, like MspA or some other protein, or even an inorganic pore, then your task is to slow the DNA down. We slow the DNA down by putting a big stopper on it, so we slow it down to zero, and that means it's not very useful for sequencing, but that enables us to do some basic science. But for somebody who wants to do sequencing, you have to slow the DNA down, and nobody has any good ideas for this because all the methods we have to slow the DNA down also reduce your signal. So it's a signal-to-noise problem.
That's the classic version of nanopore sequencing. The Oxford group also has a startup company, Oxford Nanopore Technologies. What they're doing is coupling a nuclease on top of an engineered hemolysin pore, so basically they're feeding a DNA strand in and cutting off individual bases, measuring the individual bases, the nucleotides, and then doing that sequentially. They're able to do that. I don't know as far as sequencing, but there have been a number of papers that have shown that they can very clearly tell the difference between A, C, T, and G. But that's a different method; it's not really the classic thing where you're threading the DNA through a hole, but it's sort of the reverse of the sequencing-by-synthesis technique that a lot of other technologies use.
You mentioned that you're kind of at a crossroads now in terms of where you're going with this work. What direction would you like to move in?
My group in general is interested in ion channel technologies, and we spend a lot of time engineering the lipid bilayer that ion channels go into. So we were interested in this because I'm a physicist and it's a really interesting nanopore physics problem, and there might be a number of directions we go in the future just to do physical experiments. But mostly what we're focusing on now are things like automated platforms and high-throughput platforms for doing sensing and drug discovery in ion channels.
So it sounds like you won't be working in the nanopore sequencing area too much in the future.
Our goals have not been to actually do sequencing. It's looking like that initial direction, where you're threading the DNA through the nanopore — that just looks less and less likely, as far as the technology goes. So we've been trying to see what kind of science we can do.
One of the things we're going to be doing next is determining the noise characteristics of the DNA inside the pore, and can we tell what kind of interaction the DNA has with the pore by the noise signature, and that kind of thing.
This whole field has only been around for about 13 years and there have been quite a lot of advances since then, but everybody else has made a lot of advances, too. Just a few weeks ago there was the Quake genome published, and supposedly he did that for $50,000, so even if after 10, or 15, or 20 years someone can sequence with a nanopore, if everybody else's technique is down to $1,000 or $500, you sort of have to wonder: even if it works, is it going to be worth it?
[ pagebreak ]
Is your group funded under the National Institutes of Health's $1,000 genome program?
We applied for funding, actually, when we started this three or four years ago, and at the time were told that using alpha-hemolysin for DNA sequencing was a dead end. So when you get a review back like that, you don't even resubmit the grant because they're rejecting your approach. But we've been able to scrape together a little bit of money here and there and cranked out some basic science.
So from what you said about these findings, it sounds like NIH was right about hemolysin.
Yeah. But it depends. If NIH is trying to fund technology development under that program, then I completely agree with that.
Aside from the protein itself, what other challenges to you see for this field? Are there detection issues or other obstacles that you've encountered in your work so far?
One of the things, if you're just thinking about the technology in general, is that if you want to compete with an optical technology like some of these other systems, those guys are imaging each of these little synthesis events and each of these little events is a pixel or two on a CCD array, and they are able to measure thousands and thousands of these things at a time. So you can see how that would scale, and that maybe in five or 10 years, when CCD arrays get better, then it's just a computing problem and a silicon problem, basically. One of the things that I'm not sure I know the answer to yet for sequencing with nanopores is, how do you scale it? Because you're measuring these picoamp ionic currents. If you were to measure a thousand of these at the same time, you would need a thousand of these amplifiers. What that is going to mean is that you're going to want each one of these amplifiers to be extremely high performance because you're not going to be able to scale to a thousand or ten thousand amplifiers as easily as you can scale the optical stuff. So it's going to be a tradeoff where you're going to demand extremely long read lengths on each of the pores to keep your amplifier number down. That's going to be a huge challenge.