This interview was conducted by Adrienne Burke, Kathleen McGowan, Mo Krochmal, Aaron Sender, and Kirell Lakhman.
NEW YORK, Oct. 14 - It's been two months to the day since Craig Venter announced plans to open a new sequencing center in Rockville, Md. Most of the staff has been hired, and the new facility at 1901 Research Boulevard--squeezed between The Institute for Genomic Research and Celera Genomics--is operational.
At a time when most academic centers are pulling back their sequencing efforts, Venter, the scientist whom scientists love to hate, has upped the ante--again. While most of his peers are trying to tweak traditional capillary-based sequencing, Venter has set his sights on the $1,000 genome and is hoping to nurture novel sequencing technologies; he's already taken at least one young company under his wing, and made contact with at least three other up-and-comers whose technology he hopes will help reach that number.
GenomeWeb caught up with Venter en masse during last week's GSAC meeting in Boston. Sitting down with a roomful of GenomeWeb reporters and editors, Venter, who turns 56 today, riffed on the center and his new research institutes; on NIH funding missteps; on de novo versus resequenced human genomes; and on the future of human genomics.
And by the way, what's with all the rich folk paying heavy cash to have their genomes sequenced? "I think we would find them to be remarkably similar to all the other genomes," Venter quipped.
GenomeWeb: People in this industry seem to agree there are two schools of gene sequencing these days. One says it's at its zenith and that we should concentrate now on making other enabling technologies faster and more affordable. The other school says, 'No, gene sequencing is just hitting its toddler years, and the focus should be on getting better sequencing technologies.
What's your take?
Craig Venter: I'm in both schools. We clearly need better techniques for interpreting the genetic codes that we have, but I think it shows how everyone bought into all the hype in the press and everywhere else about the Human Genome Project, and the NIH centers are talking about post-genomics. Well, there never is a post genomics. We're in the genome era and we'll be in the genome era for the rest of human history. ...
So we need a lot more genomes to do that. Genetics has shot its wad in terms of the ability to use linkage tools to find the genes associated with human traits. The best way to do that going forward is it would be nice to have 10,000 human genomes right now. With clear phenotypic correlations, with clinical records, characteristics of these individuals, and doing genotypic correlations. One [genome] is almost useless other than as a reference.
GW: Would it be fair to say that maybe certain disciplines in genomics got ahead of themselves and that more genomic mind-power should be focusing on the sequencing part of it as opposed to ...
CV: No, that's silly. We have a very big scientific community. We have the biggest scientific budget in the history of humanity and it's a question of applying that money intelligently across the scientific community. Most of it goes in an unfocused fashion. There's not really a leadership where people vote how much money people want for themselves and their projects. ...
So I think we need to change the level of funding across the board in all these areas. We don't know how to look at 30,000 let alone 300,000 different proteins simultaneously working in a group of cells. We don't understand how to interpret our own genetic code. We don't understand how to look at human variation across even the tiniest sampling of even people in this room let alone whole countries and populations.
GW: You say we need to change the focus on the kinds of research that is funded?
CV: Yes, I think NIH funds a lot of me-too science. I tell the story about the haemophilus genome because I think it shows an important aspect of science funding. We couldn't get the haemophilus grant funded. [National Human Genome Research Institute Director Francis] Collins and people said it won't work, it's impossible; they refused to fund it. But once we published the H genome, NIAID and other institutes came forth with money to really push forward genomics almost more than the NHGRI has. But getting that first breakthrough step funded is extremely difficult and I found an independent way to do that. Not too many scientists have managed to find ways to do that. ...
I think that's why a lot of scientists go into private industry. They at least have an avenue for some of their inventiveness and ideas to come out and have an impact.
GW: Lincoln Stein at Cold Spring Harbor said that eventually academic centers will stop performing gene sequencing and will instead focus on other parts of genomics. Meantime, that kind of research will continue at institutes like TIGR and the center you're opening up.
CV: Well, it started at TIGR long before Cold Spring Harbor, so it's a question of what the focus is, and what kinds of academic centers. TIGR is an academic center funded by grant money. The CSHL just lost its funding from the government. [Editor's note: CSHL denied in a subsequent interview with GenomeWeb that its sequencing facility has just lost government funding.] I think it's a mistake that they're shutting down these smaller labs. Because then there's no competition, no chance for multiple sources of inventiveness.
There's a lot of contributions that smaller programs can make. Big factories need to be run like big factories. They need to be very cost effective and efficient if you're going to be putting tens of millions of dollars of your public funding into them. But you need to have a research environment that drives inventiveness.
GW: On the topic of funding, can you talk a bit about where funding for your new sequencing center is going to come from. There's been all this press about these wealthy people paying [more than half a million dollars] to have their genomes sequenced. Is that going to be the main source of capital for you?
CV: I think it would be very unrealistic of me to think that the [NHGRI] would fund us to sequence 1,000 genomes if I push genomics forward even though to me that's the next logical step in this field. I don't have enough money to fund that either. We're using our foundation money to jumpstart it by paying for the facility and the equipment to get it going so we have the opportunity to do it.
But we're asking the philanthropic community to say, 'How about funding 100 genomes for patients with these diseases? Diseases that you care about or ethnogeographic groups to make sure there's sufficient diversity in the population or in some cases you yourself or your family as part of a legacy,' and everybody would have their data be part of a database that would be used for genome analysis in comparing clinical records, genotype/phenotype correlations, obviously in an anonymous fashion. They'd be part of the great scientific experiment and the great tradition of individuals helping to fund science instead of just relying on taxpayers and corporate initiatives.
It was misrepresented the first time in the press that this was the millionaires' genome project. Hopefully, it would not just be millionaires' genomes, although I think that would be an interesting study. I think we would find them to be remarkably similar to all the other genomes, but I think what we'd expect to happen is that there would be groups that support diseases that they really care about getting solved that they know are not going to get solved with the current paradigm. ...
GW: Not many people have the personality to be able to persuade people to put up the money to do this.
CV: That remains to be seen, too. Clearly some people are interested in doing this. I think there are a lot of people who could do this, but I think it's because of the position I'm in genomics, on the leading edge of it, a lot of people might come to the same conclusion 10 years from now. The difference is we've come to the conclusion now. And we're going to push it forward. We're not going to wait for that extra decade before people get around to thinking, 'Gee maybe we should try to do a lot more.'
GW: Is the idea that you would use that money ... to subsidize other kinds of projects, or is that right now what you think the cost of resequencing someone's genome will be?
CV: No, I think that's the current estimate of what it would take to resequence someone's genome with all the genes and regulatory regions, but with our new center we're trying to drive the cost down. I read in ... GenomeWeb [that] it costs $1.53 per lane of sequencing. So our goal is to just have it be 30 cents a lane. That's the goal out of the box.
GW: How are you going to do that?
CV: By using a variety of incremental new technologies, and as low overhead and better efficiencies for doing this. What is lacking is the competitive environment to really drive the cost down. If someone sent you a $20 million check each year to sequence things no matter what level you are doing it, you're not really incented (sic) to try and do it for half the price. ...
That's why continuing sequencing in multiple centers--large, small, independent--creates a competitive environment because. Just think if we could do sequencing for 30 cents a lane and some of the new technologies put out 1,000 base pairs, that's a tenfold change. That means we should be able to do 10 genomes for the cost of the public effort doing the mouse genome.
GW: Do you think these advances will create new bottlenecks.
CV: Well, every time you solve one [bottleneck] of course you create another one. The ultimate bottleneck we already know is how to interpret the genetic code. We don't know how to interpret it. That's going to take this entire century to really get good at it. ...
GW: It sounds like in your vision, resequencing is an important focus of what you're going to be doing. Can you talk about, say, in the next five years, in terms of the relatively near term applications of what we've already learned, what gets you more excited--this idea of being able to resequence human genomes, or being able to do de novo sequencing?
The term 'resequencing' is actually a misnomer. We're not resequencing. We're going to do de novo sequencing from 1,000 people. Your genome has not been sequenced as far as we know. So it's not really resequencing. ... Each [genome] is a major discovery process as we start to understand the true nature of human genetic variation.
Some people just want to remeasure one single nucleotide chain, thinking that will explain biology or disease when getting complete sets of haplotypes has not even been considered possible. It's a totally different approach than the genome center at NIH is taking in just trying to get a small set of composite haplotypes as though that would explain your genetic code or mine.
It's all de novo sequencing. Unless I resequenced my genome then that would be truly resequencing.
GW: When you make that distinction between what you're saying--everything you're doing is de novo sequencing--do you mean that what you're doing is a really different technique from what other people are doing, or you're making an emphasis point?
CV: I don't think anyone else is doing anything. I don't know of any other human genome sequencing going on.
GW: Well, when people talk about resequencing, are they talking about something totally different than what you're talking about?
CV: What people were talking about last night was as they got 25 base pairs, that's de novo sequencing. What it means is they don't have to assemble that data. They're just going to overlay it on top of pre-existing data.
GW: And you're not going to be doing that?
CV: Well, we're going to be doing comparisons where you have 1,000 sets of genes--obviously you line them up, but our lines will be on the order of a kb each and they're going to be from specific PCR primers so we know where we're starting to begin with, we don't need the backbone of the rest of the data as an interpreter. But the ability to do that in the first place rests on having done the genome once.
GW: What will you do with the data that you generate? Will that become a public database?
CV: It's certainly not going to be a secret database. It's not clear that it makes any sense to put individual genome sequences in GenBank. It will certainly be a very powerful tool for us and our collaborators to use for making medical advances and interpretations. It will be probably available in some form or another. It's such a massive amount of data. It's not clear that anybody is equipped to do anything with it. ...
We haven't thought that far along other than that it's not being tied up for any commercial purposes. Nobody has ever tried to publish 1,000 human genomes before.
GW: How did you choose US Genomics' technology over the other ones we heard about last night?
CV: Well, I'm talking to all of those groups. My goal is to get the next best technology that's going to allow us to go to the next phase.
For most aspects I don't care which of those groups is the winner. I hope there are several winners to choose from actually so there is intelligent competition. I was exposed to US Genomics' technology and I was quite impressed by it. ... That gives me great hope that processing data in a different way than we've been doing is feasible. But they need another one or two orders of magnitude to get down to base-pair resolution. But that gives me and other people great excitement that there's totally different ways to do things.
I think my experience can help them as an organization in a unique phase to do some things I learned the hard way. I'm equally excited about some of the approaches I heard about last night. ...
So I think one of the fallouts from my projects that I am trying to push forward [is that] it makes it very clear to the world that there is an outlet for the technology that these teams are trying to invent. ...
GW: You feel like you could help get that across?
CV: I think just the fact that I'm voting with my time and my effort and even my foundation's money to drive it in the next direction. ... I'd much rather buy one of those machines to do 10,000 genomes a day than spend $30 million on machines that I hope I have to throw away in two or three years because there are new things that are so much better.
GW: Are there any academic labs that you have kept your eye on?
CV: Academic labs? Umm, what I hope is that the next level of innovation comes from labs that I've never heard of--a physicist who's been sitting back reading about genomics and just getting turned on and just thinking ... of how to do nanotechnology depiction of these molecules, and they'll now realize they have an opportunity to change the world. I hope there are a thousand out there.
GW: sounds like you would have some provocative things to say to them about them not having much life left in them?
CV: It's a bad model that was set up there. Just like the computer at Celera should not be replicated, I don't think the model that was the done in the public for sequencing the human genome with the justification that they had to race with me is a good model for anything.
If anything it should be undone; it certainly shouldn't continue to be fed. It needs to be opened up for competition. ... They should do so competitively, not because they can get spoon-fed by Francis Collins in a noncompetitive fashion.
I think it's a mistake that they cut out the Cold Spring Harbors and the small institutions. A lot of sequencing innovation has come out of these laboratories and if the cost-of-sequencing paradigm is really pushed, there are going to be only marginal differences between these laboratories. Nobody has calculated what the minimal size you need for the maximal efficiency.
Making something that's already big twice as big in many cases can lead to inefficiencies and higher costs. So it would be a great exercise for some business modeler to do is to model with the existing technologies, what's the smallest unit you can have to have the maximum efficiency and the lowest prices. And when is too big a problem.
GW: But you'll be going big too?
CV: Physically our new facility is substantially smaller than what we built before. Some of you may have had tours there, but I don't know whether it's going to be in 10 years or 50 years but pictures of that will go in some museum and people will laugh and say, 'Look, it took this giant place to sequence the human genome and it took them nine months to do it, how silly!' We do that in two seconds now.
GW: So you guys are saying ... that this is going to be the most powerful sequencing center. So clearly you think that big is not necessarily bad.
CV: No it's not, and the efficiencies do improve the scale. We're building it that size because of the level of science we're trying to accomplish, not because we want to be the biggest. As I said, if we could afford to do 10,000 human genomes I would do it if we thought we could afford it.
GW: You had mentioned one goal for the future is the possibility of using whole-genome shotgun for sequencing entire environments. Are there any other big new ways you see using this technology as cheaper, improved faster sequencing becomes available?
CV: Part of what we're going to do at the [Institute for Biological Energy Alternatives] ... is a shotgun sequencing of the Sargasso Sea to see not can we get one genome sequenced but can we get the genome sequence of all these unculturable organisms. All of a sudden genomics goes from looking at that 0.1 percent of what we've cultured and measured to being the avenue for understanding the rest of the biosphere that's out there.
We're also talking to some scientific collaborators about doing an atmosphere shotgun-sequencing project. So if this was cost effective or when some of the technologies we heard about last night become viable, imagine how they will change the sciences of ecology and monitoring environments in terms of toxicity, emerging infections, biological warfare, anything in our environment. Just understanding our environment and how it's changing.
Biology could be the number one method for predicting weather in the future if we could really measure these changes in their dynamic state and understand the biological cycles of the whole planet. That requires massive computing, high-throughput sequencing. DNA is the one thing that unites all of us as a species and if we can understand what's changing dynamically we just might learn something worthwhile.