Name: Eric Green
Title: Scientific director, NHGRI, since 2002
Senior investigator and chief, Genome Technology Branch, since 1996
Director, NIH Intramural Sequencing Center, since 1997
Education: MD, PhD, Washington University in St. Louis, 1987 (PhD in cell biology)
BS in bacteriology, University of Wisconsin, Madison, 1981
Eric Green wears several hats at the National Human Genome Research Institute, including director of the National Institutes of Health’s Intramural Sequencing Center – the in-house sequencing facility for NIH researchers. Green’s work at NISC focuses on two main areas: comparative genomic sequencing, and understanding the genetic basis of human disease.
In Sequence spoke to Green last week to discuss some specific projects underway at the center and to get his thoughts on where the field of sequencing is going.
Tell me about the NIH Intramural Sequencing Center. What is its function, and what are some of the projects it is currently involved in?
Our center has two major roles. Its historic origin is to be a state-of-the-art large-scale sequencing center that would do large projects for investigators within the intramural research program at NIH. That actually is a very small part of what it does, because over the last eight years or so it has gotten involved in a number of large genomics projects of its own that have been mostly centered on major sequencing initiatives beyond the Human Genome Project in particular.
That has included things like being involved in Mammalian Gene Collection, [a program] which was an effort to generate collections of full-length cDNA sequences in various species, including human, mouse, rat, and cow [as well as establishing similar collections for Xenopus and zebrafish].
But the biggest project we have been involved in is a multi-species comparative sequencing project. Our unique contribution has been that rather than tackling whole regions, our efforts have been focused on targeted regions of the genome, but then going very deep in evolution to try to broker the power of comparative sequence analysis for understanding the functions of genomes.
This has led us to be sort of the major sequencing group involved in the ENCODE [Encyclopedia of DNA Elements] project, which is a project of NHGRI’s to target one percent of the human genome initially, and rigorously analyze it in various ways to try to find all the functional elements. We have been the major sequencing group that has been sequencing that one percent in multiple species.
The only other thing to add to that is that now, with increasing interest in using large-scale sequencing to answer problems in medical genetics and try to apply all this to human genetics problems, we are increasingly getting involved in medical sequencing projects — in other words, human resequencing projects.
What is NISC’s relationship to the three extramural large-scale sequencing centers at the Broad Institute, Baylor College of Medicine, and Washington University?
Since we are on the intramural side, it’s a little complicated, but the bottom line is that when they were renewed in November, we were evaluated at that time. In fact, I submitted an application that was absolutely analogous to theirs, in terms of every component that they had to submit, we had to submit. And the same review group that reviewed those other ones reviewed us as well. We were reviewed very positively, and as a result of that evaluation, we will continue to have a program just as they will for four more years.
We are very small by comparison; our budget is about $7 million a year, so we are four times smaller than the smallest of the three. Still, $7 million is a non-trivial amount of money. So we are smaller in scale, and that’s why we don’t sequence whole genomes. In recent years, we have had a very comfortable and different role among the network of sequencing centers, being sort of a mid-scale sequencing center. If you take the big three as the real huge production facilities, we are not on that scale, but as a result, there is a niche that we occupy that capitalizes on our size, on our expertise, and also on our presence within the intramural research program, which has various unique attributes, in particular those focused on clinical research activities.
What role does your center play in helping to decide which species the large centers will sequence?
Maybe what you are getting at is, over the past five years, we have been involved in getting a jump-start on getting data from multiple species at a targeted level that has helped feed in the deciding information about what genomes we might want to sequence next. I have often referred to it as sort of a reconnaissance team. In other words, we are out exploring genomes for the very first time and trying to figure out whether it is worth investing heavily in them or not. I guess that’s sort of been our role — it’s not that we end up making the decision, but we generate the data that informs the decision.
What about new sequencing technologies at NISC? Which ones have you been evaluating or testing?
Let me first say very globally that I think some of these new sequencing technologies are extraordinarily exciting. I think there are going to be developments that are going to completely change our view of how we acquire sequence and what we do with it, and I think the landscape is a very exciting one.
With all that optimism, I would tell you that I don’t see any of these sequencing technologies as immediately heralding an end to Sanger sequencing. I think for many of the things that we need to be doing and we want to be doing in the next few years, we are going to still be doing them with the 16-capillary instruments and Sanger sequencing chemistries. So I think the idea of using the tried-and-true [technologies] but testing and pushing the envelope on the new stuff is exactly what we should be doing.
These new technologies are sufficiently raw enough that they require very major investments of capital, of people’s time, of testing, of research and development, and that is not something you can do on a small scale. And I will tell you that even though $7 million sounds like a lot of funding, by the time we are doing all the science we want to do, we have never felt that we are large enough to effectively really push the envelope on very new technology. I think we have been very good at importing new technologies after they have been worked over a little bit and slightly refined, and then we can help adapt and then put them and make them more efficient. As another way of saying it, we have been very cautious for budgetary reasons to get too heavily immersed.
We have generated a little bit of 454 data, not even by owning an instrument; we are now in a lot of discussions and have actually have collaboratively worked with Solexa to get familiar with their technologies, but we have not imported any of their technologies yet. We basically take advantage of our excellent relationship with the other centers. So we wait to hear from them. They are pushing the envelope very early, and when they sort of signal to us, based on what they have generated, that there is an appropriate project for us to do at our center, then I think we will tip-toe in that direction.
We have yet to invest in any of the new technologies, simply because they are too expensive for us, right now, without having been proven specifically for the projects that we have slated for the next year or two. But we are excited. I am sure my story will change in two or three or six months, no doubt.
Where do you see the greatest promise for these new sequencing technologies? What are the main technical obstacles they still need to overcome?
There are a lot of issues at play. I am sure this will be a heavily emphasized topic in many sessions at the Marco Island [Advances in Genome Biology and Technology] meeting next month, but there are certain applications, for example in circumstances where you want to be able to sequence single molecules, because you cannot culture organisms, or you cannot recover the DNA in anything but single-molecule form. Clearly, metagenomics and ancient DNA analyses [are possible applications], or cases where you have very small genomes, where the assembly problem [of shorter reads] can be overcome. I think microbes are proving to be a great target for new sequencing technologies.
I don’t think the path is 100 percent clear at the moment how to do significant medical sequencing with the new technologies, because they are not yet at a capability of sequencing a whole genome. Therefore, we are querying genomes in a targeted way, and how you exactly then recover the genomes by PCR, by whatever other means, and use the technologies, is still being tested. I think this is what clearly many of the groups are trying to refine. So I have lots of questions. But what I know from other people applying the new technologies, the greatest [application] is when you need lots and lots and lots of reads, but very short reads.
What about your collaboration with Aravinda Chakravarti at Johns Hopkins and Solexa on genotyping a genomic region?
There will be a talk about that next month at Marco Island by Jim Mullikin [of NHGRI’s Genome Technology Branch]. It’s an interesting comparison. In this case, this was taking large PCR products generated from patients in a human genetic study, and sequencing them either by a conventional Sanger approach, or sequencing by a Solexa approach, and seeing what the trade-offs are in each case. I think it is promising, and I think the results are quite good ones for Solexa. But in any case, it shows some of the strengths and some of the challenges still associated with this evolving technology.
Looking into the future, how do you think sequencing will improve 10 years from now?
It’s hard to say, of course. A lot of the current technologies are making great strides in getting us down in cost to sequence a hypothetical human genome, but I am looking for something even more revolutionary if we want to contemplate getting down to a $1,000 genome. And it’s going to be something that is going to take some number of years, that is going to be something like a nanopore or some other, similar major advance, perhaps things we have not even thought about quite yet.
Is there any technology you personally favor?
No. The talks that I hear, and the exposure I get, I find it very difficult to discriminate between many of these. They all sound promising, but I also know the path towards actual implementation is quite different. I remember there was a lot of struggle just to get capillaries to work for sequencing. It sounded good at first, but still it was a number of years before all the little glitches were worked out.