At A Glance
Name: Kelvin Lee
Position: Director, Cornell Proteomics Program, since 2002
Prior Experience: Postdoc, Biology Division, California Institute of Technology, 1995-97
PhD, Chemical Engineering, California Institute of Technology, 1995
BSE, Chemical Engineering, Princeton University, 1991
How did you get interested in proteomics?
It dates back to the beginning of the 1990s, before the word proteomics was actually being used, and it was as a graduate student. My advisor in graduate school [James Bailey at Caltech], who is not a proteomics person per se, had the vision that this was going to be an important part of his laboratory. He sent me out to work with Mike Harrington, who was at Caltech at the time, to learn some of the basics of 2D gels [and to go] to some of the meetings. At that point, of course, the proteomics meetings were strictly 2D gel meetings where people got together and shared their experiences about different protocols and tricks.
How did you apply proteomics in your PhD project?
My training is in chemical engineering. One of the things that chemical engineers worry about is how to produce recombinant protein drugs. For example, there are a lot of good reasons for why you would want to eliminate the need for fetal calf serum, which is commonly used in animal cell culture. We looked at a common cell type for producing pharmaceuticals — Chinese hamster ovary cells — and looking at how those cells respond in the presence of different growth factors. We used proteomics to identify that a particular transcription factor, E2F, was upregulated in the presence of certain growth factors. What that meant is, when we went back and tried to engineer Chinese ovary hamster cells to constitutively overexpress E2F, we were actually able to get cells to grow without the need for any serum, or in fact any protein additives to the media.
Where did you go on from there?
[After spending the last two and a half years as a graduate student at ETH in Zurich, where my advisor had moved,] I moved back to Caltech and worked with Mike Harrington some more as a postdoc. There we looked at the application of proteomics to the diagnosis of prion diseases. We were able to characterize a molecular marker called 14-3-3, which was very useful in the diagnosis of prion diseases, using a 2D gel-based proteomics approach. After two years there, I moved on to Cornell, which is where I have been ever since.
How is your lab equipped?
We have access to two mass spectrometers. One is a [Thermo Finnigan] LCQ-Deca [ion trap], the other is an [Applied Biosystems] 4700 [MALDI-TOF/TOF]. In terms of other equipment, it’s largely gel-based, with laser scanners and robotics. We have a robotic gel picker, a robotic zip tipper, a robotic digester, [and] a robotic MALDI target plate spotter.
Do you regard automation as particularly important?
It does have a big impact on the morale of the personnel in the lab. But on the other hand, if we have a really valuable sample, and we really want to ensure that we get the best quality digest, then somebody needs to do it by hand. It’s not that the robots will fail, it’s just that somebody with experience can always do a better job by hand than any of the robots.
What is your research focused on?
There are four main project areas. The first one is our continuing work in the diagnosis of brain disease, with a particular emphasis these days on Alzheimer’s, and growing emphasis on Parkinson’s disease. It’s a very straightforward [2D gel-based] proteomics strategy, just taking cerebral spinal fluid from a number of disease samples and normal samples and trying to identify molecular markers for the disease.
We have identified a couple of molecular markers that we think are interesting for Alzheimer’s disease: We have also done some work on looking at the new variant form of Creutzfeldt-Jakob disease compared to the sporadic form.
If we can develop, for example, immunoassays or some other kind of technology, which would be hopefully simpler than 2D gels, then that potentially could be used in the pre-mortem diagnosis of brain diseases, which is a particularly challenging area.
A lot of researchers, both in academia and companies, are getting into protein disease biomarkers these days. Where do you see the advantage of using 2D gels, as opposed to other approaches?
I think [2D gels and other technologies are] all important approaches; I view them as actually complementary. One of the questions that comes up is a balance of quality and quantity, or throughput and quality. You can imagine a setting in which an investigator has access to thousands of different samples of interest, and they crank through those samples in a relatively high throughput manner. The problem is, in many cases, it’s very difficult to identify anything meaningful because a lot of these technologies take a little bit of time to yield results, because of small changes in samples or [because] it’s a very technically challenging procedure, like 2D gels.
As an academic laboratory, we can’t afford to take a very high-throughput approach, so what we try to do is emphasize more of the quality control, and carefully choosing samples that we think will be interesting for the specific questions that we are interested in.
You also work on improving recombinant protein expression?
We are trying to understand enhanced protein secretion, again with the emphasis on trying to improve cells for recombinant pharmaceutical production, both in bacterial as well as mammalian cells. One can identify super-secreting strains of E. coli, and use a proteomics approach [coupled with] a gene array-type approach to try to understand better what happens when you have mutations that allow enhanced secretion, so we can go back and try to engineer the strains to do that for us.
A lot of people say they see discrepancies between gene and protein expression.
That’s actually [related to] the third area that we have become interested in, trying to better understand translation. This is motivated by a lot of the studies which have come out which showed discrepancies in some cases, and good matching in other cases. We have been trying to build a mathematical framework to understand that relationship among messenger RNA expression, protein expression, and DNA sequences [in E. coli]. Can I actually predict what the protein profile should be computationally? [We have also been doing] a series of experiments, measuring message and protein and seeing how well our model is able to predict things. We have gotten it to the stage where for simple pertur bations of E. coli, we can predict about 60 percent of the genes that we measure, in terms of the direction of whether they go up or down and how much they go up and down.
You are also developing microchips for protein separation?
[We are] trying to develop microscale — some people call it nanoscale— technologies for protein separations, in particular, devices which allow us to separate mixtures of proteins and transport them directly into mass spectrometers.
One [advantage, compared to nano-columns,] is that potentially they can be disposable, so carryover is not an issue. A second thing is that in theory, you can build in multiple different kinds of separations, you can do two or three or four dimensions. Also, you can do chemical reactions on the device, and we are also trying to build in some sensing. Maybe you are interested in the presence or absence of one or two specific proteins or classes of proteins, and you want to detect those as they are being sprayed into the mass spectrometer. So you can build in more functionalities than just the pure separations.
[So far] we have been able to do some separations of spinal fluid which relatively cleanly remove albumin and some other high-abundance proteins, which is a big problem in the analysis of either serum or spinal fluid. We [also] have some interesting ideas on how to do multiple dimensions of separation. There is a not so commonly known problem, which is [that] at the very small-length scale, it’s very difficult to mix two fluids together. We have some ideas on how we can mix fluids, which would allow us to very effectively do multiple dimensions of separations.
Where is the greatest need for new technologies?
Mass spectrometry is evolving at a very rapid pace, and a lot of the vendors, and a lot of the academics who are interested in research-grade mass spectrometry, are turning their attention to proteins and peptides. I think that there is a reasonable fleet of instruments and technologies for people to choose from.
In the area of the upstream separations, I think that more needs to be done in terms of thinking about multi-dimensional liquid chromatography as a complement to 2D gels, and other kinds of approaches. There is a lot of art involved in sample preparation, and it’s not a very glorious area to spend a lot of energy. Every sample is a little bit different, [and sample preparation] can be very specific to your problem of interest. It hasn’t been an area where people have spent a lot of time coming up with general approaches, because there really aren’t any.
I think there is [also] a big need on the downstream end for the informatics. [For example,] the publicly available databases of sequence information [are] a wonderful resource, but there are a lot of mistakes [in them]. A lot of the gene annotations are made just based on sequence homology, [and] sometimes people rely on it too much.
One of the things that we have observed in prokaryotic organisms is that if you are given a sequenced genome, sometimes the open reading frame calls are either incomplete or incorrect. A lot of the common algorithms which are used to identify open reading frames, for example, cut off at about 50 amino acids. So what do you do if you have smaller proteins and peptides that you are interested in? If there is a way to do that kind of thing better, then that would be very useful.
And then there are still issues in terms of integrating information across different levels, DNA, message, protein, metabolites, protein activities, and posttranslational modifications, and I think there needs to be better software to handle the information, as well as better tools to try to predict different levels.