At A Glance
Name: Jonathan Weissman
Position: Professor of cellular and molecular pharmacology; and biochemistry and biophysics, University of California, San Francisco; Howard Hughes Medical Institute investigator
Scientists from the University of California, San Francisco; the California Institute for Quantitative Biomedical Research; and the Howard Hughes Medical Institute have devised a technique that pairs flow cytometry and a library of GFP-tagged yeast strains to quantitatively monitor protein levels in single yeast cells.
In a paper published in the June 15 issue of Nature, the researchers describe how they used the method to help them better understand the phenomenon of biological noise in gene expression.
The paper underscores the increasing popularity of flow cytometry as a tool for high-throughput and high-content proteomic, functional genomics, and drug-discovery assays, and is also an example of how single-cell proteomic analyses can complement — and possibly improve — microarray-based gene expression studies.
UCSF's Jonathan Weissman, one of the corresponding authors on the paper, took a few moments this week to discuss the work with CBA News.
Why is it important to understand biological noise in the context of gene expression?
Noise can be looked at as both a problem and an opportunity for a cell. When the biological community has thought about gene regulation in the past, they have thought about how a promoter turns on and off in response to, say, an environmental stress or some need. The view has been that every cell should be identical. But it's impossible for every cell to be exactly identical, because just like in every other complex system, it has noise. The reason is that many of the key molecules, like mRNA, are present at only a few copies per cell. If you had 1,000 copies of something, you could imagine going from 998 to 999 to 1,000, and it would smoothly change.
But when you have only one or two copies of an mRNA that is controlling some major part of the cell, some cells will have zero, some will have one, and some will have two, and these cells will have radically different behaviors. So if a cell doesn't suddenly want to be switching between on and off, how does it deal with this? How does it have a simple, smooth system that responds in a coherent way when you have this inherent propensity for noise?
The opportunity of noise is that it is a way for a cell to get out of a rut, in a way. If a cell wants to switch from one state to another — let's say it's in an off state for making an enzyme that metabolizes a sugar, and it suddenly wants to turn that on. It can sort of get stuck. The noise gives the cell a variability that … helps transitions that otherwise might be very slow.
Another opportunity — and this is much more hypothetical — is that noise can allow a natural diversity to occur in genetically identical organisms. Let's say you have a population of yeast where each cell is genetically identical and related. If somehow the conditions change — let's say the temperature goes up such that the average cell would be killed by them — and every cell was identical, they might be equally susceptible to this. Noise makes sure that even though the genomes of the yeast are identical, the actual phenotype is different in individuals, which may allow a few individuals to survive. In a sense, we're very familiar with this idea of noise, or stochastic expression of genes, in identical twins. Even thought they have exactly the same genome, they share some things that are strikingly similar, and other things are very different. The idea is that it's the environment and natural differences in gene expression that can promote individuality.
Does the technique that your group developed let you explore the underlying cause of biological noise?
It allows us to look at this at a single-cell level — that's the key difference from previous approaches. You can miss things when you try to look at the average properties of cells. Think about two types of light switches. Some are just on/off switches, and others are dimmers. When the cell wants to switch from one state to another, it has both types — so-called binary switches and graded responses. If you're half on, and there's a graded response, that means that all the cells would be half on. For instance, if you're talking about resistance to temperature changes, and the cells were half on, then all the cells would be 50-percent resistant. On the other hand, if it was a binary switch, "half on" would mean that half the cells were completely on, and half of them were completely off. That means that if the temperature went up, half the cells might survive, and the other half would be completely killed. So you can see how different types of responses can be completely obscured if you are just looking at the average extent to which the cells are on, but there are these graded, or binary responses. The only way we can really understand it is by looking at one cell at a time.
Is this an inherent problem with microarray analysis or proteomic analyses like mass spec?
Yes. There have been some attempts to do single-cell microarrays, but it's very technically difficult, and essentially very little has been done. There have also been some attempts to do single-cell proteomics on very large cells, but nothing on this scale.
You compared results from your flow cytometric analysis with quantitative microscopy. Could high-content single-cell imaging be used in this approach?
They're very complementary approaches. It would be very hard to beat flow on the numbers. We can count 50,000 cells in seven seconds. That would be very tough to beat. On the other hand, what you get out of microscopy is localization information, which is a very important thing. And microscopy, in a way, has traditionally been applied more to this type of approach. One of the things we were trying to point out is that flow cytometry, because of it's really high-throughput nature, can be a nice complement by providing information that would be very hard to get using high-throughput microscopy.
The yeast strain library you developed — does that comprise every known yeast gene expressed with GFP?
The large majority of them. A few of them we couldn't get for technical reasons, but each strain has a different protein tagged with GFP.
You used a Becton Dickinson flow cytometer coupled with data-analysis software designed by your group. Was that out of pure necessity? Is there nothing on the market that could do this sort of analysis?
When we started, there was nothing capable of running high-throughput screens on the LSR II flow cytometer. There is now some commercial software out there. I think in some ways, what we did was to optimize software well for our particular application. And of course, our software is freely accessible, so other people that want to do this could certainly use it, or work with some of the commercial ones that are now out there.
It's freely available to non-profit researchers, but not commercial users, right? Does your group have an interest in commercializing this?
No. I do think there is a huge opportunity right now, though, because flow cytometry is a great technique for high-throughput and high information-content analysis. One of the real limiting things, however, is not so much the instrumentation, but the software, or being able to collect and manipulate the data. I would say that it's not something we're going to directly try to commercialize, but it is going to be important that people attempt to commercialize this.
Are you familiar with work being done by Larry Sklar's group at the University of New Mexico? Would the application of technology such as their rapid auto-sampling device (see CBA News, 5/26/2006) improve your method?
There is a lot of room for advancement. Sampling rate probably wasn't the rate-limiting thing for us, but it certainly is for other applications. I would say that in the future there are going to be more and more GFP-tagged cell strains for this type of application, and for analyzing the effects of drugs, for example. The single-cell aspect could be very useful there, because again, you want to understand not just what your drug does on average, but how it affects individual cells in your population.
In the paper, you tested for changes in gene expression in individual cells in response to rich versus minimal media types. Will you be monitoring cells in response to other environmental factors or to drugs?
That's certainly one of the directions we're going. For drug discovery, this could be applicable if you had GFP reporters on particular aspects of biology, such as turning on or off an oncogene or signal transduction pathway. You could use more than one fluorescent reporter, like a GFP reporting on one aspect of biology, and a red fluorescent protein reporting on another, and then add drugs and see how they affect the cells, or even combinations of drugs to simultaneously look at effects on different pathways — flow could be very useful for doing this. What it gives you over a microscopy-based assay is a very large number of cells, and the potential to be very rapid and quantitative on a single-cell level. But again, it's not going to supplant high-throughput microscopy. They would be very complementary.
You wrote that your method was able to detect nearly all proteins that were expressed in relatively large numbers; about 50 percent of proteins with medium expression levels, and difficulty in detecting low copy number proteins. How can this be improved?
We were just trying to be very conservative about what we were observing and were not observing. The limitation is not whether you can see a signal from these low-abundance proteins — it's much more due to the background fluorescence from the cells. There are ways of minimizing the background fluorescence that would allow us to avoid these problems considerably. There are also other variants of fluorescent proteins, like the RFPs, that fluoresce in a range where there is much less background fluorescence. I think if we changed the color of our fluorescent protein, used ones that were expressed at a little higher level, and then implemented some technical aspects that would allow us to distinguish between autofluorescence and protein fluorescence, there would be a big improvement. We're at a range where a little change in sensitivity would allow us to see a lot more proteins.
Have you started to compare data from your work with data from microarray analyses?
Yes, and that's one of the important things here. We know from the central dogma of molecular biology that DNA encodes RNA, and that makes proteins. Genomics and microarray technology has made it possible to predict all the genes in an organism with fairly high accuracy, and then look at how well they're expressed, or how much of that gene is being made. In the end, though, it's proteins that are carrying out the large majority of functions in a cell. So while microarrays have allowed us to look at mRNA levels, it's a proxy for what we really care about, which are the protein levels. It's a proxy that we know is sometimes very informative, and other times misses a lot of the regulation.The reason we've looked at mRNA is that it's been much easier than it has been to measure protein levels. This approach allows us to measure protein levels at a precision that now starts to rival mRNA measurements, and at a dramatically improved sensitivity. More fundamentally, it also gives us an entire new dimension on regulation — how individual cells differ from one another — information that is almost completely missing from microarray or mass spec approaches.
So many experiments on microarrays have been published, and the implicit assumption is that it's really telling you what's happening at the protein level. Yet we really don't know how true this is, or when it's true. This allows us to start looking at how well microarrays are really reporting on protein levels. The answer is sort of 'The glass is half full and half empty.' Measuring mRNA levels, in general, is a very good predictor of changes in protein levels. But it also misses a lot of regulation, a lot of which is biologically important. So it doesn't give you a full picture, and you have to look at protein levels in individual cells to get the full picture.