At A Glance Name: Robert Murphy
Position: Director, Center for Bioimage Informatics; Professor, Departments of Biological Sciences and Biomedical Engineering, Carnegie Mellon University
Background: Postdoc, Columbia University, chemistry and human genetics — 1979-1983; PhD, California Institute of Technology — 1980; BA, biochemistry, Columbia University — 1974
A database detailing the location of every protein in a variety of cell types from a variety of different organisms — is it a pipedream? Bob Murphy at Carnegie Mellon University doesn’t think so, and is developing a new set of techniques and analysis methods — dubbed “location proteomics” — that he hopes will eventually lead to the completion of such a feat. A paper detailing Murphy’s approach, entitled “Objective clustering of proteins based on subcellular location patterns,” will be published in a forthcoming issue of the open-access Journal of Biomedicine and Biotechnology (http://jbb.hindawi.com/forthcoming/index.html). Murphy took a few moments last week to discuss his research with Inside Bioassays.
The image-recognition technology you use strikes me as having similarities to techniques in high-content cellular imaging, and you’ve done some work in the past with Lansing Taylor [Founder of Cellomics and CEO of Cellumen].
The origin of both of our efforts in this area was a grant that a whole bunch of us were involved in 10 or 15 years ago that was the impetus behind the technologies that ended up at Cellomics and in what we’ve been doing.
Your work seems to straddle the line between proteomics and cell-based assays or high-content cell screening. Do you agree with that?
Yes, absolutely. High-content screening, as it is defined right now, is still a bit of a misnomer. When you look at what is actually calculated on any assay these days, it’s a very tiny fraction of the content of information that’s present even in a 10- or 20X image. I think that there is a lot more information content in images that are being acquired as part of drug screening and other high-throughput microscope experiments than is currently being extracted. One of the goals of our work is to facilitate extraction of more information, more routinely, by implementing systems that can learn what’s important from a particular assay without being told — that is, learn what features of a distribution, what patterns in a culture are indicative of whatever conditions have been imposed on that sample, whether that’s one or two drugs, or some expression of various genes. Also, a goal is to support movement towards higher-resolution imaging — to increase that information content even further — so that even more can be done. Rather than being involved in trying to implement a specific assay or a specific commercial product, we’re trying to work out methods that can be described for everyone to use, in order to push the field along.
What is different about this approach? Is it more complex extraction from images, or are you looking at more types of objects …
Part of the experience that we’ve had is that there is no magic bullet that approaches should be built around, no one particular type of image feature that will always be useful. But when you look at the range of things that might need to be distinguished, especially when working on changes in sub-cellular distribution that go beyond things like nuclear translocation, we’ve observed that a combination of features of very distinct types — including object-based features, which we’ve been using for years — is important. I think there have certainly been advances in recent years for the level of analysis that people typically do — there’s just room for further advancement.
Is your plan to create some sort of open architecture for people to use freely?
Yes, that’s part of it. But it’s also illustrating approaches that people may decide to implement in their own architecture, in the traditional academic manner — which is to publish papers that say, ‘Look, if this is a problem you’re trying to solve, here’s a way to solve it.’ And again, that gets implemented in software, and we’re making our software available, but that doesn’t have to mean that somebody has to use that software in order to benefit from the work.
Is your work mainly focused on software development? You mentioned the idea of higher-resolution imaging. Also, are you using existing reagents?
We don’t use any reagents that are sold for the commercial high-throughput systems. We are typically using GFP-tagged proteins in cell lines that have been isolated as part of the CD-tagging project by Jonathan Jarvik and Peter Berget. So our focus is more on what subtle differences in protein patterns can be distinguished in images of varying quality. There are really two directions. One is this aspect of putting more learning capabilities into a generic assay builder, so that you can provide a set of conditions and then have the software develop or describe what the best thing is that you can do, in order to in the future assign something to one of the categories, or concentrations, or effective dosing, or whatever you’ve seen in the data that you gave it. That would use a combination of supervised and unsupervised learning methods, so that you can try to decipher the structure of the patterns that are being seen in an unsupervised way, and then combine that with information you have about what concentrations or conditions are being used, so you can associate the pattern changes with the specific condition. Then you can turn that around so that for a future sample, you can analyze it and say ‘OK, the pattern that’s shown here is equivalent to this set of conditions that you’ve shown me before.’ That means that you can have assays that don’t just follow a simple transition from ‘A’ to ‘B.’ The other direction is to learn new patterns and group proteins by their patterns. That’s the focus of the work discussed in our recent article.
For imaging, do you use basic microscopy?
We have two different readers, and we also use a spinning disc confocal microscope. With the readers, we’re trying to push the envelope as to what can be done with them — using higher magnification objectives, and collecting more information per sample, per well than is typical.
As I’m sure you’re aware, several of the imagers on the market use spinning disk confocal microscopy or some other variation on confocal imaging. It seems that there might be a commercial entity out there that would be interested in what you’re doing on the software side.
Yeah, but my primary interest is in doing things that can be of benefit to the whole community. And yes, there are many different kinds of readers, and hopefully, the kinds of things that we’re doing would encourage somebody that has a particular reader to try something that it’s capable of doing, but that they haven’t previously tried.
I also think that another piece of this — and I know that some of the manufacturers would like this to be part of their market, as well — is to push the use of high-throughput microscopy in cell biology research, not just in drug-development research. The kinds of work that we’ve done for classifying and clustering sub-cellular patterns will need to be done for many, many cell types, under many conditions, in many different organisms. That’s potentially an important market for these kinds of microscopes.
I just did an interview recently with Andreas Vogt from UPitt (see Inside Bioassays, 2/1/2005), who talked about high-content cellular analysis becoming more accepted in the academic community, becoming more hypothesis-driven …
I don’t think it’s just about becoming hypothesis-driven, I think the community has understood the value of discovery as a paradigm in addition to hypothesis. That’s been a sore point for a number of years, but I think it’s pretty clear now that it’s becoming very widely accepted — that it’s perfectly reasonable to set as a goal to discover the characteristics of some particular process on a genome-, proteome-, transcriptome-, or metabolome-wide basis. And certainly high-throughput microscope readers can play a role in that.
Just to be clear, these are all meant to be done with live cells?
Yes, the most powerful approaches are going to be for live cells, in that you don’t have any concerns about any effects of fixation, et cetera. That’s not to say that there isn’t value in using high-throughput microscopes to use fixed cell preparations that have been stained in any number of colors, to get around the problems of trying to do multiple-color analysis in live cells with microscopes that basically measure one color at a time, at least in most cases.
Is your lab also planning on applying this technology to your own specific research interest?
My long-term research interest is in helping the community understand the distribution of all biological molecules, in all cell types, under all conditions. It’s my belief that this will be critical to being able to say that we understand how cells work. It’s especially critical, and has been to a large degree underappreciated for systems biology efforts to build in silico cells and so on. Most of those kinds of efforts, if they consider location at all, do so at a cell biology textbook level: ‘Here are ten different organelles that the protein is allowed to be in.” We can perhaps, in some systems, model that, but I’m interested in the fact that there are far more subtle differences in the distribution of proteins in cells than can be appreciated by eye, and can be described in a limited sense, like the names of organelles. I’d like to move the field of analysis of sub-cellular distributions into a statistically sound, objective framework.
And you’re referring to this concept as ‘location proteomics’?
Exactly. I think we’re the first to use the term, but we’re certainly not the only people who are working in the area. There are a number of exciting projects that people are doing that are going to help in this regard.