Skip to main content
Premium Trial:

Request an Annual Quote

Image Database Maps Subcellular Locations of Proteins: New Dimension to Proteomics Research


Working in a field called location proteomics, scientists at Carnegie Mellon are creating a database of digital microscope images that show the subcellular location of fluorescently labeled proteins. The idea behind the database is to allow scientists to determine at high resolution where proteins of interest are located within cells, and for researchers to use sequence-based queries to determine the location of a protein during previous experiments.

“Currently, within the location determination field, people do fractionations of the cell and measure how much of each protein was expressed in each fraction. This is a fairly low-resolution approach because you can only do so many fractions,” said Robert Murphy, the principal investigator at Carnegie Mellon who is heading up the location database development. “The idea of this machine[-based] method is to be able to get a higher resolution.”

The resource, called the Protein Subcellular Location Image Database, is freely available to researchers through the Murphy lab home page (

Murphy’s research team started developing its location proteomics techniques in the mid-1990’s by fluorescently tagging known proteins from five cellular organelles: the golgi apparatus; the lysosome; the nucleus; the nucleolus; and the cytoskeleton. They collected digital fluorescence patterns and tested whether the images could be accurately interpreted by computers.

After they had succeeded with the five organelle proteins, Murphy and his group went on to tag proteins from all of the known organelles, and eventually to tag different proteins from within the same organelle. They found that while the human eye could not detect the difference between two labeled golgi apparatus proteins, computers could.

“We can resolve protein patterns that people can’t distinguish,” said Murphy. “Combined with automation, that makes this a powerful tool.”

To generate more protein fluorescence data, Murphy’s group now uses a technique called CD tagging, an approach also developed at Carnegie Mellon that tags all three so-called central dogma (CD) molecules — DNA, RNA, and the protein. In CD tagging, a DNA sequence encoding a fluorescent protein such as GFP is inserted into an intron of a target gene. Once the gene is transcribed and translated, researchers can determine its subcellular location by computational analysis of the fluorescent microscope images.

For high-throughput data, the researchers can randomly tag genes throughout a genome without targeting a specific protein, then analyze the images and classify the proteins into subcellular locations using computer algorithms.

“The prospectus of this is very promising,” said Yehia Mechref, a researcher in the Department of Chemistry at Indiana University who is studying the effect of alcohol on the proteomics of rat livers using subcellular fractionation techniques. “The capability to locate every single protein would be a good added advantage in proteomics, especially in looking at proteins that you’re interested in.”

However, the location proteomics field is limited in that it only documents the presence or absence of a protein, and not its quantity, Mechref pointed out.

“In the case of diseases, you’re generally looking for changes in terms of up or down regulation, not complete disappearance,” said Mechref. “What’s happening is up or down signal strength.”

But Kuo-Chen Chou, a researcher at the Gordon Life Science Institute in San Diego who studies subcellular localizations of proteins, said that knowing the locations of proteins within cells could be useful in studying diseases.

“(Murphy’s) work is important to not only basic research, but also the pharmaceutical industry and medical practice because identifying differences in how proteins move within healthy and diseased cells is one critical way that doctors could diagnose disorders and gauge response to treatment,” said Chou.

In a review article to be published in the September issue of the Journal of Biomedical Optics, Murphy pointed out that knowing the normal subcellular distribution of a protein provides important clues to the protein’s function. For example, if a protein moves within the cell after treating the cell with a certain drug, that may mean that the protein plays a role in signal transduction.

Murphy’s group is currently collaborating with other research groups at Carnegie Mellon led by John Jarvik and Peter Berget to compare the subcellular locations of proteins in normal and cancerous cells. The work has not been published yet, but Murphy said the research so far has proven that the location proteomics technology can work to detect differences in locations of proteins in normal versus diseased cells.

In addition to fluorescence images, Murphy’s location proteomics software also depicts protein locations by generating cluster trees that use a branched diagram to indicate how far one protein is from another.

“We’re trying to identify where everything goes normally, and then how that changes under different conditions,” said Murphy. “This is a different approach than what’s commonly being used. It’s much more systematic.”

Mechref said he sees Murphy’s work as being complementary to the work that is currently done, where proteins are fractionated into different subcellular layers.

“It’s a nice approach and it’s going to be a very influential approach in the future,” he said.


The Scan

Study Finds Sorghum Genetic Loci Influencing Composition, Function of Human Gut Microbes

Focusing on microbes found in the human gut microbiome, researchers in Nature Communications identified 10 sorghum loci that appear to influence the microbial taxa or microbial metabolite features.

Treatment Costs May Not Coincide With R&D Investment, Study Suggests

Researchers in JAMA Network Open did not find an association between ultimate treatment costs and investments in a drug when they analyzed available data on 60 approved drugs.

Sleep-Related Variants Show Low Penetrance in Large Population Analysis

A limited number of variants had documented sleep effects in an investigation in PLOS Genetics of 10 genes with reported sleep ties in nearly 192,000 participants in four population studies.

Researchers Develop Polygenic Risk Scores for Dozens of Disease-Related Exposures

With genetic data from two large population cohorts and summary statistics from prior genome-wide association studies, researchers came up with 27 exposure polygenic risk scores in the American Journal of Human Genetics.