Researchers at Stockholm's Royal Institute of Technology and the Max Planck Institute for Biochemistry have completed work characterizing the transcriptome and the proteome in three functionally different cell lines.
The study, which was detailed in a paper published in the December issue of Molecular Systems Biology, suggests that protein expression is more similar across cell types than previously thought, a result with implications for drug development as well as basic cell biology research, Max Planck researcher Matthias Mann – one of the paper's authors – told ProteoMonitor.
Using deep sequencing of mRNA, proteomics analysis via SILAC mass spectrometry on a Thermo Fisher Scientific LTQ-Orbitrap XL, and antibody-based confocal microscopy, the scientists compared RNA and protein expression in a bone osteosarcoma, an epidermoid squamous cell carcinoma, and a brain glioblastoma.
The study quantified roughly 5,500 proteins across the three lines, finding that 65 percent of the detected proteins had similar expression levels across all three cell lines and only 10 percent of the detected proteins were expressed in just one of the three.
Gene expression analysis likewise found considerable similarity across cell lines, with 74 percent of transcripts expressed in all three cell lines and 13 percent found in just one of the three lines. Roughly a quarter of the transcripts showed little or no change in expression levels across cells, while 40 percent showed a two-fold or greater change between two or more cell lines.
The work offers insight into "a basic cell biology question of how different cells are different," Mann said. "Are they different because you choose to express a very different subset of genes, or are they different because you express the same genes but to different amounts? The paper [indicates] that at least for cell lines it is more the latter."
"If you had asked me two years ago, I would have been extremely surprised how many proteins are expressed everywhere," KTH's Mathias Uhlen, whose lab collaborated with Mann's on the project, told ProteoMonitor. "I've started to accept that, but I think a lot of people will still be very surprised."
This realization, Mann noted, raises potential difficulties for drug development efforts as it means it may be harder to target therapies to a particular tissue or cell type than previously thought.
"It's not as simple as one might have hoped," he said. For instance, a drugmaker "might want to target the kidney and [hope that] a lot of the targets you want to go after in the kidney are only expressed in the kidney. But that doesn't seem to be the case."
Also surprising, Uhlen said, was the extent to which differences in the cell lines' proteomes were confined to low-abundance cell surface proteins.
"The cell looks very similar on the inside, but differs a lot when you look at the outside of the cell, at the surface proteins," he said. "The fingerprint of the cell is very much expressed on the surface, while a lot of the proteins on the inside are more of what we would call housekeeping."
With most of the variation in protein expression centering on low-abundance proteins, errors due to sampling bias could have been a significant issue. The researchers avoided this problem, Mann said, by using triple SILAC labeling, in which each cell line was cultivated with amino acids with a different isotope.
"From the proteomics side that's the major innovation of this paper," he said. "Because we used triple SILAC [labeling] we can be completely sure that undersampling issues don't affect [the data.]"
Triple SILAC eliminates sampling bias issues by generating triple peak patterns for each protein – one for each different isotope used. With this technique the identification of one peptide automatically yields the identity of the other two.
"What is notoriously difficult in proteomics is that in one analysis you pick one low abundance peptide, and in the next another one, and you think that [represents] a difference [in expression]," he said. "But in fact it's just that you couldn't reproducibly identify them. This isn't a problem in triple SILAC labeling because we see every peptide as a triplet."
The downside of the method, Mann suggested, is the complexity of the mass spectra it generates, which makes it more difficult to delve deep into the proteome. And, in fact, he noted, the study "didn't go all out to get the highest coverage of the proteome," something he said "would have required months of measurement with current technology."
More in-depth work could reveal that the cell lines are even more similar than the Molecular Systems Biology study suggests, the scientists noted in the paper. Because the proteins found only in one cell line were generally of very low abundance, "in-depth studies are needed to rule out that some of the proteins have not been scored due to difficulties to measure quantitative levels for low-abundance gene products," they said.
In addition to delving deeper into the proteome, the researchers are interested in replicating the work in additional cell lines and in live tissues, Mann said.
"We've looked at cell lines. The next step will be to see the expression profile [of cells] in vivo," he said. "Many people are trying to compile the proteome of different cell types and tissues in different organism. So this is a long-term goal of proteomics."
Have topics you'd like to see covered in ProteoMonitor? Contact the editor at abonislawski [at] genomeweb [.] com.