Senior researcher, dept. of pathophysiology
Name: Jiri Petrak
Position: Senior Researcher, department of pathophysiology, 1st Medical Faculty, Charles University, Prague
Background: Researcher Institute of Hematology and Blood Transfusion, Prague; post-doc fellow, University of California, Berkeley; PhD in molecular biology, Charles University
Researchers in the Czech Republic have identified, quantified, and categorized proteins they said appear with disproportionate frequency in proteomics experiments done by two-dimensional electrophoresis. Such proteins include heat-shock protein 27, heat-shock protein 60, and enolase 1.
Writing in the current issue of Proteomics, the researchers “suggest that the frequent identification of these proteins must be considered in the interpretation of any 2 DE studies.”
The article can be found here.
They also discuss whether they think the occurrence of these proteins is due to biology or technical limitations of the technology.
ProteoMonitor spoke with Jiri Petrak, a researcher at Charles University in Prague, and the corresponding author on the article, this week about he and his colleagues findings. Below is an edited version of the conversation.
What was the motivation for this article?
The reason or the motivation was that during the last couple of years, our group [has been using] proteomic approaches as a tool to study various aspects of … cell physiology. And it inevitably leads to reading a lot of related articles.
After reading several proteomic papers based on two-dimensional electrophoresis, one starts experiencing [an] unsetting sense of déjà vu. It seems that the same proteins are identified as [being] differentially expressed … basically over and over regardless of experiment or tissue.
And moreover, when you yourself identify the same proteins to be differentially expressed in many different experiments and different tissues, then you have to admit that something is wrong. Basically, you see the same protein names over and over, especially translation elongation factors [such as] annexins, peroxiredoxins, enolase, and several other proteins.
And I believe most of us who use 2 DE are very familiar with these proteins … and we decided to quantify this phenomenon.
By quantifying these proteins what were you trying to accomplish?
When I discussed this with a couple of my colleagues, they admitted to having the same feeling but nobody was able to show this for sure …so we had to find a way to prove it.
We decided to take a big set of published papers that used two-dimensional electrophoresis and we extracted data from three recent volumes of [the journal] Proteomics and we basically compiled the identities of differentially expressed proteins identified by 2 DE in human, mouse, and rat tissues.
The resulting datasets contained about 4,700 proteins, protein identifications that were covered in 186 experiments [done] by 2DE.
What tissues were covered by these studies?
We included all solid tissues, all cell cultures, but we didn’t include serum, body fluids, or fractionated tissues.
Any reason why not?
The reason is that you cannot compare proteins that are differentially expressed in serum. It’s something completely different because it’s a defined set of proteins in serum. It’s an absolutely different set, dominated by albumin and other proteins, which are not generally present in normal tissues, normal cells.
If we included fractionated tissues, it would definitely lead to some bias because once you touch the tissue, once you isolate, for instance, mitochondria, the spectrum of the proteins changes.
Researchers in this field are always saying that the high-abundance proteins hide the low-abundance ones and they keep seeing the same proteins over and over. What does this article add to the discussion?
Basically, that’s the fact about two-dimensional electrophoresis, that [if] you do any subfractionation, you only see the most abundant proteins, only the proteins that are present in, let’s say, tens of thousands of copies per cell. Still, it’s about thousands of different proteins.
But we identify a couple of the proteins. Basically, if you can load up to 1 or 2 milligrams of proteins of unfractionated tissues, then it’s obvious only the most abundant proteins can be visualized by conventional staining.
What, then, should your fellow researchers be taking away from your article if you’re saying that there are these proteins that keep recurring in study after study? Wouldn’t they be aware of that already?
What I think is important is that we show the results of two-dimensional electrophoresis experiments are predictable. No matter what you do, you are very likely to identify enolase, HSP27, HSP60, peroxiredoxins, annexins, or tubulins …
Now there could be two explanations for this observation or this phenomenon … Either there’s a technical problem or bias of the method, or alternative, these proteins are general sensors or regulators that really do change expression in response to many different various stimuli.
And in this case, differential expression of such general sensors, it’s not very specific and it doesn’t tell us much about the particular molecular mechanism that we intend to study. It doesn’t say anything about the biology involved.
To get some insight, to get some ideas whether the most often differentially expressed proteins are also often differentially expressed at the mRNA level … to get some idea whether it’s a technical artifact or whether these proteins could be such a general sensor, we wanted to know whether the same proteins are also very often differentially expressed at the mRNA level.
And we did something similar, a similar method analysis, but now with microarray data. We found that changes in mRNA expression corresponding to the most notorious proteins, namely enolase 1 and HSP 27 correlate surprisingly well, at least in humans, with the frequencies calculated in our proteomic meta-analysis.
The most notorious proteins, the proteins that are most often differentially expressed, are also very often identified as being differentially expressed at the mRNA level.
And what does that indicate?
That it’s not a bias of the technique, [or a] technical artifact, but that these proteins … are really something like general sensors or general stress response proteins.
If that’s the conclusion that you come to, how do you use that information in your research?
I think that these … notoriously identified differentially expressed proteins could be used to focus our attention [on] what is really important. I don’t want to say that we should exclude all these proteins from interpretation results, but I think that we must keep in mind, that for instance, [if] I find enolase and HSP27 are differentially expressed in my particular experiment, I shouldn’t make an unstable hypothesis based on enolase and HSP27, that I should focus on other proteins that are not among those known very well to be differentially expressed in, let’s say, every third experiment.
I believe it could help us to focus our attention [on] what’s really important. In fact, that’s what we are trying to do right now. We are trying to identify some therapeutically or diagnostically exploitable proteins of leukemia. We are using our charts to narrow down the least of the differentially expressed candidate proteins. And it seems to work.
We found two molecules that can be targeted to eliminate some leukemia cells, at least in in vitro experiments.
Were you able to identify any research practices that could account for the recurrence of these proteins?
I don’t know, to be honest. There are always problems if you work with human samples, that there’s a big polymorphism, big differences between individuals that might have some influence on the outcomes of such experiments, and statistics involved, etc.
We wanted to do something similar with serum, but two-dimensional electrophoresis is not used very often to look for biomarkers in serum for several reasons. People use more sophisticated or deeper approaches like multidimensional chromatography separations, et cetera.
But these approaches are difficult to compare because methodologically they are quite different. I don’t know how to perform a similar analysis of published data for serum, but I believe we would show that [the result with] serum is similar to what we found with tissues.
[Also] We found that in 40 percent of published experiments with human tissues and cells, at least one type of keratin was identified as being differentially expressed. And this is in striking contrast with rodent samples, where keratins were identified relatively rarely. That made us to believe that some of the supposedly differentially expressed keratins identified in human samples are in fact unwanted signatures originating from bodies of researchers involved.
Problematic are especially all experiments with human epithelial tissues and epithelia-derived cell cultures. Is the identified keratin really differentially expressed in the tissue or did it come from my hands?
If you identify a human keratin in mouse sample, than it is simple and obvious. And you do not present it in the results tables. So we should be very alert when a keratin molecule pops up among the differentially expressed proteins in experiments with human tissues.
Would sample preparation have any effect on this recurrence of proteins from study to study?
No, I don’t think so, because experiments that are based on two-dimensional electrophoresis usually contain only three to five, maybe six, up to 10 replicates, and it’s usually well controlled and the vast majority of experiments are performed on either laboratory animals or cell cultures where we don’t have this problem with sampling and selection of populations.
That’s more the problem of biomarker studies.
Since you submitted this article for publication, has anyone been able to come up with a strategy to overcome this issue about these few proteins constantly recurring in studies?
I think it would be difficult. Basically if these proteins are … general sensors and general responders, the only way to come through it is to ignore these proteins and focus on the other proteins that are differentially expressed because that’s the largest dilemma of experiments based on two-dimensional electrophoresis.
You end up with a list of differentially expressed proteins. Let’s say you have 20 proteins that are up-regulated, 20 proteins that are down-regulated — now what?
You end up with a list, and we shouldn’t end up like this. We should answer biological questions. The only thing you can do to [is to] narrow it down to one, two, three, maybe five proteins that you think are important or specific for the question that you are studying.
Do you suspect that if you looked at mass spectrometry, you would come across the same findings that the same proteins keep recurring?
Yes, absolutely, but mass specs basically don’t do anything. There must be some preceding fractionation, either [by] chromatography or electrophoresis, or whatever. Then you would have the same problem … It’s not a problem of 2 DE only, it’s a problem of all technology, to distinguish what is specific and what is not.