NEW YORK (GenomeWeb) – Last week the Human Proteome Organization held its annual meeting in Taipei, Taiwan. Among the attendees was Swiss Federal Institute of Technology scientist Ruedi Aebersold, a leading proteomics researcher and one of the founders of HUPO's Biology/Disease-driven Human Proteome Project, an effort that tackles the challenge of mapping the human proteome from the perspective of the proteomic profiles of specific organs, diseases, and biological systems.
GenomeWeb spoke to Aebersold this week from New York to get his thoughts on the meeting and trends in proteomics research more generally. Below is an edited version of the interview.
Did you observe any particular trends at last week's meeting in terms of where HUPO's focus as an organization seems to be moving?
Well, [HUPO's] Human Proteome Project, which has the task of mapping out the proteome is still going, and it is now in a phase where they are trying to go after proteins that should be there but have not been found. So that is going on at some level, but there is this other branch where the question is, 'How can proteomic data be used in specific clinical and biological contexts?' And this is, I think, gaining a lot of momentum. So I think there is a trend to go away from just identifying more and more proteins and towards trying to determine what these proteins can do and what they mean in a clinical and biological setting.
What implications does that have for how proteomics researchers go about their work in terms of approaches and technologies used? For instance, more focus on modifications and proteoforms, protein interactions and complexes, as opposed to just individual molecules?
I think there were some clear trends, and they are exciting directions. So, exactly what you say, to look more at proteoforms and modifications. And that is covered to some extent by top-down proteomics, which is steadily progressing, and also bottom-up proteomics where there are increasingly sophisticated ways to robustly identify modifications.
What I think is kind of a new direction is to look at the intersection or convergence or mutual dependency of different types of modifications. I think up until recently there was mostly the phosphorylation world and the ubiquitin world and the glycosylation world, and I think now it is recognized that these types of modifications and the systems that create them are interlinked, and increasingly people are starting to look in particular contexts and at the intersection or dependency of these different types of modifications.
Does the technology exist for doing that in a robust way?
I think this is still technically very much limited. I think where there has been a lot of progress reported in terms of robustness and the number of modifications is in individual modification types. Like, phosphorylation [research] has already been quite robust. There was quite a lot of discussion about glycosylation at this HUPO meeting. But right now I think most of these types of modifications are identified in isolation, and then maybe different types of modifications are measured on aliquots of the same sample.
I think that the next big step will be to see how they really intersect on the level of a particular protein. This is still difficult to do. Maybe top-down is a solution, but of course top-down is also struggling with this question. It is conceptually the way to go, but technically it still has a ways to go to characterize proteoforms that have different types of modifications attached.
Do you have a sense of the level of interest in top-down mass spec within HUPO? In the past that hasn't been as visible a part of the organization as other initiatives.
They have a very well organized community — it is simply technically a very difficult task to go beyond where they are. I think the state-of-the art is [measuring] a couple thousand proteoforms, and going on will depend not just on the mass spectrometry but also on the software and also on the sample separation upfront.
You also mentioned the issue of protein complexes and basically how proteins are organized, and there is a fair amount of new work on this. [Utrecht University researcher] Albert Heck had a nice talk about native mass spectrometry, and then there is the use of crosslinking mass spectrometry to find which proteins interact with each other, and then there is the idea of inferring which proteins go with each other in terms of functional complexes from large datasets where proteomes are measured under different conditions. These are the kinds of things we are doing a lot of.
And this is largely supported by these very precise repeat measurements that can now be done very nicely. For instance, [Sciex researcher] Christie Hunter had a talk [at last week's meeting] where she, in a collaboration with us, ran [mass spec profiles of] about 250 perturbed cells as kind of a drug response study. And she did it very quickly, about one hour per run by microflow SWATH, and she had about 5,000 proteins very precisely quantified across 250 proteomes, and it took her like 10 days to generate this data. So this I think is also a noticeable trend, a lot of discussion about [data independent acquisition mass spec] and various forms of DIA. I think this is moving very quickly, and the advantage, of course, is that it is a very robust and fairly simple technique. It does not reach the number of proteins you get with fractionated samples, but it reached 5,000 to 7,000 proteins now at a relatively fast pace.
Do you see any particular trends with DIA methods development?
The basic fact is that the DIA data is noise limited, because when you open up the [mass] window you create more noise. [In DIA experiments the mass spec simultaneously fragments all the precursor ions across large mass windows.] So I think there are various attempts to reduce the noise level, and this could be done, for instance, by overlapping rolling windows and assuming when you come across this peptide again and fragment it in a different context the noise will be different and you can then subtract the noise — or by going faster but through smaller windows. There are all kinds of things that are being explored. Which solution will have the biggest effect, we will see.
Do you see it converging on a single method, or many different solutions used in different contexts?
It also depends on the instrumentation. Initially Swath was just run on the Sciex TripleTOF instruments, and now the same thing is run on [Thermo Fisher Scientific's] Q Exactive and also other instruments, and I think for different instruments there might be different optima. But I think what we clearly see is a fairly massive upward trend in the number of proteins that are credibly claimed to be identified. We started out with maybe 2,000 to 3,000 proteins on the TripleTOF 5600. Now on the [TripleTOF] 6600 with microflow [LC], which makes the whole thing very robust, Christie Hunter showed 5,000 proteins over 250 runs. The guys at Biognosys [a Swiss proteomics firm spun out of Aebersold's lab] use a Q Exactive and slightly longer gradients, and they claim they get beyond 6,000 proteins with high reproducibility. So it is definitely trending upwards. As soon as you are in the range of 6,000, 7,000, 8,000 proteins credibly and quickly measured, you have a terrific tool for biology and clinical research.