NEW YORK (GenomeWeb) – A team led by researchers at the Swiss Federal Institute of Technology (ETH) Zurich have developed a mass spec workflow for profiling proteins at the complex level.
Described in a paper published this week in Molecular Systems Biology, the method uses a database searching approach analogous to those currently used for peptide-level proteomic analyses, allowing researchers to profile the protein complexes present in a sample in much the same way they have the individual peptides and proteins, said Ruedi Aebersold, a professor at ETH Zurich and senior author on the study.
Traditionally, proteomics has focused on identifying and quantifying proteins in isolation. In real biological systems, however, proteins frequently work as parts of larger complexes, and so to more thoroughly understand their function or dysfunction, scientists have increasingly turned their attention to the study of proteins as they interact with other proteins or molecules.
Researchers have used a variety of approaches to investigate protein-protein interactions, ranging from yeast two-hybrid experiments to affinity purification-mass spectrometry to proximity labeling. Such work has typically had a discovery bent to it, with the aim being to catalogue different molecules that interact with a given protein or set of proteins.
In their MSB study, Aebersold and his colleagues framed their work somewhat differently, aiming to develop an approach to examine known protein complexes in samples of interest and measure how they change in different kinds of samples and under different conditions.
The method combined size-exclusion chromatography with SWATH mass spec and a software package called CCprofiler that allows researchers to take their mass spec data and search it against existing databases of protein complexes to identify complexes in a given sample. In an analysis of HEK293 cells, the ETH team identified 462 proteins complexes composed of 2,127 proteins.
In size exclusion chromatography, the elution profile of an analyte is determined by its size, with larger particles eluting first and the smallest ones eluting last. This allows researchers to separate proteins and protein complexes by size, with proteins in large complexes coming off the column first, smaller complexes coming off after that, and unbound proteins coming off last. They can then take these fractions and analyze them by mass spec to identify the proteins present in each fraction, the idea being that proteins present in the same fraction are potential interactors.
Aebersold noted that researchers have for years used size-exclusion chromatography to study protein complexes but that the technique's usefulness has been limited by its relatively low resolution.
Each size-exclusion fraction can contain dozens to hundreds of proteins, he said, which means the chance is quite high that a given fraction will contain many proteins that eluted together despite not being in the same complex.
To address this challenge, the researchers developed the CCprofiler software, which considers co-eluting proteins in the context of existing protein-complex databases and scores how likely such proteins are to exist in a complex.
"We basically do a targeted search and say that if [for example] three proteins are known [based on previous studies] to interact, then if they co-elute precisely with certain properties, then we have evidence for this complex," Aebersold said. "So rather than trying to discover all the complexes by co-elution, which doesn't really work, we test the hypothesis that a complex that we know to exist is present based on signals in a sample."
The scoring algorithm is based on the same methods used to score targeted proteomic data, he said.
"We can use the same algorithms we use for quantitative DIA [mass spec], except we don't score transitions and ion fragments, we score peptides and proteins," he said. "So we are changing [protein complex work] from the discovery mode to a hypothesis testing mode."
In part because of the high reproducibility of DIA mass spec data, the approach is "nicely quantitative," Aebersold said, which he noted will allow researchers to do differential analyses looking at how the presence or abundance of protein complexes change across sample types or in response to different perturbations.
He and his colleagues did not present this sort of work in the MSB paper, but he said that they have since used the approach to look at changes in protein complexes under various conditions including disease states and in response to different genetic splice forms.
"If you had, say, [two different] splice forms of the same gene, they could potentially participate in the same complex and therefore carry out basically the same function, or they could become part of entirely different complex where one would therefore conclude they carried out a different biochemical function," Aebersold said. "So this is one of the things we are looking at right now."
In the study the researchers used the CORUM, BioPlex, and STRING protein complex databases, which contain data on, respectively, 1,753, 23,744, and 383,626 protein complexes (CORUM) or protein interactions (BioPlex and STRING). Aebersold said he has no plans to develop a separate database specifically for CCprofile analyses, noting that the existing resources "are already quite rich and keep becoming richer and richer."
Currently, he said, the data his team is generating is able to assess changes in a sample at the protein-complex level qualitatively and quantitatively on about 500 to 600 complexes at a time.
"Of course that doesn't account for all [of a cell's protein complexes], but it allows [researchers] to get a nice overview at the level of several hundred complexes as to how the cellular proteome organizes and reorganizes."