NEW YORK (GenomeWeb) – Though mass spec-based proteomics has grown steadily more accessible since its early days, the approach is still a technically demanding one with significant challenges in areas including sample prep, sample separation, and informatics.
Also, it is mostly used for research looking at the proteome of a single, known organism. Imagine, then, the complexity of metaproteomic analyses, in which scientists aim to characterize proteomes of a large array or organisms — often many unknown — present in a single sample.
In presentations at this week's Association of Biomolecular Resource Facilities annual meeting in Ft. Lauderdale, Florida, Dennis Wolan, an assistant professor at the Scripps Research Institute, detailed several approaches he and his colleagues have developed to enable their metaproteomic work, including activity-based protein probes and a metaproteomic database comprising all publicly available microbial protein and peptide sequences.
Wolan came to metaproteomics less from the proteomics side than as a researcher interested in protein probe design and the development of probes for proteases, in particular, he told GenomeWeb.
"I was just trying to use [proteomics] as an application to get from point A to B, to start characterizing the proteins in the microbiome," he said. The idea, Wolan noted, was that by using probes specific to various protein activity classes, he could reduce the complexity of the microbiome's metaproteome and so achieve deeper analyses.
The problem of high abundance proteins limiting mass spec's reach into lower abundance regions of the proteome is well documented. It's particularly a challenge for metaproteomics where, Wolan said, researchers are dealing both with proteins from the host and the microorganisms they are analyzing.
"You have such a high level of background from your host proteins, as well as housekeeping proteins from the bacteria," he said. "So what we do is, we apply these enrichment technologies with specific molecules that will covalently label proteins of like function — for instance, proteases. And that way, we can dig deeper in the proteome, using these isolation methods to specifically target [different functional classes], and then continue along the path towards identification with mass spectrometry."
Wolan and his colleagues use a custom database they have developed for matching their mass spectra to peptides. Called ComPIL, the database contains all publicly available microbial protein and peptide sequences, roughly some 82 million protein records.
The enormity of this search space, however, brings its own issues. Wolan said that with existing algorithms, and his team were unable to search the database in a reasonable amount of time.
"We have let searches go on for over a month with some of the search engines that people typically use," he said, adding that to effectively search the ComPIL database for matches from a typical mass spec metaproteomics experiment would take around four months.
Happily, Sung Kyu Park, a researcher in the lab of Wolan's Scripps colleague John Yates, had recently developed a new search algorithm, Blazmass, that enables much faster searches than existing tools.
"It was just serendipitous," Wolan said. "[Park] was making the search engine at the same time we were making the database, and he said 'This is probably one of the best case scenarios for me to test my search engine, so here you go.' And it worked beautifully."
"We are now able to take an entire shotgun mass spec dataset that has on the order of 105 scans and search it against an entire database of 8 billion peptides in less than 12 hours," he said. "We have tried almost every search engine that is publicly available and nothing has been able to output anything remotely within the amount of time that Blazmass will."
Wolan and his team are currently applying these tools to the study of ulcerative colitis in mouse models and human patient samples, with the ultimate goal of identifying potential drug targets for the condition.
But while many metagenomic and metaproteomic studies have focused on identifying the composition of different organisms present in a sample, this, Wolan said, is of less interest to him than identifying the presence and levels of different protein functional classes.
"We are not very much interested in the phylogenetic makeup of what the bacteria are," he said. "We are more interested in the amount of functionality that we identify by our probes."
This, Wolan noted, is in part because the metaproteomes of healthy individuals can vary widely. More important that what specific organisms are present, he said, is the balance of different functions taking place.
"Every individual is quite unique in their composition of the different kinds of bacteria that are there. What I have and what you have may be completely different," he said. "But regardless of the different composition of species, the host requires some types of functional niches that need to be fulfilled with some types of protein functionalities. [And these] functionalities can be supplied by a wide variety of different types of bacterial species."
And even in cases where, for instance, a metagenomic analysis identified the same bacteria as present in two different subjects, in reality, these bacteria might behave very differently.
"There are so many horizontal gene transfers going on, phage infections going on, for instance, that … they might not be even close to the same species anymore," he said. "Because they are influenced by their environment."
With ulcerative colitis, the researchers are "really interested in seeing the [microorganism] protein functionalities that would be responsible for the degradation of our mucus layer that protects our epithelial layers in our large and small intestine," Wolan said, noting that this means a focus on proteases and glycosidases.
"You can imagine if once that mucus layer gets disintegrated somehow, then there is room for infiltration by the bacteria into the host, resulting in systemic inflammation and further infiltration, and a variety of diseases," he said.
In preliminary experiments in a small set of mouse models of ulcerative colitis, the researchers have seen increases in certain protein functions in mice with the condition, he said. In humans, they are currently looking at healthy patients to better understand what levels of what functionalities are conserved across these subjects before they move to investigating colitic individuals.
"It is a very complicated thing, an iterative process with the host and microbiome feeding off each other," Wolan said. "So almost like a precipitous spiral, and it could be host genetics or it could be the microbial components, and we don't understand that entirely."