NEW YORK(GenomeWeb) – The most recent update from the Human Proteome Organization's Chromosome-Centric Human Proteome Project (C-HPP) puts the group's overall coverage at around 85 percent of the human proteome.
This represents steady progress for the initiative, which aims to characterize one representative protein for each human protein-coding gene, and has winnowed the number of unidentified proteins from around 6,000 in 2012 to roughly 3,500 to 4,500 in 2013, to around 3,000 today.
As the researchers cross proteins off the list, however, the search becomes more challenging, as the remaining proteins likely possess characteristics that make them particularly difficult to detect. Some, for instance, may be very low abundance. Others might be expressed only in very specific, rare tissues or might be incompatible with conventional sample prep and mass spec workflows.
To help tackle these challenges, CHPP researchers have devised a system — detailed in a paper published this month in the Journal of Proteome Research — that uses production and analysis of full-length target proteins to help refine and optimize multiple-reaction monitoring mass spec assays for detecting them.
The approach uses an in vitro transcription/translation (IVTT) human cell-free protein expression system to express full-length proteins of interest that are as of yet unidentified in actual biological samples, allowing the researchers to refine MRM-MS assays that they will then apply within the C-HPP.
While, in theory, this sort of assay refinement can be done in silico and using synthetic peptides, the hope is that working with full-length proteins will let the researchers better understand exactly how their targets behave during sample prep and mass spec analysis, Péter Horvatovich, a researcher at the University of Groningen and first author on the study, told GenomeWeb.
"In current workflows, first you check [computationally] if the peptide is proteotypic, and then you try to predict if it will fragment well and fly well in the mass spec," he said. "Then maybe you synthesize a synthetic peptide to see if it is flying well."
But while this approach lets researchers evaluate the potential of peptides as an MRM target on the peptide level, it doesn't provide information on how the target behaves at the protein level.
For instance, Horvatovich said, "It could be that when you predict the peptide and you check it with synthetic peptides they fly nicely, but when you actually digest the [intact] protein, for some reason the trypsin is not cutting where it should cut."
Using the IVTT system, the researchers start with the intact protein, allowing them to take it through the full sample prep and mass spec process to better pinpoint where problems with their assays might arise.
As the authors wrote, it allows for "obtaining experimental data from unique peptides useful in sensitive detection of the proteins by mass spectrometry."
The IVTT approach used in the JPR paper also has advantages over, for instance, bacterial expression systems in that it contains only the cellular machinery necessary for protein production, which makes for a much simpler matrix, Horvatovich said. Another advantage, he added, is that it is based on components of mammalian cells and so is optimized for production of human proteins.
In the study, the researchers used the system to first produce and analyze 11 previously identified proteins from chromosome 10 and then 18 still unidentified proteins from chromosome 16.
However, in a development that hints at the challenges faced by the C-HPP groups in tracking down the remaining proteins, even with optimized assays to these 18 unidentified proteins in hand, the researchers had — as of the beginning of this month — yet to detect them, Horvatovich said.
The next step, he said, will be to perform more extensive sample fractionation upfront of MRM-MS analysis, which should allow the researchers to reach into lower realms of protein abundance.
They will also "look at the protein extraction methods," he said. "Because what you get in the sample also depends very much on what protein extraction methods you use."