Skip to main content
Premium Trial:

Request an Annual Quote

HUPO Researchers Draw Closer to Goal of Characterizing Human Proteome


NEW YORK (GenomeWeb) – The Human Proteome Organization's Chromosome-Centric Human Proteome Project (C-HPP) is nearly 90 percent of the way to its goal of identifying a protein to each of the body's protein-coding genes.

As of the project's most recent count, participating researchers have identified proteins from 17,008 of the 19,467 predicted protein-coding genes, said Gil Omenn, a professor at the Universiy of Michigan and chair of the HPP.

That is up from 16,518 confidently identified proteins in 2016, but, Omenn noted, as the group closes in on its goal, the task has become more difficult, with the remaining proteins proving particularly elusive.

The C-HPP has spearheaded HUPO's effort to characterize the full human proteome, and coverage has steadily increased since the initiative was launched in 2012. Identifying proteins for all protein-coding genes will likely require more sensitive measurements or new approaches designed to detect molecules not amenable to conventional proteomic workflows, Omenn suggested. Perhaps even more important, it will require determining in which tissues and under what conditions these proteins are expressed.

He raised as an example of this latter challenge the beta defensin gene family, which produces antimicrobial peptides. The Chinese C-HPP team has been working to identify the conditions under which these peptides are expressed, trying approaches including the use of HDAC inhibitors to stimulate expression of these genes through epigenetic mechanisms, he said, "but they haven't quite been successful yet."

Olfactory receptors are perhaps the most famous example of this issue, and constitute the largest class of proteins yet to be confidently detected.

"We know from genomic work that there are about 940 olfactory receptor coding genes," Omenn said. Around half of these are pseudogenes, which aren't expected to make proteins.

Even tossing out these pseudogenes, though, leaves nearly 500 such receptors, which, he noted, "would be a big chunk of our 2,500 ['missing' proteins]." Yet, despite considerable work put into the effort, HPP researchers, as well as outside proteomic groups, have struggled to make much headway with these proteins.

Omenn cited work by William Hancock, one of the founders of the C-HPP, in which he collaborated with surgeons to obtain olfactory epithelium from the brain and upper nose with the goal of detecting olfactory receptors in this tissue — though with little success.

He noted as well that Johns Hopkins researcher Akhilesh Pandey has explored these receptors with his group but had little success detecting them.

"It is a very important subject, but it has just been totally intractable [from a proteomics perspective]," Omenn said. "So, what do we do about those? Well, we keep looking. But until we can get samples or techniques of sufficient sensitivity or enrichment, we are sort of stuck on that class."

More successful, he noted, have been efforts in other tissues little studied by proteomics researchers like sperm and testis. Over the last year, researchers participating in the C-HPP identified around 260 previously missing proteins in these tissues.

That work followed on transcriptomic research from Mathias Uhlén, professor of microbiology at Sweden's Royal Institute of Technology, that indicated enriched transcript expression in testis, Omenn said. "We have gotten a lot of the low-hanging fruit [from this tissue], and I think there is more to be had."

The brain is another potentially fruitful source of missing proteins, he added, noting that it contains "a lot of specialized regions that haven't been analyzed yet."

He said the kidneys and bladder had likewise received relatively little study from proteomics groups and could also be a significant source of missing proteins.

In addition to identifying and obtaining sample material likely to harbor proteins of interest, different experimental approaches will also be necessary to chase down the outstanding molecules, Omenn said. Recent papers from C-HPP participants in the Journal of Proteome Research provide examples of some such approaches.

For instance, in a JPR paper published last month, a team led by researchers at the Beijing Proteome Research Center used a multi-protease approach to identify testis proteins not detectable using trypsin digestion alone. The 7,838 proteins identified in their analysis included three missing proteins, as well as several potential candidates for which further evidence is still needed.

Such an approach might be particularly useful for membrane proteins, whose structures present challenges to conventional trypsin-based workflows, Omenn said, noting that the portions of the proteins contained within the membrane typically lack the lysines and arginines required for trypsin digestion, while in the portions outside the membrane these amino acids are too close together to generate proteotypic peptides long enough to enable confident identification.

Improved enrichment methods could also help with the identification of low-abundance proteins, Omenn said. He cited the example of another recent JPR study, this one led by researchers at China's National Institute of Biological Sciences, that used Bio-Rad's ProteoMiner protein enrichment reagents to identify 20 missing proteins.

The ProteoMiner kits use bead-based reagents consisting of hexapeptide ligands generated via combinatorial chemistry. When a sample is run through these beads, these ligands capture a large proportion of the low-abundance proteins present in that sample. High-abundance peptides, on the other hand, saturate their hexapeptide baits, meaning that a large proportion of these proteins remain unbound and are subsequently washed away. This results in the depletion of high-abundance analytes and the enrichment of low-abundance ones, potentially enabling the detection of proteins that have evaded identification due to their low concentrations.

Ultimately, Omenn said, perhaps 5 to 10 percent of proteins will prove undetectable by mass spectrometry and will require other methods of detection. However, he added, ongoing improvements in mass spec technology should continue to expand the range of molecules the approach can cover.

"This has been a moving target as we have increased the sensitivity and resolution of mass spectrometers in the last decade," he said. "And we expect further progress in that regard."