Researchers with the Chinese Human Chromosome Proteome Consortium have published a combined analysis of the transcriptome, translatome, and proteome of hepatoma cells.
Presented in a paper published last month in the Journal of Proteome Research, the study provides a road map for the CHCPC's efforts within the Human Proteome Organization's Chromosome-Centric Human Proteome Project (C-HPP) to characterize the full proteome of chromosomes 1, 8, and 20. With the additional data from this study, the group has now pushed its proteomic coverage of these three chromosomes over the 60 percent mark.
In the first phase of the CHCPC's work on the C-HPP, the group's researchers identified 12,101 proteins via mass spec analysis of liver, colon, and stomach tissue as well as several related cancer cell lines.
Moving into the second stage of the work, the researchers identified the translatome – which they defined as the collection of mRNAs bound to the ribosome nascent-chain complex – as a key area of focus for their efforts to further improve proteomic coverage of the chromosomes being studied.
While the transcriptome has been widely explored as a proxy for the proteome, the CHCPC researchers believe that the translatome could perhaps provide a more accurate picture of protein expression, said Ping Xu, director of the department of genomics and proteomics at the Beijing Proteome Research Center and senior author on the JPR paper.
In a separate JPR paper also published last month, the CHCPC team put forth a demonstration of this strategy, integrating RNC-mRNA analysis of human normal bronchial epithelial (HBE) cells and human colorectal adenocarcinoma Caco-2 cells with mass spec-based proteomics data.
This analysis, Xu said, found that RNC-mRNA sequencing could quantitatively detect 12,000 to 15,000 translating genes in single cell types and provided "accurate translating evidence for most protein products with high sequence coverage." This, he noted, suggests such measurements could offer "useful information for directing subsequent verifications of 'missing' proteins."
The translatome could also prove a good place to look for single nucleotide variants and alternatively spliced transcripts, Xu said. Investigations of non-coding RNC-mRNA might also aid the discovery of new proteins.
In the JPR paper, the researchers profiled the transcriptome, translatome, and proteome of three hepatoma cell lines – Hep3B, MHCC97H, and HCCLM3 – commonly identifying 9,918 genes at all three levels of analysis. In total, they identified 18,246 genes at the translatomic level, 9,922 or 54.4 percent of which they were able to identify in their proteomic data.
Upon combining this new data set with data previously generated under the project, the researchers upped their current proteomic coverage of chromosomes 1, 8, and 20 to 63.2 percent, 62.1 percent, and 60.5 percent, respectively – substantial numbers, but, the researchers noted, still well short of the 100 percent coverage the C-HPP has set as its goal.
Based on the study's findings, "protein abundance is the decisive factor for protein identification," Xu said, noting that given this fact better methods of enriching for low abundance proteins are needed.
The researchers used one such method in the JPR paper, applying a transcription factor enrichment approach based on concatenated arrays of transcription factor response elements. Using this technique, which was developed in the lab of Jun Qin, a Baylor College of Medicine researcher and leader of the Chinese Human Proteome Project's mass spectrometry efforts, they were able to add 31 proteins, including 14 transcription factors to their dataset.
In addition to transcription factors, similar enrichment approaches could prove helpful in increasing coverage of molecules such as membrane proteins as well as DNA- and RNA-binding proteins, Xu said.
He also noted that better workflows for analysis of hydrophobic proteins were needed. In the JPR work, the CHCPC analyzed what protein physiochemical properties most affected mass spec detection, finding that hydrophobicity played the biggest role in whether or not a protein was picked up by mass spec. Of hydrophobic proteins exhibiting high abundance at the mRNA level, the researchers were able to detect roughly 50 percent, Xu noted.
"The [difficulty] in identifying hydrophobic proteins in MS-based proteomics indicates that we need more technologies for their identification," he said, noting that prefractionation or enrichment could help improve detection of such proteins.
Use of multiple mass spec platforms might also help increase proteome coverage, Xu said. In the JPR paper, the researchers used both a Thermo Fisher Scientific Q Exactive and an AB Sciex TripleTOF 5600 for their analyses in order "to learn more about the MS preference for protein identification" and avoid "the premature saturation of distinct protein identifications," they wrote.
Both platforms "showed excellent performance," Xu said, but returned somewhat different IDs, suggesting that use of "complementary, different MS platforms may increase the number of identified proteins and improved the sequence coverage."
Antibody-based efforts like Sweden's Human Protein Atlas project will also likely play a role in complementing the C-HPP's mass spec efforts, Xu and his co-authors noted. A new release of the HPA issued this month contains antibody-based protein data for more than 80 percent of the human protein-coding genes as well as RNA expression data for more than 90 percent of these genes.