NEW YORK– Researchers at University of Wisconsin-Madison, the Max Planck Institute of Biochemistry, and elsewhere have devised a new approach to detect protein isoforms in the human proteome using shotgun proteomics, generating a comprehensive catalog of human protein diversity.
"This resource now provides a framework to directly study allele-specific expression and address fundamental questions of how mutations impact protein expression and stability," they wrote in a paper published in Nature Biotechnology on Thursday.
One of the limitations of conventional shotgun proteomics is that it fails to distinguish different protein variants and isoforms. One reason for this is that the method relies on the protease trypsin to digest the proteins, which doesn't provide an adequate picture of the isoforms present. "Many of the peptides produced are either too short or too long to be detected using current liquid chromatography–mass spectrometry technology," the authors wrote.
Instead of using trypsin alone, they digested proteins present in six human cell lines using six different proteases. This fragmentation was followed by deep fractionation with liquid chromatography and tandem mass spectrometry using an Orbitrap Tribid instrument.
With this method, the researchers identified more than 1 million peptides from 17,717 different protein-coding genes, or protein groups. The data facilitated the detection of approximately 80 percent of the sequences of all proteins within the samples — a vast increase over standard approaches.
"The whole idea was to explore this universe of proteomics. To detect as much as we can in terms of proteins and amino acids," said co-lead author Pavel Sinitcyn, a postdoctoral fellow in the Coon laboratory at the University of Wisconsin-Madison.
The research team created an online, publicly available resource called deep-sequencing.app where scientists can query any gene and examine the corresponding peptides and protein modifications associated with that gene. "Owing to its scope, depth, and coverage, the dataset reported in this study represents a resource to drive future work on the human proteome," they wrote.
One question the study tried to answer was how many alternatively spliced mRNA transcripts are translated into proteins. Previous RNA sequencing studies of human organs had estimated that transcripts from more than 95 percent of multi-exon genes undergo alternative splicing.
"The extent to which alternative transcripts with the potential to encode different proteins are translated has been the subject of considerable debate," the authors noted.
They found that about 64 percent of frame-preserving splicing events of relatively highly expressed genes that can be detected by transcriptomics are indeed translated and present at the protein level. This is likely an underestimate, they added, since even at the highest levels of peptide coverage, probably not all spliced exon-exon junctions were covered.
"I think this knowledge tells us that, yes, these ideas about splicing — allowing the cell to have this repertoire of proteins for distinct purposes — are now validated. This is the first time we've been able to measure it and prove it," said co-senior author Joshua Coon, a professor of biomolecular chemistry at the University of Wisconsin-Madison.
Coon said these findings are the initial step towards understanding all the proteins and developing better drug targets for a many diseases.