NEW YORK (GenomeWeb) – Two years after scientists first identified a large population of spliced peptides presented on human leukocyte antigen (HLA) class I molecules, researchers continue to debate how common these spliced peptides are.
The original work was published in Science by a team led by researchers then at Imperial College London, Utrecht University, and the Berlin Institute of Health.
Several recent studies from independent research groups have since followed up on their results, with two raising questions about the original findings and a third paper appearing to bolster the original results by identifying a significant population of spliced HLA peptides.
HLA class I molecules play a key role in immunity, displaying the peptide antigens that generate cytotoxic (CD8+) T cell responses to various infections or diseases. Identification and manipulation of these antigens is key to research in areas like cancer immunotherapy, where scientists are working to trigger patients' immune systems to fight their cancers by presenting cancer-specific HLA antigens.
HLA-I molecules present these antigens after they have been processed by proteasomes, which cleave proteins into smaller peptides that stimulate the CD8+ T cell response. In addition, these proteasomes can also cut and splice peptides, creating new molecules that don't match the original protein sequence.
Such peptide splicing was thought to be extremely rare. However, the 2016 Science study identified a large number of spliced peptides, with the authors suggesting that they could make up around a third of the HLA-1 immunopeptidome in terms of diversity and around a quarter in terms of abundance.
Discussing the findings shortly after the publication of the study, Albert Heck, chair of Biomolecular Mass Spectrometry and Proteomics at Utrecht University and an author on the paper, said that the discovery of such widespread peptide splicing "was very shocking," and added that "to be honest, when we found it, I didn't believe it."
Last month, researchers from the Swiss Institute of Bioinformatics and the University of Lausanne published a study in Molecular & Cellular Proteomics questioning whether the peptides identified in the 2016 Science study were actually spliced peptides.
The MCP authors analyzed the data from the Science study and found that the majority of proteasome-spliced peptides (PSPs) did not pass their bioinformatic quality filters.
They then reanalyzed the HLA-peptide spectra, andcompared these reanalyzed de novo sequences to the UniProt database. They discarded all sequences that matched a sequence in the UniProt database, then searched the remaining unmatched sequences using two different search tools at a false-discovery rate (FDR) of 1 percent. This analysis determined that between 2 percent and 6 percent of peptides did not have a match in the standard UniProt database that was a better fit than the originally proposed match to a spliced peptide, making it plausible that these peptides were, indeed, spliced.
Markus Müller, a researcher at the SIB and an author on the MCP study noted that this small percentage of peptides scored as good matches to spliced forms and appeared, based on an analysis of binding motifs, to be good binders for their respective HLA molecules. However, he added that while these factors made it possible that they were spliced peptides, it was not "proof."
While previous examples of spliced HLA peptides exist, they have "not been considered a major contribution to the peptidome," said Michal Bassani-Sternberg, a University of Lausanne researcher and senior author on the MCP study.
"These were considered very rare events," she said. "There were only a few examples in the literature."
Müller said that the MCP researchers considered the 2 to 6 percent figure as an upper limit peptide splicing might contribute to the HLA peptidome but added that he does not believe there is solid proof that even this smaller percentage of peptides represents truly spliced molecules.
In the 2016 Science paper, the researchers used a mass spec fragmentation method developed by Heck and his team that combines electron transfer dissociation with higher energy collision dissociation to provide nearly complete fragmentation of peptides, allowing researchers to almost sequence these molecules de novo.
This allows researchers to look at peptides, like the spliced HLA-I peptides, that don't correspond to the genomic sequences in standard reference databases.
However, Müller noted, "in mass spectrometry, the matching is always a bit ambiguous. So you could match what you think is a spliced peptide, but it's actually a post-translationally modified peptide or something of that sort."
To truly prove splicing is occurring, "you need to go peptide by peptide and validate them experimentally," Bassani-Sternberg said. She noted that one purpose of the MCP paper was to give researchers who are interested in doing this sort of validation work a list of potential spliced peptides.
Juliane Liepe, the first author on the Science study and a research group leader at the Max Planck Institute for Biophysical Chemistry, and Michelle Mishto, a senior lecturer and group leader at King's College London and the senior author on the study, declined to comment on the MCP paper.
Heck said he found the MCP criticisms plausible and noted that he believed it was possible that "quite a few of our reported spliced peptides may be incorrect." He said that among the likeliest to be incorrect were reported peptides that, based on binding motifs, were not good HLA binders as well those that had relatively small "delta scores" — a measure of the difference between the best and second-best peptide-spectrum matches.
This month, two new studies addressed this question. The first, published in Science Immunology, found, much like the 2016 Science paper, that 28 percent of the HLA peptides the researchers detected were best explained by splicing. Additionally, the study found that trans-splicing — combining peptide segments from different antigens — appears to be equally abundant as the cis-splicing identified in the 2016 study.
For this work, the researchers performed de novo mass spec sequencing on HLA peptide repertoires using Sciex TripleTOF 5600 and Thermo Fisher Scientific Orbitrap Fusion instruments. They searched these spectra against the human reference proteome using the PEAKS Studio 8.5 software, which the authors noted is intended for de novo sequencing-based peptide searches.
They then discarded all spectra that matched the reference proteome, leaving them with peptides that they reasoned were either "true linear sequences that fell below the stringent 1 percent FDR cutoff applied in the above database search," "potential cis- or trans-spliced peptides," or "untemplated peptides with no biological explanation at this stage or whose de novo sequencing was not of high enough accuracy."
The researchers then applied what they called their "hybrid finder algorithm" to assess whether the remaining peptides were likely linear, cis-spliced, or trans-spliced, with the algorithm prioritizing the likelier linear explanation over spliced explanations. Spectra for which no explanation could be found were discarded and the others were assigned a single likeliest sequence, representing either a linear or a spliced peptide. These peptides were then merged with the human reference proteome to generate a combined database that the researchers searched again, identifying peptides at a 1 percent FDR cutoff.
In an email, Bassani-Sternberg and Müller said that, as with the 2016 Science paper, they had questions regarding the Science Immunology work, and whether the spliced peptides it identified were in fact true splice forms.
They said that the use of the PEAKS search tool in both rounds of searching was "circular and biased," given that it had already demonstrated it could not assign a linear sequence to the putative splice spectra in the first round. "The only way to detect wrong peptide-spectra matches would have been by using a different search tool in the second round," they noted.
They also wrote that many of the trans-spliced peptides were identified using the TripleTOF 5600, which they said has low accuracy compared to newer instruments.
While the Science Immunology authors determined that their proposed spliced peptides matched the HLA binding motifs, Bassani-Sternberg and Müller said this was not unexpected due to the fact that the study looked at "monoallelic cells that only yield peptide MS/MS spectra with this motif." They added that because the relevant motifs occur towards the ends of the peptides, where de novo sequencing is most accurate, "it is not astonishing that even wrong sequences would reproduce these motifs."
Anthony Purcell, head of immunoproteomics at Monash University and the senior author on the study, said that while the motif matching was not perhaps conclusive, he and his colleagues "at least believe it is reassuring."
With regard to the criticism that the use of PEAKS in both rounds of searching was circular, he noted that in the paper's supplemental material, the researchers reported using additional search tools beyond PEAKS "with similar results."
Purcell added that while the TripleTOF data was generated with settings of a parent mass error tolerance of 15 ppm and a fragment mass error tolerance of 0.1 Da, those settings for the study's Orbitrap data were 10 ppm and 0.02 Da, respectively.
"We saw no change in the proportion of spliced peptides [identified using] the nominally higher mass accuracy Orbitrap instrument versus the TripleTOF," he said.
Finally, last week, a team led by University of Wisconsin-Madison researchers, led by Lloyd Smith, professor of chemisty, published a study in the Journal of Proteome Research in which they presented a new software tool, called Neo-Fusion, that they said could improve identification of PSPs.
In the paper, they noted several limitations of the 2016 Science paper, including the fact that its use of a database consisting of all the theoretical cis-PSPs required long search times and might have led to an underestimation of the true FDR of their search; the fact that the search did not consider trans-splicing; and the fact that the search was limited to peptides of nine to 12 amino acids in length, which excluded longer PSPs.
The Neo-Fusion tool addresses these limitations by using separate database searches for each half of a spliced peptide and then combining the two identified halves in silico.
Using the tool to reanalyze the Science dataset, the UW-Madison researchers found that they were unable to identify most of the cis-spliced peptides identified in the original study. They found that by expanding the search space to allow for a wider distribution of peptide lengths and additional post-translational modifications, they were able to reassign around half of the peptides initially identified as spliced to linear sequences. Similar to Bassani-Sternberg and Müller, their analysis put the upper limit of cis-spliced peptides at around 2 percent to 6 percent.
Interestingly in light of the recent Science Immunology study, the Neo-Fusion tool identified a larger number of trans-spliced peptides. The authors noted that "these were generally more ambiguous than the cis-spliced peptides," but they added that the retention times of these trans-spliced peptides "are remarkably close to those expected for their predicted [hydrophobicity indexes]," which bolsters the case that they were correctly identified.
The frequency and significance of spliced HLA peptides remains, thus, a matter of contention. Purcell said that his "interpretation of the landscape is that the existence of splice peptides has become reasonably well accepted, but the fight is really over how abundant they might be."
Heck likewise noted that even the small percentage of spliced peptides cited in the MCP and JPR studies would be "much more than anticipated."
For their part, Bassani-Sternberg and Müller maintain that while their study indicates spliced peptides could comprise a small fraction of the HLA peptidome, there is not yet proof that such peptides are in fact present at these levels.
Purcell said that he and his colleagues hope to soon publish biological validation of their results that will provide additional evidence for the role of HLA peptide splicing.
"We plan shortly to submit manuscripts that look at the functional consequences of these peptides in cancer and infectious disease," he said.