NEW YORK (GenomeWeb) – Researchers from Imperial College London and Utrecht University have found that a large proportion of peptides that are presented on human leukocyte antigen (HLA) class I molecules are spliced by the proteasome, a process previously thought to be rare.
The finding, described in a paper published today in Science, suggests that spliced HLA-I-binding antigenic peptides are significantly more prominent than previously thought and has potential implications for fields of study including cancer immunotherapy and autoimmune disease, Albert Heck, chair of Biomolecular Mass Spectrometry and Proteomics at Utrecht and author on the paper, told GenomeWeb.
The work also demonstrates the value of proteomics for detecting phenomena not apparent at the genomic level and the ability of recently developed mass spec methods to obtain nearly complete protein sequence information, Heck added.
HLA class I molecules play a key role in immunity, displaying the peptide antigens that generate cytotoxic T cell responses to various infections or diseases. Identification and manipulation of these antigens is key to research in areas like cancer immunotherapy, where scientists are working to trigger patients' immune systems to fight their cancers by presenting cancer specific HLA antigens.
HLA-I molecules present these antigens after they have been processed by proteasomes, cleaving them into smaller peptides that stimulate the CD8+ T cell response. In addition to simply trimming proteins into peptides, these proteasomes can also cut and splice peptides, creating new molecules that don't match the original protein sequence.
Such peptide splicing was thought to be extremely rare, however, Heck said. His team's work suggests, though, that peptides spliced by the proteasome make up around a third of the HLA-I immunopeptidome in terms of diversity and around a quarter in terms of abundance.
Key to identifying this repertoire of spliced peptides was the use of a fragmentation method Heck and his team developed several years ago that combines electron transfer dissociation with higher energy collision dissociation. Called electron transfer higher-energy collision dissociation (EThcD), the approach provides much more complete fragmentation of peptides, allowing researchers to almost sequence these molecules de novo.
"In normal HCD with tryptic peptides, if you have a peptide with 10 amino acids, you typically only have six or seven amino acids covered, but together with [a comparison to] the genome sequence, this is enough to identify the peptide, even though you are not sure about these three amino acids that you didn't identify," Heck said.
The EThcD method, on the other hand, aims "to get full amino acid sequence coverage for each peptide," he said. "And if you can do that, you can almost do de novo sequencing of peptides. You almost don't need a genome database to do your search against."
This meant the researchers could identify peptides, like the spliced HLA-I-binding peptides, that didn't correspond to the genomic sequence of the original protein. "And this allowed us to look at if this proteasomal splicing is really a rare event or if it happens more often," Heck said.
The discovery that such splicing was, in fact, quite common, "was very shocking," he said. "It not only means the proteasome does this a lot, it also means that no one had realized this yet. It is clinically very important to know which HLA peptides are presented [for areas like immunotherapy and vaccine research], and this is a whole new category of peptides that might be presented to cells."
"To be honest, when we found it, I didn't believe it," Heck said, noting that the researchers performed a number of validation experiments to confirm the discovery. They repeated the experiment in several different cells lines, again finding a large proportion of spliced peptides. They also synthesized the spliced and non-spliced peptides to confirm that the spectra from the experimentally observed spliced peptides matched those of the corresponding synthetic spliced peptides. They tested the proteasomes in vitro, as well, finding that they created spliced forms like those they detected in their in vivo experiments. And they repeated their analysis in cells with the relevant proteases knocked out, finding that this nearly eliminated the presence of the spliced peptides, which indicated that they were, indeed, generated by the proteases.
"With this data all together I became convinced that this proteasomal splicing is really happening quite often and should therefore be taken into consideration when you look at the presentation of HLA peptides," Heck said.
The study, he noted, has significant implications, but little in the way of answers, for research into the HLA-I immunopeptidome.
In cancer immunotherapy, for instance, "there is a lot of money and investment being put toward finding these [cancer antigen] peptides that are being presented," he said. "Those researchers want to know the rules for HLA peptide presentation. This work suggests that we know way less about peptide presentation than we thought we did."
The authors also observed the possibility that these spliced peptides could play an important role in autoimmune diseases. Spliced forms, they note, could create overlaps with human proteins that could lead to autoimmunity.
Beyond its potential significance to immunology research, the study also provides "a strong case for doing proteomics," Heck said. "Because the data we acquired could never have been traced back to genomic sequences."
He added that this sort of splicing has been observed in proteases outside the HLA-I system, indicating that it might contribute to proteomic diversity more generally.
"It is known that other proteases can do this, though it is not known how often it happens in vivo," he said, noting that it could explain why researchers have been unable to identify certain expected sequences in the proteome.
Of his lab's push toward de novo sequencing on a proteome-wide scale, Heck said that while the technology continues to advance in that direction, "it certainly is not there yet."
"It is very easy to explain where de novo goes wrong," he said. "If, for instance, a peptide has 20 amino acids and we sequence 18 of them and the last two we know are either A-G or G-A, if you don't have the fragmentation between [those two], you will never be able to say if it is A-G or G-A, and so that peptide you cannot identify de novo. You really need a cleavage between every amino acid."
Currently, he said, the EThcD method achieves between 90 and 95 percent fragmentation, compared to 70 to 80 percent for single fragmentation approaches.
"So it is substantially better, but it still isn't 100 percent," Heck said. He added that he believed the technology would get there, but that, even then, certain peptides would remain unamenable to the approach.
"Say you want to identify 100 peptides. Maybe for 90 you will get [100 percent] sequence coverage; for nine peptides you will have three different solutions and you might be able to synthesize these solutions and find out which one is the best; and maybe for one you still won't be able to tell what it really is," he said.