Skip to main content
Premium Trial:

Request an Annual Quote

Personalized Cancer Vaccines Benefit From Considering Nearby Variants in Peptides


NEW YORK (GenomeWeb) – Researchers from Washington University School of Medicine, St. Louis (WUSTL), Nationwide Children’s Hospital, and elsewhere have discovered the value of considering variants located near tumor variants of interest in designing neoantigens for use in personalized cancer immunotherapies.

In a paper published in Nature Genetics this week, the researchers showed that failing to consider proximal variants can adversely affect the binding abilities of neoantigenic peptides that are designed for use in personalized cancer vaccines. In addition, they wrote, ignoring local variations can result in selecting lesser neoantigens for vaccines.

Based on their evaluation of data from 430 tumors, with an average of 241 missense somatic variants per sample, the researchers reported a false discovery rate, meaning the incorrect neoantigens were predicted, of about 6.9 percent when they did not correct for proximal variants. They also reported a false negative rate, meaning that strongly binding neoantigens were missed, of 2.6 percent when their analysis did not account for nearby variants. Based on these calculations, they estimated that in a pool of 100 individuals, approximately 51 would get a suboptimal vaccine due to the selection of a neoantigen with an incorrect peptide sequence, 23 would receive a suboptimal vaccine due to investigators missing out on a strong-binding neoantigen, and 62 would receive a suboptimal vaccine due to at least one of these causes.

The results indicate that not accounting for more variation can significantly upset the binding affinity of selected peptides, said Jasreet Hundal, a doctoral candidate in human and statistical genetics at WUSTL and one of the authors on the paper. Thus, they make a good case for expanding the focus of analysis pipelines beyond narrowly looking at differences between aligned tumor reads and the reference genome. "We want to make sure that whatever we are vaccinating the patient with, that sequence is correct," she said.

Current approaches for designing personalized cancer vaccines use tumor-normal data to identify sequences for so-called neoantigens that can bind to MHC class I and class II molecules. Typically, to evaluate strong-binding neoantigens from genomic sequence data, raw reads from tumor and normal DNA libraries are aligned to the human reference genome and somatic variants of interest (SVOIs) are identified by comparing them. These SVOIs are then annotated to predict protein sequence changes and infer possible neoantigenic peptides, usually eight to 11 bases long, that can bind to MHC class I and II molecules.

Because the whole purpose of neoantigen prediction is to create a therapy that is personalized for a patient's tumors, failing to consider the context of the tumor mutations in a way defeats the purpose of the exercise, said Malachi Griffith, an assistant professor in WUSTL's department of medicine, assistant director of The McDonnell Genome Institute, and another author on the paper. "You might have … maybe ten peptide candidates for an individual patient's vaccine [and] it's a shame that some of those either don't actually correspond to that patients' tumor or you predicted that they would be good but you made that prediction based on incorrect information."

Existing solutions developed for neoantigen prediction include CloudNeo, developed by researchers from The Jackson Laboratory, which provides a cloud-based computational workflow. The solution was created as part of the Seven Bridges Genomics platform for the National Cancer Institute's Cancer Genomics Cloud initiative.

WUSTL has its own offering for the space called the pVACtools cancer immunotherapy suite. The suite includes the pVACseq pipeline, which uses tumor mutation and gene expression data to identify targets for personalized cancer vaccines and adoptive T cell therapies. The pipeline is described in paper that was published in Genome Medicine in 2016. Also included in the suite is pVACfuse for detecting neoantigens that result from gene fusions; pVACvector, a tool designed to aid in the construction of DNA-based cancer vaccines; and pVACviz,, a browser for launching, managing, reviewing, and visualizing output from the tool suite.

According to Griffith, because of the complexities of performing computational analyses of genomic data in the context of cancer, developers of bioinformatics pipelines used for neoantigen prediction made some simplifying assumptions upfront. One such assumption is that since the immune system recognizes only a small peptide sequence rather than a whole protein, nearby variations in the sequence should not be too much of an issue. "We knew that it was a limitation from early stages … but we made a calculated choice that most of the time it wouldn’t matter because of the short length of the peptides that we were using," he said, but the intent was always to reexamine some of those assumptions with an eye towards optimizing the neoantigen prediction process.

At least one existing method avoids missing potentially important variation by looking at RNA sequence. That tool, Vaxrank, was developed by researchers at the Icahn School of Medicine at Mount Sinai and the Medical University of South Carolina. As explained in a preprint published in October on BioRxiv, Vaxrank determines which peptides should be used in a vaccine from tumor-specific somatic mutations, tumor RNA sequencing data, and a patient’s HLA type. These peptides can then be synthesized and combined with an adjuvant in an attempt to elicit an anti-tumor T-cell response in a patient. The solution determines mutated protein sequences by assembling variant RNA reads and ranking them based on a scoring system. It also considers surrounding non-mutated residues in prioritizing peptide candidates.

"If you look at every mutation in the cancer and think about what [it] is doing, a large fraction of the time that will lead you in the correct direction, but a small fraction of the time, you are missing a co-occurrence of mutations," said Alexander Rubinsteyn, a researcher in Mount Sinai's OpenVax laboratory and one of Vaxrank's developers. This can be "either the co-occurrence of cancer mutations with each other, which happens at a surprisingly high rate, or a co-occurrence of a somatic mutation with a germline variant in that patient." 

According to results reported in the Nature Genetics paper, of the 430 tumor samples analyzed, 380 samples, or about 88 percent, had at least one missense SVOI in phase with a proximal missense variant. In just under 94 percent of the cases, the SVOIs had a single proximal germline or somatic variant in phase but the researchers also found cases where multiple variants ware proximal to the SVOI.

Rubinsteyn echoed the WUSTL researchers' comments that methods focused solely on identifying mutations using reference-based matching can lead to incorrect peptide sequences or missed opportunities for stronger-binding neoantigens. Vaxrank sidesteps this issue by focusing on tumor gene expression. Essentially, it assumes that "you only want to make vaccines against mutations that are highly expressed [and have] a lot of RNA evidence for the mutant protein getting made," he explained. That mutated transcript will include all of the expression for the mutation of interest, as well as variants close by. "What we do is take the whole bundle of reads that support a particular mutation and we overlap them to get a consensus sequence … which implicitly then also contains any germline and somatic variants," he explained.

One possible drawback of Vaxrank's approach is that it does not offer the same level of detail about individual variants that WUSTL's method does. "If you want to figure out which other mutations were phased with a particular somatic variant, you have to do extra work to reverse engineer that from this consensus sequence," Rubinsteyn said. "We just have a consensus sequence in the RNA that we know contains a mutation and may contain others. And then to find out what others are there, we have to do extra work."

In contrast, WUSTL's method looks for phasing at the DNA level, so "they can tell you immediately that these four mutations co-occur, of which one is germline and three are somatic," he said. "It gives you some extra information for free that we don't have in our method."

Vaxrank is being used in three cancer vaccine trials at the Icahn School of Medicine that focus on developing personal vaccines for patients with advanced non-hematologic malignancies, glioblastoma, and urothelial cancer.

Meanwhile, the WUSTL researchers have encapsulated their approach in a module that they are making available as part of the pVACtools suite. The updated pipeline follows the same process that existing tools do but includes an additional step where researchers consider sequences in the region around the candidate variants to see if there are any somatic or germline polymorphisms on the same allele that could potentially affect the selected peptide sequences. "If we find that they are within the window that matters for the size of peptide that we are trying to design for the vaccine, we correct the candidate peptide sequence to personalize it to that person's genome sequence," Griffith explained.

As part of their next steps, the researchers are prepping updated pipelines for use in current and future clinical trials at WUSTL, Griffith said. "These have to be very carefully documented and versioned so that they don’t change as rapidly as the version of the pipeline that's used in a research context," he said.  He expects that his team will be ready to deploy the patient-ready pipelines in the next three to six months.

The researchers are also exploring other avenues for improving tumor neoantigen prediction and characterization. For example, they are looking at exploring class II MHC interactions in tumor cells in greater depth and they want to find ways to incorporate data from expression levels of MHC molecules in tumors. "There's been a probably unjustified fixation on class I MHC interactions and CD8+ T-cell response and it is becoming increasingly obvious that CD4+ or class II MHC interactions are really important," Griffith said.

Furthermore, "we are not routinely looking for somatic mutations in the MHC loci of the tumors themselves," he noted. This has value, according to the results of at least one study performed last year by researchers from Memorial Sloan Kettering Cancer Center and elsewhere, which demonstrated that patients' human leukocyte antigen (HLA) genotypes influence how well they respond to immunotherapy. In that study, the researchers inferred the HLA genotypes from normal DNA. However, Griffith noted that these genotypes could have changed in the tumor sample, and this could be an important factor in selecting peptide candidate. "It will certainly be important to understand mechanisms of resistance to immunotherapies," he said.