NEW YORK (GenomeWeb) – A little more than a year after initiating a plan to compare different immunohistochemistry tests used to predict patient benefit from anti-PDL1 immunotherapy drugs, researchers have shared the first results from Phase I of this effort, demonstrating some concordance between different assays, but also significant enough differences to prevent easy translation of the results from one test to another.
Researchers from the International Association for the Study of Lung Cancer (IASLC) — who co-led Phase I of the "Blueprint" assay comparison project with a consortia of pharmaceutical companies, diagnostic companies, and other academic associations — are now preparing for a Phase II follow up that will validate and extend the results.
"Until we have more data collected, we recommend that clinicians stick with the recommendations that one assay goes with one specific drug," Fred Hirsch, professor of medicine and pathology at the University of Colorado Cancer Center and CEO of the IASLC, told GenomeWeb this week.
Hirsch presented the new data at the annual meeting of the American Association for Cancer Research last month.
The Blueprint project was first initiated in recognition of the fact that the cancer immunotherapy field was developing rapidly with multiple different drugs being advanced in parallel alongside separate companion or complementary diagnostic tests.
This cloistered development has raised a pressing clinical issue — doctors and pathologists don't want to have to order or perform multiple assays to ask a single question.
By comparing four of these assays, the pharma and diagnostics companies in the Blueprint consortium hoped to help better inform healthcare providers about the analytical variability between these tests and to take first steps towards a possible harmonization or standardization of available platforms.
At the AACR meeting, Hirsch described the group's Phase I analytical comparison, which involved four tests developed by two diagnostics companies — Dako's 22C3 and 28-8 assays and Ventana's SP263 and SP142 assays — developed alongside four PD-1/PD-L1 immune checkpoint inhibitors — Merck's Keytruda (pembrolizumab), Bristol-Myers Squibb's Opdivo (nivolumab), AstraZeneca's durvalumab, and Genentech's atezolizumab, respectively.
In the study, two Ventana pathologists and one Dako pathologist who were experts in interpreting their respective assays independently evaluated 156 IHC slides from 38 patient samples chosen to represent as much as possible the full dynamic range of the assays.
Each pathologist evaluated each case using only the clinical algorithm and positive cutoff point associated with their specific assay. For example, the 22C3 expert pathologist read all slides using the 22C3 selected cutoff of 1 percent tumor proportion score (TPS).
Researchers then compared the results of the four assays both independent of and in light of their clinical cutoff points.
According to Hirsch, a purely analytical comparison of the percentage of tumor cell staining for each assay revealed that three of the four tests were very analytically similar, while the fourth — Ventana's SP142 — diverged significantly, consistently labeling fewer tumor cells positive for PD-L1.
However, when the researchers compared the number of cases that met the clinical PD-L1 expression cutoff for each assay, only 19 cases showed agreement between all four assays in terms of their being deemed "PD-L1 positive." The others showed varying levels of agreement, largely driven by the different cutoffs used.
The researchers also calculated the agreement rate when the clinically established cutoff points were used for each assay versus cutoff points of one of the other assays. For example using a 1 percent TPS cutoff with the SP263 assay rather than the test's normal 25 percent cutoff score still yielded a positive result in 85 percent of the same samples.
In contrast, using a 25 percent TPS with the SP142 assay resulted in only 39 percent agreement compared to its usual cutoff point.
In essence, the results revealed that a sample that would be considered PD-L1 positive based on one assay with one cutoff point, may not be so according to another assay with another cutoff. Moreover, applying the cutoff from one assay to another or vice versa can significantly change the number of samples deemed positive.
According to Hirsch and colleagues these details illustrate clearly that to avoid the risk of denying a patient access to a drug that could benefit them, clinicians should still follow the recommendations of specific assay-drug pairs.
Other studies have also demonstrated similar discordance between different PD-L1 antibodies and cutoff points. For example, In a retrospective analysis led by David Rimm from the Yale Cancer Center last year comparing SP142 and another PD-l1 antibody E1L3N, investigators found that over 25 percent of patients who were above the positive threshold by one antibody were below it by the other.
However, there may yet be hope for cross-applicability of different PD-L1 assays. In another study also presented at AACR last month, researchers from AstraZeneca tested 500 tumor biopsy samples from patients with NSCLC, and looked at the results of three of the same PD-L1 tests as the Blueprint group: Ventana's SP263 assay and Dako's 22C3 and 28-8 assays.
The investigators determined that if the different assay cutoff points for the three tests could be properly aligned, they were actually highly concordant in regard to their ability to predict patient responses. In other words, the patient population defined by the Ventana SP263 test at the 25 percent cutoff point was similar to that identified by the Dako 28-8 test at its own cutoff, et cetera.
To collect more data on these open questions, Rimm, along with Ignacio Wistuba of the University of Texas MD Anderson Cancer Center, was elected to lead a National Comprehensive Cancer Network and Bristol-Myers Squibb-sponsored study.
Meanwhile, Hirsch and colleagues from the IASCL are moving forward with a second phase of the Blueprint project, which will likely focus on the same four assays compared in Phase I, but also expand to look at greater numbers of samples, as well as different types and sizes of samples.
The team is also planning to look at how the different assays perform across different technology platforms, Hirsch said.
None of the studies so far comparing various available PD-L1 tests have looked at the different cutoff points and antibodies in the context of actual patient outcomes. For instance, despite the fact that two assays may be discordant in who they deem PD-L1 positive, it's possible that they could still be concordant in terms of their prediction of patient outcome, or vice versa, adding another layer of complexity to any future harmonization.