NEW YORK (GenomeWeb) – In a study published in the Journal of the American Medical Association, researchers from Flatiron Health and Foundation Medicine confirmed the previously known clinical and genomic features of non-small cell lung cancer patients using a large clinico-genomics database, and demonstrated that this resource could also be used to glean new insights about emerging disease-linked biomarkers.
The clinico-genomics database includes deidentified electronic health records from thousands of cancer patients treated at more than 200 US cancer centers and links it with results from genomic tests performed by Foundation Medicine. Led by Gaurav Singal, Foundation's chief data officer, and Vineeta Agarwala, director of product management at Flatiron, the research team looked through this resource and identified 4,000 NSCLC patients who visited an oncology practice in the database from 2011 to 2018.
They reported this week in JAMA that three quarters of NSCLC patients in the database had a smoking history. EGFR mutations were observed in 17 percent of patients, 3 percent had ALK alterations, and 1 percent had ROS1 rearrangements. Among NSCLC patients with a driver mutation, those treated with a targeted treatment lived longer than those who didn't receive such therapy, with a median overall survival of 18.6 months versus 11.4 months, respectively.
Researchers also looked at biomarkers associated with response to immunotherapy agents, such as PD-L1 status and tumor mutational burden, an emerging biomarker. PD-L1 testing results were available for 1,235 patients, out of which 482 with positive and negative results received an anti-PD-1/PD-L1 therapy. There was no association found between PD-L1 status and overall survival in patients who received anti-PD-1/PD-L1 therapy, with a median overall survival 11.3 months versus 10.1 months for PD-L1 positive and negative patients, respectively.
Tumor mutational burden (TMB) was categorized as high if patients had 20 mutations or more per megabase or as low if they had fewer than 20 mutations per megabase. In the database, NSCLC patients who were smokers tended to have higher TMB levels compared to nonsmokers. In the group of patients who received anti-PD-1/PD-L1 treatment, TMB-high status was associated with significantly longer overall survival compared to those with TMB-low status, with a median overall survival of 16.8 months versus 8.5 months. The TMB-high group was also on treatments for longer and had a better clinical benefit rate than those in the TMB-low group.
The study also had some limitations. For example, the authors noted that not every deceased patient's date of death was captured, which may impact the completeness of overall survival analysis, and requirement for genomic testing could introduce bias.
Overall, the study, according to Singal, validates the ability of the real-world clinico-genomic database to yield scientifically and clinically relevant findings and improve understanding on personalized medicine. "This represents a major milestone in our mission to leverage regulatory-grade, real-world data to advance cancer care,” he said in a statement.
He envisions that the database could be useful in drug development, clinical trial design, and eventually support clinical decision making.
Flatiron and Foundation highlighted in a statement that since launching the database in November 2016, it has grown to contain de-identified data from more than 50,000 patients, including more than 6,000 NSCLC patients. Notably, the US Food and Drug Administration has inked a research alliance with Flatiron to explore how the database could be a source of real-world evidence on the safety and efficacy of drugs.
Ethan Basch from the University of North Carolina, Chapel Hill, and a JAMA associate editor and Deborah Schrag from the Dana-Farber Cancer Institute, wrote in an editorial that the study by Singal and colleagues is important because it demonstrates the ability of a clinico-genomics database to define a population of "real" patients and demonstrates its potential in clinical research. "The authors successfully overcame multiple problems typical in EHR-based evaluations, including identification of current stage of disease, structuring of free text radiology and pathology reports, and documentation of dates of death," they wrote. "These challenges were achieved through laborious procedures combining centralized human annotation and machine learning."
However, the laborious nature of building such a database also highlights the challenges of using real-world data and similar resources. "In an age when cancer diagnoses and treatment recommendations are frequently determined by genomic profiling, lack of availability of this information is a potential fatal limitation of most [real-world evidence] resources," Basch and Schrag wrote.