NEW YORK – Researchers at Stanford University have built a mathematical model that can predict the size of a non-small cell lung cancer (NSCLC) tumor based on the amount of circulating tumor DNA (ctDNA) shedding in a patient's bloodstream.
The team believes the mathematical framework provides a theoretical estimate for the performance of mutation-based ctDNA detection assays for multiple cancers across different clinical scenarios, including routine screening and monitoring for cancer relapse testing.
Johannes Reiter, associate professor at the Stanford University School of Medicine, explained that he and his colleagues developed the cancer evolution and ctDNA shedding framework to establish the potential and limitations of blood-based cancer early detection assays.
However, Reiter acknowledged that he did not make much progress on the model until Bert Vogelstein's group at Johns Hopkins's University published data on its CancerSeek liquid biopsy platform in 2018. His team used the study's data to understand the variance of plasma DNA concentrations across different patient populations and cancer stages.
"The construction of the CancerSeek panel, [which identified] the most frequently mutated driver gene regions, showed us that a fairly small panel can be sufficient to cover at least one driver gene mutation per patient across multiple cancer types," Reiter said. "How such a panel is then leveraged for cancer early detection also from a statistical perspective was very well explained … and helped us to think about a mathematical model."
In a study published in Sciences Advances earlier this month, Reiter's team reanalyzed ctDNA sequencing data and tumor volumes of 176 patients with stage I to III NSCLC. The data had initially been analyzed to measure evolutionary ctDNA dynamics and integrate genomic features for NSCLC detection.
Reiter explained that the model is based on previously published data about NSCLC tumors, including the amount of tumor cell birth per day, cell apoptosis per day, and ctDNA elimination rate per day in the patient's bloodstream. The group defined the effective shedding rate as the sum of ctDNA haploid gene equivalents shed during apoptosis, necrosis, and proliferation.
Applying the model on the data from the three NSCLC cohorts — which had similar ctDNA levels when analyzed with either CAPP-Seq or Natera's Signatera assay — they inferred that about 0.014 percent of a cancer cell's DNA is shed into the bloodstream after it undergoes apoptosis.
To demonstrate the framework's potential utility and accuracy, Reiter's team looked at tumors with different sizes, growth rates, and cell turnover rates. While there was some unexpected variability between tumors with different growth rates, the group saw that in general the factors strongly influenced ctDNA levels.
Reiter and his colleagues then investigated how the sequencing panel size, sampling frequencies, blood sample amount, sequencing error, and the number of called mutations affected the expected tumor detection size. Increasing sampling frequency and volume led to a large drop of the median tumor detection size, while decreasing sequencing error also led to strong decreases in the expected detection sizes. The median tumor detection size also fell dramatically when the team increased the sequencing panel up to about 25 called mutations.
By using the mathematical framework and shedding rate assumptions, the researchers could predict the expected tumor detection size for early detection tests (both screening and cancer relapse) based on somatic point mutations in ctDNA.
For annual screening purposes, the model predicted expected median detection sizes of 2.0 to 2.3 cm in a group of individuals with a smoking history, which represented about a 40 percent decrease from the current median detection size of 3.5 cm using imaging-based methods.
To assess ctDNA-based relapse detection, the group computed the expected lead time to imaging-based relapse detection when applied at the same frequency of testing. For informed monthly cancer relapse testing, the model predicted a median detection size of 0.83 cm and indicated that treatment failure can be identified 140 days earlier.
Manish Kohli, a professor of oncology at the Huntsman Cancer Institute who was not involved with the study, praised the model's contextual details and ability to potentially integrate both mathematical concepts and cancer genetics.
"Up until now, we've taken candidate mutations in panels or genome-wide mutations and sequenced them," Kohli said. "[Reiter's team] has taken that to the next level by … taking into account the math and genetics, fusing two disparate entities together to project the future, which is the way to go."
Reiter readily admits that he and his colleagues dealt with multiple challenges and assumptions when looking at patient data. Noting that his team's understanding of ctDNA shedding and its variability across other tumors was limited, Reiter said that he will need to perform additional work to understand how the correlations between ctDNA levels with tumor volume can help inform shedding rate inferences in other cancers.
"We need good imaging data in order to get a good estimate of the tumor volume," Reiter said. "If you have very good imaging data and a sensitive ctDNA assay, it allows us to infer ctDNA shedding probability and then make [objective] comparisons between multiple cancer types."
Kohli pointed out that the theoretical framework may encounter several points of bias due to the lack of ctDNA standardization at the pre-analytical level. For example, using different extraction tubes (e.g. Streck, EDTA, or LBgard tubes) may lead to different downstream ctDNA values following the sequencing workflow.
"There are many other preanalytical issues that researchers grapple with, such as the length of time after blood sample collection prior to sequencing, or if there was any breakdown of ctDNA, or if germline DNA was taken at the same time, and if alignments were done properly, so that you know exactly what is [in] the tumor's DNA," Kohli added.
While the group verified its mathematical results through computer simulations, Reiter noted that it will need to validate the predicted tumor detection sizes in large clinical studies for screening and cancer relapse-based applications. The group will also need to consider other biological factors and clinical pathological variables — such as tumor histology, cancer stage, and treatment regimen — to improve the model.
However, Reiter highlighted that in principle users could ask similar questions for both early detection scenarios. For example, he said that a public health expert may decide that lung cancer screening should be performed to detect the tumor with a goal size of 2 cm, which it believes will make a difference in reducing patient mortality.
"That goal could be put into our formula, and it should tell you how often to test, or how deep we need to sequence, or how large the sequencing panel needs to be, or how low the sequencing error rate needs to be," Reiter explained. "All of these parameters can be tweaked in different ways, depending on which commercial assay you are using."
"[Reiter's team has] taken a strong, first theoretical step [in that] if just by sampling blood you are able to do better than imaging-based detection months before when the tumor sizes are smaller, that's great," Kohli added. "But, it will not always tell you what kind of tumor it is, where it is coming from, or or its location in the body."
Reiter and his colleagues therefore expect to collect more published sequencing data from lung cancer ctDNA cohorts to further validate the model. Aiming to expand into additional solid cancers, the group has also begun gathering sequencing data from published studies on colon and breast cancer.
While Reiter said that he and his colleagues will be able to validate their predictions and improve the model for detecting relapse in NSCLC patients, he acknowledged that validating the method for screening purposes will take longer due to the challenge of getting access to large cohort sizes.
Reiter said he is unaware of any similar ctDNA-based models for early cancer detection, but noted that his colleagues at Stanford previously developed a deterministic model to search for circulating protein biomarkers that are associated with early-stage cancer.
Reiter's team is now interested in generalizing the model to additional characteristics of ctDNA, including copy number alterations, methylation biomarkers, as well as blood-based biomarkers such as circulating proteins.
"With current technology, tumors need to become fairly large before we can detect them from ctDNA in a liquid biopsy," Reiter said. "Analyzing more characteristics of ctDNA or adding other biomarkers, like proteins, can of course decrease these detection sizes."