NEW YORK – A group of researchers from the Georgia Institute of Technology Integrated Cancer Research Center are drawing on artificial intelligence and mass spectrometry to develop a test to help determine which patients may have ovarian cancer.
Described in a paper published last month in Gynecologic Oncology, the test utilizes artificial intelligence to look for correlations between different patients' metabolite patterns that might indicate ovarian cancer, said John McDonald, founding director of the ICRC and a professor at the university. According to the paper, the test has 93 percent accuracy in detecting ovarian cancer.
McDonald noted that because cancer is such a complex disease, aberrations that cause cancer may occur with mutations at the DNA level, as well as changes in gene expression, variations in proteins, or other molecular factors. However, "you're not going to find any single protein or biomarker" to detect cancer, especially because there is "tremendous heterogeneity among cancer patients on the molecular level," he said.
McDonald's team, instead, decided to use artificial intelligence to look for correlations and patterns in metabolites rather than detecting a specific biomarker. The team chose metabolites because they are "at the end point of all the molecular processes that are going on in cells, so it's very close to the phenotype of the patient, and so it embodies … changes on the DNA level, changes in gene expression, on the protein level, and so forth," he said.
By taking serum samples from hundreds of patients and running them through mass spectrometry to identify individual metabolites, the team was able to use artificial intelligence to analyze patterns that are common among patients with ovarian cancer when developing the test.
The researchers used Thermo Fisher Scientific's Q Exactive Plus Hybrid Quadrupole-Orbitrap Mass Spectrometer for its work, McDonald noted. Samples were individually processed through two different columns and analyzed using two different ionization modes, which resulted in four distinct datasets, the researchers wrote in the Gynecologic Oncology paper. Reliable features were identified using recursive feature elimination combined with repeated cross-validation and "a relative ranking of features reflective of the relative frequencies of the features after repeated [cross-validation] iterations" was assigned, they added.
"A consensus classifier was constructed by aggregating the results of five independent machine learning classifiers … to generate predictive classification models," the team noted. The probabilities assigned to specific patients by the consensus model "were utilized to create a background distribution of probabilities that a given sample was cancer or normal."
The predictive machine learning model the researchers created is able to determine which metabolite patterns are more often associated with cancer. The model can also assign scores to each patient, with high scores indicating a patient is more likely to have ovarian cancer, and return a report to a clinician that shows where a patient's profile lies on a distribution map of patients with and without cancer. That information could then be used by the clinician to decide whether to move forward with confirmatory testing or continue with surveillance, McDonald said.
The test is "not really telling the clinician 'your patient has cancer or not,' but it's a clinical aid in helping the clinician decide 'what is the likelihood that my patient has cancer, and based upon that I will decide what to do next,'" according to McDonald.
For the foreseeable future, McDonald said he believes each patient sample will need to be analyzed using mass spectrometry due to the "significant molecular level variability that exists between cancer patients with even the same clinical diagnosis." Potentially after thousands of samples have been analyzed, the researchers may be able to identify distinct subgroups of patients that can be characterized by a few individual metabolites that could form the basis of future clinical ELISA tests specific to each subgroup, he noted, but "we are a long way from that point."
The researchers are currently partnering with other US institutions to conduct prospective studies to make sure the model is accurate for all patients, he noted. The team's current goal is to validate that the method can work in different places with data collected on any instruments, and if that validation goes well the test can move into utility studies. The team also believes it will be able to identify women with early stages of cancer who aren't showing any clinical symptoms, he added.
While the logistics of setting up and infrastructure to make the analysis available to clinicians will be another challenge, it will be "no more problematic than what was faced by the DNA sequencing industry as the technology was incorporated into medical practice," McDonald said.
They've already started prospective studies in Atlanta and have identified other undisclosed collaborators who are willing to participate at their institutions. The team is aiming for a minimum of 1,000 additional patients and is trying to recruit women with a history of ovarian cancer in their families to increase the odds of collecting patients with a higher likelihood of developing ovarian cancer, he added. Right now, the team has data from about 1,000 patients from Canada, Georgia, North Carolina, and Pennsylvania included in the model.
"The more data we get, the more accurate the predictive algorithm will become because we're capturing more of the variability in the population," he said.
If the researchers are able to recruit a large enough number of patients, McDonald said, they may be able to start subdividing by race, ethnicity, and location to get more information about ovarian cancer patterns among different populations.
The field of proteomic ovarian cancer tests has seen its share of controversy, such as the ovarian cancer assay from Emanuel Petricoin and Lance Liotta that was later thrown into question due to questions around the test's data. McDonald said that he believes the two "were on the right track in their view that metabolic profiles are reflective of molecular changes operating on multiple molecular levels," but that their efforts failed for "several reasons — some their fault and some the fault of the somewhat simplistic view of the causes of cancer prevalent at the time."
McDonald added that he doesn't "think we can throw out the metabolomic approach just because it failed in the past." The strides made in understanding cancer and the limitations with mass spec analyses make a proteomic test for ovarian cancer more feasible than in previous years. "The bottom line is that if the analyses are done correctly (with the proper controls), and the computer analyses are done correctly (with the proper controls), I think it would be a mistake not to consider … this approach to cancer diagnostics."
In theory, McDonald said, the approach used for this test could be taken for other types of cancer or other complex diseases, and the team has already started preliminary work on cardiovascular disease with its method.
One potential hurdle with regulatory clearance and clinical application, McDonald noted, is that there is no single biomarker or set of biomarkers that is being detected. The mass spec can identify thousands of metabolites, but only about 7 percent of those have been characterized — the vast majority of what's being detected in this test is unknown, he said. "We know it's a metabolite, but we don't know what it is or what it's doing."
Although the researchers may only know what a fraction of the metabolites they're studying actually are, "for correlations, you don't have to know, all you have to do is get something that correlates" consistently with disease, he said. However, that might be an issue for the US Food and Drug Administration, which has traditionally wanted to know what particular biomarker or set of biomarkers a test is detecting before granting regulatory approval, McDonald added.
"A lot of people have the mentality that there's going to be a single diagnostic biomarker for cancer … but I think the fact of the matter is cancer is such a complex disease and there's so many different pathways to get to the same clinical disease that there's not going to be any biomarker, or even small group of biomarkers, that'll be 100 percent accurate," he said.
This test represents "a different kind of paradigm," he said, and that's partially why the team wants to perform studies at multiple centers: Getting the same results at multiple places on different instruments will mean the assay is consistent, and that's "one way to try to gain more confidence in this approach."