NEW YORK (GenomeWeb) – A recent paper from researchers in the Jackson Laboratory for Genomic Medicine's clinical genomics group describes the JAX Clinical Knowledgebase (JAX-CKB), a structured, curated repository that connects clinically actionable tumor variants to information on phenotype, protein effect, therapeutic strategies and options, and clinical trials.
The paper, which was published in Human Genomics last month, also describes the mechanisms that the system uses to transform input sequences into standardized formats and to map them to the knowledgebase. The researchers also shared query mechanisms for pulling information from the database that are related to the specific molecular alterations, such as a list of targeted therapies based on associated drug class or relevant open clinical trials; and showed how to use the database to explore the oncology clinical trial landscape.
JAX-CKB is part of a larger bioinformatics and curation solution called the JAX Clinical Genome Analytics (JAX-CGA), which the Jackson lab uses for identifying, annotating, and reporting cancer variants. The JAX-CKB repository underlies the JAX-CGA and houses clinically actionable data relevant to the JAX Cancer Treatment Profile, a 358-gene targeted sequencing panel that the lab developed for testing samples from solid tumors — both the test panel and JAX-CGA are described in detail elsewhere. The repository hosts manually curated information pulled from public databases such as PubMed and ClinicalTrials.gov, and relies on standard ontologies, controlled vocabularies, and human gene and variant naming conventions, all of which supports interoperability with other databases and ensures consistency in clinical reports.
The Jackson lab team published the paper to disseminate its methods more broadly to labs offering next-generation sequencing-based panels who might need help with the clinical interpretation components of their pipeline, Susan Mockus, manager of clinical analytics and curation at Jackson Lab and one of the authors on the paper, told Genomeweb. She explained that the lab initially decided to develop its own internal repository because it wanted to be able to generate clinical reports in a timely fashion. "That's really how the knowledgebase began," she said. "And then we came across a lot of decisions and hurdles that we had to figure out ... and so we wanted to share those methods with the community and how we do some of these things."
One of those hurdles was ensuring consistency in the way genes and variants are named, Mockus said. "That's really important in the community because [if] we don't call all genes the same [and] we don't name variants the same even when they are ... we can't find them to compare findings from either clinical or pre-clinical studies," she said. "So that was the other point in trying to [get] the community in this space to share ideas"
They also needed a sustainable system that could easily incorporate new data as it became available with existing information. For example, if a researcher finds a new missense mutation in a particular gene that turns out to be an important marker, the information can be added to the system without requiring changes to the existing knowledgebase, Mockus said. Part of this updating process is automated. For example, the researchers have a tool that searches clinicaltrials.gov for changes in the recruitment status of trials, and they have tools that parse databases such as PubMed for efficacy evidence on new drugs or new information on unknown variants, but the researchers also look at the evidence before adding it to the database to ensure its quality.
Equally important was creating a system that could build molecular profiles capable of capturing tumors complexity. A molecular profile in the JAX-CKB can contain one or more genetic changes including SNPs, frameshifts, insertions and deletions, gene fusions, copy number variations, and/or changes in expression levels, according to the paper. JAX-CKB incorporates this and other pieces of information as data elements named using standards such as the Human Genome Variation Society nomenclature guidelines for genes and variants, making it possible to connect the data in different ways.
The ability to build complex molecular profiles also makes it possible to match therapies to multiple variations at the same time rather than to each variant individually, the researchers wrote. Furthermore, the system can capture complex treatment regimens that involve multiple drugs and disease pathways. "More and more we are seeing that targeted therapies don't work as mono-therapies or patients don't respond to them in a sustained fashion; so we take and dump in multiple therapies targeting different pathways to prevent that from happening," Mockus said. "The database [is] already built to capture [that] complexity." JAX-CKB currently contains 1,108 unique targeted therapies relevant to treatment approaches or clinical trials for variants related to the lab's 358-gene panel.
One of the benefits of creating a bespoke repository is that Jackson Lab could tailor it to its internal testing panel and control the curation and filtering of the data. "The challenge with commercial entities selling these kinds of knowledgebases is that they have a very broad focus and they are trying to address multiple customers," Mockus explained. "We really focus on connecting clinical efficacy evidence to mutations. And so I don't want to know every paper published on that mutation and that particular drug. Just give me the best clinical data."
Besides enabling quicker and more accurate clinical reporting, the researchers also showed in the Human Genetics paper that the database can shed light on the state of the oncology clinical trials landscape. That's important because "there are currently no publicly available comprehensive databases that curate information on molecular eligibility for clinical trials, [and information on] clinical trials recruiting on molecular criteria is not readily accessible through clinical trial registries, such as clinicaltrials.gov," the researchers wrote. JAX-CKB fills this gap and offers an opportunity to "evaluate strengths and weaknesses in current molecular targeting in oncology" as well as discover potential opportunities for research and development, they wrote.
"This is an area that's really a huge deficit in the precision medicine field in that the registry doesn’t have a good way to capture ... those trials that are recruiting on patients with particular mutations," Mockus said. "We spent a lot of time manually curating that data in a very well-defined regular expression system."
According to the paper, Mockus et al. found that that one of most heavily represented tumor type in clinical trials is non-small cell lung carcinoma and that Pan-VEGFR inhibitors are the most popular drug class being studied in trials — about 378 trials involve these treatments. Their analysis also showed that clinical trial researchers now more often recruit patients with particular mutations — variations in the ERBB2 and ESR1 genes are among the top candidates — and furthermore, they were able to identify which specific mutations in these genes are of interest to clinical trial researchers.
"I think this analysis would be interesting to run over time to really assess where the field's at in certain areas and where it's going," Mockus said. It could also help investigators interested in oncology identify niche research areas in the space to take on, she added