CHICAGO – This week, the US Food and Drug Administration announced the results of its latest PrecisionFDA challenge, the second Truth Challenge.
Sentieon won the challenge in the categories of difficult-to-map regions for a Pacific Biosciences sequence, major histopathology complex (MHC) for PacBio, and MHC for sequences run on multiple technologies. The company also was in a three-way tie with Roche Sequencing Solutions and Google for best perfomance in all benchmark regions for multi-technology sequences.
Genomics data analytics firm Sentieon has, in many ways, built its reputation on these contests. In May, the San Jose, California-based company was named the top overall performer in PrecisionFDA's Brain Cancer Predictive Modeling and Biomarker Discovery Challenge. Sentieon now has participated in six PrecisionFDA challenges, all dealing with the accuracy of variant calling, and has won either overall or category honors in all of them.
PrecisionFDA based the second Truth Challenge on two Genome in a Bottle Consortium (GIAB) reference genomes. The program required entrants to process three FASTQ files, representing whole genomes of a mother-father-child triad, through their mapping and variant-calling pipelines to produce VCF files.
At the onset of the challenge in May, participants were given the recently updated long-read sequence of GIAB's reference sample HG002 (NA24385) pilot genome. Following the June 16 close of submissions, GIAB published a updates to the genomes of the parents of HG002, known as HG003 and HG004, which the PrecisionFDA technology platform ran through each of the entrants' pipelines to judge the accuracy of the software.
Brendan Gallagher, Sentieon's business development director, said that the accolades help the company establish trust in its software. "The PrecisionFDA platform has been a place to showcase our skills and build credibility," he said.
"I think it's just continued evidence of producing high-quality software tools in a world where all of our customers want to prove it to themselves as well, but it's nice to have an outside independent party [verify it]," Gallagher said of the challenges. "It lowers the hurdle for me to get a new customer to try the software and have faith in the software."
Sentieon, which was founded in 2014, produces a suite of bioinformatics secondary analysis tools for genomic data, including DNAseq and DNAscope for germline variant detection, and TNseq and TNscope for tumor-normal somatic variant detection. The products use the same methodologies as the Broad Institute's Genome Analysis Toolkit and MuTect software but are much faster and in some cases more accurate than the original tools.
In terms of number of core hours needed to process data, the company has said that DNAseq offers 10-fold faster variant calling from FASTQ files and a 20- to 50-fold increase in processing speed from BAM to variant call files over the standard BWA-GATK pipeline, without a corresponding increase in hardware requirements.
Since Sentieon's technology runs on cloud platforms, the speed saves money for customers by lowering the amount of computing time necessary to process each sample, according to Gallagher, a molecular biologist by training. "The big institutes, they'll complain about their Google [Cloud] bill, and one way to lower it is to use our software instead of the regular [GATK] software," Gallagher said.
When Sentieon was just getting off the ground, it downloaded GATK from the Broad. "That made my job easy," Gallagher said. "The Sentieon team painstakingly recreated the GATK algorithms but faster and without issues like thread dependency or downsampling, which causes run-to-run differences."
Gallagher, who joined the company in 2015, said that Sentieon was founded by a team of engineers who had been working together for nearly two decades to solve complex mathematical problems in several industries, including semiconductor manufacturing. Technology in those semiconductors help power every iPhone in the world, he said.
"They wanted to apply their skills to bioinformatics and genomics," Gallagher said. He recalls that CEO Jun Ye was chatting with a colleague about the big-data technology they developed and realized it could be useful in the life sciences.
"The challenge in the life sciences versus the semiconductor chips is that a semiconductor chip is a Six Sigma protocol. If you're going to produce millions of these chips, you can measure [adherence and accuracy]," Gallagher said. Genomics, on the other hand, has more variation because it relies on bioinformatics pipelines that follow different processes and may produce different results for the same sequence.
Gallagher said that Sentieon always focuses on whatever complex math problems they can solve with computing power. The company often partners with others to extend its technology, particularly for clinical interpretation of genomics information.
Notably, Sentieon has built DNAseq, DNAscope, TNseq, and TNscope into BC Platforms' system for the integration, secure analysis, and interpretation of molecular and clinical information. Similarly, Qiagen has contracted with Sentieon to integrate DNAseq and TNseq into its CLC Genomics Workbench product as part of a wider partnership.
Earlier, Sentieon partnered with Fabric Genomics to develop structural variant and copy-number variant capabilities for genomic analysis and to build variant calling software for hereditary diseases and oncology with the aim of improving accuracy. The firm also supplies a series of genomic analysis algorithms to DNAstack, which helps to speed up and improve the accuracy of that firm's Workflows app.
Another major partner is Seven Bridges.
Gallagher said that Sentieon has other commercial partners on the clinical side that it has not disclosed. "There's a lot of companies that do clinical diagnostics that integrate our tools into their workflow," he said. "If they're doing the clinical interpretation, they might just offer ours as an option."
Gallagher said that Sentieon is taking its advances from the PrecisionFDA sample labeling challenge and the biomarker discovery challenge to develop new AI-based products for clinical trial modeling and drug discovery. Those might not be released until next year.