CHICAGO – OmniTier on Tuesday introduced CompStor Insight, a specialized appliance for tertiary analysis of next-generation sequencing data. The developer of application-specific, high-performance data products announced the product in conjunction with the opening of the virtual American Society of Human Genetics (ASHG) conference.
CompStor Insight is targeted at advanced research and clinical users who are looking to understand the relationships between multiple genomes, including users who are trying to diagnose rare diseases.
"How do you take the 4 million variants that typically come out of an NGS run and try to make sense and bring it down to a reasonable number that can be analyzed?" asked CTO Jon Coker. "That's a part of the Insight function."
The company, with twin headquarters in Milpitas, California, and near Mayo Clinic in Rochester, Minnesota, said that CompStor Insight will be generally available to pharmaceutical and research organizations starting in December.
As an appliance, Insight combines OmniTier's proprietary software with commercial, off-the-shelf computing technologies. "Anybody can get [these computing systems], but they are very specialized," with large solid-state drives (SSD), lots of dynamic random-access memory, and enough compute power to handle genomic analysis, Coker said.
"Our basic technology depends very much on SSD, and the way that works with the algorithms, we elected to make it an appliance," Coker said. "The advantages of how we use our memory technologies I think will come through in how many users can use the same systems, how that distributes the cost of doing this kind of analysis in a very nice way."
This approach facilitates easy setup, according to OmniTier sales and marketing VP Justin Cowling.
"You take it out the box, you install it into the server rack, and within 30 minutes you're out there analyzing your first genomes," Cowling said. "People don't need to have IT or software engineering or cloud engineering skills to start running pipelines. It really is designed for ease of use for the scientists and researchers."
OmniTier, which was founded in May 2015, is marketing its appliances for local deployment so terabytes of sequencing data do not have to be moved around, but Coker said there is no reason why the technology could not be installed in a public or private cloud.
The new CompStor Insight product is designed as a complement to the CompStor Novos appliance — formerly called CompStor Assembly — which performs de novo DNA assembly and reference alignment of short-read NGS data. Coupled with Novos, Insight completes an end-to-end analytics pipeline, from raw sequencer output to variant calling to annotation, the company said.
CompStor Novos currently supports Illumina, BGI, and Pacific Biosciences sequencing. Coker said that the product will be able to support Oxford Nanopore data by some point in 2021.
Like Novos, CompStor Insight is built on the company's tiered memory architecture, called MemStac. "We are, in keeping with our CompStor idea, multinode, multiuser, and use a lot of SSD technology to get the gains that we have," Coker said.
Those gains include a 7X acceleration in annotation speed over the benchmark Ensembl Variant Effect Predictor (VEP), according to internal OmniTier testing data. For the comparison with Ensembl VEP, CompStor Insight called variants against the Genome in a Bottle reference, but Coker said that the software can call against any truth set.
The company said that typical open-source and lab-developed applications for tertiary analysis of genomic data "often exhibit low level of functional automation and require long [processing] time to actionable data that does not scale with more server nodes."
Coker touted the company's deep learning — which he defined as the training of neural networks on known data so they can analyze unknown data — as a differentiator in terms of variant calling. He also highlighted Insight's front-end design, calling the product "essentially a user interface experience instead of a command line," which provides annotations over multiple nodes and for multiple users to support rapid analysis of those annotations. "I think that's our key attribute," Coker said.
Insight draws from multiple public knowledgebases, including ClinVar, the Genome Aggregation Database (GnomAD), and the Tohoku Medical Megabank Organization's (ToMMo) Japanese Multi Omics Reference Panel (jMorp). The latter attracted the attention of Tokyo-based Juntendo University's Graduate School of Medicine because the Broad Institute's Genome Analysis Toolkit (GATK) does not support the ToMMo dataset, Coker said.
Juntendo University helped OmniTier develop CompStor Insight.
"Our researchers are very busy because we need to analyze and interpret complex variants of many patients," Kazuhiro Nitta, a lecturer at the Japanese med school, said in a statement provided by OmniTier. "With the level of automation and sophisticated filtering that is supported by CompStor Insight, we estimate we could reduce the amount of time for downstream analysis of multi-genomic datasets dramatically."
The Juntedo School of Medicine already uses Novos for secondary analysis of whole-exome and whole-genome sequences.
"They're looking for a way to [address] the rare-disease problem, as well doing GWAS-type studies —populational frequency studies based on their own population using Insight," Coker said.
Belgian nanoelectronics firm Imec also introduced a product this month that offers significant speed gains over GATK, though that offering, elPrep 5 is more in the realm of secondary analysis.
In August 2018, OmniTier shared some results from a joint study done with researchers at the Mayo Clinic's Center for Individualized Medicine demonstrating the efficacy of the CompStor Novos and the compute cluster behind it for generating variant caller-ready de novo genome assemblies quickly and at low cost.
For the Mayo study, which used the NA12878 genome dataset, the health system and OmniTier said that they used CompStor to complete a variant caller-ready assembly at 50x coverage in less than two hours. The technology vendor claimed then that the technology can scale up to 800x coverage, which will make it possible to reliably identify new and infrequent variants from de novo assemblies.
CompStor Novos performed comparably to standard alignment-based approaches but sidesteps the reference bias that plague these approaches, according to OmniTier. Furthermore, the assemblies generated by CompStor can be used to call all types of variants, but where it really shines is in the context of identifying complex variants such as structural variants, Coker told GenomeWeb in 2018.
Mayo and OmniTier later compiled the results into a paper about CompStor Novos that has been on the BioRxiv prepress site since December 2018.
Then, Coker in some ways compared CompStor Novos to Dragen, an Illumina-owned brand that uses field programmable gate array (FPGA) technology in combination with proprietary software algorithms to reduce genomic data footprint and enable faster speeds.
Coker this month said that development of software on FPGAs takes longer that the approach OmniTier has chosen. The company is able to add new features to its existing software at least once a quarter, often more frequently.
OmniTier supports multiple sequencers and types of sequencing, whereas Dragen is best suited for whole-genome sequencing from Illumina instruments. One new feature in the works is the planned support for Oxford Nanopore, Coker noted.
CompStor and Dragen are similar in terms of speed. "We hold our own despite being a … small company, with respect to quality of variant calling," Coker added.