CHICAGO – With last week's release of a free "community edition" of its variant calling benchmarking software, bioinformatics startup Truwl is trying to make a name for itself among individual researchers while also promoting its commercial business to the enterprise market.
The community edition allows any user to upload a VCF file and benchmark it against the Genome in a Bottle Consortium's HG002 reference genome to see how a given sequence compares to the winners of the US Food and Drug Administration's PrecisionFDA Truth Challenge version 2. Truwl's president and cofounder, Karl Sebby, said the company included winners that use Illumina, Pacific Biosciences, and multiplatform sequencing technology to make the free version useful for as many people as possible.
Sebby described the free edition as a "pared-down, limited [version] for variant benchmarking."
Truwl is fully hosted in the cloud to make the platform as accessible as possible to authorized genomics researchers who might not have a bioinformatics background, according to Sebby.
He sees academic researchers and workflow developers as potential users of the free release, adding that Truwl is looking for some early clinical partners who might need to validate and revalidate workflows when they install or replace a piece of lab equipment.
"We imagine this has a lot of use for [quality control] in the clinic, workflow validation, and then even monitoring as you're running your standards, to make sure everything about your tests is working properly," Sebby said. "It's just a simple way to get an idea of how you're doing."
Truwl released the full version of its platform late last year without much fanfare. That release includes GIAB's HG002, HG003, and GH004 reference samples, collectively known as the "Ashkenazi trio" of parents and child of Eastern European Jewish ancestry, though it is built to be flexible.
"You can put in a query VCF and you can choose what you want to compare against," Sebby said, demonstrating the platform in a video conference with GenomeWeb. He said that the firm has loaded about 100 regions of the reference genomes onto its platform, though the technology can theoretically support any truth set.
The target market for the more comprehensive pay-per-use version is clinical labs. Truwl is strictly a research tool now, but Sebby said that the company is working on a bit of a rebranding to reach commercial genomic labs that might be part of precision medicine efforts.
Sebby said that Truwl is reworking its website so the home page is clearer that the company can support enterprise-scale benchmarking using private data. The update should appear within a few weeks.
Whitefish, Montana-based Truwl first emerged in November 2019 when it announced that it had received a six-month, $149,959 Small Business Technology Transfer grant from the US National Human Genome Research Institute to adapt its web-based genomic data analysis methods platform for computational analyses developed for the Encyclopedia of DNA Elements, or ENCODE, project.
The technology platform was called Truwl, though the company was known as XD Bio then. The legal name remains XD Bio, but the firm now does business under the Truwl name.
For the NHGRI grant, the startup collaborated with the ENCODE Data Coordinating Center at Stanford University to expand Truwl to contain a new interface with dynamic forms for specifying pipeline inputs. The partners created a back-end service to execute and monitor computational processes, and built several Google Cloud Platform compute instances that they configured with the ENCODE reproducibility framework.
Sebby called that project a proof of concept for running ENCODE pipelines. "ENCODE had done a good job of making all the data public … but people were still struggling to use some of the methods that they had developed, which was another important output of [the Truwl-Stanford] project," he said.
Nearly two years after the NHGRI grant ran out, Truwl has most ENCODE workflows on its platform, as well as some pipelines from the BioWDL project and the Broad Institute's Genome Analysis Toolkit, available on a pay-per-use basis.
Sebby and Joby Rudolph, cofounder and chief technology officer, actually incorporated XD Bio in 2016, but kept the company as a side project while they did other jobs until the NHGRI grant opportunity came along in 2019. Truwl raised a $1.7 million seed round in December 2020, led by Montana-based Two Bear Capital.
The company still has just five full-time employees, three of whom are in Whitefish, with the others working remotely. Montana is not exactly a hotbed for bioinformatics, but there is at least one other established firm there, Bozeman-based Golden Helix.
Sebby said that decision to make part of the technology free was in some ways an homage to David Mittelman, CEO of forensic genomics firm Othram.
Mittelman, who has been a strategic advisor to Truwl, was cofounder and CSO of Arpeggi before that startup was sold to Gene by Gene in 2013. Arpeggi created a benchmarking tool called GCAT, short for Genome Comparison and Analytic Testing, and collaborated with the GIAB Consortium on benchmarking development.
GCAT eventually disappeared after Gene by Gene absorbed Arpeggi. "It was something that we think the community had missed, just being able to simply put in a variant-call file and have it do a benchmark and see how you're doing," Sebby said.
Truwl hopes to distinguish itself from other genomic benchmarking platforms by making benchmarking more accessible to individual researchers, and by automating the process. Sebby said that Truwl's benchmarking of public datasets is public, so it can be reached from a Google search, though the company also can compare private data.
Updates to such evaluations based on new medical knowledge are automated as well so users do not have to transfer reports to spreadsheets to make comparisons, according to Sebby.
Right now, Truwl's workflows follow the Workflow Description Language standard, but the firm is adding support for another framework called NextFlow. "We want to be a universal workflow runner," Sebby said. "You should be able to access all sorts of workflows from the same platform."
Eventually, Sebby would like to support the four major workflow programming languages in bioinformatics, also including Snakemake and Common Workflow Language, but that is not an immediate concern. In the short term, Truwl will be looking to attract enterprise customers by beefing up its platform security.
Truwl has not disclosed the names of any enterprise users. "Most of our customers have been academic-use individuals," who may only need to benchmark a few samples at a time. "The fewer samples you need to run, the more pain it is to stand up a whole workflow-running system, so just being able to come run 10 or 100 samples has been a huge timesaver for them," Sebby said.