NEW YORK (GenomeWeb) – Edico Genome has released a new bisulphite sequencing pipeline for its Dragen Bio-IT processor with additional pipelines for microbiome, RNA-seq, and cancer data to follow shortly. The company is also developing a graphical user interface that will offer tools for tasks such as automating data transfer and pipelines on its processor as well as a software development toolkit that will enable customers to install and run their own tools on the company's processors.
Edico also announced this week that it has added HudsonAlpha to its client list. The institute has added a Dragen processor to its next-generation sequencing workflow to process sequencing data from both its clinical and research arms. As part of the integration process, Edico worked with researchers at the HudsonAlpha Institute for Biotechnology to develop and install a customized data management system that enables automated data transfer between systems within the sequencing workflows and also generates reports at the close of analysis, among other tasks.
The added system was crucial for HudsonAlpha because of the scale of data that it will analyze and also because it is using the processor for both research and clinical use.
"When we leased the system from Edico, it really came as a command line interface where you could run one command, and when that one finished, you could queue up another command and have that run," Shawn Levy, director of the HudsonAlpha Genomics Service Laboratory,explained to GenomeWeb.
The data management system puts a wrapper around the processor that allows users to make better use of the processor's low compute overhead, high-speed performance, and efficient use of compute and storage. Essentially, "[it] sets it up for rapid scale production use," he said.
The customized addition to Dragen offers HudsonAlpha tools for quality control checking including monitoring the health of the system and providing detailed logging and error files, for scheduling sequencing runs, and for moving the data off of sequencing instruments for processing and on to storage, according to Levy. The system also includes tools for generating reports for both clinical and research contexts, Levy said, including lists of identified variants along with relevant summary annotations as well as standard metrics such as coverage depth and percent of bases at given q-scores.
Edico's planned GUI and tool suite will offer many of the same capabilities that the HudsonAlpha system does, enabling customers to perform tasks such as scheduling multiple workflow runs, analyzing analysis results and system performance, comparing pipelines, and managing multiple networked Dragen cards and system updates. The company is also developing an application programming interface and software development toolkit that will enable customers to run their own NGS analysis pipelines on the Dragen platform. The API will also provide access to accelerator blocks of the Smith-Waterman aligner, Hidden Markov Models, as well as BCL conversion, Gzip/CRAM compression, and encryption, according to the site.
The decision to add the aforementioned tools to Dragen is rooted in the ongoing shift from genomics for purely research uses towards more clinical use cases, according to Gavin Stone, Edico's senior director of marketing. Earlier iterations of Dragen were designed with more of a research-oriented user in mind, who would typically have sufficient expertise to run tools using the command line. While the vast majority of the company's customers remain on the research side, Edico is now seeing increased interest from more clinically-minded users who prefer more user-friendly interfaces and who need to run standardized pipelines with set parameters to ensure reproducibility across samples, Stone told GenomeWeb.
"We've worked with customers like HudsonAlpha to develop their own customized versions," and in the case of firms like Sequenom, who already have in-house systems in place, worked with them to connect their tools to Dragen, he said.
The exact dates for when these tools will be rolled out to customers have yet to be announced. For now the company is continuing to test and validate these tools internally as well as gathering feedback from select customers on how these tools could benefit them, Stone said.
Meanwhile, Edico continues to build its customer base both in academia and industry. This past May, PerkinElmer purchased a Dragen processor to analyze whole-genome and whole-exome sequencing data in its CLIA-certified high-throughput sequencing facility. Edico also recently participated in a collaborative study with researchers at Harvard and Stanford Universities, where its system was used to analyze a whole human genome at 300x depth of coverage in approximately six hours.
For its part, HudsonAlpha is using Dragen and Edico's Genome pipeline to analyze data generated by its Genomic Services Laboratory as well as its Clinical Services Laboratory, which performs whole-genome sequencing for patients with genetic conditions. HudsonAlpha is also using the Dragen processor to support its Clinical Sequencing Research Consortium project — a joint initiative of the National Human Genome Research Institute and the National Cancer Institute — which uses exome sequencing to help diagnose developmental delays in children. The Dragen genome pipeline includes optimized algorithms for BCL conversion, compression, mapping, alignment, sorting, duplicate marking, haplotype variant calling, and joint genotyping.
Levy told GenomeWeb that with the Dragen system, the average time required for analysis from completing chemistry on the sequencer through to generating the VCF file is 40 minutes including data migration time. Edico claims similar numbers as benchmarks for whole-genome analysis using its processor. It reports that Dragen requires 12 minutes to convert Illumina BCL base call files to Fastq files and 28 minutes to go from Fastq files through to the final list of variant calls.
Hoping to reduce those numbers, Edico teamed up with Intel earlier this year in a partnership to pair Dragen chips with Intel processors. When combined, the Edico and Intel technologies can analyze a whole genome in approximately 20 minutes, Edico said at the time.
Dragen's speed coupled with its low compute overhead were the chief reasons that HudsonAlpha opted to lease the processor for its needs. Levy said in a statement that the institute's new Illumina HiSeq X Ten sequencing system has more than tripled the number of whole genomes sequenced to just over 15,000 per year. He told GenomeWeb that the institute expects to generate about a 1,000 genomes per month.
Prior to selecting Edico, HudsonAlpha contemplated using cloud infrastructure to handle its data analysis needs or simply beefing up its existing systems. However, Edico proved to be the more cost-effective option for sequence alignment and variant calling tasks compared to using the cloud, according to Levy. It also made more sense to use Dragen, which is tailor-made for running specific analysis tasks quickly at low compute overhead, rather than to add on to existing infrastructure. Moreover, investing in Dragen frees up existing compute capacity for use in other research activities outside of alignments and variant calls, he added.