SAN FRANCISCO — Accelrys this week launched the NGS Collection for its Pipeline Pilot platform, becoming the latest informatics company to enter the next-generation sequencing market.
The product comprises a suite of more than 150 components that users can combine into different NGS processing workflows via a drag-and-drop interface.
With the offering, Accelrys joins a number of informatics firms that have recognized the significant market opportunity for NGS analysis. However, company officials said they view the product not as a rival to companies with similar offerings but as a complement to currently available commercial software made by firms like CLC Bio, GenomeQuest, and Geospiza.
Cliff Baron, director of biology marketing for Accelrys, said existing NGS software packages are "purpose-built applications" designed to address a particular analytical need, such as assembly or variant detection.
"Our niche is different" because Pipeline Pilot is designed to "make easily configurable workflows very simply, very quickly, and to push those out to the groups that need them," he said.
Baron, who spoke with BioInform at Cambridge Healthtech Institute's Molecular Medicine Tri-Conference here this week, said that NGS users are likely to appreciate the flexibility of the system because "the algorithmic space is changing so rapidly."
"The analysis you do today, let's say on variant profiling, is probably not going to be exactly the same as the analysis you do three months from now," he said.
As a result, the company said it expects bioinformatics teams as well as end users to prefer a system that can rapidly string together new algorithms and analytical tools — from the open-source community to third-party providers — as their sequencing demands evolve.
"We see ourselves as working within [an informatics] ecosystem," Baron said. "We would never suggest that a customer get rid of software that they're already using."
For instance, Accelrys would aim to integrate a customer's existing third-party NGS software within the company's Pipeline Pilot environment.
Baron said that the company is targeting the new collection toward research groups with one or two sequencers and a limited bioinformatics staff. This includes university core labs, medical-research institutes, pharmaceutical, biotech, and ag-bio firms.
Accelrys estimates that there are approximately 400 such facilities worldwide.
'Unexpected' Interest
The NGS Collection also offers an enhanced query capability and a repository model that allows organizations to manage how they share NGS data sets. Both functions are expected to be important for Accelrys as it shops the product to its core biopharma customer base and tries to lure new research customers.
Indeed, Accelrys derives most of its life-science revenue from biopharmas, but that sector has so far "been sitting on the sidelines" over bringing NGS technologies in-house, Baron conceded.
However, "we are seeing clear indications, and very frank discussions, about how they're going to be making the capital investments over the next 12 to 18 months to bring the capability in-house," he said.
"Either that, or they have very focused partnerships with service providers in which they're still going to need to be doing secondary analysis and tertiary analysis in-house," he added.
For instance, two of the top-five pharmaceutical firms are currently beta testing the NGS Collection and have "verbal agreements" with Accelrys to license the product, said Baron. He did not name the drug makers.
In addition, he said the platform is seeing "unexpected" interest from ag-bio firms, which could use it to support strain-development and trait-optimization research.
More notably, Baron said, the NGS Collection could enable Accelrys to gain a foothold in academia, which has traditionally eschewed commercial software in favor of in-house scripting.
But while that may work well for relatively small amounts of data it cannot scale with the growth of NGS data volumes.
During a presentation at the CHI conference, Baron cited a study by the National Institute of Infectious Diseases in Tokyo that characterized a drug-resistant strain of Salmonella.
The study, published last year in Antimicrobial Agents and Chemotherapy, used more than a dozen different algorithms and applications, converted terabytes of data into several different formats, and required several months of data analysis.
As an exercise, he said, Accelrys replicated the same analysis in Pipeline Pilot. It took four hours to recreate the analysis workflow and 24 hours to process the data. Baron estimated that the software could reduce the development time of a typical NGS product by between 30 percent and 70 percent.
Under the Hood
The NGS Collection for Pipeline Pilot includes more than 150 components, including methods for de novo assembly, including Velvet and MIRA3; read mapping, including BWA and Bowtie; variant detection; RNA-seq; and ChIP-seq.
It also supports Pipeline Pilot's own distributed processing system as well as third-party systems such as Platform LSF, Altair PBS, and Oracle Grid Engine.
The system supports data formats for sequencers such as Illumina's GA and HiSeq, Life Technologies' SOLiD, and Roche/454's GS. Future releases of the collection will support Ion Torrent and Pacific Biosciences platforms.
Scott Markel, Accelrys' principal bioinformatics architect, said that the system supports paired and unpaired reads in both base-space and color-space, and includes converts for common formats such as SAM, BAM, GFF3, and FASTQ.
Have topics you'd like to see covered in BioInform? Contact the editor at btoner [at] genomeweb [.] com.