NEW YORK (GenomeWeb News) – At the Bio-IT World Conference last week, Intel unveiled a new website, dubbed Optimized Code, that offers access to versions of several popular open-source bioinformatics analysis tools that the company has optimized to run on Intel Xeon processors, with the aim of generating results faster and more efficiently than standard iterations of these solutions.
The first set of applications that the company has released specifically for genomic analysis include optimized versions of the Broad Institute's Genome Analysis Toolkit; Blast algorithms for nucleotide- and protein-based sequence searching; BWA-ALN, software for mapping low-divergent sequences to a reference genome; and MPI-HMMER, protein sequence analysis software. The company has also released optimized code for AMBER and NAMD, both of which are used for simulating the molecular dynamics of biomolecular systems. The Intel website provides performance numbers for each of the optimized codes as well as directions for how to access and download the tools.
In a conversation with GenomeWeb at the Bio-IT World Conference held last week in Boston, Ketan Paranjape, Intel's general manager for life sciences, said that the company has several additional tools in its genomics optimization pipeline including RNA-sequencing analysis packages, such as TopHat and Cufflinks, which it hopes to release at a later date, pending approval from the respective authors. Other planned additions to the website will include mechanisms for users to report problems with the versions of the optimized code that Intel provides, as well as to suggest open-source solutions for potential optimization, he said.
The new website is just one of several moves in the life sciences and genomics space that Intel has made in the roughly four years since it launched its life science program. Intel set up the program, Paranjape told GenomeWeb, to look for opportunities to bring its products and expertise to bear on challenges in the healthcare and life sciences with the ultimate goal of shortening the time to results and treatment options for patients from days, weeks, and months to a single day.
Thanks to improved technologies, clinicians and researchers now have access to increasing quantities of data that are accessible in a fraction of the time and cost than previous technologies required, Paranjape noted. That includes sequence information pouring out of increasingly faster and larger next-generation sequencing instruments; clinical information, such as imaging data all stored in electronic medical records; and data from personal devices, such as fitbits and other wearables.
As the biomedical community moves toward making precision medicine a reality, datasets from all these sources need to be properly collected, curated, consumed, and securely stored, he said, and there will be a need for tools to organize and manage the data as it runs through the pipeline. Moreover, as genomics moves into more routine clinical use, it will be important for researchers and clinicians to keep tabs on pipelines and software packages and versions used to analyze and make sense of their data. There are also various hardware options to consider, such as cloud computing versus high performance systems, as well as ways of better leveraging existing hardware the researchers already have at their disposal.
From Intel's perspective, the company has a number of products in its portfolio that it believes can address these challenges including processors, solid state drive storage options, and networking capabilities. The goal for the life sciences group, according to Paranjape, was to find ways to incorporate those solutions and other in-house capabilities and resources into existing genomics pipelines.
Besides optimizing existing open-source bioinformatics code, Intel has also been involved in the development of a high-performance appliance, sold by Dell, that help reduce genomic data analysis times. Two of these are currently used in the Translational Genomics Research Institute and Cleveland Clinic for DNA and RNA sequence data analysis, among other tasks, Paranjape said.
String of partnerships
Within the past four years, Intel has inked deals with some 40 companies across the healthcare and life science spaces including partnerships with a number of computational companies that offer tools for processing and exploring genomic data. One of its more recent partners is Edico Genome, developers and marketers of the Dragen Processor, a platform-as-a-service offering that consists of a chip loaded with proprietary NGS algorithms for mapping, aligning, and calling variants.
Dragen, which just received patent protection from the United States Patent and Trademark Office last week, was recently used to speed up analysis of a whole genome sequenced at 300x depth of coverage in a collaborative study involving researchers from Stanford and Harvard Universities. According to numbers provided by Edico this week, the system completed the analysis in approximately six hours compared to about 60 hours required by a BWA/GATK pipeline.
At Bio-IT, Edico announced that it was collaborating with Intel to combine Dragen with Intel's processors to enable real-time primary and secondary NGS analysis. Pieter van Rooyen, Edico Genome's CEO, told GenomeWeb in a conversation during the meeting that the combined infrastructure will enable Dragen chips to run more efficiently and scale as needed, resulting in greater cost savings to researchers. Moreover, since Dragen chips can also be plugged directly into sequencing machines, it can help increase the efficiency of data processing on the sequencing instrument itself, he said.
Besides Edico, Intel also has an arrangement with Curoverse, a Harvard spinout that was formed to commercialize Arvados, a platform for managing, processing, and sharing genomic and biomedical data developed at Harvard Medical School. Earlier this month, Curoverse launched a public beta to test cloud and on-premise commercial products that it has created based on the platform, a program for which Intel will provide support for the on-premise pilots. The company said at the same that it was working with Intel to launch a bundled hardware-software offering that would include the Arvados appliance installed and optimized for Intel hardware.
Another recent Intel partnership is with bioinformatics consultancy BioTeam. Working alongside the National Institutes of Health's National Institute of Allergy and Infectious Diseases, Intel, BioTeam, and EMC provided high performance computing infrastructure and bioinformatics tools for the first of three planned African Centers of Excellence. Specifically, BioTeam provided a customized version of its SlipStream appliance loaded onto one of the systems from Intel's HP ProLiant DL Server product line.
Intel also has a partnership with Dutch software company Genalice that would make Intel the company's main hardware supplier while optimizing its Genalice Map software to run on Intel hardware. Also last year, Intel partnered with Stanford University spinoff Ayasdi to optimize Ayasdi Iris, a platform for drug discovery and development projects, to run on Intel hardware and improve performance.