Pervasive Software said this week that it has developed an implementation of the Smith-Waterman algorithm that demonstrated a throughput of nearly a trillion cell updates per second — an order of magnitude greater throughput than earlier Smith-Waterman algorithm performance records.
Pervasive said that the implementation, based on its Pervasive DataRush platform, analyzed 10 million combinations of protein sequences in 81.1 seconds on an SGI Altix UV 1000 with 384 cores. The company said that it was able to achieve a sustained throughput of 986 giga cell updates per second, which it claims surpasses other implementations of the algorithm on a standard CPU architecture by 43 percent.
The company has not released further benchmarking details, but said that it used protein sequences ranging in length from 1,000 to 8,000 base pairs.
Pervasive said that its implementation hit 91 GCUPS on a standard 32-core Intel machine. The company initially implemented the algorithm on a 4-core system and said it was able to scale to the 384-core system "effortlessly" and without any additional changes to the algorithm.
DataRush wasn’t created specifically for the life sciences industry, Davin Potts, director of product management for Pervasive DataRush told BioInform. He described the platform as "a general dataflow parallel computing platform" that lets users develop applications that can run on multicore processors and symmetric multiprocessing systems that have multiple central processing units.
According to the company, the tool can be used for various applications like data mining, predictive analytics, sales optimization, and marketing analytics, among others.
"It helps [users] develop applications so they can take advantage of the hardware without requiring any special background in developing multithreaded applications," Potts said. "DataRush makes it possible for people without expert knowledge to develop applications that scale very well on those types of hardware."
Alison Raffalovich, director of corporate marketing and communications, told BioInform that with the DataRush platform, the company saw "the opportunity to really tackle applications that could scale to handle very large quantities of data."
The company, which generated about $47 million in revenues for its 2010 fiscal year ended June 30 reinvests between 23 percent and 25 percent of its revenue into research and development initiatives.
DataRush grew out of an "innovation initiative" that Pervasive kicked off around four years ago "when we started thinking about the proliferation of multicore and the need for developers to be able to create applications that could fully scale on a multicore without having highly specialized knowledge of several apps," she said.
DataRush has a Java framework that allows developers to quickly create parallel data-intensive applications. It's based on the DataRush Parallel Dataflow Engine, which simplifies the parallelization process, enabling developers to build scalable applications without any knowledge of parallel programming processes, such as threading, concurrent memory access, deadlock detection, data workload partitioning, or other aspects of parallelization.
Although the company has been around for more than two decades, the life sciences market is a "newer one" for the firm, Raffalovich noted, adding that out of all the products the company offers, the DataRush platform is " probably best suited" for life science researchers who often have to manage and analyze large quantities of data.
Ray Newmark, vice president of sales and marketing for DataRush, told BioInform that Pervasive opted to implement Smith-Waterman in its platform "because it’s a well-known algorithm that was straightforward for us to implement in DataRush."
He added that the company believes that the DataRush platform "should be of interest to both end users and [systems integrators] across a range of informatics markets," including bioinformatics.
The Smith-Waterman algorithm has been implemented by commercial groups such as CLC Bio, as well as open source developers with varying analysis speeds depending in part on the hardware used.
For example, CLC Bio claims that its implementation of Smith-Waterman has achieved speeds ranging from 0.05 GCUPS to 44.50 GCUPS on a single computer, depending on the type and number of processors used. For users who are working with compute clusters, CLC Bio offers a cluster add-on that enables the algorithm to run in parallel on multiple nodes for increased speed.
Other groups are using specialized hardware to accelerate Smith-Waterman. At last year's Supercomputing 2009 conference, FPGA software firm Mitrionics demonstrated a version of the algorithm that ran on Convey Computer's "hybrid-core" x86/FPGA platform at 64 giga cell updates per second [BI 11/20/2009]
Academic groups are also working on speeding up the bioinformatics workhorse. In April, a team at Singapore's Nanyang Technological University published benchmarks for an improved version of the CUDASW++ software with throughput of up to 30 billion GCUPS on a dual-GPU GeForce GTX 295 graphics card.
Potts said that the Pervasive team did not benchmark its DataRush implementation directly against other versions. Rather, the team compared the results to throughput numbers published by other groups like CLC Bio and found that the results varied depending on the kinds of specialized hardware the groups used to perform their tests.
He said that teams that used Intel or AMD processors had throughput numbers in the neighborhood of 80 GCUPS to 90 GCUPS, while groups with "hybrid hardware setups" that use GPUS "improved by a decent margin" Potts said.
He also noted that groups that used FPGAs and other types of specialized hardware had numbers in the range of 500 GCUPS because the chips are "hardwired to specifically view one type of algorithm."
"When we took our implementation and put it on the SGI hardware … it scaled to take advantage of all 384 cores," Potts said, noting that the platform should continue to scale well on additional cores.
"If I had access to a system with 512 cores or maybe 1,000 cores, this implementation of Smith-Waterman would have continued scaling to take advantage of those cores and got us an even bigger number to claim," he said.
Potts said that this ability will enable DataRush to grow with changing hardware requirements.
The rapid improvement in hardware means that developers need to ask whether their algorithms are "going to continue to grow with the hardware that I get next year, or am I going to have to go back and rewrite them or re-implement parts of them?" he said, adding that DataRush offers developers an opportunity to "future-proof" their applications.
The Secret Sauce
Potts said that the key to increasing the speed of Smith-Waterman was the ability of DataRush to easily parallelize the algorithm using a "dataflow" approach.
"If I can describe my analysis in terms of step A and step B and so on, DataRush offers the application programming interface with that pattern to automatically execute those steps in parallel and to feed the data from one step to the next," he said.
He pointed out that developers often spend a lot of time making their algorithms faster when they are executed inside a CPU, GPU, or FPGA without equal emphasis on ensuring that the data is continually fed into the system.
"The secret sauce to DataRush is the use of that dataflow approach," Potts said. "The dataflow allows me to create a stream to make sure that the individual points where I am doing my calculation are very well fed."
He added that DataRush speeds up the input/output process by splitting apart the individual tasks such as reading, parsing, cleaning, and translating the data and then parallelizing them.
"It is efficiency in a very important place where there has otherwise traditionally been a bottleneck," he said.
Pervasive DataRush is available through user licenses and can also be offered as a service.
"We are looking to market DataRush as a general-purpose platform that … applies to a wide range of business applications, scientific applications, and anywhere where folks have business-critical applications that need to take advantage of multicore hardware," Potts said.
Newmark said that the company would work out agreements with customers who are interested in licensing the platform with access to the Smith-Waterman algorithm.
He added that the platform can be used by independent software vendors that already have packaged software products, among others. He further noted that his company has a "core analytics library" that contains several data-mining algorithms that it has implemented on the DataRush platform.