Wu Feng is not prone to the hyperbole characteristic of the high-performance computing world. Far from it. The architect of the 240-node “Green Destiny” bladed Beowulf cluster at Los Alamos National Laboratory is downright modest when he describes the compact, low-power system: “When you buy a traditional supercomputer, what you’re buying is like a Formula One racecar, but what you’re getting with our stuff is a Toyota Camry. You can get pretty fast in a Toyota Camry, but what’s its biggest selling point? It’s reliability; it’s not the top speed.”
Green Destiny (named after the sword in the movie Crouching Tiger, Hidden Dragon) made quite a splash when it debuted last year as an environmentally friendly supercomputing alternative. The system takes up far less space and exudes much less heat than a traditional supercomputer or Linux cluster — a feature that allows it to run crash-free without the need for a specially cooled machine room. Feng, who heads up the Supercomputing in Small Spaces project at LANL, admits that the solution “doesn’t perform as well” as other options, but its reliability more than makes up for its shortcomings, especially in environments where an air-conditioned machine room is not an option. With a traditional cluster, “if you do not have that infrastructure, you’re going to have problems with reliability because with every 10 degree Celsius increase in temperature, in general the failure rate of the system doubles,” said Feng.
After word of the system got around, Feng said he was surprised at the level of interest he saw from the bioinformatics community — even Craig Venter has expressed interest in the approach [BioInform 10-14-02]. “What we found is that the bioinformatics applications really don’t need the bleeding-edge network speeds or processor speeds. Obviously faster is better, but if it’s at the expense of reliability, it’s not,” said Feng. After a tour of several pharmaceutical firms — where one company literally had a Linux cluster housed in a closet — Feng soon realized that the Green Destiny architecture was a perfect fit for bioinformatics and turned his attention to optimizing Blast for the system to overcome the performance hit.
The result, mpiBlast, is an open source parallel implementation of Blast that will run on multiple platforms. Feng wrote mpiBlast to segment the database and distribute it across cluster nodes so that queries can be processed on many nodes simultaneously. In addition, mpiBlast is based on MPI (message passing interface), a standard protocol for communicating information between nodes. “Hopefully the future contributions to this freely available open source implementation will be such that it will be easy to incorporate changes because everybody will be using the same MPI programming language,” said Feng.
Version 1.0.1 of mpiBlast (http://mpiblast.lanl.gov/), “has already been quite successful at two respected academic institutions in bioinformatics,” said Feng.
While Green Destiny was originally built to run cosmology simulations, galaxy formations, and other non-life science applications, Feng seems to have caught the bioinformatics bug. His next project, called Green Machine, is double the size of Green Destiny at 480 nodes, and Feng is considering reserving up to half of those nodes for an online mpiBlast server: “People externally can run it to their heart’s content, and those corporations that are very sensitive about giving up IP over a web server can download it and set it up on their own environment.”