The Danforth Center runs one of the world’s largest supercomputers without one of the world’s largest IT budgets
It took an international team four years to sequence the 130-million-base Arabidopsis thaliana genome. But researchers at the Donald Danforth Plant Science Center in St. Louis accomplished the far more complex task of calculating the structure of each of the weed’s proteins in just six months.
“To do this on one computer would take 300 years,” says Jeffrey Skolnick, director of computational and structural genomics at Danforth’s computing center. The facility serves 15 scientists who use algorithms to predict protein structure and function from genomic sequence.
The complicated calculations are made possible by one of the world’s largest supercomputers dedicated to plant research. Danforth’s Laboratory of Computational Genomics hosts a “kilo cluster” with 1,040 Pentium III processors (733 and 750 megahertz each) running in 520 nodes, together performing at a peak rate of 335 gigaflops.
At a cost of $2 million, the massive system was built and implemented in six weeks by Western Scientific, a San Diego-based hardware outfit that specializes in tailor-made mega-systems.
Western Scientific’s package includes three years of support and hardware maintenance, 1.1 terabytes of Cyclone Redundant Array of Independent Disks storage accessible via an Ignite XMP UltraSPARC NFS server, 20 gigabytes of distributed storage in each of the 520 nodes, 266.24 gigabytes of L2 cache, and 153.6 gigabytes of RAM. The computing lab also has enough spare parts onsite that it can quickly replace almost any that go down.
Of course, as Murphy’s Law would have it, “the only real crash we had was when a power supply on the RAID array went down,” Skolnick says. “It’s a central node and it’s the one thing I don’t have a duplicate of because it costs another $80,000. I can live with one crash.”
Otherwise, the supercomputer packs a lot of punch. A comparable system by Cray would cost 10 times as much, and the cost of a similar one using Compaq Alpha processors would be quadruple. “We checked out seven vendors, including Penguin Computing, Hewlett-Packard, Silicon Graphics, Sun Microsystems, IBM, and VA Linux,” says Skolnick. Before making his purchase last spring, Skolnick says, “We looked at which one had the best price per performance for our particular code.”
Size also mattered: 400 square feet were available for the system. The Intel chips and motherboards were about a quarter of the size of some other machines, so Western Scientific was able to squeeze more machinery into the cluster’s 26 cabinets.
Another tremendous cost savings is Linux. Skolnick was able to purchase one copy of the operating system from Red Hat (version 6.2) for $30 and copy it to all 520 nodes. He says open source software has “proved to be very stable in our hands. Linux is up more than 99 percent of the time.”
At first, the center struggled with network-file-server compatibility and had to transfer data to the new operating system. But after two months, the team ironed out these issues and learned how to use the new platform. The center had previously used Solaris, running on 120 Intel processors at 400 megahertz each. Prior to that, it ran Solaris on Hewlett-Packard and Silicon Graphics workstations.
Today, about 20 different home-grown applications run on the cluster, including ab initio and threading-structure-prediction algorithms. Skolnick and his post-docs are also developing algorithms to automate active site identification. These software efforts cost about $750,000 a year, plus another $50,000 in maintenance.
But “the most expensive thing [seems to be] the power. It consumes enough power to heat and cool 17 houses,” Skolnick says. In addition to the 153,000-plus watts guzzled by the supercomputer, an outrageous amount (Skolnick couldn’t say exactly how much) is consumed by the 25-ton air conditioning system — twice the weight of the supercomputer — used to cool down all the machinery, especially during blazing hot St. Louis summers. The facility’s electric bills add up to $60,000 a year.
But Skolnick thinks it’s worth every penny. “I don’t know how I would calculate the return on investment in the supercomputer. But without it, I couldn’t do the science I’m doing.”
Even so, Skolnick plans another ramp-up by year’s end. “My next upgrade would be a multiple-teraflop system,” he says. “I would hope to do a chip swap: throw away the Pentium chips and buy faster ones. That will cost $500,000 and last another year. Then we’ll get new machines.”
— Jackie Cohen