Sandia National Laboratory has cemented a $90 million deal with Cray to build a new massively parallel supercomputer. Dubbed “Red Storm,” the system will provide the computational muscle to support the lab’s new role as a large-scale life science player and could eventually scale up to 100 teraflops.
The agreement, originally announced in June, resurrects a project initiated by Sandia, Compaq, and Celera Genomics in January 2001 that eventually ran aground due to restructuring at both Compaq and Celera [BioInform 07-08-02].
Despite Celera’s absence from the current project, Sandia officials said the lab has plenty of life science-related work for Red Storm’s computational brawn — both within the lab itself as well as in collaboration with a new set of undisclosed biotech partners.
“One of the goals for Red Storm was to provide a highly effective platform for both bioinformatics and simulation in support of biotechnology,” Bill Camp, Sandia’s director of computers, computation, information, and mathematics, told BioInform last week.
Additionally, he said, Cray’s contract with the lab requires that a cost-effective commercial version of Red Storm hit the market as soon as Sandia’s version is shipped in 2004. “This is not a one-off, very high-end machine that they develop for Sandia alone,” said Camp. “As they deliver the machine to us, they will be ready to sell commercial versions.” While most bioinformatics groups won’t be able to touch the $90 million price tag for Sandia’s version, the architecture was designed to be scalable so smaller versions could remain within reach for a typical biotech firm. “In principle,” said Camp, “you could have this architecture run from a single cabinet up to hundreds of cabinets of supercomputing capability.”
For Sandia, Cray will use more than 10,000 Opteron processors from Advanced Micro Devices to build a system with theoretical peak performance of 40 trillion calculations per second, using two calculations per clock cycle, or 20 teraops, using one calculation per clock cycle. The system is expected to be at least seven times more powerful than Sandia’s current Intel-based ASCI Red supercomputer. The Red Storm architecture, developed by Camp and Jim Tomkins at Sandia, “is designed to scale to hundreds of teraops,” according to Tomkins. The Cray contract contains “an option” to upgrade to 60 teraops.
New Lease on Life Science
For decades, Sandia has raised the bar for high-performance computing, driven primarily by the large-scale simulations it performs for its role in nuclear weapons stockpile stewardship. Red Storm will also find use in this area, but some of its cycles will be reserved for a relatively new focus area for the lab — biology.
Although Sandia has supported a patchwork of biology-related research projects for years, it only recently identified biotechnology as a key growth area. Biotechnology investment is currently about five percent of Sandia’s research budget, which the lab considers “a threshold level” for emerging research areas. But this amount is expected to grow. “The laboratory is committed to making biotechnology a major core competency,” said Camp. “If you look out a decade or so, it’s absolutely credible that biotechnology amounts to 25 percent of Sandia’s work,” he added, with the caveat that his estimate is likely at the high end of the scale.
Nonetheless, there’s no doubt that Sandia has big biotech plans. Current projects underway include the development of new bioanalytical tools for the study of membrane protein structure and function, research into microfluidics to speed protein sorting, improved scanning tools for microarrays, and data-mining techniques for gene expression data. In addition, Sandia was recently tapped to lead one of five projects funded by the DOE’s new Genomes to Life program. Camp said Sandia also hopes to establish new non-profit research groups in the surrounding region “that will be very much bioinformatics oriented” and eventually spin out commercial biotechnology firms.
This new direction forced the Sandia designers to take a good, hard look at the requirements for the new supercomputer system they had in mind. They couldn’t rely on their decades of simulation experience because the demands of biological computing are “quite different,” Camp said. Unlike bioinformatics, which requires rapid input and output of large amounts of data, “most scientific computing is not data bound or input/output bound. It’s take a little bit of data, compute forever, and take a little bit of data out and look at what you did.”
Luckily, one of the original players in the Red Storm project hadn’t strayed far from the scene. Marshall Peterson, who designed Celera Genomics’ IT infrastructure and was a key player in the original Sandia/Compaq/Celera Red Storm project, continued to advise Sandia under the auspices of EnSilico, the consulting firm he launched six months ago after leaving Celera.
Peterson agreed with Camp that input-output demands were the key differentiator in the new computer’s design. “In biology, a lot of what we do is search, so I/O is extremely important — a lot more so than it is for Sandia’s traditional role in stockpile stewardship,” he said.
Another key point in the design of Red Storm was its commercial feasibility, a factor that Peterson said attracted him to the project in the first place. “Sandia has always specialized in taking commodity components and building very powerful computers. So the reason when I was at Celera why we did the CRADA [cooperative research and development agreement] with them was not so much that they can build powerful computers, but that they built powerful computers that were very cost effective.”
While noting that Sandia has “broken a little bit from their roots” by going with a Cray/AMD system that he characterized as “not using commodity components so much,” Peterson said he’s pleased that the end result will be a commercial version that can scale from “as big as your desk to 6,000 square feet.”
Peterson has also been hired by Craig Venter to design the IT infrastructure for the new sequencing facility for TIGR and the TIGR Center for the Advancement of Genomics. When asked if he was considering a Red-Storm-style system at the facility, he noted that the project’s time frame is a bit too long-term for his employer’s characteristically high-speed demands. “That’s still two or three years out, and, you know, biology and Dr. Venter wait for no man.”
Calling on Peterson’s expertise is just one example of Sandia’s plan for expanding its new biology focus beyond its admittedly limited in-house capabilities. “Our strategy in biotechnology is to partner, partner, partner, because we recognize that Sandia is not a biology laboratory,” said Camp.
However, he noted, the lab is willing to share its own vast experience in high-performance computing in return. Several non-profit and for-profit biotech groups that Camp was unable to disclose are already using Sandia’s services, he said. “One set of partners is setting up very big technology centers and they’re asking for our help in finding the informatics component of that,” he said. Other biotech partners are collaborating with Sandia researchers on specific computational problems.
Camp added that Sandia is “actively hiring” researchers with multidisciplinary skills. Those with crossover talent in biology plus either computational science, physics, or engineering are especially in demand, he noted.