Just over three years ago, Incyte began installing a Linux cluster to do most of its bioinformatics work. Now over 3,000 processors strong, the compute farm is bumping up against a number of limitations, not least of them sheer square footage, according to Ken Jacobsen, senior vice president of information services “The cluster is rather large; it takes up a lot of space,” he said. “Our offices are in Palo Alto, so floor space is a concern. It’s not cheap.”
Another drawback of the growing system is the “tremendous amount of wiring” it requires, Jacobsen said. “The network infrastructure to support a cluster is quite onerous.”
In addition to the huge Linux farm, the company’s 3,500-square-foot machine room houses around 200 Unix servers from multiple vendors. While loath to walk away from the favorable price/performance of the Beowulf, Jacobsen said that Incyte is being driven by space and networking issues to consider other options for the next phase of its computational infrastructure.
Incyte is currently evaluating SGI’s new Altix 3000 line of Linux-based servers, Jabobsen said, and results have been favorable so far. Blast benchmarks of a 1-GHz Altix prototype indicated a seven-fold speedup over 2.2 GHz Xeon processors in the cluster: The Altix crunched 277 sequences per processor per hour, compared to 39 for the Beowulf.
“To be honest, we did not expect to get as good a performance as we got,” said Jacobsen. “We expected something that was a little faster, but not 7x.” He attributed the improvement in performance to “the memory structure in which the chip is embedded.”
One limitation Incyte has experienced with commodity 32-bit processors, according to Jacobsen, “is that you’re limited by address space.” The Altix offers 64-bit addressing along with large shared memory, “so if we’re running lots of problems that use the same datasets, we’re not duplicating the memory requirements for that, whereas if you’re running independent off-the shelf processors, you’d be duplicating memory requirements for each one."
But despite the promise of the Altix, Jacobsen said that Incyte hasn’t yet settled on its next hardware purchase. Any new technology would replace the existing cluster piece-by-piece, he said. “Clearly there’d be too much risk right now to move all processing to something that’s realistically a brand new technology. We’d like to stay near the edge, but we can’t be on the bleeding edge.” In addition, the Altix is not yet Oracle certified. Dan Stevens, SGI’s marketing manager for life and chemical sciences, said that SGI is working on Oracle certification, but in the meantime, “We can’t use any software that’s not certified yet,” Jacobsen said. “It’s a risk issue that we’re not willing to accept.”
Surprisingly, the company's recent move to drug discovery has not yet influenced this decision of where to go next with the technology. However, Jacobsen noted that standard Xeon processors would tend to run slower on computational chemistry problems than bioinformatics problems, giving the better floating point performance of IA 64 architecture an edge.
Since it was an early adopter of Beowulf clusters for bioinformatics computing, Incyte’s final decision for the next phase of its computational infrastructure will be closely scrutinized by many in the industry. As Linux clusters gain in popularity — and size — users are finding that issues such as space requirements, overhead costs, and systems administration can sometimes outweigh any savings from the lower purchase price of off-the-shelf processors.
Jacobsen noted that price/performance and simplifying the system would be top priorities in whatever decision Incyte makes. The company has already cut back it's sys-admin staff, “and as a consequence we’re trying to match what they can do with the number of people we have. ...They’re looking forward to some changes that will make their lives simpler.”