Companies like TimeLogic and Paracel have proven that ASIC (application-specific integrated circuit)- and FPGA (field programmable gate array)-based hardware systems can speed bioinformatics research considerably. Using configurable chips that are specifically optimized for the algorithms running on them, these so-called “accelerators” are a core resource in many bioinformatics research groups.
But specialized hardware isn’t just a good way to speed the Smith-Waterman algorithm. NeoGenesis Pharmaceuticals found that the technology worked just as well for its rapidly expanding storage demands.
About a year ago, the storage requirements for the company’s internal discovery system, particularly for its mass spec data, “went through the roof,” said Mark Chandler, director of information systems. The problem wasn’t so much the amount of data being generated, but the need for ready access to it, Chandler said. “Suddenly, instead of just one particular prototype unit producing data, and that data only needing to be around for two or three days, it suddenly became several of these lines running simultaneously where we had to have live data retention for a month. And because we were generating several hundred megabytes of data a day, at that point we had to seriously look at a high-end storage solution.”
NeoGenesis already had 4 TB of storage available in two separate systems from EMC, but Chandler said he had a number of “issues” with the EMC system. “It kind of solved the problem, but it didn’t exactly solve it, and getting it up and running was an absolute nightmare,” he said. In addition, he noted, “It was going to cost us almost as much as the original unit to expand it, and that was just unacceptable.”
After looking at systems from a number of vendors, including EMC, Network Appliance, Dell, and Hitachi, Chandler said the company opted for a 2 TB SiliconServer from BlueArc — a hardware-based storage system that uses specially designed FPGAs to rapidly move data in and out of storage. “All the dirty work is done in firmware,” Chandler said. “It’s not a hacked version of Windows or some little Linux stack running on a PC inside of a case. It really is dedicated hardware that just does what it needs to do and nothing else, and because of that it doesn’t break.”
Chandler said that prior to buying the BlueArc system, he was playing “swapping games” between online storage and tape for the 75 or so researchers in the NeoGenesis discovery group. Researchers who wanted two-week-old data, for example, “had to wait for us to get a tape back from our service and then restore the data, and then we had to let them know that their data could only be there for a certain number of days before we had to clear out space again.”
Now, he said, “I’m able to keep everybody’s data there, everybody’s data is live, no one is complaining about anything missing. And because of that, people are just able to work and I’m actually able to have a normal workday instead of being here all night.”
Chandler said that NeoGenesis expects to add another 2-4 TB of storage over the next year and a half.