The company that helps LexisNexis, Kodak EasyShare, and MySpace store massive amounts of images and other unstructured data is taking steps to move into the life science market.
Isilon Systems, a Seattle-based provider of clustered storage technology, now claims several life science customers among its client list, including the University of Washington Medical Center, the Harborview Medical Center, and the clinical proteomics group at Cedars-Sinai Medical Center's Warschaw Prostate Cancer Center, which recently adopted the company's system to store raw data from its Fourier transform mass spectrometers.
"From every drop of blood, we have the ability to generate about 50 or 60 gigabytes of data, which works out to almost a terabyte a day," Parag Mallick, director of Cedars-Sinai's clinical proteomics group, told BioInform. "That ends up being quite a bit of data to push around."
The lab turned to Isilon about a year ago after trying six other storage options that just didn't work out. "Some sophisticated systems seemed like they'd meet our needs well, but they were very complicated to use. So they might have been able to handle the stress of the system, but they weren't friendly," Mallick said.
The Isilon system, by contrast, "was designed for operating on large files," and proved much easier to install and maintain, "which is nice because then we can spend money on other scientists instead of on system administrators," Mallick said.
"From every drop of blood, we have the ability to generate about 50 or 60 gigabytes of data, which works out to almost a terabyte a day. That ends up being quite a bit of data to push around."
Brett Goodwin, vice president of marketing and business development at Isilon, told BioInform that the life science sector is "one of our top four or five vertical markets," and that the company is ramping up its marketing strategy to compete with vendors like EMC, IBM, HP, and Network Appliance that currently claim the largest chunk of the life science market.
According to Goodwin, however, systems from these vendors — typically network-attached storage or storage area networks — will not be able to keep up with the storage demands of the life science research market.
Goodwin said that traditional SAN or NAS systems pose two primary drawbacks for life science customers: limited single file system sizes and performance bottlenecks. In the case of file sizes, Goodwin said that files for most SAN and NAS architectures are in the range of 2 TB to 16 TB, but with the Isilon clustered storage system, a single file system can expand from 5 TB to 528 TB.
"That allows you to have a full, cross-sectional view of all your data that can be shared among all of your scientists, among all of your researchers," Goodwin said. This architecture is also well-suited to Blast farms and other bioinformatics applications that "need high concurrent access to a single shared pool of data," he said.
Goodwin added that read-and-write requests in NAS and SAN "are funneled through a single head or a single server, and once you hit the maximum limit of that single head or that single server, you've hit a performance bottleneck, and the only way to get more performance is to add another, separate system and then break up your job, and that adds a whole bunch of complexity."
The clustered architecture, however, enables the Isilon system to scale up to more than 7,000 megabytes per second, he said, compared with 250-400 megabytes per second for traditional SAN or NAS systems.
The core of the company's product line is its Isilon IQ 1920, 3000, 4800, and 6000 platform nodes, which range in capacity from 1.92 TB to 6.0 TB per node. Pricing generally runs around $10,000 to $12,000 per terabyte, Goodwin said.
This week, Isilon released an upgrade for its OneFS distributed file system, which runs on all of its storage systems. OneFS 4.0 supports up to 528 TB and 7 gigabytes per second of performance, compared to 256 TB and 3 gigabytes per second in the previous version.
Isilon also released two new products — a performance accelerator and a capacity expansion node — with the goal of offering customers greater flexibility. The IQ Accelerator is a controller that allows customers to boost performance "at a third of the cost of a storage platform node" if they don't want to add capacity, Goodwin said.
The new EX 6000 node offers the converse: additional capacity without the performance gains, which brings the cost down to around $4,000 per terabyte.
Cedars-Sinai has adopted both of the new products, and Mallick said that they have proven helpful in developing a "tiered" architecture for long-term storage of data that isn't used all the time, but isn't quite obsolete enough for tape. The IQ Accelerators, in particular, he said, "allow you to expand the performance of the back end without actually increasing the storage."
Mallick said that the clustered storage architecture has freed up his group to "plan on buying things kind of on an installment plan, so rather than buying 300 TB for the next year, we'll just buy it 30 TB a month as we need it, and plug it in and grow."
— Bernadette Toner ([email protected])