NEWYORK (GenomeWeb) – The University of California, Riverside has received about $600,000 in grant funding from the National Institutes of Health to purchase additional storage infrastructure to better support data intensive life science research projects at the institution in the long term.
The university will use the funds to buy a highly scalable disk storage system that provides about 750 terabytes of space, although it might bump that up to about a petabyte of storage, Thomas Girke, an associate professor of bioinformatics in UC Riverside's botany and plant sciences department told BioInform this week. It will supplement an existing computer cluster at the institution which provides approximately 150 terabytes of storage. The current cluster has roughly 600 central processing unit cores which are used to analyze data, and these funds will make it possible to purchase another 1000 CPU cores for analysis, but "the main thing is the storage," Girke said.
The added storage capacity will support projects run internally by more than 160 scientists from 15 UC Riverside departments — about 80 percent of projects will come from this category — as well as studies conducted by teams working in the broader UC system and in industry — about 10 percent respectively from both categories. In order to get analysis time and storage space on the system, potential users sign up for accounts and are charged annual subscription fees — $1000 per research group for UCR researchers; $1,120 for non-UCR researchers; and $1,518 for commercial groups. That gives them access to the cluster and about 30 gigabytes of storage on the current system, Girke said. More storage is available for an additional cost. Annual pricing for 100 gigabytes of additional storage is $100 for UCR users; $157 for non-UCR users; and $213 for commercial researchers.
The new cluster will be especially useful for researchers running biological and biomedical research projects that use make use of next-generation sequencing, Girke said. Currently about 75 percent of the storage and computer resource allocation on the UCR cluster goes towards supporting these projects, and that’s not likely to change, he said.
UC Riverside is currently evaluating options for the new system, which it expects to install sometime this summer. The new system will be housed in a new server room at the Institute of Integrative Genome Biology at UC Riverside.
The school does not have a specific vendor in mind, rather it is looking for a system "where we can get as much storage footprint into the smallest form factor so the largest amount of storage disks into the smallest space in a computer rack," Girke said. The exact system will be selected via a bidding process. Girke also said that the university did consider cloud-based infrastructure but decided that ultimately it would not be an economically viable solution given the size of the datasets that would need to be moved and stored.