The North Carolina Bioinformatics Grid project is officially off the ground. MCNC, the non-profit corporation responsible for building and operating the NC BioGrid, is neck-deep in the first phase of the project: a testbed of heterogeneous hardware systems that will be linked together with software from Avaki of Cambridge, Mass.
MCNC’s Phil Emer, chief architect of the NC BioGrid, said that Avaki’s software was as close to an off-the-shelf grid solution as he could find right now.
The software decision came down to a choice between Avaki and Globus, and “Avaki’s implementation is already close to the open grid services architecture specification,” said Emer. He added that his task is to build a practical “production grid” that researchers at over 70 North Carolina universities, research organizations, and biotech and pharmaceutical companies will be able to utilize, without being “engaged in the science of middleware.”
The testbed will be put in place over the summer, and will be built with Linux clusters, Solaris systems, and IBM p690 Regatta servers — a “representative” mix of hardware determined by the most popular systems in use by the BioGrid’s intended end-users, Emer said. “Several more platforms” are expected to be added to the testbed by the end of the summer. Avaki’s software will provide access to both data and computing resources across these hardware platforms, as well as security options to support a range of authentication and access controls.
Emer said Avaki’s ability to handle distributed data as well as computational power set it apart from its competitors in the commercial grid software market. The growth of distributed data “is a bigger problem from our perspective” than the computational aspects of the grid, he said.
MCNC has “just about completed deploying the infrastructure” for the testbed, Emer said. In addition, his team is developing test plans for “bounded data sets and codes to exercise the infrastructure.” Not surprisingly, Blast will be a key component of the testing phase, and Emer envisions a variety of commercially and publicly available Blast implementations running across the BioGrid.
“A humming testbed is imminent. Now the real work begins,” he said. Additional work will involve overcoming the sign-on obstacles inherent in a system that spans several administrative domains with different access policies. “These are very real issues now,” said Emer. “Where we want to focus is on applications that are actually useful.”
Emer said he perceives grid computing as a natural evolution of distributed file systems and other technologies that had their beginnings in the 1980s. But while many of the concepts and technology are the same, and the Avaki software will address many of the infrastructure headaches, significant challenges in terms of data access, security, and administrative domains remain to be solved. In addition, scientific applications still need to be modified to run on the grid, but Emer doesn’t see this as too big an obstacle. “We’ve already focused on optimizing many of these programs for parallel systems. It’s not too much of a jump to talk about gridifying them,” he said.