Building upon tried-and-true grid computing technology and a five-year, $22.2 million grant from NIH's National Center for Research Resources, researchers at the University of Southern California have set out to create an über-endpoint for all national bioinformatics and biomedical data. Hosted and maintained at USC, the new Biomedical Informatics Research Network Coordinating Center is intended to be the terminus for all data siphoned through the Biomedical Informatics Research Network, a virtual community of nationally distributed networks of data resources.
Spearheading the development of the data clearinghouse is Carl Kesselman, an engineering professor at USC. Kesselman says that more than anything, the grant represents an acknowledgment from NIH that such a solution is long overdue. "The NIH has had a longstanding interest in the ability to share data across multi-site, multi-institutional projects, so this is a recognition that advances require the ability for collaboration, multi-site data collection, multi-site data analysis, the ability to utilize data beyond a single investigator or institution," says Kesselman. "This award is really about building this infrastructure, making it broadly available to the NIH research community across a range of different research areas … so that's the motivation."
Kesselman says that this undertaking is also a culling together of many longstanding informatics and grid computing development projects. "[The grant] represents an effort to leverage what's proven to be very successful technologies and approaches in other science domains, to try to break down some of these science stove pipes that have evolved, and to enable interoperability with other large-scale science infrastructure that other agencies, both public and private, have invested in," he says. "So it's definitely a continuation and a refocusing of existing efforts."
Much of the proposed infrastructure is based on the Globus Toolkit, an open source toolkit for building grid computing systems. "A lot of what we're doing is based on the toolkit, which was pioneered by myself, Ian Foster, and Steve Teegee — both from the University of Chicago — and this infrastructure and tools that had been built with a variety of government and private funding over the course of the last decade," he says. "The other part of what we're doing is developing new tools and new methods that are applicable to supporting the specific science activities that show up in the biomedical research areas, so it's a combination of leveraging on previous investment and a lot of previous experience in a lot of other science domains bringing them to bear in biomedical research."
The USC team hopes that by the time the funding dries up, the project will have gained enough traction that many other institutions will be willing to step in and help support its deployment. "Five years is a long time from now, but certainly at that point, we would hope that some of the elements of this would become widely supported and just become part of the basic science infrastructure, much like the Internet is," he says. "And by then, we're moving on to newer and more exciting things that will represent state of the art and how informatics is helping to facilitate research."