The National Cancer Institute’s Center for Bioinformatics (NCICB) is putting together an ambitious project to create a nationwide bioinformatics network that it’s calling caBIG (Cancer Biomedical Informatics Grid). NCICB is currently in the process of notifying a number of cancer centers it has selected to participate in the pilot phase of the project. Once those centers are on board, the work is expected to progress “at the speed of the Internet,” according to NCICB director Ken Buetow.
The caBIG initiative was first conceived in July, when NCI director Andrew von Eschenbach introduced at the annual American Association for Cancer Research meeting the concept of a national network for cancer researchers to share data and tools. Since then, progress has been rapid and enthusiasm has been high, Buetow said. By early September, 49 of the 61 cancer centers that NCI supports in 31 states had submitted proposals to participate in the pilot. NCICB representatives have traveled to each of the 49 centers to assess their current informatics capabilities and needs, and are just wrapping up the evaluation process now. “We’re at the point of really dotting the ‘i’s and crossing the ‘t’s,” said Buetow. “We want to move from general enthusiasm to specific implementation and specific actions.”
NCICB plans to work closely with the cancer research community to create a common, extensible informatics platform that integrates diverse data types and supports interoperable analytical tools. NCICB has already laid some of the groundwork for this effort through its caBIO (Cancer Bioinformatics Infrastructure Objects) model, which serves as the primary programming interface to a broader bioinformatics platform. This platform, called caCORE, encompasses a set of controlled vocabularies for cancer research called Enterprise Vocabulary Services (EVS) and a set of common data elements for clinical cancer research stored in the Cancer Data Standards Repository (caDSR). With this foundation in place, Buetow said it was time for NCICB to widen its net. “We recognize that all of this is going to have to be extended and expanded as we move aggressively forward, and we’re looking to this extension and expansion to be done in partnership with the caBIG community,” he said.
The More, the Merrier
The initial model for the three-year pilot project was for approximately 10 cancer centers to participate in a demonstration implementation. As the evaluation phase progressed, however, this approach was modified to include a larger number of participating centers organized as “workspace” teams to address key projects or topic areas. “Through our interviews and interactions with the individual cancer centers, we found out that there was a lot more opportunity to involve a lot more people early on,” said Buetow. “So we decided to refocus the project so that we could involve a broader collection of the community, get deployment more rapidly to the community, and to see to it that we could address acute needs more rapidly.”
Rather than focusing on the individual centers, the revised model will revolve around three so-called “domain” workspaces — clinical trial management systems, an integrative cancer research workspace, and a tissue bank and pathology tools workspace — in addition to two “cross-cutting” workspaces for basic infrastructure development in the area of vocabularies/common data elements and architecture.
NCI will fund caBIG through contract mechanisms that will directly support the cancer center participants, Buetow said. The initiative is “open source, open access, and open architecture,” so all components developed as part of caBIG will be made publicly available. Buetow said the effort isn’t limited to cancer researchers, and NCI welcomes participation from the broader biomedical research community as well as from the commercial IT and software sector. “We’re hoping that the IT community is interested in playing along,” he said. “We would hope that they would see opportunities to contribute to the activity.” NCI has already had some “interesting nibbles” from IT vendors interested in participating in the project, he added.
While enthusiasm for the project is high, Buetow acknowledged that it is “challenging on multiple fronts.” There are certainly technical challenges to knitting together a diverse and disparate set of data and tools, but Buetow noted that “there are already IT solutions to many of those in other domains; e-business and other communities have approximate solutions to many of the technical problems.” The real question, he said, is “how do we bring those technical solutions into this very culturally diverse and scientifically diverse community?”
Some potential project participants already have a jump on meeting those challenges. For example, six Pennsylvania cancer centers created the Pennsylvania Cancer Alliance last summer to build a state-wide cancer-based biomedical informatics network [BioInform 07-15-02] . Michael Liebman, director of computational biology and biomedical informatics at the Abramson Cancer Center at the University of Pennsylvania, said that transparency for the end-user has been an important consideration in the PCA project. “What you’re going to do is potentially provide a metalayer in which everyone can share things…You can’t go and ask people to suddenly change everything they’re doing, but to add on something that enables them to basically translate it into that metalayer, and that enables the institution to basically maintain its day-to-day business, and not disrupt things,” he said.
Liebman noted that the NCI network is likely to have a different focus than the PCA network. While the latter is built on clinical data — such as demographics, family history, and clinical history, “the bioinformatics component of NCI tends to have a bottom-up approach, coming from the genomics perspective, and less of an emphasis and experience on the clinical side,” he said. Nevertheless, he said a nationwide informatics network would be welcome in the cancer research community. “Anything that moves toward facilitating the interaction of institutions, and enabling a more effective transfer of data, and therefore collaboration, is to be commended, especially in such a complicated area,” he said.
The Abramson Center and four other PCA members have submitted proposals to participate in the pilot project, but had not yet heard from NCI at press time.
Further information about caBIG is available at http://cabig.nci.nih.gov.