Celera's recent decision to release its genome data in the public domain was hailed as a victory for public bioinformatics resources over commercial models, but a funding crisis at Canada's Blueprint Initiative serves as proof that many public efforts are also on shaky ground when it comes to long-term sustainability.
In the latest example of a persistent challenge for public bioinformatics resources, the non-profit Blueprint Initiative last week laid off 33 people, around half of its staff, as it faces the expiration of its primary grant from Genome Canada on June 30.
The research program, housed at the Samuel Lunenfeld Research Institute at Toronto's Mt. Sinai Hospital, was awarded Can$29 million ($23.1 million) in 2003 to develop the BIND (Biomolecular Interaction Network Database) resource, but has been unable to secure additional funds once its grant expires, said Christopher Hogue, principal investigator for Blueprint.
"The grant is non-renewable, and we have not been able to find continued support from either the Canadian federal or provincial government to continue the project," Hogue said.
Hogue said that Blueprint still plans to maintain BIND, despite the funding gap. The project will spend the next two months "winding down" the project in Toronto and moving its curation capabilities to Blueprint Asia, the initiative's Singapore facility, which it opened last summer [BioInform 08-09-04] .
"We're continuing to look for a solution for the funding here in Ontario, but it hasn't come in time, [so we had] to lay off roughly half of our staff," Hogue said.
The situation highlights the tenuous financial underpinnings for most public bioinformatics resources outside those at NCBI and EBI. In many cases, developers are awarded a research grant to create a database or software tool, but find it difficult to secure follow-on funding for continued support and maintenance of the resource. BIND is the latest in a string of bioinformatics tools, including EMBOSS, the GDB, and SwissProt, that have had to scramble to find alternative sources of funding or hosts once their original grants expired.
Very often in bioinformatics, "you get funded for one cycle, a second cycle, you become a reliable resource for the community, and then all of a sudden you don't get funded for whatever reason," said Francis Ouellette, director of the University of British Columbia's bioinformatics center. "Where does one get a database funded permanently, apart from sending it to NCBI and having NCBI do it?"
Ouellette, who serves on the scientific advisory board for Blueprint, but is not actively involved in its development or funding, noted that BIND will probably survive in some form. "It will temporarily or permanently go somewhere else," he said, "but from a user perspective, [the funding gap is] very unfortunate," he said.
The BIND developers "were just starting to reach the milestones that they set out and that they got reviewed on, and they said they would obtain and did obtain, and now they have to sort of pull the plug on the whole thing," Ouellette said. "It's really bad timing."
Ouellette pointed out that the financial instability of bioinformatics resources affects the broader scientific community — not just the database or software developers that are out of work when their funding runs out. "We're publishing stuff that's relying on that data being out there," he said. "It's great that there's an alternative source for BIND, but how stable is that going to be?"
What are Funding Agencies Doing About it?
International funding agencies have yet to establish effective mechanisms for ensuring long-term funding for bioinformatics resources, although there appears to be a growing awareness of the problem.
Eric Jakobsson, director of the NIGMS Center for Bioinformatics and Computational Biology, told BioInform via e-mail that the agency is "in the early stages of planning a workshop that will include the major players and stakeholders involved in the provision and use of bioinformatics resources to help us make rational policies for creating and sustaining databases and associated computational environments."
Noting that it was still too early to discuss the details of the workshop, Jakobsson did provide some of the rationale behind the proposed meeting. "We are seeing the need for knowledge environments that encompass higher-level and more applied data than nucleic acid and protein sequences, including data on pathways, networks, tissues, organs, and even clinical data as a source of information for research purposes," he wrote. "With the development, deployment, and use of extramurally funded databases, data management systems, and knowledge environments has emerged two new concerns — those of interoperability and sustainability."
In the case of sustainability, he noted, "If we consider a knowledge environment that the community has come to rely on, what happens when the developer and caretaker of that knowledge environment writes a bad renewal proposal? Or retires? Or dies? Or becomes incapacitated? Or decides to discontinue support of the knowledge environment and pursue other ventures? NIH needs a sound policy foundation to deal with all of these eventualities."
The National Science Foundation has also recognized the importance of this issue, and recently released a draft report from its Board on Long-Lived Digital Data Collections (available at http://www.nsf.gov/nsb/meetings/2005/LLDDC_draftreport.pdf), which recommends that NSF "develop an agency-wide umbrella strategy for supporting and advancing long-lived data collections." The board will discuss revisions to the document during May, but the NSF has not indicated when a final report is expected.
Christian Burks, president and CEO of the Ontario Genomics Institute, told BioInform that funding agencies and developers often overlook the fact that "maintaining the database is going to be just as expensive as it was to build it." Maintenance requires new software development, operational staff, hardware, and other expenses, he noted, "so everyone understanding that it's going to cost at least as much, if not more, forever — and what is the plan around that — is a good up-front realization to have."
Burks, who worked at Los Alamos National Laboratory on the early development of GenBank, said that many bioinformatics databases run into funding difficulties because of the "two-phased" nature of their development. He noted that GenBank, like many databases, started off as a research effort run by a handful of scientists. As the project matured and attracted community interest, "it was a tough transition," he said. The early developers of the database had to relinquish some control in order to ensure that it served the community. "At the end of the day we signed off … and decided that it was more important that GenBank survive than that any particular group runs it."
Burks noted that this "theme" has run through other database efforts, such as the Protein Data Bank, which also started out as a single-site research project but eventually matured into a community-run effort. Databases that are managed by a "syndicate" rather than a single lab have a better chance of surviving, he said, "because any single institute will take a smaller hit for the total cost."
Burks suggested that BIND — and other interaction databases — are currently at a similar stage in their development, which is why funding models pose a particular challenge right now.
David Eisenberg, a professor of biological chemistry at the University of California Los Angeles, agreed. His lab runs DIP (the Database of Interacting Proteins) "on a shoestring" right now, with no direct funding. However, he said, "It's clear that with the growing volume of interaction data that it would require more stable funding, so we have applied to NIH for funding."
Eventually, he said, "When interaction analysis becomes more powerful than it is today, then it would be appropriate for NCBI or other central agencies to take it over, but today it's still a frontier field."
Eisenberg also cited the PDB as a positive model for emerging database efforts to follow. "The PDB started with one person who collected coordinates at Brookhaven National Lab … and then later grew to be an important resource for the whole community, and that's when its funding was regularized and stabilized."
But when the funding for BIND and other interaction databases may reach that point is still anyone's guess. Eisenberg said that it's "hard to predict the outcome" of BIND's situation, although he said he was aware of "some discussions" between the Blueprint team and funding agencies in Ontario. "People see it as a significant resource and a nice outcome of the funding they received previously," he said.
Some observers familiar with Blueprint's situation suggested that the project's funding woes may be partly due to the complexities of Genome Canada's funding model, which requires that all awards secure matching funds from a local funding body or other source prior to applying for their grant. Blueprint was previously funded by the Ontario R&D Challenge fund, but that agency is currently in "a transition period" as it merges with two other Ontario funding bodies to make up a new agency called the Ontario Research Fund, according to an ORDCF spokesman.
The spokesman confirmed that ORDCF did not renew its funding for Blueprint. Meanwhile, a spokeswoman for the ORF said that the new agency will not open calls for proposals until later this spring. While declining to comment on Blueprint's situation, she did say that "there could be a little bit of a timing issue" for projects that were seeking to line up their local funds in time to file for the latest round of Genome Canada grants.
Hogue declined to lay blame for Blueprint's funding gap on any particular funding body, but he did note that "science funding in Canada is like playing musical chairs on a roller coaster."
Meanwhile, Blueprint has moved on to plans B, C, and D. Hogue said that he has spoken to Blueprint's supporters in Singapore — which include Singapore's Economic Development Board, the Genomics Institute of Singapore, and the National University of Singapore — about the possibility of expanding Blueprint's operations there.
In addition, he said, "We're happy to talk with another jurisdiction that would consider funding [the project]."
Following the layoffs last week, Blueprint announced that its Singapore facility had assumed all responsibility for an agreement between the Blueprint Initiative and the American Association for the Advancement of Science that the organizations signed last December [BioInform 12-13-04]. Blueprint Asia will ensure that molecular interaction data from the journal Science is entered into BIND ahead of publication, and that BIND identifiers are delivered for inclusion in the final publication.
"Frankly, we really didn't notice the transfer of Science paper curation from Blueprint North America to Blueprint Asia," said Bryan Ray, senior editor at Science, in a statement.
In addition, Blueprint announced last week that Blueprint Asia will begin offering "directed-curation services" in which "Blueprint curators will work with customers to identify, annotate, and cross-reference molecular interaction information from the peer-reviewed literature as it relates to a specific disease state or biological condition."
Pricing details for the service, which will enable customers to have access to the curated data prior to its release through the free version of BIND, were not provided.