Chris Hogue, the principal investigator for Canada's Blueprint Initiative, is mulling a commercial model for the Biomolecular Interaction Network Database (BIND) in the face of a funding drought that has forced the project to cease curation activities.
On Nov. 16, Hogue posted a notice on the BIND website (http://bind.ca/) notifying users that Blueprint, a non-profit research program housed at the Samuel Lunenfeld Research Institute at Toronto's Mt. Sinai Hospital, had "fully expended available core funding in both Canada and Singapore," and would stop curating BIND.
In an attempt to save the project, Hogue is considering making the database available through a commercial entity • a move that carries its own risks. "Public databases are essential requirements for the future of life sciences research," Hogue wrote in the notice on the BIND website. "The question arises 'Will these be free or will they require a subscription? Should BIND/Blueprint be sustained as a public-funded open-access database and service provider?'"
Hogue told BioInform that the project, which once employed as many as 70 curators in Canada and another 15 in Singapore, is down to a handful of staff. Hogue has two full-time employees in his lab working on BIND now, and funding for their salaries will run out "early next year," he said.
In May, the Blueprint Initiative learned that its funding from Genome Canada would not be renewed because the project failed to meet a requirement to secure matching funds. At the time, the organization planned to move all of its curation capabilities to Blueprint Asia, the initiative's Singapore facility. [BioInform 05-09-05].
"We made about four or five proposals to various other funding agencies in Singapore that could have provided the match to [the economic development board], and they didn't go anywhere, so we've been forced to shut down that operation."
But Hogue said last week that the initiative's grant from Singapore's economic development board also required matching funds to continue, which Blueprint was unable to secure.
Blueprint Asia's burn rate "accelerated" when the project's Canadian operations shut down, Hogue said. "We made about four or five proposals to various other funding agencies in Singapore that could have provided the match to [the economic development board], and they didn't go anywhere, so we've been forced to shut down that operation," he said.
In the meantime, Mt. Sinai has pledged to host Blueprint Initiative resources, including BIND and the SeqHound data warehouse, while Hogue explores alternative means to support the effort.
In a Bind
Hogue told BioInform that he would consider offering BIND through Unleashed Informatics, a company spun out of the Blueprint Initiative in April to commercialize a Sun cluster pre-loaded with SeqHound and other Blueprint resources [BioInform 04-25-05]. The company employs around five people, Hogue said.
"As a spin-off of Mt Sinai Hospital, Unleashed Informatics essentially has the right of first refusal for any intellectual property that was created in my laboratory, including databases and source code," he said. "So this is the last resort, and we are trying to structure something so that Unleashed can continue to provide access to BIND in an open-access manner. We have not finalized the terms yet, but that is probably what is going to happen."
Hogue said that there are "other elements of the source code that Unleashed may decide to commercialize," but added that he would like to see BIND remain in the public domain.
It's still unclear whether • or how • Unleashed would be able to support continued curation for BIND, however. "There may have to be a substantial change to the approach we take to curation in order for it to fit the business model," he said. "We have to see how much revenue it may actually generate, [and] see how much money people will pay for a hand-curated database."
"Public databases are essential requirements for
the future of life sciences research. The question arises 'Will these be free or will they require a subscription? Should BIND/Blueprint be sustained as a public-funded open-access database and service provider?'"
Hogue noted that there are substantial risks involved in supporting a public domain database with a commercial model. "I don't relish taking a database like BIND and trying to make a business out of it," he said. "That's why we'll try to continue to keep it operating on an open-access model. Its growth will be limited, and if somebody wants more, they're going to have to figure out a way to pump some funding back into it."
Hogue said that the commercialization question arose only after exhausting all other alternatives. "We've circled all the bases, nationally and internationally," he said. "We asked everybody, and everybody said no."
As for whether the National Center for Biotechnology Information in the US might consider hosting BIND, Hogue said, "I've had that discussion with the folks at NCBI already and they indicated that it would be an additional liability to bring in an additional 3.5 million lines of source code that they didn't write and try to figure out what to do with it. … The bottom line in every case is that it's too expensive for somebody else to pick up cold."
NCBI did not respond to BioInform's request for comment on whether it would consider hosting BIND.
As far as the option of coordinating a letter-writing campaign among the scientific community, "that didn't work for Swiss-Prot, so we're not going to do it," he said.
Hogue was referring to the Swiss-Prot protein sequence database, which was originally developed by the European Bioinformatics Institute, but was forced to go commercial via Geneva Bioinformatics in 1996 when its grant expired. The resource reverted to a public-domain model in 2002 when the NIH pledged $15 million to create the UniProt database out of Swiss-Prot, Trembl, and the Protein Information Resource. [BioInform 10-28-02].
The commercialization of Swiss-Prot initially "made everybody upset and they all railed against the evil commercial intent," Hogue said, "and now I hear people saying, 'Well, that commercial activity saved it and allowed it to prosper again as an open access public good entity,' and maybe that's what the future holds for BIND."
Déjà Vu all Over Again
"There are no new lessons here; there are no new lessons to learn. We've seen this all before," Hogue said.
Indeed, BIND's funding troubles mirror those of Swiss-Prot and other public resources like EMBOSS, which have struggled to secure follow-on funding after their initial grants ran out.
"It's always difficult to get research money, but it is too often the case that a project is started up with public funding, and then • even if it is successful and going very well • if you just claim three years later that you want to continue a successful resource, it's very, very hard to get the appropriate funding for that," said Henning Hermjakob, team leader for proteomics services and principal investigator for the IntAct interaction database at the European Bioinformatics Institute.
However, Hermjakob said, it appears that there is growing awareness of this problem among the European funding agencies.
"We are currently in the negotiation phase for an EU contract, where the aim of the contract is to sustain successful resources rather than create something new," he said. While there is no guarantee that the EU will ultimately support this mechanism, "We very much hope that in the new EU Framework 7 there will be provisions for more robust funding of database infrastructure," he said.
In May, Eric Jakobsson, director of the NIGMS Center for Bioinformatics and Computational Biology and chair of the Biomedical Information Science and Technology Initiative Consortium at the National Institutes of Health, told BioInform that the agency was hoping to bring together the "major players and stakeholders involved in the provision and use of bioinformatics resources to help us make rational policies for creating and sustaining databases and associated computational environments."
"I don't relish taking a database like BIND and trying to make a business
out of it."
Jakobsson has since stepped down from his post, but John Whitmarsh, acting director of the NIGMS CBCB and acting chair of BISTIC, said that NIH is "in the process of planning a meeting, which hasn't been scheduled yet, which will address this issue in part, but it will be broader than that • it will be knowledge environments for biomedical research."
Whitmarsh acknowledged that "there isn't what you would call an NIH-wide policy or even guidelines for how to do this, and that's one of the critical areas that BISTIC is addressing."
While the issue is "on everybody's mind," Whitmarsh said, "there is no simple solution."
Database Interaction Network
BIND's funding crisis comes just as the study of molecular interactions is gathering momentum and as the interaction database community has begun to coalesce.
In August, BIND and several other major interaction databases • DIP, IntAct, MINT, and MPact • formed the International Molecular Exchange (IMEx) consortium. The group's primary goal is to eliminate redundant curation efforts by exchanging data on a regular basis, much as the Genbank, EMBL, and DDBJ sequence databases do under the International Nucleotide Sequence Database Collaboration.
In addition, Blueprint had signed agreements with nine journals to assign BIND identifiers to molecular interactions in pre-publication manuscripts. This arrangement was widely viewed as a first step toward mandatory submission of interaction data in public repositories, as is currently required for sequence, structure, and gene expression data.
David Eisenberg, principal investigator for DIP at the University of California, Los Angeles, told BioInform via e-mail that he was "sorry to learn that BIND's funding is in peril. The loss of BIND will be a loss to our entire community."
DIP's own funding has been uncertain in recent years, and Eisenberg said that his group has recently applied to NIH for new funding and that the proposal was "reasonably well reviewed," although he still does not know whether the grant will come through.
"If the support were to fail, it would mean that there would be no general interaction database in the Western hemisphere," he said.
Andrew Marshall, editor of Nature Biotechnology, said the journal has not yet decided what its future course of action will be. "In the short-term, the demise of BIND means we simply won't be offering BIND coordinates to data derived from our papers any longer," he wrote in response to an e-mail query. "To my knowledge, though, no real strong alternatives to BIND exist. There are some partial, poorly curated databases but none has the level of depth and accuracy that BIND offered."
The primary question, he said, "is whether the failure of BIND will hamper research progress. It certainly doesn't help it."
Ultimately, Marshall said, "the success of a public database depends on the support of the community." In the case of BIND, he said, the Nature Publishing Group "felt that the database offered by Blueprint was supported by a sufficient portion of the protein-protein interaction community to participate in the initiative."
However, he added, "It would be interesting to find out … whether BIND's failure to find funding reflects a problem in obtaining wider acceptance for the database as the place where protein interaction data would be lodged by the research community at large."
EBI's Hermjakob said that at some point, the IMEx consortium plans to approach the journals about taking on BIND's curation duties, "but we want to first have this infrastructure established and to have a clear separation of who curates from which journal." Ultimately, he said, "we hope to get the journals to encourage submission, and once everything is stable and working, then also to get mandatory submissions."
He said that the IMEx data-exchange infrastructure is currently in its "beta phase," and that the goal is to begin exchanging new interaction data early next year. Legacy data • including the information in BIND • will not be the first priority, according to Hermjakob, but may be integrated into the rest of the IMEx resources as time • and funding • allow.
IntAct, for example, hasn't yet made "any specific provisions" to start incorporating BIND data, "but we have made provisions to maintain a copy of the downloadable data in case for some reason the server should become unavailable."
That, he said, "is also a reason why we set up this network of databases • so if, for whatever reason, something should go wrong in one of the resources, then the load can be taken over by one of the other resources."
• Bernadette Toner ([email protected])