The Blueprint Initiative has negotiated a deal to dramatically expand its BIND database, a move that could also help accelerate the broader development of pathway informatics.
The Blueprint research program, led by Chris Hogue and based at Mount Sinai Hospital in Toronto, announced at ISMB 2004 last week that it had secured additional funding to expand its activities into Asia and Europe. In addition, Blueprint said it has enlisted the support of the publishers of Science and Nature to aid the curation of molecular interaction records and ensure their deposition into the Biomolecular Interaction Network Database.
The news was one of several signs at ISMB that the pathways informatics field is quickly reaching full flower. A number of new pathway databases have launched recently, along with a range of editing and analysis tools, and representatives from several of these projects were at the ISMB meeting Glagow to discuss their work.
In addition, standards efforts in the field, such as BioPax and the Protein Standards Initiative, have progressed rapidly in recent months, and a number of sessions at the conference addressed ways to ensure that these exchange formats remain compatible as the field evolves (see below for a list of pathway informatics tools at ISMB 2004).
Representatives of the broader pathways community at the conference welcomed the news of BIND’s success, particularly in light of the journals’ involvement in the curation process.
Before announcement, a number of attendees at the BioPathways special interest group meeting, held July 29-30, remarked that community resources like GenBank and the Protein Data Bank were slow to develop until the major journals began requiring researchers to deposit their data upon publication.
While Science and Nature do not yet require deposition in BIND, most observers viewed the journals’ new relationship with Blueprint as a step in that direction and a sign that the pathways database field is reaching critical mass.
But pathway and interaction databases are proliferating at such a rapid rate that there is some disagreement regarding the best way to make this growing body of information available to researchers.
Some are in favor of consolidation. Chris Sander of Memorial Sloan Kettering Cancer Center, for example, favors a “shared community multi-site pathway resource” at the BioPathways SIG that would collect information from each individual resource and make it available through a unified interface, he said.
Others favor separate, but interoperable, resources based on emerging standards like BioPax, PSI-MI, and SBML. Most of these exchange formats are relatively new — BioPax Level 1, which addresses metabolic pathways, was just released in early July, while PSI-MI (for molecular interactions) is slated for a September release. While most of the data resources are open to using these formats when they become available, adoption — and their eventual success — is not guaranteed.
Some questioned whether there is really a need for all of these resources. Hogue noted that the pressure for scientists to publish new research discourages collaborative efforts and has led to a series of “me-too” databases in the field.
But there are also arguments in favor of giving researchers some choice. Gary Bader of MSKCC observed that not only do each of the databases contain different data, but they have all been designed for different users and different use cases, “and they all have their own strengths in doing their own things.” Ramon Felciano, CTO of Ingenuity, agreed: “There’s room for lots of players in this space,” he said. “It all depends on who you’re trying to help.”
Felciano was also supportive of BIND’s newly ramped-up effort. “It validates the fundamental technical approach we’ve been pushing for the past five years,” he said. Felciano told BioInform that Ingenuity doesn’t see BIND or any of the other public pathway database efforts as a threat to its business model. Unlike commercial genomic databases, which failed to keep up with public efforts, the multi-dimensional nature of pathway data — and a five-year head start — has enabled the firm to build a unique resource, he said. “We have yet to find a group with the level of detail that is in our database,” Felciano said.
BIND currently holds more than 80,000 interactions for over 800 organisms, which were gathered by around 40 curators in Blueprint’s Toronto facility. The initiative plans to add 40 new curators, split between new facilities in Singapore and London, who will mine the literature to add another 60,000 interactions over five years.
In addition, Blueprint will work with Science and Nature to ensure that interaction data from those journals is deposited directly into BIND upon publication, and that BIND accession numbers are assigned to the interactions in the papers.
Brian Ray, senior editor of Science, said in a statement that the journal is “actively encouraging” authors of appropriate papers accepted for publication to deposit their data into BIND.
Eric Andrade, managing director of Blueprint, told BioInform that the initiative is currently in discussions with around nine journals about making BIND deposition a requirement for publication.
Blueprint is currently funded with a three-year, CDN$29 million ($22 million) grant from Genome Canada. Support for the expanded effort came from Singapore’s Economic Development Board and Sun Microsystems.
The amount of the EDB funding was not disclosed, but Stefan Unger, Sun’s business development manager for computational biology, said that Sun is providing “in-kind” matching funds in the form of hardware. Sun will reproduce in Singapore and London the IT infrastructure that it has already installed at Blueprint’s Toronto facility [BioInform 02-10-03].
Unger said Sun also aided Blueprint by helping to “broker” the initiative’s Singapore location. Blueprint has not yet secured funding for the European effort, but Andrade said the project is working with several partners, including Nature Publishing, to secure European funding.
Andrade said BIND will not be limited to protein-protein interactions, but will also include information on protein-DNA and protein-RNA interactions, as well as small-molecule and genetic interactions.
Noting the “energy barrier” that exists for researchers to deposit their data in resources like BIND, Andrade said that Blueprint began an aggressive effort about a year ago to start building out the resource.
First, he said, Blueprint began a “strategic back-filling” effort to enter in “every interaction” from prominent researchers like Marc Vidal, Charles Boone, and Anthony Pawson. Next, he said, the effort began “stalking” the journals. The Blueprint team systematically assigned BIND IDs to all the interactions in every paper as they were published, deposited the interactions in the database, and fired off letters to the author of the paper and the editor of the journal notifying them that the BIND accession number had been assigned. Soon, he said, the journals “took notice” and the discussions began.
Blueprint has released all of the BIND software, methods, and documentation under an open source license.
Blueprint was recently granted a US patent protecting the BIND data model [BioInform 06-14-04], but Hogue noted that this was purely a “defensive” IP strategy. Mt. Sinai’s researchers are expected to patent their work, but Hogue convinced the hospital’s commercialization committee that “an open format would be more successful than a commercial one,” he said.
“I won’t use the patent to extract license fees,” he added. “I wanted to make sure that no one in the commercial sector could ever shut BIND down.”
Pathway Tools at ISMB 2004
Pathway Visualization, Editing, and Analysis:
- CPath: http://www.cbio.mskcc.org/cpath/index.jsp
- BioGraphNet/BioTrawler: http://llama.med.harvard.edu/cgi/BioTrawler
- Cadlive: http://kurata21.bse.kyutech.ac.jp/cadlive/
- Cytoscape: http://www.cytoscape.org
- FluxAnalyzer: http://www.mpi-magdeburg.mpg.de/de/research/projects/1010/1014/1020/mfaeng/fluxanaly.html
- GenMAPP: http://www.genmapp.org/
- Pathway Editor (EMP Project): http://www.empproject.com
- Patika: http://www.patika.org/
- VisAnt: http://www.visant.bu.edu
Pathway and Interaction Databases:
- Amaze: http://www.amaze.ulb.ac.be/
- BIND: http://www.blueprint.org/bind/bind.php
- BioCyc: http://www.biocyc.org
- DIP: http://dip.doe-mbi.ucla.edu/
- Ingenuity Pathways Knowledge Base: http://www.ingenuity.com/
- INOH: http://www.inoh.org
- IntAct: http://www.ebi.ac.uk/intact/index.html
- KEGG: http://www.genome.ad.jp/kegg/kegg2.html
- MIPS: http://mips.gsf.de/
- Reactome: http://www.reactome.org
- SABIO: http://www.eml-research.de/
- WIT/ERGO: http://ergo.integratedgenomics.com/ERGO/
Pathway Standards Efforts:
- BioPax: http://www.biopax.org/
- CellML: http://www.cellml.org
- KGML: www.genome.ad.jp/kegg/xml/
- PSI: http://psidev.sourceforge.net/
- SBML: http://sbml.org/index.psp