With $9 million in funding from IBM and MDS Proteomics, Blueprint Worldwide, a new not-for-profit organization, plans to build a free, comprehensive database of protein interaction data that will contain information from over 200,000 scientific papers.
The database, called BIND, for biomolecular interaction network database, has been publicly available since its launch as a prototype late last year at the Center for Molecular Medicine and Therapeutics, an affiliate of the University of British Columbia. MDS Proteomics and IBM established Blueprint to more rapidly develop the database free of bureaucratic restrictions, said Francis Ouellette, a bioinformaticist at the University of British Columbia who is currently directing the initiative.
“From our sponsor’s perspective, and us, we want to make things happen right now,” said Ouellette.
While there are many public and privately owned databases of protein information, BIND is designed to be the primary source for researchers interested in learning how specific proteins interact with DNA, RNA, and small molecules, said Ouellette. “I would challenge any database, public or private, that says ‘we have the same information,’” he said. “They may have similar types [of information], but not the scope or format.”
BIND’s initial development occurred at the Samuel Lunenfeld Research Institute, an arm of Mount Sinai Hospital in Toronto, where MDS Proteomics’ chief information officer Chris Hogue designed the database while an academic researcher. The database architecture is modeled after NCBI’s GenBank, Ouellette said, with an additional level of protein interaction data layered on top. Hogue and other developers of the database used the computer language ANS 1 to design the software, and IBM provided servers and data storage hardware.
Currently, BIND consists of 5800 records, containing mostly protein interaction and signal transduction information taken from yeast studies. Since its launch, researchers at CMMT have funded their efforts primarily through Canadian government grants, but the IBM and MDS Proteomics contributions will fund the project for the near future, Ouellette said.
By the end of the year, Ouellette said he hopes to have 40 to 50 curators and software developers working on the project, which has not yet established a permanent headquarters but is temporarily based in Vancouver. The organization could grow to as many as 100 employees in five years, he added.
MDS Proteomics and IBM are spearheading the project for both altruistic and commercial reasons, said Frank Gleeson, CEO of MDS Proteomics. “We do have the belief that scientific advances are accelerated by close interaction between the private and public sectors,” he said. On the other hand, because MDS Proteomics has its own version of BIND that contains proprietary protein data, any initiative that creates a common infrastructure for codifying biological data “will hasten our ability to build relationships around our own BIND database,” Gleeson added. Similarly, IBM vice president for life sciences Carol Kovac said that working with MDS Proteomics on the database “teaches us a lot about how to support biological research with IT.”
But MDS Proteomics and IBM are not the only parties with commercial interests in building databases of protein information. Proteome, now a division of Incyte Genomics, among other companies, has developed a database available by subscription that contains human protein data from about 100,000 scientific papers.
Jim Garrels, Proteome’s CEO, said the public database would not make his company’s products redundant, however, because his database is several years ahead of BIND, and because Proteome can always add proprietary data from Incyte and other partners on top of the publicly available data.
Indeed, Ouellette admits that many private companies will eagerly access BIND. “My sense is that a lot of for-profit companies will reformat [the data], and make it available by selling it,” he said. “That’s fine. That’s part of being a public database.”