By Vivien Marx
STOCKHOLM, Sweden – The University of Manchester today launched Biocatalogue, a registry of curated life science web services intended to help scientists easily connect with teams offering web-based bioinformatics resources.
The university announced the first public release of the catalog here at the joint Intelligent Systems for Molecular Biology conference and European Conference on Computational Biology.
The catalog is the product of a collaboration between Carole Goble, who heads the Manchester team, and Rodrigo Lopez and colleagues at the European Bioinformatics Institute. EBI stores the catalog on its servers and performs back-end operational tasks such as data management and monitoring, while the front end development and the curating is done at Manchester.
Goble told BioInform prior to the conference that a team of 11 members curates the services in Biocatalogue, which currently holds over 1,060 services and has nearly 50 users. Developers can register their own services through the website, and services are annotated via tags, user comments, and text descriptions. All annotated services and their components, such as operations, inputs, or outputs, are searchable.
There are other registries for web services, such as BioMoby along with many less mature ones for life sciences and the general-purpose Seekda search engine, but some of the these projects lack extensive metadata, she said.
Life sciences researchers increasingly draw on web services to access data, run compute jobs, and perform in silico experiments, and the catalog is intended to help scientists — and software programs — locate the right service for the analysis they need.
This week's launch of Biocatalogue is the "end of the beginning," Goble said, adding that the intention is to keep the catalog a perpetual beta release. "It's an evolving and feature-expanding software and open to comment and also open to listening to its user base."
As a central repository, she believes it will help scientists who often don't know how to find web services, or how to use them once they locate them.
"If I know where to go, how do I know how to use it? Because often the metadata is very poor. What do the operations mean? What are the constraints on using that service?" Goble asked.
For example, scientists might not know if the services have been recently versioned, how reliable they are, if local mirrors exist, which metrics have been collected about these services, or how often they have been called in a given time period. Scientists will also want to know who among their peers is using a particular service.
"It's a social gathering point for crowd-sourced information about the services," Goble said. More than a registry, "it is an aggregator."
The effort is a sister project of the My Experiment repository of scientific workflows. "We're using a lot of the same code base and technology, Goble said.
The two resources are linked architecturally and conceptually, since workflows also use web services. Biocatalogue and MyExperiment each have "programmatic interfaces to the other," and can be incorporated into Taverna, a workflow platform developed as part of an open source middleware project in the United Kingdom.