BURLINGAME, Calif.--Five proposals for biomolecular sequence analysis interface specifications were submitted for consideration at a meeting of the Life Sciences Research domain task force of the Object Management Group at a week-long meeting here this month. Eight bioinformatics companies and organizations responded to a request for proposals released by the task force, which was formed last year to establish standards that will allow interoperability among CORBA-based tools used in life sciences research.
The biomolecular sequence analysis specifications, which the group hopes to finalize late next summer, will be the first it adopts. But other specifications should follow quickly. After meeting here November 9-13, the group issued a request for proposals on genomic map specifications and requests for information on gene expression and entity differentiation services--methods that allow researchers to mine matching genetic information that is referenced inconsistently across databases.
David Benton, a task force cochair and SmithKline Beecham's director of hyperlinking, said the meeting, which drew close to 50 attendees from the bioinformatics vendor and end-user communities, was the group's most productive to date. In other news from the meeting, Timothy Slidel of SmithKline Beecham retired from his post as cochair. Tim Clark of Millennium Pharmaceuticals was elected to replace him.
Sequence analysis specifications were proposed by Concept 5, with Millennium Pharmaceuticals and Oxford Molecular Group; NetGenics, with Genome Informatics; Neomorphic Software; the European Bioinformatics Institute (EBI); and Molecular Applications Group. Documents are available on the task force's web site, http://www.omg.org/homepages/lsr.
Steve Chervitz, a software engineer at Neomorphic who became involved with the task force as a postdoctoral researcher at Stanford, said his team's submission incorporated more biology than the others. "We have a lot of experience working with biological data models and we've added a fair amount of that into our proposal," he explained. "We've really focused a lot on looking at the relationship between sequence object analyses and sequence features."
Other submissions focused more on analytical issues. "The Millennium, Oxford Molecular, and Concept 5 submission and the one from EBI were stronger on the analysis end of things," Chervitz continued. "Ours and Netgenics's were more on the data-modeling side--more elaborate data objects, including features and sequence objects."
Molecular Applications Group's submission, which relied on Extensible Markup Language (ExML), raised debate over whether ExML-based strategies should be considered. "If this is going to be a widely adopted standard it might be short-sighted to require that ExML be used for this particular area," remarked Chervitz.
Following the meeting, Molecular Applications' Drew Wade clarified in an e-mail message to the group that his team remains interested in working with other participating companies to draft a revised proposal for adoption.
That cooperative spirit is expected to result in a single revised proposal that would be brought to the table at a March meeting of the task force in Philadelphia. According to Chervitz, "All of the submitters were pretty open about collaborating and merging the best features from each submission so that in March we will ideally have just one submission that will get voted on and have all these companies' names on it."
"We were primarily focusing on specific aspects of the request for proposals, knowing that others would have spent more time on other areas so we would have some complementarity and be able to offer ideas within a subset of the requirement," he continued. Now Neomorphic will start working with other submitters to create a revised joint submission that "will have some of our ideas and some of their ideas, instead of just retooling ours," he said.
"In an ideal case, when the revision deadline arrives there is just one submission," Benton reiterated. "All the vendors will have gotten together and combined their best technology into one specification that they submit, so there's one submission that's reviewed by most of the vendors as the best proposal they can collectively put forward at that time." He added, "When these revisions come in, I am convinced there will not be five. Whether there will be one or not depends on chemistry and social interactions and feasibility."
Michael Dickson, NetGenics's senior vice-president of product development, said the collaborative process is a way to "arrive at a kind of best-of-breed standard."
Once biomolecular sequence analysis interface specifications are adopted, which, if all goes according to schedule, should happen at an August OMG meeting in San Jose, Calif., participating vendors are required to commercialize a product within one year. Explained Chervitz, "As I understand it, the point of requiring companies to implement the specifications is a way to prove that they are able to be implemented. During implementation vendors may discover some problems that require revision and it's good to flush those out as soon as possible. Having that year requirement makes the revision process go quicker."
"This is a big difference between the Object Management Group process and, say, ISO," added Benton. "ISO can create standards and put them on the shelf and there's no guarantee that there will ever be a product embodying those standards. Here, all of the submitters have already sent in a letter of intent promising to make a commercial implementation of the adopted specifications."
In addition to commercializing a new product, NetGenics has pledged to develop a public-domain CORBA toolkit, based on the adopted specification, for academic and nonprofit sector researchers. "Whichever specification is adopted as the standard, NetGenics is committed to supporting that standard in the company's commercial products, as well as in its public-domain toolkit," said Dickson. "Since the chosen interface will incorporate the input of many organizations within the Life Sciences Research domain task force, the toolkit implementation ultimately will be the product of the collaborative insights of the best minds in the industry," he added. "As such, it should be readily available to anyone who could benefit from its use."
While parties who are participating in the specification adoption process are enthusiastic about the level of cooperation among vendors, there are some members of the community whose absence from the specifications development process is conspicuous. For instance, some pharmaceutical companies that Benton characterized as "waiting on the fence to see if CORBA will be widely adopted" have yet to contribute to group discussions.
"There are a number of big pharmaceutical companies that should be involved, I think, that aren't. They would bring more weight and encouragement to the vendors," he argued. "If we don't get enough customers involved, vendors will say, well, everything must be fine, why should we make an extra effort to make our software interoperate. But that's exactly what everybody in the pharmaceutical industry wants to get away from."
Another bioinformatics community member noticeably absent from the task force is the National Center for Biotechnology Information (NCBI). "They're a major database vendor and it would be helpful to have them there," Benton continued. "EBI has made a real commitment to put CORBA interfaces onto public sequence databases and other databases they maintain, so if NCBI doesn't, that's OK, we know a place where we can get it." But, he added, "the intellectual energy and creativity NCBI could bring to the process would be great."
Overall, though, Benton said momentum is building for the development of standards for life-science research tools. "All of us need to be more cost-effective, and the way to do that is clearly to get away from single-vendor solutions, because no single vendor has all the solutions we want. We are working to get to a place where we can buy software components and mix and match and plug and play."
Unfortunately, he conceded, it will take some time "before you get to that software-component Nirvana."
The next meeting of the Life Sciences Research domain task force will be held in conjunction with the next Object Management Group technical meeting, in Arlington, Va., January 11-15.