CAMBRIDGE, Mass.--The effort to produce bioinformatics software standards took an important step forward last month at a meeting here of the Object Management Group's Life Sciences Research Domain Task Force. Although the approval process is many months from completion, two key preliminary endorsement elections were held for CORBA software standards for biological sequence analysis and genomic maps.
The European Bioinformatics Institute, Millennium Pharmaceuticals, and NetGenics jointly submitted a proposed standard for genomic maps software. Those three companies also cooperated with Concept Five Technologies, Genome Informatics, and Neomorphic to create the biological sequence analysis document. The two standards will be the first created for life sciences research.
At the meeting, task force members and OMG's architecture board voted to recommend that the two proposals be adopted. With that vote done, a referendum involving the technical committee has begun, said David Benton, chairman of the domain task force.
After the technical committee completes its vote, the sequence analysis and genomic maps proposals will be voted on by OMG's board of directors at a meeting in Denver, March 6-10. If approval is granted, the document will be formally considered an adopted specification and vendors involved will have one year to implement it. Public domain implementations will also be developed. Each organization involved in the process is committed to introducing products that comply with the specification within one year of adoption, said Benton, who works for SmithKline Beecham's advanced information technology department.
The genomic maps specification aims to give researchers more flexibility by enabling maps to be treated as complex objects. For example, once the new object-oriented interface standard is in place, an investigator who only needs part of a chromosome map will be able to retrieve the needed pieces instead of having to access a map database and download the entire map. "You can go to whatever level of detail is desired," Benton explained. "The specification allows representation of maps from the chromosome banding resolution down to single-nucleotide resolution and anything in between." This will be a major improvement, he added.
Philip Lijnzaad, chair of the genomic maps working group, said the goal is to have a standard for representing and supplying maps and their content. "The advantage of using CORBA rather than a flat-file format or even extensible markup language (XML) is that you can offer much richer functionality, such as distribution, querying, and iteration," said Lijnzaad, R&D officer at the European Bioinformatics Institute. "The principal benefit, if the standard becomes widely used, will of course be the easier interchange, integration, and comparison of map data. What the standard does not do is calculate maps or display them graphically."
In Benton's view, the main benefit of the biomolecular sequence analysis framework is that it allows a standard interface to be put on almost any sequence analysis algorithm, allowing desktop software to check which algorithms are available from a server and then run them. "Having a uniform interface to these rather than every analysis program having its own unique or idiosyncratic interface is the big advantage," he said.
Scott Markel, chair of the working group on biomolecular sequence analysis, said the meeting was one of the group's best in terms of what was accomplished. Affirmative votes spurred enthusiasm among people who began seeing their efforts paying off. Getting consent from the architecture board was important "because those are the people who understand the overall CORBA architecture the best and have final say on whether what you're proposing fits in with everything else that exists," added Markel, who represents NetGenics in the OMG.
Other standards being developed include one for macromolecular structures software and another for bibliographic query services. Requests for proposals are out and initial submissions have been received.
In addition, there are four other working groups composing requests for proposals to be issued either from the January 10-14 meeting in Mesa, Ariz., or the March meeting, said Benton. Those four areas are: gene expression, laboratory data interchange for clinical trials, cheminformatics for small molecule representation, and entity identification.
The gene expression group, chaired by Douglas McArthur, used the November meeting to review the request for input responses from EBI, NetGenics, and Rosetta Inpharmatics, where he is a member of the computational biology group. With one more reply expected, the group is preparing a request for proposals to be distributed at the Mesa meeting. "We're also trying to increase participation, especially of large pharma in OMG meetings--both for attendance and reviewing the documents we are putting together," McArthur said. Some pharmaceutical companies aren't aware of the effort and others are too busy to participate, he speculated.