The Interoperable Informatics Infrastructure Consortium’s demonstration of a multi-vendor analysis pipeline at last week’s BIO 2002 may have seemed like déjà vu to some observers — the group trumpeted a very similar proof of principle at the BIO 2001 meeting last year — but according to I3C officials, the organization behind the technology is wholly different than it was a year ago, and last week’s demo is just one example of bigger and better things to come.
As a sign of the group’s mounting influence, Hewlett-Packard recently agreed to join and a number of other organizations, including Biogen, Agilent, and the University of Manchester, are at varying stages of signing on the dotted line, according to Tim Clark, VP of informatics at Millennium Pharmaceuticals and interim chair of the I3C board of directors. “There’s a pipeline of people who are starting to come in,” he told BioInform, welcome news for an organization that has faced a fair amount of skepticism from the community over the past year.
“I was very cautious in the early days,” acknowledged Lionel Binns, formerly worldwide life and materials sciences manager at Compaq, and now at HP. “I didn’t want to have anything to do with it until they got more industry involvement.” Now, Binns said, HP is ready to “bring our experience in general data standards to bear on the I3C.” An as-yet-unidentified “senior member” of HP’s life sciences group will hold a seat on the board and another HP staffer will join the I3C’s technical architecture working group.
The I3C is counting on other holdouts to follow HP’s lead. “One critical thing for us is to add a couple of large pharmaceutical companies to the voting board,” said I3C board member Jeff Augen, director of business strategy for life sciences at IBM. The consortium is currently negotiating with “several large pharmas,” he said, “and I think once that happens and the board has all areas covered, people will say the I3C has the power now to start driving some real initiatives in the industry.”
A well-rounded board will be an important factor in the I3C’s bid to gain broad acceptance in the informatics community. The group ran into a bit of a public relations snafu when it put its initial board in place in February. The founding members of the I3C split into two primary groups in the early days of the consortium: the technical architecture working group and the governance committee. The governance team took responsibility for developing the I3C’s intellectual property policy and incorporating the group as a non-profit entity. One of the conditions of incorporating was declaring a board of directors on all the documentation, so the five-member governance team, made up of Clark, Augen, Sia Zadeh of Sun Microsystems, Morrie Ruffin of the Biotechnology Industry Organization, and Jill Mesirov of the Whitehead/MIT Center for Genomic Research agreed to act as the interim board, with plans to expand as new members joined. This decision drew criticism from some I3C participants, however, who perceived this “self-election” as being out of step with the values of an open consortium.
Clark stressed that the current board is temporary and will soon be expanded to nine members. “We’re in the process of recruiting academic centers, biotechs, and technology companies to fill out the board,” he said, adding that by next March, a formal election process will be held at the I3C annual meeting.
Another crucial component in maintaining the I3C’s momentum is finding an executive director to handle the consortium’s day-to-day operations. Augen said that there are several candidates for the position and the group hopes to have a frontman in place by the end of July.
Filling this position should not only help the I3C better articulate its plans to the community, but will accelerate the formalization of its policies, which are outlined on the group’s website (www.i3c.org), but haven’t yet been finalized. While lingering questions about the terms of membership may be one reason for the lack of enthusiasm witnessed in the community to date, Augen noted that he’s encountered more doubts about the effectiveness of the I3C. “It’s not so much that people are reluctant to join because they don’t know what the rules are, but people are reluctant to join until they see what I3C is really going to do … More than anything it’s a question of what the role of the I3C will be and how powerful it will be.” Essentially, he added, it’s been a classic catch-22: “You don’t have a large enough powerful group to attract members, and you don’t have enough members so you can’t be a large powerful group.”
Meanwhile, isolated from the politics of the logistical branch of the I3C, the technical architecture working group has been steadily plugging away for over a year on a set of web-services-based protocols to ease the exchange of life sciences data. The group has arrived at two recommendations so far: BSML (Biomolecular Sequence Markup Language), a way to represent genomic sequence data in XML that contains aspects of the AGAVE (Architecture for Genomic Annotation, Visualization, and Exchange) format developed by DoubleTwist, and LSID (Life Science Identifier), a scheme that attempts to assign a unique name to biological objects that are likely to have different names in different data sources.
A copy of the LSID draft specification is available on the I3C’s website (www.i3c.org/workgroups/ technical_architecture/index.html). The draft was submitted to over a hundred people in the bioinformatics community for comment, said Brian Gilman, group leader in the medical and population genetics department at the Whitehead and co-chair of the technical architecture working group. Comments are trickling in now, he said, and the next version of the specification will incorporate suggestions from the community. Comments posted on the website so far are generally positive, although reviewers haven’t been stingy with suggestions for improvement.
This method of developing standards differs from groups such as the Object Management Group, that issue requests for proposals on specifications first and then converge on a single standard. Gilman described the I3C’s methodology as an extreme-programming-based approach: The working group begins with a valid use case, collaborates on a reference model and draft specification, and then elicits suggestions from the community while simultaneously weaving them into the next version.
The I3C is counting on this development-focused approach to bolster its membership list. “Merely attending meetings isn’t going to buy you anything in the I3C. Hands-on work in the technical architecture group is how you get leverage in this thing,” said Clark. Any organization — whether it’s an academic group, a pharmaceutical company, or an instrumentation maker — whose interests lie in having direct involvement in the creation of the standard should be willing to roll up their sleeves and get to work, he said.
LSID, along with OmniGene, a web-services based platform Gilman developed to integrate data at the Whitehead, were only two components of the I3C’s demonstration at BIO 2002. The overall architecture linked tools from LabBook, Incogen, Avaki, IBM, Millennium, Oracle, Sun, and TurboGenomics using the standard web services toolkit of XML, SOAP, HTTP, Enterprise Java Beans, and a UDDI-based registry system. The resulting analysis pipeline was able to search for orthologs between the sequences of model organisms across a set of multiple, disparate databases.
What’s next for the I3C? On the standards side, the technical architecture working group will continue to elicit comments from the community on the LSID draft until it arrives at a final version that will find broad acceptance. According to Gilman, “there’s enough dismay regarding multiple identifiers” in the bioinformatics community to drive adoption of a consensus specification that addresses the problem.
On the organizational side, hopes are high as well. “We’re in the process of translating a lot of enthusiasm by small and large companies into high-profile memberships,” said Clark, who added that a number of new groups are anticipated to join in the next one to two months.