A working group of bioinformatics experts organized to devise a standardized computer format for transferring and archiving microarray data has sent its initial draft to the Object Management Group, a non-profit consortium that sets independent software standards.
The draft, which the group submitted June 18, is a tagged-text XML format modified to encode information about microarrays.
"The end goal on the academic side is to facilitate the production of a GenBank equivalent for microarray data," said Paul Spellman, a group member and postdoctoral researcher at the University of California Berkeley. "But the content of microarray data is so much richer than sequence data that I don''t have any illusions that it''s going to be a simple format."
Additionally, said Spellman, private companies that want to sell data from microarray experiments can benefit from a standard format because it will minimize the barriers to converting the information from one database to another.
The group, which includes scientists from the University of California Berkeley, Affymetrix, Rosetta, Inpharmatics, Agilent Technologies, the European Bioinformatics Institute, and other institutions, has met several times so far to rough out this format.
Group members started with three proposals, one from Rosetta for its Gene Expression Markup Language (GEML), another from EBI for its Microarray Markup Language (MAML), and a third from NetGenix; and have sought to develop a unified format.
"We have a model and we are nearly done developing the rules," Spellman said. "I think we have a really good structure for submitting microarray data."
The group hopes to have the standard format finished by the end of the month, and is also seeking OMG approval for its format by early September.
MMJ