MGEDGainsMomentumasSoftware Providers, Public Databases Accelerate Efforts


This year’s meeting of the Microarray Gene Expression Database Group was marked by the presence of a number of bioinformatics software vendors pledging to make their gene expression analysis products compliant with MGED standards.

The increasing participation of software developers is a sign that the efforts of the MGED group are finally beginning to pay off, said Alan Robinson, a team leader at the European Bioinformatics Institute and member of the MGED steering committee. In addition to the academic groups and pharmaceutical companies who presented posters at previous meetings, this year’s crop of over 90 posters included contributions from Lion Bioscience, Biobase, Visual Bioinformatics, GeneticXchange, Iobion Bioinformatics, Gene Logic, and InforMax. Lion, Iobion, and Silicon Genetics sponsored the meeting along with Affymetrix and Corning.

MGED III, held at the Stanford University School of Medicine in Palo Alto, Calif., March 28-31, brought together over 300 participants in a grassroots effort to develop a set of standards for microarray experiment annotation and data representation. The underlying goal of MGED is to establish public gene expression data repositories that would be comparable to the role of Genbank for sequence data.

But the complexities of microarray experiments pose a number of challenges that make the task of setting up public databases for gene expression data far more difficult than databases for sequence information. Because gene expression data has meaning only in the context of the experiment from which it was derived, any database of gene expression information must include as much information as possible about the array design, the samples, and other parameters involved in the corresponding experiment. This requirement is further complicated by the lack of standard measurement units for gene expression levels as well as for the relative reliability of the resulting data.

In response, the MGED data annotations working group has developed a draft specification for the minimum information about a microarray experiment (MIAME) that would ensure its validity. MIAME calls for detailed information on six aspects of each microarray experiment: the experimental design, the array design, the samples, the hybridizations, the measurements, and the controls. MGED hopes that scientific journals and funding agencies will eventually require MIAME-compliant data along with submission of the data in a public repository.

Alvis Brazma, a team leader at EBI who heads up the data annotations group at MGED, said that MIAME 1.0 should be complete in around two months, at which time MGED intends to submit the document for publication, possibly in Genome Research. Brazma told BioInform that publication of the MIAME specification would provide much-needed visibility for the effort.

MIAME serves as the foundation for MGED’s working groups on data format and ontologies, who kicked off MGEDIII with pre-meeting workshops on March 28. Presentations by Paul Spellman of Lawrence Berkeley National Lab and Chris Stoeckert of the University of Pennsylvania addressed the activities of these groups at MGED III.

Stoeckert reported that the ontologies working group has posted a preliminary series of tables linking MIAME requirements with MAML terms at Ontology/MIAME_MAML.htm.

In addition, the current status of the leading public repositories for gene expression data was addressed. Ugis Sarkans of the EBI reported that ArrayExpress has just been populated with its first data set, while Alex Lash of the National Center for Biotechnology Information discussed the NCBI’s plans to include an Entrez interface for the Gene Expression Omnibus. Newcomers to this area included the National Center for Genome Resources’ GeneX and a new expression database that is under development for the DNA Data Bank of Japan by Japan’s Center for Information Biology.

Future directions for MGED include focusing on data normalization, statistical significance, and other challenges in microarray experiment analysis. Robinson noted that the establishment of MIAME and MAML would be the foundation upon which a broader standardization effort could develop.

Robinson, Brazma, and the other MGED steering committee members stressed throughout the meeting that the group’s progress depends solely on the input of its members, and welcomed increased involvement from current and future participants.

