The metabolomics research community is hoping to build upon a recent string of successes in the development of bioinformatics data standards as it embarks upon its own standards-development process.
On July 18-19, the European Bioinformatics Institute is co-hosting a "MetaboMeeting" in Cambridge, UK, in an effort to coordinate several emerging standards efforts in the field. The gathering follows on the heels of a meeting of the Metabolomics Society in Tsuruoka City, Japan, June 20-23, where standards were a hot topic of discussion, according to several attendees, and precedes a metabolomics standards workshop that the National Institutes of Health is hosting in Bethesda, Md., Aug. 1-2.
These three meetings come just as the Standard Metabolic Reporting Structures working group has published an initial set of recommendations for reporting metabolomics experiments in the July issue of Nature Biotechnology. [2005 Jul;23(7):833-8].
Despite the divergent appearance of these multiple efforts, participants in each of the initiatives told BioInform that they hope to coordinate their activities. "We realized early on that standards in metabolomic databases will be crucial," said Rima Kaddurah Daouk, president of the Metabolomics Society. "As a society, we are keen to lend any help we can and support all of the different efforts, including the EBI," she said.
"The format is really quite generic. It basically says that a mass spectrometer has a start, a middle, and an end, somebody ran it on a particular date, and it generated this big list of numbers. And you can annotate those numbers. So we saw no reason why that wouldn't be useful for metabolomics."
Daouk, a professor at Duke University and a co-founder of Metabolon, said that several members of the society's board of directors are "actively involved" with SMRS and other efforts. Likewise, Chris Taylor, a senior software engineer at the EBI who is coordinating the MetaboMeeting, said that several attendees from that meeting plan to attend the NIH meeting next month to compare notes.
All Together Now
The collaborative spirit emerging from these efforts springs from the recent success of community-driven standards initiatives such as the MAGE (microarray gene expression) consortium and the Human Proteome Organization's Protein Standards Initiative.
The development of effective standards was long considered to be a nearly impossible task in the bioinformatics community — with the failures of industry-led efforts like the OMG's life science working group and the Interoperable Informatics Infrastructure Consortium only confirming the futility of such efforts for many in the field.
But MAGE, which developed the MIAME (minimum information about a microarray experiment) reporting standard and the MAGE-ML and MAGE-OM formats for gene expression analysis, proved the nay-sayers wrong when it published the MIAME guidelines in 2002. Several journals now require researchers to submit MIAME-compliant gene expression data sets when they publish their results. In addition, most gene expression analysis software is now compliant with the MAGE formats.
HUPO's PSI initiative followed a similar formula, first developing a set of reporting requirements called MIAPE (minimum information about a proteomics experiment), and following that with an XML-based data-exchange format for mass spectra called mzData. A number of mass spec vendors have already adopted this format, according to EBI's Taylor, including Bruker, Thermo, and Applied Biosystems, while others, including Waters, "are in the middle of implementing it."
The MAGE and PSI groups have already begun to coordinate their efforts to identify areas of overlap between their respective formats that may serve as the groundwork for a "generic" object model for functional genomics called FUGE (functional genomics experiment) model [BioInform 06-06-05]. Initial work on this project made it clear that it would be necessary to bring the metabolomics community into the fold, Taylor said.
"Having built this collaboration between the transcriptomics and proteomics people, we … have been continually telling people that in the functional genomics world, metabolomics would be the other leg of this tripartite collaboration over the long term," Taylor said. "And we realized that in metabolomics it was only just starting to become an issue that people might want to share data, so we thought we'd see what we could do to help."
Daouk agreed that the metabolomics community is likely to benefit from the success of previous efforts like MAGE and PSI. "We realized that this is a very important area to really focus on early on, and to tackle and deal with early on, and basically to learn from the other omics technologies and what they have done," she said.
Taylor said that the proposed FUGE model will require input from researchers from multiple omics disciplines. The goal of FUGE is to develop common ways of capturing data, using a common ontology and common reporting requirements across several disciplines, he said, "So we really want these people to be contributing from the outset because if there are different approaches that we need to be aware of, then it would be nice to be aware of them."
It's crucial for the metabolomics community to participate in the process, Taylor said, "because they will have a different view of the biology, a different view of the way samples are generated, the way they're treated" than researchers working in proteomics or gene expression.
The work of the PSI consortium is also expected to give the metabolomics standards effort a head start. Since mass spec is used for both metabolomics and proteomics, the mzData format may work just as well in one as it does in the other with minimal modification, Taylor said.
"The format is really quite generic," he said. "It basically says that a mass spectrometer has a start, a middle, and an end, somebody ran it on a particular date, and it generated this big list of numbers. And you can annotate those numbers. So we saw no reason why that wouldn't be useful for metabolomics."
John Chakel, proteomics and metabolomics software product manager for Agilent's Integrated Biology Solutions business, said there ought to be "a lot of leverage that can be brought to this from efforts ongoing with data standards, especially related to mass spectral data in terms of what's being done for the HUPO community and the PSI group."
Chakel said that Agilent is "participating heavily" in the metabolomics community, and that he attended the Metabolomics Society meeting in Japan last month. "Once these standards are agreed to by the community, typically our position is that we'll make every effort to adapt our systems to support those standards," he said.
So far, most of the work in metabolomics standards development has been in the area of reporting requirements — the metabolomics equivalent of MIAME and MIAPE — with little work on specific data formats to date. In addition to the SMRS recommendations, there are two other proposals for researchers submitting metabolomics data for publication: MIAMET (minimum information about a metabolomics experiment) and ArMet (architecture for metabolomics).
Tom Plasterer, principal scientist for bioinformatics at BG Medicine, said that the SMRS guidelines are a "key first effort" for the field.
More Information on Emerging
Metabolomics Standards Initiatives:
ArMet (Architecture for Metabolomics), a data model for plant metabolomics developed at the University of Wales, Aberystwyth, UK: http://www.armet.org/.
|SMRS (Standard Metabolic Reporting Structure), a metabolic reporting standard built on the UML model with an XML implementation: http://www.smrsgroup.org/ or http://sourceforge.net/projects/smrsgroup/.|
|MIAMET: a checklist of information necessary to provide context for metabolomics data that is to be published. A paper describing MiaMet (Bino, R.J. et al. (2004) "Potential of metabolomics as a functional genomics tool." Trends Plant Sci. 9, 418-425) is available at http://www.public.iastate.edu/~mash/publications/TIPS04.pdf.|
|Metabolomics Society: a nonprofit group that aims to "promote the growth and development of the field of metabolomics internationally." (http://www.metabolomicssociety.org/)|
|NIH Metabolomics Standards Workshop: http://www.niddk.nih.gov/fund/other/metabolomics2005/|
Plasterer said that one of his group's primary tasks is mapping metabolite data to biological pathways, which requires consistency among metabolite IDs across multiple bioinformatics resources. "Everybody does their metabolomics somewhat differently, and this is one area that the SMRS group could really help. If you encourage scientists to standardize on how their data is going to be structured — all the way from samples up to statistics — then you may be able to encourage further the development of a standard ID system that exists on top of that data. With a consistent ID system your mapping of metabolites to pathways will be much more straightforward." he said.
From Taylor's perspective, the technical specifics of the standards under development aren't as important as ensuring that everyone's voice is heard in the process. "Most formats that you could come up with are essentially equivalent — there might be a little bit more effort to implement them, or they might last a little bit longer or require less change over the long term, but really you can cope with these little differences and difficulties in the implementation," he said. "What you can't cope with is if somebody felt excluded from the process, and therefore is minded not to adopt."
Therefore, the primary goal for this week's MetaboMeeting will be a social one. "The aim of this meeting is first of all networking," Taylor said. "It is important that the right people are all talking to each other — or that they're at least aware of each other's existence and the kinds of things that are going on so that they can follow up on any interest they thought they might have had, or not."
— Bernadette Toner ([email protected])