Metabolomics may be seen as the runty younger sibling in the ’omics family, but it is starting to come into its own. As its older sisters — genomics, proteomics, and transcriptomics — experienced in years past, this field is undergoing an explosion of interest with new tools and technologies. And with those tools, of course, comes a flood of new data. Learning from the problems that cropped up in earlier large-scale biology disciplines, metabolomics scientists are trying to develop ways to generate, standardize, interpret, and apply metabolomic data.
“It is a daunting task, first to see all of [the metabolites] and then to identify all of them — to make biochemistry out of it,” says Oliver Fiehn, an associate professor who specializes in metabolomics at the University of California, Davis, Genome Center.
Indeed, says Bruce German, a fellow UC Davis scientist who’s in the food science and technology program, “We’re stumbling with some key problems.”
Drawing on hundreds of years of physiology, biochemistry, and clinical chemistry research, metabolomics is poised to be a big player in nearly every corner of biology, from the clinic to nutrition to psychiatry. But before the budding discipline can be considered field-ready, researchers will first have to iron out those details of methods, data reporting, and database building.
Churning Out Data
There is no single way to go about metabolomics research. Most scientists use NMR or LC/GC-coupled mass spectrometry, but then they have to decide whether to look at a signature associated with a particular metabolite or to really dive in and look at the metabolites at a moment in time, or over a period of time. Any way they choose to go, it’s going to generate a lot of data.
Right now, most people tend to use mass spec-based approaches, says Fiehn. “Mass spectrometry is more sensitive and you can detect more things. For small studies, it is also very robust. The problem is it’s a little bit more temperamental in terms of absolute sensitivity over long times on an unbiased view,” he says.
But there are researchers who prefer NMR-based approaches. Clare Daykin, a lecturer at Nottingham University in England, predominantly works with NMR for her metabolomics work on nutrition and health. “Personally, I think it is a very strong technique. It has a lot of advantages,” says Daykin, citing among those reproducibility, selectivity, and the dynamic information offered by NMR.
Then there are the targeted platforms, specially altered to fit a particular need within metabolomics. A lipidomics platform is still GC-MS-based, but it is tailored to look primarily at lipid biochemistry information. “Instead of focusing on 1,000 or 2,000 metabolites, you start to focus only on the lipids and get a deeper understanding of perturbations of lipid biochemistry,” says Rima Kaddurah-Daouk, an adjunct associate professor at Duke University Medical School.
With an approach in mind, scientists then have to work out the study design: how will the technologies be put to work to actually generate data for analysis?
Some researchers look to associate a certain signal with a particular metabolite — in this way, metabolomics becomes a method of discovering biomarkers that can be used to better diagnose or treat diseases. “There are some people who do metabolomics … and they generate signals associated with metabolites — and those could be unassigned NMR resonances or unassigned mass fragments — and they interpret those data as patterns,” says German at UC Davis.
Other metabolomics researchers, including German, analyze the metabolites themselves. One way to do this is what Davis’ Fiehn calls the snapshot method, where the researcher takes a look at what the metabolism is doing at a particular moment. The drawback to this one-look approach, says Fiehn, is the possibility of missing how a biological system responds to perturbations.
For a more detailed picture, some scientists use a flux method where they follow how the metabolite traffic flows over time. “Not only [do] we need to identify and quantify all of these compounds, but also we want to look at the fluxes,” Fiehn says. “We want to look at the time scale and the pathways that are involved.”
No matter which way scientists structure their experiments, one thing is for certain: at the end of them, there will be a mountain of data to sift through. But before researchers can file that information neatly away in mega-databases, they first have to make sure to follow consistent ontologies so that anyone looking at the data across various studies can make sense of it.
The Road to Standards
Compared to something as stable as a strand of DNA, metabolism is a freakishly unpredictable and constantly changing process, reliant not only on biology but on what the organism consumes as well as other environmental factors. So in metabolomics, all of those circumstantial components — diet, environment, and so on — are just as critical to understanding the system as the metabolite data being generated. Metabolomics researchers, it turns out, have to be sticklers for details. To allow for integration of data from different studies, scientists must be sure that the way they refer to their metadata will be, if not identical to how someone else would refer to it, at the very least easy to interpret by any other scientist. “It’s important that you all talk the same language, that you use the same descriptors,” says Kaddurah-Daouk, president of the Metabolomics Society.
Last month, the Journal of Metabolomics published six reports from the Metabolomics Standards Initiative outlining the minimum reporting standards for metabolomics data in papers. These reports, shepherded by the Metabolomics Society, try to bring order and quality control to the field.
The standards initiative began in 2005, growing out of the earlier Standard Metabolic Reporting Structure effort at Imperial College in London, the Architecture for Metabolomics, and a workshop hosted by NIH and the Metabolomics Society.
Coming up with minimum standards has proven more difficult than might be expected, says Fiehn, who spearheaded the initiative. “The minimum is actually more difficult than the maximum. People would say if you don’t report body mass index, how can you say anything about anybody? People could counter-argue even the body mass doesn’t tell you a lot. The minimum, there is a lot of discussion in there. Whereas with maximum — give everything you have,” he says.
The series of standards papers provides guidelines for data analysis reporting, describing environmental context, and special reporting criteria for plant biology and microbial and in vitro studies.
“Some of these minimum reporting standards are pretty elaborate,” Fiehn says. “The minimum reporting just ensures so far that, for example, people don’t publish metabolomics without any single metabolite named. You can’t just say I threw in a ton of stuff in mass spec and I got a lot of ions and then call this metabolomics.”
These guidelines are by no means a final pass; they’re just the beginning for standards in metabolomics. “The minimum standard required is to just get the community to start to think about the minimum tools and to use them in reporting in scientific journals,” says Kaddurah-Daouk.
The standards initiative is in fact looking for more people to improve the current guidelines. “This is something that I think will be ongoing for a very long time yet,” says Nottingham’s Daykin.
The Meta Database
Of course, the rationale for cleaning up data and making it consistent across labs is so that scientists can deposit results of a number of studies into large databases for anyone to use. “With a database, you can ask other kinds of questions all of a sudden because you have both the species and the experimental design captured,” says Fiehn. “You build up knowledge.”
One of the largest database efforts in metabolomics is from the Human Metabolome Project through Genome Alberta. Researchers at the University of Alberta and the University of Calgary announced that they had completed a draft of the Human Metabolome last January, and followed up by building a database for it. “This has been a broad survey of the metabolomics landscape,” says Ian Forsythe, project manager of the Human Metabolome Project at Genome Alberta and the University of Alberta.
Using an array of different metabo-lomics approaches — NMR, GC-MS, LC-MS, and LC-MS/MS — David Wishart and his colleagues at Alberta identified and quantified human metabolites from various samples to create their database. Also, they mined papers and books for more data. Once they recognized a metabolite, they converted their data into tables of names and concentrations or relative concentrations.
“It brings together information from many different sources like biological, chemical, mass spectrometry data, nuclear magnetic resonance data … and brings together data from all these disparate sources,” says Forsythe.
Meantime, other scientists are working to establish metabolome databases of their own. Fiehn is creating several databases, including one for metabolic compounds and their associated metadata; Kaddurah-Daouk at Duke is building a metadatabase for psychiatric disorders. Still, there is no GenBank-like repository for all metabolomics data. “There is no metabolomics database that is supported by NCBI or by whoever, which is a shame. I find this not good,” Fiehn says.
The Big Idea
Such a metabolomics database could open up new avenues for researchers. The field as a whole is expected to enable breakthroughs in diverse areas including organ transplants, multiple sclerosis, and schizophrenia. Much of its impact, though, may be felt when it teams up with other data to lead to a genuine systems biology approach. Researchers may then nearly be able to see the connection between the genome and the phenotype. But before any of that occurs, some technical innovations are still needed.
For the next phase of the Human Metabolome project, Wishart and his collaborators will be taking on applied metabolomics — that is, figuring out how this information can be put to use in the clinic. According to project manager Forsythe, they have gotten inquiries from clinicians wanting to use the Human Metabolome Database to study many factors, including multiple sclerosis and organ transplantation rejection.
Partnering with other large-scale biology projects may be another way that metabolomics finds it way to the clinic or other more downstream applications. “I see great synergies between metabolomics and imaging and other ’omics, whereby collectively together, we can start to see a broader, a more comprehensive view of a disease from a systems biology approach,” says Duke’s Kaddurah-Daouk.
Of all the systems biology disciplines, metabolomics is the closest to the phenotype — what is actually seen in an organism, healthy or not. It is here, at the metabolome, says Alberta’s Wishart, that the interaction of the genome and phenotype might be seen. “If you say the gene expression would be something that could be done, the protein expression would be something that will be done or is about to be done, the metabolites are the mirror of what has been done,” adds Fiehn.
Before it can fulfill that promise, though, metabolomics will have to spend some time strengthening its foundation. Nottingham’s Daykin predicts the next few years will bring improvements to limits of detection and dynamic ranges of the different analytical platforms. Fiehn also thinks that new and better methods, databases, catalogues, and tools will be developed soon.
“If we could, in fact, say, ‘Here are the molecules. Here are the pathways. Let’s measure them accurately,’ we could begin to build qualitative databases for measuring those. Once that’s started as a process … one after another health states will be revealed, like cholesterol and heart disease, and then that’s just going to steamroll,” German says.