CHICAGO (GenomeWeb) – Computational biologists at the European Molecular Biology Laboratory and related European Bioinformatics Institute have developed a computational method for automating the analysis of multi-omics datasets.
In research described in an open-access article published this week in Molecular Systems Biology, they discussed how the method — called Multi-Omics Factor Analysis, or MOFA — was able to identify several "previously underappreciated drivers" of heterogeneity of chronic lymphocytic leukemia. MOFA also has shown promise in analyzing single-cell multi-omics data.
"MOFA infers a set of (hidden) factors that capture biological and technical sources of variability. It disentangles axes of heterogeneity that are shared across multiple modalities and those specific to individual data modalities," the paper said.
This method is easier to interpret than previous methods, the authors said.
MOFA looks for underlying causes of variations by parsing genomic, epigenomic, transcriptomic, proteomic, metabolomic, and phenomic data to "reconstruct" variation factors.
"These could be continuous gradients, discrete clusters or combinations thereof. Such factors would help in establishing or explaining associations with external data such as phenotypes or clinical covariates," the researchers wrote.
"Importantly, MOFA disentangles to what extent each factor is unique to a single data modality or is manifested in multiple modalities, thereby revealing shared axes of variation between the different omics layers."
MOFA is available on Github in beta form as of today, though an earlier test version had been up for several months, the researchers said. Most of the users are computational biologists, though many work at university hospitals so they are close to clinicians. The majority are in the US and Europe.
"I'm really interested in seeing what other people get as results from their omics datasets," said co-lead author Britta Velten, a predoctoral fellow at the European Molecular Biology Laboratory in Heidelberg, Germany. "Hopefully [we will] see if it really can help to explore this data and really dig into multi-omics data."
The method has been designed primarily for personalized medicine mostly because that has been a major focus of multi-omics studies to date, Velten said.
"With our method, we make it easier to explore the data and to visualize it, and then go with the results to the clinicians and discuss them with them," she explained.
In clinical practice, physicians look for a single gene marker, according to the co-lead author Ricard Argelaguet, a predoctoral fellow at the European Bioinformatics Institute in Hinxton, England.
"As we started getting more and more data, we just realized that things are much more complex than that. You need to pull information from many data modalities to make a robust inference [about] treatment options, the disease type, and so on," Argelaguet said.
"We weren't understanding the underlying biology," Velten said. A multi-omics perspective facilitates that understanding.
"If you have a complex object that you want to understand, you can take different views from different angles onto that object, but only if you integrate those views onto the object can you really understand what it is," she said.
"You basically start by inferring what are the meaningful axes of a variation, what are the important factors that drive the differences between the different patients?" Argelaguet added. MOFA then looks at how much each data modality contributes to the whole picture and delivers a weighted score.
The researchers started with a specific form of leukemia mostly because EMBL had good data from a collaboration with Germany's National Center for Tumor Diseases in Heidelberg, Velten said. This offered opportunities to study the interplay between genomes, transcriptomes, variations, and responses.
"One aspect that makes this dataset especially interesting is that it's apart from more traditional omics types like genome and transcriptome. Also included are direct response of patients, moving more toward clinical responses of patients," Velten said.
MOFA can, however, analyze any type of multi-omics data, including that from animals, Argelaguet noted.
"We have applied this to the scenario of clinical practice and personalized medicine," Argelaguet said. "But it is just a general framework that people can also apply in different fields."
Argelaguet said that since the paper was submitted for review, he and his colleagues downloaded data from the Broad Institute's Cancer Genome Atlas and are applying the MOFA method to that.