Researchers at the Katholieke Universiteit Leuven in Belgium plan later this year to upgrade a software application designed to help users determine which genes in their studies should be further investigated.
When the researchers are done, the tool, to be called Endeavour 3.0, will enable users to look at promising candidate genes across organisms.
Separately, the Katholieke Universiteit researchers said they are increasingly keen to license the tool to pharmaceutical and biotechnology companies.
Endeavour is a software application designed to computationally prioritize candidates genes based on a set of training genes. The tool was first published in 2005, and the current version, Endeavour 2.4.4, became available last year. The new version, Endeavour 3.0, is expected to launch sometime this fall.
The software functions in three stages: training, scoring, and fusion. In the first stage of analysis, Endeavor scours numerous data sources — functional annotations, protein-protein interactions, regulatory information, expression data, sequence-based data, and literature-mining data — for information about “test genes” known to affect the biological process being studied.
In the second stage of analysis, the software uses the resulting models to score the candidate genes and rank them according to their scores.
The final stage fuses the rankings per data source into a global ranking using order statistics.
So far, Endeavour has been made available to users studying human, mouse, rat, Drosophila, and C. elegans. Version 3.0 will also include zebrafish data.
According to Yves Moreau, a professor of electrical engineering and a bioinformatician at Leuven, by the end of 2009, Endeavour should allow researchers to prioritize their candidate genes not only within one organism, but across organisms, a capability that he and his team of researchers believe will make the tool more powerful.
"For the moment, the data in Endeavour is sliced up by organism," Moreau told BioArray News this week. "If you want to study human proteins, you look at human data. If you want to study mouse proteins, you study mouse data.
"By the end of this year, it will be possible to carry data … across organisms and synergize data from multiple organisms," he added. "We expect that predictive power of Endeavour will increase when we enable this function."
Endeavour was created to help researchers involved in genomic studies, in particular gene expression studies, who are swamped by lists of candidate genes that are time-consuming to prioritize.
Drawing upon a "data warehouse" of genomic data culled from several publically available data repositories, the software compares candidate gene lists with test genes, and ranks the genes accordingly.
"We are taking a set of candidate genes and looking at what information is available for those genes," Moreau said. "We compare these candidate genes to [compare against, for instance] a set of known genes for certain disorder to rank the candidates."
Data sources include ontologies and annotations, protein-protein interactions, cis-regulatory information, gene-expression data sets, sequence information, and text-mining data.
Moreau said that his team at Leuven regularly updates Endeavour three times a year to include all the background databases, creating a local data warehouse that can be accessed by users.
[ pagebreak ]
While the Leuven team’s expertise has been in prioritizing genes from array-based gene expression studies, Moreau said that Endeavour is useful for other applications. For example, team member Leon-Charles Tranchevent discussed the use of Endeavour with array comparative genomic hybridization data at the Wellcome Trust Sanger Institute-hosted Genomic Disorders conference, held in Cambridge, UK, last month.
"In general, for whatever procedure that led you to the list of genes to sort out, this method would be applicable," Moreau said.
He cited a paper authored in 2007 by researchers at the Max Plank Institute for Biochemistry that used mass spectrometry to identify adipocyte proteins. Adipocytes are considered to be central players in energy metabolism and are related to obesity.
In the paper, the authors compared their mass-spec data to existing array data using Endeavour, and were able to associate a number of factors with vesicle transport in response to insulin stimulation, a function of adipocytes.
[Adachi J, et al. In-depth analysis of the adipocyte proteome by mass spectrometry and bioinformatics. Molecular & Cellular Proteomics. 2007 Jul;6(7):1257-73.]
"From the [Max Plank group’s] proteomics screen, they were able to extract a subset of genes they were interested in in the biological process they were studying," Moreau said. "This is a generic problem that many people in molecular biology are facing: you have a long list and out of that you want to extract a shorter list based on getting a subtype of genes. Endeavour is broadly applicable."
Several other papers have been published demonstrating the use of the tool. Most recently, a PLoS Genetics paper published in January discussed the creation of an Endeavour web application for investigating Drosophila.
And last year, Tranchevent was the lead author on a Nucleic Acids Research paper that described Endeavour’s various functionalities. In the paper, Tranchevent and colleagues tested Endeavour on 32 recent disease gene associations from the literature and described a number of recent independent studies that made use of Endeavour to prioritize candidate genes for obesity and Type II diabetes, cleft lip and cleft palate, and pulmonary fibrosis.
According to Moreau, he and his Katholieke Universiteit colleagues do not plan to launch Endeavour as a commercial product, but instead are increasingly looking to license it to pharma and biotech companies.
"Bioinformatics shops have a hard time making money as it is, so, at this moment in 2009, we don’t think there is a large market to support commercialization," Moreau said. "We are more interested in making this more useful in larger scale set-ups, where we would be able to support a pharma or biotech company that would be interested in these tools."
Moreau said that Leuven already has one such large pharma client using Endeavour and that is interested in other, similar licensing deals. He declined to name the pharma licensee.