Researchers at Lawrence Berkeley National Laboratory have contributed another choice to a growing number of pathway analysis software tools with the launch of GenoPharm, a search engine that maps functional relationship networks among genes, drugs, and diseases.
Users first enter the gene symbols, IDs, or keywords for their query, and then choose a context such as “molecular function,” “therapeutics,” or “OMIM data.” The software displays the search results as a relationship network, with the distance between each entity representing the strength of each relationship. Evidence connecting the genes, such as a PubMed or an OMIM reference, is displayed as a link.
But according to developer Kasian Franks, GenoPharm’s capabilities are just one example of the core technology that underlies the software — a search engine development system called the Geneva Development System.
“GenoPharm was really an example of what we could do with the Geneva Development System,” Franks told BioInform. It’s the first of several search engines specific to life science research that the LBNL team plans to create with the platform. So far, they have also used Geneva to create an enhanced version of the NCBI Entrez Gene search engine that ranks results based on their “conceptual relevancy,” Franks said.
“We took the [Entrez Gene] database of genes and rebuilt it using the Geneva Development System, so that you get results that don’t just contain your keywords, but contain the context that you’re looking for as well,” he said.
The underlying concept behind Geneva is a search technology that relies on context, rather than keywords or grammatical rules, to increase the relevancy of its results, Franks said.
“Our focus is not on speed, but relevancy and accuracy,” he said. “Google’s focus is on speed, but I wouldn’t mind myself at times sacrificing a few milliseconds to get much better results.”
Prior to developing the software, Franks said that his team experimented with several pathway, text-mining, and information extraction tools, but found them to be inadequate for reconstructing biological networks. “They were making associations based on co-occurrence for a few genes that occurred in the same abstract,” he said. “That was not sufficient. We wanted to simulate the way a human might search biological data, as opposed to coming with some rule-oriented system for networking genes.”
Geneva and its derivatives, like GenoPharm, mimic the way a biologist would navigate the literature, Franks said. “You might be searching for a gene that’s similar to the breast cancer gene BRCA2. You’re going to think about the context that surrounds BRCA2 — you’re not necessarily going to look for the keyword BRCA2, but you’re going to take the context that you’ve researched in your past work and use that context to search as opposed to a few keywords.”
Using this contextual approach, Franks said, GenoPharm is able to identify indirect, or inferred, relationships between two or more genes that might be linked because they share a relationship with a third gene — a feature that distinguishes the system from other pathway analysis tools that are based on curated data.
Companies like Ingenuity and Ariadne Genomics, he said, have built pathway analysis packages “based on known data, known information. But we believe that we’re going to benefit, and others will benefit, by providing unknown connections and functional relationships between genes and drugs,” Franks said. “It’s hard to make a discovery with something that everyone already knows about.”
Ultimately, Franks said, the LBNL team wants to extend this capability to infer relationships between drugs and pathways or genes, but the data that would enable this capability remains “limited,” he said. For now, GenoPharm’s drug associations are based on direct mappings from Stanford’s PharmGKB database.
A demonstration version of GenoPharm is available at http://geneva.lbl.gov/GenoPharm/genopharm.html.
Franks emphasized, however, that the LBNL team is focusing on the underlying search technology, rather than the interactive capabilities of the software. “Our focus is really not so much on providing the visualization interface, but in providing the back-end engine that enables the development of networks and connections among genes and other objects,” he said.
He said that GenoPharm’s inferred connections could be displayed along with known relationships “in various visualization tools,” such as GenMapp or commercial pathway visualization packages. “We would like to provide them with networks and genes that are based on inference that a biologist can explore, confirm, validate, annotate, destroy, or build upon,” he said.
LBNL is looking to collaborate with a commercial partner to further develop the technology, and Franks said that the lab has already seen “some interest” from potential licensees.