NEW YORK (GenomeWeb) – Danish bioinformatics company Intomics is forging relationships with pharmaceutical clients while working to expand its product offerings.
The Copenhagen-based firm's protein-protein interaction network, known as InWeb_InBioMap, was described in a Nature Methods paper late last year. Intomics is working to commercialize that network, even as it looks ahead to future iterations of InWeb_InBioMap with analytical tools layered onto it.
"Historically, we've been doing a lot of project-as-a-service [work] for pharma companies, and we have experienced a lot of clients asking for the InWeb_InBioMap product access," Intomics co-founder and CEO Thomas Jensen said. "So we're starting to commercialize this and are signing up the first clients as we speak."
The firm, a spinout of the Technical University of Denmark, intends to spend the next six to 12 months expanding analysis algorithms used to apply information in the network to a range of problems, he said, noting that Intomics is engaging in analytical co-development efforts with a handful of pharmaceutical companies to tailor version 2.0 of its product: an integrated network and extended suite of analysis algorithms.
"We're collaborating with several pharmaceutical companies on figuring out the exact algorithms they will need along with the network to really help them in their discovery workflows across the omics data," Jensen said. "We would very much like to extend the analysis capabilities around the network, so that we ensure that our clients get the best and full use of the network for their R&D activities."
Intomics' algorithm expansion may well include a computational tool called NetSig, described in a Nature Methods paper this week led by Intomics co-founder and scientific advisory board member Kasper Lage, bioinformatics director at Massachusetts General Hospital.
The NetSig algorithm is designed to uncover cancer driver genes by tying together tumor sequence data with protein-protein interactions in InWeb_InBioMap, as Lage and his colleagues at MGH and the Broad Institute demonstrated in an analysis using InWeb_InBioMap combined with exome sequence data for more than 4,700 tumors from 21 cancer types.
In addition, Intomics is working toward opening an office in the Boston/Cambridge area by early next year. Jensen did not disclose the identities of Intomics' current clients, but noted that the types of applications being pursued span cancer and other disease types.
Lage, who was co-senior author on the InWeb_InBioMap paper published online last year, said his team has applied the network data "to almost any type of disease that you can imagine." Whereas the InWeb_InBioMap provides protein-protein interaction data or a "social network of genes," he explained, the new NetSig tool brings that protein interaction data together with tumor genome sequence data.
"It reconciles those two data types," Lage said, "and calculates a statistic for individual genes in the genomes — about how likely it is they are cancer genes based on their reputations and their social network, so to speak."
In their proof-of-principle study of NetSig's cancer driver gene prediction capabilities, he and his colleagues applied the algorithm to an early version of the InWeb_InBioMap network, along with exome sequence data for 4,742 tumors. In that pan-cancer set, they unearthed 62 potential new driver genes, including candidate oncogenes that they subsequently used to induce tumors in mouse models.
By focusing on a handful of the candidate drivers from that analysis, the team was subsequently able to find suspicious amplifications at play in a set of more than 200 lung adenocarcinoma tumors previously believed lack oncogenes.
In addition, NetSig correctly classified known drivers in 60 percent of tumor subtypes when these smaller tumor subsets were analyzed independently.
"What was a little bit surprising to us — and we didn't tease this out in more detail — but it didn't seem like the predictive power was specifically correlated with the amount of samples," Lage said, explaining that the results in these tumor exome subsets hint that relatively small collections of samples may yield useful information with NetSig (though the minimum number of sequenced samples remains to be determined).
The NetSig algorithm is capable of handling a range of tumor sequences types, provided the sequence datasets result in a gene-based p-value such as that produced by the Broad's MutSig program. Likewise, NetSig can be run with several different network types. In their recent study, for example, Lage and his colleagues applied the algorithm to transcriptional network sets to ensure the authenticity of their findings.
"We were quite worried about the knowledge contamination phenomenon: that maybe we're just good at predicting cancer genes because they're so well studied in protein interaction space," Lage explained. "One of the things that we did was to try an alternative type of network data and in this case we used transcriptional networks, because we know they can't really be affected by knowledge contamination."
Those results indicated that normalized protein interaction data was comparable to that available with the transcriptional network input, he said, suggesting knowledge contamination could successfully be dealt with in analyses based on the protein-protein interactions as an input.
Along with potentially boosting the interpretation of smaller datasets, the NetSig method is expected to be applicable for everything from cancer driver detection to broader pathway analyses or mutation-depleted genes that might help predict vulnerabilities in cancer and perhaps other complex human conditions such as autism spectrum disorder.
"We're hoping that once we have more well-powered exome sequencing datasets from other diseases, that we'll be able to apply [NetSig to other conditions]," Lage said.
"To us, this is the starting point of hopefully being able to use this approach on other genetic diseases."
When they penned the original Nature Methods paper, Lage, Jensen, and colleagues noted that the InWeb_InBioMap housed more than 585,800 protein-protein interactions. It currently contains information on closer to 715,000 interactions.
"What's really quite nice about the InWeb_InBioMap framework … is that it's continuously updated. So every quarter, Intomics updates that network," Lage said, noting that "if people have their own proprietary interaction datasets, that can be built into the network completely seamlessly because the framework is automated."
Generally speaking, the strength of the underlying InWeb_InBioMap method is its ability to find novel biological mechanisms and pathways in a data-driven manner, Jensen said, adding that he believes the future of network biology-based approaches is "very bright."
The NetSig algorithm for cancer driver prediction and classification is freely available online through Lage's lab site at MGH, the Broad Institute, and Harvard Medical School.