Researchers at the Weizmann Institute of Science of Rehovot, Israel, have developed a new two-way clustering algorithm to analyze gene expression data. They hope to establish a company based on the technology, although initial attempts to interest venture capitalists have not been successful.
The algorithm uses a coupled two-way clustering method, which splits a dataset into two subsets, where one can be used to cluster the other. The researchers’ results were published in the October 24 issue of the Proceedings of the National Academy of Sciences.
Eytan Domany, a professor of physics at the institute said that the coupled two-way clustering algorithm works well when looking at large arrays of data such as that from a gene chip, where much of the important biological information can be buried under mounds of other data.
“Our algorithm singles out small subsets of genes that when you use only them to look at a subset or the entire set of tissues, then you see structure that you couldn’t see before,” said Domany.
The researchers used the clustering technique to analyze two previously published datasets on colon cancer and leukemia.
After clustering subsets of the data, the researchers were able to identify groups and correlations that help in understanding biological processes, while others can be used to pick out new research paths, he said.
For example, one test with the colon cancer data involved making clusters of the genes first using both tumor and normal tissue samples, then using just the tumor tissues. The subgroups each had two comparable gene clusters, but in the tumor tissue sample, the expression levels of the two clusters were strongly correlated, indicating that colon cancer was more likely when people had both types of genes, which is known to be the case, he said.
The researchers used the clustering algorithm in conjunction with another clustering algorithm, called superparamagnetic clustering. However, Domany said, the two-way clustering method works with any clustering algorithm.
Now that the results have been published, Domany is turning some of his attention to commercializing the clustering method — an effort that has so far proved unsucessful.
Talks with two venture capital groups failed, he said, and discussions with an American pharmaceutical company stalled when it asked to have a version of the software that it could test. Because Domany and his colleagues don’t have a user-friendly version of the algorithm, they asked the pharma to send data instead and it wasn’t interested in doing that, he said. He has also talked to an Israeli software company.
If Domany is unable to found a business based on the technology, he will consider licensing it to a bioinformatics software company. His third option is to distribute it for free. The institute has filed in Israel for a patent for the two-way clustering method.
Meanwhile, Domany plans to create a website within the next two months that will enable researchers to submit data for analysis.
Despite his initial commercialization failures, Domany may have picked a good time to try to bring new clustering technology to market, according to some researchers who said new gene expression analysis approaches are sorely needed.
Owen White, director of the annotation informatics department at the Institute for Genomic Research, said that demand for gene expression clustering algorithms is great and he expects to see more results reported for clustering methods targeted at expression analysis.
“I don’t think anybody really anticipated that some statistical treatment of the [gene expression] data was going to be required to make sense of it. I think most people imagined that you’d basically be able to just look at the data” and understand it intuitively, said White. “So the hardware got delivered without good analysis tools.”
If Domany is able to turn his marketing efforts around, his analysis tool may help fix this problem.