BETHESDA, Md.--Larry Hunter, formerly director of the Machine Learning Project at the US National Institutes of Health's National Library of Medicine, has joined the National Cancer Institute as chief of the Molecular Statistics and Bioinformatics section, a newly created entity in the Biometrics Research Branch of the Cancer Therapy Evaluation Program. Hunter, who is also president of the International Society for Computational Biology and an adjunct associate professor at George Mason University, told BioInform that the branch intends to begin using statistical and machine-learning approaches to understand the molecular nature of cancer.
According to Hunter, the institute generates "an enormous amount of biological data, including sequences of oncogenes, gene expression profiles of neoplastic tissues, high-throughput screening of antitumor compounds, and allellic variation assays such as single-nucleotide polymorphisms." The new bioinformatics section will develop and apply sophisticated analytical tools to help make sense of that abundance of data, he explained.
"Our basic research goal is in developing new algorithms and frameworks for efficient and effective inference," Hunter said. Several commercial entities have now licensed a machine-learning system that Hunter developed at the National Institutes of Health. The institutes have applied for patents on that work, Hunter said, adding, "We hope to continue in that tradition."
So far, the new molecular statistics and bioinformatics group is comprised of three computer scientists, two statisticians, a molecular biologist, and a physician. Hunter said he is recruiting additional postdocs and will purchase a Silicon Graphics O2000 mini-supercomputer and workstations. He added that, for the most part, his group will write its own software in a combination of Lisp, Java, and C.
The group's computational tools will be used for projects in other areas significant to the institute, too. "For example," Hunter said, "we are working on induction of predictive models from the small molecule assays of tumor growth inhibition, novel clustering tools for visualizing and analyzing gene expression data, and the use of text mining techniques in Medline to automatically develop organized knowledgebases for particular literature searches." He said his group is also involved in gene selection for the institute's gene expression profiling projects, and in building mathematical models of drug treatment responses.
After 10 years with the National Library of Medicine, staying within the public sector was a conscious choice for Hunter, who said he is concerned about the consequences of the trend among bio-informatics researchers to move to the private sector. "I am committed to seeing high-quality research and training in bioinformatics remaining in the public sphere," he remarked.
Hunter added that he believes progress in biology will increasingly be underpinned by the development of algorithms that identify significant hidden structure in massive amounts of molecular data. And conversely, he contended, challenges from biology will drive the invention of important new computational techniques.