Skip to main content
Premium Trial:

Request an Annual Quote

Reel Two Prepares to Launch Two Targeted Text-Mining Products


As the field of biomedical text mining matures, researchers are finding a number of new application areas for the technology. Reel Two, a text-mining software firm that focuses on the life-sciences market, is hoping to exploit this trend by launching several new targeted applications based on its flagship Classification System technology.

The first, SureGene, addresses the “gene disambiguation” problem in the biomedical literature, according to Nicko Goncharoff, senior vice president at Reel Two. Gene names are a particular challenge for text-mining systems because there are many synonyms for the same gene, as well as many gene names that are also common English words.

In collaboration with AstraZeneca, Reel Two ran all of MedLine through its Classification System to create a pre-filtered database of abstracts in which gene names are assigned to their Entrez/LocusLink IDs. Users enter a canonical gene name or any related synonym and the system will return a ranked list of selected abstracts that are about that gene, regardless of how that gene is referred to in the article. The database is updated daily to classify genes in new articles, Goncharoff said.

A beta version of SureGene 1.0 is currently available that covers 8,300 human genes (, and Goncharoff said that an upgrade, version 1.1, which will cover 38,000 genes, is expected to be available in the first quarter of 2005. Version 2.0, which will have additional features, is slated for a spring release.

Reel Two is also partnering with cheminformatics firm OpenEye Software to develop a text-mining application called SureChem that will allow users to search the literature for chemicals using both structures or chemical names. Goncharoff said that most search engines currently scan the literature using either structure or keyword, “but there were no tools to bridge the gap.” SureChem uses Reel Two’s Entity Extraction technology in combination with OpenEye’s OGHAM text-to-structure conversion package to enable searches by structure or chemical name. An online demo of SureChem 0.1 is available now (, and the package will be available for installation in mid-December. Version 1.1, which will include additional features such as phrase extraction, will be ready in the first quarter of 2005.

Goncharoff said that the more customer feedback Reel Two gets, the more ideas it comes up with for tweaking its text-mining software for specific application areas. He said the company will likely merge the functionality of SureGene and SureChem so that users can extract gene and protein names associated with a chemical search. In addition, he said, the company plans to offer a version of SureGene that enables researchers to retrieve literature related to genes identified in quantitative trait loci analysis. Reel Two has also applied SureChem to extract compounds from patent databases.

— BT


Filed under

The Scan

Genome Sequences Reveal Range Mutations in Induced Pluripotent Stem Cells

Researchers in Nature Genetics detect somatic mutation variation across iPSCs generated from blood or skin fibroblast cell sources, along with selection for BCOR gene mutations.

Researchers Reprogram Plant Roots With Synthetic Genetic Circuit Strategy

Root gene expression was altered with the help of genetic circuits built around a series of synthetic transcriptional regulators in the Nicotiana benthamiana plant in a Science paper.

Infectious Disease Tracking Study Compares Genome Sequencing Approaches

Researchers in BMC Genomics see advantages for capture-based Illumina sequencing and amplicon-based sequencing on the Nanopore instrument, depending on the situation or samples available.

LINE-1 Linked to Premature Aging Conditions

Researchers report in Science Translational Medicine that the accumulation of LINE-1 RNA contributes to premature aging conditions and that symptoms can be improved by targeting them.