Skip to main content
Premium Trial:

Request an Annual Quote

Despite a Low Profile, Bioinformatics Group at IBM Research Finds its User Base is Growing


While the business arm of IBM may have waited until October 2000 to formally launch a life sciences unit, the research branch of the company has been quietly building a comprehensive bioinformatics toolset since 1996.

Isadore Rigoutsos, who manages the bioinformatics and pattern discovery group at IBM’s computational biology center, told BioInform that the team recently released a new version of its web engines and tools.

Eschewing press releases and product announcements in favor of a word-of-mouth approach, the five-member group working out of the Watson Research Center has nonetheless been able to attract a number of users for its Teiresias pattern discovery algorithm and Bio-Dictionary collection of amino acid patterns. Rigoutsos said usage has increased by a factor of a hundred since a new user-friendly graphical user interface went online in March.

“People know about it mostly through our publications or because they search Google for pattern discovery,” Rigoutsos said. “We took that approach because there’s only five of us that have to do the research as well as the web design and maintenance, so we figured we’d let word of mouth spread and increase the number of users while we’re debugging it.”

Rigoutsos did not disclose the total number of users for IBM’s tools.

Teiresias is a two-phase combinatorial algorithm for general-purpose pattern discovery, but its speed and ability to handle very large input datasets and arbitrarily large alphabets have made it applicable to a number of computational biology applications, including DNA tandem repeat discovery, automated protein functional and structural annotation, and gene discovery.

The Teiresias engine currently supports nine discovery, annotation, and analysis options. New options for DNA tandem repeat discovery, gene identification, and irredundant motif discovery are in the works.

The IBM team also released an updated version of its Bio-Dictionary tool — a collection of repeating protein sequence patterns that act as “words.” The first version of the Bio-Dictionary was released in 1999, using public databases such as SwissProt and GenPept as input. Rigoutsos said the newest release uses SwissProt/TrEMBL data from June 2000 and the team is completing a new computation using data from May 2001.

The updated Bio-Dictionary was used to build several complete annotated genomes, which were released with the new interface in March. Annotations for two eukaryotic genomes, three archaeal genomes, and seven bacterial genomes are currently available.

Rigoutsos said more tools would be phased in as they are developed. He is aiming for a September release of some significant new features, including an interactive website that will allow users to interactively process genome annotations using natural language text commands.

Teresias and its associated tools are freely available for non-profit users at Tspd.html. Licenses are available for commercial use.

— BT

Filed under

The Scan

Driving Malaria-Carrying Mosquitoes Down

Researchers from the UK and Italy have tested a gene drive for mosquitoes to limit the spread of malaria, NPR reports.

Office Space to Lab Space

The New York Times writes that some empty office spaces are transforming into lab spaces.

Prion Pause to Investigate

Science reports that a moratorium on prion research has been imposed at French public research institutions.

Genome Research Papers on Gut Microbe Antibiotic Response, Single-Cell RNA-Seq Clues to Metabolism, More

In Genome Research this week: gut microbial response to antibiotic treatment, approach to gauge metabolic features from single-cell RNA sequencing, and more.