In Print: Bioinformatics Tool-Related Papers of Note, December 2003


Badidi E. AnaBench: a Web/CORBA-based workbench for biomolecular sequence analysis. [BMC Bioinformatics 4:63]: Presents AnaBench, an interactive, web-based bioinformatics analysis workbench. The system employs a three-tier distributed architecture using Corba, Java, JDBC, and JSP. Availability:

Ding Y, et al. A statistical sampling algorithm for RNA secondary structure prediction. [Nucleic Acids Research 31(24): 7280-7301]: Describes a statistical algorithm that samples from the Boltzmann ensemble of secondary RNA structures, computes the equilibrium partition functions of RNA secondary structures with recent thermodynamic parameters, and generates a statistically representative sample of structures. According to the authors, the method may enable prediction of mRNA structure and target accessibility, and is “applicable to the rational design of small interfering RNAs (siRNAs), antisense oligonucleotides, and trans-cleaving ribozymes in gene knock-down studies.”

Down T. Relevance Vector Machines for classifying points and regions in biological sequences. [arXiv pre-print archive]: Describes the relevance vector machine (RVM), a machine-learning framework capable of building simple models from large sets of candidate features. Models are described for predicting transcription start sites and other features in genome sequences.

Fiser A. ModLoop: automated modeling of loops in protein structures. [Bioinformatics 19(18): 2500-2501]: Presents ModLoop, a web server for automated modeling of loops in protein structures. The server relies on the loop modeling routine in Modeller, which predicts loop conformations by satisfaction of spatial restraints, without relying on a database of known protein structures. ModLoop runs on a cluster of Linux PC computers. Availability:

Hein J, et al. Recursions for statistical multiple alignment. [Proc Natl Acad Sci USA 100(25): 14960-14965]: Presents algorithms that calculate the probability of a set of sequences related by a binary tree that have evolved according to the Thorne-Kishino-Felsenstein model for a fixed set of parameters. The algorithms are based on a Markov chain generating sequences and their alignment at nodes in a tree. D

Köhler J, et al. SEMEDA: ontology based semantic integration of biological databases. [Bioinformatics 19(18): 2420-2427]: Describes SEMEDA (Semantic Meta Database), which provides semantically integrated access to databases, ontologies, and controlled vocabularies. Availability:

Liao C, et al. Network component analysis: Reconstruction of regulatory signals in biological systems. [Proc Natl Acad Sci USA. 100(26): 15522-7]: Presents a method, called network component analysis, for uncovering hidden regulatory signals from outputs of networked systems, “when only a partial knowledge of the underlying network topology is available,” according to the authors. The method is applied to microarray data from Saccharamyces cerevisiae to reconstruct the activities of various transcription factors during cell cycle.

Liu W, et al. Algorithms for large-scale genotyping microarrays. [Bioinformatics 19(18): 2397-2403]: Describes algorithms for feature extraction, classification, statistical modeling, and filtering for the analysis of SNPs using high-density oligonucleotide microarrays. Availabil-ity: data is available at, and the algorithms will be available commercially in the Affymetrix software package.

Moore J. Gene structure prediction in syntenic DNA segments. [Nucleic Acids Research 31(24): 7271-7279]: Describes a comparative gene prediction method called pattern filtering that uses synteny between two or more genomic segments to annotate genomic sequences.

Strong M, et al. Visualization and interpretation of protein networks in Mycobacterium tuberculosis based on hierarchical clustering of genome-wide functional linkage maps. [Nucleic Acids Research 31(24): 7099-7109]: Presents a method for the visuali-zation and interpretation of genome-wide functional linkages inferred by the Rosetta Stone, Phylogenetic Profile, Operon, and Conserved Gene Neighbor computational methods.

Wang T, Stormo GD. Combining phylogenetic data with co-regulated genes to identify regulatory motifs. [Bioinformatics. 19(18): 2369-80]: Introduces PhyloCon, a new algorithm to discover regulatory motifs, which takes into account both conservation among orthologous genes and co-regulation of genes within a species. Availability:

