In Print: Bioinformatics Tool-Related Papers of Note, August 2005


Douglas S, Montelione G, Gerstein M. PubNet: a flexible system for visualizing literature derived networks. [Genome Biology 2005, 6:R80]: Presents PubNet, a web-based tool that extracts several types of relationships returned by PubMed queries and maps them into networks.

Du Z, Lin F, Roshan UW. Reconstruction of large phylogenetic trees: A parallel approach. [Comput Biol Chem. 2005 Aug;29(4):273-80]: Presents a parallel computing model for reconstructing phylogenetic trees for very large datasets using the widely used Multiple Instruction Multiple Data (MIMD) architecture. The model adapts the recursive-DCM3 decomposition method of Roshan et al to divide datasets into smaller subproblems, and "greatly reduces the computational time of the sequential version of the program." In a case study, the parallel approach took 22 hours on four processors to outperform the best score to date, according to the authors.

Erban R, Kevrekidis I, Adalsteinsson D, Elston T. Gene regulatory networks: a coarse-grained, equation-free approach to multiscale computation. [arXiv preprint archive:]: Describes computer-assisted methods for analyzing stochastic models of gene regulatory networks. The main idea that underlies this equation-free analysis is the design and execution of appropriately-initialized short bursts of stochastic simulations; the results of these are processed to estimate coarse-grained quantities of interest, such as mesoscopic transport coefficients.

Fedorov A, Stombaugh J, Harr M, Yu S, Nasalean L, Shepelev V. Computer identification of snoRNA genes using a Mammalian Orthologous Intron Database. [Nucleic Acids Research 2005 33(14):4578-4583]: Describes a bioinformatics package for the computational prediction of small nucleolar RNA (snoRNA) genes in mammalian introns. The core of the approach is the Mammalian Orthologous Intron Database, which contains all known introns within the human, mouse and rat genomes. The program,, searches for conserved snoRNA motifs within MOID and reports all cases when characteristic snoRNA-like structures are present in all three orthologous introns of human, mouse, and rat sequences.

Flannick J, Batzoglou S. Using multiple alignments to improve seeded local alignment algorithms. [Nucleic Acids Research 2005 33(14):4563-4577]: Introduces an algorithm that uses the information implicit in a multiple alignment to dynamically build an index that is weighted most heavily towards the promising regions of the multiple alignment. Typhon, a local alignment tool that incorporates the indexing algorithm, is shown to be more sensitive than algorithms that index only a sequence.

Jin L, Tang H, Fang W. Prediction of protein subcellular locations using a new measure of information discrepancy. [J Bioinform Comput Biol. 2005 Aug;3(4):915-27]: Discusses a method for the prediction of protein subcellular locations from sequences. According to the authors, the overall predictive accuracy of the method was higher than that by using support vector machines.

Lin H, Wu K, Chang J, Sung T, Hsu W. GANA-a genetic algorithm for NMR backbone resonance assignment. [Nucleic Acids Research 2005 33(14):4593-4601]: Describes a method called GANA that uses a genetic algorithm to automatically perform backbone resonance assignment in NMR data with a high degree of precision and recall. Precision is the number of correctly assigned residues divided by the number of assigned residues, and recall is the number of correctly assigned residues divided by the number of residues with known human curated answers. GANA takes spin systems as input data and uses two data structures, candidate lists and adjacency lists, to assign the spin systems to each amino acid of a target protein.

Oliver T, Schmidt B, Nathan D, Clemens R, Maskell D. Using reconfigurable hardware to accelerate multiple sequence alignment with ClustalW. [Bioinformatics 2005 21(16):3431-3432]: Discusses a method for speeding the computation of multiple sequence alignments using ClustalW and an off-the-shelf field programmable gate array. Availability: An online server for ClustalW running on a Pentium IV 3 GHz with a Xilinx XC2V6000 FPGA PCI-board is at

Pinter R, Rokhlenko O, Yeger-Lotem E, Ziv-Ukelson M. Alignment of metabolic pathways. [Bioinformatics 2005 21(16):3401-3408]: Introduces MetaPathwayHunter, a pathway alignment tool that, given a query pathway and a collection of pathways, finds and reports all approximate occurrences of the query in the collection, ranked by similarity and statistical significance. The authors discuss their use of the tool to study the similarities and differences in the metabolic networks of Escherichia coli and Saccharomyces cerevisiae, as represented in highly curated databases, in which they discovered "a few intriguing relationships between pathways that provide insight into the evolution of metabolic pathways." Availability: upon request ([email protected]).

Plewczynski D, Jaroszewski L, Godzik A, Kloczkowski A, Rychlewski L. Molecular modeling of phosphorylation sites in proteins using a database of local structure segments. [J Mol Model (Online). 2005 Aug 11]: Presents a bioinformatics tool for molecular modeling of the local structure around phosphorylation sites in proteins. The method is based on a library of short sequence and structure motifs and uses local structure segments (LSSs) as its basic units of prediction, which "enables us to avoid the problem of non-exact local description of structures, caused by either diversity in the structural context, or uncertainties in prediction methods," the authors note. The paper describes the LSS library and a profile-profile-matching algorithm that predicts local structures of proteins from their sequence information. Availability:

Smith M, Kunin V, Goldovsky L, Enright A, Ouzounis, C. MagicMatch — cross-referencing sequence identifiers across databases. [Bioinformatics 2005 21(16):3429-3430]: Presents a rapid method called MagicMatch, which maps sequence identifiers across databases. The method uses the MD5 checksum algorithm for message integrity to generate sequence fingerprints and uses these fingerprints as hash strings to map sequences across databases. According to the authors, the program can cross-link "any of the major sequence databases within a few seconds on a modest desktop computer." Availability:

