Brunet J, et al. Metagenes and molecular pattern discovery using matrix factorization. [Proc Natl Acad Sci USA 2004 101(12): 4164-9]: Discusses the use of the nonnegative matrix factorization (NMF) algorithm, and its applicability to reducing the dimension of expression data from thousands of genes to a handful of metagenes.
Down T, Leong B, Hubbard T. A machine learning strategy to identify exonic splice enhancers in human protein-coding sequence. [arXiv pre-print archive: http://arXiv.org/abs/q-bio/0403024]: Presents a strategy for analyzing protein-coding sequence by first randomizing the codons used at each position within the coding sequence, then applying a motif-based machine-learning algorithm to compare the true and randomized sequences.
Guo J, et al. A novel method for protein secondary structure prediction using dual-layer SVM and profiles. [Proteins 2004 54(4): 738-43]: Presents a method for protein secondary structure prediction based on the dual-layer support vector machine and position-specific scoring matrices. Availability: http://www.bioinfo.tsinghua.edu.cn/pmsvm.
Liu Y, et al. Eukaryotic regulatory element conservation analysis and identification using comparative genomics. [Genome Res. 2004 14(3): 451-8]: Describes a sequence-motif-finding algorithm called CompareProspector, which biases the search in regions conserved across species. Using human-mouse comparison, CompareProspector identified known motifs for transcription factors Mef2, Myf, Srf, and Sp1 from a set of human-muscle-specific genes. Availability: http://compareprospector.stanford.edu/.
Manduchi E, et al. RAD and the RAD Study-Annotator: an approach to collection, organization, and exchange of all relevant information for high-throughput gene expression studies. [Bioinformatics 2004 20(4): 452-459]: Describes RAD (RNA Abundance Database), a MIAME-compliant infrastructure for gene expression data management that uses ontologies and an annotated gene index to integrate genomic and transcriptomic data from multiple organisms. Availability: http://www.cbil.upenn.edu/RAD/RAD-installation.htm.
Ozkan S, Meirovitch H. Conformational search of peptides and proteins: Monte Carlo minimization with an adaptive bias method applied to the heptapeptide deltorphin. [J Comput Chem. 2004 25(4): 565-72]: Presents a conformational search technique called the Monte Carlo minimization with an adaptive bias (MCMAB), which is based on Monte Carlo minimization, but with biased probabilities depending on the increased structure-energy correlations as the global energy minimum is approached during the search.
Pandit S, et al. SUPFAM: A database of sequence superfamilies of protein domains. [BMC Bioinformatics 2004, 5:28]: Presents SUPFAM, a database of superfamily relationships between protein domain families of either known or unknown 3D structure based on associated sequence families from Pfam and structural families from SCOP. Availability: http://pauling.mbu.iisc.ernet.in/~supfam.
Pesavento J, et al. Shotgun annotation of histone modifications: a new approach for streamlined characterization of proteins by top down mass spectrometry. [J Am Chem Soc. 2004 24;126(11): 3386-7]: Demonstrates a strategy for characterizing post-translational modifications by database retrieval instead of manual interpretation of data from high-resolution tandem mass spectrometry, using a database of nearly 50,000 modified histone H4 sequences.
Riva A, Kohane I. A SNP-centric database for the investigation of the human genome. [BMC Bioinformatics 2004, 5:33]: Presents SNPper, a web-based application designed to facilitate the retrieval and use of human SNPs for high-throughput research. Availability: http://snpper.chip.org/.
Romero P, Karp P. Using functional and organizational information to improve genome-wide computational prediction of transcription units on pathway-genome databases. [Bioinformatics 2004 20(5): 709-717]: Describes a method for predicting transcription units using only intergenic distance and functional classification of genes. Availability: Included in version 7.0 of the Pathway Tools software suite (http://biocyc.org/download.shtml).
Stocker G, Rieder D, Trajanoski Z. ClusterControl: a web interface for distributing and monitoring bioinformatics applications on a Linux cluster. [Bioinformatics 2004 20(5): 805-807]: Presents ClusterControl, a web interface for distributing and monitoring bioinformatics applications on Linux cluster systems. Availability: http://genome.tugraz.at/Software/ClusterControl.
Takahashi K, et al. A multi-algorithm, multi-timescale method for cell simulation. [Bioinformatics 2004 20(4): 538-546]: Describes a modular, object-oriented simulation meta-algorithm based on a discrete-event scheduler and Hermite polynomial interpolation that is available as part of the E-Cell simulation environment. Availability: http://www.e-cell.org/software.
Von Gr nberg H, Kollman M. Variations in substitution rate in human and mouse genomes. [arXiv pre-print archive: http://arXiv.org/abs/q-bio/0403044]: Describes a method to quantify spatial fluctuations of the substitution rate on different length scales throughout genomes of eukaryotes.
Zacharias M. Rapid protein-ligand docking using soft modes from molecular dynamics simulations to account for protein deformability: binding of FK506 to FKBP. [Proteins 2004 54(4): 759-67]: Describes a docking method that allows relaxation of the protein conformation in precalculated soft flexible degrees of freedom during ligand-receptor docking.
Zhu X, Hood L, Ao P. Robustness, stability and efficiency of phage lambda gene regulatory network: dynamical structure analysis. [arXiv pre-print archive: http://arXiv.org/abs/q-bio/0403016]: Discusses a mathematical framework called dynamical structure analysis, and illustrates the approach by the study of stability, robustness, and efficiency of the simplest gene regulatory network of phage lambda.