Skip to main content
Premium Trial:

Request an Annual Quote

In Print: Bioinformatics Tool-Related Papers of Note, June 2004


Boussau B, et al. Computational inference of scenarios for alpha-proteobacterial genome evolution. [Proc. Natl. Acad. Sci. USA 2004 101(26): 9722-9727]: Describes computational approaches for inferring ancestral gene sets and to quantify the flux of genes along the branches of the alpha-proteobacterial species tree.

Cheng J, et al. NetAffx Gene Ontology Mining Tool: a visual approach for microarray data analysis. [Bioinformatics 2004 20: 1462-1463]: Presents the NetAffx Gene Ontology Mining Tool, which traverses the GO graph in the context of microarray data. Availability:

D’Ascenzo M, Collmer A, Martin G. PeerGAD: a peer-review-based and community-centric web application for viewing and annotating prokaryotic genome sequences. [Nucleic Acids Research 2004 32(10):3124-3135]: Presents a web-based application for community-wide peer-reviewed annotation of prokaryotic genome sequences. The application was developed to support the annotation of the Pseudomonas syringae pv. tomato strain DC3000 genome sequence and is portable to other genome sequence annotation projects.

Diella F, et al. Phospho.ELM: A database of experimentally verified phosphorylation sites in eukaryotic proteins. [BMC Bioinformatics 2004, 5:79]: Introduces Phospho.ELM, a database of experimentally verified phosphorylation sites manually curated from the literature. Phospho.ELM version 2.0 contains 1,703 phosphorylation site instances for 556 phosphorylated proteins. Availability:

Geer L, et al. Open Mass Spectrometry Search Algorithm. [arXiv pre-print archive:]: Describes the Open Mass Spectrometry Search Algorithm (OMSSA), in which specificity is calculated by a classic probability score using an explicit model for matching experimental spectra to sequences. According to the authors, OMSSA matches more spectra than a comparable algorithm, and is designed to be faster than published algorithms in searching large MS/MS datasets.

Green M, Karp P. A Bayesian method for identifying missing enzymes in predicted metabolic pathway databases. [BMC Bioinformatics 2004, 5:76]: Discusses the PathoLogic software program, which constructs pathway/genome databases by using a genome’s annotation to predict the set of metabolic pathways present in an organism. The program uses a set of sequences encoding the required activity in other genomes to identify candidate proteins in the genome of interest, and then evaluates each candidate by using a Bayes classifier.

Hornbeck P, et al. PhosphoSite: A bioinformatics resource dedicated to physiological protein phosphorylation. [Proteomics 2004 4(6):1551-61]: Introduces PhosphoSite, a curated, web-based database of physiologic sites of protein phosphorylation in human and mouse. PhosphoSite contains information from published literature and high-throughput discovery programs.

López-Bigas N, Ouzounis C. Genome-wide identification of genes likely to be involved in human genetic disease. [Nucleic Acids Research 2004 32(10):3108-3114]: Discusses a computational method for detecting genes likely to be involved in hereditary disease based on “key distinctive features” shared within this group of genes, including greater length of their amino acid sequence, a broader phylogenetic extent, and specific conservation and paralogy profiles compared with all human proteins. Availability (probability score assignments for the human genome):

Middendorf M, et al. Predicting Genetic Regulatory Response using Classification: Yeast Stress Response. [arXiv pre-print archive:]: Discusses a classification-based algorithm called GeneClass for learning to predict gene regulatory response. GeneClass uses the Adaboost learning algorithm with a margin-based generalization of decision trees called alternating decision trees.

Nemenman I. Information theory, multivariate dependence, and genetic network inference. [arXiv pre-print archive:]: Presents a graphical notation to denote dependence among multiple variables using maximum entropy techniques, which is useful for inference of genetic circuits and other biological signaling networks.

Parkinson P, et al. PartiGene-constructing partial genomes. [Bioinformatics 2004 20(9):1398-1404]: Describes a sequence analysis suite that uses public domain software to process raw trace chromatograms into sequence objects suitable for submission to dbEST; place these sequences within a genomic context; perform customizable first-pass annotation of the data; and present the data as HTML tables and an SQL database resource. Availability:

Pavesi G, et al. RNAProfile: an algorithm for finding conserved secondary structure motifs in unaligned RNA sequences. [Nucleic Acids Research 2004 32(10):3258-3269]: Presents an algorithm that takes as input a set of unaligned RNA sequences expected to share a common motif, and outputs the regions that are most conserved, according to a similarity measure that takes into account both the sequence of the regions and the secondary structure they can form according to base-pairing and thermodynamic rules.

Raymond S, O’Toole N, Cygler. A data management system for structural genomics. [Proteome Science 2004, 2:4]: Describes a data management system for structural genomics based on a database schema that deals with all facets of the structure determination process, from target selection to data deposition.

Sakharkar M, Kangueane P. Genome SEGE: A database for ‘intronless’ genes in eukaryotic genomes. [BMC Bioinformatics 2004, 5:67]: Presents an improvement on the SEGE database of eukaryotic single-exon genes that includes ‘intronless’ genes in completely sequenced eukaryotic genomes. Availability:

Van Walle I, Lasters I, Wyns L. Align-m — a new algorithm for multiple alignment of highly divergent sequences. [Bioinformatics 2004 20: 1428-1435]: Introduces a new program, Align-m, which uses a non-progressive local approach to guide a global alignment. Availability:

Wang J, et al. M-CGH: Analyzing microarray-based CGH experiments. [BMC Bioinformatics 2004, 5:74]: Describes M-CGH, a MatLab toolbox with a graphical user interface designed specifically for the analysis of microarray-based comparative genomic hybridization experiments. Availability:


Filed under

The Scan

Genome Sequences Reveal Range Mutations in Induced Pluripotent Stem Cells

Researchers in Nature Genetics detect somatic mutation variation across iPSCs generated from blood or skin fibroblast cell sources, along with selection for BCOR gene mutations.

Researchers Reprogram Plant Roots With Synthetic Genetic Circuit Strategy

Root gene expression was altered with the help of genetic circuits built around a series of synthetic transcriptional regulators in the Nicotiana benthamiana plant in a Science paper.

Infectious Disease Tracking Study Compares Genome Sequencing Approaches

Researchers in BMC Genomics see advantages for capture-based Illumina sequencing and amplicon-based sequencing on the Nanopore instrument, depending on the situation or samples available.

LINE-1 Linked to Premature Aging Conditions

Researchers report in Science Translational Medicine that the accumulation of LINE-1 RNA contributes to premature aging conditions and that symptoms can be improved by targeting them.