Skip to main content
Premium Trial:

Request an Annual Quote

In Print: Bioinformatics Tool-Related Papers of Note, March 2006


Chang AN, McDermott J, Frazier Z, Guerquin M, Samudrala R. INTEGRATOR: interactive graphical search of large protein interactomes over the Web [BMC Bioinformatics 2006, 7:146]: Describes Integrator, a graphical search tool for protein-protein interaction networks across more than 50 genomes. Availability:

Ciccarelli FD, Doerks T, von Mering C, Creevey CJ, Snel B, Bork P. Toward automatic reconstruction of a highly resolved tree of life. [Science. 2006 Mar 3;311(5765):1283-7]: Discusses an automatable procedure for reconstructing the tree of life with branch lengths comparable across all three domains. The tree has its basis in a concatenation of 31 orthologs occurring in 191 species with sequenced genomes.

Colinge J, Masselot A, Carbonell P, Appel RD. InSilicoSpectro: An Open-Source Proteomics Library. [J Proteome Res. 2006 Mar 3;5(3):619-624]: Introduces an open source proteomics software project called InSilicoSpectro that is aimed at implementing recurrent computations that are necessary for proteomics data analysis, such as mass list file format conversions, protein sequence digestion, theoretical peptide and fragment mass computations, graphical display, matching with experimental data, isoelectric point estimation, and peptide retention time prediction. Availability:

Falkner JA, Falkner JW, Andrews PC. JAF: reference information and tools for proteomics. [Bioinformatics 2006 22(5):632-633]: Describes the Java Analysis Framework (JAF) for proteomics, an open source library of Java code that abstracts all of the atomic masses, known stable isotopes, atomic compositions of amino acids, observed modifications of known amino acids, and ion masses that directly correspond to known amino acid sequences, enabling more rapid development of proteomics tools. Availability:

Glusman G, Qin S, El-Gewely MR, Siegel AF, Roach JC, et al. A third approach to gene prediction suggests thousands of additional human transcribed regions. [PLoS Comput Biol 2(3): e18]: Describes a new gene prediction approach to serve as an alternative to the two primary methods: modeling gene structure and recognizing sequence similarity. The new approach is based on detecting the genomic signatures of transcription accumulated over evolutionary time and the authors describe four algorithms based on the concept: Greens and CHOWDER (CHanges Oriented Within DispErsed Repeats), ROAST (Repeat Orientation Analysis Suggesting Transcripts), and PASTA (Polyadenylation Signal Transcript Analysis). The authors combined these algorithms into an integrated method called FEAST (Fast Empirical Algorithms Suggesting Transcripts), which we used to predict the location and orientation of "thousands of putative transcription units not overlapping known genes."

Gold ND, Jackson RM. A searchable database for comparing protein-ligand binding sites for the analysis of structure-function relationships. [J Chem Inf Model. 2006 Mar-Apr;46(2):736-42]: Describes a database of ligand binding sites extracted automatically from the Protein Data Bank. This has been combined with a method for calculating binding site similarity based on geometric hashing to create a relational database for the retrieval of site similarity and binding site superposition. Availability:

Grundhoff A, Sullivan CS, Ganem D. A combined computational and microarray-based approach identifies novel microRNAs encoded by human gamma-herpesviruses. [RNA. 2006 Mar 15]: Describes an algorithm, VMir, for identifying microRNAs. The approach was designed for use on genomes of less than 2 Mb and was tested on cells infected by either of two lymphotropic herpesviruses, KSHV and EBV.

Kim JJ, Zhang Z, Park JC, Ng SK. BioContrasts: extracting and exploiting protein-protein contrastive relations from biomedical literature. [Bioinformatics 2006 22(5):597-605]: Describes BioContrastsm, a system that extracts protein-protein contrastive information from Medline abstracts and presents the information to biologists in a web application that can be used for applications such as the refinement of biological pathways. Availability:

Lee TJ, Pouliot Y, Wagner V, Gupta P, Stringer-Calvert DW, Tenenbaum JD, Karp PD. BioWarehouse: a bioinformatics database warehouse toolkit. [BMC Bioinformatics. 2006 Mar 23;7(1):170]: Introduces BioWarehouse, an open source toolkit for constructing bioinformatics database warehouses using the MySQL and Oracle relational database managers. BioWarehouse currently supports the integration of UniProt, GenBank, ENZYME, KEGG, BioCyc, NCBI Taxonomy, and CMR. Availability:

Lehrach WP, Husmeier D, Williams CK. A regularized discriminative model for the prediction of protein-peptide interactions. [Bioinformatics 2006 22(5):532-540]: Describes a probabilistic discriminative approach to predicting peptide recognition module (PRM)-mediated protein-protein interactions from sequence data. According to the authors, the method overcomes the problem of susceptibility to over-fitting by adopting a Bayesian a posteriori approach. Availability:

Magdaleno S, Jensen P, Brumwell CL, Seal A, Lehman K, et al. BGEM: An In Situ Hybridization Database of Gene Expression in the Embryonic and Adult Mouse Nervous System. [PLoS Biol 4(4): e86]: Presents the St. Jude Brain Gene Expression Map (BGEM), a collection of in situ hybridization images of gene expression patterns in the nervous system of the developing and adult C57BL/6 mouse. Availability:

Mahé P, Ralaivola L, Stoven V, Vert JP. The pharmacophore kernel for virtual screening with support vector machines. [ArXiv preprint archive:]: Describes a family of kernels that are optimized for the manipulation of 3D structures of molecules with kernel methods. The kernels are based on the comparison of the three-points pharmacophores present in the 3D structures of molecules. According to the authors, the approach outperforms algorithms based on the 2D structure of molecules for the detection of inhibitors of several drug targets.

Marioni JC, Thorne NP, Tavare S. BioHMM: a heterogeneous hidden Markov model for segmenting array CGH data. [Bioinformatics. 2006 Mar 13 (e-pub ahead of print)]: Introduces BioHMM, a method for segmenting array comparative genomic hybridization data into states with the same underlying copy number. Availability:

Montgomery SB, Griffith OL, Sleumer MC, Bergman CM, Bilenky M, Pleasance ED, Prychyna Y, Zhang X, Jones SJ. ORegAnno: an open access database and curation system for literature-derived promoters, transcription factor binding sites and regulatory variation. [Bioinformatics 2006 22(5):637-640]: Presents the Open Regulatory Annotation (ORegAnno) database, a collection of literature-curated regulatory regions, transcription factor binding sites, and regulatory mutations that has been designed to manage the submission, indexing, and validation of new annotations from users. Availability:

Reeder J, Hochsmann M, Rehmsmeier M, Voss B, Giegerich R. Beyond Mfold: Recent advances in RNA bioinformatics. [J Biotechnol. 2006 Mar 9]: Presents five recent computational approaches for analyzing RNA secondary structure, including synoptic folding space analysis, pseudoknot prediction, structure alignment, comparative structure prediction, and miRNA target prediction. Availability:

Shannon PT, Reiss DJ, Bonneau R, Baliga NS. Gaggle: An open-source software system for integrating bioinformatics software and data sources. [BMC Bioinformatics. 2006 Mar 28;7(1):176]: Describes Gaggle, an open source Java software environment for software and database integration. Availability:

Smith A, Chandonia JM, Brenner SE. ANDY: a general, fault-tolerant tool for database searching on computer clusters. [Bioinformatics 2006 22(5):618-620]: Introduces ANDY (seArch coordination aND analYsis), a set of Perl programs and modules for distributing large biological database searches across the nodes of a Linux computer cluster. Availability:

Stojmirovic A, Andreae P, Boland M, Jordan TW, Pestov VG. PFMFind: a system for discovery of peptide homology and function. [ArXiv preprint archive:]: Describes Protein Fragment Motif Finder (PFMFind), a system for discovering relationships between short fragments of protein sequences using similarity search. It supports queries based on score matrices and PSSMs obtained through an iterative procedure similar to Psi-Blast.

Urbanczik R. SNA — a toolbox for the stoichiometric analysis of metabolic networks. [BMC Bioinformatics. 2006 Mar 13;7(1):129]: Describes SNA (Stoichiometric Network Analysis), a toolbox for analyzing the possible steady state behavior of metabolic networks by computing the generating and elementary vectors of their flux and conversions cones. Availability:

Zhong W, Sternberg PW. Genome-wide prediction of C. elegans genetic interactions. [Science. 2006 Mar 10;311(5766):1481-4]: Describes the computational integration of interactome data, gene expression data, phenotype data, and functional annotation data from three model organisms — Saccharomyces cerevisiae, Caenorhabditis elegans, and Drosophila melanogaster — and predicted genome-wide genetic interactions in C. elegans. The resulting genetic interaction network contains 18,183 interactions.

Filed under

The Scan

Purnell Choppin Dies

Purnell Choppin, a virologist who led the Howard Hughes Medical Institute, has died at 91, according to the Washington Post.

Effectiveness May Decline, Data From Israel Suggests

The New York Times reports that new Israeli data suggests a decline in Pfizer-BioNTech SARS-CoV-2 vaccine effectiveness against Delta variant infection, though protection against severe disease remains high.

To See Future Risk

Slate looks into the use of polygenic risk scores in embryo screening.

PLOS Papers on Methicillin-Resistant Staphylococcus, Bone Marrow Smear Sequencing, More

In PLOS this week: genomic analysis of methicillin-resistant Staphylococcus pseudintermedius, archived bone marrow sequencing, and more.