Bioinformatics Tool-Related Papers of Note, June 2008

Note: In addition to the below listing, Nucleic Acids Research’s annual bioinformatics web server issue is available here and the proceedings for the upcoming Intelligent Systems for Molecular Biology conference are available via Bioinformatics here.  

Bergmann FT, Sauro HM. Comparing simulation results of SBML Capable Simulators. [Bioinformatics. 2008 Jun 30. (e-pub ahead of print)]: Presents an approach for comparing results from different simulation software tools, as well as a website for the computational systems biology research community to share simulation results. Available here.

Goll J, Rajagopala SV, Shiau SC, Wu H, Lamb BT, Uetz P. MPIDB: The microbial protein interaction database. [Bioinformatics. 2008 Jun 13. (e-pub ahead of print)]: Presents the microbial protein interaction database, or MPIDB, which aims to collect and provide all known physical microbial interactions. Currently, the database includes 22,530 experimentally determined interactions among proteins of 191 bacterial species and strains. These microbial interactions have been manually curated from the literature or imported from other databases, such as IntAct, DIP, BIND, and MINT. Available here.

Hsing M, Cherkasov A. Indel PDB: a database of structural insertions and deletions derived from sequence alignments of closely related proteins. [BMC Bioinformatics. 2008 Jun 25;9(1):293]: Describes Indel PDB, a structural database of insertions and deletions identified from the sequence alignments of highly similar proteins found in the Protein Data Bank. Indel PDB contains 117,266 non-redundant indel sites extracted from 11,294 indel-containing proteins. Available here

Huang W, Marth GT. EagleView: a genome assembly viewer for next-generation sequencing technologies. [Genome Res. 2008 Jun 11. (e-pub ahead of print)]: Describes a data-integration and -visualization tool called EagleView that is designed to handle a “large genome assembly of millions of reads” and analyzing data from next-generation sequencers, according to the paper’s abstract. EagleView supports viewing co-assembly of mixed-type reads from different technologies, and enables the integration of genome feature annotations into genome assemblies. Available here

Jelier R, 't Hoen PA, Sterrenburg E, den Dunnen JT, van Ommen GJ, Kors JA, Mons B. Literature-aided meta-analysis of microarray data: a compendium study on muscle development and disease. [BMC Bioinformatics 2008, 9:291]: Introduces an algorithm called LAMA, or Literature-Aided Meta-Analysis, that quantifies the similarity between transcriptomics studies in order to support comparative analysis of expression microarray studies. The authors evaluated the algorithm on a compendium of 102 microarray studies published in the field of muscle development and disease, and compared it to similarity measures based on gene overlap and over-representation of biological processes assigned by the Gene Ontology. “While the overlap in both genes and overrepresented GO-terms was poor, LAMA retrieved many more biologically meaningful links between studies, with substantially lower influence of technical factors,” the abstract states.

Keskin O, Nussinov R, Gursoy A. Prism: protein-protein interaction prediction by structural matching. [Methods Mol Biol. 2008;484:505-21]: Introduces Prism, software for predicting protein-protein interactions that is based on a “bottom-up approach” that combines structure and sequence conservation in protein interfaces, according to the paper’s abstract. The algorithm looks for possible binary interactions between proteins through structure similarity and evolutionary conservation of known interfaces. It includes a database of protein interface structures derived from the Protein Data Bank and predicted protein-protein interactions. In the current version, 3,799 structurally nonredundant interfaces are used to predict the interactions among 6,170 proteins. Available here.

Kundrotas PJ, Lensink MF, Alexov E. Homology-based modeling of 3D structures of protein-protein complexes using alignments of modified sequence profiles. [Int J Biol Macromol. 2008 May 21 (e-pub ahead of print)]: Discusses a homology-based modeling method for 3D structures of protein complexes that is based on alignments of modified sequence profiles. The method, called HOMology-BAsed COmplex Prediction, or HOMBACOP, has two “distinctive features, according to the authors: extra weight on aligning interfacial residues in the dynamic programming algorithm, and increased gap penalties for interfacial segments.

Morgulis A, Coulouris G, Raytselis Y, Madden TL, Agarwala R, Schäffer AA. Database indexing for production MegaBLAST searches. [Bioinformatics. 2008 Jun 23. (e-pub ahead of print)]: Describes indexed MegaBlast, a new version of MegaBlast that first finds short seeds for matches by searching a database index. The authors also describe a program called makembindex that preprocesses the database into a data structure for rapid seed searching. “We show that indexed MegaBlast is faster than miBlast, another implementation of BLAST nucleotide searching with a preprocessed database, for most of the 200 queries we tested,” the authors note in the abstract. To deploy indexed MegaBlast as part of the National Center for Biotechnology Information's Web Blast service, “the storage of databases and the queueing mechanism were modified, so that some machines are now dedicated to serving queries for a specific database,” which has made the response time for such web queries “faster than it was when each computer handled queries for multiple databases,” the authors write. Available here.  

Sanderson MJ, Boss D, Chen D, Cranston KA, Wehe A. The PhyLoTA browser: Processing GenBank for molecular phylogenetics research. [Syst Biol. 2008 Jun;57(3):335-46]: Presents an informatics processing pipeline and online database called the PhyLoTA Browser, which provides “a view of GenBank tailored for molecular phylogenetics, according to the paper’s abstract. The initial version of the browser was computed from 2.6 million sequences that represent the “taxonomically enriched” subset of GenBank sequences for eukaryotes. In addition to summarizing sequence diversity and species diversity, the browser includes 87,000 “potentially phylogenetically informative clusters” of homologous sequences, which can be viewed or downloaded, along with provisional alignments and phylogenetic trees. Available here

Wright J, Wagner A. The Systems Biology Research Tool: evolvable open-source software. [BMC Systems Biology 2008, 2:55]: Describes the Systems Biology Research Tool, SBRT, which performs 35 methods for analyzing stoichiometric networks and 16 other analytical methods from fields such as graph theory, geometry, algebra, and combinatorics. “New computational techniques can be added to the SBRT via process plug-ins, providing a high degree of evolvability and a unifying framework for software development in systems biology,” according to the paper’s abstract. Available here

Zhang HL, Lin HH, Tao L, Ma XH, Dai JL, Jia J, Cao ZW. Prediction of antibiotic resistance proteins from sequence-derived properties irrespective of sequence similarity. [Int J Antimicrob Agents. 2008 Jun 24. (e-pub ahead of print)]: Describes a software tool to help predict antibiotic resistance proteins, or ARPs. The authors developed a support vector machine-based ARP prediction system using 1,308 ARPs and 15,587 non-ARPs. They evaluated the performance of the system using 313 ARPs and 7,156 non-ARPs and found that the computed prediction accuracy was 88.5 percent for ARPs and 99.2 percent for non-ARPs.

Zhao Z, Fu B, Alanis FJ, Summa CM. Feedback algorithm and web-server for protein structure alignment. [J Comput Biol. 2008 Jun;15(5):505-24]: Describes a feedback algorithm for protein structure alignment that uses a series of phases to improve the global alignment between two protein backbones. The method uses a “self-improving learning strategy” that sends the output of the global alignment to the next phase as an input, the paper’s abstract states. Available here.

