Skip to main content
Premium Trial:

Request an Annual Quote

Bioinformatics Tool-Related Papers of Note, December 15, 2006

Burba AE, Lehnert U, Yu EZ, Gerstein M. Helix Interaction Tool (HIT): a web-based tool for analysis of helix-helix interactions in proteins.  [Bioinformatics 2006 22(22):2735-2738]: Presents a package of tools for analyzing helix–helix packing in proteins. The tools include quantitative measures of the helix interaction surface area and helix crossing angle, as well as several methods for visualizing the helical interaction. Availability:

Cerami EG, Bader GD, Gross BE, Sander C. cPath: open source software for collecting, storing, and querying biological pathways. [BMC Bioinformatics 2006, 7:497]: Describes cPath, an open source database and web application for collecting, storing, and querying biological pathway data. cPath allows users to aggregate custom pathway data sets available in standard exchange formats from multiple databases, present pathway data via a customizable web interface, and export pathway data to third-party software for visualization and analysis. Availability:

Dixon SL, Smondyrev AM, Knoll EH, Rao SN, Shaw DE, Friesner RA. PHASE: a new engine for pharmacophore perception, 3D QSAR model development, and 3D database screening: 1. Methodology and preliminary results. [J Comput Aided Mol Des. 2006 Nov 24 (e-pub ahead of print)]: Introduces PHASE, a system for pharmacophore identification and assessment, 3D QSAR model development, and 3D database creation and searching.

Fischer M, Thai QK, Grieb M, Pleiss J. DWARF — a data warehouse system for analyzing protein families. [BMC Bioinformatics. 2006 Nov 9;7(1):495]: Describes DWARF, a data warehouse that integrates data on sequence, structure, and functional annotation for protein fold families.

Grant BJ, Rodrigues AP, ElSawy KM, McCammon JA, Caves LS. Bio3d: an R package for the comparative analysis of protein structures. [Bioinformatics 2006 22(21):2695-2696]: Discusses an automated procedure for analyzing homologous protein structures based on characterizing internal conformational differences and inter-conformer relationships. The method is implemented in bio3d, an R package for analyzing structure and sequence data. Availability:

Huang SY, Zou X. Ensemble docking of multiple protein structures: Considering protein structural variations in molecular docking. [Proteins. 2006 Nov 9]: Introduces a docking algorithm that uses multiple protein structures to account for protein structural variations. According to the authors, the algorithm can simultaneously dock a ligand into an ensemble of protein structures and automatically select an optimal protein structure that best fits the ligand.

Kai Xia , Dong Dong and Jing-Dong J Han. IntNetDB v1.0: An integrated protein-protein interaction network database generated by a probabilistic model. [BMC Bioinformatics 2006, 7:508]: Introduces a database of computationally analyzed potential protein-protein interactions. By applying a probabilistic model, the authors integrated 27 heterogeneous genomic, proteomic, and functional annotation datasets to predict PPI networks in human. The Integrated Network Database includes 180,010 predicted protein-protein interactions among 9,901 human proteins. Availability:

Kawas E, Senger M, Wilkinson MD. BioMoby extensions to the Taverna workflow management and enactment software. [BMC Bioinformatics 2006, 7:523]: Describes integration between the BioMoby interoperability system and the Taverna bioinformatics workflow platform. The authors have developed a Taverna plug-in that provides access to many of BioMoby's features through the Taverna interface.

Kim YR, Kwon OI, Paeng SH, Park CJ. Phylogenetic tree constructing algorithms fit for grid computing with SVD. [ArXiv pre-print archive:]: Describes algorithms for constructing a phylogenetic tree by computing the singular value decomposition of flattenings with a small fixed number of rows.

Korkin D, Davis FP, Alber F, Luong T, Shen MY, et al. Modeling of protein interactions by analogy: Application to PSD-95. [PLoS Comput Biol 2(11): e153]: Discusses comparative patch analysis — “a hybrid of comparative modeling based on a template complex and protein docking” — for modeling the structures of multidomain proteins and protein complexes.

Ma X, Lee H, Wang L, Sun F. CGI: a new approach for prioritizing genes by Combining Gene expression and protein-protein Interaction data. [Bioinformatics. 2006 Nov 10 (e-pub ahead of print]: Describes a method for prioritizing genes associated with a phenotype by combining gene expression and protein interaction data. According to the authors, the method outperforms prioritizing methods that use either gene-expression data or protein-interaction data alone. Availability: upon request.

Noguchi H, Park J, Takagi T. MetaGene: prokaryotic gene finding from environmental genome shotgun sequences. [Nucleic Acids Research 2006 34(19):5623-5630]: Describes a prokaryotic gene-finding program called MetaGene, which uses di-codon frequencies estimated by the GC content of a given sequence with other various measures. According to the authors, MetaGene can predict a range of prokaryotic genes “based on the anonymous genomic sequences of a few hundred bases,” with a sensitivity of 95 percent and a specificity of 90 percent for artificial shotgun sequences.

Ott MA, Vriend G. Correcting ligands, metabolites, and pathways. [BMC Bioinformatics 2006, 7:517]: Describes a database of metabolites called BioMeta “that augments the existing pathway databases by explicitly assessing the validity, correctness, and completeness of chemical structure and reaction information,” according to the paper’s abstract. Availability:

Pattyn F, Hoebeeck J, Robbrecht P, Michels E, De Paepe A, Bottu G, Coornaert D, Herzog R, Speleman F, Vandesompele J. methBLAST and methPrimerDB: web-tools for PCR based methylation analysis. [BMC Bioinformatics 2006, 7:496]: Presents methBLAST, a sequence similarity search program that queries in silico bisulphite modified genome sequences to evaluate oligonucleotide sequence similarities. The authors have also developed the methPrimerDB database for storing and retrieving validated PCR based methylation assays. Availability: and

Phuong TM, Do CB, Edgar RC, Batzoglou S. Multiple alignment of protein sequences with repeats and rearrangements. [Nucleic Acids Res. 2006 34(20):5932-5942]: Describes ProDA, a system for detecting and aligning homologous regions in collections of proteins. “Given an input set of unaligned sequences, ProDA identifies all homologous regions appearing in one or more sequences, and returns a collection of local multiple alignments for these regions,” according to the paper’s abstract.

Rayner TF, Rocca-Serra P, Spellman PT, et al. A simple spreadsheet-based, MIAME-supportive format for microarray data: MAGE-TAB. [BMC Bioinformatics 2006, 7:489]: Describes a tab-delimited, spreadsheet-based format called MAGE-TAB that can be used for annotating and communicating microarray data in a MIAME-compliant fashion. “MAGE-TAB will enable laboratories without bioinformatics experience or support to manage, exchange and submit well-annotated microarray data in a standard format using a spreadsheet,” the authors note in the paper’s abstract.

Reed JL, Patel TR, Chen KH, Joyce AR, Applebee MK, Herring CD, Bui OT, Knight EM, Fong SS, Palsson BO. Systems approach to refining genome annotation. [Proc Natl Acad Sci USA. 2006 Nov 14;103(46):17480-4]: Discusses an optimization-based algorithm that predicts missing reactions that are required to reconcile computation and experiment in building genome-scale metabolic models.

Shen HB, Chou KC. Virus-PLoc: A fusion classifier for predicting the subcellular localization of viral proteins within host and virus-infected cells. [Biopolymers. 2006 Nov 21 e-pub ahead of print)]: Describes Virus-PLoc, a tool for annotating the localization of viral proteins within host and virus-infected cells. Virus-PLoc fuses many basic classifiers with each engineered according to the K-nearest neighbor rule. Availability:

Taylor RC, Shah A, Treatman C, Blevins M. SEBINI: Software Environment for BIological Network Inference. [Bioinformatics 2006 22(21):2706-2708]: Introduces the SEBINI (Software Environment for BIological Network Inference), a suite of algorithms for reconstructing the structure of biological regulatory and interaction networks. According to the authors, SEBINI can be used to compare and train network inference methods on artificial networks and simulated gene expression perturbation data and also enables the analysis of experimental expression data. Availability:

Wang C, Ding C, Meraz RF, Holbrook SR. PSoL: a positive sample only learning algorithm for finding non-coding RNA genes. [Bioinformatics 2006 22(21):2590-2596]: Describes a machine-learning approach called positive sample only learning (PSoL) to predict non-coding RNA genes in the Escherichia coli genome. According to the authors, PSoL is also applicable to other bioinformatics problems.

Filed under

The Scan

Booster Push

New data shows a decline in SARS-CoV-2 vaccine efficacy over time, which the New York Times says Pfizer is using to argue its case for a booster, even as the lower efficacy remains high.

With Help from Mr. Fluffington, PurrhD

Cats could make good study animals for genetic research, the University of Missouri's Leslie Lyons tells the Atlantic.

Man Charged With Threatening to Harm Fauci, Collins

The Hill reports that Thomas Patrick Connally, Jr., was charged with making threats against federal officials.

Nature Papers Present Approach to Find Natural Products, Method to ID Cancer Driver Mutations, More

In Nature this week: combination of cryogenic electron microscopy with genome mining helps uncover natural products, driver mutations in cancer, and more.