In Print: Bioinformatics Tool-Related Papers of Note, February 2006


Adamcsek B, Palla G, Farkas IJ, Derenyi I, Vicsek T. CFinder: Locating cliques and overlapping modules in biological networks. [ArXiv pre-print archive:]: Presents CFinder, a program for locating and visualizing overlapping, densely interconnected groups of nodes in undirected graphs, and allowing the user to easily navigate between the original graph and the web of these groups. The authors demonstrate that CFinder can be used to predict the function of a single protein and to discover novel modules in a network. Availability:

Baitaluk M, Qian X, Godbole S, Raval A, Ray A, Gupta A. PathSys: integrating molecular interaction graphs for systems biology. [BMC Bioinformatics 2006, 7:55]: Presents PathSys, a graph-based system for creating a combined database of interactin networks. The authors describe the use of PathSys to integrate more than 14 curated and publicly contributed data sources for Saccharomyces cerevisiae. Availability:

Birkland A, Yona G. BIOZON: a system for unification, management and analysis of heterogeneous biological data. [BMC Bioinformatics 2006, 7:70]: Describes a data-integration system called Biozon that unifies multiple biological databases for a variety of data types, including DNA sequences, proteins, interactions, and cellular pathways. Biozon currently holds more than 100 million biological documents and 6.5 billion relations between them. Availability:

Camacho CJ, Ma H, Champ PC. Scoring a diverse set of high-quality docked conformations: A metascore based on electrostatic and desolvation interactions. [Proteins. 2006 Feb 27 (e-pub ahead of print)]: Describes an atomic-level free-energy scoring function that estimates in units of kcal/mol both electrostatic and desolvation interactions of protein-protein docked conformations. The scoring function was used to rerank blind predictions submitted for six targets to the community-wide Critical Assessment of Prediction of Interactions (CAPRI) experiment. Availability:

Kiebel GR, Auberry KJ, Jaitly N, Clark DA, Monroe ME, Peterson ES, Tolic N, Anderson GA, Smith RD. PRISM: A data management system for high-throughput proteomics. [Proteomics. 2006 Feb 8 (e-pub ahead of print)]: Describes the PRISM (Proteomics Research Information Storage and Management) system, a data-management system that supports the mass and time tag approach developed at Pacific Northwest National Laboratory. According to the authors, the system has "scaled well as data volume has increased over several years, while allowing adaptability for incorporating new and improved data analysis tools for more effective proteomics research."

Li W, Wen L, Feng JA. RepairNET: A bioinformatics toolbox for functional exploration of DNA damage response. [J Cell Physiol. 2006 Feb 1 (e-pub ahead of print)]: Describes RepairNET, a protein-protein interaction network associated with the DNA damage response. RepairNET is assembled from the scientific literature via computational data mining. The network currently contains more than 1,200 proteins with more than 2,300 functional interactions, along with a number of web-based navigation tools. Availability:

Martins W, de Sousa D, Proite K, Guimaraes P, Moretzsohn M, Bertioli D. New softwares for automated microsatellite marker development. [Nucleic Acids Research 2006 34(4):e31]: Introduces a software package to aid the development of microsatellite markers that integrates a program for detecting microsatellites called TROLL (Tandem Repeat Occurrence Locator) with the Staden Package. Availability:

Radetzki U, Leser U, Schulze-Rauschenbach SC, Zimmermann J, Lussem J, Bode T, Cremers AB. Adapters, shims, and glue — service interoperability for in silico experiments. [Bioinformatics. 2006 Feb 15 (e-pub ahead of print)]: Describes the IRIS project, a service-oriented architecture for science applications. IRIS uses a semi-automatic procedure for identifying and placing customizable adapters into workflows built by service composition. Availability:

Schmidt H, Jirstrand M. Systems Biology Toolbox for MATLAB: a computational platform for research in systems biology. [Bioinformatics 2006 22(4):514-515]: Introduces a Systems Biology Toolbox for the MATLAB mathematical software package. The toolbox provides an interface for import and export of SBML models and includes a number of analysis methods, such as deterministic and stochastic simulation, parameter estimation, network identification, parameter sensitivity analysis, and bifurcation analysis. Availability:

Valentini G. Clusterv: a tool for assessing the reliability of clusters discovered in DNA microarray data. [Bioinformatics 2006 22(3):369-370]: Describes a new R package for assessing the reliability of clusters discovered in DNA microarray data. Availability:

Vallabhajosyula RR, Chickarmane V, Sauro HM. Conservation analysis of large biochemical networks. [Bioinformatics 2006 22(3):346-353]: Introduces a new algorithm that is computationally efficient and robust at extracting the correct conservation laws for very large biochemical networks. The paper demonstrates that the algorithm can perform the conservation analysis of large biochemical networks, and can evaluate the correct conserved cycles when compared with other similar software tools, such as Jarnac and Copasi. Availability:

Wu J, Hu Z, DeLisi C. Gene annotation and network inference by phylogenetic profiling. [BMC Bioinformatics 2006, 7:80]: Introduces a new tool for phylogenetic analysis called correlation enrichment (CE), which assigns genes to functional categories at various levels of resolution. According to the authors, CE "performs better than standard guilt by association … at all levels of coverage."

Yu X, Lin J, Masuda T, Esumi N, Zack DJ, Qian J. Genome-wide prediction and characterization of interactions between transcription factors in Saccharomyces cerevisiae. [Nucleic Acids Research 2006 34(3):917-927]: Introduces a new method for identifying interactions between transcription factors that relies on the relationship of their binding sites. In a test using Saccharomyces cerevisiae as a model system, the algorithm predicted 300 "significant" interactions involving 77 transcription factors.

Zhang W, Rekaya R, Bertrand K. A method for predicting disease subtypes in presence of misclassification among training samples using gene expression: application to human breast cancer. [Bioinformatics 2006 22(3):317-325]: Discusses a procedure for handling potential mislabeling among training samples in the prediction of disease subtypes using gene expression data. In a simulation study about the estrogen receptor status of breast cancer patients, results indicated that when several training samples were artificially mislabeled, the proposed method was able to correct the ER status of mislabeled training samples and predict the ER status of validation samples as well as using the 'true' training data. Availability:

Zhang Y, DeVries ME, Skolnick J. Structure Modeling of All Identified G Protein-Coupled Receptors in the Human Genome. [PLoS Comput Biol 2(2): e13]: Describes a threading assembly refinement method called TASSER used to generate structure predictions for all 907 putative GPCRs in the human genome. Unlike traditional homology modeling approaches, TASSER does not require solved homologous template structures, according to the authors.

