Bioinformatics Tool-Related Papers of Note, May 2008

Becker SA, Palsson BO. Context-specific metabolic networks are consistent with experiments. [PLoS Comput Biol. 2008 May 16;4(5):e1000082]: Describes a method called Gene Inactivity Moderated by Metabolism and Expression, or GIMME, which uses quantitative gene expression data and one or more “presupposed metabolic objectives” to produce context-specific reconstructions of cellular metabolism that are consistent with the available data. Currently, “these reconstructions are ‘genome-scale’ and strive to include all reactions implied by the genome annotation, as well as those with direct experimental evidence,” the authors note in the abstract. “Clearly, many of the reactions in a genome-scale reconstruction will not be active under particular conditions or in a particular cell type.”

Bogdan I, Rivers J, Beynon RJ, Coca D. High-performance hardware implementation of a parallel database search engine for real-time peptide mass fingerprinting. [Bioinformatics. 2008 May 3 (e-pub ahead of print)]: Builds on a previous project to develop a raw mass spectra processor that could be implemented in FPGA hardware to achieve almost 170-fold speed gain relative to a conventional software implementation running on a dual processor server. In the current paper, the authors discuss a “complementary hardware realization of a parallel database search engine” that, when running on a Xilinx Virtex 2 FPGA at 100 MHz, delivers 1,800-fold speed-up compared with an equivalent C software routine running on a 3.06 GHz Xeon workstation. “The inherent scalability of the design means that processing speed can be multiplied by deploying the design on multiple FPGAs,” according to the paper’s abstract.

Chi B, deLeeuw RJ, Coe BP, Ng RT, MacAulay C, Lam WL. MD-SeeGH: a platform for integrative analysis of multi-dimensional genomic data. [BMC Bioinformatics. 2008 May 20;9:243]: Presents an analysis platform called MD-SeeGH that allows users to analyze datasets from multiple genomic experiments. “Multiple sample analysis in MD-SeeGH allows users to compare profiles from many experiments alongside tracks containing detailed localized gene information, microRNA, CpG islands, and copy number variations,” the authors note in the abstract.

Dinov ID, Rubin D, Lorensen W, Dugan J, Ma J, Murphy S, Kirschner B, Bug W, Sherman M, Floratos A, Kennedy D, Jagadish HV, Schmidt J, Athey B, Califano A, Musen M, Altman R, Kikinis R, Kohane I, Delp S, Parker DS, Toga AW. iTools: a framework for classification, categorization and integration of computational biology resources. [PLoS ONE. 2008 May 28;3(5):e2265]: Introduces iTools, a framework for managing diverse computational biology resources. The system stores information about three types of resources: data, software tools, and web-services. “A large number of resources are already iTools-accessible to the community and this infrastructure is rapidly growing,” according to the paper’s authors. Available here.  

Dolan PC, Denver DR. TileQC: a system for tile-based quality control of Solexa data. [BMC Bioinformatics 2008, 9:250]: Describes TileQC, a tile-based quality control system for Illumina Genome Analyzer data that is written in R. TileQC “provides a means of recognizing bias and error in Solexa output by graphically representing data generated by flow cell tiles,” according to the paper’s abstract. This data is then made available in the R environment for further analysis and automation of error detection.

Lee SY, Skolnick J. Benchmarking of TASSER_2.0: An improved protein structure prediction algorithm with more accurate predicted contact restraints. [Biophys J. 2008 May 16 (e-pub ahead of print)]: Describes a new version of the TASSER structure prediction package called TASSER_2.0, which uses a new approach called the composite-sequence method that results in “more accurate side chain contact restraint predictions,” according to the paper’s abstract. The method also relies on an improved threading algorithm, called PROSPECTOR 3.5.

Li C. Automating dChip: toward reproducible sharing of microarray data analysis. [BMC Bioinformatics 2008, 9:231]: Describes a new automation module for the dChip microarray analysis software package. With the module, dChip automation files can be created that include menu steps, parameters, and data viewpoints. The module includes a data-packaging function that allows users to transfer the dChip software, microarray data, and analysis procedures “so that the second user can reproduce the entire analysis session of the first user,” according to the patent abstract.

Nicosia G, Stracquadanio G. Generalized Pattern Search Algorithm for Peptide Structure Prediction. [Biophys J. 2008 May 16 (e-pub ahead of print)]: Presents an algorithm for predicting the tertiary structure of peptides called the Generalized Pattern Search, or GPS, algorithm that is based on a class of algorithms called “Search-and-Poll” algorithms.

Polpitiya AD, Qian WJ, Jaitly N, Petyuk VA, Adkins JN, Camp DG 2nd, Anderson GA, Smith RD. DAnTE: a statistical tool for quantitative analysis of -omics data. [Bioinformatics. 2008 May 3 (e-pub ahead of print)]: Introduces the Data Analysis Tool Extension, or DAnTE, a statistical tool for analyzing quantitative bottom-up, shotgun proteomics data. DAnTE includes normalization methods, missing value imputation algorithms, peptide-to-protein rollup methods, plotting functions, and a hypothesis testing scheme. Available here.

Porter CJ, Palidwor GA, Sandie R, Krzyzanowski PM, Muro EM, Perez-Iratxeta C, Andrade-Navarro MA. StemBase: a resource for the analysis of stem cell gene expression data. [Methods Mol Biol. 2007;407:137-48]: Describes StemBase, a database of gene expression data obtained from stem cells and derivatives.

Ritchie W, Théodule FX, Gautheret D. Mireval: a web tool for simple microRNA prediction in genome sequences. [Bioinformatics. 2008 Jun 1;24(11):1394-6]: Presents an online tool called mirEval that can search sequences of up to 10,000 nucleotides for novel microRNAs in multiple organisms. Available here

Sacan A, Ferhatosmanoglu H, Coskun H. CellTrack: An Open-Source Software for Cell Tracking and Motility Analysis. [Bioinformatics. 2008 May 29 (e-pub ahead of print)]: Introduces CellTrack, a cross-platform software package for cell tracking and motility analysis. The software includes a “novel edge-based method for sensitive tracking of the cell boundaries,” according to the paper’s abstract. Available here.  

White WT, Hendy MD. Compressing DNA sequence databases with COIL. [BMC Bioinformatics. 2008 May 20;9(1):242]: Discusses a sequence database compression tool called COIL. “While much research has been done on compressing individual DNA sequences, surprisingly little has focused on the compression of entire databases of such sequences,” the authors note in the abstract. COIL is based on the idea of edit-tree coding. In the paper, the authors demonstrate a 5 percent improvement in compression ratio over “state-of-the-art general-purpose compression tools” for a GenBank database file containing EST data.

Wilm A, Higgins DG, Notredame C. R-Coffee: a method for multiple alignment of non-coding RNA. [Nucleic Acids Research 2008 36(9):e52]: Introduces R-Coffee, a multiple RNA alignment package derived from T-Coffee that is designed to align RNA sequences while incorporating secondary structure information within the alignment. “It works particularly well as an alignment improver and can be combined with any existing sequence alignment method,” the authors note in the abstract. Available here.

You FM, Huo N, Gu YQ, Luo MC, Ma Y, Hane D, Lazo GR, Dvorak J, Anderson OD. BatchPrimer3: a high throughput web application for PCR and sequencing primer design. [BMC Bioinformatics 2008, 9:253]: Introduces BatchPriner3, a web primer design program that is based on the Primer3 program and targeted toward large-scale genomic research projects. BatchPrimer3 includes a new score-based primer picking module for picking position-restricted primers. The software implements several types of primer designs including generic primers, SSR primers together with SSR detection, and SNP genotyping primers, and DNA sequencing primers. Available here.

