Chen K, McLellan MD, Ding L, Wendl MC, Kasai Y, Wilson RK, Mardis ER. PolyScan: An automatic indel and SNP detection approach to the analysis of human resequencing data. [Genome Res. 17:659-666, 2007]: Introduces PolyScan, an algorithm and software implementation designed to provide de novo heterozygous indel detection and improved SNP identification in the context of high-throughput medical resequencing.
Colella S, Yau C, Taylor JM, Mirza G, Butler H, Clouston P, Bassett AS, Seller A, Holmes CC, Ragoussis J. QuantiSNP: an Objective Bayes Hidden-Markov Model to detect and accurately map copy number variation using SNP genotyping data. [Nucleic Acids Research 2007 35(6):2013-2025]: Describes a computational framework called QuantiSNP for detecting regions of copy number variation from BeadArray SNP genotyping data using an objective Bayes hidden-Markov model. QuantiSNP can be adapted to other array platforms, according to the authors.
DeJongh M, Formsma K, Boillot P, Gould J, Rycenga M, Best A. Toward the automated generation of genome-scale metabolic networks in the SEED. [BMC Bioinformatics 2007, 8:139]: Describes a method for generating genome-scale metabolic networks that produces “substantially complete” reaction networks, according to the authors. The method partitions the reaction space of central and intermediary metabolism into discrete, interconnected components that can be assembled and verified in isolation from each other, and then integrated and verified at the level of their interconnectivity.
Girardot C, Sklyar O, Grosz S, Huber W, Furlong EE. CoCo: a web application to display, store and curate ChIP-on-chip data integrated with diverse types of gene expression data. [Bioinformatics 2007 23(6):771-773]: Describes CoCo, or ChIP-on-Chip online, an open-source web application that supports the annotation and curation of regulatory regions and associated target genes discovered in ChIP-on-chip experiments. Availability: http://furlonglab.embl.de/methods/tools/coco
Hamilton NA, Pantelic RS, Hanson K, Teasdale RD. Fast automated cell phenotype image classification. [BMC Bioinformatics 2007, 8:110]: Describes a computational method for classifying sub-cellular images based on a concept called threshold adjacency statistics, which thresholds the image and count the number of above-threshold pixels with a given number of above-threshold pixels adjacent. According to the authors, threshold adjacency statistics can “remove the need for cropping of individual cells from images, and are an order of magnitude faster to calculate than other commonly used statistics while providing comparable or better classification accuracy.”
Jensen ST, Chen G, Stoeckert CJ. Bayesian Variable Selection and Data Integration for Biological Regulatory Networks [ArXiv preprint archive: http://arxiv.org/abs/math/0610034]: Proposes a Bayesian hierarchical model that integrates gene expression data, ChIP binding data, and promoter sequence data in a principled variable selection framework. The authors describe the use of the method to discover gene regulatory relationships in yeast. “Our inferred relationships show greater biological relevance on the external validation measures than previous data integration methods,” they write.
Myers CR, Gutenkunst RN, Sethna JP. Python Unleashed on Systems Biology. [ArXiv preprint archive: http://arxiv.org/abs/0704.3259]: Describes an open source system for modeling biomolecular reaction networks called SloppyCell, which is written in Python. SloppyCell is able to perform dynamic code synthesis, symbolic manipulation, and parallel exploration of complex parameter spaces, according to the authors.
Tjong H, Zhou HX. DISPLAR: an accurate method for predicting DNA-binding sites on protein surfaces. [Nucleic Acids Research 2007 35(5):1465-1477]: Introduces a method for predicting DNA-binding sites on protein surfaces based on the structures of proteins. The authors used a set of 264 protein–DNA complexes from the Protein Data Bank to train and test a neural network predictor of DNA-binding sites. The method “significantly outperforms previous attempts of DNA-binding site predictions,” they write.
Will S, Reiche K, Hofacker IL, Stadler PF, Backofen R. Inferring Noncoding RNA Families and Classes by Means of Genome-Scale Structure-Based Clustering. [PLoS Comput Biol. 2007 Apr 13;3(4):e65]: Discusses a structure-based clustering approach that is capable of extracting putative RNA classes from genome-wide surveys for structured RNAs. The tool, called LocARNA for local alignment of RNA, uses a variant of the Sankoff algorithm that is fast enough to deal with several thousand candidate sequences. According to the authors, the method identifies several RNA families, including microRNA and snoRNA candidates, “and suggests several novel classes of ncRNAs for which to date no representative has been experimentally characterized.”