Bioinformatics Tool-Related Papers of Note, September 2009
Chen VB, Davis IW, Richardson DC. KiNG (Kinemage, Next Generation): A versatile interactive molecular and scientific visualization program. [Protein Sci. 2009 Sep 18. (e-pub ahead of print)]: Describes Kinemage, Next Generation, or KiNG, a Java program for visualizing scientific data, with a focus on macromolecular visualization. According to the paper's abstract, KiNG uses the kinemage graphics format, which is tuned for macromolecular structures, "but is also ideal for many other kinds of spatially embedded information." The paper outlines three applications of KiNG: structural biology, bioinformatics of high-dimensional data, and classroom education. Available here.
Hoffmann S, Otto C, Kurtz S, Sharma CM, Khaitovich P, Vogel J, Stadler PF, Hackermüller J. Fast mapping of short sequences with mismatches, insertions and deletions using index structures. [PLoS Comput Biol. 2009 Sep;5(9):e1000502]: Discusses a matching model for short reads that overcomes limitations of most current methods, which "neglect the necessity to allow not only mismatches, but also insertions and deletions," according to the authors. The method, which can handle mismatches as well as indels, addresses different error models. "In a comparison with current methods for short read mapping, the presented approach shows significantly increased performance not only for 454 reads, but also for Illumina reads," the paper's abstract states. Available here.
Kim JH, Kim WC, Waterman MS, Park S, Li LM. HAPLOWSER: a whole-genome haplotype browser for personal genome and metagenome. [Bioinformatics. 2009 Sep 15;25(18):2430-1]: Describes a whole-genome haplotype browser, which is the first comparative haplotype browser that depicts a global picture of whole-genome alignments among haplotypes of different organisms, according to the authors. The browser, called Haplowser, "enables the comparison of haplotypes from metagenomes, and associates conserved regions or the bases at the conserved regions with functional annotations and custom tracks," according to the paper's abstract. Available here.
Lam TY, Meyer IM. HMMCONVERTER 1.0: a toolbox for hidden Markov models. [Nucleic Acids Res. 2009 Sep 8. (e-pub ahead of print)]: Presents HMMConverter, a software package for setting up probabilistic hidden Markov models, pair-HMMs, and generalized HMMs. Users of HMMConverter can "set up complex applications with a minimum of effort and also perform parameter training and data analyses for large data sets," according to the paper's abstract. Available here.
Lassmann T, Hayashizaki Y, Daub CO. TagDust - A program to eliminate artifacts from next generation sequencing data. [Bioinformatics. 2009 Sep 7. (e-pub ahead of print)]: Introduces TagDust, a program for identifying artifactual sequences in large sequencing runs. TagDust takes a user-defined cutoff for the false discovery rate and identifies all reads that are "explainable by combinations and partial matches to known sequences used during library preparation," the paper's abstract states. Available here.
Manning JR, Hedley A, Mullins JJ, Dunbar DR. Automated seeding of specialised wiki knowledgebases with BioKb. [BMC Bioinformatics. 2009 Sep 16;10:291]: Describes a software system called BioKb, which is implemented as a plug-in for the TWiki engine and designed to help create "a field-specific wiki containing collaborative and automatically generated content," according to the authors. Available here.
Manske HM, Kwiatkowski DP. LookSeq: A browser-based viewer for deep sequencing data. [Genome Res. 2009 Sep 29. (e-pub ahead of print)]: Presents LookSeq, an Ajax-based web viewer for browsing large data sets of aligned sequence reads. The program "assists the user to assimilate information at different levels of resolution, from an overview of a genomic region to fine details such as heterogeneity within the sample," according to the paper's abstract. Available here.
Meinicke P. UFO: a web server for ultra-fast functional profiling of whole genome protein sequences. [BMC Genomics. 2009 Sep 2;10(1):409]: Describes UFO, a web server for "ultra-fast functional profiling" that allows researchers to process large protein sequence collections instantaneously, according to the paper's abstract. The server provides the frequencies of Pfam and GO categories, as well as sequence-specific assignments to Pfam domain families. Available here.
Reisinger F, Martens L. Database on Demand - An online tool for the custom generation of FASTA-formatted sequence databases. [Proteomics. 2009 Sep 1;9(18):4421-4424]: Introduces Database on Demand, "an easy-to-use web tool that can quickly produce a wide variety of customized search databases," according to the authors. The paper's abstract notes that while most protein search engines can derive peptides in silico from protein sequences, "this is usually limited to standard digestion algorithms." On the other hand, "customized search databases that provide detailed control over the search space can vastly outperform such standard strategies, especially in gel-free proteomics experiments," it adds.
Schneeberger K, Hagmann J, Ossowski S, Warthmann N, Gesing S, Kohlbacher O, Weigel D. Simultaneous alignment of short reads against multiple genomes. [Genome Biol. 2009 Sep 17;10(9):R98]: Describes GenomeMapper, which is able to simultaneously map short reads against multiple genomes by integrating related genomes, such as individuals of the same species, into a single graph structure. According to the authors, this method is the "first approach for handling multiple references." Available here.
Wagener J, Spjuth O, Willighagen EL, Wikberg JE. XMPP for cloud computing in bioinformatics supporting discovery and invocation of asynchronous web services. [BMC Bioinformatics. 2009 Sep 4;10:279]: Describes a new approach for life science cloud computing based on the open standard Extensible Messaging and Presence Protocol, or XMPP. The approach includes an extension "to comprise discovery, asynchronous invocation, and definition of data types in the service," the paper's abstract states. According to the authors, XMPP with extensions has several advantages over traditional HTTP-based web services.
Zamar D, Tripp B, Ellis G, Daley D. Path: a tool to facilitate pathway-based genetic association analysis. [Bioinformatics. 2009 Sep 15;25(18):2444-6]: In an effort to support research into complex diseases that may be determined by interactions between hundreds of thousands of SNPs, the authors developed Path, which integrates information from nine online bioinformatics resources including the National Center for Biotechnology Information, Online Mendelian Inheritance in Man, Kyoto Encyclopedia of Genes and Genomes, UCSC Genome Browser, Seattle SNPs, PharmGKB, Genetic Association Database, the Single Nucleotide Polymorphism database, and the Innate Immune Database. Available here.