Computational Analysis of Short Repetitive Motifs in DNA Sequences. Start date: May 1, 2007. Expires: April 30, 2009. Amount: $74,000. Institution: University of Texas, Arlington. Principal investigator: Nikola Stojanovic. NIH institute: NLM.
Proposal to develop new software for the identification, visualization, and analysis of repeated short degenerate motifs of approximately 5-25 bases in DNA sequences ranging from a few hundred bases to entire chromosomes. “We shall determine which of these motifs correspond to the experimentally confirmed transcription factor binding consensuses, study their phylogenetic conservation and investigate their possible association with repeat families,” according to the grant abstract. The software will be based on an adaptation of classic string processing algorithms “to address the inexact matches in a novel way, by combining the seed elements into statistically significant degenerate motifs.”
Statistical and Computational Methods for Systematically Mining the SNP and Gene [Data]. Start date: May 1, 2007. Expires: April 30, 2009. Amount: $74,250. Institution: University of Maryland, Baltimore, Professional School. Principal investigator: Zhenqiu Liu. NIH institute: NCI.
Funds a pilot project that will develop algorithms for clustering, molecular network construction, and biomarker discovery with integrated SNP and gene-expression data for use in cancer epidemiology research, according to the abstract.
New Methods and Enhanced Software for Predicting Functional SNPs. Start date: May 1, 2007. Expires: April 30, 2011. Amount: $326,139. Institution: Brigham and Women’s Hospital. Principal investigator: Shamil Sunyaev. NIH institute: NIGMS.
Funds continued development of the PolyPhen software package, including improved methods to predict the functional effect of SNPs in the human genome and the transformation of PolyPhen into “scalable user-friendly cross-platform software,” according to the grant abstract. The investigators plan to improve the accuracy of PolyPhen by introducing new computational strategies to predict the effect of non-synonymous SNPs on protein structure and function using a structurally optimized Bayesian classifier to predict the functional effect of nsSNPs based on multiple features derived from protein sequence and structure, the abstract states. The investigators also plan to extend the prediction method to non-coding SNPs.
A DataCoordinatingCenter for modENCODE. Start date: May 4, 2007. Expires: March 31, 2011. Amount: $1,275,000. Institution: Cold Spring Harbor Laboratory. Principal investigator: Lincoln Stein. NIH institute: NHGRI.
Supports the creation of a data-coordinating center to support the modENCODE project. The center will “track the data, integrate it with other information sources, and make it available to the research community in a timely and open fashion,” according to the grant abstract. A team of three data managers at Cold Spring Harbor Lab and the University of California, Berkeley, will liaise with data-provider sites to determine data file formats, milestones, and quality control procedures for their datasets. They will also liaise with representatives from the National Center for Biotechnology Information to coordinate modENCODE activities with the primary data repositories at GenBank and the Gene Expression Omnibus, the abstract states. All software systems used by the center will be based on open source tools from the Generic Model Organism Database project, human ENCODE, and other sources.
BioGRID: An Open Resource for Biological Interactions and Network Analysis.
Start date: May 15, 2007. Expires: Feb. 28, 2011. Amount: $571,590. Institution: Mount Sinai Hospital/Samuel Lunenfeld Research Institution. Principal investigator: Michael Tyers. NIH institute: NCRR.
Supports development of an open database called BioGRID that contains more than 150,000 molecular interactions from many species including humans. According to the investigators, BioGRID sees an average of 80,000 queries and serves millions of interactions per month. BioGRID is linked to a visualization tool called Osprey that allows users to build, query, and visualize fully annotated biological interaction networks in graphical forma, according to the grant abstractt. The investigators plan to further develop BioGRID and Osprey by curating the yeast, worm, fly, plant, and human literature for interactions and post-translational modifications; expanding the BioGRID platform to allow rapid access to and manipulation of many data types; releasing a new version of Osprey that will enable cross-species network predictions, data integration, and network interrogation.
Software Tools for Analysis and Visualization of Protein Structure. Start: May 15, 2007. Expires: April 30, 2011. Amount: $265,059. Institution: University of Washington. Principal investigator: Ethan Merritt. NIH Institute: NIGMS.
Supports the development, evaluation, maintenance, distribution, and support of software tools relating to protein structure, including the structural visualization tools in the RasterSD package, as well as the WebTools, Parvati, and TLSMD suites, according to the grant abstract. A “major goal” of the project is to extend and distribute TLSMD, which infers protein flexibility and other dynamic properties from single crystal structures. The central TLSMD server will be expanded to handle the demands of researchers who do not have crystallographic software installed locally, and the core TLSMD algorithms will be adapted for use outside applications in crystallographic refinement.
Statistical & Computational Tools for Reconstruction of Gene Regulatory Networks. Start date: July 1, 2007. Expires: June 30, 2008. Amount: $71,960. Principal investigator: Peter Salzman. Institution: University of Rochester. NIH institute: NLM.
Funds a project to study the regulatory relations among genes based on gene expression data using a theoretically based scoring method for network reconstruction in a frequentist and Bayesian framework. “Searching for high-scoring networks is computationally complex because of the super-exponential size of space of possible networks,” according to the grant abstract. The investigators plan to develop an implementation of the simulated annealing and Markov chain Monte Carlo algorithms “that works on a reduced space of orders and thus increases the efficiency.”