Creation and Application of a Diabetes Knowledge Base. Start date: Jan. 15, 2005. Expires: Jan. 14, 2008. Amount: $152,083. Principal investigator: Atul Butte. Institution: Children’s Hospital (Boston). NIH institute: NLM.
Project will develop an automated system for gathering data related to particular experimental characteristics and perform inferential operators on these data. The investigators propose intersecting data sets by phenotype, and intersecting lists of significant and related genes within these data sets in an automated manner.
Software for Structural Bioinformatics of Nucleic Acid. Start date: Feb. 1, 2005. Expires: Jan. 31, 2009. Amount: $275,108. Principal investigator: John Santalucia. Institution: Wayne State University. NIH institute: NIGMS.
The goal of this proposal is to develop a new software platform for structural bioinformatics of nucleic acids. This will be achieved in four specific aims. The first will be writing software for homology modeling of nucleic acids that will “thread” a sequence whose 3D structure is unknown into a known template structure. The second aim will be to write software for de novo 3D structure prediction of nucleic acids. The third aim will be to extend the AMBER force field and optimization algorithms for nucleic acids. The modified force field will be used to rank predicted 3D structures. Finally, a systematic validation of the quality of the predicted structures will be performed.
Analysis and Annotation Pipeline for Functional Genomics. Start date: Feb. 1, 2005. Expires: Jan. 31, 2007. Amount: $ 196,948. Principal investigator: Michael Ochs. Institution: Fox Chase Cancer Center. NIH institute: NLM.
Proposal to develop an extendable, scalable, automated data-analysis pipeline for functional genomics data. The pipeline will automatically perform multiple analyses, will provide easy extendibility for adding new functions and data types, will provide a distributed computing environment to provide adequate computational power, and will integrate automated annotation to allow analyses to be guided by biological knowledge. Functional genomics data sets will be encapsulated within data objects that include links to the NCI caBIO objects, and annotations will be retrievable through the Distributed Annotation System.
Multi-level Data and Research Integration for Autism. Start date: Feb. 1, 2005. Expires: Jan. 31, 2007. Amount: $214,274. Principal investigator: Allen Tien. Institution: Medical Decision Logic. NIH institute: NIMH.
SBIR grant supports development of a software system called MultiDataWorks that will address the needs of autism researchers. The proposed system will be flexible and easily configurable in order to manage the collection and analysis of heterogeneous data and meet evolving scientific needs. The system will use a model-driven architecture to integrate genetic and phenotypic data. It will have three major parts: an enhanced version of the existing ISAAC (Internet System for Assessing Autistic Children) system; a genetic data interchange and storage module (GeneDataWorks); and integrated querying of combined genetic and phenotypic data (DataTect).
Computational Design of Proteins with Enzymatic Activity. Start date: Jan. 1, 2005. Expires: June 30, 2007. Amount: $43,976. Principal investigator: Shalom Goldberg. Institution: University of Pennsylvania. NIH institute: NIGMS.
Proposal to use computational methods to redesign the active site of a known protein to give it novel catalytic functions. Genetic selections and high-throughput screens will be developed to allow a large number of designed sequences to be analyzed. The resulting designed proteins will be characterized structurally and mechanistically to evaluate the design process and to compare them to natural enzymes. This information will be used to improve the designs and the methodology.
Computational Toxicity Assessment Using Omic Data. Start date: Feb. 1, 2005. Expires: July 31, 2005. Amount: $125,602. Principal investigator: Tom Fahland. Institution: Genomatica. NIH institute: NIEHS.
Supports the application of multivariate linear and nonlinear statistical and mathematical techniques to find patterns in -omic data that are consistent with validated samples of known toxic exposure. By using supervised machine-learning techniques, training data with cross-validated measurements will be used to quantitatively measure the accuracy of the proposed statistical techniques. By applying these techniques to a wide variety of both public and in-house data samples consisting of gene expression data and metabonomic NMR concentration data, small correlations and patterns can be measured and used to characterize the type and amount of environmental and toxicant exposure to host organisms, according to the investigators.