Textpresso, an Information Retrieval and Extraction System for Biological Literature. Start date: March 23, 2006. Expires: Jan. 31, 2009. Amount: $300,000. Principal investigator: Paul Sternberg. Institution: California Institute of Technology. NIH institute: NHGRI.
Funds development of an information retrieval and extraction system called Textpresso that processes the full text of biological papers. A prototype system is already in use at the WormBase and SGD model organism databases, according to the grant abstract. The system separates text into sentences, labels words and phrases according to an ontology, and allows queries to be performed on a database of labeled sentences. The current ontology comprises 37 categories of terms.
MouseCyc: A Biochemical Pathway Database for the Mouse. Start date: April 1, 2006. Expires: Jan. 31, 2009. Amount: $252,000. Principal investigator: Carol Bult. Institution: Jackson Laboratory. NIH institute: NHGRI.
Proposal to develop MouseCyc, a biochemical pathway genome database for the mouse that will be integrated with curated information on phenotypes, gene expression data, functional annotations, and mammalian homology for mouse genes available from the Mouse Genome Informatics database. The grantees plan to use the Pathway Tools suite of software to implement the database.
Shape Optimizing Diffeomorphisms for Computational Biology. Start date: April 1, 2006. Expires: March 31, 2010. Amount: $365,513. Principal investigator: James Gee. Institution: University of Pennsylvania. NIH institute: NIBIB.
Funds development of a new system for "rigorous spatiotemporal medical image analysis in a large scale computing environment" for neuroimaging, according to the grant abstract. The goal of the project is to associate changes that occur in an individual over time with causes, including "innate population variability, injury, pathology, or the effects of genotype on phenotype." The method will use the recently proposed Diffeomorphometry system, which "quantifies and relates these variables to an optimal spatiotemporal coordinate system."
Predicting Drug Mechanism Via Chemogenomic Profiling. Start date: April 1, 2006. Expires: March 31, 2011. Amount: $324,500. Principal investigator: Timothy Stevens Gardner. Institution: Boston University Charles River Campus. NIH institute: NIGMS.
Supports development of statistical, computational, and experimental methods to extend and broaden previous work on predicting drug mechanism of action from gene expression data. The investigators will adapt the framework of simultaneous equation models to the problem and develop extensions of recent techniques for sparse inference.
Identifying Genetic Factors for Predisposition in Polygenic Diseases. Principal investigator: Michael Ochs. Start date: April 10, 2006. Expires: April 9, 2008. Amount: $85,500. Institution: Fox Chase Cancer Center. NIH institute: NLM
Supports development of an algorithm for "determining the underlying factors responsible for predisposition to or protection from polygenic diseases," according to the grant abstract. The algorithm relies on Markov chain Monte Carlo exploration of the space of possible genetic variants coupled to a Bayesian statistical test based on phenotypic ranks.
A High Performance Server for Data-Intensive Computation. Start date: April 15, 2006. Expires: April 14, 2007. Amount: $392,218. Principal investigator: Paul Matsudaira. Institution: Massachusetts Institute of Technology. NIH institute: NCRR.
Supports the purchase of a symmetric multi-processing server with storage area network data storage for the Computational and Systems Biology Initiative at MIT. The computational resource is "critically required for the immediate progress of several labs engaged in fundamental aspects of biological research," according to the grant abstract. The grantees are planning to purchase an SMP server with 16 64-bit Itanium2 processors with 64 GB of memory running Linux, and "several terabytes" of data storage. The resource will be located in the Whitehead-MIT Bioimaging Center and integrated with the CSBi high-performance computing resources.
Extracting Reliable Information From Microarray Data. Start date: April 15, 2006. Expires: March 31, 2008. Amount: $196,234. Principal investigator: Zoltan Szallasi. Institution: Children's Hospital Boston. NIH institute: NLM.
Funds development of a database for use in evaluating algorithms for analyzing microarray data. The investigators will demonstrate the utility of the proposed database "by determining the relative merit of several widely used microarray normalization algorithms," according to the grant abstract. The grantees will also develop probe-sequence-based methods to reduce cross-hybridization noise in microarray measurements.
Bioinformatics approaches to characterizing amino acid function. Start date: April 24, 2006. Expires: April 23, 2009. Amount: $152,164. Principal investigator: Sean Mooney. Institution: Indiana University/Purdue University at Indianapolis. NIH institute: NLM.
Supports research examining how sequence, evolutionary, and structural descriptors can be used to quantify function. This knowledge will be used to develop methods that can "associate residues with known functional annotations, perform annotation transfer onto an experimentally determined or modeled protein structure, and determine the likely molecular effects of mutation, thus creating a framework for residue annotation," according to the grant abstract.
A Genetic Association Research Statistical Framework. Start date: May 1, 2006. Expires: April 30, 2009. Amount: $500,000. Principal investigator: Ross Lazarus. Institution: Brigham and Women's Hospital Research Administration. NIH institute: NHGRI.
Funds development of an integrated suite of applications using the R statistical programming language and the Bioconductor framework to support complex disease association research. Specific aims include "software support for importing experimental data and genomic annotation; methods for statistical power calculations and for selecting maximally informative subsets of markers during experimental design; methods for visualizing and summarizing experimental results; established and recently developed methods supporting statistical inference on single markers, multiple markers, and on the epistatic and gene by environment interactions characteristic of these diseases and needed for emerging fields of study such as pharmacogenetics."
Mathematical Foundations for Nonlinear, Stochastic & Hybrid Biochemical Networks. Start date: May 1, 2006. Expires: April 30, 2010. Amount: $362,763. Principal investigator: John Doyle. Institution: California Institute of Technology. NIH institute: NIGMS.
Funds continued development and enhancement of a systems biology software infrastructure developed at Caltech, including deterministic/stochastic simulation algorithms "that are much more efficient than existing stochastic algorithms, and can automatically determine the appropriate scale for different subsystems of a model," according to the grant abstract.