Integrated Cancer Data Management System. Start date: April 2004. Expires: October 2004. Amount: $100,000. Principal investigator: Antony Chiang. Institution: Altrue.com. NIH institute: NCI.
Proposal to “test the use of open source software technologies” in developing a prototype specification for an integrated cancer data management system. The system will use a graphical web-based workspace that allows researchers to organize and manipulate data from different sources, query analytical results across distributed data sources, and access and share that workspace. The system will focus on genomic and cytometric data used in breast cancer research.
Understanding Figures & Captions for Location Proteomics. Start date: April 2004. Expires: March 2007. Amount: $161,668. Principal investigator: William Cohen. Institution: Carnegie-Mellon University. NIH institute: NIDA.
Proposal to develop a method to extract information from image data — as opposed to text — in biomedical journal articles. The project will focus on fluorescence microscope images depicting the subcellular localization of proteins.
Integrating Clinical and Molecular Data. Start date: May 2004. Expires: April 2006. Amount: $170,000. Principal investigator: Robert Beck. Institute: Institute for Cancer Research. NIH Institute: NLM.
Proposal to build an integrated clinical and molecular database system consisting of a data warehouse and associated tools populated automatically from feeder systems such as microarray analysis tools, population research databases, clinical research and study databases, and production clinical information systems.
Massively Parallel High-Throughput, Low-Cost Sequencing. Start date: May 2004. Expires: April 2006. Amount: $2.4 million. Principal investigator: Kenton Lohman. Institution: 454 Life Sciences. NIH institute: NHGRI.
Funds a multi-disciplinary effort to develop a massively parallel, high-throughput sequencing instrument that combines simultaneous sequencing in hundreds of thousands of picoliter-scale reaction wells with “high-powered” bioinformatics.
Computation and Mass Spec: Predicting Protein Complexes. Start date: May 2004. Expires: April 2008. Amount: $334,435. Principal investigator: Lynn Teneyck. Institution: University of California, San Diego. NIH institute: NIGMS.
Funds development of an approach that combines mass-spectrometry hydrogen-exchange data with computational docking to give the structures of macromolecular complexes. The project will develop algorithms for extracting complete hydrogen exchange data from experiments, and will calibrate the combined method using protein kinase complexes and thrombin-thrombomodulin complexes. Additionally, the project will extend the hybrid approach to systems that change conformation on binding.
Alignment-independent Classification of Proteins. Start date: May 2004. Expires: April 2008. Amount: $278,711. Principal investigator: Ivet Bahar. Institution: University of Pittsburgh. NIH Institute: NLM.
Funds development of an alignment-independent classification approach based on a search engine technology developed for classifying medical records. Each protein is represented by a multidimensional vector, the elements of which refer to the protein’s “most discriminative” eta-grams, or sequences of eta amino acids. According to the researchers, preliminary studies on G protein-coupled receptors showed that a simple naive Bayes classifier using eta-gram feature selection in its preprocessing can outperform existing classifiers on standardized GPCR sequence data subsets. Tests on the Protein Information Resource and Pfam databases showed that approximately 70 percent of the protein sequences were classified correctly, according to the researchers. The project will develop a computational tool for protein sequence analysis and classification based on eta-gram distributions, and build a database of protein families based on eta-gram distributions.
Improved Pattern Recognition for Functional Genomics. Start date: May 2004. Expires: April 2009. Amount: $141,533. Principal investigator: Ka Yeung-Rhee. Institution: University of Washington. NIH institute: NCI.
Supports development of improved algorithms for class prediction and identification of gene markers on microarray data related to hepatocellular carcinoma and hepatitis C virus-associated liver disease. Software tools for visualization will also be developed, as well as “practical guidelines” for cluster analysis on microarray data based on empirical studies using an in-house database of thousands of microarray experiments.