NSF Bioinformatics Grants Awarded Aug. 17 — Nov. 8, 2007

Study of Dimension Reduction Methods Driven by Large Scale Biological Data. Start date: Sept. 1, 2007. Expires: Aug. 31, 2010. Awarded amount to date: $139,999. Principal investigator: Ker-Chau Li. Sponsor: University of California, Los Angeles.
This grant funds a project to develop new hybrid dimension-reduction methods for bioinformatics applications. These methods will combine concepts from principal component analysis, K-means, sliced inverse regression, principal Hessian directions, liquid association, and other clustering methods. In addition, the investigators plan to develop dimension-reduction tools for “incorporating clinical or other phenotype data” that are usually subject to censoring; investigate related statistical inference issues concerning false positives; investigate the pattern of cellular coordination at the functional module level; and develop liquid association-based methods for variable selection.

Collaborative Research: A High-Throughput Approach to the Assignment of Orthologous Genes Based on Genome Rearrangement. Start date: Sept. 15, 2007. Expires: Aug. 31, 2008. This grant was awarded to two investigative teams:
  • University of California, Riverside. Principal investigator: Tao Jiang. Awarded amount to date: $76,692.
  • Virginia Polytechnic Institute and State University. Principal investigator: Liqing Zhang. Awarded amount to date: $60,560.
This grant funds continued development of a parsimony approach for assigning orthologs between closely related genomes, which “attempts to transform one genome into another by the smallest number of genome rearrangement events including reversal, translocation, fusion, and fission, as well as gene duplication events,” according to the grant abstract. Investigators plan to address three key algorithmic problems, including signed reversal distance with duplicates, signed transposition distance with duplicates, and minimum common string partition. The results will be incorporated into a software system for ortholog assignment called MSOAR.

Computing Regulatory DNA by Comparing Plant Genomes. Start date: Sept. 15, 2007. Expires: Aug. 31, 2008. Awarded amount to date: $504,060. Principal investigator: Michael Freeling. Sponsor: University of California, Berkeley, Sponsored Projects Office.
This grant supports the development of databases of regulatory DNA sequences, mapping tools, and software for the plant breeding and research community. The primary goal of the project is to align each gene in sorghum with its orthologous gene in rice, evaluate exon annotations, and then define and store those noncoding sequences that have been conserved over evolutionary time. When the project is complete, the plant community should have approximately 170,000 short (31 base pairs on average) conserved noncoding regions sorted to particular genes. Results are displayed on the project website and available through a Distributed Annotation System server and the PlantGDB database.

Analysis of Microarray Gene Expression Data. Start date: Sept. 15, 2007. Expires: Aug. 31, 2008. Awarded amount to date: $99,973. Principal investigator: Andrew Knyazev. Sponsor: University of Colorado at Denver.
This grant supports research in microarray data analysis. Specifically, the investigators are developing numerical methods for eigenvalue problems applicable to large data sets. The grantees are developing massively parallel software for principal components analysis and canonical correlation analysis that can be used for data clustering.

A General Framework for High Throughput Biological Learning: Theory Development and Applications. Start date: Sept. 15, 2007. Expires: Aug. 31, 2010. This grant was awarded to two investigative teams:
  • Yale University. Principal investigator: Hongyu Zhao. Awarded amount to date: $119,999.
  • Columbia University. Principal investigator: Shaw-Hwa Lo. Awarded amount to date: $270,028.
This award supports a project that will investigate a general framework for handling complex large-scale, high-dimensional data sets generated from biological studies. The investigators intend to study problems “related to biological and medical prediction in response to treatments, clinical diagnosis of diseases (such as cancers), discovery of protein-protein interactions and biological network constructions related to disease etiology and motif identification,” according to the grant abstract. They will then evaluate “a series of novel statistical/computation procedures/software which will then be tested by a broad range of real and simulated data, some from current on-going studies.”

Fundamental Algorithms to Enable the Simulation of Multi-Scale Biological Systems. Start date: Sept. 15, 2007. Expires: Aug. 31, 2010. Awarded amount to date: $200,000. Principal investigator: Paul Plassmann. Sponsor: Virginia Polytechnic Institute and State University.
This grant supports the development of algorithms and software tools for simulating multi-scale biological systems. The target application for these tools is the simulation of the adaptive human immune system response to viral infections on multiple scales. “At the inter-cellular level, the movement and interaction of a variety of human cells, viruses, and antibodies must be modeled,” according to the grant abstract. “However, at the intra-cellular level, chemical pathways are simulated by complex network models that occur at much smaller time and length scales.” To address these challenges, the grantees will first develop algorithms, including efficient solution methods for the diffusion of biological agents, diffusion with drift, and fluid flow. Second, they propose a “novel model repository scheme” that will achieve orders-of-magnitude improvement in computational time.

The PhyloFacts Phylogenomic Encyclopedia of Microbial Protein Families. Start date: Nov. 1, 2007. Expires: Oct. 31, 2010. Awarded amount to date: $1,899,499. Principal investigator: Kimmen Sjolander. Sponsor: University of California, Berkeley,
Sponsored Projects Office.
This grant supports a project to create an online phylogenomic encyclopedia of microbial gene families that will enable biologists “to predict the function, biological process, and 3D structure of millions of proteins encoded in microbial genomes,” according to the grant abstract. The PhyloFacts Microbial Encyclopedia will provide pre-computed phylogenomic analyses of millions of microbial genes and will include new bioinformatics methods to reconstruct the evolutionary histories of these ancient gene families, predict protein structure, molecular function and cellular localization, and link genes to metabolic networks and signaling pathways. The resource is expected to contain “tens of thousands” of phylogenetic trees for microbial gene families, the grant abstract notes.

Network Offloading for Genome Sequence Searching using the SmartNIC. Start date: Jan. 1, 2008. Expires: Dec. 31, 2008. Awarded amount to date: $149,704. Principal investigator: Gerald Sabin. Sponsor: RNET Technologies.

This Small Business Technology Transfer Phase I research project aims to develop a field programmable gate array-based co-processing unit to enable more rapid searches of protein and nucleic acid data. “Enabling faster comparison and analysis of sequences is essential to the more efficient alignment and identification of molecules such as proteins and nucleic acids whose numbers have grown tremendously over the last two decades and continue to increase,” the grantees note in the project abstract.


