By Vivien Marx
The National Institutes of Health has awarded at least $92 million for 194 bioinformatics-related grants in fiscal year 2009 under the Obama Administration's $787 billion American Recovery and Reinvestment Act.
Bioinformatics-related projects made up 2.1 percent of NIH's total FY 2009 ARRA funding of $4.4 billion and 1.5 percent of the 12,789 NIH stimulus grants awarded, according to an analysis conducted by BioInform based on project titles. Since this method may have inadvertently omitted or miscategorized some bioinformatics-related projects, the number of bioinformatics-based ARRA grants is likely higher than the 194 tallied in this rough analysis.
Awards for bioinformatics support centers and resources garnered the most FY 2009 ARRA funding, with $16.2 million spread across 13 grants. Sequence analysis projects netted the most grants, at 23, and the second-most amount of total funding, with $12.9 million.
Bioinformatics projects focused on disease and health, including projects related to electronic medical records, came next with 20 grants worth $10 million, the third largest sum.
In terms of the number of awards, bioinformatics training grants and proteomics analysis came next, with 26 and 22 grants, respectively. The total funding for training grants was $7.9 million and $5.4 million for proteomics analysis.
Drug discovery projects received $9.5 million, hardware-related awards netted $5.5 million, database-oriented projects received $5.2 million, text mining and ontology-focused projects received $5 million, as did projects focused on network analysis. Grants for modeling, integration tools, microarray analysis, and image analysis all pulled in less than $5 million in grants (see pie charts below for a complete breakdown of grants and funding for these categories).
The largest amount of stimulus funding for bioinformatics-related projects was from the National Institute of General Medical Sciences, which awarded $22.9 million — approximately 10 percent of its total ARRA funding for FY 2009 — to 61 grants. The National Human Genome Research Institute awarded 30 grants for a combined $17.7 million, and the National Library of Medicine's 46 grants totaled $15.6 million (see chart below for a breakdown of grants awarded by NIH institute).
The National Center for Research Resources and National Cancer Institute funded 14 projects each, the National Institute of Allergy and Infectious Diseases funded five projects as did the National Institute on Aging. Ten remaining NIH institutes funded three grants or fewer.
California pulled in the largest amount of bioinformatics-related ARRA funding with $16.6 million followed by Massachusetts with $15.6 million. Maryland received $6.7 million, Virginia $3.1 million, and Texas $2.1 million in ARRA grants for bioinformatics.
Sustainability and Support
Continuing a trend of researchers seeking sustainability for their resources, the largest chunk of bioinformatics funding, $16.2 million, or 17.5 percent of the bioinformatics-related ARRA funding, went to institutes housing bioinformatics support centers.
Some of these centers are targeting a specific research area. For example, the University of California, San Diego, is creating a pediatric imaging-genomics data resource with a $4.5 million grant from NIDA and Pacific Northwest National Laboratories is creating a proteomics research resource with a $2 million grant from NCRR.
Pavel Pevzner at UCSD received a $ 1.3 million grant to support the Center for Computational Mass Spectrometry. According to the grant abstract, Pevzner and his colleagues propose to branch out from current research "into previously unexplored areas" of computational proteomics and to support multiple collaborative efforts and address the computational bottleneck in data interpretation that is hindering "the entire proteomics community."
The scientists seek to build open-access software tools to enable "complex" mass spectrometry searches, and to construct proteogenomic annotations and analyze pathways.
A number of data and analysis centers also received funding, such as the Massachusetts Institute of Technology's data analysis center to "support, facilitate, and enhance integrative analyses" for the, Model Organism Encyclopedia of DNA Elements project, which is looking to identify functional elements in the fly and worm genomes. The center aims to create a "common computational infrastructure and pipeline" for modENCODE analysis, according to the abstract. The principal investigator on the $1.5 million grant is computer scientist Manolis Kellis.
Richard McIndoe, associate director of the Medical College of Georgia's center for biotechnology and genomic medicine, received $1.8 million for a coordinating and bioinformatics unit to "create and maintain the informatics infrastructure" for the multi-center Animal Models of Diabetic Complications Consortium and the Mouse Metabolic Phenotyping Centers — two consortia that NIH has decided to integrate, the abstract states.
Focus on Applications
In the sequence analysis arena, 21 informatics-related stimulus grants were awarded for a total $12.9 million. Robert Strausberg of the J. Craig Venter Institute landed the largest sequence analysis grant, for $4.6 million, to sequence 110 reference genomes as a dataset for the Human Microbiome Project.
Boston College computational biologist Gabor Marth and his team will use a grant for $1.1 million to construct a data processing pipeline that can be deployed by the NIH and by scientists who want to analyze large of amounts of second-generation sequence data, according to the grant's abstract.
The pipeline aims to facilitate such processes as "quality assessment and validation of short read sequence datasets" and includes tools for data export and visualization, the team said.
With a $400,000 Grand Opportunities grant, geneticist Mark Yandlee and colleagues at the University of Utah plan to develop a new platform called VAAST, or the Variant Annotation, Analysis and Selection Tool, to annotate human genome variants. The software will "fulfill NHGRI's need for a technology to assess data quality and call variants," the grant abstract noted.
Among other notable awards, the Broad Institute's Todd Golub is proposing to use a $967,265 grant to develop Connectivity Map 100K, a "comprehensive 'functional look-up table' that that links disease biology, genome function, and small-molecule action," according to the grant abstract. Golub's plan is to create 100,000 "Connectivity Map" profiles of genetic and pharmacologic perturbation drawn from lentiviral shRNAs to knock down the expression of 1,000 human genes in 10 diverse cell types, the abstract said.
Another large integration project, led by John Quackenbush at the Dana-Farber Cancer Center, was awarded $690,000 to derive predictive networks from resources that, according to the grant abstract, "most analytical methods have ignored," namely the knowledge captured in published biomedical literature. Quackenbush and his colleagues want to develop a new method to extract and use the information as "seeds" that can "jump-start the process of building predictive network models."
Henry Jagadish and his colleagues at the University of Michigan at Ann Arbor also want to harness the literature with a grant for $435,850 to develop "techniques to generate putative pathways dynamically, boot-strapping from observed differential expression data, based upon external evidence of relationship from the literature and from interaction databases," according to the abstract.
A total of $5 million is for projects related to text-mining and ontology development. Judith Blake and her colleagues at the Jackson Laboratory will use $1 million, the largest grant in this category, to continue work on the Gene Ontology Consortium to establish and maintain standards that describe genomes and gene products as well as analytical tools, according to the grant abstract.
Exploring Disease, Developing Drugs
Several of the 20 disease-related bioinformatics projects funded with a total of $10.2 million emphasized cancer, such as a project led by the Dana-Farber Cancer Institute that was awarded $3 million to functionally annotate cancer genomes, and a separate Grand Opportunities project led by Dana-Farber's Matthew Meyerson to develop the computational infrastructure to use second-generation sequencing datasets for cancer virus discovery. Another cancer project at Dartmouth College was awarded $98,339 to develop machine-learning approaches to predict cancer susceptibility.
ARRA grants are funding bioinformatics projects in other disease areas such as autism, rheumatoid arthritis, hepatitis C, obstructive pulmonary disease, and infectious disease.
Grants related to bioinformatics for drug discovery and pharmacogenomics reached $9.5 million with several grants for computational ligand discovery methods.
Duke University will be using a $2.3 million ARRA grant to create a cardiovascular research-oriented partnership between the Metabolomics Network for Drug Response Phenotype and two other centers in the Pharmacogenomics Research Network to study variations in response to the "commonly used" cardiovascular antihypertensives atenolol and hydrochlorothiazide and the antiplatelet therapies clopidogrel and aspirin.
Compendia, one of the few informatics firms to land an ARRA grant will use $263,987 to develop its platform Oncomine into one "unifying" bioinformatics resource for gene expression patterns in cancer. The firm plans to modify its academic pipeline to support "commercial operations," according to the grant abstract.
Two projects are focused on electronic medical records. The University of Virginia, Charlottesville plans to use a $1.9 million grant create a "genome-enabled" electronic medical record to include family history and "personal risk factor data" and to allow entry of genetic and molecular test results linking them to clinical data. The application will be embedded into an EMR called Epicare.
At Vanderbilt University, a $415, 228 grant will be used to fund the Vanderbilt Genome-Electronic Records project, which, according to the grant abstract, will apply the VGER model as the staff performs genome-wide association analysis on patient blood samples related to cardiac arrhythmia susceptibility. The researchers also plan to develop natural language processing tools to mine EMRs.
Training the Next Generation
Bioinformatics-related training grants totaling $7.9 million are spread across 16 states including biomedical informatics training programs and funds for faculty recruitment. One such grant is $360,000 for a faculty member in computational protein design and evolution at Emory University, and the other bioinformatics faculty position is at the University of Louisville and funded with $405,051.
Several bioinformatics training programs with grants worth over $1 million were funded by the National Library of Medicine.
For example, Rice University landed renewal funds for $1 million for its training program in biomedical informatics and plans to emphasize clinical informatics "more heavily." Oregon Health and Science University received a continuation grant of $1 million for fellowship training in the department of medical informatics and clinical epidemiology as did the University of Pittsburgh for its training program in biomedical informatics.


