NEW YORK (GenomeWeb News) – Two institutes of the National Institutes of Health intend to fund a proteomics data resource to house protein sequence information that researchers can search and use to support their biomedical research.
The National Human Genome Research Institute and the National Institute of General Medical Sciences will award a grant of approximately $6 million in 2010 to start the Centralized Protein Sequence and Function Resource, which aims to be an easy-to-use and comprehensive resource of annotated proteomic and related information.
The aim of the grant is to fund a center that will enable a range of researchers to use the vast sums of proteomics data that has been generated through mass spectrometry and sequencing in a broad range of experimental and computational research projects.
The three-year grant may provide up to $5.5 million per year, but is limited to $6 million in total costs, for the resource, which should contain high-quality manual annotations and should be able to handle high-throughput data. In addition, the institutes said it should "leverage but not duplicate what is done by smaller databases."
Specifically, the grant calls for a resource that should include "information on a variety of data types including, but not limited to, protein sequences, nomenclature, alternatively-spliced proteins, homology and paralogy relationships, and family classifications." It also should include relevant information on gene function, potential protein interactions, expression patterns, and pathways.
In addition, it should be easily accessible by the scientific community through multiple methods of querying, including simple web interfaces for common standard queries and tools such as application programming interfaces.
The grant opportunity is primarily the result of discussions from a July 2008 workshop on Protein Sequence Function Resources.