The National Human Genome Research Institute, and other NIH institutes are releasing a request for applications this week for a program to fund a genome-wide protein sequence database, NHGRI said Jan. 29.
Although the institute declined to give the dollar amount of the proposed grant, the initiative is aimed at addressing the need for a centralized, web-accessible database of proteins, their sequences, and their corresponding DNA sequence in their respective genomes, said Peter Good, the program director for genome informatics at NHGRI.
“The general concept is that we would like to support a database that will be a rich resource of information for anyone interested in proteins,” he said. While there are many protein databases, most are specific to model organisms such as C. elegans and S. cerevisiae, he added.
In addition, Good said the database would help clear up some of the confusion surrounding gene predictions. “Really there’s some question as to how accurate the predictions are in terms of translation, [and] you also want to make sure there’s information about whether the protein is in a gene family; if the protein is translated from an alternatively spliced form; whether there are isoforms; [and if] the protein is modified,” he said.
The application will not be geared toward protein databases that include protein structure or protein interactions, he said, because of already established databases, such as PDB, and because including interactions, while theoretically possible, would make the project less feasible.
Because the NHGRI does not wish to duplicate already available protein sequence resources, the grant is open to established databases such as Swiss-Prot, perhaps the most well-known protein sequence database that was started in 1993 by Amos Bairoch, Denis Hochstrasser, and Ron Appel.
“Swiss-Prot is certainly welcome to apply to this RFA, and I think Swiss-Prot is an excellent model for it because it really sets the standards for a well-curated protein database,” Good said. But there are other models for protein sequence databases, he said, and “we’re looking for the best application.”
Good, who joined NHGRI several months ago after leaving a faculty position in developmental biology at Louisiana State University, said that NHGRI and the other institutes sponsoring the grant are more broadly interested in funding projects in proteomics, and are hoping a NIH-sponsored workshop in April will help the institutes “find out what the state-of-the-art technology is and see how it fits with our goals, so that we can decide what to fund.
“We do have a strong interest in proteomics, on a genome scale,” he said, but added that there is currently no initiative within NIH to fund a proteomics project on the scale of the Human Genome Project.