McLEAN, Va., April 3 - With researchers at the first annual meeting of the Human Proteome Organization clamoring for bioinformatics tools to help analyze protein data, Rolf Apweiler, the coordinator for the Swiss-Prot database of annotated protein sequence, said that his group would work to provide a platform for scientists to "ask questions across many different experiments," and link Swiss-Prot to other biological databases not specifically related to proteins.
"We are reaching out to a lot of other resources and doing a lot of work trying to verify that a biological object in Swiss-Prot or TrEMBL [a related protein database] corresponds to another biological object in another database," Apweiler said.
Swiss-Prot and TrEMBL, developed in a collaboration between the Swiss Institute of Bioinformatics and the European Bioinformatics Institute, are currently the only databases dedicated to cataloging protein sequence, function, domain structure, and post-translational modification information, said Apweiler.
In its current form, TrEMBL acts as a sort of minor-league for Swiss-Prot, where data resides while researchers annotate it and bring it up to the Swiss-Prot standard. Apweiler, a bioinformaticist at EBI, said that he would like to create a system for annotating protein data with confidence values that would help researchers gauge its accuracy.
"Annotating the data is the bottleneck," Apweiler said. While Swiss-Prot contains about 94,000 fully annotated protein entries, TrEMBL has 375,000 partially annotated entries, with a fraction that likely overlap with Swiss-Prot entries, he said.
But Apweiler noted that the goal of Swiss-Prot should not be to assemble all the data that might have some relevance to proteins, but to provide links and ways of searching many databases simultaneously.
"You don't try to do everything; instead you try to link out to everything and to make it open and interoperable," he said.