When researchers involved in protein-shape prediction wanted to evaluate their abilities and limitations, they kicked off what is now known as CASP, the community-wide Critical Assessment of Techniques in Protein Structure Prediction.
Eight years, a 10-most-wanted list, and a David Baker later, CASP proved that the challenge model can move a field forward. Building on that, bioinformaticists have turned to the KDD Cup — Knowledge, Discovery, and Data-Mining — to assess their own community.
“We have found that in other areas of information technology, one way to make progress is to have a bunch of smart people all working on the same problem and comparing their results and the techniques they used to achieve these results,” says Lynette Hirschman of the Mitre Corporation, which provides technical and research support to the government and funded the KDD Cup.
Hirschman and her Mitre colleague Alex Yeh, a chair of the competition, helped promote and participated in the KDD Cup, now in its second year. The two tasks in the challenge were biology oriented and serve as a “community evaluation” for bioinformaticists, says Hirschman.
Results of the challenge were announced at the Association for Computing Machinery’s eighth annual conference this summer in Edmonton, Canada.
A combined team from ClearForest and Celera beat 31 other entrants by developing a system that was best able to automate the extraction of gene expression information about Drosophila from biomedical journals for curation in FlyBase, the public database from Indiana University. Competitors had to indicate whether the article contained experimental evidence for RNA transcripts, polypeptides, or proteins.
The task was devised in collaboration with FlyBase curators, who needed help determining which articles to curate from the thousands circulating, Yeh says. According to Hirschman, “They were an integral part of this piece.”
Adam Kowalczyk and Bhavanni Raskutti, of Australia’s Telstra Research Laboratories, won first prize for predicting the effect of knockout genes on different sub-cellular components in yeast cells from Medline abstracts, categorical features, and data on protein-protein interactions. Fifty-three other teams submitted entries in what chair Mark Craven of the University of Wisconsin called a “very close competition.”
Teams came from academia, government, and industry. As prizes, the winners delivered talks at the ACM conference.
— Dana Frisch