Skip to main content
Premium Trial:

Request an Annual Quote

Text Mining: Biofx Accelerates with Competition


When researchers involved in protein-shape prediction wanted to evaluate their abilities and limitations, they kicked off what is now known as CASP, the community-wide Critical Assessment of Techniques in Protein Structure Prediction.

Eight years, a 10-most-wanted list, and a David Baker later, CASP proved that the challenge model can move a field forward. Building on that, bioinformaticists have turned to the KDD Cup — Knowledge, Discovery, and Data-Mining — to assess their own community.

“We have found that in other areas of information technology, one way to make progress is to have a bunch of smart people all working on the same problem and comparing their results and the techniques they used to achieve these results,” says Lynette Hirschman of the Mitre Corporation, which provides technical and research support to the government and funded the KDD Cup.

Hirschman and her Mitre colleague Alex Yeh, a chair of the competition, helped promote and participated in the KDD Cup, now in its second year. The two tasks in the challenge were biology oriented and serve as a “community evaluation” for bioinformaticists, says Hirschman.

Results of the challenge were announced at the Association for Computing Machinery’s eighth annual conference this summer in Edmonton, Canada.

A combined team from ClearForest and Celera beat 31 other entrants by developing a system that was best able to automate the extraction of gene expression information about Drosophila from biomedical journals for curation in FlyBase, the public database from Indiana University. Competitors had to indicate whether the article contained experimental evidence for RNA transcripts, polypeptides, or proteins.

The task was devised in collaboration with FlyBase curators, who needed help determining which articles to curate from the thousands circulating, Yeh says. According to Hirschman, “They were an integral part of this piece.”

Adam Kowalczyk and Bhavanni Raskutti, of Australia’s Telstra Research Laboratories, won first prize for predicting the effect of knockout genes on different sub-cellular components in yeast cells from Medline abstracts, categorical features, and data on protein-protein interactions. Fifty-three other teams submitted entries in what chair Mark Craven of the University of Wisconsin called a “very close competition.”

Teams came from academia, government, and industry. As prizes, the winners delivered talks at the ACM conference.

— Dana Frisch

The Scan

Unique Germline Variants Found Among Black Prostate Cancer Patients

Through an exome sequencing study appearing in JCO Precision Oncology, researchers have found unique pathogenic or likely pathogenic variants within a cohort of Black prostate cancer patients.

Analysis of Endogenous Parvoviral Elements Found Within Animal Genomes

Researchers at PLOS Biology have examined the coevolution of endogenous parvoviral elements and animal genomes to gain insight into using the viruses as gene therapy vectors.

Saliva Testing Can Reveal Mosaic CNVs Important in Intellectual Disability

An Australian team has compared the yield of chromosomal microarray testing of both blood and saliva samples for syndromic intellectual disability in the European Journal of Human Genetics.

Octopus Brain Complexity Linked to MicroRNA Expansions

Investigators saw microRNA gene expansions coinciding with complex brains when they analyzed certain cephalopod transcriptomes, as they report in Science Advances.