Skip to main content
Premium Trial:

Request an Annual Quote

Kaggle's Winning HIV Progression Prediction Model Outperforms Established Scientific Methods


Kaggle — an online platform for hosting bioinformatics competitions — has named the winner for its first contest, an effort to develop a tool that would identify markers in the sequence of the human immunodeficiency virus genome that could predict a change in the severity of HIV infection.

Chris Raimondi, a search engine optimization specialist who won the Predict HIV Progression Competition, used the R-based Caret and randomForest software packages to accurately predict changes in viral loads with more than 77 percent accuracy, compared to 70 percent for the best methods in the scientific literature.

To develop their prediction models, the 109 teams who participated in the competition downloaded data on the nucleotide sequences of patients' reverse transcriptase, protease, and viral load and CD4 count at the beginning of therapy. Each team was required to submit predictions for 692 patients.

According to the organizers, the competition was set up to find the markers in HIV sequences that predict changes in the viral load, indicating the severity of the disease. They expect that the models will provide a better understanding of the “genetic blueprint” of HIV that can be used to help develop more effective therapies for the infection.

“This result neatly illustrates the strength of data modeling competitions for scientific research. Whereas the scientific literature tends to evolve slowly…a competition inspires rapid innovation by introducing the problem to a wide audience,” Kaggle CEO Anthony Goldbloom said in an e-mail to BioInform

As the winner, Raimondi received $500 and will have an opportunity to co-author a paper with the host of the competition.

Kaggle's organizers aim to provide an opportunity for bioinformaticians to develop new data-analysis tools and techniques, and for researchers and organizations to expose their data to a wide range of analytical techniques.

A detailed description of the winning entry is available here.

The Scan

Genes Linked to White-Tailed Jackrabbits' Winter Coat Color Change

Climate change, the researchers noted in Science, may lead to camouflage mismatch and increase predation of white-tailed jackrabbits.

Adenine Base Editor Targets SCID Mutation in New Study

Researchers from the University of California, Los Angeles, report in Cell that adenine base editing was able to produce functional T lymphocytes in a model of severe combined immune deficiency.

Researchers Find Gene Affecting Alkaline Sensitivity in Plants

Researchers from the Chinese Academy of Science have found a locus affecting alkaline-salinity sensitivity, which could aid in efforts to improve crop productivity, as they report in Science.

International Team Proposes Checklist for Returning Genomic Research Results

Researchers in the European Journal of Human Genetics present a checklist to guide the return of genomic research results to study participants.