The Pistoia Alliance said this week that it is organizing a competition to find the best algorithm for compressing next-generation sequencing data.
As part of the contest, dubbed the Pistoia Alliance Sequence Squeeze competition, participants are required to develop open source algorithms that can compress and decompress NGS sequencing data stored in the FASTQ format.
The winning entry will receive a $15,000 prize.
The competition is open to researchers with expertise in methods of compressing data, including bioinformaticians, mathematicians, physicists, and computer scientists.
Participants will have access to a 64-bit environment on the Amazon Web Services EC2 cloud on which they can develop their code as well as some sample data with which they can test their algorithms.
Details about how to access the cloud environment as well as additional information about the competition are available on the http://www.sequencesqueeze.org/how-to-enter/index.html" contest's website.
The deadline for submissions is Mar. 15, 2012.
Entries will be judged by a panel of four that currently includes representatives from BGI, the Wellcome Trust Sanger Institute, and the Pistoia Alliance, public/private consortium founded to improve pharmaceutical R&D informatics.
Entries will be scored based on compression ratio, time, memory, as well as decompression time and memory. They will also receive points for leader, sequence, and quality mismatches.
The overall leader will be selected based on a combination of these scores and the opinions of the judging panel.
The Pistoia Alliance is also organizing a competition targeted at informatics vendors with the aim of developing a hosted platform for next-generation sequence data storage and analysis. The organization launched the second phase of the project in July with $50,000 in funding (BI 7/29/2011).
The deadline for vendors interested in participating was Sept. 23 and those chosen to receive funding were to be notified this month.