Skip to main content
Premium Trial:

Request an Annual Quote

Pistoia Alliance to Offer $15K Prize for Best NGS Data Compression Algorithm


The Pistoia Alliance said this week that it is organizing a competition to find the best algorithm for compressing next-generation sequencing data.

As part of the contest, dubbed the Pistoia Alliance Sequence Squeeze competition, participants are required to develop open source algorithms that can compress and decompress NGS sequencing data stored in the FASTQ format.

The winning entry will receive a $15,000 prize.

The competition is open to researchers with expertise in methods of compressing data, including bioinformaticians, mathematicians, physicists, and computer scientists.

Participants will have access to a 64-bit environment on the Amazon Web Services EC2 cloud on which they can develop their code as well as some sample data with which they can test their algorithms.

Details about how to access the cloud environment as well as additional information about the competition are available on the" contest's website.

The deadline for submissions is Mar. 15, 2012.

Entries will be judged by a panel of four that currently includes representatives from BGI, the Wellcome Trust Sanger Institute, and the Pistoia Alliance, public/private consortium founded to improve pharmaceutical R&D informatics.

Entries will be scored based on compression ratio, time, memory, as well as decompression time and memory. They will also receive points for leader, sequence, and quality mismatches.

The overall leader will be selected based on a combination of these scores and the opinions of the judging panel.

The Pistoia Alliance is also organizing a competition targeted at informatics vendors with the aim of developing a hosted platform for next-generation sequence data storage and analysis. The organization launched the second phase of the project in July with $50,000 in funding (BI 7/29/2011).

The deadline for vendors interested in participating was Sept. 23 and those chosen to receive funding were to be notified this month.

Filed under

The Scan

Study Tracks Off-Target Gene Edits Linked to Epigenetic Features

Using machine learning, researchers characterize in BMC Genomics the potential off-target effects of 19 computed or experimentally determined epigenetic features during CRISPR-Cas9 editing.

Coronary Artery Disease Risk Loci, Candidate Genes Identified in GWAS Meta-Analysis

A GWAS in Nature Genetics of nearly 1.4 million coronary artery disease cases and controls focused in on more than 200 candidate causal genes, including the cell motility-related myosin gene MYO9B.

Multiple Sclerosis Contributors Found in Proteome-Wide Association Study

With a combination of genome-wide association and brain proteome data, researchers in the Annals of Clinical and Translational Neurology tracked down dozens of potential multiple sclerosis risk proteins.

Quality Improvement Study Compares Molecular Tumor Boards, Central Consensus Recommendations

With 50 simulated cancer cases, researchers in JAMA Network Open compared molecular tumor board recommendations with central consensus plans at a dozen centers in Japan.