At his blog, New York University's Stuart Brown says researchers should speak with their colleagues in bioinformatics before they begin sequencing projects rather than simply dropping off hordes of fresh data. Brown says "our informatics effort is much greater on the poorly designed and failed experiments." For example, his department recently processed a "seemingly standard SNP detection using exome sequences with 100 [base pair] paired-end reads … done by a private sequencing contractor" that'd also called some SNPs. In searching for overlaps between SNP calls for a variety of samples and controls, Brown and his bioinformatics colleagues found that the sequencing reads generated by the contractor "have a 1.5 percent error rate." Given additional quality control data, Brown's group saw "a steep increase in error at the ends of reads," he says. Now, rather than simply searching for overlaps, the bioinformaticians must "trim all reads down by 10 [percent] to 25 percent and recall SNPs." Brown says his group is left to wonder about the sequencing library's insert sizes, and how they may have contributed to the overall error rate. "Talk to bioinformatics before you build your sequencing libraries," Brown pleads.
Bioinformatician: Down with 'Data Drop'
Jun 08, 2011