Writing in Genome Research, the Wellcome Trust Sanger Institute's Jared Simpson and Richard Durbin present memory-efficient data structures and algorithms for genome assembly "using the FM-index derived from the compressed Burrows-Wheeler transform, and a new assembler based on these called SGA," or string graph assembler. Simpson and Durbin also present "algorithms to error correct, assemble, and scaffold large sets of sequence data." The Wellcome Trust researchers say that using their approach, it's possible to assemble a human genome using 54 GB of memory.
In another paper published online in advance, a team led by investigators at the University of Tokyo reports whole-exome sequencing data for 15 pancreatic tumor cell lines and their matched normal samples, through which it found that "the diversity of the mutation rates was significantly correlated with the distinct MLH1 copy-number status." The team goes on to suggest that "MLH1 hemizygous deletion, through increasing the rate of indel mutations, could drive the development and progression of sporadic cancers."