Skip to main content
Premium Trial:

Request an Annual Quote

This Week in Genome Research: Mar 29, 2017

Researchers from Canada and the US report on findings from a phenotypic, genomic, and phylogenetic study of Burkholderia cenocepacia isolates collected over time in chronically infected individuals with cystic fibrosis. Using more than 200 sputum samples obtained over two years to two decades from 16 individuals with cystic fibrosis, the team did systematic phenotyping and genotyping — paired with genome sequencing and comparative genomics — to tease out B. cenocepacia relationships and genetic changes in infected host individuals. For example, the authors say, the analysis pointed to "recurrent gene losses in multiple independent longitudinal series," along with the introduction of recurrent loss of function mutations.

A National Human Genome Research Institute-led team describes Canu, an assembly method tailored to noisy single-molecule and nanopore sequence long-read data. In addition to describing details of the assembler, the researchers applied it to reads generated with Pacific Biosciences or Oxford Nanopore instruments, producing complete genomes for microbes such as Escherichia coli and nearly complete genomes for eukaryotic organisms ranging from the Arabidopsis thaliana plant to fruit flies, worms, and humans. The authors note that the Canu assembler "is able to generate highly contiguous assemblies from both PacBio and Nanopore sequencing, but signal-level polishing is required to maximize the final consensus accuracy."

Researchers from Stanford University and the University of California, Berkeley, introduce another long-read assembly approach called HINGE, which "seeks to achieve optimal repeat resolution by distinguishing repeats that can be resolved given the data from those that cannot." The team validated the approach using E. coli reads generated on Oxford Nanopore instruments and PacBio reads for the yeast Saccharomyces cerevisiae, before applying HINGE to nearly 1,000 available bacterial sequence datasets. "The HINGE graph is a natural representation of a set of possible assemblies," the authors note, "and is amenable to further repeat resolution, which can be attempted using additional long-range information such as paired-end reads, Hi-C reads, or by leveraging biological insight."