NEW YORK (GenomeWeb News) – Recent trends in genome sequencing, including next-generation instruments and the rise of metagenomics studies, have spawned several efforts to standardize genomic data formats across different platforms to enable the bioinformatics community to efficiently exchange data.
Two independent projects are currently addressing this issue from separate directions. One, a collaboration between next-generation sequencing vendors, genome centers, and other organizations, has developed a new DNA sequence data format called SSR, for short sequence reads.
The second, a collection of around 30 organizations called the Genomic Standards Consortium, has put together a “checklist” for sequencing experiments called MIGS, or minimum information about a genome sequence.
“The approaches should complement one another,” Asim Siddiqui, vice president of research at Sirius Genomics and the coordinator of the SSR effort, told GenomeWeb News sister publication BioInform last week. “The area [GSC] appears to be tackling is the higher-level problem of how to describe a genome, while we are focused on the nuts and bolts of how to handle data from sequencers and to put it together into a genome assembly.”
Dawn Field, head of the molecular evolution and bioinformatics section at the Center for Ecology and Hydrology at the UK’s Natural Environment Research Council and coordinator for the GSC effort, agreed.
“I think in many ways, the interactions should be close, but they’re two very separate projects,” she said. “They’re describing reads, so that’s really how industry packages up its data to allow biologists and others to exchange data in order to do something useful with it in the future, while we’re doing this very top-level checklist: ‘Why did you do this genomics experiment, what isolate did you use, what phenotypes did it have?’”
Both Siddiqui and Field said that they plan to coordinate their efforts in the future. “We’re working with a lot of the same partners, and it would be nice just for the visibility if we could circulate our stuff to their lists and their stuff to our lists, because it’s all people doing genomics,” Field said.
The complete version of this article appears in the current issue of BioInform, a GenomeWeb News sister publication.