NEW YORK (GenomeWeb News) – The Genomic Standards Consortium (GSC) has published a set of guidelines meant to improve the way researchers classify any and all genomes and metagenomes.
The GSC, an international, open-membership group created in the fall of 2005, includes biologists, bioinformaticians, and computer scientists, as well as individuals affiliated with sequencing centers and organizations such as NCBI’s GenBank and the US Department of Energy’s Joint Genome Institute. In a perspectives article published in this month’s issue of Nature Biotechnology, the consortium outlines what it calls “minimum information about a genome sequence” or MIGS specifications.
“There is great enthusiasm in the community for this project and we are already collecting MIGS-compliant reports,” lead author and GSC founder Dawn Field, a researcher at the UK’s Natural Environmental Research Council Center for Ecology and Hydrology, said in a statement. “We are a highly collaborative group and open to new participants joining the GSC at any time.”
The GSC hopes to standardize the way genomic and metagenomic data is collected, described, and stored. This includes providing information about everything from stain names, location or habitat from which the organism was isolated, to detailed information about the sequencing method, coverage, assembly methods, finishing, and so on.
The GSC also emphasize the need to create a consensus-based approach to genomics and metagenomics, to help researchers collect data quickly, share it with others, and integrate it with information already available to the genomics community.
To address such issues, the group has developed a freely available Genome Catalogue system (GCat) intended to help users input data, view and search genome descriptions, and, ultimately, integrate ontological information. Genomic sequence data and initial annotations must be submitted to the International Sequence Database Collaboration, and MIGS contains only primary, curated information, the authors noted.
The MIGS specifications are just one of a set of “minimum information” standards being developed. In addition, other groups such as the Microarray and Gene Expression Data Society and the International Nucleotide Sequence Database Collaboration have also come up with genome information standards through the Minimum Information about a High-throughput Nucleotide Sequencing Experiment (MINSEQE) and the Genome Project Metadata initiative, respectively.
“With the rapid pace at which new genome sequences are appearing, the need to consider how best to ensure stewardship of these data for the long term has never been more pressing,” Field and her colleagues wrote. “Given the importance of the growing genome collection, the capital investment in its creation and the benefits of leveraging it value through diverse comparative analyses, every effort should be made to describe it as accurately and comprehensively as possible.”