GenBank release 166.0 is now available here via FTP from the National Center for Biotechnology Information. It contains data as of June 11. The new release comprises approximately 92 gigabases and 88 million entries from non-WGS, non-CON sequences, and 113.6 gigabases and 39 million entries from WGS sequences.
Uncompressed, the 166.0 release flat files require about 343 GB for sequence files only. The ASN.1 data require approximately 314 GB.
As of the new release, GenBank has surpassed the 200-gigabase threshold.
CLC Bio has released Genomics Workbench, a new next-generation sequencing-analysis software package. Genomics Workbench can analyze and visualize data from all the major platforms, including Applied Biosystems’ SOLiD, 454’s GS FLX, Illumina’s Genome Analyzer, and Helicos Biosciences’ Heliscope. It contains an accelerated assembly algorithm. The company said it expects to release a benchmark white paper on its performance in the near future.
The package supports paired-end reads, reference assembly of genomes, de novo assembly of genomes, SNP detection, multiplexing, and high-throughput trimming.
Future versions will support digital gene expression, metagenomics, clustering and assembly of EST and cDNA sequences, large genomics and transcriptomics downstream analyses, and workflow support.