Concurrent with last week’s announcement of the completion of the Human Genome Project, both the UCSC genome browser and Ensembl released versions of the latest assembly (NCBI build 33). UCSC is providing bulk downloads of the data via ftp at ftp://genome.ucsc.edu/goldenPath/10april2003/ or through the genome browser downloads link at http://genome.ucsc.edu/ downloads.html. Ensembl is providing the data at http://pre.ensembl.org/Homo_sapiens/.
GenBank release 135.0 is now available from NCBI at ftp.ncbi.nih.gov. The release contains 31,099,264,455 base pairs and 24,027,936 entries, up by 1.7 million base pairs and 992,113 sequence records from the February 134.0 release. Uncompressed, the 135.0 flat files require about 100 GB for the sequence files only. The ASN.1 version requires around 89.67 GB. As of the April 14 announcement of the completion of the HGP, GenBank contained 9,743,398,611 base pairs of human sequence, and 6,574,171 human sequence records.
Lion Bioscience has released a new software wrapper that integrates SRS with IBM’s DiscoveryLink middleware. The wrapper will enable users to write SQL queries to access SRS, while DiscoveryLink customers will be able to use SRS to integrate either flat- file or relational biological databases.
Lion has also released version 3.2 of its Lion DiscoveryCenter integration platform, which integrates chemical compound databases from MDL Information Systems as well as in-house customer databases using MDL’s Relational Chemistry Server data cartridge.
IBM is offering a free trial download of a suite of web services technology for the NCBI databases, including wrappers for GenBank, PubMed, and Blast, at www.alphaworks.ibm.com/tech/ws4LS.
SSAHA 3.1 is available at www.sanger.ac.uk/Software/analysis/SSAHA/. The release addresses several platform compatibility issues.
SuMo, a web-based system for finding structural similarities in 3D structures of proteins is available from Pôle Bio-Informatique Lyonnais at http://sumo-pbil.ibcp.fr.