NEW YORK, July 25-As part of a four-year project to sequence the zebrafish, a favorite experimental animal, the Sanger Institute released its first draft of its genome assembly earlier this week.
The assembly includes 7,942,778 unique reads comprising 82.4 percent of the starting reads of the genome.
Estimates of the genome coverage include supercontig coverage of about 77 percent, and contig coverage of about 61 percent.
Sanger researchers warn that the assembly is still provisional. Since sequence data was gathered from about 1,000 5-day-old fish embryos, areas of high genomic variability have not been accurately pieced together, causing both dropouts and duplications.
The fish genome was assembled using Phusion, an algorithm developed by Sanger Institute researchers to put together the genomes of higher organisms.
The Max-Plank Institute and the Hubrecht Lab are also involved in the zebrafish project, which is scheduled to be finished in 2005.
Zebrafish are widely used among genomic researchers because of their high reproductive rate and because phenotypic changes are easily visible in the transparent eggs. The fish is especially well suited for comparative genomics and developmental genetics studies.
Its full genome is estimated to be about 1.7 trillion base pairs.
Assembly data is freely available on the Sanger Institute's website.