ROCKVILLE, Md.--Celera Genomics announced last week that is has completed the random shotgun sequencing phase of its work on the Drosophila genome and will begin releasing data to the public in October.
The company said it will now dedicate its complete sequencing resources--a stable of 300 Perkin-Elmer ABI 3700 sequencing machines--to the human genome. The Drosophila sequence is the first to be generated by Celera since Craig Venter and Perkin-Elmer established the company last summer.
Anthony Kerlavage, senior director of bioinformatics for Celera, told BioInform that Celera was able to identify thousands of new genes in commercially important protein families in the organism.
Kerlavage also noted that Celera generated more than 1.8 billion base pairs of raw sequence, or 10x coverage, on what is assumed to be a 180-megabase genome since beginning its work on the Drosophila sequence just four months ago. In comparison, scientists at the Institute for Genomic Research spent more than a year on the shotgun sequence of the two-megabase Haemophilus influenzae genome, which was completed in 1995. Drosophila is the largest organism to be sequenced so far by the whole-genome shotgun method. Celera anticipates that the human genome is comprised of 3.5 gigabases.
Last month, Celera revealed that preliminary results of the Drosophila project indicated that the genome might be larger than was at first predicted, but Kerlavage said the exact size will not be known until additional phases of the project are completed.
Computational assembly of the genome using a bioinformatics tool developed by Gene Myers of Celera and data annotation will be undertaken next in collaboration with the Berkeley Drosophila Genome Project. Then, regions of the genome that are not amenable to automatic sequencing--gaps--will be closed by Berkeley in a "finishing" phase, according to Kerlavage.
Celera and the Berkeley team plan to jointly publish the final results of their research in a peer-reviewed journal early in 2000.