Skip to main content

Drosophila Genome Sequence Published, Released on Celera s New Discovery Portal



OCKVILLE, Md.--After Celera Genomics and the Berkeley Drosophila Genome Project jointly published the assembled and annotated genomic sequence of the fruit fly in Science on March 24, Celera posted the data on its new web-based research portal, CeleraScience ( The site, which also offers access to other public genomic data, uses Lion Bioscience's SRS bioinformatics tool to enable users to search the data and conduct cross-database queries.

The fruit fly data is also available at the US National Center for Biotechnology Information database, GenBank (

Celera announced last fall that it had completed 10x coverage during the sequencing phase of the Drosophila genome project. In November, 40 bioinformatics experts gathered for an intense two-week "jamboree" at Celera's facility here to assemble and annotate the genome. Using software such as Genie, a high-speed gene-finding system developed by the Berkeley, Calif., company Neomorphic, they were able to identify about 13,600 genes in Drosophila.

The scientists then devoted themselves to sorting and characterizing the genes. Celera has estimated that the data resulting from the collaborative effort are more than 99.99 percent accurate. The fruit fly is the first insect and, at 120 million base pairs, the largest organism yet sequenced.

Gerald Rubin, who headed the Berkeley project, a consortium of researchers from University of California, Berkeley, Lawrence Berkeley National Laboratory, Baylor College of Medicine, and Carnegie Institution of Washington, funded by the US National Institutes of Health, the Department of Energy, and the Howard Hughes Medical Institute, said the successful collaboration should serve as a model for future partnerships between publicly funded researchers and private companies. "We now have a complete Drosophila genome 18 months sooner and millions of dollars cheaper than anyone expected," remarked Rubin, who is now an investigator at the Howard Hughes Medical Institute.

Craig Venter, Celera's president and chief scientific officer, echoed Rubin's sentiments on the merits of the group effort and added that it is a positive sign for Celera's continuing effort to sequence the human genome.

The Berkeley group is now working with CuraGen to create a protein interaction map for Drosophila. Both parties anticipate that the map will lead to a better understanding of the role genes and protein function play in disease. CuraGen expects to begin releasing data to the public later this year.

--Matthew Dougherty

Filed under

The Scan

Gap in COVAX Doses

BBC News reports that COVAX is experiencing a vaccine shortfall, as the Serum Institute of India has paused exports.

Sanofi, GSK Report Promising Results

The Wall Street Journal reports that the candidate SARS-CoV-2 vaccine from Sanofi and GlaxoSmithKline has had encouraging early results.

Influence of Luck

The New York Times examines how the US avoided variant-fueled increases in COVID-19 cases.

PLOS Papers on Retina GWAS, Hantaan Virus, COVID-19 Phenome-Wide Association Study

In PLOS this week: genome-wide association study of retinal morphology, analysis of hantaan virus found in a mouse, and more.