As expected, last week’s simultaneous publication of Celera’s and the Human Genome Project’s sequencing papers sparked a flurry of debate over the relative strength of the two assembly methods and the quality of the resulting data.
Proponents of the public effort were quick to point out that Celera relied much more heavily upon data from the Human Genome Project than it had originally claimed, and that the bulk of the analysis in Celera’s Science paper was derived from its compartmentalized shotgun assembly rather than its whole genome assembly.
Some interpreted Celera’s shift toward compartmentalized shotgun assembly as proof that whole genome sequencing failed to live up to the company’s claims.
Celera, however, downplayed its use of the public data in its final analysis. “Whole genome shotgunning has been an unqualified success,” said Gene Myers of Celera at the press conference marking the occasion. Other Celera proponents have said that the company’s data is more accurate and of better quality.
Rather than delve any further into the fracas, BioInform has provided a comparison of the two assemblies in the hope that the numbers will speak for themselves.
These figures are based on data reported in the Science and Nature papers, and are not intended to represent the current contents of either Celera’s or the Human Genome Project’s databases.
The genome data available for free on Celera’s website (http://public.celera.com)will remain fixed at the October 1, 2000, point described in its paper, while it will continue to update its subscription-only database. The company has not announced a finish point for its version of the genome sequence.
The Human Genome project said it intends to complete its annotation of the human genome by 2003.
The primary source data available in three public databases — Genbank, EMBL and DDBJ — will be updated continuously as the project moves forward.