ANN ARBOR, Mich.--Celera Genomics could finish sequencing the human genome by this summer, well ahead of schedule, company president Craig Venter announced during a talk he gave recently at the University of Michigan Medical Center. By then, the company also could have as many as 8 million single nucleotide polymorphisms in its databases, he added.
"Our goal is to have the sequencing phase finished in June," Venter said. By early December, 1999, Celera had sequenced over 3.5 billion base pairs, and was adding about a billion base pairs every three weeks or so.
One reason for the accelerated timetable is that Celera can take advantage of the Human Genome Project’s "rough draft" now under construction and also due to be completed in 2000. The Genome Project’s data, obtained from sequencing bacteria artificial chromosomes, either pre-mapped to their proper genomic location or anchored immediately after sequencing, will allow Celera to more easily order its own shotgunned DNA fragments. Venter said that by April or May, "essentially every single BAC from the public databases will be completely ordered, and we’ve already started annotating these." Synergies between the two projects are saving Celera time and money, he remarked.
Venter also responded to skeptics who have argued that the assembly of shotgunned sequence fragments will be an insurmountable challenge due to the large amount of repetitive DNA in the human genome. "A lot of fuss has been made about repeats in the human genome, and other genomes," he said. "Gene Myers, who’s head of our algorithm team, made the simple discovery that if we just ignored the repeats, we could unambiguously assemble 99.7 percent of the genome, going back and filling in the repeats later." Myers’ strategy enabled Celera to complete the assembly of Drosophila melanogaster by December.
While Venter credited the US National Human Genome Research Institute with helping Celera accelerate its own sequencing timetable, he also expressed doubts about the institute’s future, suggesting that it would evolve into "the US Department of Genetic Identity" and predicting that the era of genetic identity cards is not far off. Genetic predictions--especially for behavior--in Venter’s view, smack of "genetic reductionism" and could greatly damage the field of human genetics in particular and biology in general. "Studies in the literature linking polymorphisms to behavior like suicide, sexual preferences, etc., are perhaps among the most dangerous things we face in this field," Venter said. "Bad science going forward will so scare the public that it could lead to the kind of reactions we’re seeing in England and Europe right now with genetic modification of organisms."
But cataloguing human genetic variation is a key part of Celera’s business strategy. The company plans to obtain SNP data by comparing DNA from five individuals, in addition to the Genome Project’s sequence. "We’ll have between 4 and 8 million SNPs in our database by June, accurately, down to a single nucleotide," Venter claimed. "And we’ll be adding to that extensively as we go forward."
Venter also commented on Celera’s gene patenting plans, which have been a source of some concern within the scientific community. "We’re not patenting anything that we don’t have a good prediction on," he said. "We’re only filing provisional applications on things we know the pharmaceutical industry is likely to want to use, such as neurotransmitter receptors and ion channels. Nobody, it seems--especially us--wants to see broad blocking patenting of just predicted genes of the genome."
Venter hinted at the riches already obtained through Celera’s computerized gene-identification efforts. "We’ve found close to 2,000 new G-protein coupled receptors," he said. "Secreted proteins. New ion channels. Kinases." Other examples: "A dopamine receptor, a new muscarinic receptor--we thought we already had all of those--[and] we found several new interleukins."
Once the human genome is finished, Venter said, Celera will immediately tackle the mouse genome. Work is already well underway, in collaboration with the Institute for Genomic Research to sequence Arabidopsis thaliana, and Venter said rice would be undertaken later this year.
The nonprofit Institute for Genomic Research, which was founded by Venter in 1992 and is now run by his wife, Claire Fraser, has sequenced the genomes of 13 different organisms, including, in 1995, Mycloplasma genitalium. At 517 genes, M. genitalium is the world’s simplest self-replicating organism. In the Dec. 10, 1999 issue of Science, Venter’s group demonstrated that only about 300 of those genes are necessary for life, and could represent a "minimum" genome for the future creation of synthetic life forms. An even more amazing observation, according to Venter, was that the functions of more than one-third of these genes remain unknown.
"Here’s the most minimal cell that is self-replicating, and we have no idea whatsoever what 103 of the 300 genes do," Venter noted. "That’s extremely humbling when you try and extrapolate forward to our own genome, where we have approximately 80,000 genes, in a hundred billion different combinations, and we can’t tell you what the biology is of the simplest cells."
Venter stressed that the human genome sequence will just be the starting point for an effort to understand human biology. Proteomics, he predicted, will soon take off "exponentially," as will use of gene expression arrays, because every gene can be included. The ultimate payoff will be a full understanding of genes and gene interactions--work that has barely begun.
"How could we ask for a more exciting situation going forward?" Venter concluded. "Every student in the room can pick up one of those genes, and has a chance to make mind-boggling breakthroughs about the fundamentals