Skip to main content
Premium Trial:

Request an Annual Quote

Publication of Venter's Genome Hints at Personal Genomics, But Raises Issue of Cost, Sep 4, 2007

A team of researchers led by Craig Venter published an analysis of his genome this week — the first diploid genome of a known individual to be sequenced and published. While Venter hailed the study as an important step toward individualized genomics, he cautioned that there are still questions regarding the cost and accuracy of new sequencing methods that will need to be addressed before the field takes off.
The genome, published in this week’s issue of PLoS Biology, is “probably the first and last” individual genome to be sequenced by Sanger technology because of the “cost and time involved,” Venter said in a press briefing announcing the publication of the study.
Using new sequencing techniques, Venter said, will be “cheaper, but probably at the cost of accuracy and completeness.”
Venter said that the variation in his genome was “much greater than expected.” In total, his genome had 4.1 million DNA variants, of which 78 percent were SNPs, although the remaining variants comprised three quarters of all variant bases (see side bar for further details).
But the genome sequence, a “high-quality draft” generated by whole-genome shotgun sequencing with Sanger technology, is not yet finished. To close the gaps and improve the assembly of haplotypes, the scientists are now generating additional data using next-generation sequencing technologies.
The genome, which the scientists hope will become a reference for future individual genomes, was long in the making. Part of the data came from Celera Genomics, which published a human genome in 2001. Sixty percent of that genome was Venter’s own. Since 2003, the Venter Institute has been supplementing this data with additional Sanger reads, resulting in the 32 million reads that were used in the assembly of his diploid genome. The researchers deposited the sequences in GenBank on May 18.
While Celera’s original haploid genome was a consensus assembly of five individuals, the public consortium published a haploid sequence that was made up of many individuals. “In retrospect, both sequencing approaches were flawed,” Venter said.
George Church, a geneticist at Harvard, told In Sequence by e-mail that a diploid genome is needed for genome-wide association studies but cautioned that this sequencing approach is not suitable since it is too expensive. “It’s also worth noting that many diploid genomes are already available in [genome-wide association] studies,” he said, pointing to the HapMap samples, for which diploid blocks of DNA have been identified, although their entire sequences are not known.
Venter estimated that the cost of the project was at least $70 million, including $60 million from the Celera human genome project, which cost $100 million, and “at least an additional $10 million” worth of sequencing at the J. Craig Venter Institute.
“If we did the whole thing from scratch with Sanger sequencing, [the cost] would still be probably in the tens of millions,” Venter said.
When the project began at JCVI in 2003, the scientists decided to go with Sanger sequencing since at the time, it was “the most mature [technology],” Sam Levy, a senior scientist at the institute who headed the project, told In Sequence last week. “Had there been another sequencing technology available then, we may well have opted for it.”
Venter’s genome is not the only one in the making. In May, researchers from 454 Life Sciences and Baylor College of Medicine announced the completion of the genome of Jim Watson, using 454’s sequencing technology. They said the project — which generated 6X coverage of the genome — cost less than $1 million and took two months (see In Sequence 6/5/2007). The researchers have yet to publish their analysis of Watson’s genome.
Venter said his and Watson’s genomes mark the beginning of individualized genomics. “I think this number is going to scale up quite dramatically with some of the new techniques,” he said. “We think potentially we can go down to six weeks or less for doing relatively complete genome coverage,” adding that “for some of the calculations, for example using George Church’s technology, it looks like it could be as low as $100,000 a genome right now.”
Church’s group at Harvard is developing a low-cost sequencing technology for the Personal Genome Project (see In Sequence 6/26/20007). For that project, Church plans to sequence one percent of each individual’s genome for $1,000 (see In Sequence 4/10/2007).
Venter said he believes 30 to 50 human genomes will be sequenced next year. “I expect there is going to be an exponential scale-up. Now that the cost has come way down, we are in a position to look at dozens of new technologies that offer a lot of excitement.”
He also expects at least 10,000 diploid human genomes will be deciphered over the next three to five years. But that might not even be enough to come to meaningful conclusions about how genotypes affect phenotypes.
“We need databases. The smallest we need is 10,000 human genomes; 10 million would be better,” Venter said. “And once we have those, we would be able to sort out basically every fundamental question about nature vs. nurture. What’s genetic? What’s environment? What are all the minor variations that show up in each of our genomes that actually contribute to our individual traits, risk for disease, et cetera?”
Information on minor variants cannot be gleaned from studies like the HapMap project, he said, since they only look at common variations.
JCVI researchers are currently building a genotype-phenotype database. According to Levy, they are working on a genome browser, which they are planning to launch in a few months. “That will enable individuals to come to our website, where you can see all these variants. We hope through that, we can add medical and other annotations to get to all the protein function,” he said.

“It’s a shame that the Sanger sequencing costs so much because having the long reads and the high accuracy has been a key part of the success of sequencing thus far, and I think it’s going to be a challenge, as we move into new technologies, to see if they are up for that.”

The scientists assembled Venter’s diploid genome from scratch, modifying the existing Celera assembler to facilitate the identification of different alleles.
“This was a completely independently assembled human genome,” Venter said. “It was not small bits of genome layered on top of one of the composite genomes that are already in GenBank. Had we done that, we would have gotten a very different answer. We would have recapitulated the errors that are in that genome.”
However, not all future genome sequences have to be assembled de novo, he believes. “We should, after having several reference genomes, be able to maybe come up with algorithms for comparison that allow some of these shorter read sequences to be used effectively, but I think those are certainly some of the challenges we have out here as the technology moves forward,” Venter said.
“It’s a shame that the Sanger sequencing costs so much because having the long reads and the high accuracy has been a key part of the success of sequencing thus far, and I think it’s going to be a challenge, as we move into new technologies, to see if they are up for that.”
But not even his own genome is complete yet. “There are still hundreds of sequence gaps,” George Church told In Sequence. “Before we can have a true reference genome, we need to fill the gaps. Also, most of the changes observed are likely to be neutral, so we need accurate computational and experimental methods to determine which are meaningful.”
The JCVI scientists are currently using “some of the newer sequencing technologies” to close the gaps, according to Levy. “Once we have improved the coverage rate two- to three-fold more than where we are currently, we suspect this will be a complete genome in terms of covering all the gaps and providing an accurate definition of all the variants and the allelic contribution from both of Craig’s parents,” he said in this week’s press briefing.
“We expect that we are also missing very large DNA variations that are just very difficult to characterize, even by the existing methods we are using. With more sequence coverage, we believe that we will even capture those in the long run,” Levy told In Sequence. They also have not sequenced regions of heterochromatin, he said.
Creating additional coverage with cheaper technologies will also help the researchers decide whether a SNP is homozygous or heterozygous, something that the existing 7.5-fold coverage is not sufficient for, Levy said.
Levy said the scientists have already started to sequence some regions of Venter’s genome using 454’s technology. However, they are still eying other platforms. “We are keeping our options open as to where we go with future sequencing technologies,” Levy said.
Future genome sequencing projects will likely use new technologies, “and I am sure that all of 454, the Illumina, and AB SOLiD would be used in some form or another,” he said. What is going to be important is “which will end up being the least biased, which will produce the greatest number of reads, mappable back to a reference, and which will enable us to still characterize SNPs and indels.”
Why did Venter choose to sequence his own genome? Back in the Celera days, “I felt it was not correct for me to ask other people to volunteer to have their genome sequenced if I was not doing the same,” he said. “And also, I said many times, I cannot imagine a scientist working in this field not having an extensive curiosity about their own genetic code and their own history. It was a combination of those events that led to it.”
He does not seem to regret it. Rather, he wants to encourage others to do the same. “I was very pleased that Jim Watson has also taken a leadership role in putting his information on the Internet,” he said. Rather than fear the information, “we hope to help teach people they should welcome this information as a breath of fresh air that gives them opportunities in their lives to perhaps change things” to modify the impact of their genetic makeup.
The Scan

Germline-Targeting HIV Vaccine Shows Promise in Phase I Trial

A National Institutes of Health-led team reports in Science that a broadly neutralizing antibody HIV vaccine induced bnAb precursors in 97 percent of those given the vaccine.

Study Uncovers Genetic Mutation in Childhood Glaucoma

A study in the Journal of Clinical Investigation ties a heterozygous missense variant in thrombospondin 1 to childhood glaucoma.

Gene Co-Expression Database for Humans, Model Organisms Gets Update

GeneFriends has been updated to include gene and transcript co-expression networks based on RNA-seq data from 46,475 human and 34,322 mouse samples, a new paper in Nucleic Acids Research says.

New Study Investigates Genomics of Fanconi Anemia Repair Pathway in Cancer

A Rockefeller University team reports in Nature that FA repair deficiency leads to structural variants that can contribute to genomic instability.