A team of researchers led by Craig Venter published an analysis of his genome last month — the first diploid genome of a known individual to be sequenced and published. While Venter hailed the study as an important step toward individualized genomics, he cautioned that there are still questions regarding the cost and accuracy of new sequencing methods that will need to be addressed before the field takes off.
The genome, published in PLoS Biology, is “probably the first and last” individual genome to be sequenced by Sanger technology because of the “cost and time involved,” Venter said in a press briefing.
Using new sequencing techniques, Venter said, will be “cheaper, but probably at the cost of accuracy and completeness.”
Venter said that the variation in his genome was “much greater than expected.” In total, his genome had 4.1 million DNA variants, of which 78 percent were SNPs, although the remaining variants comprised three-quarters of all variant bases.
But the genome sequence, a “high-quality draft” generated by whole-genome shotgun sequencing with Sanger technology, is not yet finished. To close the gaps and improve the assembly of haplotypes, the scientists are now generating additional data using next-generation sequencing technologies.
The genome, which the scientists hope will become a reference for future individual genomes, was long in the making. Part of the data came from Celera Genomics, which published a human genome in 2001. Sixty percent of that genome was Venter’s own. Since 2003, the Venter Institute has been supplementing this data with additional Sanger reads, resulting in the 32 million reads that were used in the assembly of his diploid genome.
Venter estimated that the cost of the project was at least $70 million, including $60 million from the Celera human genome project, which cost $100 million, and “at least an additional $10 million” worth of sequencing at the J. Craig Venter Institute.
— Julia Karow
Kansas State University has provided $1 million to fund a four-year Sorghum Translational Genomics program. The effort, which will be led by Frank White, occurs in conjunction with Cornell University, Texas A&M University, and the US Department of Agriculture. Researchers plan to sequence eight diverse sorghum strains to 4x to 8x coverage, focusing on 250 megabases of the sorghum genome that is thought to be expressed; that information will then be mined for SNPs.
454 Life Sciences announced the launch of new products, including new plate formats, and software for the ultra high-throughput Genome Sequencer FLX System. The new additions expand the versatility of GS FLX system and address two key customer requests: decreased cost per read and improved analysis tools.
Researchers at the University of Georgia Warnell School of Forestry and Natural Resources will analyze conifer gene expression under the Department of Energy’s Joint Genome Institute’s Community Sequencing Program. This project occurs in lieu of actually sequencing a conifer, which has an unusually large genome.
Applied Biosystems launched a program to help third-party software developers create bioinformatics tools for its SOLiD next-generation sequencing system. The company is expanding its Software Development Community to include sample data sets, data file formats, and data conversion tools for the SOLiD system.
US Patent 7,263,444. System, method and computer program for non-binary sequence comparison. Inventor: Jeffrey Clark. Assignee: BioInformatica. Issued: August 28, 2007.
According to the abstract, this patent covers “a system and method for performing non-binary comparison of biological sequences includes a new measure .omega..sub.0, which is a non-binary counting measure that is used in a stand alone module called VaSSA-1. This measure obtains substantially more information about sequences and comparisons between them than is gathered by conventional bioinformatics techniques.”
US Patent 7,257,562. High throughput method for discovery of gene clusters. Inventors: Chris Farnet, Alfredo Staffa, Emmanuel Zazopoulos. Assignee: Thallion Pharmaceuticals. Issued: August 14, 2007.
In this invention, “fragments in [a] small insert library are sequenced and compared by homology comparison under computer control to a database containing genes, gene fragments or proteins known to be involved in the biosynthesis of microbial natural products.”
The National Cancer Institute plans to award around $2 million in fiscal year 2008 to support new data analysis and visualization technologies for the Cancer Genome Atlas project.