Skip to main content
Premium Trial:

Request an Annual Quote

PGP to Publish Initial Data Sets Next Month As Church Predicts $1,000 Genome in 2009

NEW HAVEN, Conn. — Harvard Medical School’s Personal Genome Project expects to publish initial data from its first 10 participants on its website in late October, according to George Church, the project's principal investigator.
Data from the study, which aims to gather and correlate genomic and phenotypic information from thousands of individuals, will include DNA sequence information on 50,000 exons.
Church, a professor of genetics and director of the center for computational genetics at Harvard Medical School, talked about the project at a symposium on current second-generation DNA sequencing technologies at Yale University last week.
He also addressed advances in sequencing technologies that he and his colleagues have been developing, and predicted that scientists will achieve the goal of a $1,000 human genome next year, but did not say what platform would enable that milestone.
The first phase of the Personal Genome Project, PGP1, kicked off about a year ago with 10 volunteers who agreed to have their genomes analyzed and provide physical traits and medical information (see In Sequence 7/31/2007).
Those volunteers, who met for the first time in July 2007, and researchers involved in the project are scheduled to gather again at Harvard Medical School on Oct. 20 to review the data generated so far. Given their final permission, these data will go up on the PGP website [LINK:] the following day.
The data gathered at the moment include SNPs and copy-number variation information from Affymetrix genotyping arrays, DNA sequences from 50,000 exons, as well as medical and non-medical trait data, according to Church.
Other measurements are still in progress: The researchers are generating gene-expression data from fibroblasts generated from volunteers' skin samples, as well as from induced pluripotent stem cells and differentiated cells.
In order to selectively sequence 50,000 exons, Church and his colleagues developed a capture method that uses molecular inversion probe-like oligos, which they published about a year ago in Nature Methods (see In Sequence 10/16/2007).
At the time, only 10,000 of the 55,000 targeted exons were detectable by sequencing. Last week, Church said that since the publication, the scientists have improved the capture efficiency by more than 100,000-fold, for example by optimizing the length of the extension and ligation arms of the probes, and by adjusting hybridization times and reagent concentrations. Also, a greater percentage of the probes — 98 percent — is now on target than before.

5,000 PGP volunteers are currently "queued up at the entrance exam stage."

According to Church, most of the PGP sequence data so far has been generated by either the Illumina Genome Analyzer or by the open-source Polonator platform that his lab developed. The Dover business of Danaher Motion, working in collaboration with the Church lab, began selling the Polonator earlier this year for approximately $150,000 (see In Sequence 2/5/2008).
Danaher and the Church lab have not yet revealed performance characteristics for the instrument, such as accuracy and throughput. Danaher has shipped the instrument to a small number of customers, including the Broad Institute and the Max Planck Institute for Molecular Genetics in Berlin, in addition to the Church lab.
The initial version of the system uses the polony sequencing-by-ligation method, which generates two 13-base tags, each consisting of a 7-base stretch and a 6-base stretch that are separated by a 4- to 5-base gap. Church mentioned, though, that his group has been working on alternative chemistries, both for polymerase-based and for ligation-based sequencing.
One such chemistry uses multiple rounds of ligation, he said. Another, a polymerase-based chemistry, is being developed by a postdoc from Jingyue Ju's lab at Columbia University and delivers read lengths similar to those obtained from commercial platforms that use reversible terminators, he said. Church told In Sequence that Intelligent Bio-Systems, which licensed reversible terminator sequencing-by-synthesis chemistry developed by Ju's lab in late 2006, is "not directly" involved in this project (see In Sequence 10/23/2007).
Church predicted in his talk last week that the cost of sequencing will continue to fall rapidly. By mid-October, he said, he is "pretty confident" that a human genome at 36-fold coverage will be available for $5,000. He did not elaborate whether that data will come from his own lab or from a company developing sequencing technology. Church is an advisor to a number of such firms, including Complete Genomics, Intelligent Bio-Systems, LightSpeed Genomics, and Helicos BioSciences.
He also said he believes that the so-called "$1,000 human genome" will be available by 2009, adding that "you guys can hold me to this."

Following PGP1, the second stage of the project is already underway. This spring, Harvard Medical School's Institutional Review Board approved a scale-up of the project to 100,000 volunteers. According to the project's website, study organizers are in the process of developing educational tools and resources to educate potential volunteers, and will enroll additional participants in the second half of this year. Church told In Sequence that approximately 5,000 volunteers are currently "queued up at the entrance exam stage."

The Scan

Not Yet a Permanent One

NPR says the lack of a permanent Food and Drug Administration commissioner has "flummoxed" public health officials.

Unfair Targeting

Technology Review writes that a new report says the US has been unfairly targeting Chinese and Chinese-American individuals in economic espionage cases.

Limited Rapid Testing

The New York Times wonders why rapid tests for COVID-19 are not widely available in the US.

Genome Research Papers on IPAFinder, Structural Variant Expression Effects, Single-Cell RNA-Seq Markers

In Genome Research this week: IPAFinder method to detect intronic polyadenylation, influence of structural variants on gene expression, and more.