Skip to main content
Premium Trial:

Request an Annual Quote

1000 Genomes Project Data To Be Released Within Months as Pilot Phase Nears Completion

PHILADELPHIA (GenomeWeb News) – The 1000 Genomes Project collaborators plan to begin releasing data early next year and expect to finish sequencing 1,200 human genomes by around the end of 2009, project representative David Altshuler announced yesterday at the American Society of Human Genetics meeting here.
The team anticipates an official data release starting in January 2009, following a pilot data release this December, said Altshuler, an associate professor of genetics and medicine at Harvard Medical School and a lead investigator for the project. After January, new data will likely be released quarterly.
Meanwhile, the three 1000 Genomes pilot projects — which began in January and are aimed at achieving low coverage of 180 individuals, high coverage of two parent-offspring trios, and targeted sequencing of 1,000 genes in approximately 1,000 individuals — are nearing completion, Altshuler said. Those efforts seem to be generating high-quality data and have already uncovered new genetic variants, he added.
“We declared the pilot project very much a success at this point,” Altschuler told reporters at a press briefing yesterday.
So far, the 1000 Genomes Project has generated 3.8 terabases of data. This September and October, Altshuler said, the team deposited as much data each week as was present in GenBank when the effort began. In 2009, the project is expected to up that dramatically, producing a petabyte of data.
Along with the sequencing effort itself, Altshuler emphasized a need for developing shared data formats for different stages of the analysis. In the absence of standard formats or a clear framework for such analysis, he added, efforts to decipher the genetic information would be delayed. Consequently, team members are working to develop draft formats to aid this analysis.
The goal of the 1000 Genomes Project, an international effort, is to uncover the genetic variants that are present at a frequency of one percent or more in the human genome.
Some have suggested that the large-scale sequencing effort may also help researchers impute new information for the more than 100,000 genotyped genomes available already. While that has not been shown for rare variants, Altshuler explained, it is possible that it could add value to the multitude of samples already scanned with chips.
But beyond the direct implications for the 1000 Genomes Project, the effort has spurred researchers to pioneer and evaluate methods that benefit other research efforts as well. For instance, researchers have been working with high-throughput sequencing, developed new approaches for exchanging and analyzing data, discovering SNPs and CNVs, and making imputations based on next-generation sequence data.
Discussing the project at the press briefing yesterday, former National Human Genome Research Institute director Francis Collins noted that while the project itself is not aimed at linking genotypes to phenotypes, “It will be the engine of many follow up studies.”

The Scan

Quality Improvement Study Compares Molecular Tumor Boards, Central Consensus Recommendations

With 50 simulated cancer cases, researchers in JAMA Network Open compared molecular tumor board recommendations with central consensus plans at a dozen centers in Japan.

Lupus Heterogeneity Highlighted With Single-Cell Transcriptomes

Using single-cell RNA sequencing, researchers in Nature Communications tracked down immune and non-immune cell differences between discoid lupus erythematosus and systemic lupus erythematosus.

Rare Disease Clues Gleaned From Mobile Element Insertions in Exome Sequences

With an approach called MELT, researchers in the European Journal of Human Genetics uncovered mobile element insertions in exomes from 3,232 individuals with or without developmental or neurological abnormalities.

Team Tracks Down Potential Blood Plasma Markers Linked to Heart Failure in Atrial Fibrillation Patients

Researchers in BMC Genomics found 10 differentially expressed proteins or metabolites that marked atrial fibrillation with heart failure cases.