Skip to main content

1000 Genomes Project Data To Be Released Within Months as Pilot Phase Nears Completion

PHILADELPHIA (GenomeWeb News) – The 1000 Genomes Project collaborators plan to begin releasing data early next year and expect to finish sequencing 1,200 human genomes by around the end of 2009, project representative David Altshuler announced yesterday at the American Society of Human Genetics meeting here.
The team anticipates an official data release starting in January 2009, following a pilot data release this December, said Altshuler, an associate professor of genetics and medicine at Harvard Medical School and a lead investigator for the project. After January, new data will likely be released quarterly.
Meanwhile, the three 1000 Genomes pilot projects — which began in January and are aimed at achieving low coverage of 180 individuals, high coverage of two parent-offspring trios, and targeted sequencing of 1,000 genes in approximately 1,000 individuals — are nearing completion, Altshuler said. Those efforts seem to be generating high-quality data and have already uncovered new genetic variants, he added.
“We declared the pilot project very much a success at this point,” Altschuler told reporters at a press briefing yesterday.
So far, the 1000 Genomes Project has generated 3.8 terabases of data. This September and October, Altshuler said, the team deposited as much data each week as was present in GenBank when the effort began. In 2009, the project is expected to up that dramatically, producing a petabyte of data.
Along with the sequencing effort itself, Altshuler emphasized a need for developing shared data formats for different stages of the analysis. In the absence of standard formats or a clear framework for such analysis, he added, efforts to decipher the genetic information would be delayed. Consequently, team members are working to develop draft formats to aid this analysis.
The goal of the 1000 Genomes Project, an international effort, is to uncover the genetic variants that are present at a frequency of one percent or more in the human genome.
Some have suggested that the large-scale sequencing effort may also help researchers impute new information for the more than 100,000 genotyped genomes available already. While that has not been shown for rare variants, Altshuler explained, it is possible that it could add value to the multitude of samples already scanned with chips.
But beyond the direct implications for the 1000 Genomes Project, the effort has spurred researchers to pioneer and evaluate methods that benefit other research efforts as well. For instance, researchers have been working with high-throughput sequencing, developed new approaches for exchanging and analyzing data, discovering SNPs and CNVs, and making imputations based on next-generation sequence data.
Discussing the project at the press briefing yesterday, former National Human Genome Research Institute director Francis Collins noted that while the project itself is not aimed at linking genotypes to phenotypes, “It will be the engine of many follow up studies.”

The Scan

Possibly as Transmissible

Officials in the UK say the B.1.617.2 variant of SARS-CoV-2 may be as transmitted as easily as the B.1.1.7 variant that was identified in the UK, New Scientist reports.

Gene Therapy for SCID 'Encouraging'

The Associated Press reports that a gene therapy appears to be effective in treating severe combined immunodeficiency syndrome.

To Watch the Variants

Scientists told US lawmakers that SARS-CoV-2 variants need to be better monitored, the New York Times reports.

Nature Papers Present Nautilus Genome, Tool to Analyze Single-Cell Data, More

In Nature this week: nautilus genome gives peek into its evolution, computational tool to analyze single-cell ATAC-seq data, and more.