Skip to main content
Premium Trial:

Request an Annual Quote

‘1,000 Genomes’ Challenges Data Flow

Premium

Managing and analyzing the data from the recently announced 1,000 Genomes Project will pose a number of challenges, some related to the nature of the next-generation sequencing platforms, but organizers say the results will boost both medical research and scientists' understanding of human evolutionary history.

The study consortium, which aims to sequence at least 1,000 and up to 2,000 human genomes within three years, includes two data-related working groups. A data flow group will be responsible for collecting and archiving the sequence reads, helping map them to a reference genome, and making the data available to the research community in different formats and levels of detail. Meantime, an analysis group will focus on aligning the reads, reconstructing the 1,000 genomes from the data, calling genetic variants, and interpreting the results.

A major challenge will be the "sheer volume of data," according to Gil McVean, a professor of statistical genetics at the University of Oxford, who co-chairs the analysis group. The consortium expects the study to produce on the order of 6 terabases of data, or 60 times the sequence data that has been deposited in public DNA databases over the last 25 years.

According to McVean, the analytical tasks fall into three broad areas: technology-related tasks that focus on translating the raw data into DNA sequence and mapping the sequence reads to a reference genome; calling genetic variants such as SNPs and structural variations and reconstructing individual genomes; and using the results to help disease studies and other research projects.

On the technology side, data analysis experts have to grapple with the fact that the nature of the data produced by existing next-generation sequencers is still in flux. "The data that comes out of the machines is changing pretty much month by month as the engineering improves," McVean said.

Julia Karow

Sequencing  Notes

Knome, a personal whole genome sequencing startup, lined up its first two customers and announced a partnership with the Beijing Genomics Institute. Based in Cambridge, Mass., Knome offers whole-genome sequencing and genomic analysis
for $350,000.

DNA sequencing technology startup Genome Corp. raised $250,000 in venture funding to continue work on its massively parallel Sanger sequencing tool. It also named three founding members of its scientific advisory board: Norm Dovichi, Annelise Barron, and Patrick Doyle.

Pacific Biosciences said it expects to commercialize a next-gen sequencer by 2010 that could eventually generate 100 gigabases of sequence per hour, or 10x coverage of a human genome in 15 minutes.

Datapoint

$12 million
NHLBI and NHGRI set aside $12 million in grants for developing cheaper methods of exon sequencing.

Funded grants

$219,331/FY2007
Sequencing DNA by transverse electrical measurements in nanochannels
Grantee: Robert Riehn, North Carolina State University
Began: Aug. 1, 2007; Ends: July 31, 2009
With this exploratory grant, Riehn and his team plan to build a sequencing technology using stretched and linearized DNA in nanofluidic channels, and detection using nanoelectrodes, according to the grant abstract. Riehn says this kind of technology would enable ultralong read frames of more than 100 kilobases.

$140,625/FY2007
Exon Specific Sequencing of Whole Genomic DNA
Grantee: Darren Link, RainDance Technologies
Began: July 1, 2007; Ends: June 30, 2009
Link says his long-term goal is to build a way to simultaneously sequence thousands of different exons from a genomic DNA sample with 30 to 50 times coverage of each exon. The technology will be based on RainDance's microfluidics platform and 454 Life Sciences sequencing, according to the abstract.

The Scan

Could Mix It Up

The US Food and Drug Administration is considering a plan that would allow for the mixing-and-matching of SARS-CoV-2 vaccines and boosters, the New York Times says.

Closest to the Dog

New Scientist reports that extinct Japanese wolf appears to be the closest known wild relative of dogs.

Offer to Come Back

The Knoxville News Sentinel reports that the University of Tennessee is offering Anming Hu, a professor who was acquitted of charges that he hid ties to China, his position back.

PNAS Papers on Myeloid Differentiation MicroRNAs, Urinary Exosomes, Maize Domestication

In PNAS this week: role of microRNAs in myeloid differentiation, exosomes in urine, and more.