Skip to main content
Premium Trial:

Request an Annual Quote

BGI Finds Illumina GA Suitable for Human Genome Reseq; Will Publish Results Soon

Researchers at the Beijing Genomics Institute in Shenzhen have analyzed the genome of an anonymous Chinese man whom they sequenced a year ago using Illumina’s Genome Analyzer.
According to their analysis, which is scheduled to appear in an upcoming issue of the journal Nature, short-read sequencing technologies are well suited for sequencing large eukaryotic genomes, as long as a reference genome sequence is available.
BGI first started talking about the project, at the time dubbed the “First Asian Diploid Genome Project,” a year ago (see In Sequence 9/25/2007). The study is part of the larger Yanhuang project, which aims to sequence at least 100 Chinese individuals over three years. The project, announced in January (see In Sequence 1/8/2008), aims to study genetic polymorphisms in the Chinese population.
Last week, Laurie Goodman, a contract public information officer and editor for BGI-Shenzhen, presented results from the analysis of the first genome at Cambridge Healthtech Institute’s Exploring Next-Generation Sequencing conference in Providence, RI.
According to Goodman, who spoke on behalf of Jun Wang, associate director of BGI-Shenzhen, who was unable to obtain a visa to attend the conference, the full cost for the project was approximately $500,000. The project has been “accepted in principle” by the journal Nature, she said. BGI Shenzhen announced the acceptance of the manuscript on its website earlier this month.
Goodman said it took BGI scientists approximately two months to generate the data on five Illumina Genome Analyzers, each week yielding between 4 and 8 gigabases of high-quality data.
In total, the scientists generated 3.3 billion reads on the instruments, which they mapped against the NCBI reference genome, covering approximately 99.97 percent of it.
The total coverage of the genome was 36-fold. Of that, 22.5-fold coverage came from unpaired reads and 13.5-fold coverage from paired-end reads. The data covered the autosomes to 34-fold depth, and the X and Y chromosomes to 19-fold depth. The read length varied from 25 to 44 base pairs, though the majority of reads were 35 base pairs long.
The researchers identified more than 3 million SNPs in the genome, of which 13.6 percent were not contained in dbSNP. In addition, they discovered 135,000 small indels one to three base pairs in size, as well as 2,682 structural variants.
A comparison of the SNPs identified in the published genome sequences of Craig Venter and Jim Watson showed that the three share approximately 1.2 million SNPs.
The researchers also compared their results against SNPs discovered using the Illumina HapMap 1M BeadChip and found that they covered approximately 99.22 percent of those SNPs by sequencing. They also validated SNPs that were inconsistent between the two platforms by PCR-based Sanger sequencing and found that for more than 80 percent of the inconsistencies, the Illumina sequence data were accurate.
Unsurprisingly, an analysis of the genetic background of the Chinese donor revealed that he is 94 percent Asian.
The researchers concluded that the Illumina sequencing technology is well suited to resequence large eukaryotic genomes, such as the human genome, as long as a reference sequence is available.
They found that the technology allows for “extremely accurate” detection of SNPs and insertion or deletions up to 3 base pairs in size. However, a mix of long and short reads is still required to detect longer inserts.
BGI Shenzhen it is now applying the experience it gained from its first genome project to the 1000 Genomes Project, where it is responsible for sequencing Asian HapMap individuals as part of one of the pilot studies.

The Scan

Ancient Greek Army Ancestry Highlights Mercenary Role in Historical Migrations

By profiling genomic patterns in 5th century samples from in and around Himera, researchers saw diverse ancestry in Greek army representatives in the region, as they report in PNAS.

Estonian Biobank Team Digs into Results Return Strategies, Experiences

Researchers in the European Journal of Human Genetics outline a procedure developed for individual return of results for the population biobank, along with participant experiences conveyed in survey data.

Rare Recessive Disease Insights Found in Individual Genomes

Researchers predict in Genome Medicine cross-population deletions and autosomal recessive disease impacts by analyzing recurrent nonallelic homologous recombination-related deletions.

Genetic Tests Lead to Potential Prognostic Variants in Dutch Children With Dilated Cardiomyopathy

Researchers in Circulation: Genomic and Precision Medicine found that the presence of pathogenic or likely pathogenic variants was linked to increased risk of death and poorer outcomes in children with pediatric dilated cardiomyopathy.