A team of researchers in China is using Illumina’s next-generation sequencing technology to map the genome of a Chinese individual, In Sequence has learned.
The project, dubbed the “First Asian Diploid Genome Project,” may be publicly presented as early as next month. It is expected to pave the way for a larger undertaking to sequence multiple human genomes, possibly in collaboration with other research institutes outside of China, according to a project spokesperson.
The genome sequence will “clearly provide significant experience, as well as being interesting and important in its own right,” Richard Durbin told In Sequence by e-mail last week. As a principal investigator at the Wellcome Trust Sanger Institute, Durbin, who has been collaborating with the Beijing Genomics Institute on another project, focuses on resequencing human genomes using new sequencing technologies.
More than 100 researchers from three Chinese institutes are participating in the genome project: the Shenzhen campus of the Beijing Genomics Institute; the National Engineering Research Center of Bioinformatics Systems; and the Beijing Institute of Genomics of the Chinese Academy of Sciences.
Between them, these institutes own seven Illumina Genome Analyzers, which they acquired about six months ago, according to Ye Jia, a spokesperson for the project who works at BGI-SZ.
Besides producing sequence data, the scientists are developing “all downstream software” for the project, such as for sequence alignment and assembly, and for identifying variations, according to Jia.
The scientists have not yet decided on the exact fold-coverage but Jia stressed that “it must be a high coverage.”
The researchers, who will submit the sequence data to GenBank, hope to present a first version of the genome map at the China Hi-Tech Fair in Shenzhen next month, she said, although “we are not sure of the exact time to finish it.”
The estimated cost for the project, which is primarily funded by the government, is $2 million, according to Jia.
The DNA donor is a Chinese man, she said, but “we don’t think it’s time to reveal his identity now.” He is among the first few individuals to have his genome sequenced:
Craig Venter published an analysis of his genome last month in PLoS Biology (see In Sequence 9/4/2007), using standard Sanger sequencing technology; and Jim Watson was presented with his genome sequence by researchers from Baylor College of Medicine and 454 Life Sciences in May (see In Sequence 6/5/2007).
Illumina has been sequencing an African HapMap individual. Earlier this year, the company said it had generated 4X coverage with single reads and planned to add paired reads. As of May, the project was still ongoing.
The greatest challenge for the project so far has been the fact that next-generation sequencing technology is still so new, Jia said. “There are many difficulties, such as library construction, the balance between quality and quantity, variation analysis, and so on.”
Jia said that the research teams decided to sequence another individual, even though two such genome sequences already exist, because “it is a demonstration of the new technology and a start toward finding all common polymorphisms.”
Indeed, this project is only the start of a potentially much larger endeavor.
“We are planning to sequence more individual genomes in the future,” Jia said.
It is unclear right now which sequencing platform would be used for a larger project, though. While the current project is proceeding with Illumina’s technology, “I don’t know whether we will use other sequencing machines in the next stage,” Jia said.
In fact, the Chinese researchers are collaborating with the Sanger Institute on finding the best strategy for scaling up. “We are exchanging information in our evaluation of various platforms, and also discussing strategies for approaching sequencing multiple human genomes, which we are also discussing with others,” Durbin told In Sequence last week via e-mail.
“We are planning to sequence more individual genomes in the future.”
Durbin has an ongoing collaboration with BGI, which developed from his work with Wang Jun on the TreeFam database of animal gene trees.
However, he said, “It is really too early to discuss a specific project with goals and deliverables. The final sequencing technology/ies to be used are also still open questions.”
Last week, BGI-SZ’s website stated that the “Asian Diploid Genome Project” is “the first step in our collaboration with the Wellcome Trust Sanger Institute on sequencing a thousand human genomes.”
Jia did not have any information on this collaboration, and the statement has since disappeared from the institute’s website.
The Beijing Genomics Institute’s Shenzhen branch opened earlier this year. It is a collaboration between the Beijing Institute of Genomics of the Chinese Academy of Sciences, BGI’s Hangzhou branch, BGI, “and other BGI-affiliated institutes” according to its website.
The institute is equipped with Illumina’s Genome Analyzer, ABI sequencers, and GE Healthcare Megabace sequencers and has a sequencing capacity of 2 billion raw bases per day.
BGI-SZ researchers are currently developing a number of applications for the Illumina sequencer, including whole-genome sequencing, de novo sequencing, genomic tag sequencing, methylation, miRNA discovery and profiling, gene expression profiling, transcriptome sequencing, targeted sequencing of genes and genomic regions, and ChIP-seq.
The institute’s bioinformatics center, which has more than 50 researchers, “aims to support large-scale sequencing,” according to the BGI-SZ website. Its computer department owns a 4.5-teraflop Linux cluster and has a data storage capacity of 300 terabytes.
Besides the Asian human genome project, the center’s website lists two other ongoing research projects.
In the first, titled “High Altitude Genome Research,” scientists plan to sequence and analyze the genomes of humans living in the high-altitude Qinghai-Tibet Plateau area. “The genomic studies of human population[s] who live in such areas will also help us understand how they react and adapt their bodies to such extreme environment[s] at high altitudes,” according to the website.
In addition, the “Panda Genome Project” aims to sequence up to 90 percent of the 3-gigabase panda genome using a combination of Sanger sequencing and Illumina’s sequencing platform. The institute also plans to generate a “digital whole-genome expression map” for the panda, according to the website.