NEW YORK (GenomeWeb News) — An international team led by researchers at Seoul National University has sequenced and annotated the genome of a Korean individual — the second Korean genome to be sequenced in recent months and the seventh complete human genome to be published in the past two years.
In a paper published in this week's Nature, the researchers, led by Jeong-Sun Seo of the Genomic Medicine Institute at Seoul National University, describe how they used a combination of whole-genome shotgun sequencing, targeted bacterial artificial chromosome sequencing, and microarray-based comparative genomic hybridization analysis to analyze and annotate the genome of the individual, known as AK1.
"This combination of approaches improved the accuracy of SNP, indel, and CNV detection, and will assist in the assembly of contiguous sequences," the researchers wrote in the paper.
The team used two strategies for DNA sequencing on the Illumina Genome Analyzer: First, 390 genomic regions that are known to harbor copy number variants were sequenced at an average of 151-fold coverage. Next, the researchers performed whole-genome sequencing for the entire genome at an average depth of 27.8-fold. The sequencing took six weeks and cost $200,000, the researchers said.
After aligning the genome to the National Center for Biotechnology Information's reference, the team determined that there were around 3.45 million SNPs in AK1's genome, of which around 590,000 were novel and 10,162 were non-synonymous.
These results are in line with the findings of researchers from Korea's Gachon University of Medicine and Science and the Korean BioInformation Center, which published the genome of another Korean individual, SJK, in Genome Research in May. That team, which also used an Illumina Genome Analyzer, identified about 3.4 million SNPs in SJK's genome, of which about 420,000 were novel and around 9,500 were non-synonymous.
Seo and his colleagues also compared AK1's genome to that of the previously published genomes of James Watson, Craig Venter, a Chinese individual dubbed YH, and the Yoruban HapMap sample known as NA18507. They found that the number of SNPs in the genome of AK1 was "similar" to that of Watson, but higher than Venter and Chinese YH, and less than NA18507, "which may reflect differences in technical procedures or inter-individual variability."
Venter's genome was sequenced using Sanger technology, Watson's genome was analyzed with the Roche/454 Genome Sequencer, and both the Han Chinese and Yoruban genomes were sequenced on Illumina's Genome Analyzer.
In a comparison to the 9.5 million SNPs detected in all five sequenced genomes, 21 percent were unique to AK1 and 8 percent were shared by all of them, the researchers wrote. Around 2.1 million AK1 SNPs were heterozygous, "yielding a higher SNP diversity than in the Venter, Watson or YH genomes, but less than the Yoruba individual."
The researchers noted that "sequencing of other genomes using uniform technical procedures is warranted to evaluate the proportion of genetic variance explained by differences within and between human populations."
The researchers detected 170,202 indels in AK1's genome of a size range between -29 and +5 nucleotides. This was lower than the findings reported by the Gachon University team, which identified 342,965 indels in the range of -29 to +14 nucleotides in SJK.
The Seoul National University team also used several "complementary" approaches to detect CNVs, including deep sequencing, a custom-designed CGH array with more than 24 million probes, and genotyping arrays. They initially identified 1,237 CNV regions, but used "conservative" criteria to whittle that down to 238 deletions ranging from 277 bases to 196,900 bases and totaling 2.4 megabases, and 77 copy number gains, totaling 7 megabases. Of these CNVs, 148 of the deletions and 33 of the gains were not in the Database of Genomic Variants and are considered to be novel.