Skip to main content
Premium Trial:

Request an Annual Quote

Korean Genome Project Data May Be Useful for Cancer, Other Disease Studies

NEW YORK – Researchers in South Korea, the US, and the UK have released an initial set of data from the Korean Genome Project (Korea1K), including Korean-specific genome variation patterns, which they said can be a useful resource for clinical and ethnogenetic studies.

The first phase of Korea1K includes 1,094 whole genomes, sequenced at an average depth of 31x, paired with data on 79 quantitative clinical traits, the researchers reported in a study published on Wednesday in Science Advances. They identified 39 million single nucleotide variants and indels, of which half were singletons or doubletons, meaning they are extremely rare.

"Also, Korea1K, as a reference, showed better imputation accuracy for Koreans than the [1000 Genomes (1KGP)] panel," the authors added. "As proof of utility, germline variants in cancer samples could be filtered out more effectively when the Korea1K variome was used as a panel of normals compared to non-Korean variome sets."

Of the 1,094 Korean genomes in the dataset, 1,007 genomes were newly generated, the researchers said, and they combined these data with systematically acquired clinical and biochemical measurements from the blood and urine of the participants. They characterized SNVs, indels, copy number variations, transposable element (TE) insertion, and human leukocyte antigen (HLA) type in the Korean population and contrasted the Korean data with similar data from other populations.

Approximately half of the variants they identified were classified as singletons or doubletons. Surprisingly, more than 70 percent of them had not been previously reported in dbSNP, and less than 20 percent of the variants were classified as very common. Regarding indels, the researchers observed more deletions than insertions, possibly resulting from skewed variant calling.

They also found 35 drug response variants annotated in ClinVar. Eleven of them had significantly different allele frequencies compared to Chinese or Japanese individuals in the 1KGP set, highlighting the importance of population-specific datasets when interpreting pathogenic or drug-response variants. For example, the variant rs4961 in the ADD1 gene had the highest frequency in the Korea1K dataset compared to other populations. That variant is associated with hypertension and responsiveness to furosemide and spironolactone, as shown in a European study, but no significant association with blood pressure was found in the GWAS the researchers performed using the Korea1K set.

Overall, the researchers noted, the current sample size for the dataset is still insufficient to represent the Korean population or to map latent genomic structural variations.

"Our investigation of using Korea1K as a panel of normals for cancer genomics studies can be a small stepping stone for an efficient germline prefiltering process for cancer genome analyses in the future," they also wrote. "However, it is still questionable how much actual benefit such ethnicity-specific variome-based filtering can bring to cancer genome analyses in real clinical settings, especially for rare or individual-specific variant analysis."

However, the researchers added, the large-scale Korean variome database contained in the Korea1K reference is potentially applicable in studies on various cancers and other diseases in the Korean population, and could indirectly help reduce the cost of certain genetic analyses.

"This kind of personal whole-genome dataset combined with common health check–derived clinical information is possibly a good exemplary path for an ethnicity-relevant reference panel for future personalized medical applications for Koreans," the authors concluded.

The Scan

Genetic Tests Lead to Potential Prognostic Variants in Dutch Children With Dilated Cardiomyopathy

Researchers in Circulation: Genomic and Precision Medicine found that the presence of pathogenic or likely pathogenic variants was linked to increased risk of death and poorer outcomes in children with pediatric dilated cardiomyopathy.

Fragile X Syndrome Mutations Found With Comprehensive Testing Method

Researchers in Clinical Chemistry found fragile X syndrome expansions and other FMR1 mutations with ties to the intellectual disability condition using a long-range PCR and long-read sequencing approach.

Team Presents Strategy for Speedy Species Detection in Metagenomic Sequence Data

A computational approach presented in PLOS Computational Biology produced fewer false-positive species identifications in simulated and authentic metagenomic sequences.

Genetic Risk Factors for Hypertension Can Help Identify Those at Risk for Cardiovascular Disease

Genetically predicted high blood pressure risk is also associated with increased cardiovascular disease risk, a new JAMA Cardiology study says.