Skip to main content
Premium Trial:

Request an Annual Quote

Deep Sequencing of MHC Region in Han Chinese Enables Database for Future Studies

NEW YORK (GenomeWeb) – A team of Anhui Medical University-led researchers has deeply sequenced the major histocompatibility complex region in more than 20,000 people of Han Chinese descent.

The MHC region is notoriously variable as well as linked to susceptibility to a number of diseases, including immune disorders. Anhui's Xuejun Zhang and his colleagues sequenced this entire five-megabase region in some 9,900 psoriasis patients and 10,700 controls, all of Han Chinese ancestry. As they reported in Nature Genetics today, the researchers uncovered features of the MHC region that appear specific to the Han Chinese population and developed a reference panel. In addition, they verified a known psoriasis susceptibility allele and uncovered additional novel ones in the region.

"We anticipate that our Han-MHC database will serve as a useful tool for future studies evaluating the genetic basis of MHC-associated diseases in the Han Chinese population," Zhang and his colleagues wrote in their paper. "As there are marked differences in the haplotypes associated with many autoimmune diseases between populations of Chinese and European ancestry, future studies comparing these haplotypes may also provide valuable insights into the molecular basis and mechanisms involved in complex diseases."

Using a target capture array, Zhang and his colleagues sequenced the MHC region stretching from upstream of HLA-A to downstream of HLA-DPB1 in 20,635 Han Chinese. In the 10,689 healthy individuals, the researchers reported reaching 55X sequencing coverage.

Through re-sequencing, the researchers examined SNPs, short indels, and HLA typing in the region to find between 5,872 variants and 27,438 variants, as compared to the reference haplotype PGF. On average, they reported that each individual harbored 14,680 SNPs and 2,075 indels, and that rare alleles and low-frequency alleles accounted for 79 percent and 8 percent of these variants, respectively. Only 13 percent were common alleles.

They then generated genotypes for all of the polymorphic genes in the IMGT/HLA database, including the classical HLA genes HLA-A, HLA-B, and HLA-DRB1, among others, and determined their frequency within the Han Chinese populations.

When they examined the concordance between these generated genotypes and 24 individuals with known alleles and 188 individuals for whom they also performed Sanger sequencing-based typing, the average genotyping accuracy was 98 percent and nearly 96 percent for two- or four-digit resolution for the five most polymorphic genes. This, they added, suggests that theirs is the most comprehensive MHC database for a specific population.

The HLA-B gene, they noted, exhibited the greatest diversity as common alleles only accounted for slightly more than a third of the total alleles in the Han Chinese population. This, they added, indicates that HLA-B diversity might have had a key role in environmental adaptation.

Zhang and his colleagues also generated MHC haplotypes for each individual based on 27 allele-distinguishing SNPs in five classical HLA genes to find that the most prevalent haplotype appeared in just shy of 4 percent of the individuals; most haplotypes appeared only once.

Their Han-MHC database can act as a reference panel and be used to impute MHC information from, for instance, genotyping data, they added. When they evaluated its performance, they found that its aggregated mean concordance was 0.97 for common, 0.93 for low-frequency, and 0.81 for rare HLA alleles.

Using two different approaches — fine mapping and imputation — the researchers searched for causal variants of psoriasis in the MHC regions. Through sequencing the MHC region in 9,946 Han Chinese patients with psoriasis, Zhang and his colleagues homed in on the HLA-C*06:02 allele — which had been linked to diseases in both Chinese and European populations — as the most significantly associated allele.

They also uncovered another association at the HLA-C*07:04 allele and at a SNP upstream of the HLA-B gene through conditional analysis. Further examination of the HLA-B gene uncovered two other disease signals there. Conditional analysis based on these loci also revealed an independent signal in the BTNL2 gene.

At the same time, a genotyping-based analysis that imputed missing SNPs and HLA variants using the Han-MHC database also revealed the association at HLA-C*06:02 allele as well as those at the HLA-C*07:04 allele and the SNP upstream of the HLA-B gene. The other three associations the fine-mapping approach uncovered were not found, however. The researchers said that incomplete information and low coverage of the MHC region in the genotyping data might be the source of the discrepancy.

"In comparison to previous large-scale population studies of the MHC region … our Han-MHC database, with its possible application in providing highly accurate imputation results for Han Chinese populations, helps to determine the genetic landscape of the MHC region and fine-map disease-associated or disease-causative mutations," Zhang and his colleagues wrote.