NEW YORK – Researchers at the University of Chicago, Bluestar Genomics, and elsewhere have developed a genome-wide tissue map of the DNA 5-hydroxymethylcytosine (5hmC) modification that revealed its role in tissue-specific development and as a potential diagnostic and prognostic biomarker for a variety of human diseases.
In a study published in Nature Communications on Wednesday, the researchers noted that 5hmC is already associated with gene transcription and is used as a biomarker to investigate dynamic DNA methylation conversion during mammalian development and in human diseases. There are various studies that link changes in global 5hmC with disparate developmental processes, as well as to pathobiology such as the initiation and progression of cancer. But scientific knowledge of 5hmC's precise functions remains incomplete.
To investigate its role further, the investigators sought to develop a genome-wide 5hmC tissue map in different human tissue types, with the goal of determining how it's implicated in transcription activity and tissue specificity.
In total, they characterized 5hmC in 19 human tissues derived from 10 organ systems, using next-generation sequencing to identify genome-wide 5hmC distribution that uniquely separated samples by tissue type. Further comparison of the 5hmC profiles with transcriptomes and histone modifications revealed that 5hmC is preferentially enriched on tissue-specific gene bodies and enhancers.
"While previous studies have shown that 5hmC can serve as an excellent biomarker for the diagnosis and prognosis of human diseases, including cancer, the lack of a whole-body tissue map limits our global understanding of this mark and its potential tissue specificity," University of Chicago chemistry professor Chuan He, the study's senior author, said in a statement. "The new map confirms 5hmC as a prevalent gene activation mark for both gene bodies and enhancers with superb tissue and cell type specificity, which is key to future early diagnosis of human cancer and monitoring of human chronic diseases."
Previous studies of 5hmC have either used microarray-based techniques, which lack genome-wide coverage, or have only focused on limited tissue types, the researchers said. For this study, they developed a sensitive chemical labeling and pull-down method called 5hmC-Seal, followed by NGS. The 5hmC-enriched regions that were detected in the tissue samples were further evaluated for their regulatory potential and tissue specificity by comparing them with gene expression data and cis-regulatory element data from the Roadmap Epigenomics Project.
The tissue samples represented 19 tissue types from the nervous, cardiovascular, digestive, reproductive, endocrine, respiratory, urinary, integumentary, skeletal, and lymphatic organ systems. Of the 96 total specimens, 79 samples were taken from non-cancerous organs, while the remaining 17 were from normal adjacent tissues upon tumor resection.
A previous report using tiling microarrays had suggested that the HOXA gene cluster had highly variable 5hmC levels in different tissue types, so the researchers investigated whether this observation could be recapitulated in their 5hmC-Seal datasets between tissue types. They found that the location and enrichment levels of peaks did indeed vary across tissue types, especially between distantly-related tissues.
They also found that the enrichment of 5hmC at the HOXA gene cluster appeared to be highly consistent between donor samples within the same tissue type. These data confirmed the profiling accuracy and reproducibility of the genome-wide 5hmC profiles they obtained in various tissues, providing a unique resource to study tissue-specific distributions of 5hmC in the genome.
Once the dataset was established, the researchers determined several common features of the 5hmC distributions in various human tissues. For example, although the total number of identified 5hmC peaks per tissue sample varied within each tissue type, the number of peaks per donor sample was similar. The lowest peak numbers were obtained from bone marrow samples while the highest came from the placenta samples. Further, the researchers found that 5hmC peaks were overrepresented at exons and promoters and underrepresented at intergenic regions, demonstrating that in spite of different tissue identities, the 5hmC loci were consistently distributed within known genomic locations of 5hmC.
The researchers also sought to investigate 5hmC modifications in the context of different epigenomic regions by comparing their data with publicly available datasets. They found that 5hmC was highly enriched in regulatory chromatin states mainly marked by the histone modification H3K4me1 and defining active and flanking transcription start sites, enhancer regions, and bivalent enhancer regions.
In an experiment to determine whether the 5hmC modifications marked tissue-specific genes, the researchers detected a total of 1,723 tissue-specific 5hmC-modified genes that separated all tissues, with the placenta showing the highest number of tissue-specific genes. Samples from related anatomical or physiological systems also showed an apparent overlap in 5hmC enrichment. For example, the brain-specific 5hmC-modified genes also showed higher modification levels in the hypothalamus samples, compared to other tissue types. Further, transverse and sigmoid colon segments exhibited similar modification levels for tissue-specific genes, despite arising from different embryonic lineages.
Overall, the researchers said, the data suggested that 5hmC may mark specific genes or promoters and enhancers that undergo changes representative of specific disease pathogenesis.
"The human tissue 5hmC map presented herein provides a critical and valuable resource to integrate with known data of 5mC distribution, chromatin accessibility, and various histone marks in human tissues to further understand the role of epigenomic regulation in human development and disease progression," they concluded.