By Monica Heger
Researchers from the University of North Carolina have developed a sequencing technique to create maps of regulatory regions in the genome, and used it to identify a gene variant involved in diabetes. The authors of the study, which was published last week in Nature Genetics, say the technique can be applied to other types of disease-relevant tissue samples and will even work on solid tumors.
"It could be applied to any disease-relevant tissue, and is even more powerful in combination with data from genome-wide association studies," said Jason Lieb, associate professor of biology at UNC and a senior author of the study. Lieb's group collaborated with diabetes geneticists at UNC, as well as endocrinologists from the Hospital Clinic de Barcelona in Barcelona, Spain, who provided the tissue samples.
The team developed the technique — formaldehyde-assisted isolation of regulatory elements, or FAIRE, combined with high-throughput sequencing — several years ago, but this study, they said, is the first example of it being used in human tissue relevant to disease, in this case, pancreatic tissue.
By crosslinking open chromatin regions with formaldehyde, and then performing a phenol-chloroform extraction, the researchers were able to identify and separate out those regions of the chromatin where regulatory proteins can bind. They then used the Illumina Genome Analyzer to sequence those regions, using 36-base pair single-end reads, generating 60 million mapped reads of the sample they analyzed.
After creating an accurate picture of the regulatory regions of the chromatin, they were able to show that a SNP that other genetic studies had previously shown to be associated with type 2 diabetes risk fell within one of those regions. They then tested the DNA surrounding the SNP and found evidence that it did in fact have an effect on regulatory function — cells containing the SNP supported higher levels of transcription than cells with the non-risk variant.
Lieb said that being able to sequence just the regulatory portion of the genome is useful in studying complex diseases. Often, the mutations affecting disease won't occur in the genes associated with that disease, but will occur in the regulatory regions that control the gene, said Lieb. Regulatory proteins binding to these regions can affect how much protein the gene makes, or where in the body the gene is expressed. These regions can be difficult to find, however, because they are often located thousands or millions of bases away from the gene that they regulate. Identifying just the regulatory regions eliminates nearly 98 percent of the genome that researchers would otherwise have to search for variants.
"If we can identify the SNPs that are occurring in regions where regulatory proteins are binding, we might be able to make a better hypothesis about which SNPs we should test first," Lieb said.
Lieb said the method is similar to DNase hypersensitive mapping: "it's a way to find places in the genome where the DNA is accessible," he said, adding that its main advantages are that it's simple to do, and can be done on solid tissue samples.
While the method is relatively new, other labs have begun to use it as well. Karine Le Roch, assistant professor of cell biology and neuroscience at the University of California, Riverside, uses it to study malaria and agreed that the method yields clean, reliable data. "We've used microarrays before, and this sequencing technology is much cleaner," she said. "You get resolution at the single-nucleotide level, and it eliminates any type of contamination."
She said her group combines FAIRE-seq, which identifies the open regions of the chromatin, with another technique, called MNase-mediated purification of mononucleosomes to extract histone-bound DNA sequencing, or MAINE-seq for short, which identifies the closed regions of the chromatin. MAINE-seq gives an opposite picture of FAIRE-seq, she said, so the results verify each other. "It gives you a beautiful picture of which parts of the chromosome are active and which are inactive," she said.
Lieb said his group is currently working on a project to use FAIRE-seq on breast tumors from samples from University of North Carolina's medical school. He said that preliminary results indicate that it is possible to distinguish different types of tumors based on the chromatin profiles. "There are different subtypes [of tumors] that have to do with the cell type from which they originated. For example, some are hormone responsive, some are non hormone responsive. And, we can easily distinguish the two subtypes based on their chromatin profiles," Lieb said. "It's a rich set of information because it's telling us what regulatory elements are causing differences in gene expression." Lieb added that he was also encouraged because the preliminary results showed that the method worked on heterogeneous tissue.
Aside from the breast cancer study, Lieb's group has received funding to continue to identify functional regulatory SNPs involved in diabetes and obesity. In the future, his team is interested in mapping open chromatin in cell lines from the HapMap project. Since these cell lines are derived from white blood cells, they may be useful in identifying which variants associated with autoimmune diseases like lupus or rheumatoid arthritis fall within regulatory regions. He also wants to study chromatin variation among healthy individuals.
"The idea is that this will be a rich resource to go and test whether the regions that were identified and sequenced affect disease," he said.