NEW YORK (GenomeWeb) – Working independently, two research teams have combined CRISPR-based genome targeting and engineered ascorbic acid peroxidase (APEX) labeling to identify proteins linked to specific portions of the genome.
In a pair of papers published this week in Nature Methods, the two groups, one led by researchers at the Broad Institute and one led by researchers at the University of Massachusetts Medical School, demonstrated that the combination of these two recently developed techniques could provide new insights into phenomena including protein-DNA interactions and the three-dimensional structure of chromosomes.
Originally developed by Stanford University researcher Alice Ting (then at the Massachusetts Institute of Technology), APEX labeling relies on APEX tags genetically inserted into molecules of interest. Upon stimulation with hydrogen peroxide, this tag releases biotin-phenoxyl radicals that tag nearby proteins in the cell, and these tagged proteins can then be pulled out of the sample using streptavidin-based enrichment and analyzed using mass spec.
To date, APEX labeling has primarily been used to study the protein contents of different subcellular compartments or to investigate protein-protein interactions. In such experiments, an APEX tag is inserted into a protein of interest and researchers then look at what proteins it biotinylates, the idea being that these biotinylated proteins are in close proximity to the target protein and, as such, are likely interactors or occupants of the same subcellular compartment.
By using CRISPR to insert the APEX tag at specific genomic loci, the Broad and UMass researchers have extended the approach to DNA-protein interactions.
According to Steve Carr, senior director of proteomics at the Broad and senior author on one of the studies, his group's effort was inspired by Broad President Eric Lander, who, upon observing APEX labeling work being done in Carr's lab, asked if it could be done in a genomic-locus-specific way.
It was left to Samuel Myers, a post-doc in Carr's lab and first author on the paper, to work out exactly how to combine the two techniques.
"I'm interested in transcriptional regulation from a protein-centric angle," Myers said. "This is something I wanted to work out so that we could start looking at things in a more unbiased fashion."
Previously, Carr said, the only way to look at which proteins interacted with which genomic loci was chromatin immunoprecipitation (ChIP) and ChIP-seq, where proteins of interest are pulled down along with their association portions of DNA, which can then be analyzed via sequencing to determine the sequences to which a particular protein binds.
This, however, requires knowing which protein you want to target, which, as Myers noted, precludes unbiased analyses of protein-DNA interactions. Additionally, ChIP experiments typically require high-quality antibodies to proteins of interest, which are not always available.
The ability to insert APEX tags at specific genomic loci opens up a variety of research possibilities, Myers said.
"Let's say you have a [single-nucleotide polymorphism] out in the middle of the genome, and you know it's associated with some disease through some GWAS hit," he said. "What you can do with this method is you can land our CASPEX enzyme [consisting of the dCas9 nuclease fused to the APEX2 gene] right there and find what proteins are associated with that particular disease-relevant locus."
More generally, Myers said he is interested in using the approach to study signaling processes affecting transcription.
"My overall goal is to combine this within a larger workflow where we can look at, if you give a cell a certain signal, like a certain cytokine, and it turns on a gene, can we then follow the phosphorylation signaling that happens to actually bring in the transcriptional machinery to turn on or off a particular response gene," he said.
He gave the example of embryonic stem cells shutting off the Oct-4 and Nanog genes upon exiting pluripotency. "The phosphorylation signaling goes away and [the cells] shut down Oct-4 and Nanog," he said. "Those are the first genes to shut off. So, one of the things I want to ask is what are the first proteins that show up to shut off those genes."
Key to the approach's ability to target very specific loci is the way it tiles loci of interest with caspex genes inserted at multiple locations. This increases confidence in that hits that reproduce across the different insertion points are likely true interactors, as opposed to background. Additionally, it helps account for the possibility that the caspex insertion could interfere with some protein interactions.
"We're not sure how bad [caspex] disrupts the locus," Myers said. "It doesn't seem to be too bad, but if it does bump anything off, the tiled [insertions] around it will capture what was there."
The UMass study similarly uses dCas9–APEX2 insertions into genomic regions of interest to identify interacting proteins, but, unlike the Broad team's work, this study is focused on repeated regions of the genome like telomeres and satellite centromeres.
CRISPR-based approaches have been used before to insert proximity tagging systems like the biotin ligase BirA* in such repeated genomic regions, said Erik Sontheimer, a UMass professor and senior author on the study, but APEX had not been previously used.
One advantage of using APEX-based labeling is that it has different labeling chemistry from BirA*, Sontheimer said, which means the two approaches will label different sets of proteins.
"That doesn't necessarily make either one better or worse than the other," he said. "It's simply that there are going to be some proteins that are not efficiently labeled by BirA* that are efficiently labeled by APEX and vice versa. And having both options available will minimize the risk of false negatives."
The other advantage, Sontheimer said, is that BirA* labeling has traditionally been an inefficient process that can take on the order of hours.
"If you're looking at a circumstance that is relatively static, that's okay," he said. "But if you want to look with good time resolution at something that is dynamic, then that becomes a limitation. And APEX overcomes this because its labeling protocol is one minute. So, you can get a greater degree, in principle, of time resolution when you're looking at dynamic processes with APEX."
Sontheimer added, however, that this situation is improving with regard to BirA*, noting that Alice Ting's lab has recently published on new versions of BirA* with labeling times of under an hour.
Using the dCas9–APEX2 approach to study telomere-associated proteins, Sontheimer and his colleagues found that they were able to identify roughly half of the proteins firmly established by previous research to be localized there.
"That's with pretty stringent cut-offs," he said. "So, that was something we were pretty happy with."
Comparing these findings to telomeric proteins identified via BirA* labeling and to a biochemical fractionation of chromatin, the researchers found good overlap between all three datasets.
"Additionally, there were a number of known telomeric factors that those previous methods have missed, that we were able to find using APEX," Sontheimer said, adding that, "last but not least, we were able to find some new factors that hadn't been found by those other approaches before and validate them using independent means."
The method was developed as part of the National Institutes of Health's 4D Nucleome project, which aims to map the structure and temporal dynamics of the nucleome to understand their role in both normal cellular function and disease.
"What we want to be able to do now [as part of that project] is to map proteomes onto those three-dimensional [genomic] structures," Sontheimer said.
"For instance, how does the three-dimensional structure of the genome change when you go from an embryonic stem cell to a partially differentiated state? Or, how does the three-dimensional structure of the genome change when you go from the G2 phase of the cell cycle through mitosis and then into G1?" he said. "To identify proteomic changes that either correlate with or cause changes that are seen in genome structure is something that we're very interested in doing."