Skip to main content
Premium Trial:

Request an Annual Quote

Cytosine Base Editors Allow Scientists to Study Function of Variants at Scale


NEW YORK – Researchers at the Broad Institute have developed a new, CRISPR-based method to study the function of genomic variants in mammalian cells at scale. They're hoping the technique, which relies on CRISPR-Cas9 cytosine base editors (CBEs), could be used to reliably determine the function of thousands of mutations in the genome to identify whether they cause disease, or confer drug sensitivity or resistance.

The approach could also help to address the problem of variants of unknown significance by providing researchers with a faster and more efficient method of determining a mutation's effect.

In a paper published on Thursday in Cell, the researchers said they benchmarked the performance of CBEs in positive and negative selection screens and were able to identify known loss-of-function mutations in BRCA1 and BRCA2 with high precision. They then tried to determine the utility of the CBE screens by probing small molecule-protein interactions, specifically screening against BH3 mimetics and PARP inhibitors, to find point mutations that might confer drug sensitivity or resistance.

Through these screens, they identified loss-of-function variants in numerous DNA damage repair genes and created a library of single guide RNAs (sgRNAs) predicted to generate 52,034 ClinVar variants in 3,584 genes.

Several methods already exist for variant function screening, but they all have some drawbacks. Saturation mutagenesis can generate all possible single-nucleotide variants in coding or non-coding regions, but it relies on exogenous overexpression of the variant of interest, which may not be the same as for endogenous variants, and is difficult for large genes, the researchers said. Saturation genome editing screens, which use Cas9-mediated homology-directed repair (HDR), have been used to probe all possible variants in several exons of BRCA1. But while this approach is effective in yeast, the low efficiency of HDR in human cells has restricted its use to near-haploid lines.

"We've done some studies like this in the past where the idea is to mutagenize the gene, and you can do that with CRISPR knock-out technology," said John Doench, the study's corresponding author. However, he explained, creating a double-strand DNA break (DSB) with CRISPR creates "a semi-random repair outcome." So while it's possible to mutagenize genes and discover interesting variants using that approach, it's hard to control, and the repairs are out of frame most of the time, resulting in a truncated protein. 

CBEs were developed in the lab of David Liu at the Broad, who was also a co-author on the Cell study. They consist of a catalytically impaired CRISPR-Cas9 mutant that cannot make DSBs, a single-strand-specific cytidine deaminase that converts C to uracil in the single-stranded DNA bubble created by Cas9, a uracil glycosylase inhibitor that impedes uracil excision and downstream processes that decrease base editing efficiency and product purity, and nickase activity to nick the non-edited DNA strand, directing cellular mismatch repair to replace the G-containing DNA strand. This kind of editing enables the efficient and permanent conversion of C-G base pairs to T-A base pairs, and addresses point mutations found in roughly 15 percent of genetic diseases.

Because they introduce point mutations to the genome, rather than DSBs, CBEs are much easier to control. But more than that, Doench said, the point mutations more faithfully replicate disease-causing mutations or other variants of interest in the genome.

The base editors also require only one guide RNA to direct the Cas enzyme to the targeted loci, they can screen large genomic regions that cannot easily be assayed by saturation mutagenesis, and they're relatively predictable in their editing activity.

In their benchmarking experiment to assess the feasibility of the screening method, the researchers used a library that includes all possible Streptococcus pyogenes sgRNAs targeting the exons of 47 genes that, when inactivated, conferred a phenotype that was readily assayed in a pooled viability screen. These included 10 pan-lethal genes and four genes whose knockout conferred resistance to the BRAF inhibitor vemurafenib (Genetech's Zelboraf).

They first asked whether sgRNAs predicted to introduce neutral or loss-of-function mutations in pan-lethal genes showed differential performance in a negative selection screen. Importantly, they observed that sgRNAs predicted to introduce no edits or only silent mutations performed similarly to targeting controls, indicating a low rate of indels or C-to-R editing. The validation of the base editor in five different cell lines showed a similar separation between predicted neutral and loss-of-function mutations across all cell lines, indicating effectiveness across cell types of various ploidy.

They also used their method to identify known loss-of-function alleles in BRCA1 and BRCA2. Because many nonsense mutations in BRCA1 and BRCA2 have been well characterized, the researchers focused on missense and silent mutations for this experiment.

In one interesting observation, the researchers found that in BRCA2, the most depleted sgRNA showed depletion of alleles containing a G-to-A mutation at the canonical splice donor site of exon 13, which is listed as pathogenic in ClinVar. However, this sgRNA also gave rise to a second depleted allele that contained an intact splice donor site but included a G-to-A mutation 5 nucleotides into the intron, which is listed as "conflicting interpretations of pathogenicity" in ClinVar. When they conducted an analysis of splicing intolerance using ExAC data, the researchers found that this second site was significantly intolerant of non-G nucleotides, and although they couldn't conclusively determine the functional consequence of this mutation without further validation, the data were consistent with the conclusion that G-to-A variants at this site are loss-of-function mutations and disrupt splicing.

Importantly, the researchers also probed drug-target interactions to identify mutations that may confer sensitivity or resistance.

"We mostly focused on gain of function from the standpoint of protein-small molecule binding — so a mutation in a target that retains the function of the gene, but blocks binding of the small molecule," Doench said. "Those things occur in any sort of treatment, especially in cancer. You treat with an EGFR inhibitor and the cell starts to express an isoform of EGFR that doesn't bind the drug. So those things are going to continue to crop up in drug development. And I'm going to tell this to every pharma person I meet: it is drug development malpractice to move forward with a small molecule if you haven't done a study to find mutations that are resistant to it, because that's what proves [if] this is really the target."

For many small molecules the target isn't actually clear, he added, and knowing the resistance mutation would be good evidence of the drug's actual target.

The researchers focused on MCL1 and BCL2L1, a pair of anti-apoptotic genes often upregulated in cancer that share a synthetic lethal relationship and each have targeted inhibitors. Overall, they concluded that base editor screens can effectively identify point mutations that modulate drug response. Importantly, unlike screens that rely on pseudo-random mutagenesis, base editors are efficient enough to allow for negative selection screens, enabling the identification of mutations that sensitize cells to drug treatments.

The study shows that this approach should "have a home in any sort of a drug development process," Doench said. "It's not necessarily step one, but it should certainly be a step before Phase I clinical trials."

The study also showed that the technique has use for the clinical interpretation of genes that have known variation in the human population but where the effect of that variation on susceptibility to common polygenic diseases is not well known. In addition to BRCA, Doench pointed to the cholesterol receptor gene LDLR as an example.

The approach can be used to identify gain-of-function and loss-of-function variants, and to screen variants across many genes in parallel to determine their contribution to a common phenotype, the researchers said. That would lead to the creation of look-up tables that connect sequence variation and gene function, even if a variant has yet to be observed clinically.

While in the case of most diseases, loss-of-function variants are more common, gain-of-function variants are just as important to characterize, Doench said. In the case of BRCA1 variation, for example, knowing whether a variant is causing a loss of function or whether BRCA1 is still active can help a clinician determine whether or not to treat a patient with a PARP inhibitor. "So it affects multiple levels of clinical decision making," he said. "Characterizing variants is going to be very valuable."

Another advantage to this method is that it could help reduce the number of variants of unknown significance that currently exist in clinical databases. One of the biggest challenges in assessing how significant a genetic variant is to human health is finding the right model system to test its function, Doench said. BRCA1 can be assessed through a simple growth assay in the lab, but genes involved in immune response, for example, are much harder to model in HeLa cells.

"The good news is that there are still several thousand genes which have a growth effect, so we'll be able to make good headway with a lot of those genes," he said. "But there are absolutely going to be many genes for which we need better assay systems to say, 'Is this gene functional? Is this variant for this gene functional or not?'"

And while even the base editor approach has limitations, such as which amino acids can be targeted and changed, the technique is an improvement on current methods and gives researchers an additional tool to work with.

And there will likely be even more tools built upon this one in the future. In addition to CBEs, Liu and his colleagues developed an Adenine Base Editor (ABE) in 2017 that is capable of converting A-T base pairs to G-C bases. At the time Doench and his colleagues started their study, ABEs had only recently been developed and the version they were working with wasn't as efficient as they wanted for high-throughput screening. Since then, he said, ABEs have gone through several iterations, and the team is now working with the newest version to develop an ABE-based approach similar to the CBE-based method in the Cell paper.

More recently, Massachusetts General Hospital researchers led by Julian Grünewald and Keith Joung and a second team led by researchers from the Chinese Academy of Sciences independently developed CRISPR base editors that induce targeted DNA transversions, changing C to G in human cells. Doench and his colleagues haven't tried that kind of base editor for this type of screening yet, but they plan to do so.

And finally, they're aiming to try modified versions of Cas9 in order to expand the possible target sequences the enzyme can hit. Wild-type Cas9 is restricted to a protospacer adjacent motif (PAM) of NGG, narrowing which bases it can target in the genome. But certain groups — including researchers at the Broad; the University of California, Berkeley; and Mass General — are evolving or engineering new versions of Cas9 to have more relaxed PAM requirements or to be nearly PAM-less, and these Cas9 orthologs are already being used to improve the targeting abilities of base editors.

"Using these base editors with those more targetable Cas9 versions is already the direction we're going in, and we're certainly very excited about that," Doench said.