NEW YORK (GenomeWeb) – Researchers at Stanford developed a highly efficient CRISPR-Cas9-based genome editing method which they say can be used to identify genetic variants driving natural phenotypic variation at single-base resolution in yeast.
Using this approach, which the researchers called Cas9 retron precise parallel editing via homology (CRISPEY), they studied the fitness consequences of 16,006 natural genetic variants in yeast, and identified 572 variants with significant fitness differences in glucose media. As they wrote in Cell yesterday, these were highly enriched in promoters, particularly in transcription factor binding sites. Only 19.2 percent affected amino acid sequences. They also noted that nearby variants nearly always favored the same parent's alleles, suggesting that lineage-specific selection is often driven by multiple clustered variants.
"In sum, our genome editing approach reveals the genetic architecture of fitness variation at single-base resolution and could be adapted to measure the effects of genome-wide genetic variation in any screen for cell survival or cell-sortable markers," the authors wrote.
The researchers began by attempting to boost the efficiency of Cas9 editing and reasoned that generating a large number of potential donor DNA molecules within the nucleus would maximize the chance of homology-directed repair (HDR). They decided to use bacterial retrons for generating these donor DNAs as they are natural DNA elements coding for a reverse transcriptase (RT) as well as a template on which the RT acts, to create a multi-copy single-stranded DNA (msDNA) product.
The researchers found that coupling retrons with CRISPR-Cas9 enabled precise genome editing with very high efficiency and throughput, allowing them to pinpoint the functional variants underlying variation in fitness in any given environment. They then replaced a portion of the bacterial retron sequence with a 100-nucleotide donor sequence that contained a desired mutation flanked by 50 base-pair homology arms, and showed that msDNA was produced in yeast in an RT-dependent manner — this was the method they came to call CRISPEY.
Through a series of experiments in haploid Saccharomyces cerevisiae, the researchers found that CRISPEY was highly efficient and precise, and that it allowed them to introduce individual variants, one at a time, into various S. cerevisiae strains, enabling comparisons even between reproductively incompatible lines.
They then characterized the variants with significant fitness effect during growth competition. Many of these variants had surprisingly large effect sizes, suggesting that they may involve fitness tradeoffs where each allele confers a selective advantage in some environments, as unconditional fitness effects would lead to fixation of the fitter allele. The researchers also found that their top 23 hits were all in promoter regions, despite promoters accounting for only 29.8 percent of the edits.
They further investigated the promoter variants to test whether they were enriched in transcription factor binding sites (TFBSs), which are key sequence elements in cis-regulation, and found that 33 percent of variants within known TFBS had significant fitness effects, compared to only 9.6 percent of all tested promoter variants. The fraction of significant hits showed a strong dependence on distance from the nearest TFBS, with variants within 10 base pairs still highly enriched, which was consistent with results from human studies, the researchers noted.
And although coding variants were under-represented among significant variants, the investigators identified 156 synonymous and 95 missense variants. "Missense variants that change protein sequences were about equally likely to affect fitness as synonymous variants," they wrote. "While this may seem counterintuitive, this does not suggest that random missense and synonymous mutations will have similar effects, because deleterious variants have already been filtered by natural selection. We hypothesized that missense variants may affect fitness via changes to proteins, while synonymous variants may affect translation via codon usage."
Indeed, they found that significant missense variants were more likely to cause non-conservative amino acid changes and that significant synonymous variants were more likely to be present in genes with strong codon usage bias.
Overall, the researchers concluded, it is possible to assay genotype-phenotype relationships at single-base resolution in parallel, enabling the identification of causal variants for polygenic traits. They also noted that refinements to the CRISPEY approach could allow for the detection of even smaller fitness effects, and more symmetric power to detect effects in both directions.
"To conclude, retron-mediated HDR is highly efficient and potentially adaptable to many species. In addition to measuring fitness across diverse conditions or strain backgrounds, CRISPEY screens could easily be adapted to any trait that can lead to differential strain or allele abundance (including cell sorting based on expression of fluorescent reporter genes)," the authors added. "Looking ahead, implementing CRISPEY in other species — including mammalian cells, in which retrons are functional — could potentially allow rapid, base-pair-level investigation of a wide range of traits and diseases."