NEW YORK — Using a systematic approach, researchers from the University of California, Berkeley have identified thousands of new large serine recombinases (LSRs) that could be used in genome editing and other genome engineering approaches.
Bacteriophage-origin recombinases and transposases evolved to allow large swaths of genetic material to be inserted into bacterial genomes at certain sites, a feature that can be harnessed by a range of lab workflows. LSRs in particular have traits that lend themselves to genome engineering applications, such as their ability to insert large cargoes, site specificity, unidirectionality, and simplicity as only one LSR protein and one DNA donor are needed to insert a gene. But their adoption has been hampered by the limited number of LSRs that have been identified and their low integration efficiencies.
But by scouring the genomes of nearly 195,000 bacterial isolates, both clinical and environmental, the Berkeley-led team uncovered more than 12,600 candidate LSRs and predicted their attachment sites, as they reported on Monday in Nature Biotechnology. These newly identified LSRs included ones with much higher integration efficiencies.
"Given my experience developing CRISPR technologies over the last decade, we've learned that diverse CRISPR-Cas nucleases can have widely varying activities in human cells," co-senior author Patrick Hsu, an assistant professor at Berkeley, said in an email. "We therefore sought to leverage the vast wealth of metagenomic data to systematically describe the broad diversity of LSRs and then experimentally identify versions that could be significantly more efficient in human cells."
In 194,585 bacterial genomes, the researchers computationally searched for signs of LSRs within mobile genetic elements. Further, as mobile genetic elements integrated by LSRs retain their attachment sites, they additionally analyzed the target sites of the LSRs. Through this, the researchers identified 12,638 candidate LSRs, which they whittled down to 6,207 unique LSRs and cognate attachment sites. This boosted the number of known LSR and cognate attachment sites by more than one hundredfold. The LSR enzymes could further be classified into three functional classes: landing pad LSRs, human genome-targeting LSRs, and multi-targeting LSRs.
In all, they synthesized and functionally tested more than 60 diverse LSRs.
As the integration of genetic cargo at pre-installed genomic landing pad sites is a key potential LSR application, the researchers particularly examined how well a handful of the new landing pad LSRs, including Kp03 and Pa01, integrated into landing pad cell lines. Both Kp03 and Pa01 outperformed Bxb1, a previously characterized LSR. In particular, Kp03 and Pa01 improved upon Bxb1 integration efficiency by between two and sevenfold. Further, they exhibited integration efficiencies of 40 percent to 75 percent for cargo greater than 7 kilobases in size.
They additionally described a new LSR-driven approach to integrate an amplicon library into a landing pad in human cells. This, they added, indicates that higher efficiency landing pad recombinases could be applied to functional genomic applications and shorten genetic screening analysis times by eliminating the need for library cloning and lentiviral delivery.
In addition to that application, Hsu said he is also excited about the therapeutic potential of LSRs as a gene-writing technology. Since LSRs require only one protein and one DNA donor, Hsu noted that the therapeutic delivery would be simpler than other approaches.
"This was a major motivation for us — we wanted to think backwards from the therapeutic potential and simplicity of delivery," he said.
Hsu further adds that "we're very excited about exploring the translational potential of integrases as a platform approach."