NEW YORK (GenomeWeb) – Building upon their pioneering work with short hairpin RNAs, Cold Spring Harbor Laboratory researchers recently published the details of an algorithm that can help predict the potency of shRNAs against a given target.
The algorithm, called shERWOOD, was trained using data from a massive parallel evaluation of shRNAs generated at CSHL. When combined with a new, optimized microRNA scaffold, the algorithm has allowed the scientists to begin create highly potent shRNA libraries targeting the human and mouse genomes.
Over the past decade, there have been several attempts to develop algorithms for the construction of effective RNAi molecules from endpoint silencing data, but these have largely been focused around siRNAs. And while such siRNA resources could be applied to shRNA design, their value was limited by fundamental differences between the molecules, the CSHL group wrote in Molecular Cell.
For instance, shRNAs expressed from RNA pol II or pol III promoters reach lower intracellular concentrations than transfected, synthetic siRNAs, they noted. At the same time, shRNAs must be processed nucleolytically before they are loaded into RISC, while siRNAs do not.
Given the need for shRNA-specific design algorithms, the CSHL researchers used a previously developed sensor assay, which allows for the large-scale parallel assessment of shRNA potencies, to interrogate about 250,000 shRNAs.
In looking at individual nucleotide positions for their predictive capacity, the scientists compared, at each position in the target sequence, each nucleotide's enrichment and or depletion levels in potent versus weak shRNAs. They found that, in general, low GC content is predictive of high efficacy, with the exception of the third nucleotide inside the guide target, which shows a strong selection for cytosine.
They also examined whether any pairs of positions were predictive of shRNA strength beyond what could be inferred from their individual predictive power and discovered that, for any given nucleotide position within a target gene, the most predictive partner is the neighboring nucleotide. "An exception to this trend is observed in the positions corresponding to the shRNA guide seed, where predictive position pairs are also observed in nucleotides separated by up to four bases," they wrote.
For triplets of nucleotide positions, the researchers found that neighboring triplets of positions within the target show strong predictive power compared with triplets of non-neighboring positions, and that the distance between predictive triplets is extended slightly in the guide seed region of the shRNA.
These results were applied to shERWOOD, along with additional heuristics to maximize the probability of successfully reducing protein levels in most cell and tissue types, the CSHL researchers wrote in their paper. Then, the algorithm was used to create an shRNA library against roughly 2,200 genes associated with cell growth and survival in culture, and about 400 olfactory receptor control genes.
The team found that shRNAs from the shERWOOD library outperformed those from two widely used genome-wide shRNA libraries — The RNAi Consortium (TRC) collection distributed by Sigma-Genosys and the Hannon-Elledge V3 library distributed by GE Dharmacon — with 40 percent of shRNAs against essential genes achieving significant depletion. This compared with 24 percent for the TRC library and 31 percent for the Hannon-Elledge collection.
To further boost the efficacy of their shRNAs, the CSHL investigators took a different approach to their construction by using a variant of the miR-30 scaffold typically used. Specifically, they created the shRNAs by Gibson assembly, removing restriction sites altogether.
When shRNAs were placed in this scaffold, which they called ultramiR, mature small RNA levels were increased significantly to those observed using the miR-30 scaffold. Meanwhile, the majority of the shRNAs were able to cut target mRNA levels by over 80 percent, and in most cases by over 90 percent. In comparison, TRC and Hannon-Elledge shRNAs showed only modest knockdown, according to the Molecular Cell paper.
The shERWOOD-ultramiR shRNAs also proved highly specific, with limited off-target effects.
Overall, the CSHL team found that they could generate shRNA libraries where nearly 60 percent of all hairpins targeting essential genes are strongly depleted in multiplexed screens.
"This means that, for any library containing, on average, four hairpins per gene, most bona fide hits will be identified by multiple hairpins, greatly reducing the probability of false-positive calls," the scientists wrote.
Currently, the scientists are constructing two new sequence-verified shRNA libraries targeting the mouse and human genomes. The first, which includes about 75,000 shRNAs against human genes and 40,000 against mouse genes, uses shERWOOD with the canonical miR-30 scaffold. The second combines the algorithm's designs with the ultramiR scaffold and is about 50 percent complete, and is being made available by Transomic Technologies.
"We feel that the combination of improvements to shRNA technologies described herein creates a next-generation RNAi toolkit that will produce more reliable outcomes for investigators, whether applied on a gene-by-gene basis or in the context of unbiased, genome-wide screens," the CSHL investigators concluded in their paper.