AT A GLANCE
Name: Bernard Mathey-Prevot
Position: Director, Drosophila RNAi Screening Center, Harvard Medical School, 2004-present; and associate professor of pediatrics, Harvard Medical School, 1995-present
Education: Postdoc, biology, Whitehead Institute and Massachusetts Institute of Technology, 1983-1987; PhD, biology, Rockefeller University, 1983
Researchers conducting large-scale RNAi screens in mammalian cells were made aware early on of the need to adjust against siRNA’s propensity for off-target effects. However, it was thought that the same off-target effects were not necessarily a problem in large-scale RNAi screens in cells from other important model organisms, such as Drosophila and C. elegans.
That idea is now changing, and Bernard Mathey-Prevot’s research group at the Drosophila RNAi Screening Center at Harvard Medical School recently completed a study that provides evidence of off-target effects associated with long dsRNAs in Drosophila cell-based assays.
The group’s research was published in the October issue of Nature Methods along with an accompanying commentary by a consortium of leading RNAi researchers.
This week, Mathey-Prevot took a few moments to discuss his group’s research with CBA News.
Your group conducts high-throughput RNAi screens in Drosophila cells. In the Nature Methods research paper, you wrote that off-target effects in this organism and in C. elegans were previously thought not to be problematic. Can you explain why that view is changing?
I think it’s mostly because of the ability of C. elegans and Drosophila to tolerate the presence of long double-stranded RNA in cells without invoking the interferon response. That meant that people were able to use, say, a 500- or 600-nucleotide long double-stranded RNA if it would be processed by Dicer into siRNAs. The thought was that because of that, you were going to have a large pool of very different siRNAs, each specific to the same gene, basically, since they were derived from a particular region of a gene. By doing so, you would have what many people have argued is a dilution effect: Let’s say you have a bad siRNA in there that would cause some off-target effects simply because it has some homology toward the genes, or because it has the seed region of the miRNA, which is also being recognized nowadays as being a problem in off-target effects – that effect would basically be diluted by all the other ‘good’ siRNAs that were present in the cells after Dicer processed the RNA. Therefore, this dilution would protect from any potential off-target effect that was definitely observed in mammalian cells, where people in the past used just a single RNA, and therefore the concentration of the siRNA would be much higher than any of the processed siRNAs in Drosophila or C. elegans. This really was the dogma, in a way.
I must add that there is some truth to that, and the field, even in mammalian cells, has moved toward pooling siRNAs to reduce off-target effects. There is no question there is a dilution effect, but beyond this, there is definitely still the possibility of observing off-target effects. We can say this with confidence, because rather than just trying to take a single example of a particular region – like the Ma and Bicci paper in Nature [Ma et al, Nature 2006; 443:359-63. Prevalence of off-target effects in Drosophila RNA interference screens] did – we looked at all the double-stranded RNAs that we had in our collection throughout 30 genome-wide screens, and their statistics, basically. Then we identified a number of dsRNAs that were called hits by investigators because they caused a phenotype. Those turned out to be at much higher frequencies than by chance alone if you compared to all the other dsRNA. And one distinguishing factor for these dsRNA is that each of them had a region of homology of minimally 19 nucleotides or longer to other genes. This was the best predictor, and still is, at least with Drosophila, for off-target effects. It was an alarm because that would definitely lead to the identification of false positives in genome-wide screens.
Why did your group decide to investigate this in the first place? Were you having issues with off-target effects in your screens?
There was always the concern because these off-target effects were so prominently discussed and recognized in the mammalian field. And often we were asked whether we were worried about off-target effects. In the beginning, like everybody else, we thought there was absolutely no real issue. But then we realized nobody had really tested that. The first thing that attracted our attention to a potential problem is that the collection of the dsRNA that we are using in the DRSC was based on an early annotation which, in some cases, two open reading frames would be annotated as two different genes when in fact, in later annotations, they turned out to just be a single gene. And we had a number of these, maybe several hundred. So we could go back and say, ‘OK, if one in the pair scores in one assay, you’d expect the other to score as a hit, as well, in the same assay.’ And we found that there was a fair amount of discordance – although with some of them it was perfect. Most of the time, when it didn’t work, it turned out that one or two of the long dsRNA had perfect homology toward the genes. And we originally looked at 21 or 23 nucleotides, simply because it’s the length of the siRNA being made by Dicer. In fact, when we did a careful analysis, we started to have noticeable problems as soon as we had 17 nucleotides of perfect homology. The big threshold, which we found to be the point where you need to worry, is at 19 nucleotides.
This is the reason your group thinks the off-target effects are happening?
Yes. Experimentally, we were able to show that even in the case of dsRNA, when you have one or two perfect homologies predicted in silico, that doesn’t necessarily mean you’re going to have knockdown of that particular gene. We can’t say for a fact that this will always happen. But in one case, we were able to show very dramatically that one of the predicted off-target effects is completely knocked down in the cell.
The complicating factor in this type of analysis is that Dicer has processive activity. In other words, it will start at one end or the other of a dsRNA, and then digest by 21- or 23- nucleotide blocks. That means that you won’t be able to get all the predicted siRNA that you can think of by just doing a computer analysis of a whole 400-basepair strand. You have a large amount of siRNAs that are possible if you slide one nucleotide at a time. But that’s not the case in vivo, where you can most likely have only a subset of those predicted siRNAs because of Dicer. This is why also sometimes you might have this excellent dilution effect I was alluding to earlier because you have a lot of good siRNA; but sometimes, because of Dicer, you may not have this excellent diluting activity.
What is the cautionary note to researchers conducting these types of screens?
Every new technology tends to be a little overhyped at the beginning as being the ultimate tool to do, in this case, functional genomics. I think that in this early enthusiasm, people started to do these genome-wide screens, and started to publish these long lists of genes that were implicated in the process. We wanted to make sure the community was aware that, indeed, among some of the earlier lists of published genes, there may be some that are false positives, and there is a very easy way to check by taking whatever gene was published, which dsRNA was used to target the particular gene – and they can do this through our website. And if they use the program that we use … and it predicts that you have a large number of off-target effects, my worry would be that it in fact is just an off-target effect and that it is basically an artifact. They should not put too much stock on that particular result until they can confirm with a second, independent dsRNA. A screen is a screen, and people who have done genetic screens know that when you come up with about 20 or 30 genes, some of them will just be background and not be real, and some will be real. And I think that these large-genome RNAi screens that have been done fall into the same category. I think that each list of genes is to be taken as a suggestive list, not necessarily a validated list, because you can’t confirm in vivo the validity of what you found in a cell-based assay, because it’s just too much work.
In the commentary accompanying the paper in Nature Methods, your group along with other prominent RNAi researchers mention that, among several possible solutions to improving specificity, one is using high-content or multiparametric assays. Can you elaborate?
Many of the screens that had been initially done in our center relied on things like transcriptional assays based on luciferase, and things like that. These particular assays really measure the signal input downstream from a signal transduction pathway. And they are also susceptible to certain situations – let’s say one dsRNA has a metabolic toxicity, where the cells just tend to die. Off-target effects can sometimes happen simply by knocking down the other gene that is recognized by the siRNA. But in other cases, it looks like you may have very little effect on the knockdown on one or two of the off-target genes, basically, but there is this sort of general malaise of the cells, and we don’t really understand the physiology of that. So the cells are going to be a little sick or die, and that may reduce the cell number. And whenever you do very easy reporter gene assays, you often need to normalize for cell number. At very low cell numbers, there are issues of linearity in terms of your test reporter that is basically measuring cell numbers, and it is no longer linear, and you can get skewed values that will make you believe you’ve gotten a hit in your assay, but it’s due to the nonlinearity of your normalization.
If you look at multi-parametric assays, or high-content assays, the ideal for us in terms of an assay is one that … involves a sequence of molecular steps that need to take place in order to give you a phenotype if its disrupted – it’s not just one or two steps, and then you no longer have a signal. You have to have maybe several steps. For example, in muscle differentiation – Norbert Perrimon’s lab [at Harvard Medical School] carried out a genome-wide screen in primary cells to look at muscle differentiation. That is really a very complicated process, and I think the moment that you see that only the muscle cells in that particular culture seem to be disrupted and unable to carry out differentiation, you feel much more comfortable that you have really disrupted the process, and not just a general metabolic pathway that makes the cells look a little bit sick. Because this was done using a mix of primary cells including neuronal, muscle, and other types of cells – when you see the phenotype, it is only affecting the muscle cells. If you can measure that, you reduce the risk of being fooled by some generic off-target effect. And if you use multi-parametric assays with endogenous reporters – for instance, an mRNA transcript that you have shown by RNAi profiling that it is typical of a signature response – if you have five or six mRNA that correspond to that signature, and you can follow them inside the cells, it’s going to give you much more confidence that if you disrupt that program in all five or six markers, you’re really looking at a process of interest as opposed to something that is somewhat implicated in maintaining cell health.