Bioinformatics tools are adding precision to the process of selecting sequences for RNAi experiments
By Adrienne J. Burke
As RNA interference quickly becomes the number one new method for studying gene function in model organisms and human cell cultures, bioinformatics programmers are emerging onto the scene to help you design RNA oligos that can do the job.
By most accounts, an RNAi experiment is considered successful when a double-stranded RNA introduced into a cell not only matches the gene of interest and suppresses its activity by inactivating its messenger RNA, but also when it does not interfere with genes that you did not intend to perturb.
For those attempting to design synthetic short interfering RNAs for this purpose, RNAi expert Tom Tuschl, known for being the first to silence mammalian genes with the method, offers online guidance for increasing the odds that the 21-base RNA sequence you choose will work. If you follow it, he says there’s a 70 percent chance that the siRNA you design will succeed in downregulating your gene of choice.
Still, there’s that 30 percent chance it won’t. And just why it won’t is a question that has Tuschl and other RNAi pioneers scratching their heads.
Consider a few questions that remain to be answered by the experts: If an RNAi oligo has homology to more than one gene in your sample, why does it silence one but not the other? If you have two RNA sequences that are both perfect and unique matches to one gene, why would one, but not the other, silence the gene? How could a 21-base siRNA silence another gene in your sample with which it has only 12 bases in common?
Sayda Elbashir, formerly of Tuschl’s Max Planck Institute laboratory and now a senior scientist at the Boston-based RNAi therapeutics company Alnylam, has explored some of those questions. A paper she authored for Methods (No. 26, 2002) outlines some design protocols, and one currently in press for the journal Antisense & Nucleic Acid Drug Development describes a series of comparative studies that look at what role the region of a sequence might play in silencing efficiency. Elbashir says her team spent nine months conducting countless experiments to draw no definite conclusions. Among the possible reasons a duplex does not silence a gene? “We speculated that the secondary structure of the messenger RNA might play a role, maybe the area you target is not open to being recognized with the siRNA, maybe the sequence composition plays a role,” Elbashir says. “We don’t have solid evidence to say if you select this sequence from this region you get 100 percent silencing.”
Conundrums like these can be great challenges for experts like Elbashir, but nagging sources of frustration for gene function researchers who just want to take advantage of a new tool. As Whitehead Institute bioinformaticist Fran Lewitter notes, “If you have a gene of 500 bases, there are a lot of 21-length sequences you could choose. These experiments can be pricey and you don’t want to waste time.”
Tuschl, whose Rockefeller University lab is working on perfecting siRNAs, recommends tactics such as choosing a region with 50 percent or less GC content or aiming for at least a four-nucleotide difference from any other 21-base sequence in your organism. Elbashir suggests that if you can’t find a sequence without homology to a gene other than your target, a mid-sequence mismatch to the other gene is best. Tuschl adds, “Never use [just one] RNA. We always make four per target and if one doesn’t work … we have three others that [do].”
The most obvious way to design a 21-mer for a particular gene is to conduct a Blast search of GenBank. But for those who want less trial and error and more precision, there’s a growing list of more sophisticated bioinformatic options. They won’t explain all the RNAi enigmas, but they can increase your chances of coming up with an oligo that works. The box on page 27 lists URLs for several places to get siRNA design guidance, and we describe a few options here.
Public siRNA Programs — Fran Lewitter’s bioinformatics team at Whitehead unveiled a design program in February that codifies Tuschl and Elbashir’s siRNA selection rules. Starting with the messenger RNA instead of genomic DNA, the program returns a list of possible siRNAs, ruling out any homologs. By searching SNP and splice variant databases, the program also lets users know if proposed siRNAs contain known SNPs, or alternative splice variants. The program can be instructed to exclude regions with exon boundaries and choose sequences specific to the targeted gene or an entire gene family. Results are available directly on the web or can be e-mailed to the user.
Ye Ding, a bioinformaticist at the New York State Department of Health’s Wadsworth Center, offers another public program. Sfold is a program based on a patent-pending algorithm for predicting mRNA structures. The program can be used for RNA folding and the rational design of RNA-targeting nucleic acids.
RNA oligo vendors — Most companies that sell siRNAs offer free design programs on their websites. Ambion, Dharmacon, and Qiagen currently offer such tools, and Proligo says that it intends to soon. Whether you’re ordering oligos from the company or not, you can use their web tools to search for sequences.
MWG, which sells Dharmacon-made siRNAs in Europe, has just begun offering an online design tool, too. Like at other vendors’ sites, customers can paste in a sequence accession number from GenBank, define preferences such as percentage of GC content, and instruct which databases it would like the sequence searched against.
Molecula, another siRNA design company, says it is utilizing a proprietary in-house design system called Target, adapted from the company’s 10-year-old antisense oligos business. The program limits GC content to less than 60 percent, generates symmetric duplexes with both 3’ ends as dTdT deoxynucleotides, and chooses designs so that siRNA can be expressed from polIII expression vectors.
For those willing to give up some control when ordering siRNAs from a vendor, Dharmacon is known best for its knockdown guarantee. The company says that its Smart Selection tool designs siRNAs with a 99.99 percent chance of suppressing the activity in your chosen gene by at least 75 percent. Exactly how Smart Selection works is a carefully guarded secret — “there’s a lot of proprietary information and bioinformatics investment in it,” says Dharmacon CEO Steven Scaringe.
But at a recent RNAi conference in Waltham, Scaringe shed some light on the algorithms Dharmacon uses: through studying about 360 RNAi duplexes on four different genes, the company has honed 34 selection criteria. These criteria, which expand on Tuschl’s rules, use weighted scoring to enhance the probability of silencing. The company then also pools four selected siRNAs, in a manner that is also proprietary. “Mimicking natural siRNA pools [offers] much higher specificity,” Scaringe said at the conference.
Some are skeptical of the powers of the black-box technology. A Merck researcher who compared 25 siRNAs custom-designed by Dharmacon with 25 sequences selected by the Rosetta bioinformatics team says, “I can tell you that our rules are not significantly different from Dharmacon’s because the results are indistinguishable. My sense is that they don’t have a magic bullet.”
Under its most economical offering, Dharmacon sells a pool of four oligos with the guarantee that one in the cocktail will induce gene silencing. But unless you are willing to pay for a higher level of service, you don’t get to find out which sequence actually worked. Tuschl and others have criticized the company’s policy, but Scaringe defends it, saying that the product is reasonably priced and comes with a guarantee. He acknowledges, however, “We have to work with the market. If you absolutely need the sequences we’ll share them with you. … Who knows, our policy may need to change in six months.”
RNAi services companies — For help developing in vitro double-stranded RNAs, MapRNAi is a new tool from ReceptorBase, a Baltimore company founded by GPCR expert Frank Kolakowski. MapRNAi can purportedly identify every potential RNAi and associated networks for a given genome. Kolakowski explains that the approach “allows us to drop a genome on top of a computer program that will go through and compute all possible mRNA-length oligos and label whether they are unique. … You can use this information to make sure you can design the correct siRNA with a lot of information about the natural network that might be there, or use this approach to control signal transduction pathway or to control a set of genes linked in some other aspect of physiology.”
Alan Kopin, an advisor to the company and a Tufts-New England Medical Center Drosophila researcher had, until now, been using GenBank Blast searches to choose RNAi sequences. He expects the ReceptorBase tool to turn selection into a rational process. “This takes it to a level that’s much more precise,” he says.
Bioinformatics vendor Compugen argues that you need to consider splice variants when you select your siRNAs and is offering a design service using its transcriptome database. Compugen’s executive director of technical marketing, Alon Amit, says half of all genes have splice variants and it’s necessary to understand their structures in order to fine-tune your siRNA selection.
To Tuschl and other RNAi researchers, however, many of the commercial services seem like icing on the cake. After all, they have yet to encounter a gene that they couldn’t silence eventually.
Where to Find siRNA Design Guidance Online