At A Glance
Name: Fran Lewitter
Position: Director of bioinformatics and research computing, Whitehead Institute
Background: PhD, genetic epidemiology, University of Colorado, Boulder 1977
Education: BS, mathematics, University of Wisconsin 1972
Fran Lewitter started her undergraduate studies focusing on pure math. However, she became interested in the biological sciences after a course on archeology revealed a significant role for mathematics in genetics.
After working as a postdoc at Harvard Medical School, she held various positions at GenBank and Brandeis University. Ultimately she made her way to the Whitehead Institute, where she has been for 12 years, five of which as head of her own computational biology group.
During her time at Whitehead, she oversaw the development of an online siRNA selection program, located at http://jura.wi.mit.edu/siRNAext, that provides researchers with assistance in finding oligos that will knock down a gene of interest.
She recently spoke to RNAi News about her work.
Let's start with an overview of what you do at Whitehead.
What my group does is work with the scientists here at Whitehead to help their science. We collaborate, we teach courses, and we consult, so people can come in and ask, 'How do you do a multiple alignment?' or something like that.
But what we've seen in the past couple of years are more complex questions, which require much more program building. … Brent Stockwell, who was a Whitehead fellow, came to me very soon after Tom Tuschl's paper [in Nature] was published showing RNAi in mammals. He asked me if there were any bioinformatics tools to apply to the [siRNA] selection process. That's how we started the website … [which] is the first tool from my group that we've made publicly available.
[When Stockwell came to me,] I was by myself at that point. I did the programming [for the siRNA tool], which involved figuring out all the possible 21 mers that could be selected from the sequence he was interested in. Then I decided we should do some BLAST searching to make sure that these different sequences were unique in the genome. It was all pretty manual, and I selected some possible siRNAs that he used. In fact, they worked, and that was gratifying.
Then I had a bioinformatics programmer, Bingbing Yuan, join me that summer [after Stockwell] asked for help, and I put her on this project. She developed the website, and she's done a really great job building the tool.
When you first put the tool together, where did you come up with selection criteria?
Brent gave us some ideas about what to do, and I knew Tom Tuschl, who had been a postdoc here at Whitehead. He was a little bit skeptical about doing much bioinformatics, but in the end he's been very supportive and, in fact, has been very helpful in giving us ideas and testing [the siRNA tool]. His people were using it while he was [at the Max Planck Institute], and we've been collaborating with Markus Hossbach who worked with [Tuschl] over in Germany.
So the first version of the tool you put together, you said, was successful?
Well, the first selection that I did fairly manually was successful. Then we opened it up to Whitehead scientists our first release of the website was just for Whitehead scientists and we got good reviews from people here. Then we made it publicly available in 2003 because people were asking for it at that point.
Could you give an overview of the website and what it involves?
Let me back up and say that one of the goals is to make it a flexible webserver. A lot of people like black boxes, and want to put in a sequence and get some candidate oligos out. The problem is that the rules keep changing, and new experimental information is being discovered all the time. We try to allow for incorporation of that kind of new information, instead of making the website static.
You, of course, have to start with your sequence search or an accession number, and then you can enter some patterns, and these are the original patterns that Tom Tuschl had come up with the AA, 19 of anything, and then a TT. But we also allow for custom patterns, so you can enter very complex patterns if you have some idea of what should be in a certain position.
Once you put your sequence in, the next place where there's selection criteria is where you get a list of possible hits that match your pattern, what position they're in within the sequence, GC percent, thermodynamic values, and you're allowed at that point to select which of the sequences you want to use for further analysis you do a Blast search, and we provide flexibility for what program you would use, whether it's an NCBI Blast or WU Blast. And you can search different species human, rat, mouse against different databases such as RefSeq or Ensembl, things like that.
Once you do your search, and this is where the newest feature has come in, you can filter for off-target effects. This is based on some of the work that Tom Tuschl has published in the last year, which shows that a match to other sequences in certain positions is not good. We have the filtering steps of the Blast that let you remove the siRNAs that have potential off-target effects. It's quite flexible if you have Blast hits that share a certain number of matches, you can eliminate those as a possibility. So, if you can say, "Anything that matches with more than 12 bases should be eliminated." You can also do it by position.
You talked about wanting this to be a flexible tool. Is there still work being done to refine it?
Currently, we don't have a release date for any new features, but we will. We actually have been getting a lot of user feedback. We do require that people register, and the reason is that we have to limit how many searches a person can do on a given day because we have limited resources. So we ask for a very simple registration, and we have over 6,000 registered users.
Are those pretty much all academic?
A fair amount are the majority certainly. But we do allow commercial users. And we're getting requests for licenses to our software. We're working on that.
Is that like a version of the website that people can put onto their computer and run all they want?
Any sense of when that might be available?
We're going to try it with one place first, and see how it goes. That will be coming up soon. I hate to put a date on it, and I'm not convinced that we're going to want to make it publicly available, because then you get into maintenance. Everybody's environment is different, so the [program] is going to behave differently.
When you make updates and changes [to the tool], is that something you do with people on the outside or … do you work on it internally?
Well, for instance, after Phil Zamore who was also a postdoc here at Whitehead published the thermodynamics paper in Cell, he sent me an e-mail and said, "Incorporate this." Well, I would have done that anyway.
Then, certainly, I'm in touch with Tom Tuschl and from time to time he will send suggestions. I'm not sure we get that many [comments] from users, per se, but we scan the literature and see if there are new rules. There haven't been a lot of new rules recently, though.