By Meredith W. Salisbury
Ye Ding, a career statistician who works for the New York State Department of Health, isn’t what you’d call a likely hero for the rapidly growing RNA interference crowd. But Ding believes he can improve high-throughput functional genomics with a superior process for designing antisense oligos and choosing short interfering RNAs to be used in gene silencing research.
Ding, whose statistical background harks back to the ’80s at Carnegie Mellon University, joined the Wadsworth Center, a research unit of the state’s health department, in 1990 to do statistical modeling. By 1997, his work had morphed into RNA structure prediction — the key underpinning for his work with siRNAs.
Seeking a more surefire method of choosing siRNAs than the existing trial-and-error routine, Ding found that existing algorithms for predicting the structure of RNA came up short. Mfold, for instance, “predicts one optimal structure and a limited number of alternative structures” for target RNA, says Ding. But because RNA is believed not to have one unique structure, he explains, that doesn’t provide a clear enough picture of binding sites where siRNAs could be aimed. “The key for this antisense type of nucleic acids to work is target accessibility,” he says.
Unconvinced by the available options, Ding and his Wadsworth colleagues got to work on a new program. Ten thousand lines of code later, funded by both NIH and NSF grants, Ding emerged with Sfold, a completely different approach to structure prediction. The web interface was put together in just six months, and all the work for the software was done in-house at the interdisciplinary Wadsworth Center. “Our algorithm provides user-friendly tools to predict the accessibility of targets,” Ding says.
In contrast to Mfold’s specific predicted structures, “we take a statistical sample of probable structures and summarize all the information into a single, graphical plot,” Ding says. “That way you have statistical confidence that these are good sites regardless of which structures you elect to look at. It overcomes the difficulty of predicting a single structure.”
Anyone can take advantage of the software. Users submit their jobs to the Sfold website and are notified when the results are in. The data comes in two forms: graphic — peaks in the image show where sites are more accessible, and valleys show sites to avoid — as well as in a text output file so users can work with it in any format. The patent-pending algorithm is free to noncommercial researchers, and requires a license for commercial applications.
Sfold has seen a few generations since its inception. The first version took two years, during which Ding also took biology 101 to really get a handle on the problem at hand. Making the algorithm robust and bug-free took another year or two. And now it’s all about maintenance, Ding says: because of all the unknowns in the still-nascent field, he’s continually adding to or tweaking the code to keep up with new discoveries. But he doesn’t mind, so long as he gets to work on RNAi. “It’s the hottest thing in biology,” he says.
Peers consider it important work, too. NSF awarded Ding’s group a $600,000 grant for three years for the project, and he says his request for a five-year, $2 million grant from NIGMS looks promising, too. “All this federal money is going into the further development of the software with a long-term goal to continuously improve [it],” he says.