By Doug Macron
Researchers taking advantage of high-throughput RNAi screening's ability to understand gene function and biological pathways should not overlook the need for properly optimizing and validating assays in order to get the most out of such screens, according to the director of New York University's RNAi Core.
At the same time, the data-analysis process is far from straightforward and requires equal attention to detail, according to Chi Yun, who made her remarks at Select Biosciences' Academic Screening Workshop held in Bridgewater, NJ, last week.
The process of conducting a high-throughput RNAi screen begins with the optimization and validation of an assay, and “you either pay as you go in or you pay as you go out,” Yun told Gene Silencing News this week. “If you don't put the time and effort into your optimization, you're going to spend a lot of time and effort at the end of the screen because you're going to have hundreds of putative positives … to wade through.”
Though biology is “noisy, and even a great screen might have noise, as long as you put the effort into optimization, you can extract information,” she said.
Because no two screens are exactly alike, there is no one-size-fits-all approach to doing so. But Yun offered a few guidelines including making sure that the assay uses transfectable cell lines that are appropriate for the resources available.
NYU's RNAi Core, for instance, offers siRNA, dsRNA, and microRNA libraries, but not shRNA-based ones, limiting the types of cells that can be used for screening. At the same time, while there are technologies available for “high-throughput electroporation-type experiments, we don't have that technology right now,” she said.
Yun also recommends that researchers make certain they are looking for a quantifiable cellular phenotype.
“You need to have something to measure,” she said. “We have had people come in with a phenotype where you can't even see a difference by eye. They're looking for an increase in fluorescence intensity or translocation of a protein from the nucleus to the cytoplasm, and those screens are just really hard to do.”
In such cases, investigators need to “optimize around that phenotype so that you can actually use it for the true screen,” she explained. “We've had people use a [green fluorescent protein]-integrated cell line and [overcoming measurement issues] was as simple as staining with GFP to increase the signal.”
Lastly, Yun suggests that researchers make sure to use adequate controls, which are “really important. You have to have a way to optimize the assay and you want to get an idea of what kind of screening positives you'll get.”
While negative controls are key, she also stressed the need for proper positive controls. “Otherwise, how will you assess where your cutoff will be when you're doing the analysis?” she asked.
“I'm also a strong believer in not just choosing the strongest positive control you have,” but also including intermediate controls, Yun added. “You want to know what kinds of things affect your pathway, but maybe they don't display as strong a phenotype as your strongest control. [This] also gives you a way to tier the positives you get at the end of the screen.”
Once a primary screen is completed and before secondary screening and in vivo validation can occur, initial data need to be analyzed, a process for which there is “no one best” method, Yun said. “Every screen is different … [and] some people come up with very imaginative ways of analyzing their data.”
Because screens at the NYU RNAi Core are investigator-initiated, researchers have the final word on how data are analyzed, but the facility does provide advice, references, and feedback on “what might be better ways to analyze data,” she said, adding that her facility also provides tools for data visualization.
“Data analysis is really tough, especially for the academic investigators who are not used to handling that type of data,” Yun said. “Some people do collaborate with bioinformatics people, but even those that do should really understand what [their collaborators] are doing.”
Because investigators know the biology and what data might be relevant to their experiments, they should work closely with bioinformatics partners, not just “hand everything off and see a list later on,” she added.
Yun also suggested using multiple data-analysis parameters, especially on first passes. “Maybe using an average-based method is not the best [method due] to a non-normal distribution,” she said. “Maybe you want to use a medium-based analysis method.”
The literature can provide tips on how to analyze data, but it isn't perfect since the primary data isn't always available.
“I always tell people to look to the literature to see what people have used, but unless you have access to the raw data, it's hard to compare,” Yun said.
At the same time, “data-visualization programs are really useful. There are a lot of tools available where you can plug in the data and look at the different outputs,” Yun said. Then, by looking at different cutoffs, [you can] see how many screening positives pop out of those different analyses.
“I would not recommend anyone doing this with just Excel,” she noted. “It becomes a lot of work.”
Have topics you'd like to see covered in Gene Silencing News? Contact the editor
at dmacron [at] genomeweb [.] com