Professor, physics of complex systems, Weizmann Institute of Science
• Chairman, council of professors, Weizmann Institute of Science — 1999-2001
• Head, department of physics of complex systems, Weizmann Institute of Science — 1993-1999
• Visiting professor, Oxford University — 1990-1991
• Visiting scholar, Stanford University — 1983-1984
• PhD, theoretical physics, Cornell University — 1977
A team of investigators from the Weizmann Institute of Science this month reported in Nucleic Acids Research on the development of a new tool for identifying functional microRNAs and their targets.
Unlike existing approaches, the new method — called CoSMic, or context-specific microRNA analysis — combines sequence-based prediction algorithms with experimentally derived miRNA and mRNA expression data. As such, CoSMic can identify miRNAs with active roles in specific biological systems and predict their targets with fewer false positives.
Additional details about CoSMic can be found here.
This week, Gene Silencing News spoke with the paper's senior author, Eytan Domany, about the work.
Let's start with an overview of your research focus.
I am a theoretical physicist by training, and have worked on computational biology and systems biology for the past 11 years. I have a group of students, [and] we develop and apply methods for analysis of data coming up from collaborations with wet labs or that are publicly available.
Within your particular areas of interest, are microRNAs something you've worked on before?
We just published a study on the role of microRNAs in breast cancer, in collaboration with a group in Rome … but this particular problem [of miRNA target identification] is something we've been working on for two and a half to three years. It's a serious problem and we've been racking our brains trying to find a way to do it better.
Can you give an overview of the key difficulties with microRNA target identification?
It's an important problem. MicroRNAs are believed to play a key role in regulating all kinds of networks and biological processes. In order to understand those processes, it is important to understand the role of microRNAs.
The main problem, as I see it, is that there is no high-throughput method to identify targets of microRNAs. It is widely accepted that a microRNA does not perform a biological function on its own, rather it regulates the level of expression, the level of degradation, and the level of translation of a target messenger RNA. … So in order to understand the role of a microRNA, you have to know which are the messenger RNAs that it targets.
This targeting is done by binding of the microRNA to a particular binding site on the messenger RNA, and there is no high-throughput method that can [identify] the mRNA molecules to which [a particular miRNA] binds. If you have a particular [miRNA/mRNA paid] in mind, you can do an experiment to test whether a microRNA binds to this particular mRNA, but you cannot do it in a genome-wide fashion.
There are several sequence-based prediction algorithms … that have been proposed in the last five or six years, but they all have one problem in common: they all predict a very large number of targets, into the thousands, and that is not much help to an experimentalist who wants to zero in on a few that he can directly test. If you give him a few thousand predicted targets, what does he do? … [Additionally], these few thousand are mostly false positives.
What people do is take two or three or four of these algorithms and try to intersect the list of targets produced by each of them and assume that if an mRNA is coming up in two or more, it is a true target. This assumption isn't necessarily correct; there is no reason why one algorithm's true positives should coincide with the other ones' true positives.
These target-prediction algorithms all use sequence information and look for matching [base] pairs … and try to calculate the conformation of the complex, and from that, to calculate free energy differences between the bound microRNA/mRNA pair and the … dissociated components. This is a hard calculation to do.
Another issue is context specificity. A microRNA will bind a particular target in one biological context and will not bind it with high affinity in another context. [For instance], in a breast tumor, there will be regulation of a particular target by a particular microRNA, but there is no guarantee that the same level of affinity or regulation will be observed in a different tissue. Since RNA sequence does not depend on the biological context, sequence-based algorithms cannot be context-specific.
So how does CoSMic work and how does it sidestep those issues?
CoSMic combines theoretical sequence-based prediction methods … [with] an experimental component. If you are interested in breast cancer, you take breast cancer tissue and measure the expression profile of messenger RNAs and of microRNAs in several conditions.
These conditions can be time points if you are doing dynamic experiments, or they can be breast cancer tissue samples taken from different patients, [for example, but always require] more than one measurement of both microRNA and mRNA on the same biological system you are looking at. Then you can calculate the correlation of expression of the microRNA with each and every one of the mRNAs whose expression level you measured.
Now, focus on a particular microRNA — you have a measurement of this correlation between the expression levels of the microRNA and all its potential targets … done in a context-specific way. … [Using this information along with the results of a particular target-prediction algorithm] ... we look for those mRNAs that have a high correlation with the microRNA and also have a high score with the sequence-based prediction of choice. We combine these, and for every microRNA, we can give a statistical measure of whether it is active in this particular [biological] context or not.
We end up with a list of active microRNAs … [and] for each of these microRNAs, [CoSMic yields] a list of predicted mRNA targets.
To test this … we ran an experiment with [CoSMic in human mammary epithelial cells] … known to respond to [epidermal growth factor]; in response to EGF, these cells start to migrate. … [Of the several miRNAs identified as having putative targets involved in the migration process], we selected three … and silenced them by introducing siRNAs against [them]. So we lowered the concentration of these microRNAs and, indeed, this manipulation had a clear effect on the migratory response of the cells.
[As detailed in the paper], our success rate [of identifying the true positives] with CoSMic was 30 percent for one microRNA and 50 percent for another. Our purity, [the fraction of real targets among the predicted ones], was around 7 [to] 10 percent, which is much better than any sequence-based predictions and other existing algorithms.
Do you see opportunities to further refine CoSMic?
We're not going in that way. We are now in the process of applying it within collaborations in cancer research.