The experimental method called chIP-on-chip, which combines chromatin immunoprecipitation with microarray analysis to identify DNA-binding sites, is gaining popularity as a high-throughput way to explore regulatory interactions across the genome. Now, the same MIT research team that pioneered the chIP-on-chip technique has released a software package to help make the most of the data it produces.
The software is based on an algorithm called GRAM (Genetic Regulatory Modules), which combines chIP-on-chip data with expression data to build regulatory networks that are more accurate than those built with either data source alone, according to David Gifford, a professor of computer science and engineering at MIT.
Several computational approaches are available to reconstruct regulatory networks using expression data, but most of them assume that co-expressed genes are also co-regulated — which isn’t necessarily the case — or use indirect evidence of genetic regulation that doesn’t provide a clear picture of the relationship between regulators and the genes they regulate, Gifford said. On the other hand, chIP-on-chip data alone identifies the presence of regulators at promoter regions, but doesn’t provide any information on the type of interaction — that is, whether the regulators activate or inhibit genes in the network.
GRAM tackles this problem by starting with the chIP-on-chip location data, which uses microarrays to identify fragments of the genome that are bound by a particular protein. “You have a collection of cells at a particular time or with a particular genetic background or after a particular perturbation, and you take a picture of them by adding a cross-linker,” Gifford explained. “That snapshot captures in vivo what is actually, at that particular time, bound at the genome.” After the genome is broken into fragments and analyzed on a microarray, “Whatever spots light up are the parts of the genome that were bound by that protein.”
GRAM searches for sets of genes in the location data that share a common set of transcriptional regulators, and then uses expression data to identify a subset of those genes that are co-expressed. The two-step approach “allows us to not only determine what genes are co-expressed, but also what other factors may be binding to the regulatory regions that may explain that co-expression,” Gifford said.
The algorithm relies on “gene modules” — sets of co-expressed genes that bind to the same transcription factors — to determine the function of regulators. When higher expression levels of a transcription factor are correlated with higher levels of a particular gene module, it’s likely that the transcription factor positively regulates the expression of genes in the module. “Expression data is indirect because it’s the consequence of a mechanism, whereas this binding data is actually the mechanism itself,” Gifford said.
Gifford and his colleagues describe GRAM in a paper published online in Nature Biotechnology on Oct. 12. They validated the algorithm by reconstructing a regulatory network in yeast using binding data for 106 transcription factors and more than 500 expression experiments. After comparing the results against literature searches, independent chromatin-IP experiments, and other methods, the team was satisfied that “the GRAM algorithm would be useful in analyzing new data sources.” They then created a new genome-wide location analysis data set for 14 transcriptional regulators in yeast cells treated with rapamycin, which inhibits Tor kinase signaling, and used GRAM to reconstruct the novel regulatory network.
A java implementation of GRAM is available from the MIT website (http://psrg.lcs.mit.edu/GRAM/Index.html), and Gifford said the team is also making its rapamycin chIP-on-chip data freely available “so somebody can take our binding data and put it together with their favorite expression data.”
In addition, Gifford said, because chIP-on-chip is growing in popularity, “we would expect that many laboratories would have their own binding data that they could combine with expression data using this technique.”
— BT