Skip to main content
Premium Trial:

Request an Annual Quote

Toolkit: MIT Marries Expression Data and Binding Data to Build a Better Network


The experimental method called chIP-on-chip, which combines chromatin immunoprecipitation with microarray analysis to identify DNA-binding sites, is gaining popularity as a high-throughput way to explore regulatory interactions across the genome. Now, the same MIT research team that pioneered the chIP-on-chip technique has released a software package to help make the most of the data it produces.

The software is based on an algorithm called GRAM (Genetic Regulatory Modules), which combines chIP-on-chip data with expression data to build regulatory networks that are more accurate than those built with either data source alone, according to David Gifford, a professor of computer science and engineering at MIT.

Several computational approaches are available to reconstruct regulatory networks using expression data, but most of them assume that co-expressed genes are also co-regulated — which isn’t necessarily the case — or use indirect evidence of genetic regulation that doesn’t provide a clear picture of the relationship between regulators and the genes they regulate, Gifford said. On the other hand, chIP-on-chip data alone identifies the presence of regulators at promoter regions, but doesn’t provide any information on the type of interaction — that is, whether the regulators activate or inhibit genes in the network.

GRAM tackles this problem by starting with the chIP-on-chip location data, which uses microarrays to identify fragments of the genome that are bound by a particular protein. “You have a collection of cells at a particular time or with a particular genetic background or after a particular perturbation, and you take a picture of them by adding a cross-linker,” Gifford explained. “That snapshot captures in vivo what is actually, at that particular time, bound at the genome.” After the genome is broken into fragments and analyzed on a microarray, “Whatever spots light up are the parts of the genome that were bound by that protein.”

GRAM searches for sets of genes in the location data that share a common set of transcriptional regulators, and then uses expression data to identify a subset of those genes that are co-expressed. The two-step approach “allows us to not only determine what genes are co-expressed, but also what other factors may be binding to the regulatory regions that may explain that co-expression,” Gifford said.

The algorithm relies on “gene modules” — sets of co-expressed genes that bind to the same transcription factors — to determine the function of regulators. When higher expression levels of a transcription factor are correlated with higher levels of a particular gene module, it’s likely that the transcription factor positively regulates the expression of genes in the module. “Expression data is indirect because it’s the consequence of a mechanism, whereas this binding data is actually the mechanism itself,” Gifford said.

Gifford and his colleagues describe GRAM in a paper published online in Nature Biotechnology on Oct. 12. They validated the algorithm by reconstructing a regulatory network in yeast using binding data for 106 transcription factors and more than 500 expression experiments. After comparing the results against literature searches, independent chromatin-IP experiments, and other methods, the team was satisfied that “the GRAM algorithm would be useful in analyzing new data sources.” They then created a new genome-wide location analysis data set for 14 transcriptional regulators in yeast cells treated with rapamycin, which inhibits Tor kinase signaling, and used GRAM to reconstruct the novel regulatory network.

A java implementation of GRAM is available from the MIT website (, and Gifford said the team is also making its rapamycin chIP-on-chip data freely available “so somebody can take our binding data and put it together with their favorite expression data.”

In addition, Gifford said, because chIP-on-chip is growing in popularity, “we would expect that many laboratories would have their own binding data that they could combine with expression data using this technique.”

— BT


Filed under

The Scan

Study Reveals New Details About Genetics of Major Cause of Female Infertility

Researchers in Nature Medicine conducted a whole-exome sequencing study of mote than a thousand patients with premature ovarian insufficiency.

Circulating Tumor DNA Shows Potential as Biomarker in Rare Childhood Cancer

A study in the Journal of Clinical Oncology has found that circulating tumor DNA levels in rhabdomyosarcoma may serve as a biomarker for prognosis.

Study Recommends Cancer Screening for Dogs Beginning Age Seven, Depending on Breed

PetDx researchers report in PLOS One that annual cancer screening for dogs should begin by age seven.

White-Tailed Deer Harbor SARS-CoV-2 Variants No Longer Infecting Humans, Study Finds

A new study in PNAS has found that white-tailed deer could act as a reservoir of SARS-CoV-2 variants no longer found among humans.