As efforts toward taming the epigenetics of various cancers continue to ramp up, academic labs' stores of chromatin immunoprecipitation data continue to swell. At Ohio State University, computational biologist Victor Jin is developing bioinformatics approaches to chip away at cancer epigenetics, with a particular emphasis on using ChIP-seq data.
"Cancer systems biology is moving very fast," Jin says. While some researchers continue to use a ChIP-chip approach, he says most researchers have moved away from arrays, which have largely "been replaced by next-generation sequencing, [because it is] higher resolution, more high-throughput."
In a July PLoS One paper, Jin and his colleagues report their use of ChIP-seq to generate a genome-wide map of TGFβ-induced SMAD4 binding in an epithelial ovarian cancer cell line, as TGFβ-SMAD4 signaling is commonly dysregulated in ovarian cancers. Within TGFβ-stimulated ovarian cancer cells, the team identified more than 2,300 SMAD4 binding loci as well as 318 differentially expressed SMAD4 target genes.
Then, using an in silico mining approach on published clinical data sets, the team found that a subset of those 318 differentially expressed genes correlated with patient outcomes. Indeed, when the researchers trained their computational approach on three different, publicly available ovarian cancer data sets, they identified a subset of 187 genes that segregated with patient survival. When the researchers tested that subset of genes on two separate clinical ovarian cancer cohort studies that had reported survival data, they found those genes to be "actually predictive of survival," Jin says. "While our genes are predictive in only a subset of patient data, they have power."
While Jin says his team's approach to analyzing ChIP-seq data for TGFβ-induced SMAD4 binding may be applicable to other cancers in which TGFβ signaling is dysregulated, it also points to the power of using a computational mining approach on publicly available patient cohort data to tease out gene signatures that predict clinical outcomes. "Our approach ... can be applied to other projects," he says. "This approach can be used on real clinical patient [data] to see how sets of genes affect [outcomes]."
Jin still has other epigenetic data sets to dip into for additional in silico analysis. "We have a lot of clinical patient data in our hands," he says, noting that he has multiple collaborations with cancer researchers at Ohio State and across the US.
To translate gene signatures that he identifies into clinically useful predictors, Jin has his sights set on a combinatorial solution. One long-term goal, he says, is to "develop a [computational] framework to combine ChIP-seq to methylation to gene expression data, and of course RNA-seq [data], into a platform to try to identify cancer biomarkers." With such a platform, Jin envisions a computational "toolkit for clinicians and researchers ... to say: 'This set of genes is potentially useful for this patient.'"
Jin is quick to point out that such a solution is rarely the result of one investigator's work, particularly when it comes to the epigenetics of cancer. From start to finish, "this is collaborative work," he says, adding that contributions are needed from clinicians, bench scientists, and computational biologists to push the science behind cancer epigenetics and patient survival forward. "This work is a combination [effort] among computational biologists and experimental biologists," he says.