ChIP-chip Technical Guide

Table of Contents

Letter from the Editor
Index of Experts
Q1: When performing ChIP, how do you maximize antibody recognition?
Q2: What factors go into choosing the best array platform?
Q3: What do you consider when deciding which sample amplification technique to use?
Q4: How How do you optimize sensitivity and resolution?
Q5: How do you reduce variability from experiment to experiment to ensure reproducibility?
Q6: What do you do to normalize your data and optimize algorithmic analysis?
List of Resources

Download the PDF version here

Letter from the Editor

Genome Technology brings you this technical guide on ChIP-chip. Combining chromatin immunoprecipitation with microarrays isn't new; some ChIP users have even adapted the work for use on a next-gen sequencer. But since those instruments are still relatively pricey, this month's roster of experts has chosen to stick with arrays for their cost and reliability.

ChIP-chip uses chromatin immunoprecipitation to enrich for specific regions of the genome, which are then identified using whole-genome tiling arrays or, in the case of ChIP-seq, next-generation sequencing. In general, ChIP-chip is used to look for transcription factors and how their genome-wide binding patterns impact genetic regulation. More recently, ChIP-chip (as well as ChIP-seq) has also been used to precipitate methylated DNA, thereby allowing a genome-wide look at chromatin histone marks and their importance in epigenetic regulation.

In this guide, we've focused on the basics of conducting a ChIP-chip experiment, from how to choose the best array platform to methods for optimizing sensitivity, resolution, and data analysis. As our chromatin immunoprecipitation experts will tell you, there's simply no substitute for nailing down protocols to ensure success no matter which platform you choose to use. Take a look at what our experts have to say and, for even more, check out the handy resources guide at the back. Happy hunting!

— Jeanene Swanson

Index of Experts

Genome Technology would like to thank the following contributors for taking the time to respond to the questions in this tech guide.

Ian Dunham
European Bioinformatics Institute

Ghia Euskirchen
Yale University

Wei Li
Baylor College of Medicine

Jason Lieb
University of North Carolina, Chapel Hill

Xiaole Shirley Liu
Dana-Farber Cancer Institute

Oliver Rando
University of Massachusetts Medical School

Q1: When performing ChIP, how do you maximize antibody recognition?

The starting point has to be a validated antibody. Although it is the case that antibodies that perform well in immuno (western) blots do not necessarily perform well in ChIP, this at least should be a starting point. Usually the suppliers will have western blot information in their technical information or on their website. For antibodies that recognize histone modifications, you can also perform competition experiments with modified peptides containing the epitope that is being recognized to confirm specificity. After that you want to titrate antibody and chromatin concentrations within a protocol that is already working with chromatin preparation, shearing/sonication, etc. The conventional way to assay this would be to run quantitative PCR for known targets, and to optimize the conditions against this assay. In practice, a quick look on a microarray often gives a feeling as to whether conditions are working, and if you have the resources, optimization can be done on the arrays.

— Ian Dunham

First we screen antibodies in test immunoprecipitations (IPs) using nuclear extracts prepared from non-crosslinked cells and only proceed with those antibodies that perform well at this level. For the test IPs we run four lanes on immunoblots: input; elute from IP with normal IgG; elute from IP with specific antibody; and supernatant. If the supernatant is not cleared of the factor of interest then the amount of antibody is increased. The use of nuclear as opposed to whole cell extracts is important, as is not over-crosslinking the cells in the ChIP.

— Ghia Euskirchen

We usually apply three quality control experiments to check the antibody quality: 1) western blot, 2) co-immunoprecipitation (Co-IP) and 3) ChIP followed by qPCR on known targets.

— Wei Li

I assume here that you mean "how do you get your antibodies to work for ChIP?" There are no strict rules, but some general tips may be useful. Use the minimum fixation time allowable for your experiment. This variable is often not optimized, but lighter fixation may allow for better epitope recognition in some cases. Another option is an epitope tag. While in many systems using an epitope tag has some serious disadvantages (level of expression, possible compromise of function), at least the interaction between the antibody and epitope are well characterized. Success or failure of an antibody in other applications (immunofluorescence, western) is not a good predictor of success or failure in ChIP. When choosing an epitope among proteins in the same complex, it's best to choose a subunit closest to DNA, if this information is known. This can make a big difference. Histones and histone modifications tend to be the easiest ChIPs, followed by transcription factors and other DNA-binding proteins. Chromatin-associated enzymes tend to be much more difficult. The main lesson is that every antibody is different and every DNA-associated protein is different, and while it is possible to get lucky, some amount of optimization is usually required for the best results.

— Jason Lieb

We are often stuck with the antibodies that are available on the market. If there are multiple antibodies available, it is important to test and compare them before conducting the whole experiment. A pilot can be conducted using ChIP-qPCR or ChIP-chip on a pilot array. We want to pick an antibody that gives reproducible enrichment or higher enrichment at positive control regions with lower background noise. Often postdocs can generate better enrichment with lower noise from the same antibody over time, so protocol and experience are important factors too. It is good to learn ChIP-chip protocol from veteran labs.

— Xiaole Shirley Liu

We typically run an antibody titration with differing amounts of antibody, then quantify the amount of material we ChIP. We compare cells with and without the epitope in question (for example, yeast carrying H3K4R should not be K4 trimethylated), and choose an antibody level that maximizes specific yield.

— Oliver Rando

Q2: What factors go into choosing the best array platform?

A lot of factors go into this. To some extent it is a question of access — which platforms do you have access to and what designs are available on that platform? — and cost — can you afford to do the experiment with that platform? From a more fundamental point of view, you can also tailor the design to the question you want to answer. So if it is a gene or promoter limited question, i.e. you think you know the targets of interest, a custom array may be sufficient and will often be less costly, particularly if the design already exists. If it doesn't exist then you have to either seek a supplier who can readily produce the array design you need, make the design yourself, or adapt a different design to suit your question (e.g. perhaps a cDNA or SNP chip could be used). For a genome-wide question, you have to look at what is the most cost- and reagent-effective platform for your situation — how many arrays will you have to hybridize, how many points across the genome will be sampled, how much will it cost? Sensitivity is also an issue — in general in our experience, longer oligos appear to have a better signal-to-noise ratio, but this may be overcome by appropriate processing. One of the things that has been learnt recently is that pretty much all questions are really genome-wide, and unless you look over the whole genome you may be missing important effects. In addition, now a choice has to be made as to whether you should use arrays at all. Would it be better to sample the ChIP using high-throughput sequencing technologies if you have access to them?

— Ian Dunham

In general we have found that longer oligos outperform shorter oligos, and we prefer to work with technologies where the tiling paths can be readily customized. Then it depends if one wants to include intergenic regions, which is a good idea as a lot of transcription factor binding occurs outside promoter regions.

— Ghia Euskirchen

In our recent ChIP-chip ENCODE benchmark paper, we found that microarray platform choice is not the primary determinant of overall performance. In fact, variation in performance between labs, protocols, and algorithms within the same array platform was greater than the variation in performance between array platforms. Tiling arrays from all commercial companies tested worked well at around 10 percent FDR level, especially using the optimal experimental protocol with the best analysis algorithm. Nimble- Gen and Agilent arrays are more sensitive at detecting regions with very low enrichment (1.25- to two-fold), likely owing to longer oligonucleotide probes and probe sequence optimization. But they are usually more expensive. Affymetrix arrays are generally more cost-effective, especially for high enrichment targets.

— Wei Li

I was involved in a recent study that used a pre-defined DNA sample (a "spike-in") to measure array performance. We found that the long-oligo nucleotide arrays (Nimblegen and Agilent) were slightly more sensitive at very low enrichment levels, but that all of the major array platforms performed very well (Nimble- Gen, Agilent, and Affymetrix). We primarily use NimbleGen arrays, but all of the platforms work well. I'd choose based on cost and familiarity with the platform.

— Jason Lieb

We consider cost, performance, accessibility, and reproducibility. We get high-quality data from Affymetrix human/mouse whole genome tiling microarrays for $2000 (arrays and labor) per sample at the Dana-Farber Cancer Institute. The core facility here has very experienced staff, high QC standard, and quick turnaround time (about a week). They accept outside academic users, too, with competitive pricing. ChIP-seq is an attractive alternative; if you have access to a machine the turnaround time is short. Technology and pricing are changing so quickly, so it is sometimes difficult to say which is the best. Timing is sometimes more important, so we pick a platform that works with a reasonable time and cost.

— Xiaole Shirley Liu

Cost, coverage, and signal-to- noise. In addition to commercial arrays, we also print home-made arrays, where the cost is negligible, but the coverage is low and the background tends to be a bit higher.

—Oliver Rando

Q3: What do you consider when deciding which sample amplification technique to use?

We have generally used our experience with unamplified samples on PCR tiling arrays as a benchmark for the results we expect to achieve. We want to see all the targets that we see with unamplified material present after the amplification procedure. We have only used amplification when we were unable to get sufficient material to go over several slides, e.g. using the Affymetrix genome-wide tiling arrays. In our hands Qiagen's RepliG FFPE kit worked pretty well.

— Ian Dunham

We have avoided this issue by working with cell lines that allow us to grow as many cells as necessary to achieve reliable signal enrichments.

— Ghia Euskirchen

We tested the effect of three different amplification procedures: ligation-mediated PCR (LM-PCR), random-priming PCR (RP), and whole-genome amplification (WGA). In general, LM-PCR and WGA reproduced relative enrichment levels with high fidelity. LM-PCR afforded the least reduction in AUC from unamplified to amplified sample on Agilent arrays. On Affymetrix arrays, LM-PCR performed significantly better than RP. The WGA method was used only on the NimbleGen platform, but also produced results with very little reduction in performance.

—Wei Li

In our paper we found that both LM-PCR and WGA worked remarkably well in preserving the original enrichment ratios. In my lab we have found that WGA may be more accurate quantitatively (with respect to what is in the sample tube), but that LMPCR gives sharper peaks that, in most cases, more accurately represent actual sites of protein-DNA interaction, and is more sensitive. We typically go with LM-PCR when amplification is required.

—Jason Lieb

Many researchers here use LM-PCR, partly because they learned the ChIP-chip protocol from a few local experts who use LM-PCR. It gives good enrichment, especially if a kinase step is added before the ligation, which greatly improves LM-PCR efficiency. The WGA kit from Sigma is a good alternative, as it is easy to use and gives reproducible enrichment. Either one is fine, and you probably don't need to switch if you already get one working.

—Xiaole Shirley Liu

We use a linear amplification method called TLAD, because Dr. Liu found that short fragments (<~500 bp) show biases in the traditional PCR-based amplification methods. Since we cut DNA to ~150 bp, this is crucial.

— Oliver Rando

Q4: How do you optimize sensitivity and resolution?

Sensitivity is optimized with known targets. We test multiple different concentrations of labeled ChIP material in the hybridization solution and monitor the effect on these targets. In terms of resolution, in ChIP-chip this is ultimately determined by the nature of the shearing/sonication used. However, software approaches can be used to interpolate the binding sites from the shapes of the peaks, and transcription factor binding site searches within the region identified are used to infer potential binding sites. With histone modifications the ultimate resolution can be no better than the nucleosome, so the optimal approach is to use micrococcal nuclease digestion instead of sonication as has been done for yeast.

— Ian Dunham

We are careful about the shearing and use a robust amount of labeled ChIP DNA. Sensitivity and resolution are also largely determined by the tiling path design parameters. Additionally as part of the ENCODE consortium, we explored these areas through the labeling and hybridization of a series of spike-in DNA samples.

— Ghia Euskirchen

Lower-resolution arrays generally have a lower sensitivity than their denser counterparts. Furthermore, replicates are essential to increase the sensitivity, especially for Affymetrix, which requires at the very least three replicates to generate a reasonable sensitivity at the whole- genome tiling resolution (35 bp).

— Wei Li

For sensitivity, we use long oligonucleotide arrays and LM-PCR. For resolution, we use high-density tiling arrays (usually a start-to-start resolution of 50 bp).We also have been able to shear chromatin down to ~300 bp, which helps. For nucleosome mapping and chromatin modification ChIPs, digestion to mononucleosomes with micrococcal nuclease provides enhanced resolution.

— Jason Lieb

Detection platform and analysis algorithms could determine the sensitivity and resolution. A recent spike-in study found Affymetrix, Agilent, and NimbleGen to have similar sensitivity at three-fold or above. Agilent and NimbleGen provide better sensitivity at 1.25- to two-fold enrichment, but Affymetrix genome tiling arrays have better peak resolution because of the dense tiling. Analysis algorithm is also key, as reported in the same study, so we want to pick the best algorithm specific for each array platform. In addition to detecting the wholeenriched region, some algorithms (e.g. MAT for Affymetrix arrays) also provide the best estimate of the precise binding location, which allows better resolution for downstream motif analysis. Although ChIP-seq could be very sensitive theoretically, the current sequencing depth (2 million to 10 million tags for human/mouse genomes) may not be sensitive enough below 10-fold enrichment for most of the factors. ChIP-seq does offer superior peak resolution.

— Xiaole Shirley Liu

Resolution is improved by shearing as hard as possible, or for nucleosomes, by digesting to mononucleosomes. Also, resolution is increased by increasing the number of probes tiling a region of interest. I can't speak much to optimizing sensitivity beyond the antibody titration — for microarrays obviously sensitivity is improved by minimizing background.

— Oliver Rando

Q5: How do you reduce variability from experiment to experiment to ensure reproducibility?

We try to standardize everything, including cell culture conditions, crosslink conditions, ChIP protocol, etc. We test that the cell growth is behaving the same by observation and cell sorting. One critical parameter is batch-tobatch variation of antibodies. Once you have a good antibody tested, it is essential that you have access to sufficient amount of the antibody from the supplier to complete all your anticipated experiments.

— Ian Dunham

In the early days of microarrays we used to have a lot of problems with cover slips and hybridization conditions. Those problems disappeared with the availability of MAUI hybridization systems. Otherwise, prior to the array work, we perform PCR analyses of the ChIP DNA using known targets and control regions. Lastly, each data set is comprised of three or more biological replicates with one replicate applied per array. We compare the rank order lists of the targets across the arrays.

— Ghia Euskirchen

Before we put the ChIP-DNA on the array, we normally use ChIP followed by qPCR to check the ChIP enrichment level on several known targets among all the biological samples (replicates). Sample(s) inconsistent with others could be discarded in this way.

— Wei Li

It is important to keep in mind that reproducibility, in and of itself, is not the goal of a ChIP experiment. The goal is to really determine the sites of interaction between a protein and DNA. In other words, one could generate extremely reproducible results that are incorrect. Therefore, while we do all of the standard things that are simply good scientific practice, it is important to vary factors that may introduce spurious results. For example, we always use completely independent preparations of extract from completely independent cultures for our replicates. Replicates are always performed on different days, sometimes by different people at different locations, and typically include a "dye swap". These are factors that should not matter — real targets should be robust to these differences.

— Jason Lieb

From the ENCODE spike-in study, we found experimental and human factors to be the biggest cause of variability, bigger than array platforms and analysis algorithms. Assuming ChIP antibody is good, an experienced postdoc who consistently follows a protocol (sonication, washing, DNA amplification) every time and an experienced core facility (preferably the same technician) for array hybridization and scanning will give more reproducible results.

— Xiaole Shirley Liu

We always run several biological replicate experiments, so we are able to directly assess reproducibility. Key factors for us include using water bath incubators for cell growth rather than air incubators (temperature is more constant), having all experiments use the same batch of media, and ensuring that all of our nuclease digestions yield the same range of mono and dinucleosomes. Other factors will pertain in other situations, of course.

— Oliver Rando

Q6: What do you do to normalize your data and optimize algorithmic analysis?

In general the better your data is, the less of a problem this becomes. We normalize the data by standard microarray approaches and then get an overview of the data by viewing the ratio of ChIP to input channels in a genome browser. In terms of peak calling algorithms, this is still an active area of research in many labs. We have favoured an HMM approach developed by Paul Flicek at the EBI, called Nessie (unpublished). With all the algorithms, though, the area where there is a question is in the data that is closest to the boundary of enriched signal distribution vs. background, and until you have biological confirmation that these regions are true hits deciding where to draw the line is always going to be debatable.

— Ian Dunham

We are close collaborators with Mark Gerstein's group (Yale University) and they put a lot of effort into that question and their resulting Tilescope program. — Ghia Euskirchen
For one-color Affymetrix arrays, we usually apply MAT algorithm to standardize the probe signals. MAT uses all the probes on the array to model how nucleotide (i.e. A, C, G, or T) counts, nucleotide locations on a probe, and copies of this 25-mer in the genome affect the observed probe values. Based on the model, the baseline probe signal can be directly predicted from the probe sequence. Probes with similar baselines will be grouped (~3,000 probes per group) together and each probe's raw signal can be standardized using its predicted baseline and variation estimated from all the probe signals within the same group.

For two-color array platforms (i.e. NimbleGen or Agilent), most of the probe variations come only from the probe GC content, not the position-specific nucleotide effect. Therefore, we use MA2C algorithm (MAT-like algorithm for two-color arrays) to normalize probes' ChIP vs. input channel ratios by their GC content, in effect giving probes with different GC content a different weight. Based on our ENCODE benchmark, both MAT and MA2C seem to give very promising results.

— Wei Li

For standard ChIP experiments, we still use simple normalization methods (e.g. median centering) and ChIPOTle, which is conceptually simple, easy to run, and typically performs well. Normalization and analysis are very heavily dependent on the biology of the experiment you are doing. For example, we had to apply special normalization to our ChIP experiments involving dosage compensation proteins, since they bound only to the X chromosome and not the autosomes. We often make use of Z-scores to standardize data from individual replicates. Lately we've been using X. Shirley Liu's MA2C algorithm for normalization and peak calling.

— Jason Lieb

We normalize the data array by array, since tiling microarrays have a massive amount of probes, the vast majority of which are measuring background noise and cross-hybridization. To normalize probe values across different samples, channels and probes, we use MAT for Affymetrix arrays and MA2C for twocolor arrays, both of which achieve background subtraction and normalization in one step.

Readers are strongly recommended to check out Johnson et al. Genome Research (2008) for more detailed answers for many of these and other questions. — Xiaole Shirley Liu
We carry out "block" normalization to control for any gradients on the array. Algorithmic analysis is very much customized for the project in question, so that is a difficult question to answer. For transcription factor ChIP-chip, people have developed fairly standard peak-calling algorithms to make hard calls on binding sites, but we don't use these and I can't really be of much help in choosing the optimum.

— Oliver Rando

List of Resources

Our panel of experts referred to a number of publications and online tools that may help you get a handle on running ChIP-chip experiments. Whether you're a novice or pro, these resources are sure to come in handy.

Publications

Buck MJ, Nobel AB, Lieb JD. ChIPOTle: a user-friendly tool for the analysis of ChIPchip data. Genome Biology. 2005;6(11):R97. Epub 2005 Oct 19.

Horak CE, Snyder M. ChIP-chip: a genomic approach for identifying transcription factor binding sites. Methods Enzymol. 2002;350:469-83.

Huebert DJ, Kamal M, O'Donovan A, Bernstein BE. Genome-wide analysis of histone modifications by ChIP-on-chip. Methods. 2006 Dec;40(4):365-9.

Johnson DS, Li W, Gordon DB, Bhattacharjee A, Curry B, Ghosh J, Brizuela L, Carroll JS, Brown M, Flicek P, Koch CM, Dunham I, Bieda M, Xu X, Farnham PJ, Kapranov P, Nix DA, Gingeras TR, Zhang X, Holster H, Jiang N, Green RD, Song JS, McCuine SA, Anton E, Nguyen L, Trinklein ND, Ye Z, Ching K, Hawkins D, Ren B, Scacheri PC, Rozowsky J, Karpikov A, Euskirchen G,Weissman S, Gerstein M, Snyder M, Yang A, Moqtaderi Z, Hirsch H, Shulha HP, Fu Y,Weng Z, Struhl K, Myers RM, Lieb JD, Liu XS. Systematic evaluation of variability in ChIP-chip experiments using predefined DNA targets. Genome Research. 2008 Mar;18(3):393-403. Epub 2008 Feb 7.

Johnson WE, Li W, Meyer CA, Gottardo R, Carroll JS, Brown M, Liu XS. Model-based analysis of tiling-arrays for ChIP-chip. Proc Natl Acad Sci USA. 2006 Aug 15;103(33):12457-62. Epub 2006 Aug 8.

Komashko VM, Acevedo LG, Squazzo SL, Iyengar SS, Rabinovich A, O'Geen H, Green R, Farnham PJ. Using ChIP-chip technology to reveal common principles of transcriptional repression in normal and cancer cells. Genome Research. 2008 Apr;18(4):521- 32. Epub 2008 Mar 17.

Mikkelsen TS, Ku M, Jaffe DB, Issac B, Lieberman E, Giannoukos G, Alvarez P, Brockman W, Kim TK, Koche RP, Lee W, Mendenhall E, O'Donovan A, Presser A, Russ C, Xie X, Meissner A, Wernig M, Jaenisch R, Nusbaum C, Lander ES, Bernstein BE. Genome-wide maps of chromatin state in pluripotent and lineage-committed cells. Nature. 2007 Aug 2;448(7153):553-60. Epub 2007 Jul 1.

Song JS, Johnson WE, Zhu X, Zhang X, Li W, Manrai AK, Liu JS, Chen R, Liu XS. Model-based analysis of two-color arrays (MA2C). Genome Biology. 2007;8(8):R178.

Wu J, Smith LT, Plass C, Huang TH. ChIP-chip comes of age for genome-wide functional analysis. Cancer Research. 2006 Jul 15;66 (14):6899-902.

Zhang ZD, Rozowsky J, Lam HY, Du J, Snyder M, Gerstein M. Tilescope: online analysis pipeline for high-density tiling microarray data. Genome Biology. 2007;8(5):R81.

Web Tools

http://www.chiponchip.org
http://openwetware.org/wiki/Wikiomics: ChIP-chip
http://tilescope.gersteinlab.org/
https://sourceforge.net/projects/chipotle-2/
https://sourceforge.net/projects/chipotle-perl/
http://chip.dfci.harvard.edu