Name: Françoise Thibaud-Nissen
Title: Staff scientist, department of plant genomics, the Institute for Genomic Research
Professional Background: 2006-present, staff scientist, department of plant genomics, the Institute for Genomic Research; 2003-2006, postdoc, Arabidopsis Laboratory, TIGR; 1999-2003, graduate research assistant, Soybean Biotechnology Laboratory, department of crop sciences, University of Illinois at Urbana-Champaign.
Education: 2003, PhD in plant molecular biology, University of Illinois at Urbana-Champaign; 1993-1996, MSc, biochemistry, Institut National Agronomique de Paris-Grignon, France.
A paper in the July 2006 issue of Plant Journal reinforced what most providers of microarrays for chromatin immunoprecipitation (ChIP)-on-chip have known for the past two years — the market for ChIP-on-chip technology is a dynamic one that will continue to push into diverse areas of research, including the field of molecular plant biology.
In the paper, Françoise Thibaud-Nissen, a staff scientist in the department of plant genomics at the Institute for Genomic Research, and fellow investigators describe how they developed arrays for the high-throughput identification of transcription factor-binding sites in Arabidopsis thaliana
using ChIP-on-chip [Plant Journal. 2006 Jul;47(1):152-62
Specifically, the TIGR researchers worked with NimbleGen to create two array sets — one containing known promoters on one chip, the other containing the entire Arabidopis genome spread across three. The researchers studied binding sites for the TGA2 transcription factor in plants treated with salicylic acid. To learn more about the new arrays and ChIP-on-chip in general, BioArray News spoke with Thibaud-Nissen last week.
What is your role at TIGR and why did you start this project?
I have been at TIGR for about three years. I was hired as a postdoc specifically to work on this project, mainly to work on the creation of the chip and the analysis of the chip data.
Beyond all the data you get with expression arrays, there’s really a need to find a hierarchy among the genes that are differentially regulated in response to a stimulus, identify which interact with which, and when.
Expression arrays give you the landscape, but they don’t tell you in what order things happen, if there’s an order, and which genes influence the expression of which. The idea was that we would uncover the transcriptional network of a factor important in systemic-acquired resistance using ChIP-chip and expression analysis.
We knew that the TGA transcription factors were playing a role in the induction of salicylic acid and the question was what genes in particular do they govern? That’s really what is driving ChIP-chip because we know that transcription factors are important switches, but in very few cases do we know what targets are downstream of the switches.
How did you develop the arrays?
The first idea was to find the genes that were the most likely targets of the TGA2 factors based on biology and the presence of the TGA factor cis-element in the promoter. We selected around 200 targets, made oligos, and printed them. The only target we detected when we hybridized these first chips with TGA2-immunoprecipitated chromatin was the one we already knew. So we needed a more open approach. Then NimbleGen came up with a fairly cheap way of creating oligos and actually synthesizing them in situ on the slide. We were sort of driven by the technology that was developing at the time. NimbleGen designed the oligos, sent us their designs, and we looked at spacing between the oligos and whether we were missing important parts of the genome and then went with it.
Why did you need to go to a company to do that?
In house we have designed a small chip of 200 probes, but on the promoter array there are 190,000 probes and on the whole-genome array there are almost 1 million probes. And so I think it was cheaper to go to NimbleGen.
You developed two arrays for this project and then compared them in the paper. Can you describe these new arrays?
The first one represents promoters only in the 2 kb region upstream of the start codon. The capabilities increased along with time. In the beginning we were limited by the number of oligos that we could fit onto one chip — 190,000. NimbleGen increased their capabilities to 390,000 oligos per chip and started doing two-color hybridizations. So the cost difference of using the whole-genome versus the promoter only was not that big.
By the time we decided to switch [to a whole-genome chip], we also knew that in humans, binding of transcription factors outside of the traditional promoter regions had been shown. So we expected some targets of the TGA factors to be outside of the promoter region and that’s why we expanded to the whole-genome tiling array.
How did you test them to determine which would be the more optimal array?
The whole-genome array is denser. There is one probe for every 90 bases. For the other one, it’s seven probes for 2 kb, so the distance between the two probes is much wider. You have lower confidence in the regions identified with the promoter array because usually your binding sites are only indicated by two or three probes, max, on the promoter array, while on the larger array you can have in some cases seven or eight probes. So your level of confidence is a lot higher. The chance of finding, just by chance, eight probes that give a significantly higher signal with your ChIP sample than with raw chromatin is pretty small.
For the experiment in the paper, we had one sample that we hybridized to both arrays. We looked at the results that we were getting with both and realized that of course we weren’t picking up the regions that were outside of the promoter regions with the promoter array, but more importantly we found that in some cases the enrichment was so close to the start site of the gene that it was not really picked up by the promoter array because the closest probe to the ATG site was a little too far away and we were sort of seeing just half a peak really. In other cases only a single probe showed enrichment in the promoter array, while two or three appeared on the whole-genome array. It makes a difference in the confidence you can have in the data.
I should add that these arrays are not set in stone because the way the NimbleGen technology works is that the oligos are synthesized on the chip. So at any time if you are not satisfied with your design you can change. There is no upfront cost for the design. You could create hybrids of what we’ve got: instead of having a -2 kb to ATG region of the promoter you could expand it to -3 kb to +1kb. You could probably get most of the targets that you are interested in for less than if you use the whole-genome tiling array. In the same way you can vary the density. So it’s really a very flexible system.
What kinds of tools are you using for data analysis?
We are using mostly [BioConductor’s] R-based software, and what we’ve done so far is just the basic identification of the probes that are above a certain threshold and seeing how close together they are and identifying clusters of probes that have high ratios.
How developed are TIGR’s capabilities for ChIP-chip analysis?
It’s still relatively new. There’s a lot that needs to be done in terms of analysis of the data. One thing is that it is still a relatively expensive technology and the number of replicate chips for a given experiment is going to be small. So one needs to take advantage of the fact that for a single binding site there are several oligos that are found enriched and leverage statistical power out of the proximity of the probes. Basically you have some level of replication built in your chip because of the fact that for single enriched regions you have several probes that are enriched.
Is either array particularly suited for certain applications?
Well, it sort of depends what you know about your transcription factor. It also depends on your wallet because there is one that is more expensive than the other. I think that the larger tiling array is better. You are going to get more for your money. It’s also more data to analyze.
Where is your research going?
I think we are just hitting the tip of the iceberg. I think that systemic acquired resistance is a fairly complex phenomenon. There’s a lot more transcription factors that are involved than the TGA factors. Bridging the gap between binding of the transcription factor and triggering expression is probably going to be a little harder than we anticipated because there’s probably a number of factors that are involved.
Is there a functioning community of ChIP-chip researchers?
Since we’ve published the paper people have contacted us to get the raw data. There’s a lot of interest, but I don’t know if there is a community per se. For plant it’s very limited — it’s basically us. But the number of users is growing quickly in the yeast and mammalian community. So usage is getting bigger, but a community per se? I haven’t gotten wind of anything.
If the technology could be improved, what would you suggest?
I think the price makes a difference. Getting good immunoprecipitated-chromatin is getting easier, but the enrichment most people see isn’t off the roof. If the arrays were cheaper you could get more replication and much better data. Another thing is that the resolution you see on the array is really dependent on the size of the chromatin. The smaller the chromatin size you can get, the better the resolution you can have. So there’s still some progress to be made in the resolution of ChIP-chip.