The Salk Institute for Biological Studies
Name: Ryan Lister
Position: Postdoctoral fellow, The Salk Institute for Biological Studies (Joe Ecker’s group)
Experience and Education:
— PhD in biochemistry, The University of Western Australia, 2006
— BSc in biochemistry and genetics, the University of Western Australia, 2001
Joe Ecker’s group at the Salk Institute for Biological Studies was one of the early-access customers of Solexa’s sequencer, now the Illumina Genome Analyzer.
Two weeks ago, Ecker and his team published in Cell a sequence-based study of the methylome, mRNA transcriptome, and small RNA transcriptome of Arabidopsis thaliana.
Ryan Lister, a postdoctoral fellow in Ecker’s lab, is the lead author of that study. In Sequence caught up with him during Cambridge Healthtech Institute’s Next Generation Sequencing meeting in San Diego last week.
How did the Arabidopsis epigenome project that you just published come about?
Joe and his lab had a collaboration with Steve Jacobsen at UCLA. In 2006, they published a methylome based on tiling array immunoprecipitation results, which was the first look, for any eukaryotic genome, at the cytosine methylome as well as transcription.
But the natural direction is to take that to the highest resolution possible at which you can measure DNA methylation — it’s the perfect epigenetic modification to couple with sequencing. Histone modifications, they will be really informative [too], but they don’t benefit as remarkably as DNA methylation does through the increase in resolution. Joe was at a meeting a couple of years ago where Solexa, then, was first talking about its technology. And he went up to them and said, ‘I want one of these’ and put an early order in. And because he has done a lot of genomic work, [such as] sequencing the Arabidopsis genome, creation of many genomic tools like mutant lines and cDNA collections, we were shipped an instrument in January 2007.
I had just arrived in the lab. I’m on a fellowship to look at DNA methylation and its variation in different strains of Arabidopsis, which is something we are planning to do now — looking at how genetic differences [are the] basis for the epigenetic variation we see between these related strains. It was sort of the right time for me to work on this project, looking at DNA methylation. And it is coupled to the bisulfite conversion technique, it is sort of a logical extension of the technique, which has been done previously for a decade or more.
What kinds of methods did you develop?
The instrument would sequence [but] in the beginning, the only kit we had for it was for genomic DNA sequencing. That’s what we began doing, just for testing it out. But it worked pretty robustly, besides some early teething issues with some hardware. But quickly, we were able to generate quite a bit of sequence and moved on to this bisulfite sequencing project.
I tested several different protocols for doing this bisulfite conversion because there is a lot of talk about different efficiencies of conversion. I settled on one which coupled a pretty high conversion rate, up to 99 percent conversion, with low degradation of DNA.
The next step where all this has to go is from what other groups were doing, which is just grind up whole tissues, a big heterogeneous population of cells, and moving into populations of the same cell type, and ultimately, down to a single cell. [For that], it’s essential to find a technique that doesn’t destroy your DNA.
And then, it was just a matter of doing a lot of the sequencing. It took a while, just learning by experience, how to cope with the data type. We developed a lot of scripts to do some statistical analysis, using some binomial distribution to make sure we don’t identify too many false-positives.
We also, then, got early access to the small RNA sequencing kits from Illlumina. And Brian Gregory in the lab used that to do the small RNA sequencing from the wildtype plant and all the mutants. That’s the natural partner to the DNA methylation [study] because in plants, it’s known that the small RNAs target DNA methylation. But we wanted to look at this relationship in exactly the same tissue and correlate it at this really high resolution with the DNA methylation.
It’s an established pathway in plants, called RNA-directed DNA methylation. This is the first time we have been able to look at exactly where the methylation is underlying the small RNAs. Before, it could be done at several selected loci throughout the genome, but never before on the whole genome, or many, many targets.
And then, finally, we wanted to look at how the transcriptome was altered in DNA methyltransferase and DNA methylase mutants. That’s ... something which has never been looked at this level before, which is demethylation, the removal of the methyl cytosines, and seeing precisely where they are functioning. It’s something that really isn’t well known in mammalian systems, and there is still a lot of argument over whether there are these demethylases. But in plants, the biochemistry of these demethylases has been shown convincingly.
Did you have to develop the mRNA sequencing method?
Yes. It was basically taking the adaptors that we got from the small RNA kit and then coming up with a lot of enzymatic and chemical treatment of that RNA to manipulate it to develop the strand-specific protocol.
You get rid of all the ribosomal RNA, which is why we use this blocked nucleic acid depletion method. Our aim was to have as little selective bias as possible, so not selecting for a poly-A tail because there may be subsets to the transcriptome which are not poly-adenylated which we want to look at. And also, to do random fragmentation just by methylhydrolysis. You do that, and then by blocking certain ends of the RNA fragment, ... we can specifically ligate one adaptor on one end, and then put the second adaptor on the [other] end. Now you know that, since you always sequence from adaptor A, it gives you the strand-specific information, so you can comfortably say which strand the transcript originated from.
How do your bisulfite sequencing results differ from those published by Jacobsen’s group in Nature last month (see In Sequence 3/18/2008)?
In their study, they focused intensely on the patterning of DNA methylation. Certain patterns they observed from the bisulfite sequencing by spacing of methyl cytosines, which apparently look like they are spaced between nucleosomes, patterns in the telomeric sequences, and between certain asymmetric methylation. And for that, they did deep sequencing of the wildtype plants.
But then, when they are looking at the various mutants, they only did a really low level of sequencing, whereas we wanted to get a really comprehensive measurement of the methylation in not only wildtype, but in the DNA methyltransferase mutations as well. So we sequenced them to almost the same level as the wildtype. On top of that, we also decided to look at the DNA demethylases, which really give an extra dimension to the data, showing that there is this group of enzyme which are constantly moving [around] the genome, protecting it from methylation, being laid down in particular places, which makes the methylome appear highly dynamic.
[This] would fit in with some recent results from animal systems, looking at this periodicity of methylation and demethylation of promoters, saying that in synchronized cell cultures, you can get promoter methylation and then demethylation in a matter of 100 minutes or so. It will likely end up being something which is as dynamic as the transcriptome, which needs to be measured in detail at different time points and different cell types under various stimuli to get an idea of how it changes.
What are you going to do next?
One thing [is] doing the bisulfite sequencing and transcriptome sequencing and small RNA sequencing for this Cape Verdi Islands strain of Arabidopsis, which is a highly diverged strain of Arabidopsis thaliana from off the West Coast of Africa. It shows significant phenotypic variation, and we, in a collaboration with 454 and [the Joint Genome Institute], have done a lot of resequencing, with both shotgun and different length paired ends.
Now, armed with a good quality reference sequence for this CVI genome, we will be able to layer on top of it the DNA methylation, the epigenome, the small RNAs which are targeting this DNA methylation, and any consequences on transcription, and look at how transposition events, novel integration events of transposons, may alter the epigenetic pattern surrounding them. Because there have been a few case where different expression of a gene has been found in a plant, and upon closer inspection, it’s been found that a transposon has inserted right next to that gene, and a consequence ... methylation, for example, histone modifications, aimed at that transposon, can be spread into the surrounding genic regions and alter gene expression. And this may be unintended, or may have a deleterious consequence, but [could] also [be] a process by which variation in gene expression can arise between different natural populations.
Would the idea be to do this not in one but in many other strains?
There is talk of sequencing a large number of Arabidopsis [strains], sequencing 1001 of these. And we want to follow this up, also, with the associated data of [the] epigenome for all of them. And we have to see how the sequencing technologies progress to determine how feasible it is to do for 1,000 of them.
Knowing the structural variation and sequence variation, we can really use that data to try to find out whether the differences in methylation that we are seeing have a genetic basis. Microarray experiments so far indicate that there is large variation in DNA methylation patterns in the methylomes of different strains of Arabidopsis. But without actually sequencing it and knowing for sure where that methylation is, it’s unknown whether that’s just an artifact of relying on arrays for the detection. Perhaps it has detection limits. So if the detectability was much greater, you may actually find that the methylation patterns look more similar.
How far has this idea for a 1,001 Arabidopsis genome project progressed?
There are a number of labs now who are sequencing different strains of Arabidopsis. It’s gaining some momentum. There is a lot of interest in different strains for Arabidopsis for which recombinant inbred lines have been developed — strains in which you see differential tolerance to abiotic or biotic stresses. So sequencing [will help] to try and get an idea of the genetic basis of these phenotypic differences.