Skip to main content
Premium Trial:

Request an Annual Quote

DNA Degradation Features Help Team Map Nucleosomes, Methylation in Ancient Genome Sequences

Premium

By exploiting sequence and read depth signals that reflect degradation, damage, and protection in ancient DNA, a Danish-led team has come up with treatment-free methods for profiling genome-wide methylation and nucleosome packaging patterns in ancient genomes.

The approaches rely on read depth and deamination profiles to reveal nucleosome occupancy and cytosine methylation, respectively, rather than sequencing DNA treated with nucleases or bisulfite.

"We don't use anything but the DNA sequence," Ludovic Orlando, a researcher at the University of Copenhagen Centre for GeoGenetics, told In Sequence. "We don't bisulfite treat, we don't capture anything. We just use DNA."

In a study published online last week in Genome Research, Orlando and colleagues from the University of Copenhagen, Aarhus University, and elsewhere used existing genome sequence data from a 4,000-year-old Paleo-Eskimo individual from the Saqqaq culture in proof-of-principle demonstrations of the techniques.

Within that dataset, they saw peaks and valleys in genome-wide read depths that appeared to reflect ancient DNA protection by nucleosomes, which diminish degradation in the sequences they occupy. On the other hand, cytosine deamination proved useful for seeing methylated regions of the genome, since this degradation process affects methylated and unmethylated cytosines differently.

Together, the approaches produced nucleosome occupancy and cytosine methylation maps that largely jibed with patterns in modern human samples — particularly across promoter regions and CTCF transcription regulator binding sequences that the team focused on to validate their results.

Similar epigenetic profiles could also be pulled from available DNA sequence data on samples as recent as 100 years old and as old as 120,000 years old.

But beyond that, the group was able to use the ancient nucleosome and methylation profiles — together with data from previous studies — to make preliminary predictions for the Saqqaq individual's age as well as the gene expression patterns present in the Saqqaq hair sample.

"At the end of the day, we use DNA to do gene expression predictions," Orlando said, "which is kind of weird because normally we do that with RNA."

As such, the study's authors noted that the new approaches could have applications for not only providing access to epigenetic information from ancient organisms and tracking nucleosome biology over time, but also for fleshing out phenotypes in evolutionary or forensic studies.

"In terms of the ancient genomic applications, that's one of the exciting opportunities: we can infer something beyond the genotype of potentially extinct species and other samples that are old," co-first author Jakob Skou Pedersen, a molecular medicine researcher at Aarhus University, told IS.

The idea of tapping into the degradation patterns in ancient DNA is not new, Orlando explained. Investigators who routinely concern themselves with ancient DNA have taken to profiling various forms of DNA damage, for example as a means of assessing sample quality or removing sources of contamination.

In the process of doing such profiling on the Saqqaq genome — which Orlando, Pedersen, and others published published in Nature in 2010 — the researchers realized that a form of DNA damage called deamination produced different nucleotide misincorporation events in place of the original cytosine bases, depending on the presence or absence of methylation modifications.

Indeed, a closer look at the available read data indicated cytosine deamination events in ancient DNA were occurring in a manner akin to the bisulfite treatment used to convert unmethylated cytosines to uracil during bisulfite sequencing experiments used to assess methylation in modern samples.

Rather than converting cytosine bases to uracil with the help of a chemical treatment, though, the group realized that inherent cytosine deamination in the samples was swapping uracil bases in place of unmethylated cytosine.

That base is picked up by polymerase enzymes such as Taq platinum high fidelity, but missed by others, including the Phusion, or Pfu Taq, polymerase that researchers had used for some of their Saqqaq sequencing experiments.

On the other hand, cytosines that started out methylated got converted to thymine during this type of degradation — a conventional DNA base that's recognized by a range of polymerase enzymes.

The team decided to take advantage of this discrepancy to get a look at previously methylated regions of the Saqqaq genome, using reads generated with the help of Taq HiFi and Phusion enzymes to map nucleotide misincorporation events reflecting this methylation.

To gauge the authenticity of the methylation signals gleaned from these patterns, the researchers went on to compare them with known methylation patterns in the human genome.

As expected, the Saqqaq genome showed a dip in thymine bases in regions known for lower-than-usual methylation levels. On the other hand, the team saw a jump in these products of methylated cytosine deamination in spots known for hyper-methylation, including splice site boundaries and certain promoter regions.

"We know that there are some classes of promoters that are highly methylated in the genome, and some have lower methylation rates," Orlando noted. "So we clustered the promoters according to what we expected and checked whether those that were supposed to be hyper-methylated showed really high methylation levels … and vice versa."

Methylated regions in the Saqqaq sample also clustered most closely with other hair samples when the team put them up against array-based methylation profiles from several modern-day human tissue samples, lending further support to the notion that the sequence misincorporation method was picking up remnants of legitimate methylation marks.

Meanwhile, Skou Pedersen and his group made their own observations while working with genome sequences generated from the ancient samples: They noticed that read depths across the genome — sequenced to an average depth of around 20-fold using DNA isolated from a hair samples — showed a regular variation that did not seem to reflect sequence composition or coverage biases alone.

"There was this great periodicity in the read depth, with peaks regularly spaced at 200 base pair intervals," Skou Pedersen said. "It's like peaks and valleys: You have a peak every 200 base pairs and you have a valley in between."

Even after correcting for biases associated with the presence of guanine and cytosine nucleotides in a given sequence, the group was left with read depth variation that appeared to reflect the proximity to genes and promoter regions in the genome.

Those patterns, as it turned out, "correlated very nicely with what is known about nucleosome positioning," Skou Pedersen explained, prompting enthusiasm about using coverage discrepancies to delve into nucleosome occupancy patterns in the long deceased Paleo-Eskimo individual.

After using a statistical method to correct for GC-biases introduced during PCR amplification and sequencing steps, the researchers used the residual read depth variation left across the genome to map nucleosome occupancy.

The resulting map had features that were comparable to known nucleosome occupancy patterns in present-day samples, the researchers reported, showing decreased occupancy upstream of apparent transcription start sites and more pronounced occupancy across coding regions, for instance.

The nucleosome features predicted from read depth data in the Saqqaq genome were also consistent within so-called nucleosome array regions — parts of the human genome that typically show similar nucleosome occupancy regardless of the tissue type considered.

The read depth-based method appeared to pick up plausible nucleosome occupancy patterns in genome sequence datasets generated from other ancient samples, too, the group noted, including a 120,000-year-old polar bear bone sample and a 100-year-old hair sample from an Aboriginal Australian.

"We see nearly the exact same pattern — this same variation — in an ancient Aboriginal sample from Australia," Skou Pedersen noted. "This sample is only about 100 years old, but has been stored in rather different conditions than the one from Greenland."

The ability to discern both nucleosome positions and methylation patterns opens the door to a host of potential applications, the authors of the study explained.

Based on recent research suggesting ties between tissue methylation patterns and human aging, for instance, they have already explored the possibility of using methylation information as a window into the Saqqaq individual's age.

Methylation marks detected in the Paleo-Eskimo genome, together with methylation-age data discerned from past studies, suggested that the Paleo-Eskimo likely lived to be 40 to 60 years old.

Using information on the nucleosome positions and methylation profiles known to neighbor active or silenced genes, meanwhile, the researchers took a crack at estimating gene expression in the Paleo-Eskimo hair sample.

"By knowing the pattern of methylation and the positioning of the nucleosomes, we can say something about how expressed a given gene was in an ancient sample," Skou Pedersen said, adding that "this is somewhat at the proof-of-principle stage in this paper."

Even so, from the Saqqaq nucleosome and methylation data, the team was able to pull out gene expression predictions that seemed to jibe with modern hair samples.

When they looked at the genes predicted to be expressed most highly in the ancient hair, the researchers found an over-representation of plausible hair components, including genes coding for hair-specific forms of keratin.

If such findings hold in future studies, it appears possible that DNA sequence data alone could eventually offer a window into not only DNA sequences from long-deceased individuals, but also epigenetic and expression attributes that can be used to tease apart other functional and phenotypic features.

That might serve as a way of picking up contamination in samples — for instance, errant skin cells that have gotten into an ancient hair sample such as the one assessed in the current study.

Having access to a combination of genetic and epigenetic information for the same samples also raises the possibility of being able to see whether physical features or traits associated with various SNPs matched the predicted expression patterns in those samples, Orlando said. "By merging those different approaches, maybe you'll achieve much better phenotyping of ancient individuals."

For instance, he pointed to the possibility of trying out such techniques with genome sequence datasets from the Neandertal and Denisovan genomes. Orlando's own group is interested in understanding genetic and phenotypic changes associated with domestication in the horse lineage — another area where additional layers of epigenetic and expression information may prove useful.

But there are other potential applications, too. Skou Pedersen noted that his group is interested in learning more about nucleosome positioning, biology, and function.

Samples profiled using the read depth approach described in the study may help in that respect, he said, since early results hint that natural degradation processes might introduce fewer biases than nuclease treatments typically used to assay for nucleosomal occupancy in modern samples.

"It somehow appears that we have less bias in our [Saqqaq] dataset," Skou Pedersen said. "We don't really know what's cleaving in our datasets, but it appears that we somehow get more sharp peaks than in existing datasets."

Even so, findings from low-coverage datasets representing modern human hair samples suggest that at least some nucleosome occupancy information may be distinguishable from samples that aren't all that old or degraded, he noted.

On the other hand, the ability to precisely profile methylation patterns in the genome seems to depend quite heavily on the extent to which DNA degradation and deamination has occurred within a given sample.

In the case of the Saqqaq sample, for instance, Orlando noted that there was sufficient degradation to see cytosine methylation patterns across genomic regions. But even after several thousand years, the Greenland sample did not undergo the extreme damage to see methylation marks on individual cytosines.

Such damage is not a direct function of time, he explained, but instead reflects internal and external factors, including features of the environment where DNA has been preserved or stored, such as temperature and pH.

"If you have a 100-year-old sample from the tropics, it might be much more damaged than a 10,000-year-old sample from the Arctic, for example," Orlando said.