Hidden beneath layers of sediment or deep in the permafrost are the remnants of past life on Earth. Neandertals, dinosaurs, and ancient equids once roamed the Earth's forests and plains — but until now, little was really known about them. That, however, is changing.
With modern genomic and proteomic tools, researchers are able to learn more about the genes and proteins of ancient fossils and determine more about these organisms' evolutionary relationships to modern organisms. The work isn't easy — researchers constantly do battle with small and degraded samples, all under the ever-present threat of contamination.
"The challenges are that it's really poorly preserved molecular material, it's highly damaged, and it's very fragmented. This creates a number of problems, both in terms of an increased risk of contamination, but also some challenges in regard to getting the correct DNA sequence due to miscoding lesions resulting in substitutions or insertion of various bases through amplifying the DNA," says Eske Willerslev, director of the Centre for Ancient Genetics at the University of Copenhagen. "Basically, very poor material is the great challenge."
New and better tools, particularly sequencers for DNA studies and increased mass resolution for protein studies, are letting researchers compensate for that poor starting material. Recently, researchers announced their sequencing of the genomes of an ancient human and of Neandertals. "We have to do more genomes, we have to do more Neandertals, more ancient species," says Eva-Maria Geigl, a co-group leader of the epigenome and paleogenome team at the Jacques Monod Institute in Paris. "Next-generation sequencing is opening up real evolutionary studies. We are very excited. We will not miss the train."
After spending thousands of years under dirt or ice, DNA strands and proteins begin to break down. DNA is prone to fragmentation, diagenic lesions and cytosine deamination. It's also found in low quantities. Michael Hofreiter, a professor at the University of York, estimates that an ancient DNA sample dating back 30,000 to 40,000 years would have an average fragment length of 50 to 100 base pairs.
The oldest samples of ancient DNA that have been analyzed — and authenticated — to date have come from beneath the permafrost. In 2007, Copenhagen's Willerslev and his colleagues reported their study of the basal part of ice cores from Greenland. No one else was interested in this silty ice part of the cores, he says, since it is difficult to date. When the ice is clean, he says dating is nearly as simple as reading tree rings, but with silty ice, it is less clear. There, Willerslev and his colleagues found pollen grains, from which they extracted and then amplified DNA. "[We] found evidence of a forest, a conifer forest, in Greenland which was pretty surprising," he says, adding that they then used four different methods to determine the age of the silty ice. "They all showed that it was in the range of around half a million years," he says.
Permafrost, though, is a special circumstance and only captures a small slice of life that once roamed on Earth. Many more organisms lived outside of those cold regions, but their DNA is more difficult to come by. "For sequencing information, what I would say is now admitted is 100,000 years in non-permafrost environments," says Jacques Monod's Geigl, who studies fossils from more temperate regions.
Ancient proteins, too, are found at very low levels and in various stages of degradation. They can lose their secondary and tertiary structures as well as undergo hydrolysis or condensation and deamination. "The main problem that we've got right now, first and foremost, is low concentration," says Mary Schweitzer, an associate professor at North Carolina State University. In addition, she says "it is almost a certainty" that any proteins they do see will be highly modified.
But how long until they fall apart and cannot be detected? "That is the $10 million question," says Peggy Ostrom, a professor at Michigan State University. There is good mass spectrometric data showing that proteins can persist for half a million years, particularly for permafrost samples, she adds.
However, Schweitzer says she has been able to reach back further with sandstone-buried dinosaur bones. "I'm pretty convinced that we have endogenous proteins from our dinosaurs pushing that date back to at least 80 million years," she says, adding "that doesn't mean the entire community buys that." She says her team has histological preservation, antibody binding, and protein sequence data for dinosaur proteins.
Sample degradation does not come to an end when bones or other fossils are unearthed. In fact, quite the opposite is true, Geigl says. In a 2007 PNAS paper, she and her colleagues showed that DNA degradation actually increases after excavation. "You take out the bone from the sediment where it was preserved for thousands of years and then wash it and then you store it and the DNA degrades very, very fast," she says. "We calculated from one special case that it degraded 70 times faster than during the 3,000 years in the sediment."
Contaminants all around
Once proteins and DNA are unearthed, not only does degradation continue, but researchers face the increased problem of contamination — from the environment, from the researchers themselves, and from around the lab.
Researchers are continually shedding skin and hair cells, complete with DNA and proteins. "For proteins, contamination is flagrant since we are living in a sea of keratin that's continually sloughing off," Ostrom says. "We have to rely on being a unique amino acid sequence to distinguish the analyte of interest, our particular protein, from keratin."
How much of an effect these modern DNA and proteins have depends on the organism under study. "If you work with humans or very close relatives of humans, like Neandertals, it's rather problematic because most samples are contaminated with modern human DNA," York's Hofreiter says. In addition, samples can be contaminated by other modern DNA including "cattle, chicken, pig, we still have problems because these animals' DNA are very common in our environment," he adds. Hofreiter points out that new sequencing technologies are making this contamination problem less of an issue.
There are strategies to minimize the effects of contamination, both in the field and in the lab. One is to keep the samples set in their surrounding dirt for as long as possible. "If you could retrieve your fossil from an anoxic environment and then take it back to lab, you're probably better off," Ostrom says, but she adds that "most of the time we find our samples sitting at the surface … and then already at least part of that skeleton is going to have seen atmosphere and be exposed to oxidizing conditions at the surface."
Geigl works out an excavation plan before heading into the field. She and her archaeologist or paleoanthropologist collaborators hash out a plan based on both of their sets of constraints. "I want my bones fresh as much as possible. I want them dirty and I want them cold. I do not want them to be washed and then dried in the sun, for example," she says, adding that this process, which helps archaeologists analyze the bones, degrades the DNA. Instead, she invites collaborators to come to her lab where they can analyze the DNA and the bones together. "This can be worked out if you know the people and if you collaborate very tightly," she says.
In addition, Schweitzer insists on having the bones arrive for analysis with all their surrounding sediment. For her recent work on Brachylophosaurus canadensis, that meant that her colleague had to carry about 300 pounds of extra material back to the lab. This way, Schweitzer says, the bones are better preserved and she has more sample to work with. In addition, she can compare the proteins found in the bones to those found in the surrounding environments. "If you are going to show that this stuff is real and endogenous to a dinosaur bone, you should get a signal from the bone but not the sediments immediately adjacent to it," she says.
Then, in the lab, researchers do "all the things you can think of with gloves, et cetera, et cetera," Ostrom says. But for proteins, "it comes down to the subsequent purifications you do to try to isolate your protein from everything else." Her lab goes through many purification and validation steps to ensure they have their protein of interest before beginning LC/MS or MALDI mass spec analysis. Ostrom adds that other labs take more of a shotgun approach and sequence directly from the sample, which gives a lot of information about all the proteins in the sample.
One of the major contamination culprits in the lab for ancient DNA work is modern high-quantity PCR products that drown out the lower-frequency ancient DNA of interest. There, the lab setup itself can help to avoid contamination. "What -really helps is having — and almost every lab working in ancient DNA has it — a completely separate lab unit where ancient sample analyses are performed and where no modern DNA PCR products — or any high-copy number DNA — is brought in," Hofreiter says. Additional protocols such as doing ancient DNA analysis in the morning before working in the modern lab and not returning until the next day also minimizes contamination levels, he says.
It all changes
Ancient DNA is are often small — small DNA fragments and a small quantity of that DNA. "Really, we challenge any system that we use, any method, any approach that we use. We are always on the limit — the detection limit, the reliability limit," Geigl says.
Until recently, researchers relied heavily on PCR to amplify those fragments up to a usable amount. That approach had drawbacks, not only because of the contamination threat, but also because of the method itself. "The thing is with the traditional primer approach, there was a limit to how short fragments you could actually retrieve [could be] because you had to have primer binding sites and then you had to have something between that you could interpret," Copenhagen's Willerslev says. "That really meant that you could not go much lower than about 90, or possibly 80, base pairs."
These days, next-gen sequencing is allowing researchers to examine smaller stretches of ancient DNA, and at a higher throughput. "That has had an enormous effect. Basically, it completely transformed ancient DNA analysis," Hofreiter says.
It has also broadened avenues of research, Willerslev adds. "Materials that we couldn't get reliable results from earlier, we can actually get to work now," he says, adding that "the amount of DNA sequences you can get out with these new techniques ... really enables us to do ancient genomics."
Earlier this year, Willerslev's group reported the first ancient human genome, that of an extinct Paleo-Eskimo, in Nature. His team extracted DNA from about 4,000-year-old permafrost-frozen hair found in northwestern Greenland at a Saqqaq cultural site. Using Illumina's GAII, they sequenced the Saqqaq sample to an average depth of 20X, covering about 79 percent of the diploid genome. Being able to study ancient humans, Willerslev says, gives insight into human evolution and migration.
Willerslev says that some people think there was one wave of migration that peopled the Americas, but his work hints at a more complex picture of human migration. "In order to get full picture of things like human migration, human evolution, it's not enough to study the genetics of living human beings because there is so much diversity lost along the way. If you want the full picture, the full story, you have to go back in time," he says.
The publication of the draft Neandertal genome has also made the picture of earlier human evolution and migration look more complex. The Neandertal genome, published in Science in May, was composed of samples from three individuals. It was sequenced to about 1.3-fold coverage, the Max Planck Institute for Evolutionary Anthropology's Svante Pääbo said at the Biology of Genomes meeting held in Cold Spring Harbor, NY, that same month. Pääbo and his team also reported that there likely was gene flow between Neandertals and humans after humans left Africa. "Neandertals are closer to Europeans than Africans," he said, adding that they hope to get up to 10- to 20-fold coverage of the Neandertal genome.
"The genome is still not well covered. It has a lot of holes," says Geigl, who was not part of the work. She adds that more Neandertals and ancient species need to be studied.
Indeed, Pääbo's team has also begun work on another early hominid. This sample, a 30,000- to 50,000-year-old finger bone, was found in the Denisova cave in Siberia and has been sequenced to 2.1-fold coverage and, from their early analysis, Pääbo reported at the meeting that the Denisova hominid diverged from the modern human line about a million years ago and that it is about 13 percent divergent from modern humans, or a bit more divergent than Neandertals are. And unlike Neandertals, Pääbo said that there is no evidence of gene flow between this hominid and modern humans.
To sequence the Denisova sample, the team is using a DNA repair method that they published earlier in the year in Nucleic Acids Research. The main degradation problem is that cytosine turns to uracil, which is then read as thymine, Pääbo said. With their method, uracil is removed by the enzyme uracil-DNA-glycosylase while another enzyme cuts the DNA at that site before it is sealed by DNA polymerase.
That same problem has plagued others working with ancient DNA. Hofreiter has capitalized on the random nature of the cytosine transformations to determine the correct sequence. "If we analyze three independent molecules, one may be damaged in position five, the other one in position 23, and the other one, say, at position 57, but very rarely the case that all three are damaged in the same position," Hofreiter says. "If you replicate things, you can get a correct consensus." He adds that three replicates were initially preferred, but now only two are necessary.
On the protein side of the spectrum, coming to a consensus can be difficult. NC State's Schweitzer and her colleagues reported in 2007 that they had recovered peptides from Tyrannosaurus rex remains — a finding that was met with skepticism, as many researchers say proteins cannot persist that long. The University of York's Matthew Collins and others, including MSU's Ostrom, wrote in a technical comment to Science that this work did not meet authenticity criteria.
For Schweitzer's more recent study of collagen from the hadrosaur B. candensis, her team kept the skeleton entombed in sandstone to protect it from contaminants. From their antibody and mass spec analysis, they reported in Science that they identified collagen, particularly amino acid repeats that are specific to collagen, that were localized to the B. candensis tissue. Their collagen sequence placed the hadrosaur into the dinosaur-bird clade and they predicted that B. candensis is more closely related to birds than to alligators.
In her studies, Ostrom focuses on the protein osteocalcin. This protein is predominantly found in vertebrates, and not in contaminating bacteria or fungi. "You wipe all those contaminants out of there," she says. "The idea is, if you end up having a molecular ion that looks like osteocalcin, it probably can't be anything else." Her team has been able to determine the sequence of osteocalcin from a 42,000-year-old horse and from a 21,000-year-old extinct camelid, as they reported in Geochimica et Cosmochimica Acta in 2006 and 2007, respectively.
Ostrom notes that the ancient protein field is hamstrung by incomplete databases. For many ancient proteins of interest, she says that not even the modern protein sequences are in databases.
But since she started in the field, she says it has been helped by "a huge increase in mass resolution" and an increase in sequencing capabilities. "Sensitivity and resolution are just crazy better than they were then," Ostrom says.
"Ancient DNA work has been ongoing for 30 years; a complete sequence of a protein was published by us in 2002. We've had less than a decade and I think that what we've accomplished in less than a decade given that we don't even have the modern databases — I think we've worked really hard and it's been a landslide," Ostrom says. "We're so far behind the DNA folks because we don't have that wealth of information that they do — but we'll catch up."