A team of researchers in Germany that is sequencing the Neandertal genome using 454’s technology has improved sample prep methods and data analysis, thus increasing the reliability of the results.
The improvements, which address the researchers’ ability to detect damage-related sequence errors and to prevent and detect contamination with modern human DNA, give the scientists confidence that “it will be technically feasible to achieve a reliable Neandertal genome sequence,” according to a recent publication.
The damage analysis will also interest researchers studying DNA from other Pleistocene organisms like the mammoth and the cave bear.
The initial goal of the Neandertal project is to generate 1x coverage of the genome within the next year or two, according to Adrian Briggs, a graduate student in Svante Pääbo’s lab at the Max Planck Institute for Evolutionary Anthropology in Leipzig, Germany.
“Once we have that, then there will be even better technology, and we would eventually like to be able to go for much higher coverage,” said Briggs, who is also the lead author of the study, which was published in PNAS last month.
Currently, Roche subsidiary 454 Life Sciences generates data for the project at its sequencing center in Branford, Conn. Because the Neandertal bone used in the project is heavily contaminated with bacterial DNA, the scientists obtain almost 100 bacterial reads for every Neandertal sequence read.
“That still requires many machines operating in parallel,” Briggs said, “and only the 454 sequencing center is capable of doing that for us right now.”
The MPI researchers have looked into other technologies, but “454 is still the best option for this,” Briggs said. The main reason is the platform’s comparatively long sequence reads — currently 200-300 bases — which make it easier to distinguish Neandertal DNA from microbial sequences.
However, as other platforms improve their read lengths, “we will certainly be looking at [these] in the future,” Briggs said.
In the “very near” future, the scientists want to complete 2 percent of the Neandertal genome, or about 60 million bases. Late last year, they published an analysis of the first million bases of the genome (see GenomeWeb Daily News, an In Sequence sister publication, 11/15/2006).
“We have taken a lot of time in the last year improving [the] technical side of the project, for example incorporating the new contamination approaches, and trying to improve our DNA recovery, so we don’t waste bone,” explained Briggs.
They have also identified suitable Neandertal bones from two more sites, in Spain and Germany, in addition to the bone from Croatia that they have been using in their main project. “Now we got a good grip on the technical side of the project, so we can really go for some bulk sequencing,” Briggs said.
In their recent study, the scientists addressed two issues: ways to prevent and detect contamination of Neandertal DNA with modern human DNA, and sequencing errors that result from damage in the ancient DNA.
Contamination with modern human DNA is “one of the biggest pitfalls of sequencing Neandertal DNA,” according to Briggs. In order to prevent it, the researchers extract the DNA from the bone in a clean room environment.
Previously, they sent the DNA extract to 454 for library preparation. But because contamination can occur during that process, they now prepare their own libraries in-house, using a project-specific barcode that is part of the sequencing adaptors. That barcode uses a different four-base key than 454’s normal adaptors and is “unique to our project,” Briggs said.
The researchers also looked at ways to detect contamination with modern human DNA. Until recently, people thought that a good way to distinguish between ancient and modern DNA is to look for certain types of DNA damage, believed to be a telltale sign of ancient DNA. However, apparently, they are not.
“Modern DNA can be just as degraded as ancient DNA,” according to Briggs. For example, bones exposed to the environment for as little as 10 or 20 years show DNA damage that was previously considered to be typical of ancient DNA.
“The only way to positively identify contamination, and measure it, is from sequence,” Briggs said. To that end, the researchers amplified a hypervariable region of mitochondrial DNA in the Neandertal extract and looked at how much of it matched the Neandertal. They found that more than 95 percent of the mitochondrial sequences came from the Neandertal. In addition, all mitochondrial sequence reads in the actual sequencing project matched Neandertal DNA.
“Modern DNA can be just as degraded as ancient DNA.” For example, bones exposed to the environment for as little as 10 or 20 years show DNA damage that was considered to be typical of ancient DNA.
But even if the DNA is uncontaminated, the sequence can be wrong. That is because of base damage in ancient DNA. Certain bases decompose to become other bases, which result in incorrect base calls during sequencing.
In previous ancient DNA analysis projects, Briggs said, scientists sequenced each base repeatedly from different templates. Because damage only hits each base very rarely, they are unlikely to see the same damage-related error more than once. “But because with our shotgun Neandertal project we only sequence everything once, that’s not available to us,” he said.
In order to distinguish damage-related differences between Neandertal and modern human DNA from real differences between the species, the scientists aligned the Neandertal reads to the human genome and studied the distribution of mismatches. They did a similar analysis with mammoth and cave bear DNA.
They found that only two out of 12 possible types of mismatches were elevated above a rate that they would expect from sequencing intact DNA. These mismatches also showed a pattern: they clustered at the ends of the DNA molecules. Most likely, they result from damaged DNA and do not represent real differences.
“Any particular change that you happen to see in the middle of the molecule is more believable than changes that you see at the beginning or end, because there is more damage,” Briggs said.
“The data are excellent and the patterns are really neat,” Hendrik Poinar, an associate professor in the department of anthropology at McMaster University in Canada, told In Sequence by e-mail.
He pointed out that contamination with modern DNA is “not as much of a problem” for the study of other extinct species, because it is unlikely that DNA from a modern related species is around.
Early last year, Poinar and his collaborators published a large-scale sequencing study of mammoth DNA using 454’s technology (see GenomeWeb Daily News 1/3/2006)
The MPI scientists now plan to develop a model that incorporates the type of mismatch and its position in the molecule into a reliability or quality score for each mismatch between Neandertal and human DNA.