While large-scale research consortia are beginning to sequence hundreds of human genomes to find out how they differ from one another at the DNA level, evolutionary biologists are employing new sequencing technologies to study how humans differ from their closest relatives.
At the Biology of Genomes meeting at Cold Spring Harbor Laboratory last week, scientists from the Max Planck Institute for Evolutionary Anthropology in Leipzig presented updates and initial results from two such projects, both involving 454’s sequencing technology: one aiming to decipher the Neandertal genome and the other to sequence the genome of the bonobo, a close cousin of the chimpanzee.
Max Planck Institute researchers from Leipzig and 454 Life Sciences said in mid-2006 they won funding from the Max Planck Society to help them produce a draft of the Neandertal genome sequence within two years (see In Sequence’s sister publication, GenomeWeb Daily News 7/20/2006).
Later that year, they published their first results, a megabase of Neandertal genome sequence (see GenomeWeb Daily News 11/15/2006).
In a talk at the conference last week, Svante Pääbo, director of the department of evolutionary genetics at the MPI in Leipzig, said that the researchers have increased that amount to approximately 70 megabases of unpaired sequence data. The DNA comes from three different fossil bones found at sites in Spain, Germany, and Croatia. The majority of the data was generated from a 38,000-year-old sample of a male found in Vindija, Croatia.
Of the genetic information extracted from the bones, Neandertal DNA accounted for between less than 1 percent and 10 percent, while the rest consisted of microbial DNA, so only a small fraction of the more than 4.5 gigabases of sequence data the researchers have generated came from Neandertal DNA, Pääbo pointed out.
And because the ancient DNA is damaged and broken into small pieces, the average read length is only about 60 base pairs, much shorter than the 250-base-pair reads the 454 FLX technology can generate.
The coverage of the Neandertal genome was uneven, with GC-rich regions being overrepresented, according to Pääbo, possibly because those regions are better preserved or more easily extracted from the fossil bone.
From these data, the researchers were able to reconstruct a complete 16-kilobase Neandertal mitochondrial genome, which was covered 35-fold by the data. They found that this genome differs from almost all modern humans at 133 positions.
Based on a comparison with mitochrondrial DNA from modern human populations, Pääbo concluded that Neandertals and modern humans diverged about 600,000 years ago. According to the data, it is also unlikely that the two populations interbred, although more data from the entire genome will be required to confirm this.
MPI researchers are sequencing the “Forgotten Ape.”
The Neandertal mitochrondrial genome has also enabled the researchers to assess whether their samples had become contaminated with modern human DNA. Recent data only contained less than 1 percent sequence reads from human mitochondrial DNA, but Pääbo mentioned that the first library they sequenced – data they published in Nature in late 2006 – contained approximately 10 percent human DNA.
In order to guard against contamination, for the last two years or so the researchers have been generating DNA libraries in a clean room in Germany before sending them off to 454 in Connecticut for sequencing, and have been adding project-specific adaptor sequences to the Neandertal DNA (see In Sequence 9/11/2007).
Besides mitochondrial DNA, it would also be useful to have an assay for nuclear genome contamination, Pääbo said. A good candidate for such a marker is the X-chromosome, he said, since the male Neandertal sample only contains one copy, and since parts of that chromosome are expected to be covered two-fold in the initial draft sequence.
By October, Pääbo and his colleagues want to generate approximately 50-fold more, or 3.8 gigabases, of Neandertal sequence data, equivalent to about one-fold coverage of the genome. Because the fossil DNA is contaminated so heavily with microbial genetic material, this will require several hundred runs at 454’s sequencing center.
Over the next few years, the scientists hope to increase the coverage of the Neandertal genome to about 12-fold. It is not clear at the moment which sequencing platform they will use for this scale-up, Pääbo told In Sequence.
The Forgotten Ape
Besides the high-profile Neandertal project, Pääbo’s group is also sequencing the genome of the bonobo, the lesser-known cousin of the chimpanzee and, along with the chimp, the closest living human relative.
In a separate talk at last week’s CSHL conference, Susan Ptak, a researcher in Pääbo’s group, presented initial results from that project.
The bonobo, Pan Paniscus, split from the human lineage approximately 6.5 million years ago, and from the chimp, Pan troglodytes, around 1.5 million years ago.
Only a small number of individuals – an estimated 5,000 to 100,000 – remain in the wild, south of the Congo River in Africa. Though they are closely related and similar in appearance to the common chimp, their social behavior differs markedly from their cousin: females dominate in the bonobo community, and they resolve conflicts with sex. Among chimps, on the other hand, males dominate, and they resolve conflicts with aggression and often violence.
For their sequencing project, the researchers chose a female bonobo from the Leipzig zoo. So far, using 454’s GS FLX platform, both in-house and at 454’s sequencing center, the researchers have generated approximately 13.3 gigabases of sequence data, equivalent to about 4-fold coverage.
An initial analysis by researchers at the US National Human Genome Research Institute confirmed that, as expected, the bonobo genome is about 0.4 percent divergent from the chimp genome.
The Leipzig team is now building a map-based assembly of the bonobo genome, assembling overlapping sequence reads into contigs after aligning them to the chimp genome, which serves as a guide. As a result, the quality of the assembly for different bonobo chromosomes depends on the completeness of the chimp genome in those chromosomes, Ptak showed.
Once the assembly is complete, the researchers plan to study the evolutionary history of chimps and bonobos. They also want to study copy number and structural variations, and analyze splice sites.
Eventually, they would like to increase the sequence coverage to 12- to 15-fold, Ptak told In Sequence.