With the publishing of the gorilla genome earlier this year, all the great apes — bonobo, chimpanzee, human, orangutan, and, now, gorilla — have been sequenced. A number of other, important non-human primate genomes, like that of the rhesus macaque, have also been unraveled.
From studying the new sequence from a female western lowland gorilla and comparing it to the human and chimpanzee sequences, researchers were able to pinpoint regions of the gorilla genome and other primate genomes that are undergoing rapid evolution. It was not surprising that some of those areas, like those involved in brain function, were evolving rapidly, but it was intriguing that regions associated with hearing also were under selection.
"We don't regard hearing in humans as particularly special [and] we don't really know much about gorilla hearing," says Aylwyn Scally, a postdoctoral fellow at the Wellcome Trust Sanger Institute who was the first author on the Nature paper presenting the gorilla genome. "The animals both live in such different environments that you wouldn't expect them to necessarily have the same selection pressures. … We don't really have a good explanation for why it is, but it is one of these findings that you get when you look across the whole genome."
And now that there are so many non-human primate genome sequences, researchers can do even more whole-genome comparisons and delve deeper into primate genomics. "Now, people are doing whole-genome surveys, looking for signals, looking for things they didn't ever expect to find," Scally adds.
While researchers can apply 'omics approaches like exome sequencing and, increasingly, whole-genome sequencing, to tease out questions of evolution and other complex research questions, some tools and resources are in need of refinement. And further, access to samples and restrictions on animal studies remain challenges.
"Over the last several years, because the costs have been coming down and the speed of data production has gone up, the scope of primate genetics and genomics, the ambitions that people have for the work they want to do, have dramatically increased," adds Jeffrey Rogers, an associate professor at Baylor College of Medicine. "People are beginning to undertake and publish studies that just a few years ago weren't even on the radar, so that's very exciting."
Tools of the trade
When new tools are developed, they are usually best suited to study the human genome or the genome of a model organism like the mouse before they are adapted to work with primates. "We're always late to the party," says Betsy Ferguson, an associate scientist at the Oregon National Primate Research Center.
But the tools do come. In the mid-2000s, Robert Norgren, a professor at the University of Nebraska Medical Center, set about developing a gene chip for the rhesus macaque, even though its genome was still in the process of being sequenced. Working with Affymetrix, Norgren tweaked the chip-design process. Usually, he says, gene chips are developed by aligning ESTs against a finished genome and picking the probe selection region of about 570 base pairs at the 3' end of the gene. Instead, Norgren and Christopher Davies at Affymetrix relied on human genomic data to design primers flanking the probe selection regions. Then with those primers, they amplified, using PCR, rhesus macaque DNA and then sequenced what they amplified — rhesus macaque probe selection regions, about 5,000 of them.
They then turned to the Rhesus Macaque Genome Sequencing and Analysis Consortium, led by Baylor, that was in the process of sequencing the rhesus macaque genome. They repeated what they'd done — using human probe selection regions against rhesus macaque DNA — but in silico to pull out more rhesus macaque probe selection regions. From all those probe selection regions, Affymetrix selected a number to include on the chip.
"The biggest user [of the gene chip] is the SIV/HIV community because the rhesus macaque is the animal model for HIV vaccine research and also for HIV dementia, for understanding the pathogenesis for all aspects of HIV," Norgren says. He adds that researchers continue to use the chip today.
Similarly, researchers are now applying human whole-exome capture methods to study non-human primates like chimpanzee and rhesus macaque — and they work fairly well.
"People thought that it might be able to capture 50 percent of the exons using human-based designs … but in fact we're seeing 90 to 95 percent of the exons recovered," Ferguson says. "It performed much better than we had expected, and, for most of our applications, that is good enough." She and colleagues in China published a study indicating that in PLOS One.
Eric Vallender, an assistant professor at Harvard Medical School and the New England Primate Research Center, notes that with exome capture, researchers don't have to worry as they do with microarrays about differential amplification or quantitation. "For capture, it doesn't matter so much if it doesn't work great because you just want to capture it, period," he says. Though, of course, the more closely related the species of interest is to humans, and thus to the bait, the better the capture will work.
While one advantage to using human-based exome capture methods is that they can be bought off the shelf, another is the higher confidence that the method encompasses the full exome, Vallender says. "When you try to design [the capture method] off of, say the rhesus, it's a lot harder because there are annotation problems, the genome isn't complete," he says, adding "and that's with the rhesus. There is at least a genome for the rhesus. If you moved out to other species like even other old world monkey species where there's no genome, then you can't do it."
Despite the annotation issue, Baylor's Rogers notes that his group is in the midst of developing a rhesus macaque-specific whole-exome capture array.
Of course, the drawback to using a human-based design to study other primates is that it is human based, so researchers will miss anything that is species specific. That, Vallender adds, is something researchers need to stay cognizant of.
As, though, sequencing costs continue to decline, researchers may skip straight to performing whole-genome sequencing of non-human primates. Right now, Vallender says, researchers are in a moment in time when exome sequencing makes sense — its cost is so much lower that it is worth it to do exome sequencing even though some data may be missed. But that time period is quickly ending. Vallender adds that "once you get past that [time period], if you just do the whole genome, then you don't need to worry about what species it is or anything."
Tomas Marques-Bonet, a research professor at the Institut de Biologia Evolutiva at Universitat Pompeu Fabra in Barcelona, adds that "if you can afford it, you should do whole-genome sequencing."
Norgren, too, is moving toward sequencing, and says that he hopes to make the rhesus macaque gene chip obsolete. But first, he says, there needs to be a better rhesus macaque reference genome.
While a number of non-human primate genomes have been sequenced, they suffer from assembly and annotation issues — most of the genomes are of draft quality, which limits how researchers may use them. "We have been relying on assemblies, reference genomes that are just in draft quality. That means that they are really, really far away and lagging behind compared to the human reference genome," Marques-Bonet says.
Harvard's Vallender adds that "annotation on a lot of the genomes out there could use a lot of work."
Efforts are underway to improve the quality of non-human primate genomes, though many researchers acknowledge that the new reference genomes are unlikely to reach the gold-standard quality of the human reference genome.
Along with colleagues at the University of Maryland, Nebraska's Norgren is working to develop a new rhesus macaque genome. "Draft genomes really cannot be used as reference genomes when you do next-gen sequencing," he says. "And this is not just true of the rhesus macaque — this is true of all draft genomes."
To put together this new rhesus macaque genome, Norgren and his colleagues aren't starting from scratch. They are making use of the MSR-CA assembler out of Maryland that can combine new whole-genome and whole-exome sequences produced by next-gen sequencing with the original Sanger rhesus macaque sequences.
At the same time, Norgren and his colleagues are also performing mRNA-sequencing with a Velvet-Oasis pipeline to assemble those transcripts into a de novo rhesus macaque transcriptome. "We can then align [the transcriptome] with the new rhesus assembly and use it for annotation," he says. "It's really two important steps. There's the assembly and then the annotation."
He expects to have a draft of the new assembly and annotation by the end of the year.
Norgren is also working on bolstering the chimpanzee reference genome by doing more sequencing and RNA-seq of Clint — the male chimpanzee from Yerkes National Primate Research Center whose genome was chosen to be the reference — as well as of cell lines derived from Clint.
While non-human primates are closely related enough to humans for techniques like human-based exome capture to work, they do differ significantly from humans as well as show other inter- and intra-species variation. Understanding that variation could help inform research into evolution and disease as well as help control genetics as a confounding factor in other studies.
Baylor's Rogers and his team set out to characterize genetic variation, particularly SNPs, in captive rhesus macaque populations. In a study published in BMC Genomics last year, Rogers and his colleagues reported their identification and validation of more than 3 million SNPs in rhesus macaque — previously, the researchers note, fewer than 900 had been validated.
"Basically, developing a catalog of the genetic variation that is present in the captive research populations of rhesus monkeys creates many new opportunities for studying the influence of genetic differences among individuals on risk factors or even full disease outcomes that very closely model human disease," he says.
In addition, Rogers and his colleagues predicted that about 400 of those SNPs would have damaging effects. "We expected to find a significant amount of functionally interesting, functionally significant genetic variation in genes related to diseases. But it was really exciting to see that it was actually quite easy to find significant mutations such as stop codons and frameshift mutations that should have dramatic effects on gene function," he says. "The next step is, of course, to investigate and characterize the influence of those mutations on biomedically relevant phenotypes."
He adds that he and his colleagues are following up on this work by sequencing 50 more whole exomes and 10 whole genomes. While they are still crunching the numbers, Rogers says that the amount of genetic variation that they are seeing in the rhesus macaque is only going up. "We are finding lots of very interesting genetic variation in 50 whole exomes," he adds.
Such a strategy, Rogers says, can be applied to other primates as well. "We started with rhesus because they are the most widely used, so that's the biggest impact we could make," he says, "but this general strategy of developing catalogs of common genetic variation in laboratory primates is going to be very important across a number of species, and benefit research in a variety of disease areas."
Such studies of genetic variation in non-human primates can also help get at evolutionary questions like what differentiates humans from their closest relatives? With more, and better, genomes, comparisons may be made.
"We're very interested in having these reference genomes so that we can get a closer look at evolution, human evolution, which is really hard to do without these reference genomes," Norgren says.
Elodie Gazave, now at Cornell University, and her colleagues, who included Universitat Pompeu Fabra's Marques-Bonet, examined differences in copy-number variation — in location and in frequency — among bonobos, chimpanzees, gorillas, humans, and orangutans. To determine what is specific to humans, she says, researchers have to make sure what they've found is unique to people, and "for that you need to make sure they are absent in the closest related species."
Additionally, having a better understanding of the variation within a non-human primate species can also inform conservation strategies, Gazave adds.
As they published in Genome Research last year, Gazave and her colleagues determined that there appear to be species-specific structural variation patterns, though many CNVs are shared among the species. Further, many of the shared CNVs generally fall in line with what is known about phylogenetic relationships. They were also able to pinpoint human-specific CNVs.
A related study from Marques-Bonet, from 2009 in Nature, focused on segmental duplications, and it found that there was a "burst" of segmental duplications in some primates. "[In] human, chimp, and gorilla, there seems to be an acceleration in the rate of duplication," he says, "and as a corollary of these observations, that means that, essentially humans, chimps, and gorillas seem to share, at some degree, the distribution of segmental duplications, whereas [in] orangutan and macaque, [it] seems that they have a more ancestral mammalian distribution of segmental duplications."
What this means isn't quite clear. One hypothesis, Marques-Bonet says, is that there was a population bottleneck that helped bring this trait to fixation. "Although this is something, of course, that only with a more completed set of genomic data will we be able to solve this," he adds.
More genomes can also help clarify speciation events and their timing — did speciation occur all of a sudden or slowly over time? Looking at DNA mutation rates across a number of genomes is helping the Sanger's Scally answer that question. Chimpanzees and humans are thought to have split from a common ancestor about 6 million or 7 million years ago, but Scally notes that that is a rather new estimate — that number used to be 3 million or 4 million years ago.
The revision of the timing of the human-chimpanzee split occurred after researchers examined the mutation rate in modern humans by comparing genomes from parents and their children — mutation rates can be used to calibrate the evolutionary time scale. People had thought that a typical nucleotide in the human genome mutated once every billion years, but it turned out to be closer to once every 2 billion years, Scally says. That, then, pushed the speciation timing further back.
"That's again something that you can only do with whole genomes because you don't see very many of these mutations in any one individual or in any one position," he says. "You need to look across the whole genome in several individuals to be able to count up enough to be able to get an idea of how often they appear."
Getting all those non-human primate genomes to compare can be problematic for researchers. Great apes and non-human primates are endangered species and are regulated by law. The Convention on International Trade in Endangered Species of Wild Fauna and Flora, or CITES, is an international agreement among some 170 countries to protect the survival of endangered animals. Chimpanzees, gorillas, and rhesus macaques are among those protected by the CITES agreement.
"Some of these species are very endangered and there's lots of restrictions about getting the samples," Scally notes.
Those restrictions and the permits and permissions that go along with them can hamper research, some scientists say.
"I completely understand that we need to protect these animals. At the same time making this conservation so rigorous, it is also limiting our ability to fully understand them," Cornell's Gazave says. She recalls being able to get five samples for one study when she had the funds to do many more.
The permits and permissions take a lot of time, Universitat Pompeu Fabra's Marques-Bonet adds. Further, he notes that restrictions apply to materials derived from the animals, including cell lines and DNA. "Getting samples is difficult, but it is not impossible," he says. "But the thing that we should revisit as a community of scientists is scientist permission and the permits we need to move samples, especially when we are talking about DNA between labs.
"In my opinion, it just doesn't help conservation, which is, at the end, what they are trying to do," Marques-Bonet adds.
Because many non-human primate species are endangered, their wild populations are dwindling. And zoos, Scally adds, don't always have the diversity of species that researchers want to cover. For example, he says, gorillas in most zoos are western lowland gorillas, but there are also eastern lowland gorillas and mountain gorillas.
Further, Gazave notes that the pedigrees of zoo animals are oftentimes unknown. That, she adds, makes concluding anything about species diversity difficult.
There are additional restrictions on chimpanzee research in the US, notes Dario Boffelli, an associate scientist at Children's Hospital Oakland Research Institute, who studies comparative epigenomics. Nearly a year ago, an Institute of Medicine committee concluded that much of biomedical and behavioral research on chimpanzees was not necessary. It did, though, note that comparative genomic studies, among a few others, were still needed.
The US National Institutes of Health accepted IOM's recommendations. However, while NIH is putting processes in place to implement the guidelines, it is not funding any chimpanzee-based research.
"I think comparative genomics will be granted an exception," Boffelli says. "But until then, we are kind of in limbo."