Researchers from the University of Copenhagen have turned to next-generation sequencing for use in conservation efforts and biodiversity monitoring. One project, led by Tom Gilbert, a professor at the university's Center for GeoGenetics, is sequencing DNA from the guts of leeches to more cost-effectively identify rare mammal species in areas of Southeast Asia.
Conventional methods for monitoring biodiversity from substrates such as soil, water, and leeches have involved PCR amplification, cloning, and Sanger sequencing, Gilbert told In Sequence. But, with the advent of next-generation sequencing, and particularly sequencing on one of the desktop systems, cloning can be replaced by amplicon sequencing, providing deeper coverage and greater resolution to identify rare species from a complex sample.
Because the sequencing systems generate thousand to millions of sequences per run, this enables a dramatic cost reduction by pooling samples together while still enabling sequencing at a much greater depth than could have been done with Sanger.
Compared to Sanger sequencing, this allows for two main advantages. "For samples that are very complex and that might have many things in them, like soil, you can really look at the diversity in the community, even the very rare stuff," Gilbert said.
For things that are less complex, like leech guts, which may only contain DNA from one organism, researchers can multiplex, which reduces cost.
Gilbert's team is focused on sequencing DNA found in the guts of leeches and is using an amplicon sequencing strategy on the Illumina MiSeq. He talked recently at the US Department of Energy Joint Genome Institute user meeting in Walnut Creek, Calif., about his group's efforts.
Initially, he said the group was using Roche's 454 GS FLX, but switched to the MiSeq because it "provides enough sequence without being overkill," the read lengths are long enough, and it is "easier to use" than the GS FLX.
The DNA found in the leech guts is of surprisingly good quality, Gilbert said, and remains preserved for several months, due to a special feature of the leech's digestive system. In between its mouth and its gut, it has what is called a crop, where it stores blood from the animal it has feasted upon. The blood is incrementally released into the gut where it is digested, but can remain stored in the crop for several months. As such, the crop contains some sort of "natural preservative" to keep the blood from spoiling, which also has the added benefit of preserving the DNA, Gilbert said.
To extract the DNA, the researchers simply grind up the leech, and then use 5' barcoded primers to amplify the DNA. The team will create a short 150-base to 200-base region that is generic to many organisms, such as a 16S for mammals or plant tRNA.
In order to multiplex and pool samples, the team will generate "for example, 20 forward and 20 reverse primers that are identical except that they vary at the 5' end with a unique barcode," Gilbert said. "By switching the forward and reverse primer combinations you can generate, for example, 400 different tag combinations. So once you've amplified 400 samples with these 400 different tags, you can pool them all together, convert them into a library, and sequence them," he said.
The trickier part is doing the actual analysis and determining the precise species, especially if it is a rare species for which little information is available, Gilbert said.
"The databases of the animals we're after are quite incomplete," he said. For example, in one case, the team found DNA from a rare type of rabbit, but on the initial GenBank query, the results were inconclusive. "The sequence came back from GenBank as basically this may be a rabbit or it may be a rodent, because there was no sequence in GenBank for [the animal]."
The other problem is distinguishing sequencing and PCR errors from real findings, he said, especially when a result points to a rare species. One way Gilbert's team gets around this problem is by doing multiple library prep steps and sequencing runs per sample and then limiting the analysis to data that appears several times.
In a pilot test looking at DNA from 25 different leeches, the team discovered DNA in 21 leeches from seven different mammals, "a considerable fraction [of which were] quite rare mammals," Gilbert said.
For instance, they found DNA from the Annamite striped rabbit and a type of deer called the Truong Son muntjac, species that were only discovered in the late 1990s, Gilbert said.
"It was interesting, because you tend to imagine that mammals that have been discovered recently are extremely rare," Gilbert said. But the fact that DNA from the Annamite striped rabbit was found in 10 percent to 20 percent of the samples "suggests that actually they are more common they you would imagine."
The cost of the protocol depends on the question the researcher wants to answer, Gilbert said. For instance, if a researcher simply wants to know if a certain species is present in an area, DNA from many different leeches can be pooled and sequenced in one run, with a simple yes or no result.
But for more complicated questions, where a researcher wants to know what species are in each individual leech, costs will increase, since the samples from each leech will have to be barcoded. It is the upfront extraction and sample prep work that makes up the bulk of the cost, he said, not the sequencing itself.
Going forward, Gilbert said his group is moving into mitochondrial sequencing to measure not only what types of species are present, but their abundance.
"If we're doing 16S and want to see how many rabbits are in the region, if we just identify that 16S, which is the same between all the rabbits, we can't say anything about abundance," he said. However, each animal will have slight differences in its mitochondrial genome, which can be used to estimate the number of species in an area.
Another potential future application that Gilbert's group is considering is assembling genomes of the rarer species without a reference. The tricky part with doing the assembly though is that most of the DNA ends up being leech DNA, so methods would have to be developed to subtract out that DNA, he said.
Gilbert said instead of crushing up the entire leech to extract DNA, focusing on just the gut of the leech where the mammal DNA would be located might reduce the amount of leech DNA. Targeted capture approaches could also select out unwanted DNA, he suggested.