As next-generation sequencing prices continue to fall, a technique known as genotyping-by-sequencing is beginning to be implemented in an ever-expanding range of fields, including ecology, animal husbandry, and plant breeding.
In particular, researchers in these fields have found the technique useful for doing large population-based studies and analyzing complex genomes or the genomes of species without a reference.
There are "a lot of these groups who historically have never used any high-content genomic technologies starting to incorporate [genotyping-by-sequencing] into their research," Charlie Johnson, director of genomics and bioinformatics at Texas A&M AgriLife Research, told In Sequence. It has a "breadth of applications" from "plant breeding to population ecology," he added.
Additionally, several sequencing service providers have seen an uptick in demand for the genotyping-by-sequencing technique known as RAD-seq, or restriction site associated DNA sequencing. India's SciGenom, which primarily serves the plant genomics field, has said that researchers are increasingly requesting RAD-seq services (IS 4/9/2013). Similarly, service provider Floragenex plans to focus exclusively on RAD-seq (IS 6/18/2013).
In the field of ecology, researchers are using the technique for conservation genomics, for instance in understanding how a specific animal population is or isn't adapting to its environment, as well as to map genomic regions associated with thermal tolerance, migration timing, and disease resistance or susceptibility, explained Shawn Narum, lead geneticist at the Columbia River Inter-Tribal Fish Commission.
Narum said that in his work he is primarily using RAD-seq to do population studies of trout and salmon. Because RAD-seq reduces the amount of sequencing needed, he is able to pool up to 75 individuals on one lane of the Illumina HiSeq 2000, genotyping each sample for thousands of SNPs.
He is also working on developing a protocol that would involve amplicon sequencing of just a few hundred loci in order to pool thousands of individuals on one lane of the HiSeq.
The approach is ideal for population ecology, Narum said, because "the primary species that we deal with have huge, very complex genomes, and there's no reference."
Whole-genome sequencing would be too costly for such large-scale population studies.
"We're trying to sequence many individuals to understand the genetic variation within a population or between populations."
Additionally, even if a reference genome becomes available, "for some of the questions we're addressing, the reality is that whole-genome sequencing may be an abundance of sequence that's not necessary."
Texas A&M's Johnson agreed that RAD-seq and other genotyping-by-sequencing techniques are ideal for species without reference genomes. He has primarily been using it on crop species, which aside from often not having a reference, also tend to contain many highly repetitive regions and can be polyploid.
"Most major crop species don't have a reference," he said. And, RAD-seq has the "ability to genotype absent a reference and of course, for breeders, cost is a big issue, so it allows them to genotype thousands or hundreds of thousands of markers in a cost-effective way."
Johnson estimated that genotyping-by-sequencing techniques are about half to one-quarter the cost of SNP arrays.
It is also an unbiased approached, he said. "In order to measure a SNP [on an array] you have to know it's there," Johnson said.
Aside from RAD-seq, Johnson said that there are many other flavors of genotyping-by-sequencing, all of them just variations of each other, differing in the enzymes they use to cut the genome. The different enzymes have different frequencies of cut sites, leading to more or less of the genome being sampled for sequencing.
Recently, a group from Martin Luther University in Halle, Germany, developed a technique called RESTseq for restriction fragment sequencing, which they optimized for lower-throughput experiments run on Life Technologies' Ion Torrent PGM (IS 6/4/2013).
Ikhide Imumorin, an assistant professor in the department of animal science at Cornell University, recently applied genotyping-by-sequencing to 47 cattle samples from six different breeds in Nigeria and the US.
He said that while there are SNP chips available specifically for cattle, the chips were developed based on common breeds. But, "if you want to profile breeds that were not included, then that chip is not useful," he said. For instance, breeds from Latin America, Asia, and Africa were not included in the discovery work that went into developing the currently available cattle SNP chips, so there will be problems with bias. "The prevalence of a particular SNP is breed specific," he said.
By contrast, genotyping-by-sequencing allows SNP discovery and genotyping to be done in a single step without bias, he said. Additionally, he estimated that genotyping-by-sequencing is more cost effective than SNP chips. Imumorin estimated chips cost around $120 to $150 per animal, while genotyping-by-sequencing runs around $35 to $45 per animal.
Recently, Imumorin collaborated with a team that used genotyping-by-sequencing on 1,300 cattle and identified over 500,000 SNPs. He is also planning a project to look at cattle populations in sub-Saharan Africa and Asia.
"That's where I think genotyping-by-sequencing will find a lot of application — in populations outside of Europe and the US," he said.
He added that industry would likely adopt the technique as an outsource model, although he said that if a lab had a next-gen sequencer, then genotyping-by-sequencing was not too complicated and could be performed in house.
Moving forward, Narum said that while the technique has been gaining in popularity, there is room for improvement, particularly on the bioinformatics side. "As we're generating tremendous amounts of sequence data, the bioinformatics to process that information is still a bit lagging," he said. In the future, he said he would like to see "software pipelines that would enable us to take raw sequence data and distill that down into the genotypes of each of the individuals for each SNP locus."