When Patrick Schnable wanted to study maize as a teenager, he ordered corn with kernels of different colors and planted them in different parts of his family's garden to track the colors' inheritance. "Of course, because I didn't control pollination, as an experiment, this was a disaster," Schnable says. "But it did spark my interest in the phenotypic diversity of maize."
These days, as director of the Center of Plant Genomics at Iowa State University, Schnable has somewhat more sophisticated approaches at his disposal. In recent years, he has moved from using high-density comparative genomic hybridization microarrays to sequence-capture and next-generation sequencing to study genetic variation in maize.
And, Schnable says, crop-breeding companies aren't far behind the research community when it comes to the new technologies. While companies currently use high-throughput PCR-based approaches to track SNPs previously shown to be linked to target genes, Schnable expects that sequencing-based methods "stimulated by the deployment of the Ion Torrent and MiSeq instruments" will supplant the older approaches in the "not too distant future."
According to Schnable, many of the new, informative breeding markers have been identified using RNA sequencing.
"One of the bigger impacts transcriptomics has had so far has been to facilitate the identification of genic SNPs that can be used in molecular breeding applications," Schnable says. "Expression data is being widely used to identify genic leads for both genetically modified and native trait breeding," he says.
Ray Ming, a professor of plant biology at the University of Illinois at Urbana-Champaign, attributes the uptake of transcriptome sequencing to its declining cost and says that the growing availability of the technology has allowed researchers and companies to quickly identify relevant markers and implement them in breeding programs.
"Low-cost transcriptome sequencing certainly expedited research and applications for crop improvement," says Ming, who has been using RNA-seq to study sugarcane. In the past, Ming and fellow researchers had used expression arrays to study the crop, but made the jump to sequencing in recent years because of the greater amount of information it yields.
With a small investment of between $400 and $1,000, Ming says it is now possible to sequence the transcriptomes of targeted crop species in a couple of weeks, covering up to 80 percent of the genes in the crop's genome. This, in turn, has allowed researchers like Ming to rapidly identify candidate genes for disease resistance and other qualitative traits.
High-density mapping of various crops through transcriptome sequencing is less economical though, Ming says. He adds that sequencing the transcriptome of a segregating crop population is "still costly" and can run anywhere between $20,000 and $40,000. At the same time, he says that the resulting gene maps are "highly informative" for map-based cloning and marker-assisted selection.
SNP chips and GBS
"The next-gen sequencing technologies have greatly facilitated the generation of reference genomic sequences for barley and wheat," adds Shiaoman Chao, a geneticist at the US Department of Agriculture's Agricultural Research Service. "Breeders can readily tap into these resources and design markers closely linked to regions previously identified containing agronomically important genes for marker-assisted selection," she says.
She also says that high-density SNP genotyping arrays continue to be the "main genomic tool" used for whole-genome analysis in cereal crops like barley, oat, and wheat. As an example, she says that a 9,000-SNP array is available for barley studies, a 6,000-SNP array is available for oat studies, and a 9,000-SNP array and a 92,000-SNP array are available for wheat studies.
Illumina manufactures these targeted arrays and others for its agrigenomics customers, and more chips are expected to become available. For example, earlier this year, BGI announced a deal with Affymetrix to co-develop and co-market microarrays for agricultural applications.
The non-exclusive partnership will aim to provide a portfolio of plant and livestock microarrays for genotyping analysis, spanning applications such as marker-assisted trait selection, parentage, quality control, and traceability. The partners will use BGI's next-generation sequencing platforms, bioinformatics capabilities, and sequencing databases to develop new content for Affy's genotyping arrays.
Chao says that breeders often use these SNP chips to select smaller panels of markers that are run on other lower-multiplex SNP genotyping technologies such as Illumina's BeadXpress platform, Fluidigm's integrated fluidic circuits, Sequenom's Mass-Array, or KBioscience's KASP platform.
"Based on the SNP genotype data, barley breeders were able to select 48-SNP sets for marker-assisted selection, and 384-SNP sets for genomic selection," Chao says. "Breeders adjust what SNPs go into these SNP sets according to their breeding goals."
While RNA-seq appears to be replacing older expression microarray technologies for transcriptome profiling in agricultural research applications, Chao says that genotyping arrays still have benefits over their sequencing-based alternatives.
"Genotyping-by-sequencing using next-generation sequencing has been explored in barley, wheat, and oat," Chao says, but, at the moment, the drawback for genotyping-by-sequencing, or GBS, is that the amount of missing data is "rather high" due to "shallow sequencing depth [resulting] from sample multiplexing, which can further complicate the calling for heterozygotes."
According to Chao, breeders often subject early-generation breeding populations to marker screening, so to be able to call heterozygotes reliably is critical when applying markers for crop improvement.
In addition, GBS "has not become a mainstream genotyping tool," Chao says, "mostly due to the lack of bioinformatics expertise to handle and process gazillions of sequence reads." She says that "at the end of the day, what breeders want is the genotype data for their breeding lines irrespective [of] what genotyping method is used, SNP genotyping or GBS." She speculates that once these hurdles — shallow sequencing depth, bio-informatics limitations — are resolved, GBS will "become a more routine genotyping method, given that the cost for sequencing continues to decrease."
Chao adds that approaches that rely on sequence capture may be a better option for future breeding programs. "Sequence capture [methods have] also been used in wheat and barley to study whole genome structure and genetic diversity," Chao says. Based on her experience, the "use of sequence capture on hundreds to thousands of targeted gene regions, coupled with sequencing, would be a more practical approach for breeding than the use of whole-genome sequencing, as sequencing reduces [the] number of targeted regions [that] would help increase the sequencing depth and improve the heterozygote calls."
Iowa's Schnable adds that sequence capture-based approaches are now being applied to assay structural variation. "There is a lot of interest in ascertaining the extent to which CNVs and PAVs [presence/absence variants] contribute to genetic variation and heterosis in crops," Schnable says.
And given the data analysis limitations Chao cites, Schnable's solution has been to go into business with a new company called Data2Bio that has its own version of GBS called "tunable GBS" because "it allows us to more stringently select that fraction of the genome that will be sequenced, and hence genotyped," he says.
Schnable says that Data2Bio obtains more reads at each polymorphic site for any given number of sequence reads. "This greatly reduces both the amount of missing data and need for imputation," he adds. "The greater read depth also enables us to call heterozygotes."