With the growing ubiquity of low-cost, high-throughput genotyping by sequencing, agricultural researchers can now generate high-quality data on their crops of interest like never before. However, combining that data with phenotypic information in order to improve the accuracy of genetic selection in plant breeding remains a challenge.
"Genotyping has become easy because of GBS and SNP chips," said Patrick Schnable, director of the Center for Plant Genomics at Iowa State University in Ames. "It's the phenotyping that has become the roadblock," he said. "Phenotyping is hard and is deserving of a lot of attention."
To tackle this issue, Schnable and fellow maize researchers have established the Genomes to Fields Initiative (G2F), an effort to integrate databases containing genomic information with phenotypic information, with the ultimate aim of providing corn growers with highly detailed, statistical model-generated breeding recommendations.
As part of the initiative, participants will also be evaluating various GBS approaches to determine the best ones for certain cases.
"We want to be able to say that, 'with this soil, with this level of nitrogen, with this stand density, this is what you should do'," Schnable told In Sequence. "We want to be as good as Netflix in making predictions," he said. "Right now, if I go on Netflix and have been rating movies, it does a really good job of telling me movies I don't want, and it is actually good at picking the winners," said Schnable. "We want to be that good."
G2F commenced last year and, in addition to ISU, involves around a dozen research groups affiliated with the University of Wisconsin at Madison, the University of Minnesota, the University of Missouri, Penn State University, Cornell University, the US Department of Agriculture, and others.
Its main supporter so far is the Iowa Corn Promotion Board, an organization that collects one cent per bushel from participating program members and directs those funds toward maize-related R&D efforts. David Ertl, technology commercialization manager for ICPB and a member of G2F's executive committee, told In Sequence that ICPB contributed $135,000 to get the initiative started and has committed an additional $500,000 for G2F this year.
Ertl noted that ICPB has also provided $1.5 million to Schnable's endowed chair in genetics at ISU, money that will "in a way, go toward support of this initiative as well." He added that ICPB is "working with other organizations to raise other funds in addition to ours."
Schnable and co-lead investigator Natalia de Leon, an assistant professor of agronomy at UW Madison, discussed G2F at the Plant and Animal Genome conference held last month in San Diego. In his presentation, Schnable described G2F as the "logical next step to apply information generated by the sequencing of the maize genome." He also said that by assessing genotype and environmental interaction with the resolution provided by GBS, G2F aims to "enable accurate modeling and prediction of the performance of specific genotypes in specific environments."
"The Genomes to Fields concept is the next step of applying the information generated by initially sequencing the maize genome and then sequencing the other maize genomes, and all the other haplotypes," Schnable said at the conference. "It's a coordinated project that involves lots of genotyped lines that are grown in many different environments, so that we can look at not only how does the genotype control phenotype in one environment, but how does that interaction of genotype and environment play out."
As an example, Schnable noted that some maize lines with certain genotypes grow well in drought conditions while others do not. Such growth, however, may be dependent on the interaction of the maize with weather, soil type, water, nutrients, disease pressure, agronomic practices, and other variables. G2F therefore aims to integrate GBS-derived genotypic information with collected phenotypic information, as well as information on those environmental variables, to create models that could potentially predict how a maize line might grow in any number of given conditions, and if other lines might be more suitable for those conditions.
"If we can understand this well enough to model how a plant is going to grow, so if we know the genotype and can measure the environment and we understand the interaction, that will predict phenotype," said Schnable. "Then we will be able to do some really cool things" he said. "We will improve the accuracy of selection in plant breeding, which will increase genetic gain. We'll be able to breed crops for the world of the future when climate change increases weather variability, and we'll be able to help farmers decide which varieties to grow in a particular field under a particular management practice, which will help them."
Getting started
According to UW Madison's Natalia de Leon, G2F has since its formation last year been increasing seed for between 400 and 500 inbred maize lines that are "important and represent the diversity that we are interested in." The inbreds were selected from sites across North America, from Texas to Ontario, and Nebraska to Delaware, de Leon told In Sequence, and most have been genotyped on Illumina's instruments at Ed Buckler's Lab within Cornell University's Institute for Genomic Diversity.
These hundreds of female inbreds have since been crossed with five male "tester" lines, creating a resource of about 2,000 hybrids that will be distributed to the 21 sites taking part in G2F, planted in replicate, and observed during the 2014 growing season. "We will see how they perform differently according to location, temperature, rainfall, et cetera," de Leon said. And not only is G2F evaluating the lines for "big picture morphological traits," such as height and yield performance, de Leon said, but the researchers will also use the information to test more specific hypotheses, such as theories of root development.
The next steps are to combine the obtained phenotypic information with the existing genotypic information, to narrow in on regions of the genome that may be associated with certain morphological traits, and to take that information to help guide breeding decisions, de Leon said. Still, she cautioned that it might take several growing seasons to obtain enough phenotypic information to be able to definitively link specific genotypes with phenotypes.
"The year to year variabilities that we observe in phenotype are often greater than those observed from location to location, so we will need more than one year to obtain this information," she said.
The first year will allow researchers to collect environmental information on temperature, light intensity, and wind in a uniform manner that can later be integrated into a database. De Leon added that G2F will collect samples from all of the hybrids generated and genotype them again as a quality control measure, "to make sure we know what we are working with."
Schnable said that G2F will be recruiting computational biologists to aid in the development of the envisioned genotype-phenotype database and statistical modeling.
"We have figured out how to build good genome databases, now we need to figure out how to build good phenotype databases and tie those together," Schnable said. He noted that G2F has been engaging seed companies to move the project along, and said that there has been a lot of interest and that "they see this as a sensible thing to do."
He also acknowledged that G2F would not be possible without the availability of affordable genotyping by sequencing.
"GBS really has had a huge impact on plant biology and could have an equally huge impact on plant breeding," Schnable said. "You just need to look at a large number of individuals, phenotype them, genotype them, build a statistical model, and then you will be able to just look at the genotype and make pretty good predictions."
GBS 'bake off'
While the focus of G2F is to understand the effects of environment on phenotype and, in particular, the interaction between genotype and environment, the project will employ various GBS applications to generate data, Schnable said, making it clear that sequencing was the technology best suited for the initiative.
"Arrays suffer from ascertainment bias – they only score SNPs that were previously discovered," said Schnable. "Because maize is highly diverse this can be a serious limitation," he said. He added that it is unlikely that G2F will rely on digital PCR approaches. "I don’t believe digital PCR would be cost-effective for scoring thousands of SNPs, and depending on how it is conducted, it has ascertainment bias," he said.
The only question for G2F's organizers, therefore, is what GBS applications to use. "There are various flavors of GBS and sequencing platforms available to us," said Schnable. "We plan to conduct bake-offs to select the right ones for various applications." He declined to provide additional information on how such comparisons could progress, stating, "we've simply committed to conducting evaluations of alternative technologies to decide upon the ones that are appropriate for specific cases."
Apart from his involvement in G2F, Schnable is also managing partner of Data2Bio, a next-generation sequencing services and analysis company. With offices in Iowa and China, Data2Bio offers a number of services, including an alternative GBS approach called tunable genotyping-by-sequencing (tGBS).
According to Data2Bio, tGBS "more stringently controls the fraction of genome that is sequenced and genotyped." Reads are clustered at fewer sites, increasing read depth, and resulting in the reporting of tens of thousands of SNPs. On its website, Data2Bio claims that tGBS is suitable for most projects, and that the approach "yields much less missing data than conventional approaches for mapping the genetic determinants of traits in bi-parental crosses and association mapping experiments."
Schnable provided an overview of tGBS at a separate workshop at PAG. While he acknowledged that his presentation was not specific to the G2F initiative, he noted that tGBS is being used in a variety of projects. "Whether it will be used in the G2F initiative will depend on the needs of the initiative and the results of the bake offs," Schnable said.