NEW YORK (GenomeWeb) – Collaborators from China, the US, the Philippines, and France are sharing results from the 3,000 Rice Genomes (3K-RG) Project, an effort to characterize the genetic and genomic variation across Asian cultivated rice accessions.
The team — led by investigators at the Chinese Academy of Agricultural Sciences, BGI-Shenzhen, and the International Rice Research Institute — sequenced and analyzed 3,010 cultivated rice representatives, identifying millions of SNPs, small insertions or deletions, and structural variants falling in and across the Asian rice populations identified. The findings, appearing online today in Nature, also highlighted genes that are shared between rice populations, along with genes falling in the broader rice pan-genome.
"The next challenge will be to examine associations of the 3K-RG genetic variation with agriculturally relevant phenotypes measured under multiple field and laboratory environmental conditions; this will guide and accelerate rice breeding by identifying genetic variation that will be useful in breeding efforts and future sustainable agriculture," the authors wrote.
The team embarked on the 3K-RG effort to spell out the diversity within and across Asian cultivated rice accessions, which until now have been placed in two broad groups within the Oryza sativa rice species. The Xian/Indica (XI) accessions are typically grown in diverse but less challenging environments, while rice accessions in the Geng/Japonica (GJ) group are more often found in harsh, high-altitude, and/or high-latitude environments.
As they outlined in GigaScience in 2014, the researchers sent rice genomic DNA to BGI-Shenzhen for Illumina sequencing. After sequencing 3,024 rice genomes and tossing out data for 14 accessions that did not meet their quality control criteria, they aligned reads for the other 3,010 accessions to a Nipponbare RefSeq rice reference to track down informative variants.
The search led to more than 29 million SNPs, the team reported. Nearly 8 percent of those SNPs appeared to have moderate to high effects on other features in the genome, while just over 5 percent were classified as low-effect SNPs. The genomes also housed some 2.4 million small insertions and deletions.
Based on genome sequence data for 453 rice accessions with particularly deep coverage, the researchers narrowed in on 93,683 structural variants, or 12,178 structural variants per rice genome, on average.
Along with five main rice clusters reported in the past, the team was able to identify nine rice sub-populations, coinciding with the geographic distributions of these plants, while the structural variant profiles provided the basis for a phylogenetic analysis on the most deeply sequenced accessions.
When they added insights from the 268 million bases of rice sequences that were poorly represented in the rice reference genome, the researchers narrowed in on 12,465 newly detected protein-coding sequences and thousands of other partial gene sequences. Beyond the 12,770 to 14,826 gene families that made up the core rice genome, the team placed more than 9,000 "distributed" gene families in the broader rice pan-genome — an analysis informed by new reference genome sequences generated for two rice representatives using Illumina and PacBio sequences.
"Our analysis brings more resolution to the within-species diversity of O. sativa," the authors wrote. For example, they noted that the GJ rice from more challenging environments tended to have broader core genomes, while the pan-genomes in the XI rice eclipsed that of the GJ accessions.
"A closer look at patterns of haplotype sharing at domestication genes suggests that not all 'domestication' alleles came to XI from GJ," they noted. "Taken together, our results — combined with archeological evidence of XI cultivation for [more than] 9,000 years in both India and China — support multiple independent domestications of O. sativa."