NEW YORK (GenomeWeb News) – In a paper set to appear online in the Proceedings of the National Academy of Sciences this week, researchers from Seoul National University and elsewhere reported that they have sequenced and analyzed the draft genome of Glycine soja — a plant believed to be the ancestor of domestic soybeans.
The team used a re-sequencing approach to cobble together the wild soybean draft sequence based on information gleaned from the genome sequence of a domesticated soybean (G. max) cultivar called Williams 82, published early this year in Nature. By comparing patterns in the G. soja and G. max genomes, the researchers started tracking down genetic differences between wild and domestic plants. As such, they explained, the new sequence is providing clues about the nature and timing of soybean domestication.
"[T]he genome sequences of wild species should provide key information about the genetic elements involved in speciation and domestication," corresponding author Suk-Ha Lee, a plant genomics researcher at the Seoul National University, and co-authors wrote. "[G]enome comparison suggests that the genetic history of soybean is more complicated than previously assumed."
Lee and colleagues used the Illumina Genome Analyzer to sequence DNA isolated from a wild soybean variety known as IT182932 that had been collected from a field in South Korea. They then mapped the newly generated sequence to the G. max reference genome, Glyma1.01, using the Roche 454 Genome Sequencer FLX platform to verify SNPs and fill in gaps in the genome.
Using this approach, the team generated sequence covering nearly 98 percent of the G. max genome to an average depth of about 43 times.
Similar to the domestic soybean genome, the team found that the G. soja genome is rich in duplications, consistent with a pair of polyploidization events occurring about 59 million years ago and 14 million years ago.
Overall, the researchers reported, domestic and wild soybean plants share about 915.4 million bases of consensus sequence. But, they noted, the wild plant is missing more than 32 million bases found in G. max and contains 8.3 million bases not found in its domestic counterpart.
Comparisons with the domestic soybean genome also uncovered some 2.5 million SNPs, including more than 337,000 variants in genic regions and 86,236 SNPs in sequences that actually code for proteins. Of these, 38,598 of the variants appear to represent non-synonymous changes, while another 47,638 SNPs appear to be synonymous.
When they sifted through the non-synonymous SNPs using data from the Polymorphism Phenotyping (PolyPhen) and Sorting Intolerant from Tolerant (SIFT) databases, the team found that about a fifth of the non-synonymous variants are predicted to produce pronounced functional changes in the plant.
Their sequence comparisons and structural analyses also turned up 196,356 small insertions and deletions in the G. soja genome, as well as 5,794 larger deletions, 8,554 insertions, and 194 inversions.
More than a fifth of the genomic regions found in the domestic, but not the wild, soybean harbored transposable elements, the researchers noted, consistent with the notion that these sorts of sequences may expand during the process of domestication.
Based on their findings so far, the team concluded that the previously sequenced plant G. max does, indeed, represent a domesticated version of the wild plant G. soja. They estimate that these wild and domestic plants diverged from one another roughly 267,000 years ago. But, they explained, soybean domestication seems to have happened much more recently, between 6,000 and 9,000 years ago.
"Although a divergence time based on the nucleotide sequences of only two genotypes could be an overestimate," they explained, "these results suggest that the divergence between IT182932 (G. soja) and Williams 82 (G. max) predated soybean domestication."