NEW YORK (GenomeWeb) – A team led by investigators at Michigan State University and the University of California, Davis has produced a near-complete, chromosome-level assembly for the genome of the cultivated strawberry, a plant with a complex octoploid genome produced through the hybridization of two wild octoploid plants that each arose from four diploid progenitors.
"The genomes of the progenitor species, Fragaria virginiana and Fragaria chiloensis, are the products of polyploid evolution: they were formed by the fusion of and interactions among genomes from four diploid progenitor species (that is, subgenomes) approximately 1 million years before present," senior and co-corresponding author Steven Knapp, a plant scientist at UC Davis, and his colleagues wrote in their study, published online this week in Nature Genetics.
To tackle the Fragaria x ananassa garden strawberry genome, the researchers used a combination of Illumina short-read, Pacific Biosciences long-read, and 10x Genomics haplotype phasing and scaffolding data, which was assembled and scaffolded using NRGene's DenovoMAGIC3 software package. The genome was further scaffolded to chromosome scale using Hi-C data in combination with Dovetail's Hi-C HiRise pipeline, and PBJelly error correction was applied to fill gaps. The resulting de novo assembly contained nearly 108,100 predicted protein-coding genes, spread over 28 chromosome-level pseudomolecules, and offered a look at strawberry phylogenetics, the diploid progenitor plants ancestral to it, and the evolution of sub-genome dominance.
"This reference genome should serve as a powerful platform for breeders to develop homoeolog-specific markers to track and leverage allelic diversity at target loci," the authors concluded, noting that "we anticipate that this new reference genome, combined with insights into sub-genome dominance, will greatly accelerate molecular breeding efforts in the cultivated garden strawberry."
Using high molecular weight DNA extracted from strawberry leaf tissues from young Fragaria x ananassa Camarosa cultivar plants, the researchers prepared libraries for sequencing on PacBio RSII and Illumina instruments, incorporating additional 10x Genomics Chromium and Dovetail Hi-C data into an 805.5 megabase genome assembly spanning an estimated 99 percent of the plant's estimated 813.4 megabase genome.
From there, the team turned to an existing pipeline, protein sequence data, expressed sequence tags, and 10 Fragaria x ananassa transcriptome sets to annotate the genome, uncovering an estimated 108,087 protein-coding genes, 15,621 apparent long, intergenic non-coding RNAs, some 9,265 anti-sense overlapping transcript lncRNAs, and 5,817 sense overlapping transcript lncRNAs.
The researchers also sequenced the transcriptomes of dozens of diploid Fragaria species to aid a subsequent phylogenetic search for strawberry's diploid progenitors. Using phylogenetic, gene conversion, and phylogenetic analyses, they considered the roots of disease-resistance genes in the strawberry plant, focusing on nucleotide-binding-site, leucine-rich-repeat (NBS-LRR) gene clusters in particular.
Moreover, the researchers noted that more than a third of the new strawberry genome assembly was made up of transposable element (TE) sequences, particularly long-terminal-repeat retrotransposons (LTR-RTs). Their results revealed rising gene expression with diminishing TE-density, and vice versa, pointing to a role for the density of TEs in the balance of power between sub-genomes in allopolyploid plants.
"Our data support the hypothesis that sub-genome dominance in an allopolyploid is established by TE-density differences near homoeologous genes in each of the diploid progenitor genomes," they explained, noting that "the merger of sub-genomes with different TE densities results in higher gene expression for the dominant homoeolog with fewer TEs."
When they focused on sub-genome expression, for example, the investigators saw increased expression of genes from the sub-genome originating from the F. vesca diploid progenitor plant, which has lower TE densities than the diploid F. nipponica, F. viridis, and F. iinumae progenitors that contributed to the less dominant sub-genomes. That appeared to promote metabolic pathways encoded by the dominant sub-genome, including those involved in strawberry color, flavor, and aroma.
"Pathway analysis showed that certain metabolomic and disease-resistance traits are largely controlled by the dominant sub-genome," the authors wrote, noting that "[t]hese findings and the reference genome should serve as a powerful platform for future evolutionary studies and enable molecular breeding in strawberry."