NEW YORK (GenomeWeb Daily News) – Even though Saccharomyces cerevisiae strains are less genetically diverse than strains belonging to the related wild yeast Saccharomyces paradoxus in terms of SNP content, S. cerevisiae genomes exhibit more diversity in terms of copy-number variation and the presence and absence of genes, according to a study published in Molecular Biology and Evolution.
Investigators led by Gianni Liti, a team leader at the University of Nice in France, noted that S. cerevisiae had previously been found to include higher levels of phenotypic diversity, leading the researchers to speculate that those trait differences may be due to genome content variation.
Additionally, the researchers reported that subtelomeric regions of the yeast genome are spots of enriched variation, particularly genes linked to sugar or metal transport, metabolism, and flocculation, or the ability of yeast to form clumps.
Liti and his colleagues sequenced more than 40 strains of S. cerevisiae and S. paradoxus to examine the genomic variation between and within those species and strains.
"These genome sequences allowed us to expose surprising differences between the evolutionary histories of the common baker's yeast and its wild relative," the researchers said in a statement. "Our results suggest that the very large diversity in traits observed between strains of baker's yeast might mostly be due to the presence or absence of entire genes rather than differences in single DNA letters."
The researchers sequenced 42 strains of S. cerevisiae and S. paradoxus, including at least one strain representing the different major phylogenetic lineages of the species and a number of mosaic species, to between 10x and 60x coverage using the Illumina GAII and HiSeq platforms. Six strains were sequenced to a greater depth of coverage, reaching between 400x and 800x coverage.
Using this data, the researchers assembled the genomes of 14 S. cerevisiae strains and 13 S. paradoxus strains whose coverage was higher than 20x after quality filtering.
These genomes, the researchers noted, contained a smattering of genes not included in the reference, most of which were located in the subtelomeric regions and were linked to flocculation and sugar transport and metabolism.
By making pairwise comparisons between the different strains and examining regions present in one strain versus another, the investigators found that S. cerevisiae contained more genome content variation in comparison to its level of SNPs variation while S. paradoxus had higher levels of SNP variation than genome content variation. In both species, the researchers noted, there is a correlation between SNP distance between the strains and genome content variation, but that link is weaker in S. cerevisiae than in S. paradoxus.
This effect, the researchers speculated, may contribute to the phenotypic variability exhibited by S. cerevisiae.
Subtelomeric regions in particular house a number of CNVs, the researchers reported. In S. cerevisiae, genes in those regions with CNVs were enriched for gene ontology terms linked to sugar transport and metabolism, ion and metal transport and metabolism, as well as to flocculation – in short, the researchers noted, genes with functions that involve interacting with the environment.
Genes with similar predicted functions also housed CNVs in S. paradoxus.
"These results imply that largely similar evolutionary forces are shaping the landscapes of copy number variation in these two species," Liti and his colleagues wrote.
The ARR gene cluster, which contains genes involved in arsenic detoxification, for instance, exhibits copy-number variation in both yeast species. The researchers tested whether the number of copies of the ARR gene cluster in the organism affected its ability to grow on media containing arsenic. They noted a strong association with the copy number of the gene cluster and mitotic growth and efficiency, and while they found that the rate at which the yeast could expel arsenic increases with each copy, the energy cost was constant.
In S. paradoxus, the ARR gene cluster copy number followed the population structure of the species with all the European strains containing two copies, the North American strain containing one, and the East Asian strains lacking the cluster.
In S. cerevisiae, the distribution is more complex, as the researchers found evidence of both introgression and convergent amplification. While the Malaysian strain and the West African mosaic strains lack the cluster, most strains have one copy. However, the sake strain Y12 and two wine/European strains contain two copies of the cluster.
By phasing the haplotypes of one European strain, called BC187, the researchers found that one copy that it contains is phylogenetically related to the copies housed by other European strains. The other copy, though, clusters with the sake strain Y12, indicating it was introgressed from that strain.
The other European strain with two ARR copies, called DBVPG1373, appears to have undergone an independent duplication event.
"These findings demonstrate convergent evolution of ARR cluster duplication and loss both between different lineages within S. cerevisiae and between S. cerevisiae and S. paradoxus," the researchers said. "It is tempting to speculate that this has been driven by differences in environmental arsenic concentrations between the habitats of different yeast lineages."
The subtelomeric regions also contain an enrichment of loss of function variants. For example, one S. cerevisiae strain includes a two basepair insert in the gene RIM15, which encodes a kinase thought to be involved in regulating cell division, proliferation, and sporulation in response to nutrient levels. By constructing hybrids between this strain and three other main lineages that were hemizygous for the frameshifted RIM15, the researchers noted that it has a negative effect on the organism's ability to undergo meiosis and forms spores during nutrient starvation.
"Genes displaying copy number and loss-of-function variation as well as genes not present in the reference genome are enriched for functions related to interaction with the external environment, e.g. sugar transport and metabolism, flocculation and cell adhesion, and metal transport and metabolism," the researchers said. "It is plausible that this reflects variation in the environmental conditions of different strain habitats, leading to selective pressures for these cellular functions that vary across time and space and resulting in either gain or loss of gene functions in different lineages."