Skip to main content
Premium Trial:

Request an Annual Quote

PNAS Papers on Pepper Populations, Cacao Tree SVs, Genomic Records Metadata

Editor's Note: Some of the articles described below are not yet available at the PNAS site, but they are scheduled to be posted this week.

An international team led by investigators in Italy and Germany outlines population structure and past dispersal patterns for pepper plant accessions in the Capsicum genus. Using genotyping-by-sequencing, collection location data, and a "Regional Mixture" (ReMIXTURE) analysis and visualization method, the researchers profiled more than 10,000 global Capsicum genebank samples, uncovering SNP patterns for exploring the plant's population history. "The method ReMIXTURE — using genetic data to quantify the similarity between the complement of peppers from a focal region and those from other regions — was developed to supplement traditional population genetic analyses," the team writes, noting that the findings point to "west-east routes of expansion, shedding light on the links between South and Mesoamerica, Africa, and East/South Asia."

Investigators at the University of Minnesota and Pennsylvania State University explore structural variant consequences in cacao-producing Theobroma cacao populations. With the help of 10x Genomics linked reads, the team put together 62 haplotype-resolved, chromosome level de novo genome assemblies that range from 341 million to 387 million bases and represent 31 T. cacao accessions collected from wild populations at four South American sites. From the resulting diploid genome assemblies, the authors identified more than 160,000 structural variants (SVs) that were set against cacao plant fitness, adaptation, and gene expression patterns. The analyses suggest most cacao SVs are deleterious, they report, though some SVs have apparent ties to local adaptations. "Despite the overall detrimental effects," they write, "we identify individual SVs bearing signatures of local adaptation, several of which are associated with genes differentially expressed between populations."

A team from Michigan State University, Penn State, and elsewhere reflects on the importance of meta-data collection, documentation, and data stewardship in advancing efforts to come up with more diverse genomic datasets. Based on their review of publicly available genomic data for wild or domestic species available through the International Nucleotide Sequence Database Collaboration repository, the researchers suggest that roughly one-third of published genomic datasets are accompanied by spatiotemporal meta-data such as the geographical location and year samples were collected. "Streamlined data input processes, updated meta-data deposition policies, and enhanced scientific community awareness are urgently needed to preserve these irreplaceable records of today's genetic biodiversity," the authors propose, "and to plug the growing meta-data gap."