NEW YORK – A team from China, Denmark, and the UK has assembled genome sequences for tens of thousands of marine microbes using metagenomic sequence data generated from ocean samples at sites from around the world, scouring the genomes for clues to microbial diversity, biological capabilities, and potential tools for future biomedical or biotechnical applications.
"This work provides evidence that global-scale sequencing initiatives advance our understanding of how microbial diversity has evolved in the oceans and is maintained, and demonstrates how such initiatives can be sustainably exploited to advance biotechnology and biomedicine," co-senior and co-corresponding author Guangyi Fan, a researcher affiliated with BGI Research and the Hong Kong Polytechnic University, and his colleagues wrote in Nature on Wednesday.
For their study, the investigators analyzed more than 230 terabytes of available marine metagenome sequence data deposited in the National Center for Biotechnology Information (NCBI), European Bioinformatics Institute, and Joint Genome Institute databases between the fall of 2009 and summer of 2020, representing sequences for nearly 24,400 metagenomic samples.
By carefully reanalyzing these data, in combination with bacterial and archaeal genome sequences for marine microbes found in NCBI, the Ocean Microbiomics Database, or the OceanDNA collection, the team put together a "global ocean microbiome genome catalog" (GOMC) collection containing 43,191 new and known metagenome-assembled genomes — a set that spanned 2.46 billion bases of sequence and 24,195 unique genomes for marine bacterial or archaeal species across 138 phyla.
"This research marks a major step forward in marine metagenomics, showing how important marine microorganisms are for improving human health and supporting environmental sustainability," co-first author Jianwei Chen, a researcher affiliated with BGI Research and the University of Copenhagen, said in a statement. "These findings open up promising opportunities for scientists worldwide to explore and use ocean resources sustainably, while also setting the stage for future breakthroughs in biotechnology and biomedicine."
The collection included microbes found in sediment, bathypelagic zone, and host-associated ecosystems, among others, offering a look at microbial distribution within and between ocean sites and highlighting 56 distinct groupings that the team dubbed "metagenomic provinces."
In addition, more than 41 percent of the metagenome-associated genomes (more than 9,900) in GOMC were new sequences, significantly expanding on the genomes available for microbes from the Thermoproteota and Halobacteriota archaeal taxa and from Campylobacterota and Desulfobacterota groups of marine bacteria.
"Our newly recovered [metagenome-assembled genomes] significantly increased the known diversity of marine microbiomes, constituting 65 percent of the genomes for the Thermoproteota and Halobacteriota phyla, and accounting for more than 85 percent of Campylobacterota and Desulfobacterota genomes."
Along with apparent trade-offs between antimicrobial capabilities and CRISPR activity, including a tendency for ARG sequences to dip in the genomes of certain microbial phyla containing Cas-coding sequences, the investigators were able to find new insights related to microbial genome size, structure, and diversity of microbial genomes.
"The observed augmentation in bacterial genome size demonstrates a complex association with the proliferation of the distinct functional domains that are crucial for nutrient acquisition, responsiveness to environmental stimuli, and interactions with other organisms," the authors wrote, noting that "differential abundance and uneven distribution of defense systems across ecosystems reflect the competitive nature of oceans, despite their dilute environment."
Using the GOMC data, for example, the investigators uncovered three previously unappreciated polyethylene terephthalate hydrolase enzymes believed to be capable of degrading plastic. They also uncovered 117 candidate antimicrobial peptides using bioinformatics, before synthesizing and experimentally testing a subset of the potential antimicrobials on human pathogens and other bacteria.
The team also unearthed dozens of CRISPR-Cas9 systems, including a new Om1Cas9 system that was subsequently shown to have genome editing capabilities in double-stranded DNA digestion and human cell line experiments.
"Our investigation unveils valuable information about newly identified CRISPR-Cas9 systems, AMPs, and plastic-degrading enzymes, showcasing the diverse molecular arsenal encoded within the microbial communities of the marine environment," the authors reported, noting that novel CRISPR-Cas9 systems found in the GOMC catalog "hold great application potential in various fields of research and biotechnology."
In a related editorial in Nature, Helmholtz Institute for Functional Marine Biodiversity researcher Murat Eren and Tom Delmont, with the University of Paris-Saclay and the Research Federation for the Study of Global Ocean Systems Ecology and Evolution, pointed to the potential bioprospecting applications of the work.
While the study "cannot be considered to be exhaustive," Eren and Delmont suggested that the new marine microbe investigation "successfully demonstrates the timeliness, feasibility, and potential" for linking genomic observations with follow-up laboratory experiments and "reveals the benefits of pushing the boundaries of microbiology through a fusion of its classical and modern means."