NEW YORK – Although international sequencing efforts have made it possible to identify and track emerging SARS-CoV-2 variants, new research suggests that more extensive standardization and data sharing are needed to improve SARS-CoV-2 surveillance around the world.
"Our findings indicate an urgent need to increase timely and full sharing of sequences, the standardization of metadata files, and support for countries with limited sequencing and bioinformatics capacity," co-first author Zhiyuan Chen, an infectious disease researcher at Fudan University's School of Public Health and its Key Laboratory of Public Health Safety, and his colleagues wrote in their study, published in Nature Genetics on Monday.
Starting with millions of SARS-CoV-2 genome sequences generated from viral isolates collected and analyzed internationally before the end of 2021, researchers in China, Switzerland, and the US looked at country-specific surveillance, sequencing, and reporting patterns in more than 100 countries.
In addition to insights on the emergence and spread of new variant strains — including the Alpha variant that spread at the beginning of 2021, which was overtaken by the Delta variant by spring of that year — the team documented different levels of genomic surveillance by location, along with variable sequencing methodologies and metadata releases.
"Our findings suggest that global SARS-CoV-2 genomic surveillance strategies and capacity vary considerably, and are limited in some regions," the authors reported, noting that "our study revealed that in certain countries, a large number of genomes are not available in public databases."
In 96 countries, the investigators saw relatively high levels of available SARS-CoV-2 sequence data, for example, while half a dozen countries had low and more than 70 countries had moderate sequence data availability. On the other hand, they found that routine genome sequencing-based surveillance was done at particularly high levels in nearly four dozen of the countries considered.
"We found that genomic surveillance strategies were globally heterogenous," Chen explained in an email, with 45 countries performing high levels of routine genomic surveillance and many countries in the Eastern Mediterranean, Africa, and the Americas doing limited surveillance.
More than half of the sequences came from sites in Europe, the team noted, followed by the Americas. Also, far more SARS-CoV-2 sequence data was traced back to high-income countries compared to more income-limited sites.
Across the suite of available SARS-CoV-2 genomes, the team noted that investigators most often turned to short-read sequencing technologies.
More than 3.7 million sequences were generated with Illumina instruments, compared to around 816,000 genomes produced with Oxford Nanopore sequencing platforms, though the availability of specific technologies, and sequencing capacity in general, varied by location and income levels in each country or region.
Likewise, sequencing times tended to be shorter in high-income regions, the researchers reported. Even so, the turnaround time dipped across all areas they looked at as the pandemic went on, as labs put surveillance processes in place and improved their workflows.
Despite the slew of sequence data being generated over the course of the pandemic, the team noted that only a subset of countries consistently shared sequence data related to variants of concern by the end of 2021, including viruses in the Alpha, Beta, Gamma, and Delta strains. Many of the SARS-CoV-2 sequences also appeared to lack demographic and other metadata, such as sampling methods or an infected individual's symptoms, outcomes, or vaccination status.
"Our analysis of publicly deposited SARS-CoV-2 sequences implied that some countries are not sharing genomic data in public repositories," Chen said. "To counter the threat of emerging variants, we urge international cooperation in encouraging, incentivizing, and enabling the timely and complete sequencing and sharing of SARS-CoV-2 genomic data in all countries."