NEW YORK – With mutants of the SARS-CoV-2 virus on the rise all over the world, some more contagious than the wild type and others dodging the just-released vaccines, genomic surveillance to monitor their spread is becoming more important than ever.
At the Advances in Genome Biology and Technology general meeting, held online this week, two genomic laboratories – one in the UK, the other in Canada – described their approaches for sequencing viral genomes from patient samples routinely and how they have used the data to monitor variants, analyze outbreaks, and guide policymakers.
Jeffrey Barrett, director of the COVID-19 initiative at the Wellcome Sanger Institute, explained that his team's effort is part of the COVID-19 Genomics UK (COG-UK) Consortium, which was founded in March 2020 and delivers large-scale whole-genome virus sequencing, bioinformatics, and expertise to local centers of the UK's National Health Service and to the UK government to help inform policy decisions and manage the pandemic. Besides the Sanger Institute, it involves more than a dozen academic institutions that provide sequencing services, as well as the UK's four public health agencies.
The UK government has also helped set up a network of so-called "lighthouse laboratories" – high-throughput COVID-19 testing labs that are managed by the health department, NHS trusts, commercial suppliers, academic groups, or non-profit organizations that undertake the majority of community-based COVID testing.
Last summer, Barrett said, the Sanger Institute teamed up with the Lighthouse labs for genomic surveillance. Every day, the labs deliver leftover samples from their PCR tests to the Sanger Institute, a total of 500,000 samples per day, that are a mix of positive and negative tests in 96-well plates. They arrive in large boxes that are stored in temporary freezers in a car park of the institute until they are sequenced.
Sanger researchers use a specially configured manual and robotic pipeline to pick out only positive samples from the plates – right now, about 20,000 positive tests per week, though that number continues to grow – for sequencing. These are selected based on the number of positive cases in different geographic areas of the UK at the time. Samples are PCR-amplified using the ARTIC protocol, which covers the 30 kb SARS-CoV-2 genome with 98 tiled amplicons, and are sequenced on Illumina NovaSeq instruments.
The lab currently generates about 20,000 viral genomes per week. Across the UK, labs have sequenced more than 300,000 samples in total, according to Barrett, half of which were done by Sanger. "It's really a pretty globally unique scale in terms of using genomes to watch and monitor the virus," he said.
Using the sequence data, which usually comes out about 10 to 14 days after samples were PCR-tested, the researchers have been able to identify superspreader events, where lots of virus sequences were identical, for example, an outbreak at a frozen food facility in Swindon that spread into the surrounding community.
They have also followed the spread of the B.1.1.7 variant, which first appeared at the end of November in southeast England and rapidly spread across the country over the next few months.
Today, almost all new infections in the UK are with the B.1.1.7 variant, Barrett said, which carries many different variants, including 17 coding mutations and a deletion of two amino acids in the spike protein. That deletion led to the dropout of one of the targets in the TaqPath PCR test from Thermo Fisher Scientific, which is widely used in the UK, and that dropout could be used to follow the spread of the variant.
Going forward, Barrett said, the goal is to scale up viral sequencing even further to keep track of mutant strains and prevent them from spreading further while countries are racing to vaccinate their populations. "We certainly intend … to keep up our surveillance program throughout 2021 because we think it's going to continue to have a lot of value," he said. Vaccine resistance, in particular, is "an issue that's been very closely watched," he added.
Across the ocean in British Columbia, Natalie Prystajecky, program head at the BC Centre for Disease Control Public Health Laboratory, which serves a population of about 5 million people, has been tracking viral genome sequences since the early days of the pandemic. Part of the lab's funding has come from the Canadian COVID Genomics Network (CanCOGeN), a C$40 million (US$31.6 million) program from the Canadian government to address the COVID-19 pandemic and build capacity for future outbreaks.
Initially, the BC lab used sequencing to follow viral lineages in the province that were associated with travel, long-term care facilities, and with a superspreader event at a dental conference in early March that continued to seed cases across British Columbia. In the early days, sequencing "was a tool that helped inform policy decisions," Prystajecky said, but the data was also used to investigate outbreaks and how cases were linked.
"At this point, we realized there was more than just research we should be doing with sequencing SARS-CoV-2," she said, and requests for more sequencing were mounting. As a result, the lab started to do surveillance sequencing of specific groups, including school-age children, travelers, and healthcare workers, and is now focused on identifying variants of concern.
In the process of transitioning scaling up sequencing, including for clinical purposes, the lab in August switched its initial setup of the ARTIC amplification protocol and Oxford Nanopore MinIon sequencing to a more high-throughput protocol with sample prep automation that would allow it to sequence at least 750 SARS-CoV-2 genomes per week.
For that, it acquired two Illumina sequencing instruments and switched to an amplification protocol that produces 1,200 bp instead of 400 bp amplicon, which provided more consistency, Prystajecky said.
Right now, the researchers extract RNA with a Thermo Fisher KingFisher instrument, which is also used in routine diagnostics, generate Illumina libraries on an Eppendorf liquid handler, and sequence two libraries per day on a MiSeq Micro Cartridge with a 19-hour run time. The usual turnaround time is three to four days, she said, and the lab sequences 900 to 950 viral genomes per week, which represents about a quarter of all positive cases in the province. However, even that is not enough for the public health authorities, she said, and the lab plans to double its sequencing capacity in the coming months. This will require more liquid handling systems and also a longer work week, going from 6 days to 7 days by mid-March.
One challenge to scaling up is a shortage of supplies, Prystajecky said, including a recent shortage of library prep kits. Part of the plan is to look into alternate library preparation strategies, she said, that are "a little bit more independent of some of the big players."
Another problem is a lack of data analysts. "If you generate twice as much data, you really do need to have the analytics in place and the people in place to be able to process that data," she said, adding that more hiring is underway.
Since August, the lab has generated more than 10,000 genomes with high coverage. It has also been successful in sequencing samples with low viral load, obtaining good results consistently with samples that had a PCR Ct value of 30, and even with samples of higher Ct values, though some samples with both high and low Ct values failed.
The sequencing data is used to study how the predominant lineages in the province change over time and how they impact different regions. Its also provides insights into hospital outbreaks, she said, and how transmission happened.
The lab sequenced 25 percent of pediatric cases over a three-month period and found that initially, there was not much transmission inside schools, which mainly happened in the community. That changed in November, when the overall number of cases started to increase and school-based outbreaks started to occur.
More recently, the lab has been following the new variants, including the B.1.1.7 UK variant, which was first detected in December, and the B.1.351 South African variant, which first showed up in January. Both variants still have a fairly low prevalence in BC, according to a study with a qPCR assay in January that also included the P.1 Brazilian variant, but they are increasingly acquired through community spread rather than travel. Since two weeks ago, the lab routinely screens all positive samples for variants of concern, she added.
The lab is also working on doing surveillance for variants of concern in wastewater. In the Vancouver metro area, it has been sequencing wastewater since the fall and has been able to track the virus and see it go up and down along with clinical cases. The plan is to now also screen for the mutant strains in wastewater, Prystajecky said.
Going forward, there will be more tasks. "We know that our role of sequencing will continue to evolve as the pandemic evolves," she said, for example, to detect the emergence of new variants of concern including local variants. Finding reinfections and identifying vaccine escape mutations will also be important.
Better data integration is another future goal, in particular linking genomic data with epidemiological and outcomes data. The lab is co-located with several data groups in these areas, she said, and the aim is to bring all their data together in a so-called COVID Cohort database, which could help researchers understand the clinical impact of new variants of concern.