NEW YORK – A new UK-based research consortium is betting that sequencing the virus that causes COVID-19 can provide policymakers with a clearer picture of the spread of the epidemic across the country as diagnostic testing remains limited.
"I think we'll be able to provide a pretty clear snapshot of COVID-19 across the UK" in a month's time, said Ewan Harrison, an infectious disease researcher at the University of Cambridge and coordinator for the COVID-19 Genomics UK consortium, announced Monday with £20 million ($23.2 million) in funding from the Wellcome Trust and the UK government.
The consortium expects its work to provide useful insights into the spread, distribution, and scale of the epidemic in the UK. Viral genome sequences could also inform the development of treatments and vaccines. And the data may provide feedback on the effects of certain policy interventions, like self-isolation.
Researchers, healthcare providers, and public health officials in more than a dozen cities across the UK will work together to collect viral nucleic acid from patient samples that have been confirmed for COVID-19 and analyze them with next-generation sequencing. They'll be allowed to use whole-genome sequencing technologies and methods they're most comfortable with, provided the protocols and validation studies are shared with the other members or with the public.
"It's a bit of a logistical problem," Harrison admitted. "I think in many ways the sequencing is the easier part."
"I think this is a great idea," said Michael Zody, scientific director of computational biology at the New York Genome Center (NYGC). His institution is trying to set up a similar consortium for COVID-19 sequencing in the region, which has been one of the hardest hit in the US. "One of the great challenges of this outbreak is the 'tip of the iceberg' question of how many people are actually infected and how the virus is transmitted between people. Large-scale projects like this to try to capture viral sequences from as many people in as many places as we can are going to be really important to understand the parameters of the outbreak."
But the UK researchers still face many uncertainties, including whether they can get harmonized results despite using different target enrichment strategies and even sequencing technologies. And aspects of the virus's biology present challenges that some methods may address better than others.
The consortium, which was put together over the last several weeks, will draw on the UK's next-generation sequencing expertise, built over decades, to gain an edge in the international health and economic crisis. It is well-positioned to succeed, NYGC CEO and Scientific Director Tom Maniatis said.
"As usual, the UK is highly organized. They're lucky to have Wellcome Trust funding which is obviously a critical factor," he said. Moreover, the centralized health system will help identify and collect patient samples, as well as the appropriate patient consent needed to collect and later analyze them.
The geographical distribution of consortium members is another aspect the UK has done right, Zody said. Obtaining a proper phylogenetic tree for the virus can show how the epidemic is progressing through measures of its size, shape, and other aspects. It can also help show how the epidemic responds to certain public health interventions, such as closing schools or restricting people's movement in cities.
"We'd love to sequence every case," said Nick Loman, a sequencing expert and consortium participant at the University of Birmingham. Though community spread of the coronavirus continues in the UK, which would make it hard to sequence all patients, getting proper sampling is more important than sequencing every last case, he said.
Whether the funding announced so far will allow the consortium to sequence each sample is unclear. Harrison said it wasn't possible yet to determine how many samples the funding would be sufficient for.
Zody estimated that if the entire amount were spent on sequencing runs, it could cover approximately 500,000 samples. However, Harrison said sample logistics and computing resources would also be covered by the available funds.
Sample preparation is a major challenge. Nasal swabs in a healthcare setting don't always provide the highest quality viral RNA, Zody noted, and Harrison said the consortium would be working off different collection and even different sample types. In addition, the amount of viral genome available can vary wildly between samples. Loman said his lab has seen threshold cycle (Ct) values, an indirect measure of viral copy number based on PCR amplification cycles needed to produce a signal above background, in the range of 12 to 37 for SARS-CoV-2. "That's a really large number of log differences," he said.
But the method Loman and many others are using is well suited for use in a viral outbreak, he said, having proven its worth with the 2014 Ebola and the 2015 Zika virus outbreaks.
The ARTIC Network SARS-CoV-2 Rampart protocol, released publicly in late January, uses a whole-genome PCR tiling approach to amplify viral sequences, which helps with low-copy number samples. And it's cost-effective, Loman said: "More or less all the reads you get are targeted on the virus."
Several researchers have already successfully sequenced patient samples with the protocol. Using Oxford Nanopore sequencing technology, Thomas Williams, a respiratory virus researcher at the University of Edinburgh, was able to get 100x coverage in just 15 minutes on the first COVID-19 samples sequenced in that city, according to a Twitter message he posted.
Williams explained that despite its lower single-base accuracy, nanopore sequencing with the protocol still provided high coverage — "at least 200x in most cases," he said in an email — allowing for consensus basecalling.
But not all UK consortium members will be using the same methods or even sequencing platforms to collect data. The ARTIC protocol works on both Illumina and nanopore sequencing platforms, for example, and Loman said other researchers are using metagenomic sequencing strategies or target capture-based sample preparations.
Maniatis said this was surprising, noting that one of the classic problems facing consortium-based projects is "making sure everyone uses the standard protocol." But Zody said it wasn't surprising, given the circumstances. "This is a new virus and nobody had a protocol for sequencing it until a few weeks ago," he said. "You're optimizing protocol at the same time you're trying to do vitally important sequencing work, that's one of the challenges."
Loman said that the consortium's "ground rule" is that each method, protocol, and validation study must be shared with the group, at least, or publicly if possible. In his experience, data sets are often comparable. "That's already working on a global scale," he said. "We take sequences from all over the place and we get very nicely comparable results."
It's unclear whether other national governments that have been slow to implement PCR-based COVID-19 testing, including in the US, will follow the UK's lead. The US Centers for Disease Control, the National Institute of Allergy and Infectious Disease, and the office of National Institutes of Health Director Francis Collins did not respond to requests for comments on any plans to use sequencing to complement epidemiology.
Maniatis, who is trying to start a New York-based COVID-19 sequencing project, said he was not aware of any NIH funding opportunities for such work, but he expects to see some soon.
Harrison and Loman said they've been in contact with other researchers around the world about sharing plans and lessons learned, including with people interested in starting national or more local COVID-19 sequencing programs.
Though sequencing can help fill in the number of infections not currently shown by diagnostic-focused PCR testing in some countries, it is also needed "to understand the population dynamics and evolution of the virus, so you can strategize on how to stop the pandemic," Maniatis said.
Sequencing studies of the virus could also reveal gene-based differences in how the viral proteins interact with human innate immunity pathways, enabling a better or worse cellular defense. Already, Italian researchers using a Thermo Fisher Scientific NGS panel have obtained evidence that the virus is genetically stable, making it a good target for a vaccine. Loman added that sequencing human genomes of confirmed COVID-19 cases, even asymptomatic ones, may also provide useful information related to an individual's susceptibility to the virus.
And while the world is still in the throes of the current pandemic, the next one is already in the back of scientists' minds.
"One of the great hopes is that for any future pandemics, this infrastructure will be there and hopefully we'll demonstrate the use case for it," Harrison said. "We're into new territory at this scale. There are lots of things we're likely to see, but until we get the data, it's hard to be sure."