NEW YORK – Two independent research groups have demonstrated that a hardware hack can turn the Illumina NovaSeq 6000 sequencer into a platform for spatial transcriptomics research, paving the way for larger studies that could be too expensive to run on dedicated commercial spatial omics platforms.
Both approaches, detailed in BioRxiv preprints released over the last six months, build on the SeqScope chemistry that was published in 2021 by researchers from the University of Michigan, who used the low-throughput Illumina MiSeq benchtop sequencer. Researchers from the Berlin Institute for Medical Systems Biology at the Max Delbrück Center for Molecular Medicine described Open-ST in December, while a team from Belgium's VIB presented Nova-ST in February.
The essentials are the same for both new high-throughput methods. First, a library of special oligos featuring a unique 32 bp sequence and a section designed to capture RNAs is loaded on a NovaSeq S4 flow cell and run through the sequencer using the 35-cycle reagent kit. This seeds the flow cell nanowells with clusters of spatial barcodes. Their coordinates in the flow cell, as read by the first sequencing run, can be used later on in data analysis. Then, the flow cell is pried apart, cut up, and exposed to a tissue section, capturing RNAs in the sample. These RNAs are then tagged with the spatial barcode in the capture probe, released, and prepped for another sequencing run on any sequencing instrument compatible with Illumina libraries.
Despite the fact that the approach consumes at least one of Illumina's most expensive sequencing flow cells, both groups say their methods offer more high-resolution data for less money when compared to similar commercially available methods from companies such as Vizgen, Curio Bioscience, 10x Genomics, and NanoString Technologies.
"The cost of a 10 mm by 8 mm tile comes to around €500 ($538) to €600 per sample, excluding sequencing," mostly due to the flow cell, said Suresh Poovathingal, head of the single-cell and microfluidics expertise unit at VIB-KU Leuven and the first author of the Nova-ST preprint. That's just under five times cheaper than 10x's Visium HD per sample, he said, and seven to 10 times cheaper per square millimeter of tissue.
The Open-ST team said their sample prep costs less than €150 per 12 mm2 capture area. "The total costs of a standard sample (3 x 4 mm, 400M sequencing reads, ∼100,000 cells, ∼1,000 [unique molecular identifiers]/cell) are a few hundred euros, primarily driven by the sequencing costs," they wrote.
For intrepid scientists, the new methods may unlock larger spatial biology studies that would be too expensive to run on commercialized platforms. However, the risks of expending resources are real: just to get started requires an S4 flow cell that costs north of $10,000, not to mention the second sequencing run and the patience to run a method lacking corporate polish and support. But Nova-ST can produce over 80 approximately 1 cm2 tiles per NovaSeq flow cell and produce data in less than a week. NovaSeq 6000 flow cell nanowells are 300 nm across, and their centers are 625 nm apart, offering high resolution for an NGS-based readout. For comparison, 10x's Visium HD, which is only hitting the market now, offers 2 μm by 2 μm tiles. However, that product offers a continuous capture surface, while the NovaSeq flow cells have dead capture space between clusters.
Both methods could add spatial context to established sequencing-based data types, such as targeted immune cell receptor sequencing and epigenomic profiling by ATAC-seq (assay for transposase-accessible chromatin by sequencing). Already, labs in the US and Europe are thinking about how to get started with Nova-ST, according to Poovathingal and Eric Chow, director for the Center for Advanced Technology at the University of California, San Francisco, who is helping another researcher try out this approach.
Open-ST and Nova-ST represent the confluence of several trends in spatial analysis. They follow several methods that use NGS as a readout, including the original spatial transcriptomics method, which has been commercialized as 10x's Visium, and Slide-seq, a method from researchers at the Broad Institute that pre-sequences the barcoded capture array and is being commercialized by Curio as the Seeker platform.
They also join MGI Tech's Stereo-seq and Singular Genomics' G4X in using an instrument built for sequencing to do spatial analysis of tissue sections.
Finally, they follow in the DIY trail blazed by methods like SeqScope, which they incorporate, and PySeq, a method developed by researchers at the New York Genome Center that repurposes Illumina HiSeq instruments for low-plex spatial proteomics.
"These papers are really cool," said Silas Maniatis, associate director of technology innovation at the NYGC and senior author of the PySeq paper. "They are the realization of what we and probably every other tech lab thought about as soon as we saw the original SeqScope paper."
"From the data point of view, these papers look pretty great," he said. "Also, we of course love the customizability available to end users when you're able to specify the sequences on the flow cell."
Poovathingal noted that his team accomplished its project without assistance from Illumina. If the sequencing giant has any plans to go down this path, it has not said so publicly. "If they can make the flow cell in a way that the two glass layers are not bonded, that would have saved us a lot of time," he said. However, now his lab has automated the process of disassembling and cutting the glass flow cells into tiles of any size. He said they get almost 100 percent recovery of tiles: At the standard size of 1 cm2, that yields more than 80 tiles that can be used to process tissue sections.
In their preprint, the Nova-ST team began by binning transcripts into areas of three sizes, 25 µm by 25 µm, 50 µm by 50 µm, and 100 µm by 100 µm. Approximately 85 percent of reads had valid barcodes and were able to be mapped, and the percentage of unique molecular indices ranged between 36 percent and 79 percent, depending on sequencing depth. The smallest "bin" yielded a median of 994 genes and 2,131 UMIs for deep sequencing. They also shallowly sequenced samples, yielding a median of 251 genes and 374 UMIs. The largest bin offered 6,318 genes and 32,317 UMIs for deep sequencing and 2,503 genes and 5,920 UMIs for shallow sequencing.
The authors also compared their method with Stereo-seq on mouse brain sections. Using the largest bin size, Nova-ST detected a median of 6,318 genes compared to 4,092 for Stereo-seq. The authors suggested that this means they could decrease bin sizes to achieve increased resolution, compared to Stereo-seq.
Nova-ST had similar sensitivity to Open-ST — which "also demonstrates superior sensitivity compared to Stereo-seq," Poovathingal's team wrote.
In their preprint, the Open-ST team reported about 55 percent of reads mapping to genic regions that carried a spatial barcode. They used a cell segmentation approach in a mouse brain section, sequenced at a depth of nearly 500 million reads. They segmented nearly 50,000 cells, capturing 21,609 total genes, with a median of 621 genes and 880 UMIs per cell, with 42 percent of cells containing over 1,000 transcripts. Open-ST authors were not available for comment before deadline.
One drawback of the two methods is the dead space between nanowells. "It could just mean you have less efficient capture," Chow said. "If there isn't a barcoded oligo underneath it, you're not going to capture RNA above that space."
Poovathingal noted that the total analyzable surface is slightly less than the full surface of each tile; for a 1cm2 tile, the available surface is 10 mm by 8 mm.
The method is also currently limited to transcripts only, though protein analysis through the CITE-seq (cellular indexing of transcriptomes and epitopes) protocol is one of several applications that Poovathingal hopes to explore.
To that end, he is working with another lab on a study, though he declined to name his collaborators. "They have genetic tags they introduce that each cell receives. They want to know spatial location of different barcodes in a tissue," he said.
In addition to CITE-seq, he thinks Nova-ST could also be paired with ATAC-seq and immune-cell receptor profiling. "Stereo-seq can't do ATAC-seq," he said, though DBiT-seq, a spatial method from Rong Fan's lab at Yale University, can do that protocol.
Poovathingal noted that Nova-ST can be used with tissues from organisms other than human and mouse, an important consideration for his lab, which has worked with samples from odd animal models like octopus and hamsters.
Chow said he's currently testing and evaluating both Open-ST and Nova-ST to help another researcher with a large study. "We'll be comparing aspects of each and end up taking bits of both," he said. He suggested that researchers only consider this path if they're ready to make a commitment. "The protocols are not as clear and developed, so it's going to take a bit of work," he said. "You've got to be doing multiple flow cells to make it worth the effort."