NEW YORK – A new method developed at Sweden's Karolinska Institute offers whole-genome sequencing and transcriptomics for the same single cells, with potential applications in cancer research and genetic screens.
Direct nuclear tagmentation and RNA sequencing (DNTR-seq) is a plate-based method developed by Karolinska cancer research Martin Enge that combines low-coverage sequencing with SMART-seq (switching mechanism at 5' end of RNA template sequencing).
In a paper published Thursday in Molecular Cell, Enge's team described the method and provided proof of concept that DNTR-seq produces libraries with lower positional bias than typical DNA-only methods while achieving the same breadth of coverage with fewer reads. Total library size was 2.48 gb per cell, on average.
The method is especially suited to call copy number variants (CNVs), the authors wrote, due to low positional bias of the reads. In one experiment, the researchers were able to identify minor subclones based on CNVs from frozen samples of leukemia patients.
The entire library preparation takes place within one microwell, with one PCR amplification and three enzymatic steps with no intermediate cleanup. At ultra-low coverage of approximately 1 million reads per cell, 80 percent of the per-sample cost is the cost of sequencing. And the method easily identifies doublets which often plague single-cell transcriptomics studies, Enge said. "Since we cut the genomic template directly, we can get a very good view of how many chromosomes there are," he said.
The paper demonstrates the ability to analyze thousands of cells at once, said Merja Heinäniemi, a genomics researcher at the University of Eastern Finland who collaborates with Enge on cancer research, but who was not involved in this study. Though droplet-based single-cell RNA-seq methods have redefined what's considered high throughput, DNTR-seq proved useful: "They've gained a lot of insight even from not having a huge number of cells," she said. She noted that low-coverage sequencing does not usually yield data on single-nucleotide variants; however, analyzing groups of about 60 clonal cells with DNTR-seq yielded recall probability of about 80 percent, at a depth of 1 million read pairs of 37 bp each.
Moreover, "it looks like you could set this up in any lab," she said. "That also makes it attractive."
"Our main focus is to study the behavior of different cell types in a tumor," Enge said. "We want to be able to understand the transcriptional effect of different mutations, but also identify which cell types exist within a tumor." But the method could also be applied to genetic screens and cell maturation studies, he said.
Enge traces the roots of DNTR-seq back to his time as a postdoc in Stephen Quake's lab at Stanford university. "Back then we were working mostly with RNA-seq data," he said. RNA data alone provided an incomplete view of the genome, he said and working with cancer made it even more imperative to get DNA data to pair with the transcriptome. With DNTR-seq, "we can take tissue from a patient, separate it into single cells, and understand which different clones exist there. And for each cell we can understand what its phenotype is in the tissue," he said.
Enge's method is the latest to offer simultaneous genomic and transcriptomic analysis of single cells. Genome and transcriptome sequencing, introduced in 2015, also separates DNA and mRNA for separate processing, using multiple displacement amplification to amplify the whole genome. Meanwhile, DR-Seq, based on quasilinear amplification, does not separate the nucleic acids.
But the existing methods are "laborious and expensive," Enge said. "For this to gain traction, you have to be able to do many cells. Otherwise, what's the point?" he said.
DNTR-seq begins with a fluorescence-activated cell sorter, which deposits one cell in each well of a plate. A lysis buffer leaves the nucleus intact, which is then spun down and removed from the well with an automated pipettor.
After treatment to remove chromatin, the method uses a tagmentation-based approach to prepare the sequencing library instead of random priming and strand displacement. DNA yield can be up to 60 percent for whole-genome sequencing. The samples are then pooled and sequenced at low coverage.
The transcriptome remaining in the well is processed using the SMART-Seq2 protocol. RNA-seq yield is more difficult to ascertain, he said, but published estimates for the protocol are about 20 to 25 percent of mRNA.
Sample preparation for WGS costs about $.50 per cell and the first, low-coverage pass costs about $1 for sequencing. SMRT-seq2 costs vary, Enge said, and can range between $.50 and $2 per cell.
"Our pipeline typically starts with WGS," he said. "It's a much simpler protocol and it's much cheaper to characterize a lot of cells on the genomic level and then select cells to do RNA-seq on after."
Though pooled analysis of clones can offer some insight on SNVs, "the basic roadblock right now is that we can't really trust the single nucleotide variants that we find," Enge said. "We start with exactly one template and it's hard to distinguish early-round PCR errors." His team is working on error correction measures to address this.
Heinäniemi noted that one study presented in the paper used frozen acute lymphoblastic leukemia cells. That's a disease she's working on in collaboration with Enge and hopes DNTR-seq can produce data from.
"We intend to analyze the impact of genetics versus the transcriptome in these primary patient cells. We're very interested in what determines the response to treatment," she said. "When you have cancer, the response could be related to resistance mutations, or, it could be related to the gene expression pattern, features in the RNA profile."
Cancer therapies, by design, eradicate cancer cells, so those that are left are, ideally, few in number. "The thing you can do now with single-cell methods, and we're very interested to try this, is to cleverly sort the remaining cells on the plate and analyze those in detail," she said.