NEW YORK – Scientists at the MD Anderson Cancer Center in Houston developed a computational method to combine single-cell RNA-seq (scRNA-seq) and spatial transcriptomics (ST) data to reconstruct a spatial cellular map at single-cell resolution.
The method, called CellTrek, directly maps individual cells to physical locations, in contrast to other spatial deconvolution methods, which generally map mixtures of cells to small but less-precise regions.
The group is now augmenting CellTrek's performance by adding image recognition and deep learning to the workflow, while working with pathologists on CellTrek's clinical applications.
"A lot of pathologists are interested in this tool and in how they can improve their qualitative analysis of tumor tissues by having more quantitative data mapping," Nicholas Navin, the study's corresponding author, said in an interview.
The researchers published a proof-of-principle paper on CellTrek last week in Nature Biotechnology. Navin noted that although CellTrek is an open source tool and as such, is unlikely to be commercialized itself, he anticipates collaborating with companies interested in adding it to their own commercial workflows.
CellTrek compares the transcriptional similarities between individual cells, which can account for the stark differences between cell type as well as the more subtle variations between cells of the same type. The algorithm decomposes single-cell gene expression profiles into their principal components then maps those back to spots on a microarray via a machine learning technique called a random forest classifier, which estimates the distances between data points.
This provides a higher resolution of visual mapping compared to other deconvolution methods, which often map gene expression profiles to predetermined cellular models.
"Usually," said Runmin Wei, the study's lead author, "people first try to define cell types and using that reference, try to deconvolute each ST spot into the proportion of different cell types."
Different cell states that are expressed by continuous variables may exist within a given cell type, however, which cell type-based deconvolution methods cannot easily distinguish.
With CellTrek, Navin explained, “you can map a trajectory of many intermediate states. You can map a continuous phenotype like the amount of endothelial-to-mesenchymal transition or the invasive signature. So you can think of it as really being able to map cell states [whereas] most deconvolution is kind of cell type level.”
Navin and his colleagues tested CellTrek on simulations and in situ datasets before studying the spatial organization of cell types and states using mouse brain and kidney tissues as well as data generated from two human ductal carcinoma in situ (DCIS) samples.
CellTrek first reliably reconstructed tissue patterns and identified cell subtypes from publicly available scRNA-seq and ST data from mouse brain and kidney tissues, with the ST data coming from multiple platforms, in this case 10x Genomics' Visium and Slide-seq v.2, an open-source platform developed at the Broad Institute. In addition to reconstructing spatial patterns, CellTrek also identified subtle gene expression patterns that other deconvolution methods are more likely to miss.
After demonstrating CellTrek's capabilities across platforms, the MD Anderson team applied their technique to DCIS breast cancer samples, using 3' scRNA-seq and Visium, both from 10x Genomics, to build their sequencing and ST datasets.
CellTrek identified tumor subclones and their progenitor clones, showing the evolution of unique patterns within specific tumor regions and the tumor's extensive spatial heterogeneity by mapping cells to distinct ductal regions.
In a separate DCIS tissue sample, Navin's team extended its analysis into the tumor-immune microenvironment, mapping both tumor and immune cells and showing immune cell enrichment near DCIS regions.
As a downstream analysis method, CellTrek remains sensitive to earlier tissue manipulations, such as whether a sample is fresh or has been embedded in paraffin, with the latter leading to a degree of "bleeding" of nucleic acids between spots.
Similarly, technical limitations of the microarray being used to assign cell locations can affect CellTrek's output. Spots on the Visium platform, for instance, allow approximately 50 micrometers of diffusion space between spot centers. Attempts to reduce that distance, Navin said, have led to some trouble in unambiguously identifying individual spots.
"We hope a lot of people will use this tool," Navin said, "and that it will impact pretty diverse areas of biology and biomedicine."