NEW YORK (GenomeWeb) – A team of researchers at the European Molecular Biology Laboratory, European Bioinformatics Institute, and the Wellcome Trust Sanger Institute has developed a computational method to link sequence variation with transcriptional profiles in single cells.
In a study published online today in Nature Methods, the researchers used the tool, called TraCeR, to reconstruct paired T cell receptor (TCR) sequences from RNA sequence data derived from individual T lymphocytes.
The method is part of an effort to link transcriptional status with antigen specificity, the researchers noted, as well as model dynamics of clonal expansion and investigate T cell phenotypic plasticity.
The TraCeR tool "extracts TCR-derived sequencing reads for each cell by alignment against 'combinatorial recombinomes' comprising all possible combinations of V and J segments," the researchers wrote.
The tool assembles reads into contiguous sequences to find full-length, recombined TCR sequences. These contain approximately the complete length of the V(D)J region, enabling the researchers to discriminate between closely related gene segments.
The tool can be applied to any single-cell RNA-seq data derived from full-length DNA, and the group performed an analysis of Smart-seq data in conjunction with the Fluidigm C1 microfluidics system. EMBL-EBI, the Sanger Institute, and Fluidigm began a collaboration around single-cell genomics data analysis in late 2014, while Smart-seq —a method to convert polyadenylated RNA into cDNA, which is then amplified and turned into sequencing libraries — is now exclusively supplied by Clontech Laboratories.
As a proof of principle, the team isolated 272 FACS-sorted CD4+ T cells from mouse spleen, and compared TCR sequences reconstructed by its method to a multiplex PCR-based approach developed at Stanford University, finding good concordance. It also learned that a variety of sequencing depths and read types could be used to reconstruct the TCR sequences.
They next found that Salmonella typhimurium resulted in clonal CD4+ cell expansion in mice, with changing numbers of cells and clonotypes in different cell types at different time points after infection. The group concluded that the method likely detected the correct combinations of TCR recombinants within the cells because they observed multiple cells sharing all their recombinant sequences, including ones that were nonproductive.
For independent component analysis from single-cell gene expression data, the researchers examined 14,889 informative genes, which they noted was far more than the 17 phenotyping genes used in the PCR-based approach.
They then selected sets of genes that enabled them to group the cells into four populations and determined the distribution of the expanded clonotypes. This revealed that "cells derived from the same progenitor could be seen throughout the activated differentiating, TH1 effector and effector memory populations," suggesting that the progeny of a particular CD4+ T cell that bound to a Salmonella antigen-MHC complex differentiated asynchronously.
"Members of one clonotype exist across the full spectrum of proliferation and differentiation states that occur during the Salmonella response," the researchers wrote.
The method will next be applied to B cells, as the researchers note it is "sensitive, accurate, and easy to adapt." It is, however, too costly at this time for surveys of the entire immune repertoire, but this may change as scRNA-seq throughput increases, and costs decrease.