By Monica Heger
Despite numerous advances in sequencing technology, characterizing the T-cell receptor repertoire remains difficult with current technology, according to several recent studies.
In one of the most comprehensive T-cell sequencing studies to date, researchers from the BC Cancer Agency found that there are at least 1 million clonotypes of T-cell receptors per individual. Furthermore, in different samples from the same person, there is only about a 13 percent overlap between T-cell sequences, and when sequences from two different people are compared, the overlap drops to just 1 percent, indicating that next-generation sequencing of the immune repertoire for clinical applications is still difficult.
Similarly, researchers from St. Jude Children's Research Hospital in Memphis, Tenn., found that the error rate for next-gen sequencing of T-cell receptors is high, although they concluded that with better filtering methods, it could be reduced.
T-cell receptor sequencing is "a long way from being used clinically," said Robert Holt, head of sequencing at the BC Cancer Agency's Genome Sciences Center and senior author of the group's paper, which was recently published in Genome Research. In the future, it "might be used as a diagnostic," but first it would have to be "more reliable and faster," he said.
The team obtained blood samples from two unrelated donors at two different time points. Total RNA was isolated and reverse transcribed with a primer specific for the two conserved TCR beta chain C genes. A 5’ priming site was then added to cDNA molecules during reverse transcription by template switching. The TCRB sequenced was then amplified and sheared, leaving only the distal part of the V gene, the informative CDR3 sequence, and the J segment intact and creating template strands of 130 to 180 base pairs long.
From the first time point of the first individual, the team generated 142.1 million raw read pairs from paired-end sequencing on the Illumina Genome Analyzer, yielding 181,258 distinct TCRB sequences. While statistical analysis predicted that deeper sequencing would not yield many new sequences, when the team sequenced a second library from the same sample, they found that 74.8 percent of the filtered TCRB sequences were novel.
"Our data clearly show that a single library may not adequately capture the diversity of a biological sample," the authors wrote.
So the team constructed 10 additional libraries, generating 632.7 million pairs of raw reads, equivalent to 50-fold coverage of the human genome, "focused on just this one hypervarible 100-base-pair region," said Holt. "At that level of sequencing, you can be fairly certain you captured the diversity present in the sample," he said.
Nevertheless, when the team sequenced a sample from the same individual, but obtained about one week later, they found mostly new sequences, with only 13 percent shared between the two time points.
Holt said the study highlights the immense diversity found in the immune repertoire, as well as the technical challenges associated with sequencing such a variable region of the genome.
For one thing, it seems that there is "not a way to directly measure total diversity," he said. However, he added, the study did succeed in establishing a new limit on the amount of diversity present. "It's quite clear from this study, where we've [aimed to] sequence to exhaustion, that [the number of clonotypes] is definitely not the trillions that are theoretically possible, but more likely a few million."
One of the main technical challenges in sequencing the immune repertoire is that it is nearly impossible to distinguish between very rare clonotypes and sequencing errors, he said. Additionally, "when you sequence very deeply, you get an accumulation of these sequencing errors, which can be misleading."
[ pagebreak ]
Terrence Geiger, the medical director of clinical pathology at St. Jude Children's Research Hospital, who led a study that attempted to characterize the errors produced from immune repertoire sequencing, agreed that distinguishing between the rare clonotypes and sequencing errors is very difficult.
T-cell receptors are present at very different frequencies, he said. "So when you sequence and see a rare event, it's difficult to know whether that's a mis-sequence or if it's real." However, filtering techniques can reduce erroneous sequences, he added.
In his study, published in BMC Genomics, he found that T-cell receptor sequencing on the Illumina GA has an error rate of between 1 and 6 percent. His team then characterized those errors in an attempt to filter out the majority of them.
Most of the errors were introduced from the sequencing process itself, rather than any of the sample-prep steps, he said. Interestingly, the different sequencing lanes each had different error rates. Also, certain types of errors were more frequent than others, he said. Transversions — substitutions between a purine base and pyrimidine base — made up the majority of base substitution errors, he said. All other substitutions had very low error rates.
He also found that examining the ratio of forward to reverse reads helped find errors. Typically about half the sequence reads would be forward and half reverse, but errors tended to be biased in one direction or the other, he said.
"It's not easy to monitor T-cell repertoires," he said. "We really need some more data to find out whether it will have [clinical] applications." Nevertheless, he said that with careful filtering methods "it is feasible to accurately sequence the T-cell repertoire."
Despite the challenges of sequencing immune repertoires, several companies are offering it as a service and are developing diagnostics based on immune repertoire sequencing, including Sequenta, Adaptive TCR, and iRepertoire (IS 12/14/2010, 8/31/2010, and 7/20/2010).
Chris Carlson, an assistant member of the Fred Hutchinson Cancer Research Center and co-founder of Adaptive TCR, said that despite the apparent challenges illustrated by the studies, sequencing the immune repertoire for clinical purposes is still feasible for some applications.
"We're ready for the low-hanging fruit," such as determining an individual's response to a particular stimulation such as an infection or vaccine, he said. For those applications, the high-frequency clones will be important, not the rare ones, he said.
However, he added, "there will be studies where what matters is in the rare overlap between individuals, and that's more challenging." Additionally, he added, the technology is not yet ready to be used to make a clinical decision in real time, but it is "possible that will be feasible in the not-too-distant future."
Holt said the key things to making immune sequencing applicable in a clinical setting are improving the accuracy and turnaround time, and also being able to make a reproducible assay.
"Anything used clinically for patient care needs to have all the error modes defined, and generate reproducible data at a reasonable cost," he said.
Have topics you'd like to see covered by Clinical Sequencing News? Contact the editor at mheger [at] genomeweb [.] com.