NEW YORK (GenomeWeb) – Researchers at the University of Michigan have demonstrated the advantages of a capture-based transcriptome sequencing protocol for clinical purposes from formalin-fixed paraffin-embedded tissues, compared to RNA-seq methods that rely on either poly A selection or rRNA depletion.
The team, which has been running the capture-based method alongside the more traditional poly-A RNA-seq method for its clinical cancer sequencing pipeline since 2011, has now demonstrated that the capture-based method is as good or better than the poly A approach for all metrics and plans to use only that method for future clinical cases.
The Michigan team described the method this week in Genome Research. Although capture pull-down prior to RNA-seq has been done in a research setting, the Michigan team was one of the first to implement it in a clinical setting from FFPE samples.
Senior authors Dan Robinson and Arul Chinnaiyan developed the method, lead author Marcin Cieslik told GenomeWeb.
Traditional transcriptome sequencing typically relies on either a poly A selection to enrich for mRNA prior to converting to cDNA or a ribosomal RNA depletion step.
These protocols struggle when RNA from FFPE samples is used because of the amount of crosslinking and degradation, Cieslik said. In addition, he noted, what's known as RNA nicking — in which RNA molecules are essentially cut up due to enzymatic reactions — results in biases in the poly A approach because if a long molecule is cut up only the portion near the poly A tail will be enriched for, causing a 3' bias.
The capture protocol method is similar to the poly A selection method, except that instead of doing a poly A selection immediately, total RNA is first fragmented and converted to cDNA. After A-tailing, adaptor ligation, and several cycles of PCR, an overnight exome capture is performed using exon-targeting RNA probes. After washing and another several rounds of PCR, the library is ready for sequencing.
The Michigan team compared the capture protocol to the poly A method for intact RNA. Both libraries had high alignment rates and strandedness, and for both protocols, 95 percent of all aligned fragments overlapped with known exons. The poly A protocol was slightly better at removing rRNA — with only 1 percent of fragments from rRNA as opposed to about 10 percent in the capture library.
The two methods were also comparable in their ability to detect SNVs and gene expression. There were no differences in their abilities to detect protein-coding genes, although the capture library detected more reads from long noncoding RNA, an emerging biomarker.
Next, the researchers validated the capture method for its ability to estimate absolute gene expression. Cieslik said that the group initially thought that the capture-based method would not be able to quantify gene expression, particularly for genes that might be extremely overexpressed in a cancer transcriptome.
"We thought if a transcript is highly expressed, we might have more molecules than capture probes, so we'd saturate the capture probes and lose the dynamic range," he said.
However, the team found that was not the case. Testing all three protocols they first determined that all three had good reproducibility and agreement across a range of gene expression levels. There were a small number of genes that were underexpressed in the capture library, but those were genes that had been poorly captured, the authors wrote.
Looking at some of the most differentially expressed genes, they found "no evidence of saturation," the authors wrote. In addition, capture efficiency was not biased by GC content.
The results "suggest that exome-capture RNA-seq provides precise and largely unbiased estimates of gene expression for the majority of captured genes," the authors wrote.
The team next tested performance on low-quality RNA samples, and found that SNV calling got worse as with increasingly degraded RNA for the poly A protocol, but not for the capture protocol. In addition, the capture protocol had better coverage of splice junctions and could detect gene fusions better than the poly A approach. Both protocols could detect a cancer-associated gene fusion, TMPRSS2-ERG, when it was highly expressed, at varying levels of RNA quality and degradation. But when expression of the fusion was repressed, it was only reliably detected in the capture protocol.
Finally, the researchers confirmed their findings in actual clinical cases, testing both fresh frozen and FFPE samples from 13 prostate cancer samples. The group sequenced a total of 29 samples divided into three types of libraries: capture FFPE, capture frozen, and poly A frozen. The main goal was to determine whether capture FFPE enabled precise estimates of gene expression and was a substantial improvement over poly A in frozen samples.
Similar to the cell line experiments, the team found that SNV calling was more sensitive in capture libraries than poly A libraries from frozen samples, and they identified more candidate gene fusions in the capture. Importantly, the capture library from one frozen sample identified a gene fusion involving the ETS gene family in four out of eight patient samples, while the poly A library only detected ETS fusions in three.
Evaluating performance from FFPE samples, the researchers compared the capture protocol to a rRNA depletion protocol. For all patients, the capture protocol identified more putative fusions. A known oncogenic fusion was only detected in three of nine of the rRNA depletion libraries.
Cieslik said that the Michign team plans to drop the poly A RNA-seq approach from its clinical pipeline and use only the capture-based approach. Aside from its superior performance, he said that the switch would also help cut costs, since one less library per patient will be made. Typically, between four to six sequencing libraries are done for each patient he said — a tumor and normal exome library, a poly A RNA-seq library and a capture-based RNA-seq library. Dropping one of those four could reduce costs by up to 25 percent, he said.
To date, the team has already run over 500 clinical capture transcriptomes, he said. "It's a very routine part of our workflow."
One drawback to the method, however, is that it relies on an overnight hybridization, so takes a bit longer than the poly A protocol to prepare the library.
At least one other clinical sequencing cancer lab has also adapted the capture-based RNA-seq method. Sameek Roychowdhury, an assistant professor of medicine at Ohio State University's Comprehensive Cancer Center, is using the method in a precision cancer medicine clinical trial. "It's a great strategy," he told GenomeWeb. Roychowdhury previously trained in Chinnaiyan's lab at the University of Michigan and learned of the method there.
As part of Ohio State's trial, eligible patients are first screened via a targeted NGS panel for clinical decision making, and if no actionable results are found in the panel, the sample is sent through a research pipeline where it undergoes exome and transcriptome sequencing.
Roychowdhury said that the main benefit of the capture-based method is that it "allows you to rescue low-quality or degraded samples" that might not otherwise yield results. In addition, it enables targeting of specific genes. In the Genome Research paper, the Michigan team demonstrated the approach on the entire exome, but it is also possible to use probes targeting an even smaller subset of genes, Roychowdhury said. While the strategy is not brand new, he said, the paper does a nice job of illustrating its "applications in the clinic for cancer and potentially other diseases as well, to focus your testing or to rescue degraded samples."