Members of a methods testing and development group at the Broad Institute have published a new comparison of RNA sequencing methods that focuses on samples with low RNA quality and/or quantity.
"The 'easy' samples, people already have solutions for," co-senior author Joshua Levin, with the Broad Institute's Genome Sequencing and Analysis Program, told In Sequence. "But people are always coming to us with other samples that they want to work on, so we were trying to figure out which method would be the best one to use."
As they reported online this weekend in Nature Methods, Levin and his colleagues looked mainly at five RNA sequencing methods: Epicentre's ribosomal RNA removal kit (called Ribo-zero); a duplex-specific nuclease-based approach known as DSN-lite; a method that uses the endoribonuclease enzyme RNase H to remove abundant RNA such as ribosomal RNA; NuGEN's Ovation RNA sequencing system; and Clontech's SMART-seq kit (CSN 7/25/2012).
The group tested all but the SMART-seq approach on chemically degraded RNA samples, while SMART-seq and NuGEN methods were tested for samples with lower-than-usual RNA levels. The NuGEN approach alone was used to assess a sample plagued by both low-input and low-quality RNA.
Results from experiments done with RNA from a chronic myeloid leukemia cell line suggested that the RNase H approach was best suited to deal with low-quality RNA, followed by Ribo-zero — a conclusion supported by follow-up experiments on clinical or biological samples containing poor-quality RNA. On the other hand, the SMART-seq and NuGEN methods stood out for producing the best results from samples with low RNA input levels.
Levin noted that there have been a few new RNA sequencing approaches introduced onto the market or in the literature since the group's Nature Methods paper went to press, though the analysis "was comprehensive at the time we did it."
Some other RNA sequencing strategies weren't tested in the new study since close scrutiny of those methods suggested they likely weren't compatible with degraded or low-quantity RNA.
The head-to-head comparisons were motivated by a desire to support a range of studies that rely on accurate RNA sequencing of samples with degraded RNA, low amounts of RNA, or both.
For instance, Levin noted that he and his colleagues are collaborating with researchers interested in early embryology who are keen to profile the RNA contents of embryos comprised of just a handful of cells.
"There's a lot of interesting biology going on," Levin said, "but there's not very much RNA there to analyze."
On the other hand, Broad researchers participating in the ongoing Genotype-Tissue Expression, or GTEx, project — an effort to profile RNAs in a large number of warm autopsy samples across many individuals — have been faced with samples containing low-quality RNA.
So far the GTEx group has focused its attention on samples with suitable RNA integrity. But Levin noted that results from the new analysis suggest that some of the samples that weren't suitable for testing in the past might be amenable to RNA sequencing strategies that involve RNase H treatment — a strategy that came out on top of the heap in his group's analyses of samples with low-quality RNA.
For that comparison, researchers looked at chemically fragmented RNA stemming from a human chronic myeloid leukemia cell line called K-562. After preparing complementary DNA libraries using the RNase H, Ribo-zero, DSN-lite, and NuGEN methods, they used Illumina's HiSeq 2000 to do paired-end sequencing on each library, generating more than 75 million reads apiece.
The team used RNA from the same cell line to produce SMART-seq and NuGEN libraries for experiments aimed at evaluating methods for dealing with exceptionally low RNA input levels.
Libraries were also produced using Illumina's TruSeq oligo (dT) selection strategy for the low-quantity experiments.
To further aid their comparisons, investigators made two libraries from high-quality and high-quantity RNA, using oligo (dT) selection to enrich for mRNA in one of the libraries. The other control library represented the total RNA content of the K-562 cell line, including rRNAs.
Researchers then used a set of carefully selected criteria to see how the methods performed on a range of tasks — from the efficiency of rRNA depletion in libraries produced using each method to the gene annotation and gene expression profiles ascertained from those libraries.
"The metrics that we chose are ones that might matter in specific kinds of experiments or in all experiments," Levin noted.
Whereas some investigators are primarily concerned with gene expression information, for example, others may need to see the fine details of transcript splicing and annotation.
RNase H performed best across the criteria considered when looking at samples with low-quality RNA, the researchers reported.
The Ribo-zero approach, which also removes rRNA at the beginning of the sample preparation process, albeit by a different manner than RNase H treatment, also did quite well in that analysis, though study authors noted that that kit is pricier and appears to require slightly deeper sequencing.
"Ribo-zero performed similarly to RNase H by many metrics; as such, Ribo-zero might be acceptable for researchers who prefer to use a kit or have only a few samples," they wrote.
The RNase H and Ribo-zero methods proved useful for sequencing RNA from a formalin-fixed, paraffin-embedded kidney sample and a pancreas sample, too, suggesting these methods perform well on actual clinical and biological samples.
When dealing with low-quantity RNA samples, meanwhile, the researchers saw SMART-seq and NuGEN in a dead heat for the top spot.
Both performed well, Levin said, but there are differences between the approaches that may make one or the other better suited to a particular study — considerations that he and his colleagues discussed in the supplemental material accompanying their paper.
For example, results from the current analysis indicated that SMART-seq libraries tend to perform poorly for samples in which guanine and cytosine nucleotides are especially prevalent, while that GC-bias was less pronounced in samples sequenced after NuGEN treatment.
The team suspects that may have something to do with the first PCR amplification step involved in SMART-seq, though they have not yet published results to support that notion.
Finally, results from the analysis suggest that challenges remain for those interested in looking at samples characterized by both low levels of RNA and poor RNA quality.
The group's experiments using NuGEN indicated that that method can provide some information on these doubly complicated samples, though performance across the metrics considered was generally lower than that found for NuGEN libraries created with fragmented but more plentiful RNA samples.
For the most part, each method required a comparable time commitment, which was one to a few hours longer than standard RNA sequencing library preparation steps. But the DSN-lite approach — a duplex-specific nuclease enzyme method that ousts rRNA at the end of the sample preparation process by targeting especially abundant cDNAs in the library — took a day longer.
On the other hand, commercial kits tend to be pricier, Levin explained. That increased cost-per-sample may not be an issue for labs dealing with relatively few samples, he noted, though such price differences are expected to impact centers doing RNA sequencing on hundreds or thousands of samples.
"If you're just doing five or 10 samples, then it's not that big a deal," he said. "But if you're doing a lot of samples, the costs start to add up."
The group is continuing to compare RNA sequencing methods as they are published and/or introduced into the market. Going forward, Levin noted, there is interest in finding ways to do RNA sequencing with low-quality RNA samples that have increasingly small amounts of RNA.
"In this paper, we tried RNase H with 1 microgram or 1000 nanograms of total RNA," Levin said. "But the challenge is, 'Can you use it with less and how low can you go?'"
"The other challenge with low quantity is then to see how well these methods work for single cells," he noted.
The use of SMART-seq on single cells has already been described. The group hasn't taken a crack at using the NuGEN method to sequence RNA from single cells just yet. Nor are there published examples of NuGEN being used for that application.
Even so, Levin and colleagues published a Nucleic Acids Research study last year outlining experiments that used NuGEN to sequence low-input viral RNA. While it's still difficult to decisively measure RNA concentrations, results from that analysis indicated that NuGEN is useful for sequencing RNA in samples with low RNA contents approaching those in some individual cells.
In the meantime, some researchers at the Broad have been using a combination of SMART-seq and Fluidigm's C1 machine to do single-cell RNA sequencing.
In general, though, it still remains to be seen which, if any, of the approaches tested in the current analysis will make their way into production sequencing at the Broad.
While the methods group behind the analysis found advantages to RNase H for samples with low-quality RNA, for instance, that group is not directly involved in production sequencing at the center, Levin explained.
So while the sequencing team is believed to be considering the RNase H method as an option for more routine sequencing of samples with low-quality RNA, it's not currently part of a sequencing pipeline available to everyone at the Broad.
"While we would say that the RNase H method is the method that we would want to use in every case of low-quality samples with sufficient material available, the production group would have to implement that," Levin noted.
"They necessarily lag behind," he added. "Our group has a lot more flexibility because we don't have to build a production process."