By Monica Heger
Since the introduction of the first RNA sequencing protocol for second-generation sequencing several years ago, researchers and companies have developed their own protocols for strand-specific transcriptome sequencing, but these protocols had never been evaluated side-by-side. Now, in a paper published last week in Nature Methods, researchers from the Broad Institute have compared seven different protocols for strand-specific RNA-seq on the Illumina platform, determining "dUTP second-strand marking" to be the leading protocol, with Illumina's RNA ligation method a close second.
Joshua Levin, a research scientist in the Broad's Genome Sequencing and Analysis program and a first author of the paper along with Moran Yassour, said that while a multitude of methods have been developed for strand-specific RNA-seq, including some by the Broad Institute, the absence of any side-to-side comparison of the different protocols made it difficult for researchers to know which method would perform best. In order to do the evaluation, Levin and his team had to develop a suite of computational analysis tools, which he said would be useful for testing any additional RNA-seq protocols that are developed in the future.
The team tested seven methods: RNA ligation, Illumina RNA ligation, a method developed for RNA-seq on Life Technologies' SOLiD known as SMART, a hybrid between RNA ligation and SMART, bisulfite conversion, not not so random priming (NNSR), and the dUTP second-strand marking method.
Levin noted that the dUTP method, which came out on top in the evaluation, is "very clever." The approach, developed by researchers at the Max Planck Institute for Molecular Genetics in Berlin and described last year in Nucleic Acids Research, uses deoxyuridine triphosphate instead of deoxythymidine triphosphate to mark the second strand of RNA. Then, before the amplification step, an enzyme cuts the second strand at all the uracil bases, so only the first strand is amplified, maintaining strand specificity. Aside from strand specificity, the dUTP method provides more even coverage than some of the ligation-based methods. "It seems to be the right combination of steps," he added.
To test the different methods, Levin and his team prepared 11 different libraries based on the seven strand-specific protocols, including libraries based on two different variations of four of the protocols. They also prepared a non-strand specific cDNA library to use as a control. Additionally, the researchers compiled published data from a - library that used an eighth method known as the 3' split adaptor method.
The researchers used the Saccharomyces cerevisiae transcriptome as a benchmark, because it is well annotated, and tested each protocol using the Illumina Genome Analyzer.
They evaluated each method based on different criteria, including library complexity, strand specificity, evenness and continuity of coverage at annotated transcripts, performance at both 3' and 5' ends, and performance in expression profiling.
The dUTP library had the highest percentage of mapped paired-end reads, and four protocols — RNA ligation, Illumina RNA ligation, dUTP, and NNSR — all performed well in terms of strand specificity. The SMART method, which was developed for SOLiD but adapted for use on the Illumina GA, was the "least strand-specific method, by a wide margin," according to the authors.
The dUTP method also performed well in terms of accurate gene expression profiling and coverage at both the 5' and 3' ends, and provided the "most compelling overall balance across criteria," the authors reported. While the Illumina RNA ligation protocol was a close second, the dUTP method has the additional advantage of also being compatible with paired-end sequencing, while the current Illumina RNA ligation method is not.
The authors only tested RNA-seq protocols that could be performed on the Illumina platform, and did not test protocols for the SOLiD or Helicos platforms. Levin said that if a different platform were to be used, the comparisons would likely not hold. "Each platform will have its own idiosyncracies," he said.
Levin said that since doing the experiment, the dUTP method has become the Broad team's default method for RNA-seq experiments. While the group tested the methods only on RNA from yeast, he thought the results would hold true regardless of the sample's origin.
Because a comprehensive comparison of the different methods had never been done before, Levin said that when they began, the team did not know which one would yield the best results. But, as they were performing the experiments, it quickly became clear which methods were preferable. "After our initial experiments, we knew which ones we didn't like. If something is hard to do in the lab that would not be ideal," he said. For instance, the RNA ligation protocol was the most labor intensive and required the most starting material, while the bisulfite method was the "most computationally challenging."
The experiment also made it clear that having a "good computational pipeline in order to judge whether one method is better than another method seems to be lacking in a lot of other comparisons," Levin said. "When you introduce a new method, you don't always have the tools to know whether it is better or not."