SAN FRANCISCO (GenomeWeb) – Library prep kits for small RNA sequencing that make use of random sequence adapters may be less biased than more conventional kits that use fixed sequence adapters, according to researchers involved in the National Institutes of Health's Extracellular RNA Communication Consortium.
Describing their work this week in Nature Biotechnology, researchers from nine laboratories in the consortium evaluated four different commercial kits for small RNA-seq, as well as an in-house protocol developed by David Galas' lab at the Pacific Northwest Research Institute and different modifications of that protocol.
Maria Giraldez, co-lead author of the study and a postdoc in Muneesh Tewari's lab at the University of Michigan, said the consortium had been tasked with profiling RNA from blood but "soon realized that we don't know much about the methodology" and how it works in biofluid.
"RNA sequencing hasn't been developed for biofluids," which are very different sample types than tissue or cells. "With blood and urine, the amount of RNA you have there is very little."
She said that her group at the University of Michigan is particularly interested in RNA sequencing methods to analyze cancer biomarkers in the blood. So, the group decided to first analyze various protocols in order to characterize how they worked for sequencing small RNA from blood and other biofluids.
The same sample prep kits that are used to sequence mRNA cannot be used for small RNAs due to the much smaller size of those molecules. Instead, an adapter needs to be added to the ends of the small RNA molecules to make them longer. There are two main types of small RNA library prep kits: one type adds a known, fixed sequence adapter and the other technique adds a random sequence adapter, called a degenerate base adapter, or 4N.
The 4N protocol was originally developed in 2011 by researchers from Mount Sinai, who ultimately licensed the technology to Bioo Scientific, now owned by PerkinElmer.
Most currently available commercial kits use fixed sequence adapters. In the study, the researchers tested three such commercial kits: Illumina TruSeq, New England Biolabs NEBNext, and TriLink Biotech CleanTag. In addition, the researchers tested PerkinElmer's Bioo Scientific NextFlex, which is based off of the Mt. Sinai team's 4N protocol; as well as several iterations of the 4N method, including one described in 2015 by researchers from the University of East Anglia and one developed by Galas.
In total, the consortium members sequenced 377 small RNA libraries. The researchers sequenced both control libraries that consisted of a pool of known synthetic RNAs, as well as libraries of human plasma-derived RNA.
When sequencing the synthetic RNAs, the researchers noted that recovery of the known small RNAs, between 16 and 25 nucleotides, varied between protocol and that the biases seen were greater than those known to be present in long RNA-seq.
In addition, the biases seen were consistent between replicates within the same lab and between labs for the specific protocol used. "The libraries formed distinct clusters corresponding to the different protocols included in the study, indicating that the effect of the protocol bias is potentially greater than that of lab-to-lab variation," the authors wrote. As such, the 10 most represented small RNAs and the 10 most underrepresented RNAs varied between protocols.
However, despite the fact that all protocols had some bias, the biases were reduced in the 4N methods.
For instance, the researchers calculated the median percentage of sequences that had read counts that were 10 times above or 10 times below what would be expected. They found that for the TruSeq, CleanTag, and NEBNext protocols, which all used defined sequence adapters, those values ranged from 42 percent to 62 percent. But for the 4N protocols, the range was 3 percent to 22 percent.
In general, Giraldez said, the 4N protocols tend to have less bias because when random adapters are used, every RNA molecule "gets the same chance to be ligated, so the representation reflects better what's in your sample."
By contrast, with fixed adapters, "certain sequences will be very favored, while others will not be favored," she said.
"This is especially problematic in biofluids where the total amount is already really low," Giraldez added. "If there's a low amount of a certain microRNA and the adapter does not favor that microRNA, it may not be in the final library." When sequencing RNA from tissue, the problem isn't as pronounced because the total amount of RNA molecules is much higher. "Everything is more extreme in biofluids," she said.
Giraldez also said that she was not surprised that the protocols using fixed sequence adapters showed more bias than the so-called 4N protocols, given previous research that showed those protocols reduce bias, but "we were a little surprised by how huge the bias can be, how far from reality estimates for certain microRNAs can be because they are not that far off when sequencing longer RNAs."
However, she noted, one encouraging result was that all the protocols performed well when the goal was to compare the relative amount between two samples. For instance, she said, if a researcher was comparing a healthy sample with a cancer sample, for instance, the different library prep kits would be able to accurately determine that the cancer sample has double the amount of a certain micro RNA than the healthy sample, for instance. In addition, she said, the protocols were all reproducible among replicates and even between labs, as long as the protocols were followed to a T. "Even changing a small detail in a protocol can give different results," she noted.
In an email, Micahel DeMayo, TriLink's vice president of commercial operations, noted that the study did not address other aspects such as the kits' ease of use and automation potential.
Giraldez noted that indeed, the commercial kits that use fixed sequence adapters are "more user friendly" than the in-house 4N protocol, which is "longer and more tedious" — something researchers would have to take into consideration when designing their studies.
However, PerkinElmer, which sells the small RNA kit based on random adapters, does have an automated protocol available, according to Arvind Kothandaraman, director of the NGS product portfolio at PerkinElmer. The kit is automated for PerkinElmer's Sciclone G3, "improving workflow efficiency and reducing variability," he said in an email. He noted that the study "demonstrates the importance of reducing bias introduced during library preparation for small RNA sequencing."
Neither Illumina nor New England Biolabs provided comments about the study.
Giraldez said that her lab's next steps are to do miRNA sequencing from real samples, comparing cancer with healthy controls, looking for miRNAs and other small RNAs circulating in blood. For those studies, she said the group would most likely use one of the 4N protocols.
She added that the main point she hopes researchers take away from the study is the need for standardization regardless of which protocol is used. For instance, in a multi-center study, researchers should take care to ensure that "every small detail" is the same. In addition, she said, "I would love to see companies take the time to improve the commercial protocols."