NEW YORK (GenomeWeb) – Oxford Nanopore Technologies has developed a method to do direct RNA sequencing on its MinIon nanopore sequencing device. The researchers described the protocol in a publication on the bioRxiv site last week.
The company plans to release early developer kits for the protocol this year and is inviting interested customers to give feedback on the types of applications they would like to see available, Daniel Turner, senior director of applications at Oxford Nanopore and senior author of the bioRxiv publication, told GenomeWeb.
Typically, RNA sequencing involves synthesizing a complementary DNA strand and a library prep method that uses PCR. The use of PCR introduces biases, such as PCR duplicates that can be difficult to distinguish from true RNA species, and causes epigenetic information to be lost.
In Oxford Nanopore's direct RNA sequencing approach, the company made a few changes to library prep from its DNA sequencing protocol. First, because the RNA is single-stranded rather than double-stranded, the transposase-based method it uses in its library prep to attach adapters cannot be used. So instead, the company researchers had to use a ligation-based approach, Turner said.
In the study, the researchers started with poly A RNA, ligating an RNA adapater — a reverse transcriptase splint — to the poly A tail. In addition, the company only does 1D sequencing for RNA, since it is single stranded. Turner said that while a 2D protocol is possible in theory, it would rely on "joining the RNA onto its cDNA strand." And finally, another major difference is that the RNA is sequenced in the reverse direction.
In the library prep, after ligating the RT splint adaptor onto the poly A tail of the RNA, the company performs reverse transcription, creating a cDNA molecule. It then ligates sequencing adaptors and motor protein onto the mRNA/cDNA complex. Although cDNA was created in the library prep, it was not sequenced.
Turner said that it is possible to do direct RNA sequencing without the cDNA step, but the researchers found that synthesizing cDNA improved the performance, even though it was not sequenced. The study authors wrote that the cDNA helped improve throughput, "possibly by reducing intramolecular secondary structure of the RNA." Turner said that the company is also working on a protocol that gets rid of the cDNA synthesis step while maintaining the improved performance, including "having cleaner adapters, getting rid of RNA that does not have adapters, and running the motor [protein] faster."
After the sequencing adaptors are ligated to the mRNA/cDNA complex, the mRNA is sequenced. The RNA molecule is sequenced in the reverse direction — 3' to 5' — since the initial adaptors were ligated to the poly A tail. Turner said the company would continue to study the possibility of sequencing from the 5' end as well, which he said should be possible. However, one advantage of going in the reverse direction is that the group can design adaptors to attach to the known poly A sequence.
The current levels caused by RNA in the pore are slightly different than the DNA current levels, Turner said, so the company had to train the basecaller to recognize those different current levels. In addition, he said, the researchers also had to reverse the sequence, since sequencing was done in the reverse direction. For basecalling, the researchers used a hidden Markov model, but Turner said that the firm intends to move from that model to a recurrent neural network-based approach.
In the study, the researchers demonstrated the method on a human rhinovirus sample, preparing a 1D RNA template from the 7.5-kilobase single-stranded RNA genome. In addition, they also showed that direct RNA sequencing can detect base modifications. They sequenced RNA transcripts with known m6A base modifications and transcripts without any modifications, showing that direct RNA sequencing could distinguish between the fully modified and fully unmodified transcripts.
Turner said that the company will continue to make improvements to the method. For instance, he said, company researchers are screening different motor proteins to "find ones with the best combination of smooth movement, high processivity and fast processing speed." In addition, he said, researchers are testing different mutations in the pore to find the ones that give well-separated current levels. Ideally, the company would use the same pore for both DNA and RNA sequencing, Turner said, "but if it turns out that the best pore for RNA sequencing is different from the best pore for DNA sequencing, we would want to make both available to users."
Direct RNA sequencing on the MinIon initially will be most useful for looking at splice variation. Down the road, Turner said, expression profiles could be generated on the higher-throughput PromethIon system. Other applications include doing 16S rRNA profiling of metagenomic samples to identify species from mixed samples. These 16S rRNAs "are very abundant, and the RNA prep is quick, so this method would be a rapid way to do metagenomic species ID," Turner said.
Helicos BioSciences developed a direct RNA sequencing approach that did not rely on first converting RNA to cDNA in 2009, which SeqLL now offers as a service. However, that approach has very short read lengths — between 20 bases and 50 bases long. In addition, Turner disputed Helicos' description of it being direct RNA sequencing since the method is sequencing-by-synthesis and relies on reading the DNA bases that are added to the RNA template, rather than the RNA template itself.
In addition, Pacific Biosciences has previously said that it is developing a direct RNA sequencing approach. It's current Iso-Seq method relies on PCR, but has already demonstrated the power of having very long reads for identifying novel isoforms and sequencing full-length transcripts.
Direct RNA sequencing on the MinIon and eventually the Promethion has the potential to build on these other approaches since it will have the advantage of long reads as well as a PCR-free protocol.