NEW YORK (GenomeWeb) – A team of German researchers has developed a new sequencing protocol to capture transiently expressed noncoding RNAs.
The approach — dubbed transient transcriptome sequencing or TT-seq — is a variation on 4sU-seq that attempts to account for that approach's 5' bias. As the Max Planck Institutes for Biophysical Chemistry's Patrick Cramer and his colleagues reported in Science today, TT-seq was able to recover messenger RNAs and long intergenic noncoding RNAs as well as the more fleetingly present enhancer, antisense, and promoter-associated RNAs in a human cell line.
"TT-seq has afforded insights into the determinants of human genome transcription and provides a complementary tool for transcriptome analysis," Cramer and his colleagues wrote in their paper.
Like 4sU-seq, TT-seq relies on exposing cells to the nucleoside analog 4-thiouridine, which then becomes incorporated into RNA strands during transcription to yield 4sU-labeled RNAs. Those labeled RNAs can then be isolated and sequenced.
But since only a short 3' stretch of nascent transcripts is labeled during 4sU exposure, 4sU-seq doesn't map transcripts uniformly. To get around this problem with TT-seq, the researchers added an RNA fragmentation step after 4sU exposure, but before the labeled fragments are isolated for sequencing. In this way, the researchers say TT-seq measures newly transcribed RNA fragments uniformly.
Cramer and his colleagues compared the two methods in human K562 cells, and found that TT-seq indeed covered newly transcribed regions uniformly while 4sU-seq exhibited a 5' bias. Further, they estimated the coverage of short-lived introns with respect to exons to be 60 percent for TT-seq, while it was 23 percent and 8 percent for 4sU-seq and RNA-seq, respectively.
"TT-seq is highly reproducible and enables complete mapping of transcribed regions," the researchers wrote, further adding that TT-seq complements other approaches like GRO-cap and CAGE that detect the 5' ends of RNA.
Using both TT-seq data and the segmentation algorithm GenoSTAN, Cramer and his colleagues identified 21,874 stretches of uninterrupted transcription, or transcriptional units in the cell line. They also noted that TT-seq is sensitive as it recovered 65 percent of the transcription start sites obtained via GRO-cap.
About 8,500 of these TUs overlapped with GENCODE annotations, the researchers noted. They further uncovered 7,810 mRNAs, 300 long intergenic noncoding RNAS, and 430 antisense RNAs. The remaining 10,400 TUs are likely newly detected noncoding RNAs, the researchers said.
They further characterized those remaining 10,400 TUs using GenoSTAN and ENCODE ChIP-seq data, and found that they include 685 upstream antisense RNAs, 778 convergent RNAs, 3,115 opposite-strand asRNAs, 2,580 short intergenic ncRNAs, and 3,257 TUs from enhancer state regions.
The researchers also estimated that mRNAs and lincRNAs had the highest synthesis rates and that the median mRNA half-life was about 50 minutes, though previous estimates had placed that at closer to 139 minutes. Other transcript types, they noted, had lower synthesis rates and shorter half-lives. eRNAs, for instance, have a half life of a few minutes.
TT-seq can also reveal transcription termination sites, Cramer and his colleagues reported. They uncovered an average of four transcription termination sites in nearly 7,000 mRNA genes, and these sites were enriched for a certain consensus motif that likely destabilizes the polymerase complex in its T/A-rich region, while binding in the C/G-rich region remains stable and can be cleaved.
All together, the researchers said, their approach may complement other transcriptome analysis tools.