A team from Fudan University and ShanghaiTech University describe a bioinformatic approach designed for finding and following polyadenylation sites in intronic gene sequences. The IPAFinder method detects intronic polyadenylation sites de novo by comparing RNA sequence data, the researchers say. When they applied the algorithm to more than 250 tumor and matched normal samples spanning half a dozen tumor types from the Cancer Genome Atlas, for example, they tracked down some 490 recurrent, cancer associated intronic polyadenylation events at sites that appeared to shift dynamically — a set that included genes with apparent intronic polyadenylation-related regulation. "[T]he IPAFinder method should open up a new avenue for discovering [intronic polyadenylation (IpA)] events and changes in their usage in numerous biological processes using standard RNA-seq," the authors conclude. "This should help to revel the functional roles of IpA in diverse conditions."
Investigators at Washington University School of Medicine and Yale University School of Medicine explore the gene expression effects of structural variants in samples from hundreds of GTEx project participants. When the team profiled almost 61,700 SVs and their gene expression consequences in 613 individuals, it saw signs that common SVs may play an outsized role as expression quantitative trait loci, particularly when it came to duplication and deletion variants. On the other hand, mobile element insertion contributions at eQTLs were quite modest, the authors note, while common and rare SV eQTLs in general appeared to affect the expression of more than 1.8 genes apiece, on average. "SVs have a disproportionately large effect on common and rare gene expression changes and often affect multiple genes," they report. "Our findings reinforce the importance of comprehensive variant detection in the design of future trait mapping studies."
A team from the US and the Netherlands outline a computational strategy for digging into single-cell RNA sequence data to find cell type-relevant marker gene sets. The machine learning-based algorithm — version 2.0 of the NS-Forest algorithm — is designed with marker gene selection in mind, the researchers note. "The marker genes selected provide an expression barcode that serves as both a useful tool for downstream biological investigation and the necessary and sufficient characteristics for semantic cell type definition," they write. By applying NS-Forest to scRNA-seq profiles from middle temporal gyrus samples from the human brain, for instance, the method helped to untangle informative cell types, along with related cell signaling and non-coding RNA contributions.