NEW YORK (GenomeWeb News) – A University of Toronto-led research team reported online today in Nature that they have come up with a computational method for predicting tissue-specific alternative splicing patterns in mice.
The approach, which relies on incorporating information on hundreds of RNA features to tease apart an alternative splicing code, predicted a slew of tissue-specific splicing patterns as well as splicing differences between adult and embryonic mouse tissues that may point to previously unappreciated regulatory mechanisms. Those involved with the effort say the work also sets the stage for similar studies in human tissues.
"What we've achieved here is really the first step," senior author Brendan Frey, an engineering and molecular genetics researcher affiliated with the University of Toronto and the Canadian Institute for Advanced Research, told GenomeWeb Daily News. "We and other researchers can use the same framework and the same methodology for [studying alternative splicing in] humans."
Nearly all genes with more than one exon can be spliced in a variety of ways. This alternative splicing is thought to help explain how humans accomplish such biological complexity with relatively few genes, Frey noted. "Before we can understand how genes work, we have to understand how splicing works," he said.
But rather than looking gene-by-gene or protein-by-protein, he explained, the team decided to tackle splicing complexity across the genome using a coding approach similar to that used by the biological system itself.
"Our method takes as an input a collection of exons and surrounding intron sequences and data profiling how those exons are spliced in different tissues," Frey and his co-authors wrote. "The method assembles a code that can predict how a transcript will be spliced in different tissues."
The team used an algorithm developed by co-lead author Yoseph Barash, a postdoctoral researcher in Frey's lab, to create a code based on data for 3,665 alternative exons in 27 mouse embryonic and adult tissues.
After bringing together information on 171 known exon and intron sequence motifs, as well as 326 new motifs, 460 short motifs, and 57 features associated with specific transcript structures, the researchers narrowed in about 200 of the most informative features.
These included a wide range of features, from specific motifs to transcript features such as exon length, Frey said, hinting at complex regulation of alternative splicing.
"It is apparent from examining the splicing code deciphered in the present study that large numbers of sequence features are generally required to achieve tissue-regulated splicing," he and his co-workers wrote.
When the researchers grouped the dozens of tissues into four categories — central nervous system tissues, muscle tissues, digestive system tissues, and embryo/embryonic stem cell tissues — they were able to begin characterizing the extent of tissue specific splicing.
"In most cases that we looked at, there were changes between tissues," Frey said.
In addition, their method predicts patterns in the mouse tissues that could only be explained by previously unappreciated regulatory mechanisms. For example, the researchers found a class of genes that is expressed in both embryonic and adult tissues but doesn't seem to be functional in the adult.
Based on their predictions and follow up experiments, the team concluded that such differences stem from the inclusion of exons with premature termination codons in adult tissues that stimulate nonsense mediated mRNA decay. Indeed, Frey noted, they found roughly 100 examples of genes in which NMD-causing exons seem to get skipped in embryonic tissue.
While instances of this had been detected for individual genes in past studies, Frey explained, "there was no understanding of whether this was general or not."
Overall, about 3,000 of the splicing events predicted by the newly developed splicing code were subsequently validated using microarrays, Frey noted, while 14 exons were verified in 14 tissues using reverse transcription PCR. The team also followed up on some of the predictions using mutagenesis and other experiments.
The team has now started doing similar alternative splicing analyses in 20 or more human tissues using RNA sequencing and microarray data, Frey noted. They eventually hope to broaden their analyses to include information on non-coding RNAs, polyadenylation patterns, and more, he added.
Nevertheless, Frey cautioned, the alternative splicing picture is far from complete, even in mice. He predicts that researchers' understanding of the splicing code identified in the current study will continue expanding as more transcript data — including information on additional tissue types — becomes available.
An online tool for investigators interested in determining whether a particular exon is likely to undergo alternative splicing is available at the Website for Alternative Splicing Prediction.
"The tool can scan previously uncharacterized exons, predict tissue-dependent splicing patterns, and produce downloadable exploratory feature maps linked to the UCSC genome browser," the researchers noted.