NEW YORK (GenomeWeb) – Researchers from Brown University and the University of Utah have established a two-pronged assay for identifying and authenticating splicing mutations with high-throughput in vitro and in vivo experiments.
The approach, known as a "massively parallel splicing assay" (MaPSy), considers the potential splicing consequences of disease-related mutations in parallel in a complete cell system and in cell-free supernatant, with the help of mini-reporter constructs to recreate disease-related mutations, explained Brown University computational molecular biology researcher William Fairbrother.
As ever more obscure variants stack up from the flood of human sequencing data unleashed in the past few years, he and his colleagues see a corresponding need for high-throughput strategies to sift through these variants to identify, validate, and characterize those with splicing (and other) effects.
"There's a lot more genotyping information available from people in clinical studies or research studies and there are a lot more variants that are now being discovered," Fairbrother said. "There hasn't been a concurrent increase in the throughput of the validation technologies. So there's a bit of a bottleneck: we're discovering tens of thousands of these variants, but we don't really know which variants are important and which ones aren't."
For example, when the team applied its MaPSy approach to nearly 5,000 mutations in protein-coding exons from the Human Gene Mutation Database (HGMD) for a study published in Nature Genetics this week, it found that roughly one in 10 exon mutations considered could affect splicing. And those alterations appeared to be over-represented in certain exons, genes, and/or disease types.
More generally, MaPSy made it possible to bring "this kind of old-school biochemical analysis" to hereditary disease alleles, leading to some previously unappreciated splicing event clusters, Fairbrother said. "You could start to learn things about how particular exons failed when they couldn't splice correctly."
In the future, the team plans to profile the potential splicing impact of rare, low-frequency, or de novo mutations, which are typically thought to have more pronounced effects than common SNPs.
For their current analysis, the researchers began by focusing on 4,964 HGMD mutations, each falling in an exon shorter than 100 bases that was bordered by non-coding introns. Using a solid-phase, array-based synthesis approach, libraries of splice reporter constructs containing mutant or wild type sequences were produced at Agilent Technologies.
For the in vivo arm of the assay, the researchers used deep sequencing to tally the wild type and mutant versions of alleles in the libraries, which were then transfected into HeLa cells, where splicing could take place in the context of all of the usual cellular machinery.
"The cell does with them what they do with normal genes: they recognize a promoter, they initiate transcription, splicing occurs, then polyadenylation, and export from the nucleus," Fairbrother said. "There's a whole chain of events that happens during normal gene expression and those are also presumed to take place on the mini-gene reporters."
By sequencing the products of these events, then, investigators could see shifts in splicing and characterize the nature of these changes. But because it is also possible that some of the mutations are indirectly affecting the way gene transcripts are sliced, diced, and pasted back together, the MaPSy includes a second step that compares those in vivo splicing changes to those taking place in a cell-free system capable of splicing.
"It's possible that a mutation could affect the transcription step or the RNA stability step," Fairbrother explained. "So that's really where the in vitro splicing assay comes in."
As part of the in vitro side of MaPSy, the team developed a more high-throughput version of an in vitro splicing assay originally developed in the 1980s, he noted, pairing splice-competent cell extract paired with synthetic genes that are similar, but simpler, than those employed for the in vivo assay.
For the most part, the allelic splicing ratios from the in vivo and in vitro sides of MaPSy lined up well for the HGMD mutation sites considered, the authors noted, "[d]espite substantial differences in processing and substrate design."
At the exon level, their results revealed "all these different features that appeared to sensitize an exon to having a splicing mutation," Fairbrother said, ranging from intron and exon length to broader sequence context.
In genes with an abundance of documented splice site mutations, for example, the team saw a corresponding increase in mutations inside of the profiled exons that changed the transcript splicing outcomes.
When the researchers took the approach to other cell lines, fibroblasts, blood, and post-mortem brain tissues, meanwhile, they saw that more than 80 percent of the splicing-related mutations picked up by MaPSy could also be identified by RT-PCR on corresponding RNA samples.
The team has since gone on to start looking at large sets of mutations from disease-specific datasets, including analyses of de novo mutations identified in individuals with autism spectrum disorder in a Simons Foundation dataset.
Though he was hesitant to pin down a price for MaPSy reagents, which can vary, Fairbrother noted that experiments looking at the number of mutations considered for the HGMD study would typically cost somewhere in the few thousand dollars range.
Along with broadening the types of cells and tissues tested with MaPSy in the future, the team hopes to incorporate more extensive biochemical analyses into the assay — for example, to gauge the proteomic impact of particular mutations and corresponding splicing changes. The group is also exploring strategies to ramp up the throughput of the approach to assess libraries with tens or hundreds of thousands of sequences.