This article was originally published May 2.
NEW YORK (GenomeWeb) – Stanford University researchers have secured a three-year, approximately $1.3 million grant from the National Institutes of Health to support the use of their targeted RNA sequencing method for allele-specific expression testing on samples collected for the Genotype-Tissue Expression (GTEx) project.
The approach couples deep sequencing with Illumina or other sequencing instruments to determine ratios between alleles at targeted sites in RNA transcripts, Stanford geneticist Jin Billy Li, primary investigator on the NIH grant, told In Sequence.
"Our assay compares the ratio between two different alleles," he said. "We're totally blind to the expression across different genes. We just look at the expression levels of two different alleles of the same gene."
In its current form, the method — known as microfluidics-based multiplex PCR and deep sequencing, or mmPCR-seq — can assess as many as 960 RNA loci in 48 samples at once on a Fluidigm chip by pooling up to 20 primer pairs into each reagent well.
By slapping the same barcode on amplicons generated for each sample and pooling hundreds of samples in a single Illumina sequencing lane, the team has shown that it can quantify allelic ratios for a wide range of targeted RNA sites for less than $50 per sample, once appropriate primer sets are available.
"Essentially, we do multiplex PCR in a multiplex platform," Li said, noting that group generally pools between 100 and 288 samples on a given Illumina HiSeq lane.
In a study published in Nature Methods last November, Li, co-senior author Stephen Montgomery, and colleagues at Stanford and McGill University, outlined the rationale for the approach, along with its applications for studying RNA editing and allele-specific expression (ASE).
Another paper appearing online last night in PLOS Genetics by Montgomery, Li, and others highlighted the utility of using ASE patterns obtained with mmPCR-seq and other approaches to ascertain the effects of loss-of-function variants and other rare, deleterious alleles.
Deleterious and loss-of-function variants are among the alleles that Li and his team plan to profile for ASE patterns with mmPCR-seq over the next three years using the newly obtained NIH funding, which went into effect late last month.
All told, the researchers are gearing up to test some 2,400 GTEx samples using mmPCR-seq — both to continue exploring deleterious and loss-of-function variant effects on expression and to verify apparent expression quantitative trait loci identified by GTEx so far.
In their grant abstract, the researchers proposed validating mmPCR-seq-based ASE testing on 800 suspected eQTLs in tissues from 96 individuals, along with ASE profiling on rare and deleterious variants in multiple tissue types from around 50 individuals enrolled in GTEx.
The direct funding to be used over the course of the project is nearly $833,000, Li said, including more than $700,000 to be administered this year.
Generally speaking, efforts such as GTEx and others have helped researchers make headway in untangling variant effects on gene expression in different organisms and tissues types — for instance, using gene expression and genotyping information to find suspected eQTLs.
While those eQTLs offer insights into variants that regulate expression at the population level, they typically require additional validation. Moreover, Li explained that it's often difficult to accurately account for the allelic ratios of genes expressed in the low to medium range using RNA sequencing alone.
"The main barrier to using RNA-seq data to call ASE is low coverage for most of the genes and most of the SNPs in the genes," he said, noting that some 80 to 90 percent of sequence reads generated in a typical RNA sequencing experiment represent transcripts from only around 10 percent of genes.
Some targeted methods of assessing allelic ratios in RNA transcripts have encountered similar problems, Li said, since the abundance of a given RNA in the transcriptome influences how well it can be captured by an oligo, for example.
In an effort to design a high-throughput method with more uniform and representative transcript amplification, Li and his team decided to look at allelic ratios in a system that uses many cycles of PCR to produce amplicons at targeted alleles while maintaining information about the ratio of alternate alleles at each site.
The multiplexed system was developed around Fluidigm's Access Array microfluidics platform, which is designed to produce PCR products for 48 DNA samples simultaneously.
By initially dialing up the amount of input material — in this case complementary DNA — and using sets of pooled primers to target alleles of interest, the researchers came up with ways to amplify and assess as many as 960 loci per RNA sample.
Generally speaking, higher RNA inputs were needed when using the Fluidigm chip to assess RNA rather than DNA, Li explained, since few copies of genes with low expression make it to the amplification step in the system.
Nevertheless, the team demonstrated in its Nature Methods study that it's possible to profile allelic ratios in low-quality or -quantity RNA samples when a pre-amplification step is included as well.
For both the RNA editing and allele-specific expression applications of mmPCR-seq, the researchers produce saturating levels of amplicons at each site of interest so that coverage becomes uniform, Li said, though the ratios between alleles can still be observed in the subsequent sequencing data.
"Here, you want to measure the ratio between two alleles," he explained. "You're not necessarily interested in the gene expression levels across different SNPs."
The team barcodes amplicons from each sample to a deep sequencing step, often pooling barcoded samples from multiple chips so that hundreds of samples can be sequenced simultaneously.
For their proof-of-concept Nature Methods paper, the researchers focused on samples with well-characterized gene expression profiles, demonstrating the mmPCR-seq provided ASE information that matched that ascertained by RNA sequencing in combination with genotyping patterns obtained from DNA sequence data.
The same general approach is used when using mmPCR-seq to assess RNA editing and ASE, Li said, since both hinge on the proportion of each allele present at sites of interest.
He noted that the approach may be somewhat easier and more cost effective to apply for RNA editing applications, since only a few hundred key RNA loci seem to be affected by such events based on past mouse, fly, and human cell studies.
"For RNA editing, often we already know where the edited sites are and in our experiments we just target that set, no matter which sample we're using," he said.
In contrast, using mmPCR-seq to verify eQTLs or perform other types of ASE analyses may require more extensive primer design, though Li noted that it is still cost effective when looking at large numbers of samples.
"The drawback is that you have to know where you want to look," Li noted. "And for that set, you have to design primers."
The team is currently doing mmPCR-seq using Illumina instruments, adding Illumina adaptors to the amplicons at the time samples are being barcoded.
In their Nature Methods study, for example, the researchers generated deep sequence reads either by single-end sequencing using the Illumina HiSeq 2000 or with paired-end MiSeq sequencing. But Li said the same the general mmPCR-seq approach should be platform agnostic if researchers opt to add different adaptors during that step.
In the future, the team is interested in exploring options for applying mmPCR-seq to look at RNA editing and ASE in single cells. Li emphasized that the mmPCR-seq method is meant to complement, rather than replace existing RNA sequencing methods.