NEW YORK (GenomeWeb News) – Two new high-throughput RNA sequencing studies by independent research groups are offering a glimpse into gene splicing, expression, and regulation patterns in European and Yoruban populations.
The papers, which relied on samples collected through the International HapMap Project, appear online in Nature today.
In the first of these studies, University of Chicago researchers used the Illumina Genome Analyzer II to sequence complementary DNA libraries from lymphoblastoid cell lines created for 69 Yoruban individuals sampled through the HapMap project.
Compared with microarray-based studies of gene expression, RNA-Seq does not rely on probing specific parts of the genome that are thought to be expressed, lead author Joseph Pickrell, a graduate student in University of Chicago geneticist Jonathan Pritchard's lab, told GenomeWeb Daily News.
Instead, Pickrell explained, the sequencing approach offers an unbiased look at expression — providing data that can be used to interpret information about genetic variants influencing gene expression and more
Their RNA sequencing approach turned up transcripts at more than 4,000 previously unannotated regions, including protein-coding transcripts, alternatively spliced exons, and previously undetected untranslated regions.
By incorporating data on millions of SNPs genotyped through the HapMap project, the researchers uncovered 929 genes or potential exons with expression quantitative trait loci, or eQTLs, within 200,000 bases.
They found that some 90 percent of SNPs influencing gene expression are located near the affected genes. Their results also suggest that such eQTLs have allele-specific effects on gene expression.
"We demonstrate that eQTLs near genes generally act by a mechanism involving allele-specific expression," the researchers wrote, "and that variation that influences the inclusion of an exon is enriched within and near the consensus splice sites."
In their subsequent experiments, the team also identified 187 genes that appear to be associated with splicing QTLs, or sQTLs — variants influencing which isoforms get expressed.
The team plans to do more extensive studies aimed at determining which individual SNPs affect gene expression and how, Pickrell said. He noted that the genomes of cell lines used in the study have now been sequenced, giving the researchers access to all possible causal SNPs.
"Studies of variation in gene expression using microarrays have provided insight into the mechanism of action of loci associated with disease," Pickrell and his co-workers wrote. "[T]he increased sensitivity to detect variation by RNA-Seq will greatly enhance these efforts."
Meanwhile, researchers from Switzerland, the UK, and Spain used paired-end sequencing with the Illumina Genome Analyzer II to assess transcriptomes from lymphoblastoid cell lines created from 60 European individuals from the HapMap3 project — work that is providing information on everything from transcript abundance, eQTLs, and allele-specific expression patterns in that population to exon structure and variants influencing alternative splicing.
"[RNA-sequencing] allows us to get a really deep resolution of the transcriptome," lead author Stephen Montgomery, a post-doctoral researcher who was formerly at the Wellcome Trust Sanger Institute and is now at the University of Geneva, told GWDN, noting that gene expression appears to be much more complex than previously appreciated.
In addition, Montgomery explained, comparisons between his group's data and eQTLs for the top 500 genes from the Yoruban population study suggest that some 33 percent of the signals overlapped, suggesting many of the same SNPs affecting gene expression in the European group are shared in the African individuals tested.
Although researchers also expect to see some population- and tissue-specific gene expression patterns, Montgomery added, this overlap between eQTLs in data from distinct populations generated by independent research groups is encouraging. "That was actually a really good replication," he said.
Such gene expression data may eventually inform genome-wide association studies of disease, Montgomery explained, since some of the SNPs coming out of these GWAS fall in non-coding regions of the genome and may exert an effect by altering gene expression.
He and his co-workers at the University of Geneva are currently following up on the study by doing cellular phenotype studies looking at how genetic patterns influence cellular features.
"As sequencing technologies continue to increase the depth and breadth of the interrogation of the genome and the transcriptome, it is anticipated that our understanding of finer scale cellular processes will become more detailed and robust," the researchers concluded.