NEW YORK (GenomeWeb) – In Nature Genetics online today, a University of California at Los Angeles-led team described an imputation-based approach that uses gene expression and genome-wide association study information to search for transcriptome-wide associations related to complex traits or diseases.
Starting from a reference set of individuals with known genotype and gene expression profiles, the researchers came up with a strategy to extend SNP-gene expression ties detected in that group into genotyped and phenotyped individuals whose gene expression profiles had not been tested directly.
"Our approach builds upon the wealth of GWAS data in massive cohorts to directly implicate the gene-based mechanisms underlying complex traits," senior author Bogdan Pasaniuc, a human genetics and bioinformatics researcher at UCLA, and colleagues wrote.
The group then applied this strategy to do transcriptome-wide association studies of blood and adipose tissue related traits, using expression profiles from some 3,000 individuals and GWAS data generated for hundreds of thousands of related phenotypes — a search that led to 69 genes with expression ties to obesity-relevant traits such as body mass index, height, and blood lipid levels.
Though past studies have looked for overlap between trait- or disease-associated variants identified by GWAS and variants implicated as expression quantitative trait loci (eQTL), the team argued that this method might not pick up relatively subtle expression effects on a trait of interest.
Moreover, they noted that efforts to directly assess gene expression within a GWAS have largely been limited to studies with small sample sizes due to the additional cost associated with this step and added complication of obtaining appropriate tissue samples.
With that in mind, the team set out to do a transcriptome-wide association study, or TWAS, using imputed gene expression data.
The researchers first explored the heritability of cis SNPs near genes and trans SNPs found further afield in data for 3,234 genotyped and gene expression-profiled participants from three study cohorts, settling on a set of more than 6,900 genes that appeared to be both trait-related and near heritable cis SNPs.
In its proof-of-principle experiments, the team found that heritable cis SNP genotypes could boost the accuracy of gene expression predictions in the individuals' adipose tissue and/or blood samples compared with predictions done with the top cis eQTLs alone.
Bolstered by this finding and their subsequent TWAS simulations, the researchers attempted individual TWAS of height, lipid measurements, and BMI using existing GWAS findings and related phenotype profiles, combined with heritable cis SNP information from the reference group.
Most of the 665 trait-gene relationships detected in the analysis coincided with variants already described for these traits, the team noted. But it also picked up ties between lipid, height, and/or BMI and expression levels for 69 genes not implicated in these traits previously — findings the group replicated for a subset of genes using eQTL data from two prior studies.
Of the 40 genes that could be scrutinized in phenotyped mice from the Hybrid Mouse Diversity Panel, the investigators verified that at least some of the genes found through the TWAS coincided with obesity-related features in mice.
The study's authors cautioned that "the summary-based TWAS cannot account for rare variants that are poorly captured by the [linkage disequilibrium] reference panel or optimally capture non-linear relationships between SNPs and expression."
"Additional sources of information could potentially be incorporated to improve prediction, including significant trans associations, allele-specific expression, slice-QTL affecting individual exons, haplotype effects, and SNP-specific functional priors."