Group Leader, European Molecular Biology Laboratory-European Bioinformatics Institute
Recommended by Ewan Birney, EMBL-EBI
NEW YORK (GenomeWeb) – Genomics offers a plethora of statistical challenges. In particular, there's the issue of connecting high-dimensional phenotypes to gene variations and molecular traits. That is, determining how genetic differences between people lead to variability in gene expression and how that, in turn, leads to phenotypic differences or to disease. And that's what drew Oliver Stegle, who has a background in theoretical physical statistics, to genomics.
"My interest originally stems from the statistical perspective. The field poses several exciting challenges where powerful statistical methods can make all the difference," he told GenomeWeb.
Now in his own group at EMBL-EBI, Stegle is working on genotype-phenotype associations. There, he's focusing on how genetic polymorphisms can lead to expression changes or changes to protein structure.
In that way, he's getting at the genotype-phenotype association issue at the level of the intermediate trait — expression. But through this, he said, he and his lab can begin to move beyond the links that genome-wide association studies find between SNPs and phenotypes to try to get to causation.
"As we profile increasingly deep omics data, associations are becoming increasingly abundant," he said. "Causal analyses will help to get beyond correlative evidence and closer to biological mechanism."
At the same time, his lab is also focusing on new technologies, especially single-cell transcriptomics and induced pluripotent stem cells. Single-cell studies, he said, offer the opportunity to study variation at both the individual and population levels, and efforts like the UK stem cell project HipSci, which plans to generate iPS cells from thousands of individuals, will provide a model system in which to study differentiation and the role of molecular variation in genetics.
The main stumbling block right now, Stegle said, is finding the right people to bring into his new lab to pursue these projects. "Recruitment has certainly been keeping me busy in the first year," he said.
Paper of note
During his PhD, Stegle published a method for handling confounding genetic variation. To know how a gene variant affects the expression level of a particular gene, all the other variations that are unrelated to that SNP has to be controlled for.
As he and his colleagues reported in PLOS Computational Biology in 2010, they developed a Variational Bayesian QTL mapper for gene expression variability to do just that. This framework, they reported, accounts for signals from the genotype and known factors and hidden confounding factors.
Stegle said that this paper has laid the foundation for many projects his group is now working on, as they've developed the approach not only for eQTL and GWAS, but also to remove confounding factors from single-cell transcriptomes.
Looking ahead
Technology developments, particularly in single-cell genomics and proteomics will drive the field in the next few years, according to Stegle. Single-cell approaches will enable researchers to better understand variation between people and tissues while proteomics will offer another layer at which to study the effects of variation.
At the same time, he said that basic research and translational research would also become more tightly interweaved. Humans, he noted, are quickly becoming the dominant model system and as the number of sequenced human genomes grows, there will be greater links between genome sequencing and drug development, healthcare systems, and other commercial products to translate those findings.
And the Nobel goes to…
Stegle said he hopes that his group will be able to make sense of high-dimensional phenotypes. By combining data from all the different omics technologies and linking that to phenotypic information, he aims to understand those different layers of information to enable biological discovery.
This is the twelfth in a series of Young Investigator Profiles for 2015 that will appear on GenomeWeb over the next few months.