While a number of common variants have been identified that are associated with common, complex diseases, those common variants typically only explain a fraction of disease heritability. Researchers, then, are turning to rare variants to unearth the remainder of that so-called missing heritability, but the analysis of rare variants comes with some challenges.
Because of how rare variants are analyzed, determining their genetic variation — which is an indication of heritability — can be problematic. Since the variants are so rare, specific loci are analyzed together. "There are going to be causal and non-causal variants that we are analyzing in aggregate. For the different causal variants, the effect sizes can be quite different — they can also go in different directions where some can increase the quantitative trait values while others can decrease it," says Suzanne Leal, the director of the Center for Statistical Genetics at Baylor College of Medicine.
Because of that, she adds, any estimate of genetic variance is going to be underestimated. "We could only really say something about the lower bound. … But the whole problem here is arising because of the aggregate analysis," she says.
Further, the estimate of a variant's effect size may be overestimated due to the winner's curse, Leal adds. In that situation, the researcher has found a true positive signal, but due to chance, there was an over-sampling of carriers with that rare variant, skewing the effect size estimates. "It's just because of sampling where you were lucky in one way with your sampling, that's why it is called the winner's curse — because you won, you found something in the beginning, and so that's all great, but your effect size estimates are overestimated," she says.
Leal and her colleague Dajiang Liu developed a new technique, based on a re-sampling approach, that can tease out those effects of rare variants in an unbiased manner and using only the discovery population, which they presented in The American Journal of Human Genetics in October.
For common variants, Leal notes, researchers would turn to a new sample as a replication group would lack that bias, though such an approach would not work for rare variants. "Because of the amount of population substructure and the different allelic architectures in different populations, if you are trying to estimate these genetic parameters even in a slightly different population — so like if you are looking at Italians then you go to the French — you can have different estimates," she says. "[If your] discovery population is different from your replication, it can not only be due to the winner's curse, it can also just be because of the difference in allelic architecture. We thought that this was quite important to have unbiased estimates, also which you could obtain within the original discovery sample."