In this week's Nature Genetics, a University of Michigan-led team presents a new statistical method for overcoming the problem of case-control imbalance in large-scale genetic association studies. According to the researchers, dropping genotyping costs have allowed biobanks to genotype all their participants, enabling genome-wide scale phenome-wide association studies in hundreds of thousands of samples. However, for most binary traits, the biobanks have substantially fewer cases than controls — an imbalance that existing statistical models do not address well. The researchers propose a novel generalized mixed model association test that uses the saddlepoint approximation to calibrate the distribution of score test statistics, and demonstrate its effectiveness through an analysis of data on more than 400,000 samples from the UK Biobank.
Also in Nature Genetics, researchers from Massachusetts General Hospital and elsewhere present genome-wide polygenic risk scores for five common diseases — coronary artery disease, atrial fibrillation, type 2 diabetes, inflammatory bowel disease, and breast cancer — highlighting the potential utility of polygenic risk prediction in clinical care. With their approach, the investigators identified percentages of UK Biobank participants at greater than threefold increased risk for the diseases. For coronary artery disease in particular, the prevalence is 20-fold higher than the carrier frequency of rare monogenic mutations conferring comparable risk, they write. "Additional studies are warranted to develop polygenic risk scores for many other common diseases with large GWAS data and validate risk estimates within population biobanks and clinical health systems." GenomeWeb has more on this, here.