While genome-wide association studies have certainly proven their worth when it comes to pinpointing which genes play a role in human disease development, they are far from perfect. Sometimes, the genealogy of the individuals included in these large-scale studies can throw a wrench in the works because rarely are pairs of individuals in a study completely unrelated. This pairwise relatedness has occasionally led researchers to believe they have discovered a gene involved in a particular disease when in fact it is an artifact. While most researchers have statistical approaches for dealing with different levels of relatedness that come in the form of population structure or hidden relatedness, a team of scientists from the University of Michigan and the University of California, Los Angeles, has developed a statistical approach for dealing with both forms of relatedness. The method has the added benefit of dramatically speeding up the analysis process from years to just a few hours.
"Previously, researchers were able to address these issues using a combination of different techniques, but our method can address all these issues," says Eleazar Eskin, an associate professor of computer science at UCLA. "People have been working on this problem for over 10 years, and our method makes real improvements over these other methods." Eskin and his team used their freely available software package called EMMAX — efficient mixed-model association expedited — to perform association analysis for 10 quantitative traits from the Northern Finland Birth Cohorts and seven common diseases from the Wellcome Trust Case Control Consortium. The group's findings, published in the March issue of Nature, demonstrate how EMMAX outperforms both principal component analysis and genomic control when correcting for sample structure. "Our technique takes into account all of the relationships between the individuals when we compute the correlations, and when we compute it between a genetic variation and a disease trait. We're also taking into account the genetic distance between all the pairs of individuals in our sample," Eskin says. Moving forward, the team hopes that EMMAX will also be able to aid in analyzing admixed populations, or sample sets comprised of individuals of diverse ancestries.