Skip to main content
Premium Trial:

Request an Annual Quote

Age of Disease Onset, Family History Inclusion Boosts GWAS Power, Study Finds

Family tree

NEW YORK — A new approach taking age of disease onset and family history into consideration may boost the power of genome-wide association studies, a new analysis has found.

Most GWAS have relied on case-control data, which may not account for whether individuals have reached the age range when the disease of interest generally is diagnosed, or whether controls have a family history of the disease.

A Danish team of researchers has now developed a multivariate liability threshold model that extends an existing one that is conditioned on family history to also account for age of onset and sex. They proposed that these changes could improve power in GWAS.

As they reported in the American Journal of Human Genetics on Tuesday, the researchers applied their approach, dubbed LT-FH++, to both simulated data as well as data from the UK Biobank and the iPSYCH dataset. They estimated that LT-FH++ could improve the statistical power of GWAS by up to 61 percent, as compared to typical case-control approaches.

"As more genetic datasets with linked health records and family information become available, e.g., in large national biobank projects, we expect the value of statistical methods that can efficiently distill family history and individual health information into biological insight will only increase," senior author Bjarni Vilhjálmsson from Aarhus University and his colleagues wrote in their paper.

The LT-FH++ approach builds on the idea that each person has a certain liability for a disease and that once they pass a certain threshold that is determined by the sample or population prevalence, they are considered to be a case. In the previous LT-FH model, that liability was broken down into genetic and environmental components, where the genetic component can include family history. The new LT-FH++ approach personalizes that liability threshold based on the person's age, birth year — to account for cohort effects — and sex.

The researchers benchmarked their method against LT-FH and a case-control approach using simulated data. They found that over 10 simulations, LT-FH++ had power improvements between 34 percent and 61 percent over standard GWAS. By comparison, LT-FH had a power improvement between 14 percent and 54 percent. These power gain estimates, though, varied by sample size and completeness of family or age-of-onset information.

They further applied LT-FH++ to real data from the UK Biobank and the Danish iPSYCH register. Specifically, they conducted a GWAS of mortality with UK Biobank data using the LT-FH++, LT-FH, and case-control approaches. With the standard case-control approach, the researchers were unable to find any significant SNPs, but the LT-FH method uncovered two SNPs with genome-wide significance, one at APOE, which has been associated with mortality, and one at HYKK, which is associated with smoking behavior. LT-FH++ identified those two SNPs as well as eight additional ones, including near HLA-B, MYCBP2, and ZBBX.

Meanwhile, using the iPSYCH dataset, the LT-FH++ approach identified more genome-wide significant associations than the others across the disorders examined.

However, the researchers noted that for ADHD, there was little power improvement using either LT-FH or LT-FH++ instead of a case-control approach, which they said could be due to the underlying assumptions of the multivariate liability threshold model, such as that there is no environmental covariance between family members or that there are no differences in genetic architecture by age of diagnosis.

Vilhjálmsson and his colleagues added that their approach provided the largest power gains when cases were ascertained through downsampling and when prevalence was high. They added that it is also limited by access to detailed health register data , though they noted that the approach can be applied to individuals with missing or partial data and that prevalence rates could be obtained from national statistics.

The Scan

Machine Learning Helps ID Molecular Mechanisms of Pancreatic Islet Beta Cell Subtypes in Type 2 Diabetes

The approach helps overcome limitations of previous studies that had investigated the molecular mechanisms of pancreatic islet beta cells, the authors write in their Nature Genetics paper.

Culture-Based Methods, Shotgun Sequencing Reveal Transmission of Bifidobacterium Strains From Mothers to Infants

In a Nature Communications study, culture-based approaches along with shotgun sequencing give a better picture of the microbial strains transmitted from mothers to infants.

Microbial Communities Can Help Trees Adapt to Changing Climates

Tree seedlings that were inoculated with microbes from dry, warm, or cold sites could better survive drought, heat, and cold stress, according to a study in Science.

A Combination of Genetics and Environment Causes Cleft Lip

In a study published in Nature Communications, researchers investigate what combination of genetic and environmental factors come into play to cause cleft lip/palate.