Skip to main content
Premium Trial:

Request an Annual Quote

Geisinger PheWAS Uncovers SNPs Associated With Diverse Diseases, Clinical Measurements

NEW YORK (GenomeWeb) – Researchers with the Geisinger Health System, the University of Pennsylvania, and elsewhere have presented results from a large-scale phenome-wide association study (PheWAS) simultaneously spanning hundreds of diseases or clinical measurements.

Using electronic health record data for tens of thousands of genotyped participants in the MyCode Community Health Initiative, the team looked for variants coinciding with a broad range of disease diagnoses or clinical laboratory measurements. These apparent associations were assessed alongside results from related genome-wide association study data, when available.

Moreover, the researchers started digging into the risk variants implicated in the PheWAS and large-scale association studies published previously, searching for functional explanations for some of the associations.

"The comprehensive nature of this PheWAS allows for novel hypothesis generation, the identification of phenotypes for further study for future phenotypic algorithm development, and identification of cross-phenotype associations," senior author Sarah Pendergrass, a biomedical and translational informatics researcher with Geisinger Health System, and her colleagues wrote in a study published online today in the American Journal of Human Genetics.

For their analysis, the researchers considered 541 diagnostic codes from the International Classification of Disease (ICD version 9). Along with those binary diagnostic outcomes, they tracked continuous outcomes in the participants using average measurements across more than two-dozen clinical laboratory tests in Geisinger MyCode Community Health Initiative study participants.

Starting with data for more than 50,700 genotyped individuals, including 45,899 individuals genotyped on the Illumina HumanOmniExpress Exome bead chip array, the team focused in on 635,525 SNPs in 38,622 unrelated individuals with sufficient data quality.

"A PheWAS at this scale, where we computed a total of 343,819,025 associations for the diagnostic codes and 15,888,125 associations for the clinical lab measures, presented several big data challenges such as computational burden, high throughput result interpretation, and visualization of the results," the authors noted.

On the diagnostic code side, for example, the team narrowed in on more than 1,100 phenome-wide significant associations, including new and known risk variants for conditions ranging from diabetes to psoriasis, heart disease, and hypertension.

Another 3,024 associations reached phenome-wide significance for clinical lab measurements, the researchers reported, revealing thousands of SNPs linked to levels of bilirubin, blood glucose, and other traits revealed by clinical lab tests.

Along with analyses focused on genes closest to potential risk variants, the team considered potential sources of pleiotropy in the associations, while diving into the nature of these associations with functional, regulatory, and epigenetic data.

"Further, epigenomics knowledge of non-coding regions of the genome helped us to refine the genetic associations, to illustrate the biological relevance to the associated disease," the authors wrote. "With these results, we provide a landscape of associations across diseases and quantitative traits, a series of potentially novel associations, and cross-phenotype associations, all within the context of protein-coding and regulatory impact of genetic variants."

The Scan

Suicidal Ideation-Linked Loci Identified Using Million Veteran Program Data

Researchers in PLOS Genetics identify risk variants within and across ancestry groups with a genome-wide association study involving veterans with or without a history of suicidal ideation.

Algorithm Teases Out Genetic Ancestry in Individuals at Biobank Scale

Researchers develop an algorithm known as Rye to tease apart ancestry fractions in admixed individuals at a biobank-scale, applying it to 488,221 UK Biobank participants in Nucleic Acids Research.

Multi-Ancestry Analysis Highlights Comparable Common Variants at Complex Trait-Linked Loci

Researchers in Nature Genetics examine common variants implicated in more than three dozen conditions, estimating genetic effect similarities across ancestry tracts in admixed individuals.

Sick Newborns Selected for WGS With Automated Pipeline

Researchers successfully prioritized infants with potential Mendelian conditions for whole-genome sequencing or rapid whole-genome sequencing, as they report in Genome Medicine.