NEW YORK (GenomeWeb) – Researchers from Vanderbilt University Medical Center have used data from electronic health records to develop phenotypic risk scores to identify possible undiagnosed genetic diseases in patients suffering from a number of conditions.
The Vanderbilt team developed phenotypic risk scores for more than 1,200 Mendelian diseases based on the clinical features of conditions captured by electronic health records. When they applied these phenotypic risk scores to more than 21,700 people who'd undergone genotyping, they uncovered 18 genetic associations between rare variants and phenotypes consistent with Mendelian disease, including severe outcomes such as organ transplants. The study was published today in Science.
"We started with a simple idea: look for a cluster of symptoms and diseases to find an undiagnosed underlying disease," senior author Joshua Denny from Vanderbilt said in a statement. "Then we got really excited when we saw how we could systematize it across thousands of genetic diseases to figure out the impact of millions of genetic variants."
He and his colleagues first mapped clinical features of Mendelian diseases to phenotypes that could be extracted from electronic health records, such as heart failure or infertility. In particular, they relied on clinical synopses from the Online Mendelian Inheritance in Man database that had been annotated using the Human Phenotype Ontology and then charted those HPO terms to EHR billing codes called phecodes.
By gauging the extent to which a patient's clinical features, as gleaned from the phecodes, overlapped with the OMIM-HPO terms, the researchers computed a phenotypic risk score (PheRS) for that individual.
They first tested their approach for six Mendelian diseases in a cohort of clinically diagnosed patients and controls. For five of the six conditions, the researchers reported that PheRSs were strong predictors of disease status. The sixth condition — phenylketonuria — served as a negative control as the treatment for the disease eliminates its manifestation. This indicated to the researchers that their PheRSs captured the characteristics of those diseases.
Denny and his colleagues then computed PheRS for 1,204 Mendelian diseases within a discovery cohort of 21,701 adults of European ancestry from the BioVU DNA biobank, which links DNA samples to de-identified EHRs, who had undergone genotyping. In this cohort, they tested for associations between PheRSs and 6,188 rare variants in disease-linked genes, assuming a dominant genetic model.
They uncovered 18 such associations, including two known cystic fibrosis-causing variants. But most, they noted, were novel associations. They attempted to replicate five of the significant associations they uncovered in a cohort of 9,441 people of European ancestry from the Marshfield Clinic and 3,820 people of non-European ancestry from Vanderbilt, replicating four of them.
Most of the patients with significant variants had not been diagnosed with disease, even though some had a high burden of symptoms. For instance, four of the patients who were TG p.G77S heterozygotes — linked to the Mendelian disease thyroid dyshormonogenesis — had had thyroidectomies, while of the 40 people who were heterozygotes for HFE p.E168 — linked to hemochromatosis — four had had liver transplants.
To search for additional variants segregating among the high PheRS individuals, Denny and his colleagues sequenced the whole exomes of 84 individuals: 36 with elevated PheRSs and 48 without. Four of those with elevated PheRSs had a second, rare nonsynonymous variant within the target gene: two were likely compound heterozygotes and two were homozygotes for the variant uncovered in the discovery analysis. Three of those four patients with second variants had the highest PheRS for their respective diseases. In vitro analyses of three variants supported their likely pathogenicity.
The researchers noted that their findings indicated that some people who are heterozygous for disease variants may still exhibit symptoms and that Mendelian and complex genetic diseases appeared to exist on a spectrum.
"In view of our findings, familiar medical categories such as 'complex' versus 'genetic', or 'dominant' versus 'recessive' begin to appear more like continuums," first author Lisa Bastarache from Vanderbilt said in a statement.