NEW YORK (GenomeWeb) – Increasing numbers of generally healthy individuals have their genomes sequenced as part of research projects or through commercial services, but it is largely unknown how much medically useful information they can gain from this.
To help answer this question, researchers from the Marshfield Clinic in Wisconsin and Complete Genomics sequenced and analyzed the genomes of 300 deceased patients and compared the results with their long-term electronic health records. They found that a large percentage carried variants that could have had a clinical impact during their lifetime.
Max He, a researcher at the Center for Human Genetics and the Biomedical Informatics Research Center at the Marshfield Clinic Research Foundation, presented results of the study, which is under review for publication, earlier this month at the American College of Medical Genetics and Genomics annual meeting in Tampa and spoke with GenomeWeb last week.
The 300 patients analyzed in the study were part of the Personalized Medicine Research Project (PMRP), which the Marshfield Clinic launched in 2001. With more than 20,000 participants, it is one of the largest population-based genetic research projects in the US.
The patients had agreed to participate in the study but had said that they did not want any clinically actionable results returned, He said, which is why the researchers sequenced their genomes after they had died. At the time of their death, they each had at least 30 years' worth of electronic health records.
The team assessed whether any of the genetic variants they carried could have predicted the diseases they came down with during their lifetime, adverse drug reactions they suffered, or even the cause of their death. "We wanted to check if whole-genome sequencing could be useful for clinical practice," He said.
Complete Genomics, a partner in the project, sequenced the genomes using blood samples the patients had submitted when they joined the PMRP. He and his team then analyzed the data using a toolset called SeqHBase that they had developed for the rapid analysis of large family-based genome or exome sequencing datasets and published last year.
Specifically, they analyzed variants in 149 genes: 56 disease-causing genes recommended for secondary findings analysis by ACMG, 60 additional disease genes described by the Exome Sequencing Project, and 33 genes that have been recognized by the US Food and Drug Administration to be involved in drug response.
While these genes could have been assessed by a targeted sequencing approach rather than whole-genome sequencing, He said, certain splice site variants may have been missed by that, and the quality of whole-genome sequence data is often better. Also, the collaboration with Complete Genomics provided a rare opportunity to assess the whole-genome approach, he added.
Overall, they identified 38 pathogenic variants in the 300 patients, including 10 novel loss-of-function variants. A total of 83 individuals, or 28 percent, carried at least one potentially pathogenic variant in one of the 116 disease genes. Also, 16 patients, or 5 percent, were predicted to have a Mendelian disorder, based on their genotype.
For more than 30 percent of the presumed pathogenic variants, the researchers found clinical outcomes in the medical records that were expected for these variants, and for another 20 percent, they found atypical results in the EHRs. Only two patients, however, had been clinically diagnosed with a genetic disease during their lifetime.
In several cases, the patients' death was likely linked to a pathogenic variant they carried. One female patient, for example, carried a frameshift deletion in the BRCA1 gene and was diagnosed at age 53 with breast and ovarian cancer, to which she succumbed at age 59. Because she had no known family history of the disease, she did not receive annual breast exams or mammograms, and it is conceivable that such screening or other preventive measures could have extended her life, the authors wrote in their abstract.
Another patient had a mutation in the SCN5A gene, which has been associated with arrhythmia. This patient had a family history of atrial fibrillation and was diagnosed with it in her 60s. She required a pacemaker and eventually died from heart failure in her mid 80s.
But the researchers also found that for almost half the expected pathogenic variants — among them several novel loss-of-function variants — no corresponding disease could be corroborated by the electronic health record. One explanation is that the disease variants were not 100 percent penetrant, He said, and that they might not be pathogenic in every population. Also, the patients could have sought healthcare outside of Marshfield Clinic, which would not be documented in their EHRs.
The discrepant results between pathogenic variants and clinical phenotype are consistent with other studies, He said, for example a recent study coming out of the Electronic Medical Records and Genomics (eMERGE) project that found reportedly pathogenic mutations in two arrhythmia-associated genes that were not associated with an abnormal phenotype in an unselected population.
Pharmacogenomic variants affected even more patients than disease-associated variants: 93 percent of patients carried one of the 33 PGx variants, and almost 75 percent were prescribed drugs that may have been affected by these variants. Several patients indeed had adverse drug events documented, some of which might have been prevented if the variant data had been available during their lifetime.
There are no firm plans yet for expanding the study to larger numbers, He said, though more study subjects are available: Since their enrollment, about 1,800 PMRP participants have passed away, and about 1,000 of them have more than 30 years worth of EHRs. Marshfield Clinic is applying to become part of the Precision Medicine Initiative, He said, which might involve many more patients in its healthcare system, and the clinic has been in talks with a pharmaceutical company about large-scale genome sequencing of patients.