NEW YORK (GenomeWeb) — Researchers from the ongoing MedSeq clinical sequencing exploratory research project have published a report on their strategy for summarizing polygenic risk for complex diseases in whole-genome sequencing reports.
The study, published last month in Genetics in Medicine, describes the group's methodology for reporting complex disease risk from whole-genome sequence data by combining multiple risk alleles into what is called a multiplicative polygenic risk score, or MPRS.
MedSeq, unlike many other CSER projects funded by the National Human Genome Research Institute, is exploring the effects and impacts of broad genome sequencing not only on subjects with a particular disease that sequencing might help diagnose or treat, but also on healthy individuals in the context of routine primary care.
In the project, researchers primarily report back to participants and their doctors several types of so-called "actionable" findings — pathogenic variants in known Mendelian disease genes, loss-of-function mutations in other disease-associated genes, as well as pharmacogenomic and other variants.
In addition to this, the team has developed a strategy to use whole-genome sequencing data to describe study subjects' risk for complex diseases using a PRS method. Sek Won Kong, the first author of the study, told Clinical Sequencing News that complex disease risk calculations are being reported back to participants in MedSeq along with other actionable variants.
Because whole-genome sequencing, unlike exome sequencing, targets not only coding areas, but the entire structure of the genome, it opens the possibility of reporting genetic risk variants, which occur predominantly in intronic and other non-coding regions.
"Since MedSeq is focused on whole-genome sequencing," Kong said, "we wanted to see what information we could extract. Standard actionable variants, of course, and risk alleles were the two things we could think of right away."
To take advantage of the potential to interpret risk variants for complex diseases, the MedSeq project sought to develop a strategy for estimating and summarizing risk.
In developing their strategy, what they call a multiplicative polygenic risk score, Kong and his colleagues focused their reporting on alleles associated with cardiovascular disease, which is the subject of the diseased-focused arm of the MedSeq project.
According to the authors, a future in which predicting complex disease risk based on reported risk alleles becomes widely used in clinical practice is predicated on the collection of much more, and much better, data. This would require "rigorous large-scale studies in clinical cohorts for which validity, clinical utility, and cost-effectiveness can be demonstrated."
It may also require that personal genomics and direct-to-consumer genetics companies — the gatekeepers of vast population-scale datasets on complex disease risk variants, and increasingly also how the frequency of these variants correlates with genetic ancestry patterns — agree to share their data.
But in the meantime, one step that can be taken, the authors wrote, is the establishment of standardized reporting strategies for genetic risk information from whole-genome sequencing data.
To take that step, even in the context of a currently limited pool of well-characterized risk alleles, Kong and the MedSeq group set out to develop a method of normalized risk scoring using data from the 1000 Genomes Project as a population reference.
For its MedSeq reports, the strategy focuses on cardiovascular disease traits or phenotypes, including abdominal aortic aneurysm, atrial fibrillation, CHD, T2D, hypertension, metabolic syndrome, platelet aggregation, and QT prolongation.
Unlike genetic risk reporting by companies like 23andMe, and its shorter-lived colleagues in the direct-to-consumer genetic testing forum, the MedSeq researchers decided to report risk using a ranking system that compares an individual's risk to that of the general population using deciles, or dividing patients into one of 10 risk categories spanning from 0-10 percent to 90-100 percent, rather than reporting an absolute disease risk estimate for each person.
For example, a patient report summarized in the study shows how population prevalence of a certain phenotype and the expected proportion of variation that can be explained by common genetic variants are listed alongside the number of risk loci measured, the total number of alleles identified, and the subject's decile, or percentile risk ranking.
In addition, according to the authors, the genetic risk is presented in the context of medical and family histories, lifestyle, and other factors.
"An individual with a high polygenic risk of a particular disease may have a modest overall risk once non-genetic factors are considered," the authors wrote.
According to the researchers, a major challenge for reporting relative genetic risk for complex disease is that studies that establish new risk loci, as well as population datasets like the 1000 genomes project used to contextualize individual risk scores, have not comprehensively explored different ancestral and ethnic populations. A rare variant in one population can be very common in another, and its contribution to disease risk may thus vary widely among different ancestries.
Addressing this challenge will be a central need in future studies, Kong said, in order to expand the ability of WGS data to provide meaningful complex disease risk information.
As part of its overall goals, the MedSeq project is tracking the clinical application of the complex disease risk portion of its reporting. According to the study authors, while more rigorous data will be necessary to establish the clinical utility of WGS-based disease risk assessments, the group hopes that combining even their current limited strategy with family history and other clinical information will be informative for clinicians and patients in the project, and potentially impactful on health behaviors and clinical decision making.