NEW YORK (GenomeWeb) – At the annual meeting of the Association for Molecular Pathology last week, many talks focused on the growing challenges in variant interpretation that have followed the expansion of genetic testing, and how existing strategies might be improved to make interpretation more uniform across the field.
In one session, researchers shared early data from an AMP working group project called VITAL (Variant Interpretation Testing Across Laboratories), which just finished its first round of classifications in October.
A volunteer-based classification challenge, VITAL aims to uncover the precise reasons that labs might vary in their interpretation of the same genetic variant, even when they employ strategies or use guidelines that have been developed to make this process more uniform.
ACMG, along with AMP and the College of American Pathologists, publicized a newly updated variant classification system in March of last year.
The guide is intended as a resource for labs and clinical geneticists to help overcome inconsistency in how they interpret inherited disease variants identified via genome sequencing or other molecular tests.
Published in Genetics in Medicine last year, the updated 2015 guidelines include standard terminology — "pathogenic," "likely pathogenic," "uncertain significance," "likely benign," and "benign" — for characterizing variants.
The guide also describes and codifies standardized classes of evidence, including population, computational, functional, and segregation data that labs should use to define the status of a particular variant.
However, attendees at last week's AMP meeting highlighted that even with these updated tools, labs are still not calling all variants as uniformly as the field would likely prefer.
"We have seen and heard over the last year … that people may have questions about how to use specific evidences," said Elaine Lyon, medical director of both molecular genetics and pharmacogenomics at ARUP laboratories, a former AMP president, and a member of the VITAL working group. Other group members include Sherri Bale, Julie Gastier-Foster, Madhuri Hegde, and Carolyn Sue Richards.
With VITAL, Lyon and colleagues hope to analyze how labs and individuals are reaching a particular classification and use that data to identify specific aspects of the system that most need clarification.
"While [the 2015] guidance recommendation is a good step forward to help achieve uniformity in variant classification across laboratories, it requires widespread testing and validation to assess its strengths and weaknesses and fine-tune the process," AMP says on the website for its VITAL challenge.
Improving the uniformity of variant classification is not only important for the good of patients, but could also be important as laboratories navigate a future in which genetic testing is subject to the same scrutiny and potential liability as other medical procedures and interventions.
Earlier this year, a wrongful death case was brought against Quest Diagnostics and Athena Diagnostics alleging that Athena failed to follow federal lab regulations and accurately classify the genetic mutation causing a young boy's epileptic seizures, causing him to receive treatment that worsened his condition and caused his death.
If the case goes forward, a jury will be faced with deciding whether Athena was negligent in failing to determine a pathogenic link between the boy's mutation and a known seizure disorder called Dravet syndrome, based on the available literature on the variant in 2007 and the lab's internal protocols for classifying variants at the time.
The question of liability in the case of genomic variant interpretation is complicated by the challenge of defining negligence in a field that lacks standardization. There are no mandated standards for how genetic testing labs classify and interpret detected variants. The ACMG guidelines are voluntary, and the field is ever evolving.
In the wake of the case filing, a high-profile, independent committee of the National Academies of Sciences, Engineering, and Medicine met last month to consider some of these issues.
If variants are not interpreted consistently across testing labs, it becomes difficult for a single lab to defend its actions (or for a plaintiff to argue negligence) based on allegiance to (or deviation from) standard industry practices.
The hope has been that tools like the newest ACMG-AMP guidelines can help reduce variability in how variants are classified from lab to lab. Unfortunately, even using a consistent strategy or resource, individual genetic testing professionals may come to divergent conclusions about the pathogenicity of a variant.
With VITAL, and other studies that have been enabled by the new guidelines, researchers in the field now hope to conduct precise analyses of where and why labs concur or diverge in their variant calling — to systematically identify the reasons that one lab calls a variant one way and another does not and develop strategies to overcome these discordances.
For example, in an early pilot project testing the ACMG-AMP system published in May of this year in the American Journal of Human Genetics, researchers from the Clinical Sequencing Exploratory Research consortium (CSER) found that clinical labs initially agreed only 34 percent of the time when classifying variants.
Encouragingly, inter-lab agreement in that study improved to 71 percent after participants discussed their variant calls by phone or email. In about 5 percent of cases, however, remaining discordances could affect patients' medical management, the study concluded.
One reason the AJHG authors cited that discussions between labs increased consensus, is that these discussions revealed that one or another lab had not actually used the ACMG-AMP rules appropriately.
At AMP, Lyon said that VITAL aims to identify more of these same kinds of patterns and hopefully use that information to develop improvements to the ACMG-AMP interpretation guide.
Labs can sign up online to participate in the challenge, which has so far consisted of one set of 10 cases — soon to be followed by a second set. The five working group members each contributed five subjects to the challenge, aiming to represent a range of different types of challenging cases.
Participants in VITAL then use the provided information, which includes a case scenario, and the terminology (pathogenic, likely pathogenic, VUS, likely benign, benign) and other schema from the ACMG/AMP guidelines to determine pathogenicity for each variant.
They are also instructed to report the evidences they used to reach that classification, and to rate the level of difficulty in classifying each variant. Submissions are anonymized.
Although it has only been a few weeks since the results from the first set of 10 cases came in, Lyon said at the meeting that the working group has already pulled out some high-level data that speak to the challenges of implementing the ACMG-AMP system.
In an email to GenomeWeb this week, Lyon said that the working group strove to collect variants that would be challenging in different ways. "This doesn't represent a 'typical day' in the life of a molecular geneticist," she wrote.
Attendees at the AMP presentation, many of whom raised their hands when asked if they had participated in the first round of 10 challenges, echoed this sentiment. "It's not that these [cases] were the most difficult," one attendee said. "I think we have all seen even more challenging examples in our own labs, we just don't see nine of them in a day."
According to Lyon, many of the individual challenge cases seemed to illustrate that use of the guidelines does lead to a majority of labs reaching a close consensus.
In a few cases, at least 80 percent of respondents clustered around the benign/likely benign categories or the pathogenic/likely pathogenic categories.
For one of the 10 — a case with two variants in the gene XYLT2 — 95 percent of the respondents classified the first variant as either pathogenic or likely pathogenic. For the second variant 100 percent of the group returned one of those two classifications.
Importantly, it also seemed, for at least some of the examples, that respondents took similar strategies in their use of the guidelines' evidentiary categories, although Lyon said at the meeting that the AMP working group hasn't yet been able to analyze this in detail.
In contrast, the set also included cases where there was much more variability in the classifications labs returned. In one challenge, featuring a variant in the gene PDE11A, responses ranged over the full spectrum from benign to pathogenic with significant proportions on opposite ends of the spectrum.
In terms of both the classifications and the evidence criteria cited to support them, "this one was all over the place," Lyon said at the meeting.
As the working group moves forward with a closer analysis of these patterns, the hope is that they will be able to pick out specific areas where the evidence criteria delineated in the ACMG guide may not be be used or considered correctly.
Then they can start to work on developing improvements to the guidance to help fix these problems.
"Most laboratories consistently use evidences such as allele frequency or absence in population databases. [But] there were several very difficult variants that had more variability in classifications as well as the evidences used. These will be the ones that will help us determine which evidences may be used differently than intended," Lyon told GenomeWeb in her email.
During the presentation Q&A, Steve Lincoln, senior vice president at the genetic testing firm Invitae, suggested that if the challenge was divided by specialty so that labs with a particular expertise — in cardiovascular genetics, for example — could focus on samples in their wheelhouse, the implementation of evidence and variant calling might reach higher consensus levels.
Getting specific
At another session at the AMP meeting, Stanford University's Jillian Buchan discussed work by her group as part of a ClinGen project bringing together professionals with particular disease specialties to try to tweak the ACMG-AMP guidelines in ways that could help the field as a whole.
Overall, the ACMG-AMP guidelines were designed as a general guide across diseases, Lyon said. Not all genetic testing labs focus on specific disorders, and broad sequencing-based assays are growing in use and scope. Several of the VITAL cases were exome-sequencing results, she said in her presentation.
But that doesn't discount that specific expertise will be needed to adapt and improve the guidelines in many areas, she told GenomeWeb.
"Modifying evidences for gene or disease-specific applications was expected and encouraged," Lyon said. "One particularly helpful modification is to develop gene-specific data [concerning] greater than expected frequency for the disease prevalence. Another is to identify the types of functional assays that are relevant."
In the presentation on her and her colleagues' ClinGen working group project, Buchan said that while the guidelines may improve things over the range of diseases and variants, there may be niche areas where using them actually results in less concordance than would be seen if labs stuck to their own internal classification strategies.
For example, she said, her group found in an early experiment focused on cardiomyopathy genetics that three labs only classified three of 10 variants concordantly using the ACMG-AMP strategies, but when asked to go back and use their own methods, they managed to reach concordance on all 10.
Using the guidelines should result in higher concordance between labs, not lower, she said, so this raised the question of why – and prompted a pilot study looking at how modifications to the ACMG-AMP rules might affect classification calls.
First, the researchers worked out modifications for specific rules, culling things that were detrimental in the specific context of inherited cardiomyopathies, and adding two new evidentiary rules.
In a second phase of their study, the group then tested these new adapted rules, collecting a set of 60 variants in the MYH7 gene and having two individuals classify each. The result was that 52 of the classifications were concordant, and eight were discordant.
As a bonus, Buchan said that the team was able to submit the 60 variants to the NIH ClinVar database with three-star assertion levels at the end of the pilot.
Data sharing
Strategies to help labs weigh evidence for variant pathogenicity aside, many at the AMP meeting also argued that substantial improvements in the uniformity of variant classification and interpretation will rely on greater data sharing.
If information about the pathogenicity of variants is not entered into the public sphere, it can't be used, no matter how clear or standardized guidelines for interpretation become.
These types of discussions have ranged much further than AMP, and have been increasing over the past few years.
Heidi Rehm — a leader of the ClinGen consortium and director of the laboratory for molecular medicine at Partners Healthcare Center for Personalized Genetic Medicine — has argued that lab-accrediting organizations, scientific journals, insurers, and the US Food and Drug Administration should all require labs to share their variant interpretations.
Rehm and others have suggested that there is a critical need for labs to deposit interpreted variants and supporting evidence into the ClinVar database of genotype-phenotype relationships, to ensure that all patient reports, regardless of the genetic testing lab or company that produced them, benefit from access to all available evidence.
Rehm has seen some success in these efforts. For example, the FDA has said that it hopes to recognize or certify public genetic variant databases that meet certain standards.