Skip to main content
Premium Trial:

Request an Annual Quote

Study Explores ClinVar Variant Misclassification With Population Genome, Exome Sequences

NEW YORK (GenomeWeb) – Researchers have found that the proportion of ClinVar variants picked up in population genome or exome sequences significantly outpaced the prevalence of corresponding diseases, consistent with variant misclassification in the database.

Amalio Telenti and Craig Venter led a research team from the J. Craig Venter Institute and Human Longevity, and Baylor College of Medicine began by analyzing ClinVar variant frequencies in whole-genome sequences for nearly 10,500 individuals. Across more than 500 disease-associated genes, the group saw more suspicious (pathogenic or likely pathogenic) variants than expected given the actual rates of related disease in the general population.

Instances of "inflation" — marked by variant misclassification or apparent genetic risk that exceeds disease risk — were more common for variants that were deemed pathogenic based on relatively lower levels of evidence, the researchers reported. Based on patterns in the original genomes and another 138,000 exomes, they found that rare variant misclassification is likely a large contributor to the types of inflation detected.

The authors subsequently retraced historical ClinVar variant reclassifications over time, comparing the version available in September 2017 with the May 2016 edition. That analysis suggested that a significant proportion of pathogenic or likely pathogenic variants have shifted to more benign or indistinct risk groups as more evidence is incorporated into the dataset.

"[M]ost of the re-classification in ClinVar feeds into 'conflicting interpretation,' [benign/likely benign], and [variant of uncertain significance], and away from [pathogenic/likely pathogenic]," Telenti, Venter, and their co-authors wrote in a study published in the American Journal of Human Genetics today, noting that the "trend of re-classification is expected as more knowledge is acquired and shared."

Starting with genome sequences for 10,495 of the unrelated individuals that members of the same team profiled for a 2016 study in the Proceedings of the National Academy of Sciences, the researchers searched for pathogenic/likely pathogenic ClinVar variants with varying levels across the ACMG 59 genes — a gene set recommended for incidental findings reporting by the American College of Medical Genetics and Genomics with ties to 26 conditions. They also profiled variants in 463 genes implicated in 265 rare diseases from the OrphaNet/OrphaData database.

Based on data for more than 67,000 ClinVar variants, the researchers identified three ACMG 59 gene-associated conditions (malignant hyperthermia susceptibility, multiple endocrine neoplasia type 1, and hereditary paraganglioma-pheochromocytoma syndrome) with particularly inflated pathogenic/likely pathogenic variant profiles. Such inflation spanned 24 of the 26 ACMG 59 conditions when they considered variants with conflicting interpretation.

The team dialed down this inflation by folding in disease-specific allele frequency data, taking factors such as disease prevalence, inheritance patterns, and risk variant penetrance into account. Following that filtering, the inflation associated with the conflicting variant set declined for nine of the ACMG 59 conditions.

In the genome sequence set, the researchers also detected some 2,830 of the almost 13,000 ClinVar variants encompassed in the OrphaNet genes. Again, the apparent genetic risk exceeded disease rates, especially for four of the rare conditions considered.

Similarly, the team noted that ACMG 59 and OrphaNet genes contained ClinVar variants with a higher risk than expected from disease data when it expanded its analyses to encompass 123,136 more exome and 15,496 genome sequences from gnomAD. 

"The present analyses strongly suggest that ClinVar includes significant amounts of misclassified variants and supports the important role of ClinVar to increase transparency, contrast claims, and foster validation across submitters," the authors wrote, adding that "discordance is higher in non-clinical and older submissions and [for] low-penetrance variants."