GLASGOW – Analyzing exome copy number variants (CNVs) and mitochondrial DNA (mtDNA) variants in existing sequencing datasets can help boost diagnostic yield for rare diseases, according to two new studies by researchers from the Broad Institute.
In a talk on Monday here at the European Society of Human Genetics (ESHG) annual meeting, Gabrielle Lemire, a medical geneticist at the Broad Institute and Boston Children's Hospital, presented results from a study that demonstrated the diagnostic utility of exome CNV analysis for unsolved cases.
Her team analyzed 22,825 exomes from a cohort of 6,678 families with various disease phenotypes, most related to neurodevelopmental disorders. The exome data was originally generated by the Broad’s Center for Mendelian Genomics (CMG), a member of the NIH-funded Genomics Research to Elucidate the Genetics of Rare Diseases (GREGoR) consortium. Since its inception in 2016, the center has sequenced the exomes of thousands of families with suspected genetic diseases, Lemire said.
The researchers applied GATK-gCNV, a bioinformatics tool developed by the Broad, to call CNVs in the exome datasets. After that, each family’s CNV profile was analyzed using Seqr, an open-source, web-based variant search platform also developed by the Broad.
So far, the analysis has helped solve 171 previously undiagnosed cases, translating to an additional 2.5 percent solve rate in the study cohort, Lemire said. The identified CNVs consisted of 140 deletions, 3 insertions, 15 duplications, and 13 complex structural variants (SVs), involving 165 known and six novel genes. The estimated CNV sizes in the solved cases ranged from 292 bases to 80 Mb, with most falling within 1 kb and 100 kb.
"In conclusion, CNV analysis from existing exome data increases the solve rate for individuals that remain undiagnosed after standard testing approaches," Lemire said.
Besides exome CNVs, data from the Broad CMG also revealed the benefits of mitochondrial DNA (mtDNA) variant analysis for improving rare disease diagnostic yield.
In a separate conference talk on Saturday, Sarah Stenton, a researcher from the Broad, presented results from a study also funded by the GREGoR consortium that investigated mtDNA variants using datasets from 9,253 individuals, or 5,042 families, most with unsolved diseases.
Stenton and her team called single nucleotide variants (SNVs), small indels, and large deletions with at least a 1 percent heteroplasmy level (HL) from exome, genome, and RNA-sequencing data, overcoming technical challenges posed by mtDNA such as its circular genome, heteroplasmy, and mitochondrial-nuclear DNA misalignment. While most of the data in the study were from exome sequencing, she said, "a smaller amount" came from genome sequencing, and about 250 samples underwent RNA sequencing.
After variant calling, the researchers carried out four analyses. They included identifying pathogenic and likely pathogenic variants previously reported in the human mitochondrial genome database (MITOMAP) and in ClinVar, large deletion analysis, and novel variant detection, as well as outlier profiling.
Overall, the researchers discovered pathogenic or likely pathogenic variants from MITOMAP or ClinVar in over 220 samples, a significantly higher rate than in the gnomAD reference population. In addition, they identified one large deletion case in the cohort and prioritized 193 novel variants, 10 of which were considered "high priority."
Based on the definite, probable, and possible diagnoses made, Stenton said, the diagnostic yield increased by about 0.5 percent across the entire cohort, which came at "minimal added cost" — approximately $.10 to $.20 per sample.
"The most promising end note here is that we did provide a potential diagnosis, if not a definite diagnosis, to 19 previously unsolved families," Stenton said.