NEW YORK – A team led by researchers at the University of Cambridge has published a new study describing its use of whole-genome sequencing (WGS) in the UK's national health system to streamline disease diagnosis, and to discover unknown etiological variants in the coding and non-coding regions of the genome.
As the researchers wrote in their paper in Nature on Wednesday, they generated WGS data for 13,037 individuals enrolled at 57 National Health Service (NHS) hospitals in the UK and 26 hospitals in other countries. In total, 9,802 of the participants (75 percent) had a rare disease or an extreme measurement of a quantitative trait, of whom 9,024 were probands and 778 were affected relatives. Of the 7,065 participants for whom there was detailed phenotypic data, the researchers were able to provide a genetic diagnosis to 1,138.
They also identified 95 Mendelian associations between genes and rare diseases, of which 11 had been known since 2015. At least 79 were confirmed to be etiological. Importantly, the investigators identified four novel non-coding variants that cause disease through the disruption of transcription of ARPC1B, GATA1, LRBA, and MPL.
The patients in the study cohort presented with pathologies of many organ systems. Following bioinformatic analysis, the researchers considered a maximum set of 10,259 unrelated participants, in which they identified more than 172 million short variants comprised of 91.5 percent single-nucleotide variants and 8.5 percent small insertions or deletions. Of these SNVs and indels, 48.6 percent and 40.8 percent, respectively, were absent from major public variant databases.
The researchers reviewed the 1,138 clinical reports that were issued to patients and classed 266 of the 995 SNVs and indels as novel, because they were absent from the Human Gene Mutation Database (HGMD) and were not among the variants in ClinVar. They ranked the 329 diagnostic-grade genes (DGGs) by the number of clinical reports in which they featured. The top three DGGs (BMPR2, ABCA4, and TNFRSF13B) featured in 25 percent of all reports, the subsequent 19 DGGs featured in a further 25 percent of reports, and the remaining 307 DGGs mostly featured in a single report.
Clinical reporting was enhanced by the use of PCR-free WGS with a mean autosomal depth greater than 35X instead of whole-exome sequencing (WES). For example, the researchers identified a causal SNV encoding a start loss of HPS6 in a case with Hermansky-Pudlak syndrome that was previously missed by WES.
Genetic discoveries that arose from the study informed treatment decisions, the researchers noted. For example, patients with KMT2B-mediated early-onset dystonia were treated by deep brain stimulation, and individuals with DIAPH1-related macrothrombocytopenia and deafness were treated for their thrombocytopenia in a preoperative setting.
These findings also improved the accuracy of prognosis. Mutations the researchers observed in BMPR218 and EIF2AK419 conferred a poorer-than-average prognosis for patients with pulmonary arterial hypertension, for example.
"The resolution of unknown rare disease etiologies will be hastened by the standardization and integration of clinical testing and research on a national scale," the authors concluded. "The NHS in England plans to increase provision of WGS-based diagnostics from 8,000 to 30,000 samples per month. To achieve this, it has reduced the number of clinical genomics laboratories to seven and introduced unified staff training in WGS, informatics, and genomics. The development of statistical methodology to interpret the new data and participant consent to recall for follow-up experiments will be of critical importance. Additionally, long-read sequencing may be needed to overcome the difficulty of calling complex structural variants by WGS."