NEW YORK (GenomeWeb) – Analyzing genetic variants in 1,133 children with severe undiagnosed developmental disorders and their parents using exome sequencing and array-CGH data, researchers in the UK have identified 12 novel genes associated with their disease, increasing the diagnostic yield from 28 to 31 percent.
The results stem from the UK's Deciphering Developmental Disorders study and were published in Nature today. The analysis, led by Matthew Hurles at the Wellcome Trust Sanger Institute, complements a paper published in The Lancet last week that focused on clinical diagnostic aspects of the study.
According to the authors, the results "validate a large-scale genotype-driven strategy for novel developmental disorder-linked gene discovery that is complementary to the traditional phenotype-driven strategy of studying patients with very similar presentations, and is particularly effective for discovering novel developmental disorders with highly variable or indistinct clinical presentations."
Despite their ability to identify novel disease genes, and the size of the cohort, the study had limited power, leaving the majority of patients undiagnosed. One way to increase the number of diagnoses, the authors wrote, could be to share minimal genotypic and phenotypic data from studies like their own internationally. The DDD study is already doing this, they noted, through a web portal for the DECIPHER database.
The DDD study, launched in 2011, aims to diagnose children with developmental disorders with the help of exome sequencing and microarrays. The project, which aims to recruit 12,000 families, is a collaboration between the UK's National Health Service and the Wellcome Trust Sanger Institute and is funded by the UK's Department of Health and the Wellcome Trust. The latest paper describes the analysis of the first 1,133 patients, recruited through the 24 regional genetics services of the NHS and Ireland.
The majority of children suffered from intellectual disability or developmental delay, and most were the only affected members of their family. The researchers sequenced the exomes of the patients and their parents and performed exome-focused array CGH on the children.
On average, they discovered about 19,800 single nucleotide variants in coding regions or at splice sites, about 490 coding or splicing indels, and about 150 copy number variants per child. Overall, they found about 1,600 de novo variants in coding and non-coding regions.
As reported in the Lancet paper, 28 percent of the children had likely pathogenic variants in at least one of about 1,130 genes previously shown to be involved in developmental disorders, most of which were de novo mutations.
To find new candidate disease genes, the scientists looked for genes in the children that were enriched for damaging de novo mutations. To increase their chance of finding such genes, they included de novo mutation data from other published studies that had looked at 2,350 developmental disorder parent-child trios, where the patients suffered from intellectual disability, epileptic encephalopathy, autism, schizophrenia, or congenital heart disease.
Integrating the statistical genetic evidence with the phenotypic similarity of patients, data from model organisms, and "functional plausibility," they identified 12 novel disease genes that had "compelling evidence for pathogenicity."
Two unrelated children with identical mutations in one of the genes, PCGF2, had a "strikingly similar facial appearance," the authors wrote, representing a "novel and distinct dysmorphic syndrome."
Also, two of three children with mutations in the same gene, DNM1, that had been previously identified as a candidate for epileptic encephalopathy, had seizures, as did a mouse model with a defect in that gene.
Overall, the researchers found identical missense mutations in unrelated but phenotypically similar patients for four of the novel genes. For another gene, they identified several non-identical missense mutations that were significantly clustered.
A total of 35 patients had mutations in one of the 12 novel disease genes, increasing the diagnostic yield from 28 to 31 percent. The remaining undiagnosed 69 percent likely contain additional pathogenic mutations in single genes "that we have detected but for which compelling evidence is currently lacking," the researchers wrote.