NEW YORK — Researchers have generated a map of genes throughout the human genome that are sensitive to dosage changes.
Rare copy number variants are associated with both Mendelian and complex diseases — particularly neurological conditions — and are thought to contribute to disease risk by deleting or duplicating key genes. To better understand which genes are sensitive to such changes in dosage, researchers from Massachusetts General Hospital and elsewhere analyzed rare CNV data from nearly 1 million people in conjunction with 54 disease phenotypes.
As they reported in Cell on Wednesday, the researchers further combined these rare CNVs with genome annotations to generate a model to identify likely dosage-sensitive genes, uncovering nearly 3,000 genes sensitive to haploinsufficiency and more than 1,500 triplosensitive genes.
"We provide all maps and metrics derived in this study as an open resource for the community and anticipate that they will have broad utility for human genomic research and medical genetics," senior author Michael Talkowski from MGH and the Broad Institute and colleagues wrote in their paper.
They first generated a catalog of rare CNVs by harmonizing data from 17 different sources, including diagnostic laboratories and national biobanks. The resulting set of 950,278 samples included ones from 458,326 individuals who also had one or more of 54 different disease-related phenotypes, representing both neurological and non-neurological conditions.
By dividing autosomes into 200-kilobase sliding windows, the researchers searched for rare CNV segments associated with each of the 54 phenotypes, uncovering 163 dosage-sensitive segments associated with at least one disease phenotype. More than half of the 95 dosage-sensitive regions reported in the literature were also detected in this analysis, the researchers noted.
In a combined analysis of 178 disease-associated rare CNV segments — the 163 dosage-sensitive segments the researchers uncovered plus 15 others found in targeted analyses — the researchers began to tease out patterns among dosage-sensitive segments. Overall, rare CNV segments tended to be gene dense and include a dominant dosage-sensitive driver gene. They were also more likely to overlap with phenotype-matched disease genes than expected and include more genes under strong mutational constraint.
The researchers adapted a fine-mapping approach generally used in genome-wide association studies to prioritize likely causal genes in these segments. They homed in on 31 highly confident and 90 confident genes, which were enriched for plausible driver genes, such as ones with mutational constraint.
The researchers further developed a computation model to predict the probability that more than 18,000 autosomal protein-coding genes would be sensitive to haploinsufficiency or have triplosensitivity. This model, they reported, could separate known dosage-sensitive and dosage-insensitive genes with high precision and identified 3,635 highly dosage-sensitive genes. These included 2,987 genes that are haploinsufficient and 1,559 that have triplosensitivity.
The researchers noted that genes that are more sensitive to deletion tend to be larger, be situated farther from other genes, and have a number of cis enhancers. Meanwhile, genes more sensitive to duplication are generally shorter and are found in GC-rich and gene-dense regions.
"Although these patterns are preliminary, they nevertheless provide an important foothold for future investigations of dosage sensitivity at sequence resolution and for decoupling the principles of haploinsufficiency and triplosensitivity throughout the human genome," the researchers added.