NEW YORK – A new set of analytical tools is making it possible to systematically search for links between copy number variants and complex human traits or conditions, according to a study by a pair of investigators at the European Molecular Biology Laboratory's European Bioinformatics Institute.
"[W]e present a robust CNV-to-phenotype discovery process that uses [next-generation sequencing] information that is analogous to the traditional SNP-based GWAS," authors Tomas Fitzgerald and Ewan Birney, both at the EMBL-EBI, wrote in Cell Genomics on Wednesday, noting that the approach "complements the long-standing use of CNV in rare disease discovery and provides a higher-resolution view of common CNV than established SNP array-based methods."
Using their "copy number estimator" (CNest) tool set, the researchers analyzed copy number variation in exome sequences for 200,629 exome-sequenced and phenotyped UK Biobank participants, tapping into the variants they identified for a copy number-based genome-wide association study (CNwas) spanning dozens of human traits, along with their relationships to SNP variants.
"Although it is widely accepted that CNV can contribute significantly to differences in human traits, to date, methods for large-scale CNV-to-phenotype association studies, the equivalent of GWAS for CNVs, have been hampered by a number of factors, including methodological difficulties, the availability of sufficiently large datasets, and the ability to interpret complex rearrangements from sequencing data," the authors explained, calling CNest "a new discovery method … based on novel normalization techniques for large-scale cohorts."
The team's CNwas led to 646 significant associations between CNVs and the traits considered, while its subsequent fine-mapping analyses highlighted 862 CNV-related associations. The investigators cautioned that a subset of those relationships could be traced back to nearby SNPs, though the association study also highlighted CNVs with previously unappreciated impacts on human traits and their potential ties to complex human conditions.
"Many of these associations recapitulate multiple known associations based on previous studies on both CNV and SNP genome association testing," the authors reported, "whereas others discover new CNV-specific findings in relation to the genetics of common human traits."
The investigators noted that CNest can be used within standards established by the Global Alliance for Genomics and Health, and that the method should make it possible to unearth still other informative CNVs in large cohorts.
"We encourage the community to explore the discoveries we have made in this paper, to use CNest to make more CNV associations in both [the] UK Biobank and beyond, and to help extend the CNest method further to provide a more comprehensive view of human variation," the authors wrote, noting that the "ability to jointly model SNPs and CNVs in the same framework will more easily allow for integration of these two types of variation."