NEW YORK (GenomeWeb) – An international team led by investigators at the Wellcome Trust Sanger Institute and the University of Trieste has characterized genomic patterns in eight isolated European populations, in the hopes of getting a clearer picture of human genetic variation and bolstering association studies focused on complex traits and diseases.
The researchers did whole-genome sequencing on more than 3,000 individuals from eight isolated and two non-isolated populations, comparing these sequences to one another and to available data from the 1000 Genomes Project and UK10K project. The study, published online today in Nature Communications, revealed an uptick in low-frequency variants predicted to have functional impacts in the genome — the types of alterations normally weeded out of larger populations through the process of purifying selection.
"We demonstrate relaxation of purifying selection in the isolates, leading to enrichment of rare and low-frequency functional variants," senior author Eleftheria Zeggini, a human genetics researcher and group leader at the Sanger Institute, and her co-authors wrote. They noted that the sequencing data from isolated populations "give deeper and richer insights into population demography and genetic characteristics than genotype chip data, distinguishing related populations more effectively and allowing their functional variants to be studied more fully."
Past studies have suggested that each population's evolutionary history, environmental influences, demography, gene flow, and so on can influence the genetic architecture and disease susceptibility patterns for people in that population. But while that seems to be particularly true for isolated populations, the team explained, there have been relatively few genomic studies focused on isolated populations.
"Isolated populations have special characteristics that can be leveraged to increase the power of association studies," the authors wrote. "Nevertheless, only a small proportion of functional variants have increased in frequency in any one isolate, so multiple isolates must be investigated to reveal the full diversity of associated variants."
Using Illumina GAII, HiSeq 2000, or HiSeq X Ten instruments, the researchers did genome sequencing on individuals from Kuusamo in northern Finland, Crete in Greece, Italy's Friuli-Venezia Giulia or Val Borbera regions, or the Orkney Islands in the UK, as well as non-isolated populations in Finland and Greece.
In these newly sequenced genomes — representing 3,059 individuals sequenced to average depths of 4- to 10-fold apiece — the team identified some 8.3 million common SNPs, as well as 5.5 million low-frequency variants, and roughly 12.2 million rare variants.
When investigators analyzed the genome data for the 10 populations alongside sequences for 2,353 individuals profiled for the 1000 Genomes Project and 3,781 UK10K participants, they saw genetic relationships between the isolated populations and corresponding general populations from the same areas. But the isolated populations typically formed their own genetic clusters, particularly when focusing on rare variants.
In general, the team's analyses supported the notion that isolated populations are more prone to accumulating rare and low-frequency variants, including those falling in essential genes and sequences expected to alter protein function.
The data also made it possible for the researchers to delve into the demographic histories of the populations considered, letting them estimate the timing of when isolated population split from the general populations, for example, and gauge population fluctuations and falls since then.
"With the advent of large-scale whole-genome sequencing, studies in isolates are poised to continue as major contributors to our understanding of complex disease etiology," the authors concluded.