By Monica Heger
Sequencing may be better than genome-wide association studies at finding causal variants in common diseases, according to researchers from Duke University.
The researchers came to this conclusion after performing whole-genome sequencing studies on 29 individuals and finding that rare variants are significantly more likely than common ones to be functional.
Genome-wide association studies have been used to try to pinpoint the genetic underpinnings of common disease, but have so far only been able to explain a small proportion of the predicted heritability. Rather, the Duke study suggests that rare variants, which GWAS so far have not been designed to find, may be responsible for common diseases.
To arrive at their results, researchers in David Goldstein's lab at Duke sequenced 29 individuals on the Illumina Genome Analyzer to an average 28-fold coverage, and called 5,491,245 single nucleotide variants.
After applying filters to eliminate SNVs inside repeat regions, copy-number variations, segmental duplications, 1 kilobase regions around assembly gaps, and regions aligned to deletions, the team was left with 3,533,186 high-quality SNVs to analyze.
They then separated those variants according to their minor allele-frequency value and analyzed them according to evolutionary conservation, gene structure, or regulatory potential.
For each of the functional categories, the researchers found a significant enrichment of lower-frequency variants and calculated that odds ratios were highest when comparing variants whose minor-allele frequency was less than .052 to all other variants.
"For all the functional categories, when the variants are more rare they are more likely to be in the functional region and therefore more likely to cause phenotypic effect," said Qianqian Zhu, a co-lead author of the study published in last week's American Journal of Human Genetics.
The team studied a variety of functional regions for the presence of rare variants, including genes, protein-coding genes, both the 3' and 5' untranslated regions, regulatory regions, exons, and introns. They found that for all the regions, rare variants were significantly more enriched than common variants, with odds ratios of up to two.
Zhu said the team conducted the study because of the ongoing debate over whether common diseases are more likely to be caused by rare or common variants. To her, the study provides evidence for the rare-variant theory of disease and underscores sequencing's ability to study disease and find rare variants.
"Sequencing is definitely more powerful [than GWASs] for detecting rare variants," Zhu said. With microarrays, "the SNPs are already fixed, so they are mostly common variants. But, when you do whole-genome sequencing, you find all the variants that you can find, including both rare and common."
Tamara Koopmann, a researcher at the Heart Failure Research Center in Amsterdam, who has used both GWAS and next-gen sequencing to study variants in cardiac arrhythmias, said the study supports the rare-variant theory of disease and highlights next-gen sequencing as a good tool for identifying disease-causing variants.
However, she noted that GWAS can still provide important information.
"I would love to just sequence all the patients we have, but I don't think we'd know what to do with all the variants we find," she said. "Genome-wide association studies can point you to interesting areas, and then you can go back and do sequencing."
For example, she said, whole-exome sequencing will identify between 20,000 and 30,000 variants in an individual. And while the technique can effectively identify disease-causing variants in Mendelian disorders, it is not as straightforward for complex diseases, Koopmann said.
"Working with a more complex or polygenic disease might be more difficult, and GWAS can help you find the more interesting areas," she added.
Robert Hegele, director of the Blackburn Cardiovascular Genetics Lab at the Robarts Research Institute at the University of Western Ontario, who has used GWAS and sequencing to identify variants in hypertriglyceridemia (IS 7/27/2010), agreed that the combination of techniques is useful.
Researchers are "starting to sequence underneath the GWAS peaks and that's the logical starting point," he said. "We can't yet do whole-genome sequencing on everyone in a sample."
However, as prices continue to drop and researchers continue to hone their ability to analyze the data, disease studies will move toward sequencing and away from GWAS, since sequencing uncovers both rare and common variants, Hegele said.
"It certainly advances the use of the technology and gives us a glimpse of what's going to be possible," he added.
Moreover, the study is "consistent with the little glimpses we've been getting about rare variants and complex disease." He said that he has found a similar enrichment of rare variants in functional regions in his studies of hypertriglyceridemia.
While the AJHG study supports the rare-variant theory of common disease, Hegele said he thought each disease would be different, with some being primarily determined by rare variants and others caused by a complex combination of rare and common variants, as well as factors such as the environment.
"There will be different diseases with different mosaic architectures," he said.
While the study was able to look at rare variants at frequency of between 2 and 5 percent, the sample size did not enable its researchers to assess rarer variants found in less than 2 percent of the population.
To compensate, the researchers evaluated whole-exome data from 168 individuals whose exomes were sequenced to about a 73-fold coverage. They were able to evaluate variants down to a frequency of 0.3 percent and found that as frequency decreased, the proportion of nonsynonymous variants compared to synonymous variants increased.
This result was consistent with a previous whole-exome sequencing study of 200 individuals, which found a similar "excess" of rare, likely deleterious variants (IS 10/5/2010).
"I think we will now start giving higher weight to rare variants [and] begin to identify the real causal variants of disease," Zhu said.
Have topics you'd like to see covered by In Sequence? Contact the editor at mheger [at] genomeweb [.] com.