Though some vendors and analysts perceive a lull in the market for high-density SNP genotyping arrays used in genome-wide association studies, at least 10 papers have been published so far this month that specifically describe new, array-based GWAS, according to a review by BioArray News.
Additionally, array-based GWAS continue to spur researchers to carry out follow-on studies, either by meta-analysis of data from one or more studies on a certain condition, or by replicating findings in smaller cohorts. Many of these follow-on studies do not use array technology, however.
Also, the bioinformatics community continues to churn out new algorithms and databases to serve researchers conducting array-based GWAS, according to several of the recently published papers.
Yesterday and Tomorrow's Experiment?
"Don't give up on GWAS," was the message of a letter that appeared in this month's Molecular Psychiatry. Signed by 96 investigators, the purpose of the letter was to call for more association studies related to schizophrenia.
"Along with [copy number variation] analysis, GWAS is the only approach that has yielded robust and replicable results for this most enigmatic disorder," corresponding author Patrick Sullivan told BioArray News this week.
Sullivan, a professor of psychiatry at the University of North Carolina at Chapel Hill, cautioned that the letter was intended for those studying schizophrenia. "The key issue is genetic architecture and it depends on the disease," he said. "There is now strong evidence that schizophrenia is typified by thousands of common variants of pretty subtle effects — hence our call for more GWAS. It's the right technology for this disease, and we can predict that doing more will have an outstanding yield."
He cited mental retardation as a condition for which GWAS may not be the most suitable experiment, as the data suggest that severe mental retardation is mostly caused by multiple rare variants.
Sullivan acknowledged questions about the suitability of the approach, but argued that some of these critiques of GWAS are based on unrealistic expectations.
"For schizophrenia, there has been some confusion about the immediate goal of GWAS," said Sullivan. "For most of us, the goal has been to uncover biology, to gain etiological clues for a disorder about which we know very, very little.
"Some have argued that the immediate goal is personalized medicine — disease prediction, improved treatment. I'd disagree, as this was never the proximal goal for most of us," Sullivan added. "Obviously, it is an important ultimate goal. However, deep understanding has to precede rational clinical implementation."
He also predicted that there will be more association studies related to schizophrenia. "For schizophrenia, I'd argue that GWAS was yesterday's experiment, and will be tomorrow's experiment," he said. "Today's experiment is exome sequencing, and the early news is not encouraging." He did not elaborate.
A number of psychiatry-related GWAS studies were published this month. One study in Psychiatric Genetics described a genome-wide association study of comorbid depressive syndrome and alcohol dependence.
Using the Illumina Human 1M-Duo DNA Analysis BeadChip, researchers surveyed 467 cases and 407 controls. Although no SNP identified met genome-wide significance criteria, the authors identified 10 markers with P values less than 110, seven of which are located in known genes that have not been previously implicated in either disorder.
Another association study was discussed in Biological Psychiatry. A team of researchers from the Universidade de Santiago de Compostela in Spain performed a case-control association study of common SNPs in Galician samples using the Affymetrix GeneChip Human 20K cSNP Kit, followed by a replication study of the more promising results. Taking into account that another metal ion transporter gene, SLC39A3, is associated with bipolar disorder, the authors said their findings reveal a role for brain metal homeostasis in psychosis.
Beyond neurological conditions, several of the recent studies focused on cancer. Seven new GWAS were discussed in Human Molecular Genetics alone. Among them, one identified a new susceptibility locus for renal cell carcinoma on 12p11.23; one associated the 8q22.3 locus in Chinese Han with idiopathic premature ovarian failure; one identified a potential gene locus for keratoconus, a common cause of corneal transplantation in developed countries; and one identified a locus at 10q22 to be associated with clinical outcomes of adjuvant tamoxifen therapy for Japanese breast cancer patients.
The chips used in the studies included the Affymetrix SNP 6.0 Array and Illumina's HumanCNV370-Duo and Human610-Quad DNA Analysis BeadChips.
Another study, in the Journal of Medical Genetics, examined men with symptoms of testicular dysgenesis syndrome and its network biology interpretation. The research team used Affy SNP 6.0 arrays to screen 488 patients with symptoms of TDS and 439 selected controls with "excellent reproductive health."
Markers located in the region of TGFBR3 and BMP7 showed association with all TDS phenotypes in both the discovery and replication cohorts, and an immunohistochemistry investigation confirmed the presence of transforming growth factor β receptor type III in peritubular and Leydig cells, in both fetal and adult testis. The authors argued that "integrating data from multiple layers can highlight findings in GWAS that are biologically relevant despite having border significance at currently accepted statistical levels."
Coauthor Ramneek Gupta, a biologist at the Technical University of Denmark, told BioArray News this week that GWAS remains a valid approach, but that it should be supplemented with other approaches to be successful.
"I certainly believe there is more to be obtained from GWAS, but part of this solution will arise from augmented analysis that complements traditional statistics with biology using, for instance, emerging systems biology frameworks, Gupta said.
Not all new GWAS are in human samples. A paper in the Journal of Animal Science this month details the use of the Illumina EquineSNP50 BeadChip to study osteochondrosis in French Trotter horses.
And not all GWAS are undertaken using SNP arrays. One study appearing in the European Journal of Human Genetics discussed the use of microsatellite genotyping to survey Faroese samples for panic disorder. And in another study, discussed in Genome Research, researchers used next-generation sequencing to study drug metabolism in 601 pediatric acute lymphoblastic leukemia patients.
Meta Analysis and Follow Ups
Some association studies discussed in publications this month detail the use of existing datasets to identify causal variants, rather than genotyping new cohorts.
One paper, in the Journal of Neurochemistry, examined datasets from two previous association studies of French patients with Alzheimer's disease totaling nearly 10,000 samples to identify relevant pathways. Another such meta-analysis was discussed in Carcinogenesis. This paper relied on datasets from three previously conducted prostate cancer studies to identify loci interacting with known prostate cancer-risk-associated genetic variants.
Researchers continue to use a number of technologies to follow up findings from GWAS. Several papers this month discussed the use of RT-PCR assays to validate variants identified in association studies, including a study of obesity and coronary artery disease in Han Chinese; a study of Behçet's disease and Vogt-Koyanagi-Harada syndrome in a Han Chinese; and a study of the association of systemic lupus erythematosus in Europeans with decreased expression of miRNA-146a.
Technical University of Denmark's Gupta said that targeted sequencing is also a follow-on method that is "very promising" for researchers carrying out GWAS.
"We have shown that 25,000 SNPs can be assayed through [sequencing] at a reagent cost of $200 per sample," he said. "Sequencing regions of promising GWAS hits holds the promise to reveal causative variation," said Gupta. "The price is falling fast," he said, though the "trick is designing and funding GWAS that have a follow-up plan in mind, and a biology-anchored strategy beyond finding the highest scoring single SNPs."
According to Gupta, the "game has to become more creative than just increasing sample numbers and chip densities."
Software and Databases
In terms of new software tools for researchers conducting association studies, a number of new programs and databases were introduced this month. At least eight papers discussed a new GWAS-related software tool or database, though the bioinformatics community has in recent years tended to focus its energies on developing tools for next-generation sequencing users (BAN 11/8/2011).
In a new Methods in Molecular Biology paper, researchers from the New Jersey Institute of Technology detailed hidden Markov models for controlling the false discovery rate in genome-wide association analysis. According to lead author Zhi Wei, the new tool enables users to capture SNP dependency via a hidden Markov model and provides control for identifying susceptibility loci.
Researchers at the Wellcome Trust Sanger Institute, meantime, described a new clustering algorithm for identifying problematic samples in association studies in a new Bioinformatics .
The statistical algorithm can be used to identify samples with atypical summaries of genome-wide variation, according to the authors. Its use as a semi-automated quality control tool is demonstrated in the paper using several summary statistics, selected to identify different potential problems, and it is applied to two different genotyping platforms and sample collections.
In another paper in the same journal, an international team of researchers described an algorithm to detect genome-wide multi-locus epistatic interactions based on the clustering of relatively frequent items. Experiments on simulated data are presented in the paper that show the algorithm is "fast and more powerful" than existing methods, according to the authors.
"On a real genome-wide case-control dataset for age-related macular degeneration, the algorithm has identified genotype combinations that are significantly enriched in the cases," the authors noted in the paper.
In a paper in Human Genetics, researchers from Harvard University warn that "using imputation to combine samples genotyped on different platforms with severely unbalanced case-control ratios" has the potential to produce inflated Type I error rates and that researchers should apply "appropriate" quality filters. "Every SNP found with genome-wide significance should be validated on another platform to verify that its significance is not an artifact of study design," the researchers argued in the paper.
Meantime, a quartet of new or updated GWAS-related databases was profiled in the annual database issue of Nucleic Acids Research that appeared this month.
One, HaploReg, was developed by researchers at the Massachusetts Institute of Technology. The authors describe the tool as useful for "exploring annotations of the non-coding genome among the results of published GWAS or novel sets of variants." Using linkage disequilibrium information from the 1000 Genomes Project, linked SNPs and small indels can be visualized along with their predicted chromatin state in nine cell types, conservation across mammals, and their effect on regulatory motifs, according to the authors.
Another database profiled is DistiLD, which allows users to survey diseases and traits in linkage disequilibrium blocks. As the authors write, DistiLD "aims to increase usage of existing GWAS results by making it easy to query and visualize disease-associated SNPs and genes in their chromosomal context."
The creators of the GWASdb database, meantime, claim that the resource contains "20 times more data than the GWAS Catalog" and includes "less-significant" genomic variants, manually curated from the literature. In addition, GWASdb provides functional annotations for each GV, including genomic mapping information; regulatory effects, such as transcription factor binding sites; microRNA target sites and splicing sites; amino acid substitutions; evolution; gene expression; and disease associations, according to the authors.
Another database described in the literature this month is PolymiRTS, which was designed for linking polymorphisms in microRNA target sites with human disease and complex traits. The second version of the database is profiled in a paper that contains descriptions of new features that allow users to link polymorphisms in miRNA target sites with genes associated with human diseases and traits in GWAS.
These latter two databases should support what UTD's Gupta referred to as "systems biology-influenced GWAS."
"The idea of exploring pathways or functionally related protein-protein 'modules' instead of single SNPs has the promise for unraveling mechanistic insights as well as translating to larger proportions of cohorts than current studies appear to," said Gupta.
"Setting up some biological hypotheses in addition to data-driven exploration will be the way to go," he said. "The informatics tools need to evolve to bring the 'bio' back in to complement the statistics at an early stage."
Have topics you'd like to see covered in BioArray News? Contact the editor at jpetrone [at] genomeweb [.] com