NEW YORK (GenomeWeb) – In a new analysis, an international team of researchers identified more than 1,200 SNPs associated with educational attainment.
A previous study published in 2016 of nearly 300,000 individuals linked 74 SNPs to the amount of formal education participants completed. Many of those SNPs had also been tied to brain development.
This new study, however, is much larger. It drew upon genomics data from 1.1 million individuals included in more than 70 datasets, uncovering numerous additional SNPs. As they reported in Nature Genetics today, the researchers also developed a polygenic prediction score based upon the SNPs they identified that accounts for more than 11 percent of the variance seen between individuals in educational attainment. They further traced these SNPs to genes with roles in neuronal communication and neurotransmitter secretion.
"It moves us in a clearer direction in understanding the genetic architecture of complex behavior traits like educational attainment," co-first author Robbee Wedow from the University of Colorado Boulder said in a statement.
Wedow and his colleagues performed a genome-wide association study meta-analysis of years of completed schooling using data from 71 datasets, including from the UK Biobank and 23andMe. The analysis included more than 1.1 million individuals, though was restricted to individuals of European ancestry. The researchers identified 1,271 SNPs that reached genome-wide significance, and estimated that the median effect size of the lead SNPs corresponded to 1.7 weeks of schooling per allele.
However, they also conducted a within-family association analysis using four sibling cohorts — totaling 22,135 pairs of siblings — and found that GWAS effect size estimates might be upwardly biased because of a link between educational attainment and the environment in which children are raised. They noted that a recent paper reported that parental alleles not passed on to children could still sway educational attainment because of that influence.
They also reported that the lead SNPs have heterogeneous effect sizes. This imperfect genetic correlation confirms what previous studies have found and such imperfect genetic correlation would likely be the case for other phenotypes that are also influenced by the environment.
Using the bioinformatics tool DEPICT, the researchers examined the genes located near their lead SNPs and found that they are overwhelmingly enriched for expression in the central nervous system. In particular, they noted that many of these genes encode proteins that are involved in neurotransmitter secretion, ion channel activation, and synaptic plasticity.
With the fine-mapping tool CAVIARBF, they also uncovered 127 SNPs that are likely causal. One is a non-synonymous variant in the CACNA1H gene, which encodes a subunit of a voltage-gated calcium channel that helps traffic N-methyl-D-aspartate receptors.
Based on the SNPs they uncovered, the researchers developed polygenic prediction scores using the National Longitudinal Study of Adolescent to Adult Health and Health and Retirement Study cohorts. These scores, the researchers reported, explain between 11 percent and 13 percent of the variance in educational attainment. The scores also could explain a portion of other, related phenotypes like cognitive performance.
"Having a low polygenic score absolutely does not mean that someone won't achieve a high level of education," Wedow added. "As with many other outcomes, it is a complex interplay between environment and genetics that matters."
He and his colleagues also noted that their study was limited to individuals of European ancestry and that their score has a much lower predictive power in African Americans, suggesting it would have lower predictive power in other non-European groups as well.