NEW YORK (GenomeWeb) – The UK Biobank has collected and developed a resource of genomic and phenotypic data on nearly 500,000 individuals.
The project, spearheaded by the Medical Research Council, the Wellcome Trust, and other agencies, was launched in 2006 with the aim of collecting DNA alongside medical and lifestyle data from half a million individuals across the UK between the ages of 40 years and 69 years.
Now, researchers from the project have described the full dataset in Nature. An interim UK Biobank dataset was released in 2015.
"The UK Biobank dataset represents a step change in the field of human genetics," senior author and University of Oxford researcher Jonathan Marchini said in a statement. "Research groups all over the world are now actively analyzing the data to understand how our genetic code influences disease."
The participants provided blood, saliva, and urine samples for analysis and underwent a battery of tests, including of heart and lung function, hearing and eyesight ability, and lipid and hormone levels. At the same time, the participants underwent genotyping by the Affymetrix Research Services Laboratory on either the Applied Biosystems UK BiLEVE Axiom Array by Affymetrix or the UK Biobank Axiom Array, with overlapping marker content.
After applying a custom genotype calling pipeline, filtering, and other quality-control measures, the researchers made a set of genotype calls for release of 488,377 samples at 805,426 markers. The researchers also estimated haplotypes for their cohort and imputed their dataset using the UK10K and 1000 Genomes phase 3 reference panels to generate a dataset of 93 million autosomal SNPs, indels, and structural variants in 487,442 individuals and another 3.9 million markers on the X chromosome.
When they compared allele frequencies among the UK Biobank cohort to individuals from the Exome Aggregation Consortium database of European ancestry, the researchers found them to be broadly similar.
As a further check on their dataset, the researchers tested whether it would yield known results. They imputed HLA types at two-field resolution for 11 HLA genes and then tested for associations with HLA linked-conditions, finding their results to be consistent with previous studies. Likewise, a genome-wide association study for height using this dataset largely reflected the results of the Genetic Investigation of Anthropometric Traits Consortium. It also underscored the increased power of the UK Biobank.
The researchers noted that more data would be added to the Biobank, as the participants have consented to follow up through their health records. In addition, the researchers are continuing to collect data using, for instance, activity trackers and imaging.
A separate paper also appearing in Nature today, gives a taste of what those future studies may entail. A subset of UK Biobank participants has undergone functional MRI, which researchers led by Oxford's Stephen Smith noted could give insight into both brain structure and function.
Among these 8,248 individuals, the researchers identified more than 3,000 functional and structural brain imaging-based phenotypes and conducted a genome-wide association study to determine whether these phenotypes were heritable and linked to particular genetic alterations. They found, for instance, that brain volume was more likely to be heritable than cortical thickness.
In addition, they uncovered 1,262 significant associations between SNPs and the image-derived phenotypes, noting that this replicated many of the findings from the ENIGMA consortium. They found associations between T2* imaging of the caudate nucleus, putamen, and pallidum and SNPs in or near genes that affect iron transportation and storage. These genes, the researchers said, also tended to be linked to neurodegenerative disease.
They also found associations between brain features and genes involved in brain development and plasticity, noting that these genes were also linked to schizophrenia and depression.
"This work is just a tantalizing teaser of how much more we will learn once 100,000 UK Biobank participants have undergone brain imaging — a project that should be completed by 2020," added Vanderbilt University's Nancy Cox in a related commentary also appearing in Nature today.
Many more studies fueled by the UK Biobank are expected, as the dataset is an open-access resource. "[W]hat is exciting is that there will be really clever scientists who will exploit these data to improve human health and healthcare in ways that currently we can't imagine," Biobank paper co-author and Oxford researcher Peter Donnelly added in a statement.