Skip to main content
Premium Trial:

Request an Annual Quote

GnomAD Resource Introduced at ASHG Meeting, Doubles ExAC Dataset

VANCOUVER, British Columbia (GenomeWeb) – Building on the success of the Exome Aggregation Consortium (ExAC) dataset, members of the same research team have established a collection that contains roughly twice as many exomes as the version of ExAC released to the public two years ago, analyzed alongside more than 15,000 whole-genome sequences.

The Broad Institute's Daniel MacArthur introduced the resource, known as gnomAD, at the American Society of Human Genetics meeting today. MacArthur noted that more than 5,000 principal investigators provided exome and genome data for gnomAD, which has now been released publicly. The dataset currently includes information on 126,216 exomes and 15,136 whole-genome sequences.

ExAC was established to help overcome some of the challenges that researchers have faced in the past when trying to tap into variant data found in the massive amounts of genome and exome sequence that have been generated around the world, from issues related to informed consent or regulatory constraints to subtle differences in the pipelines used to call variants, MacArthur explained. Since its launch in October 2014, the ExAC site has been viewed nearly 6 million times. Variant data gleaned from the collection has been used by investigators focused on understanding features found in protein-coding regions of the human genome as well as those filtering variants to focus in on disease- or trait-related variants. Details on the ExAC resource and its applications were published earlier this year in Nature.

For the new gnomAD collection, MacArthur and his colleagues called variants in the available exomes and genomes separately using consistent variant calling processes, but ultimately analyzed the sequences together. So far, they have identified nearly 18 million variants in the expanded set of exome sequences, including 7.5 million variants not described previously. The whole-genome sequence data has yielded more than 254 million variants. Almost 160 million of the variants found from whole-genome sequences are novel.

Along with the variant coverage available in gnomAD, MacArthur touted the diversity of the dataset, which represents individuals from a wide range of ancestry groups and includes sequences for some 5,000 individuals of Ashkenazi Jewish descent.

MacArthur also cautioned, however, that the gnomAD website is currently in its beta version and urged users to report any unusual variant calls or bugs to the team as it works to continue improving the site. The group plans to finalize quality control and variant filtering for the dataset shortly and will release non-coding variant information from gnomAD in the coming weeks. Protein-coding variants from the collection are already available.

Researchers are not restricted with regard to use of the gnomAD data and publications stemming from these analyses, MacArthur said. But he urged those intent on doing large-scale analyses with the data to contact the team beforehand to avoid duplicating the efforts of other groups.

The Scan

Genetic Testing Approach Explores Origins of Blastocyst Aneuploidy

Investigators in AJHG distinguish between aneuploidy events related to meiotic missegregation in haploid cells and those involving post-zygotic mitotic errors and mosaicism.

Study Looks at Parent Uncertainties After Children's Severe Combined Immunodeficiency Diagnoses

A qualitative study in EJHG looks at personal, practical, scientific, and existential uncertainties in parents as their children go through SCID diagnoses, treatment, and post-treatment stages.

Antimicrobial Resistance Study Highlights Key Protein Domains

By screening diverse versions of an outer membrane porin protein in Vibrio cholerae, researchers in PLOS Genetics flagged protein domain regions influencing antimicrobial resistance.

Latent HIV Found in White Blood Cells of Individuals on Long-Term Treatments

Researchers in Nature Microbiology find HIV genetic material in monocyte white blood cells and in macrophages that differentiated from them in individuals on HIV-suppressive treatment.