Skip to main content
Premium Trial:

Request an Annual Quote

NIH Takes on Human Genome Variant Collection, Curation Project with $25M in Grants to Three Groups


Originally published Oct. 7.

By awarding $25 million in grant funding to three groups, the National Institutes of Health has taken on the enormous task of collecting and annotating data on clinical variants across the human genome.

Specifically, the NIH has awarded grants to three groups tasked with managing different aspects of the project, called the Clinical Genome Resource or ClinGen. Researchers led by Heidi Rehm from Brigham and Women's Hospital have received an $8.25 million grant over three years; a team led by Jonathan Berg and James Evans from the University of North Carolina, Chapel Hill have received $8.4 million over four years, and a team headed by Stanford University's Carlos Bustamante has been awarded $8.4 million over four years.

Rehm's group is in charge of collecting variant data from labs and clinics, and developing standard formats for depositing this information into ClinVar, a centralized repository of genotype-phenotype information. Bustamante's team will use computational methods and informatics tools to sift through the vast body of collected data on the variants and determine which markers are potentially clinically relevant and should be studied further. Then, Berg and his colleagues will form disease-specific expert groups to analyze the data on the variants flagged by Bustamante's team and determine which variants are relevant for patient care.

Ultimately, researchers involved in ClinGen will work with entities such as the American College of Medical Genetics to establish guidelines on how healthcare providers should use these clinically relevant genetic variants in medical practice.

Lisa Brooks, director of the Genetic Variation Program within the NHGRI's Division of Genome Sciences, believes that ClinGen is long overdue since the human genome project has been completed for more than a decade, and with the launch of large-scale efforts to characterize genomic variation, such as the 1000 Genomes Project and the International HapMap Project.

"There are probably by now around 2,000 locus-specific databases [for] curating information on particular genes or diseases, but they are in different formats, [contain] different types of data, [and have] different data qualities," Brooks said. "So, it would be hugely helpful to get this information in one place."

ClinGen is particularly necessary because the increased application of advanced sequencing tools in genomic research helps investigators identify numerous variants of interest much faster than they are able to discover whether these markers are associated with disease or truly meaningful for patient care. Moreover, since the various databases in which these data are stored have divergent protocols for assessing their clinical relevance, there is conflicting information on which markers are benign, deleterious, or neutral.

In an interview with PGx Reporter sister publication BioInform, Berg noted that before next-generation sequencing tools were available, most labs tackled curating and annotating single genes or a few genes in depth. Now that NGS tools are more readily used by researchers, many labs are increasingly encountering variants for which there is little or no data in terms of disease association (BI 10/4/2013).

The grant "is partly about starting to clean up some of that confusion about which variants are really pathogenic and which … are of uncertain clinical significance," but it's also about setting up infrastructure that other curators can use in their projects, Berg told BioInform.

While the goal of ClinGen is to standardize the annotation of clinical variant information across the genome, the grantees are approaching the project by taking on small pieces of the puzzle. For example, Rehm's team, charged with collecting variant data into ClinVar and developing standards for data collection, are focusing initially on three disease areas. Rehm told PGx Reporter that her team will direct their resources on collecting data and standardizing processes for variants associated with cardiovascular disorders, such as cardiomyopathies, arrhythmias, and rasopathies; hereditary cancers, including colon, breast, and multi-organ syndromes; and inborn errors of metabolism.

The ClinVar database made headlines earlier this year, when University of California's Robert Nussbaum launched Sharing Clinical Reports – a project in which Nussbaum and other volunteers have been asking cancer clinics and patient groups to deposit data on BRCA1 and BRCA2 variants into ClinVar. Certain BRCA1/2 mutations are associated with an increased risk of breast and ovarian cancer in women who inherit them. However, the largest repository of annotated BRCA1/2 markers are held in a database proprietary to Myriad Genetics, the company that markets the most widely used test for gauging these mutations.

Through Sharing Clinical Reports, Nussbaum and his colleagues are hoping to collect information on BRCA variants with uncertain links to breast and ovarian cancer in ClinVar and make it broadly available to all researchers. Nussbaum is one of the researchers on Rehm's team awarded the ClinGen grant(PGx Reporter 6/19/2013).

NHGRI's Brooks pointed out, however, that given ClinGen's scope, clinical variants like BRCA for breast cancer are not going to be the most challenging to annotate. "Ultimately, the goal is to deal with the entire genome. Mostly we don't have a clue. We don't have the datasets that will tell us which variants are actually clinically relevant," she said.

"Obviously, for breast cancer there are very nice databases, there's a huge amount of clinical information, and there is a lot of attention. So, to some extent clinical testing labs have a lot of information on the variants and data types," she continued. "In those cases, we actually have hope of assigning maybe not every variant, but most of the variants, to either benign or pathogenic [status]. But those are special cases."

As ambitious as the project is, Brooks hopes that ClinGen will be able to curate variants from the entire human genome over time. "We'll be getting more of that information, because this is a process designed to last a long time to continue curation of the human genome," she said.

The amount of money given to the grantees will "give them a good start" in coming up with standard methods of collecting and assessing the data, she added. The researchers heading the three ClinGen grants are experts in the field of genetics and have much experience with data annotation, "so, they're not coming up to speed from a standing position," Brooks said.

Once ClinGen researchers determine whether a set of clinical variants is linked to disease, the final step will be to work with bodies that develop clinical practice guidelines and advise healthcare providers on how they should use genetic variant data in patient care. However, the process of developing clinical guidelines around genetic variants is perhaps as disjointed as the data collection and annotation process, and often doesn't mirror what's going on in clinical practice.

For example, the Agency for Healthcare Research and Quality recently issued a report on whether doctors should use CYP2C19 variants and platelet reactivity tests to guide antiplatelet treatment. Specifically, AHRQ found limited evidence showing that loss-of-function CYP2C19 variants are linked to an increased risk of adverse cardiovascular outcomes and insufficient evidence that testing should be used to guide antiplatelet treatment choice.

Counter to these guidelines, the US Food and Drug Administration has updated the label for the antiplatelet clopidogrel to advise doctors to consider alternative treatments for patients who have CYP2C19 variants that make them poor metabolizers of the drug. There are many commercially available tests that gauge CYP2C19 variants, and many of them are approved by the FDA for guiding treatment strategies with drugs metabolized by CYP2C19 enzymes.

Although CYP2C19 testing isn't widespread across cardiology practices in the community setting, several academic centers, such as Vanderbilt University and the University of Florida, are routinely testing for these variants in patients with cardiovascular conditions and who have undergone stent procedures. UF is even collaborating with community hospitals to launch CYP2C19 testing programs in cardiology practices (PGx Reporter 8/21/2013; 9/11/2013).

A focus at ClinGen will be to work closely with groups like ACMG and others in charge of crafting guidelines, so recommendations around genetic variants will reflect the latest evidence. "This is a data-driven project. Some of these things might change over time … [and] hopefully … more and more [variants will] fall into the pathogenic or benign category, as opposed to the unknown," Brooks said. "Our hope is that by being as comprehensive as possible, we'll get to the right conclusions as soon as possible."

The Scan

WHO Seeks Booster Pause

According to CNN, the World Health Organization is calling for a moratorium on administering SARS-CoV-2 vaccine boosters until more of the world has received initial doses.

For Those Long Legs

With its genome sequence and subsequent RNAi analyses, researchers have examined the genes that give long legs to daddy longlegs, New Scientist says.

September Plans

The New York Times reports that the US Food and Drug Administration is aiming for early September for full approval of the Pfizer-BioNTech SARS-CoV-2 vaccine.

Nucleic Acids Research Papers on Targeting DNA Damage Response, TSMiner, VarSAn

In Nucleic Acids Research this week: genetic changes affecting DNA damage response inhibitor response, "time-series miner" approach, and more.