NEW YORK (GenomeWeb News) – The National Human Genome Research Institute plans to launch a grant program that would fund efforts to develop innovative computational approaches for interpreting variants found in the non-protein-coding regions of the human genome.
NHGRI's advisory board yesterday approved the program, which will provide up to $500,000 per year to each of five or six projects to create new tools that will pare down the numbers of genomic variants that are thought to be contributing to diseases or other traits.
At its tri-annual meeting yesterday, the National Advisory Council for Human Genome Research agreed that the RFA program, "Interpreting Variation in Human Non-coding Genomic Regions Using Computational Approaches," should proceed, and the plan now is for the first RFAs to be released in August.
Although most disease-associated variants are not found in protein-coding regions of the genome, most sequencing projects today only sequence exons, rather than whole genomes, Lisa Brooks, director of NHGRI's Genetic Variation Program, explained to the council yesterday. That is partly because exon sequencing is cheaper, but also because it is currently much more complex to interpret variation in the non-coding regions of the genome, she added.
This complexity is largely a product of linkage disequilibrium, which happens when multiple genes, genomic elements, and variants in a region are statistically associated with a certain trait or disease. Although many elements may be statistically associated with the trait, there may only be one element that is causing it.
NHGRI wants these projects to develop computational tools that could be used to untangle these linkages and tease out the ones that are actually causing the phenotype.
"So the question is: we know that lots of genetic variants are associated with disease, but which ones are actually causing them? Function is complicated, causation is complicated," said Brooks. "LD is very non-trivial. There are some things there that really are contributing to disease, but they've got a whole bunch of buddies along for the ride."
Exon sequencing only covers about 1.5 percent of the genome, Brooks noted, even though the non-coding regions are known to impact disease.
"We know the non-coding DNA variants affect human diseases," she said. "We know they affect drug use and response to drugs. The [genome-wide association studies] catalogue is full of these associations, 90 percent or so of which are not in exons," Brooks said.
The projects this program will fund will include data from other studies, such as GWAS data sets, phenotype data, gene-gene or gene-environment interactions, gene expression data, and other types of information.
Many of these data sets may be obtained from public resources, such as the dbGaP, the Cancer Genome Atlas, the ENCODE, 1,000 Genomes, and GTEx projects, among others.
Applicants seeking funding may plan to identify one or more traits or diseases to study, such as human diseases, disease resistance, or certain physiological traits. The goal for these investigators should be to deliver a robust approach that could be used to study other traits beyond those in the initial project.
Because these projects may be particularly challenging, Brooks said, there are likely to be two rounds of RFAs issued and separate due dates — one in January 2014 and another the following January.
Putting out separate RFAs would enable research groups that are ready to develop applications to apply for the earlier round. It also would give other groups time to consider using new computational or experimental methods or to set up collaborations.