Skip to main content
Premium Trial:

Request an Annual Quote

NHGRI Clears $123M to Expand ENCODE

By a GenomeWeb staff reporter

NEW YORK (GenomeWeb News) – The multi-year effort at the National Institutes of Health to identify the functional elements in the genomes of humans, fruit flies, and roundworms will expand to create a more complete catalog of those elements, after leadership at the National Human Genome Research Institute approved up to $123 million to fund several new grant programs Monday.

NHGRI's National Advisory Council for Human Genome Research (NACHGR) gave the nod yesterday to three so-called concept clearance plans to develop grant solicitations to start the next phase of the Encyclopedia of DNA Elements (ENCODE) Project.

The expansion is aimed at creating as complete a catalog as possible using current technologies; supporting ways to store, display, and release ENCODE data; and funding individual research grants to expand the pool of researchers who are analyzing all of these data.

Also in its tri-annual meeting, NACHGR approved up to $30 million to launch a new project to fund investigations of how genome sequences dictate gene expression.

The ENCODE Project, which launched in 2003, and its sibling modENCODE have "been very successful in generating large amounts of high-quality data" that are being "heavily used" by the research community, according to the concept clearance proposal.

However, these programs have still interrogated only a fraction of the cells and tissues needed for a comprehensive catalog of functional genomic elements in human, D. melanogaster, and C. elegans. The ENCODE expansion will be aimed at supporting simultaneous and collaborative efforts to drive the development and implementation of robust new research methods.

"We wanted to expand the ENCODE project to generate as complete catalogs as feasible," ENCODE Program Director at NHGRI Elise Feingold told the council yesterday. "We really wanted to capitalize on the progress that's been made in establishing high-throughput and efficient production pipelines [and to] take advantage of economies of scale, centralized management, and centralized coordination.

"In this initiative we really wanted to focus on the data production and analysis efforts to maximize the utility of the resources, and think about the most useful data resources to collect," Feingold said, noting that this expansion also will place "a big emphasis on continuing annotation of the human genome."

Focusing on human genome annotation "was felt to be a very high payoff project, and useful for the community," she added. "There is a lot of new RNA data that is now available that will fuel the continued annotation of the human genome."

But, she noted, ENCODE researchers wanted to expand this to annotation of the mouse genome and expand the repertoire of data types in ENCODE, particularly to include more classes of RNA molecules and functional elements within RNA molecules for both the human and mouse genomes. "Clearly, when the ENCODE project started four years ago we did not have as good an appreciation for the different classes of RNA molecules as we do now, so we feel this is an important area to expand in," said Feingold.

One of the new ENCODE solicitations will provide between $15 million and $25 million per year for four years to expand the encyclopedia in the human and model organisms by funding six to eight new studies.

This program will fund efforts to improve gene models on the basis of new data on RNA transcripts coming from next-generation sequencing platforms, to annotate the mouse genome, to expand the types of data found in ENCODE — particularly more classes of RNA molecules, and to take existing data sets more deeply into the human genome.

These investigations will put more focus on the human genome and a secondary focus on the mouse. While it will continue to support studies of C. elegans and D. melanogaster, those latter models will comprise a smaller segment of the ENCODE program going forward.

Specific studies funded under these grants could include projects that generate maps in more cell types, including maps of binding sites for all transcription factors, sites of open chromatin in more cell types, histone marks and other relevant chromatin proteins, and sites of DNA methylation.

Another new project will provide up to $3.5 million per year over four years to fund the creation of a Data Analysis and Coordination Center for the ENCODE project. This data center will serve as a centralized database for the ENCODE research projects, and it will develop, house, and maintain databases to track, store, and provide access to the data those projects generate.

"What we're doing here is actually consolidating efforts," Feingold explained, pointing out that there already was a data coordination center for ENCODE, another for modENCODE, and data analysis centers for both.

The third ENCODE solicitation will use up to $3 million per year for three years to fund six to 10 awards to fund researchers analyzing ENCODE data. These projects could include combining ENCODE data with functional genomic information, using the data to improve analysis and interpretation of disease maps to identify causal variants, or to develop new methods to improve analysis and interpretation of these data.

"We really want to make sure that this data gets out to the community and that people are using it. We'd like to bring additional people in to looking at ENCODE data," she said.

Feingold explained that these grants should support individual research projects using ENCODE data and combining it with related functional genomics data to improve analysis of disease mapping studies and to develop new methods for analyzing the data.

The other concept clearance NACHGR approved yesterday will provide up to $10 million per year over three years to fund between five and eight demonstration projects focused on gene regulation.

The Genomics of Gene Regulation (GGR) project will fund studies of how genome sequence dictates gene expression, with the aim of identifying regulatory sequences, how they control where and when genes are turned on and off, their level of expression, and their response to environmental stimuli.

The GGR project's long-term goal is to apply genomic approaches and technologies to understand how genetic regulatory systems are assembled and then function to determine biological processes.