Skip to main content
Premium Trial:

Request an Annual Quote

Human Genome Not So Tidy After All, ENCODE Project Suggests

NEW YORK (GenomeWeb News) — The human genome may not be a “tidy collection of independent genes,“ but rather “a network in which genes, regulatory elements and other types of DNA sequences interact in complex, overlapping ways,” according to National Human Genome Research Institute, which today announced the publication of results from its ENCODE Project in the June issue of Nature, and 28 companion papers in Genome Research.
 
The ENCODE consortium, which comprises 35 groups from 80 organizations around the world, debuted in 2003 to build a “parts list” of the biologically functional elements in 1 percent of the human genome, and is a pilot study meant to “test the feasibility of a full-scale initiative to produce a comprehensive catalog of all components of the human genome crucial for biological function.”
 
In the papers, ENCODE partners describe “major findings” in gene transcription and regulation, chromatin and replication, and evolutionary constraint.
 
These findings include the discovery that the majority of DNA in the human genome is transcribed into RNA, and that these transcripts extensively overlap one another. “This broad pattern of transcription challenges the long-standing view that the human genome consists of a relatively small set of discrete genes, along with a vast amount of so-called junk DNA that is not biologically active,” NHGRI said in a statement.

The data also showed that the human genome contains “very little unused sequences” and is a “complex, interwoven network.” According to NHGRI, in this network genes are “just one of many types of DNA sequences that have a functional impact.”

 
In the Nature paper, the authors write, "Our perspective of transcription and genes may have to evolve,” noting the network model of the genome "poses some interesting mechanistic questions" that have yet to be answered.

Other “surprises” in the ENCODE data could have “major implications” in how researchers understand the evolution of genomes, particularly mammalian genomes. “Until recently, researchers had thought that most of the DNA sequences important for biological function would be in areas of the genome most subject to evolutionary constraint,” NHGRI said.

However, the ENCODE effort found that about half of functional elements in the human genome "do not appear to have been obviously constrained during evolution, at least when examined by current methods used by computational biologists.”

According to the ENCODE researchers, this lack of evolutionary constraint may indicate that many species' genomes contain a “pool of functional elements,” including RNA transcripts, that “provide no specific benefits in terms of survival or reproduction.”

Over time, this pool may serve as a "warehouse for natural selection" by acting as a “source of functional elements unique to each species and of elements that perform the similar functions among species despite having sequences that appear dissimilar,” the researchers speculated.

Other ENCODE findings include the identification of numerous previously unrecognized start sites for DNA transcription; the discovery of evidence that, contrary to traditional views, regulatory sequences are just as likely to be located downstream of a transcription start site on a DNA strand as upstream; the identification of specific signatures of change in histones, and correlation of these signatures with different genomic functions; and a deeper understanding of how histone modification coordinates DNA replication.

 
The NHGRI said that taken together, these findings will “reshape our understanding of how the human genome functions.”
 
The study focused on 44 targets, which together cover about 1 percent of the human genome sequence, or about 30 million DNA base pairs. The targets were selected to provide a representative cross-section of the entire human genome. All told, the ENCODE consortium generated more than 200 datasets and analyzed more than 600 million data points.
 
NHGRI Director Francis Collins said the ENCODE effort has “blazed the way for future efforts to explore the functional landscape of the entire human genome.”
 
The main portal for ENCODE data is the University of California, Santa Cruz's ENCODE Genome Browser, and the analysis effort is coordinated by Ensembl, a joint project of the European Bioinformatics Institute and the Wellcome Trust Sanger Institute.
 
Much of the primary data have been deposited in databases at the NIH's National Center for Biotechnology Information and at EBI.
 
Additional information on the ENCODE project can be found here.

The Scan

Shape of Them All

According to BBC News, researchers have developed a protein structure database that includes much of the human proteome.

For Flu and More

The Wall Street Journal reports that several vaccine developers are working on mRNA-based vaccines for influenza.

To Boost Women

China's Ministry of Science and Technology aims to boost the number of female researchers through a new policy, reports the South China Morning Post.

Science Papers Describe Approach to Predict Chemotherapeutic Response, Role of Transcriptional Noise

In Science this week: neural network to predict chemotherapeutic response in cancer patients, and more.