ENCODE Explosion

The ENCODE project just published 30 papers examining the function of the human genome.

Full-text access for registered users only. Existing users login here.
New to GenomeWeb? Register here quickly for free access.

It is well accepted that less

It is well accepted that less than 3% of the human genome actually appears to encode proteins or RNA. Over 8% of the human genome includes remnants of recognizable viral DNA that was integrated into the genomes of our ancestors during our evolution over millions of years. Nevertheless, it is clear that there are many promoter and repressor elements in the genome DNA flanking these genes that regulate their expressions as well as for genome structure, replication, repair and degradation. Introns within genes also provide for alternative splicing. All this has been known for decades, although this certainly has not been as extensively mapped out until the ENCODE effort. There are also likely to be many surprises yet in store in the so called "junk" or "dark" DNA.

That being said, I do wonder how much of the putative transcription factor and histone interactions with DNA sequences may actually be non-specific and truly inconsequential. Such low level interactions may simply be noise and just tolerated. However, the main reason why I have a hard time accepting that about 80% of the human genome sequence is functional and important is the data from other species with a similar number of genes, but extremely divergent amounts of DNA. For example, the fruit fly Drosophila melanogaster has 0.165 billion nucleotide base pairs (nbp), whereas the mountain grasshopper Podisma pedestris has 14 billion nbp, and the flower Fritillaria assyriaca has a whopping 124.9 billion nbp in their genomes. The human genome size lies between these insects with about 3.2 billion nbp. While the fruit fly has 85-times less DNA than the mountain grasshopper, both are very successful hexapods.

There appears to be strong evolutionary pressure in multicellular organisms to retain excess baggage so as to simply make sure that the important parts are retained. There are countless cases of this ranging from the extensive remodelling of embryos during early development, to the hundreds of thousands of superfluous phosphorylation sites in the proteins encoded by the human genome. At the levels of gross anatomy down to the molecular, there are so many examples of inefficiencies in biology. As I have pointed out above, DNA sequencing studies in diverse organisms have increasingly demonstrated extreme ranges in the sizes of their genomes, whilst still having a relatively similar number of genes. It just seems highly unlikely that this is for increasing the amount of regulation of the genome in certain organisms over others. Over-regulation can be also be highly disadvantageous, for example, as observed with more bureaucratic governments.

I cannot help but laugh now

I cannot help but laugh now at Bill Haseltine, who chided J. Craig Venter and Celera, stating that sequencing the whole human genome was not worth it. Wrong again.

I remember standing up in a

I remember standing up in a human genome meeting in the late 80's and pronouncing "God don't make junk".