NEW YORK, NY (GenomeWeb News) – The Encyclopedia of DNA Elements, or ENCODE, Consortium is continuing to unravel the functional elements in the human genome, attendees at the Biology of Genomes meeting heard last week, with members integrating a slew of data types from different labs in a careful and systematic way.
Hudson-Alpha Institute for Biotechnology President and Director Richard Myers described the approaches that ENCODE participants are using to bring together the disparate data types being generated by numerous high-throughput functional genomics techniques, including RNA-Seq, RNA array, transcription factor ChIP-Seq, and SNP genotyping.
The effort involves many groups using various assays and cell lines but with standardized methods, reagents, and analyses, he explained, doing "lots of experiments in a systematic and standardized way."
ENCODE, an effort to identify all of the functional elements in the human genome, was launched in the fall of 2003 through the National Human Genome Research Institute. In 2007, the ENCODE team published several publications from the pilot phase of ENCODE. NHGRI announced that it would fund additional studies to scale up ENCODE later that year.
Of late, the team has been focusing on two "Tier 1" cell lines: a HapMap lymphoblastoid cell line, called GM12878, which is also being sequenced by the 1000 Genomes Project, and a chronic myeloid leukemia line called K562. They also plan to look at five "Tier 2" lines — including HeLa cells and primary keratinocytes — as well as other "Tier 3" cell types down the road.
By developing a uniform analysis pipeline, consortium members have already been able to start bringing together and validating data generated in different labs, Myers explained.
Some biological stories are already starting to come out of the data, Myers noted, though researchers still have a ways to go to understand what many of the functional element patterns mean.
And while the results are expected to yield a wealth of biological insights down the road, that will involve more experiments — particularly for testing hypotheses about biological function that come out of the integrated data.
Still, Myers said the pay-off for such experiments is higher now than it was even six months or a year ago. He said the project will likely continue to expand, involving more and more cell types and, potentially, tissues and mouse models as well.