The widely held perception that next-generation sequencing technology is moving into areas once dominated by microarrays was given extra credence last week when the National Human Genome Research Institute announced grants for the second phase of its Encyclopedia of DNA Elements, or ENCODE, project.
Begun in 2003, the ENCODE project aims to identify all functional elements in the human genome sequence. The project comprises three phases: a pilot project phase, a technology development phase, and a production phase. The pilot phase and technology development phases wound up earlier this year resulting in papers in the June issues of Nature and Genome Research.
During the pilot phase of ENCODE, researchers primarily used tiling arrays from Affymetrix and NimbleGen to identify functional elements in 1 percent of the human genome. In the next phase of the project, however, roughly half of the 10 institutions funded to participate in the project will use next-gen sequencers.
Scott Tenenbaum, an assistant professor of molecular genetics at the State University of New York in Albany, told BioArray News that NHGRI suggested he evaluate next-gen sequencers before embarking on his $2.2 million, three-year slice of the project aimed at comprehensively identifying RNA-based cis-regulatory elements.
“My role is to identify the binding sites of RNA-binding proteins,” Tenenbaum said. “We predominantly have used tiling arrays in the past, but [NHGRI] asked us to determine whether some of the deep sequencing methods are appropriate for us.”
Tenenbaum said he is wary of the “hype” surrounding next-generation sequencing technologies. For example, he said that Affy’s tiling arrays received a similar amount of buzz when they were first launched but after working with them as an early-access customer, he found that they were “more involved than expected.”
Even though tiling arrays were a “major force” in the pilot phase of ENCODE, they had their limits, Tenenbaum said. “I am using the ENCODE arrays now to do a whole-genome scan and it is 11 arrays,” he said. “That is an expensive experiment to do. At some point the sequencing will be more attractive.”
During the evaluation, Tenenbaum’s lab will run NimbleGen and Affy tiling arrays in house and will also send some aliquots to some of the manufacturers. It will also look at sequencers sold by Illumina, Applied Biosystems, and 454. In the end, he said his lab will “let the chips land where they may” before adopting a platform for its part of the ENCODE project.
“We are not unhappy with tiling array data; this is just something that NHGRI has asked us to do,” he said. “They said they don’t want to fund us for something today that three years from now we will have to repeat using another technology.”
Arrays Plus Sequencers
While Tenenbaum’s lab is weighing arrays against sequencers, other labs funded for the second phase of the ENCODE project will be using both technology platforms. For example, Greg Crawford’s lab at Duke University’s Institute for Genome Sciences and Policy has been awarded $6.5 million to identify and characterize regions of open chromatin using a cross-platform approach.
Crawford told BioArray News this week in an e-mail that Duke aims to “identify at high resolution all active gene regulatory elements in the human genome among cell types representative of most human tissues.” His lab will seek to accomplish this goal by identifying regions of open chromatin with two independent and complementary methods: DNaseI hypersensitivity assays and formaldehyde-assisted isolation of regulatory elements. Crawford said this will be combined with chromatin immunoprecipitation for a small set of selected regulatory factors.
”Each method will be coupled to two detection platforms: high-resolution whole-genome tiled arrays [from NimbleGen] and Illumina's sequencing by synthesis strategy,” Crawford wrote. “We believe that using this dual platform approach allows for global validation.”
Crawford’s decision to use NimbleGen’s arrays with Illumina’s sequencer highlights another trend in the new market for arrays and sequencers: a lack of brand loyalty in cross-platform experiments. Several companies, like Illumina and Roche, have sought to become comprehensive tool providers, selling both arrays and sequencers. But so far, researchers don’t appear to see an advantage in buying both platforms from the same vendor. Instead they opt to use NimbleGen arrays with an Illumina Genome Analyzer, or Affy tiling arrays with a sequencer from 454 Life Sciences. NimbleGen and 454 are owned by Roche.
“We've found that for our purposes we need huge numbers of sequences, and we can efficiently generate larger numbers of reads using the Illumina system,” wrote Crawford about the choice to use Illumina instead of 454. “Even though the read lengths of Illumina are shorter than 454, the 20-35mer Illumina sequences are more than adequate for uniquely mapping to the genome.”
“They said they don’t want to fund us for something today that three years from now we will have to repeat using another technology.”
Moreover, Crawford argued that rather than replacing arrays, next-gen sequencers will make the results of his part of the ENCODE project more reliable. “Sequencing is useful in that it generates to-the-basepair resolution ... and you have to worry less about cross hybridization artifacts,” he wrote. “Arrays are useful in that they provide a cheap platform to validate the quality of the material before going genome-wide, and the platform inherently normalizes for copy number variation, which is a huge advantage when analyzing complex aneuploid cell lines.”
Affy’s Cool $10.2 Million
Of the projects funded to contribute to the second phase of ENCODE, there is one that will mostly rely on array technology. Affymetrix was awarded $10.2 million to use its GeneChip tiling arrays to identify protein-coding and non-protein-coding RNA transcripts. Affy’s team will be led by Thomas Gingeras, the firm’s vice president of biological science.
“Our project will work on mapping all transcribed regions in the human genome,” Gingeras wrote in an e-mail to BioArray News this week. “We will characterize both the protein-coding and nonprotein-coding transcripts as to where they start and stop, as well as their structure. Additionally, we will try to formulate a classification system to help group both protein-coding and nonprotein-coding transcripts.”
According to Gingeras, Affy will use different versions of its whole-genome tiling arrays, along with other types of custom technologies. He said he was uncertain which other projects will be using Affy tiling arrays but noted that fellow ENCODE researchers John Stamatoyannopoulos at the University of Washington in Seattle and Michael Snyder at Yale University have used them in the past. Neither Stamatoyannopoulos nor Snyder returned e-mails seeking comment by press time.
Regardless, Gingeras wrote that Affy benefited from its participation in the pilot phase of ENCODE and that its business is likely to benefit from the new four-year project.
“After the previous collaboration, Affymetrix was able to develop various types of whole-genome tiling arrays, sample prep and labeling assays, and computational methods to deal with large amount of data,” Gingeras wrote.
“We will now take a deeper look in order to develop a comprehensive map and characterization of both protein-coding and non-coding transcripts for the human genome,” he wrote. “This tool will be an important and valuable resource for the scientific community in both basic and applied research.”