COLD SPRING HARBOR, NY — Researchers at the Broad Institute, in collaboration with Helicos BioSciences and a group at the University of Massachusetts Medical School, have used the company's sequencing technology to identify origins of replication in the genomes of several fission yeast species, thus doubling the number of eukaryotes for which such an analysis has been performed on a genome-wide level.
At the Biology of Genomes meeting at Cold Spring Harbor Laboratory last week, Chad Nusbaum, co-director of the Broad's genome sequencing and analysis program, reported that the Helicos single-molecule sequencing technology seemed a good fit for the project because it generates large numbers of reads and does not require amplification of the DNA during the sample preparation, so the data would not suffer from amplification-induced bias.
Origins of replication — small DNA regions scattered across the genome where DNA replication starts during the S-phase of the cell cycle — are difficult to identify in eukaryotes, he explained, because they are not conserved in sequence or position along the genome. Until recently, origins had only been mapped genome-wide in two eukaryotic genomes, those of baker's yeast, Saccharomyces cerevisiae, and the fission yeast Schizosaccharomyces pombe.
For their project, in collaboration with Nick Rhind's group at the UMass Medical School, the researchers chose to analyze replication origins in two other, recently sequenced fission yeasts, S. octosporus and S. japonicus, as well as to re-analyze S. pombe as a control. They extracted DNA from cells in the S phase and in the G2 phase of the cell cycle and sent it to Helicos for sequencing, which generated 60 million of mappable unpaired reads per genome.
After mapping the reads back to the genome and subtracting G2-phase from S-phase data, the scientists were able to identify origins of replication, which were covered by greater numbers of reads than other areas of the genome because their DNA had already begun to be replicated.
The resolution of the origin maps was less than a kilobase, and the researchers were able to distinguish between origins used at low frequency and high frequency. The data were also very reproducible, with less than 10 percent of the signal representing noise, Nusbaum said. The S. pombe map corresponded well with known or predicted origins in this species, and the Broad team is currently analyzing the nature of the origins in the two other fission yeast species further, and plans to conduct time course experiments.
"With four lanes of Helicos sequencing, we were able to double the number of eukaryotes with genome-wide maps of origins," Nusbaum noted, adding that in principle, the same approach could be used in other species, including humans.
Asked by an audience member whether the amplification-free sample prep actually made a difference in the experiment, he said that his team had not performed a direct comparison with a sequencing technology that requires amplification, such as the Illumina Genome Analyzer, but that he suspected amplification would likely introduce bias.
The replication origin project was one of several test projects the Broad Institute conducted in collaboration with Helicos prior to the institute's decision late last year to bring the Helicos Genetic Analysis System in house (see In Sequence 12/16/2008).
In another such project, the Broad researchers sequenced the genome of one of the fission yeast species at 40- to 50-fold coverage and assembled the data with the Velvet short-read assembler. The results enabled them to close several hundred gaps in the existing genome sequence of this species that were due to unclonable sequence stretches, Nusbaum said.
They also tested the Helicos technology in a ChIP-seq experiment, obtaining results similar to Illumina ChIP-seq projects. Only nanograms of DNA were required as input material, Nusbaum pointed out.
Overall, he said the Helicos sequencer is currently well-suited for "counting applications" such as transcriptome analysis, copy number variation analysis, and ChIP-seq because it generates "cheap abundant reads," has a "very easy" sample prep that involves no amplification, and requires small amounts of starting material.
The Broad Institute is currently using its HeliScope for transcriptional profiling, copy number variation analyses, ChIP-seq, capturing unclonable sequence, and origin of replication analyses, he said.