MONTREAL - Roche, Illumina, and Applied Biosystems presented updates on applications for and projects involving their respective next-gen sequencing platforms at last week’s HGM 2007 Human Genome Meeting in Montreal.
Illumina and ABI also spoke about how their instruments will likely perform in the near future (see table, below).
Roche, which this week closed its acquisition of 454 Life Sciences (see Short Reads), recently added human whole-genome sequencing to its list of applications: 454 this week plans to present Jim Watson with the sequence data from his genome at Baylor College of Medicine, according to a company spokesperson.
Earlier this year, 454 and its Baylor collaborators presented initial results from the project at the Advances in Genome Technology and Biology meeting (see In Sequence 3/13/2007).
At the HUGO meeting, users of Roche’s sequencer talked about adding new applications and methods. During a company workshop, Yijun Ruan from the Genome Institute of Singapore presented a new method for studying long-range chromatin interactions.
The method, which Ruan calls ChIA-PET, involves crosslinking DNA and the proteins bound to it; fragmenting the DNA; using a linker to ligate DNA that is in close proximity; and using 454 sequencing to map the ligation sites.
Last year, Ruan published a paired-end sequencing method for the Genome Sequencer using linked PETs, or paired-end ditags (see GenomeWeb News, In Sequence’s sister publication, 9/5/2006).
Michael Boutros from the German Cancer Research Center in Heidelberg, Germany, showed how his lab sequenced Drosophila microRNAs from a mixture of different libraries, using four base-pair tags as library barcodes.
Finally, Jan Korbel from Mike Snyder’s lab at Yale University presented a new paired-end strategy that used 454 sequencing to analyze structural variation in the human genome. Snyder presented data from this research earlier this month at the Biology of Genomes meeting (see In Sequence 5/15/2007).
Presenting for Illumina, chief scientist David Bentley said that by the end of the year the company plans to increase the read length of its Genetic Analyzer to 50 base pairs, make paired reads widely available, and increase the output from 1 gigabase to 3 gigabases per run.
He also talked about a variety of projects and applications for the platform, among them studies that couple chromatin immunoprecipitation with sequencing (see related article, this issue), and an in-house project that used the technology to discover potentially novel microRNAs.
Others have used the technology for gene-expression profiling. According to Bentley, researchers at Illumina have sequenced mRNA from brain samples and compared it to gene expression in a universal human reference. They also identified alternative splice sites and discovered novel transcripts.
As for genome sequencing applications, scientists at the Wellcome Trust Sanger Institute have sequenced Streptococcus suis and have generated an assembly of its genome using no paired reads, Bentley said. The coverage was a little over 97 percent, but paired reads will likely bring it up to 100 percent, he added.
Additionally, in a collaboration between Illumina and researchers from Johns Hopkins University and the National Human Genome Research Institute, Illumina resequenced 140 kb of human DNA at greater than 5-fold coverage. The team called previously known SNPs and found a large number of novel SNPs that have since been confirmed by capillary sequencing, Bentley reported (see GenomeWeb News 10/30/2006).
Illumina and Sanger researchers have also sequenced the X chromosome from a CEPH sample and have analyzed structural variants using paired reads.
Bentley also said that plug-ins and modules will soon be available to view and analyze Illumina sequencing data using the Ensembl genome browser. The software results from a three-year collaboration between Solexa, Imperial College, the European Bioinformatics Institute, and the Wellcome Trust Sanger Institute that was funded by the UK government.
Applied Biosystems will start shipping early-access versions of its SOLiD platform to users “in the next few weeks,” company representatives told In Sequence during the HUGO meeting.
In selecting customers for the early-access program, which will start mid-summer, the company is looking at the “capability of the institutions” as well as the diversity of applications for which they plan to use the instrument.
ABI reiterated its plan to fully launch the platform early next year. Running costs will be “extremely competitive.” according to Michael Gallad, senior manager of Americas marketing for genetic analysis and genotyping at ABI, but have not been completely determined yet.
Gallad said the company plans to have “more service coming online” this summer, adding that there is already “quite a queue” at the former Agencourt Personal Genomics in Beverly, Mass.
However, ABI does not plan to provide fee-for-service long term but to use the service to woo customers and test new applications.
During a company presentation, Kevin McKernan, ABI’s senior director for scientific operations for high throughput discovery, said that ABI recently showed proof-of-principle for a new ligation protocol that increased the read length for single reads to 45 base pairs.
He also pointed out the importance of paired-end reads for sequence-based analyses of structural variations in the human genome. To this end, he said ABI is working on three different types of mate-pair libraries with insert sizes ranging from six to 14 kilobases.
The company has also started to sequence on two slides per run, increasing the output to 1.4 gigabases per run. The company hopes to increase the output to 20 gigabases per run in the next few years by increasing the density of the templates, among other improvements, McKernan said.
He mentioned results from several collaborations with outside researchers for which ABI has contributed SOLiD data. Among them is a resequencing project of an E. coli strain with George Weinstock at Baylor College of Medicine. During this study the researchers detected a duplication and corrected the Sanger reference sequence.
In addition, McKernan talked about a transcription-sequencing project in collaboration with Sean Grimmond at University of Queensland in Australia (see In Sequence 3/20/2007), as well as a histone-analysis project in C. elegans with Arend Sidow’s lab at Stanford University, and an update on a P. stipitis-resequencing project with researchers at the US Department of Energy’s Joint Genome Institute (see In Sequence 3/6/2007).
How They Stack Up
The performance of next-gen sequencing platforms is a moving target.
The table below compares the three platforms that are currently or imminently available.
|Name||Genome Sequencer FLX||Genome Analyzer (Solexa sequencing)||SOLiD System|
|Availability||Fully commercialized||Fully commercialized||Limited early access in June 2007 to be followed by more placements in the fall of 2007|
|Base pairs per run||100 megabases||1 gigabase of high quality data ( >3 gigabases by end of 2007)||1.6 mappable Gb (2 segmented slides running mate-pair library) this summer (2-4 gigabases later in 2007)
"Mappable reads" for a defined readlength (25) would have at least 50% of the beads with zero or 1 error when aligned to reference and no more than 50% of the beads with 2 or 3 errors when aligned to reference
|Average read length||200-300 bp or 2x20 bp mate pairs||35 bp (50 bp in 2007)||25bp - fragment library
Up to 35bp at 99.9% accuracy
2x25bp - mate pairs library
|Reads per run||>400,000 filtered reads||> 40 million reads (clusters) per run of high quality data||Up to 40 million|
|Run time (no sample prep)||7.5 hours||3 days||4 days - fragment library
8 days - mate pairs library
3 days and 6 days expected for later systems
|Cost of sequencing (reagents)||NA||$3,000 per run; $400 per 4 megabase bacterium (25-30X); $4 per gene (25X, 100-plex); $100,000 per human genome (25-30X)||Approximately $3,000 per Gb (depending on application)|
|Single-read accuracy||>99.5% over 200 bases||In 1 gigabase production run, over 90% of 35 bp reads will have 2 or fewer errors; over 50% >99.9% accurate (Q30)||97% (99.9% after error correction from 2-base encoding)|
|Paired-end reads||Yes: fragment sizes 2-2.5 kb, 20 bp tags; alternative method with 2x100 bp reads from 2.5 kb fragments planned for later in 2007||Yes: currently in early access testing; available in 2007||Yes: insert sizes 3-4 kb, 6-8 kb, (10-14 kb in development)|
|Multiplexing||Up to 16 samples/plate (3 gasket formats to subdivide picotiter plate); bar-coding tags in development||Currently can process 8 samples/slide (separate channels); bar-coding tags in development to enable multiplexing||Up to 8 samples/run (4 samples/slide) for June 2007
Up to 32 samples/run in 2008
Bar-coding tags in development
|List price per instrument||NA||$430,000 (includes cluster station)||$525,000 for analyzer
$600,000 for system, which includes analyzer and ancillary equipment. Analyzer also includes compute system: 12 DualCore CPU cluster with 9 TB data storage for June 2007 systems. Depending on availability of terabyte drives, Fall units may increase to 13 TB of data storage
|Recommended DNA/RNA starting material||NA||0.1-1 microgram||Application dependent
10 microgram (fragment libraries)
30 microgram (mate pair library)
|Anticipated performance beyond 2007||500 bp reads in development||6+ Gb in 2008||In 2008: 50 bp single reads; 4-8 Gb/run; Reduced DNA sample required; 99.95% raw read accuracy. Several years: Up to 20 gigabases|
SOURCES: Company presentations and interviews.