By Julia Karow
After announcing its "genome in a day" HiSeq 2500 platform at the JP Morgan Healthcare Conference last month, Illumina last week showed data generated on the instrument at a user meeting at the Advances in Genome Biology and Technology conference in Marco Island, Fla.
At the meeting, the company and several customers also highlighted applications for the MiSeq desktop sequencer, and Illumina outlined further improvements for the instrument, including 2x400-base reads.
Earlier this year, Illumina announced the HiSeq 2500 (IS 1/10/2011), which will have a high-output run configuration similar to the HiSeq 2000 as well as a “genome in a day” configuration that will generate 120 gigabases, or a human genome at about 40x coverage, in 27 hours from 2x100-base reads. The instrument, slated for a launch in the second half of the year, uses a two-lane flow cell as well as on-board clustering and fast chemistry that were originally developed for the MiSeq.
The company has not yet determined how much it will cost to sequence a human genome using the fast run mode.
Current HiSeq 2000 users can upgrade to the HiSeq 2500 for $50,000, and users of the HiSeq 1000, which has only one flow cell, can do an equivalent upgrade to the HiSeq 1500.
While recent HiSeq 2000 owners will get the full performance of the 2500, owners of an early version HiSeq 2000, for technical reasons, will obtain an instrument with a slightly slower run time, requiring a day and a half instead of a day to sequence a human genome in fast mode, according to an Illumina representative.
During the user meeting, Geoff Smith, senior director of research at Illumina, said that fast HiSeq 2500 runs for customers have shown good performance, with 90 percent of the data of greater than Q30 quality. The entire workflow from library preparation to annotated variants takes only 50 hours or so, he said, using 500 nanograms of human DNA as input and TruSeq PCR-free and gel-free sample preparation.
Internally, the company has sequenced a HapMap trio from just 100 nanograms of starting DNA, using PCR-free and gel-free sample prep and loading the library directly onto the flow cell without further quality control, which saved additional time, he said.
Over the last few weeks, Illumina has sequenced whole human genomes and exomes in fast HiSeq 2500 mode for several customers, including the Genome Institute at Washington University, the Broad Institute, the Sanger Institute, the Genome Sciences Centre at the BC Cancer Agency, and Children’s Mercy Hospital.
As an example, Smith mentioned a tumor/normal pair where Illumina started the sequence run on a Monday and shipped the data to the customer on Friday.
Sheila Fisher, assistant director of technology development at the Broad Institute, said that a comparison of a genome sequenced by the Broad on the HiSeq 2000 and by Illumina on the HiSeq 2500 showed the data quality of the 2500 is “superior at this point.”
The company has also been able to produce high-quality whole-genome data on the HiSeq 2500 from a human FFPE sample, she said.
Illumina has also been working on making human genome data easier to move around: at the conference, the company distributed USB sticks containing four human genomes sequenced on either the HiSeq 2000 or the 2500. To do this, they converted the BAM files into a new file format, called CRAM, that has been developed by the European Bioinformatics Institute.
Since Illumina launched its MiSeq desktop sequencer last September, it has placed about a third of the instruments shipped so far at clinical or translational research labs, according to Smith. The company has not disclosed the total number of MiSeqs it has shipped.
Applications for the system have so far included amplicon sequencing, sample quality control prior to large-scale HiSeq runs, bacterial genome sequencing, and infectious disease sequencing.
Through an upgrade slated for mid-year, the maximum output for the MiSeq is expected to increase from a current 2 gigabases to 7 gigabases per run through a combination of more and longer reads, which will increase in length from 2x150 to 2x250 bases.
An improved DNA polymerase and a novel reagent formulation will also allow the instrument to be faster, decreasing the run time for 2x150 base runs from 27 hours to under 24 hours.
In house, Illumina has achieved an 8.1-gigabase run with 2x250 base pair reads, of which 4.7 gigabases consisted of perfect reads, Smith said.
Internally, Illumina will be working this year on pushing the read length for MiSeq even higher, to 2x400 base pairs, though Smith said the company is “not happy” with the data quality yet.
Its longest perfect read from two overlapping paired-end 400-base pair reads has been 678 bases, which Smith said is “getting to the sort of read length you might be expecting from a capillary sequencer.”
Asked whether Illumina plans to make similar read length improvements for the HiSeq, he said that improvements for MiSeq tend to migrate to the HiSeq eventually, though Illumina has no firm plans for that yet.
The Broad Institute has used its six MiSeqs for a variety of applications, according to Fisher, who spoke at the user meeting. Since it received the instruments last July, it has performed more than 220 runs, with read lengths ranging from 8 bases to 2x250 bases.
Fisher said the platform has had low failure rates, and has been easy to get up and running. One technician is enough to operate all six instruments. She cited the platform’s speed as an advantage, especially for development work, along with its ease of use, yield, high-quality data, and long reads compared to HiSeq. The platform has been easy to integrate, particularly since it can use all the same software tools the Broad had already developed for the HiSeq.
About 40 percent of MiSeq runs at the Broad have been to check the quality of libraries and even representation of pooled samples, 25 percent resequencing, 13 percent amplicon sequencing to validate mutations, 16 percent de novo assembly, and 6 percent long read sequencing, she said.
The error rate of the MiSeq’s 2x250 base reads is improving quickly, and a promising application for the platform’s relatively long reads is the de novo assembly of bacterial genomes and viral populations, she said, as well as 16S microbial profiling.
Because of the lower cost of MiSeq data compared to 454, in the coming months, the Broad is looking to switch certain long-read applications from 454 to MiSeq, she said.
Stephan Schuster at Penn State University, who also spoke at the user meeting, has applied the MiSeq to a variety of projects as well, including targeted resequencing of bacteria and mammals for genotyping purposes, exploratory plant genome sequencing, and insect mitochondrial sequencing. Among other things, he developed a method for paired-end sequencing with 1.5-kilobase insert sizes that uses 2x150-base reads.
Have topics you'd like to see covered in In Sequence? Contact the editor at jkarow [at] genomeweb [.] com.