By Monica Heger
In anticipation of the launch of its MiSeq instrument next quarter, Illumina presented data on the system's performance earlier this month in a poster at the Biology of Genomes meeting at Cold Spring Harbor Laboratories.
In the poster, company researchers compared MiSeq's performance to the HiSeq for human genome sequencing, amplicon sequencing, 16S metagenomics, and bacterial sequencing and de novo assembly.
In general, the data, which are representative of an average run on MiSeq, are comparable to the HiSeq data, Geoff Smith, Illumina's senior director of DNA sequencing, told In Sequence.
The MiSeq is at an "advanced stage of development," he said, and the company is now finalizing a "series of robustness testing and fine-tuning of the software to support the instrument when it's launched."
He said the company is still on track to start shipping to early-access customers in the third quarter, although he declined to specify how many orders Illumina had for instruments, or who the early-access users will be. Currently, all sequencing runs on the instrument are being done in-house.
To compare the instrument's performance on human genome sequencing, Illumina sequenced the library of a previously sequenced human genome. One 101-base paired-end sequencing run on one flow cell on the MiSeq generated about 8 million reads and 1.6 gigabases of sequence data, corresponding to about 0.5-fold coverage of the genome.
About 90.4 percent of the reads passed filter, compared to 90.8 percent on the HiSeq, with 87.7 percent of bases passing Q30 metrics and 62.7 percent above Q35, compared to 85.5 percent and 67.7 percent for the HiSeq, respectively.
On the MiSeq, 82 percent of the first read and 76 percent of the second read were considered perfect — slightly higher than the HiSeq, which has perfect reads on the first read 80 percent of the time, and on the second read 73 percent of the time.
Because the MiSeq only sequenced to 0.5-fold coverage of the human genome, the data was normalized to evaluate how well it covers features like genes, exons, and promoters. Similar to the HiSeq, the MiSeq was able to cover all of the genes and exons and about 80 percent of CpG islands and promoters, the Illumina researchers said.
Additionally, the team demonstrated the instrument's ability to detect structural variation. Taking PCR products spanning a panel of structural variants that had been previously identified by the whole-genome sequencing of a mother, father, and child, the team pooled and indexed the products, and then converted them to Nextera libraries. They then sequenced the libraries using a 48-base paired-end sequencing strategy, identifying a homozygous deletion in the child and mother and a heterozygous deletion in the father.
Average run times range from four hours for a single-end 36 base sequencing run to 27 hours for a 151-base paired-end sequencing run. Sample prep with Nextera adds about 1.5 hours and alignment and variant calling about another two hours.
The machine will likely compete with Life Technologies' Ion Torrent PGM and the Roche 454 GS Junior machine. The three machines are all lower-cost versions of their larger, higher throughput predecessors such as the SOLiD, GS FLX, and HiSeq, and are all targeting the clinical market.
[ pagebreak ]
The PGM and GS Junior have a bit of a head start in the marketplace since they are already available, but Smith said one advantage of the MiSeq is that all the protocols and applications that researchers have already developed for the HiSeq and Genome Analyzer will also work for the MiSeq, so it would be an easy transition.
'Addressing the Clinical Market'
To address whether the MiSeq would be a good machine for clinical applications, Illumina demonstrated its ability to do amplicon sequencing of the cancer-associated BRAF and KRAS exons from formalin-fixed, paraffin-embedded samples from ovarian, rectal, and gastric tumors, as well as controls.
FFPE-extracted DNA was amplified for the BRAF and KRAS exons with tailed indexed primers, quantified, pooled and sequenced using a 77-base paired-end ultra deep sequencing protocol. For each tumor and normal sample, the Illumina team generated over 15,000-fold coverage of the exons and was able to detect variants down to 1 percent frequency.
This is "very sensitive" and "much better than what you'd get with capillary sequencing," said Smith.
Smith said that so far, sequencing from FFPE samples has not been an issue, particularly on exons that are short. The BRAF and KRAS exons were 96 base pairs and 76 base pairs, respectively.
He added that he thought amplicon sequencing, particularly amplicons with diagnostic or medical implications, will be one of the main applications of the MiSeq instrument.
"We are aiming to address the clinical market by putting the MiSeq platform through FDA 510(k) approval," he said, reiterating a goal that the company set for the instrument when it first announced plans to launch it (IS 1/18/2011).
The company has been working with a variety of clinical groups to sequence clinically relevant amplicons, including researchers at the University of Oxford who are sequencing B-cell chronic lymphocytic leukemia patients at various time points throughout treatment. Anna Schuh, a hematologist at the University of Oxford, presented initial data from the study at the Biology of Genomes meeting, which included using the MiSeq to do ultra-deep sequencing of genes that had been found to be mutated in the patients (CSN 5/17/2011).
The Illumina team also sequenced and performed a de novo assembly of the E. coli genome, comparing the MiSeq to the HiSeq. Sequencing en E. coli library with 101-base paired-end reads, they generated about 6 million reads. Each platform detected the same percentage of high-quality reads, as well as the same number of SNPs and indels.
Using the Velvet assembler, both platforms assembled the genome into contigs with an N50 of 132,865 base pairs for the MiSeq and 148,770 base pairs for the HiSeq.
Finally, they demonstrated the platform's applicability in metagenomics, sequencing the 254 base pair V4 region of the 16S rRNA gene in 32 indexed samples of dogs, their owners, soil samples, and human samples from Malawi using 151-base paired ends.
Again, the sequencing results were comparable to what was generated from the HiSeq, and the different samples showed unique bacterial makeup.
Have topics you'd like to see covered by In Sequence Contact the editor at mheger [at] genomeweb [.] com.