By Julia Karow
As second-generation sequencing platforms have matured, all three major high-throughput sequencing systems — Illumina's Genome Analyzer, Life Technologies' ABI SOLiD, and Roche's 454 GS FLX — have entered labs across the world, according to a recent survey by In Sequence.
Although two-thirds of the 50 labs polled in the survey own only a single type of next-gen sequencing platform, a number of centers have started to mix and match systems from different vendors.
More than two-thirds of survey respondents said they plan to purchase another high-throughput sequencer in the next 12 to 18 months. For those users, Illumina's sequencers tops the wish list, followed by Pacific Biosciences' real-time single-molecule platform and ABI's SOLiD.
When it comes to the next generation of sequencers, which are currently in development or not fully commercialized, most users place their bets on systems from PacBio and Oxford Nanopore Technologies, followed by Life Tech's single-molecule sequencer.
The anonymous survey, consisting of 10 questions, was conducted over 19 days in late December and early January, using Surveyor software. A link to the survey was e-mailed to 213 known users of second-generation sequencing platforms in more than 30 countries, based on our own reporting as well as a UK-based database of high-throughput sequencing facilities.
In order to avoid multiple responses from the same institution, the survey was sent to only one e-mail address at each organization. For those organizations with multiple sequencing facilities, the survey was sent to only one e-mail address per unit. Fifty individuals, or 23 percent, responded to three or more questions and were included in the final analysis.
Slightly more than half the responses came from universities, about a fifth from non-profit research institutes, and the remainder from commercial service providers, pharmaceutical/biotech companies, and government labs or agencies (see chart, below, for details).
Respondents are based all over the world. More than half are located in North America, a third in Europe, and the remainder in Asia-Pacific and South America (see chart).
About two-thirds of respondents said they have at least one Illumina Genome Analyzer, followed by half with at least one Genome Sequencer FLX, and 44 percent with at least one Applied Biosystems SOLiD. Two respondents have the Helicos Genetic Analysis platform, and one has a Polonator. There is some overlap in these numbers since respondents could provide multiple answers (see chart).
Almost two-thirds of users own only a single type of platform, and of those, almost half have Illumina GAs. Among the quarter of users who have two kinds of sequencing platforms, the most popular combination is Illumina and 454. Fourteen percent of users have three types of platforms installed, all of them Illumina, SOLiD, and 454. Two respondents have a fourth sequencing platform — one a Helicos system, the other a Polonator (see chart).
Of the 32 Illumina GA users, about 40 percent have a single instrument of that type, and one-quarter have two GAs. A single user reported having more than 30 Illumina machines.
By comparison, three-quarters of the 25 454 users own a single GS FLX, and none have more than three. Among the 22 SOLiD users, 60 percent own a single system of this type, and one user has between 11 and 20 SOLiDs installed. One Helicos user has a single unit, the other one four; and the sole Polonator user has a single machine of this type.
The typical performance users get out of their sequencing systems, by and large, seems to be in line with vendors' performance specifications.
On average, Illumina GA users reported an upper end of 16.8 gigabases of sequence data per run, ranging between 1.2 gigabases and 40 gigabases. Users also reported an average number of reads per run of almost 160 million, ranging from 100 million to 360 million, and an average read length of up to 72 base pairs for single reads, and 80 base pairs for paired reads, ranging from 36 base pairs to 108 base pairs.
For comparison, Illumina states on its website that the GAIIx generates up to 9 gigabases of data from a single-read 35-base pair run, and between 18 and 50 gigabases of data per paired-end run — depending on the read length, which can vary from 2 x 35 base pairs to 2 x 100 base pairs. The system generates up to 250 million filter-passing clusters per run, each cluster leading to a read.
SOLiD users, on average, said they obtain an upper end of 23.9 gigabases of data per run, ranging from 1.5 gigabases to 60 gigabases. Their systems produce up to 519 million reads per run on average, ranging from 320 million to 1.1 billion. All SOLiD users said they obtain 50-base pair fragment reads and either 25-, 35-, or 50-base paired-end reads from their machines.
On its website, Life Tech's ABI states that the SOLiD 3 Plus has a typical output of up to 30 gigabases with a fragment library, and up to 60 gigabases with a mate-paired library, using 50 base pair reads. The total number of tags, or reads, per run is up to 500 million for a fragment library and up to a billion for a mate-paired library.
Users of 454's GS FLX system reported, on average, a data output of up to 0.48 gigabases per run, ranging from 0.2 gigabases to 0.58 gigabases. The average number of reported reads per run is 960,000, ranging from 80,000 to 1.6 million, and the read length averages 405 base pairs, ranging from 350 base pairs to 540 base pairs.
The company's website states that the GS FLX, running Titanium Series reagents, generates up to 0.6 gigabases of high-quality, filter-passed bases per run, with an average read length of 400 bases and more than a million high-quality reads per run.
A single Helicos user reported an output of 23.5 gigabases per run, 672 million reads per run, and 35-base pair reads. The company, on its website, says the instrument routinely produces up to 28 gigabases per run and up to 800 million usable strands per run, as well as an average read length of up to 35 bases.
Applications and Target Enrichment
We also asked users about the applications they run on their sequencers. The most frequently cited application is mRNA-seq, followed by whole-genome de novo sequencing, small RNA sequencing, whole-genome resequencing and targeted sequencing/deep sequencing, ChIP-seq, digital gene expression/expression tag sequencing, metagenomic sequencing, and methylation/bisulfite sequencing. "Other applications" included sequence-based physical mapping, DNAse analysis, replication timing, and custom tag-based applications. Respondents were able to provide multiple answers for this question (see chart).
Users performing targeted sequencing enrich their target DNA in many different ways, according to the replies. The most popular method is traditional PCR, followed by Agilent's SureSelect in-solution enrichment and NimbleGen's sequence capture microarrays. Multiple answers were also possible for this question (see chart).
Room for Improvement
We also asked users how reliable their platforms have been over the past year, and to list any specific problems they have encountered. In addition, we asked what aspects of their systems they would like vendors to improve over the next year.
The vast majority of Illumina users reported one to several problems with their platforms (see chart), among them reagent issues, mechanical issues, and software problems.
Among the most frequently cited requests for improvement of the Illumina platform is sample preparation, including a better library quantification method and automatable library construction protocols. Back-end analysis is also high on user wish lists, including more tools for data analysis and more reliable software. As in previous years, users asked for longer reads, more reads per run, and lower costs per run and per base.
Most SOLiD users also cited either one or a few problems with their systems. Reported troubles included complex library preparation, hardware issues, reagent issues, and software communication problems. A majority of SOLiD users asked for a shorter and more automated library construction process, and several said they would like to see better and more efficient bioinformatics tools. In addition, users said they would like to see lower costs per run.
Most users of Roche's 454 GS FLX platform also said they had one or a few problems with their systems over the last year. No single issue dominated, and several users noted that their technical problems were quickly resolved. High up on user wish lists is an increase in read length and in the number of reads, as well as a decrease in cost per run and per base. Users also asked for shorter and more automated library preparation.
One of the Helicos users cited a "software glitch" that affected half a run as the single problem so far and said he wished for longer reads, more condensed arrays, and better software.
The Polonator user did not elaborate on issues with that system.
In addition, we asked users whether they are planning to purchase additional sequencing platforms over the next 12 to 18 months, and which ones they are considering. Thirty-four users, or about two-thirds of all respondents, said they are planning another purchase, while six users said they are not currently doing so. The remaining 10 either did not answer this question or said they remain undecided.
Top among the choices for additional machines is Illumina's Genome Analyzer (Illumina had not yet announced the HiSeq 2000 and GAIIe at the time of the survey), followed by Pacific Biosciences' real-time single-molecule system, and ABI's SOLiD. Multiple answers were possible for this question (see chart). One user said he is considering using Complete Genomics' human genome sequencing service as an alternative to buying another instrument.
Finally, we asked users what is the next generation of sequencing technology they are most excited about. Pacific Biosciences was mentioned most often, followed by Oxford Nanopore Technologies, and Life Technologies' single-molecule sequencer. Respondents could select multiple answers (see chart).
Among the six users who answered "none," one said he is waiting to "see who releases a viable instrument first," and another stated that no data from future generations of sequencing instruments is currently available "to get excited about any of them."