By Julia Karow and Bernadette Toner
Sixty percent of next-generation sequencing users recently polled by In Sequence operate an Illumina sequencer, while only a third own a 454 or a SOLiD platform, confirming Illumina's dominant position in the marketplace.
According to the results of In Sequence's fourth annual survey of the next-generation sequencing community, almost three quarters of users have NGS instruments from a single vendor, the four most popular platforms being the Illumina Genome Analyzer IIx, the Illumina HiSeq 2000, Roche's 454 GS FLX, and Life Technologies' SOLiD 4.
But the picture is about to change as new platforms enter the market. Over the next 12 months, current NGS users plan to purchase a variety of instrument types, including the HiSeq 2000, the Ion Torrent Personal Genome Machine, the PacBio RS, and the 5500xl SOLiD.
Also, almost half of current NGS users believe that PacBio will provide the next big leap in sequencing, while others bet on Ion Torrent and Oxford Nanopore Technologies.
The following is a high-level analysis of the survey results. Complete results are available in a supplementary document here.
NGS Users from 27 Countries Respond
The anonymous survey, conducted with SurveyMonkey software, ran from Dec. 7 through Dec. 31, 2010, and generated 415 responses.
After eliminating duplicates and responses from vendors (based on IP address), we were left with 341 participants. Of those, 65 percent (221) said they currently operate a next-generation sequencing system, and unless noted otherwise, our analysis focuses on these.
Almost 40 percent of NGS users work in universities, and about a quarter in non-profit research institutes. The remainder work in pharmaceutical or biotechnology companies, government labs or agencies, or commercial service providers.
NGS users hail from all over the world — 27 countries in total. More than half are based in North America, a third in Europe, 11 percent in Asia-Pacific, and the remainder in South America and Africa.
Sixty percent of NGS users reported having an Illumina platform, followed by 31 percent who operate Roche/454 sequencers, 30 percent who use Life Technologies' SOLiD system, 5 percent (10 users) with an Ion Torrent PGM, 5 percent (9 users) with a Pacific Biosciences instrument, and 5 percent (10 users) with other instrument types — among them Helicos BioSciences' system (4 users), the Polonator (2 users), and the Intelligent Bio-Systems sequencer (1 user). (see figure, below )
We also asked users to select ranges for the number of instruments of each model they have (one, two, 3 to 10, 11 to 20, 21 through 50, or more than 50). In total, there are at least 785 and up to more than 1,341 next-gen sequencers currently in use among survey respondents.
With 90 users, the Illumina Genome Analyzer IIx appears to be the most popular platform, and represents the largest number of instruments of a single type in our survey — at least 240 and as many as 418 or more. About half the GAIIx users run a single instrument, almost 30 percent operate two, and a single user reported having more than 50 instruments.
Less than a year after its launch, the HiSeq 2000, with 72 users, is already the second most popular platform in our survey, with between 204 and more than 344 total instruments. More than 70 percent of HiSeq 2000 users run a single instrument, and one user reported having more than 50.
Sixty-five users of our survey operate Roche's 454 GS FLX, making it the third most popular instrument. More than 80 percent of these users run a single instrument, and they operate at least 65 and more than 103 GS FLX machines in total.
With 59 users, Life Technologies' SOLiD 4 came in fourth in popularity in our survey. About half of the SOLiD 4 users operate a single system, and one user reported more than 50 instruments. In total, these users own between 129 and more than 258 SOLiD 4 instruments.
The remaining platforms count considerably fewer users among our survey participants, in part because they only recently came to market or are still in early-access testing. Sixteen users own the Illumina Genome Analyzer IIe (Illumina has already decided to discontinue this platform), 10 users have Life Tech's Ion Torrent, nine users each operate Pacific Biosciences' sequencer or the Illumina's HiScan SQ, seven users each operate Roche's 454 GS Junior or Life Tech's 5500xl SOLiD, and four users each have Illumina's HiSeq 1000 or Life Tech's 5500 SOLiD. In addition, nine users reported having a SOLiD 3, the predecessor of the SOLiD 4 (see table, below ).
I'll Have a Combo
We also analyzed whether users operate instruments from more than one vendor, and if so, what combinations they favor.
The overwhelming majority — almost three-quarters — have instruments from a single vendor: 43 percent only operate Illumina sequencers, 15 percent only have SOLiDs, and 14 percent only run 454 machines (a minority said they operate only the Helicos, Polonator, Pacific Biosciences, or Intelligent Bio-Systems platform).
Of the remainder, 20 percent operate platforms from two vendors, the most popular combinations being Illumina/454 (7 percent), SOLiD/454 (6 percent), and Illumina/SOLiD (4 percent).
A fraction of users run platforms from three, four, or five different vendors. Two users operate a hodgepodge of Illumina, SOLiD, 454, Pacific Biosciences, and Ion Torrent sequencers (see table, below ).
Performance of NGS Platforms
On their websites, vendors post certain performance specifications for their platforms, which these instruments are supposed to deliver in a customer's lab. We asked users whether their instruments indeed met — or exceeded — those specifications (as of late 2010) in their best runs.
The following analysis includes the four most popular platforms, as well as two new instrument types that were in the hands of early-access customers in late 2010 — the PacBio RS and the Ion Torrent PGM. Complete performance data for all systems is available in the supplementary material.
For the GAIIx, Illumina specified an output of 85 to 95 gigabases for 2 x 150 base pair reads, as well as 320 million clusters passing filter and up to 640 million paired-end reads. Moreover, more than 85 percent of bases are supposed to have a quality value higher than Q30 in a 2 x 100 base pair run.
Fewer than half of GAIIx users reported meeting or exceeding the stated output per run, and about two-thirds said they generated fewer than 320 million reads per run. Also, fewer than half the users confirmed that at least 85 percent of the bases were greater than Q30.
For the HiSeq 2000, Illumina cited an output of 150 to 200 gigabases for 2 x 100 base reads, up to a billion clusters passing filter and up to two billion paired-end reads. Also, more than 80 percent of bases are expected to have a quality value higher than Q30 in a 2 x 100 base pair run
Almost three-quarters of HiSeq 2000 users achieved or exceeded the stated output, with more than half getting more than 200 gigabases, and 21 percent more than 250 gigabases per run. Half the users also said they had generated more than a billion reads, and about three-quarters said the base quality is at least as good as stated, with about 10 percent reporting more than 90 percent of bases being greater than Q30.
Roche's 454 specified that the GS FLX generates 400 to 600 million high-quality, filter-passed bases per run (400 to 600 megabases), and more than a million high-quality reads per run. Moreover, the Q20 read length is 400 bases — that is, the accuracy is 99 percent at 400 bases, and higher for prior bases.
Almost 80 percent of GS FLX users said they generated at least the specified output, and almost a fifth achieved more than 600 megabases in a run. More than half obtained or exceeded the specified number of reads per run, but only about a third said that more than 99 percent of bases in 400-base reads are at least Q20.
Life Technologies stated for its SOLiD 4 an output of 80 to 100 gigabases of mappable data for a mate-paired run, along with at least 700 million (unpaired) tags per run, and a "system accuracy" of greater than 99.94 percent due to 2-base encoding.
More than half of SOLiD 4 users said they have generated at least the stated output, and 15 percent reported an output greater than 100 gigabases per run. However, 60 percent of users generated fewer reads than specified by Life Tech. Almost 60 percent said their "system accuracy" matched what Life Tech promised.
Pacific Biosciences in late November 2010 released performance specs for its beta instruments at early-access customer sites, which included an average read length of 500 to 550 bases and a raw read accuracy of 80 to 85 percent (IS 12/7/2010). The company also said previously that about a third of the wells in its chip generate sequence data, translating to about 15,000 reads per run with the 45,000-well development chip, and 25,000 reads per run with the 75,000-well commercial chip.
In our survey, half the PacBio users said they achieve at least the specified average read length, with 25 percent exceeding it. Also, half the users said the raw read accuracy is 80 to 90 percent (10 to 20 percent raw read error rate), whereas a quarter achieve better and a quarter worse error rates. About 60 percent of users said they are getting at least 15,000 reads per run, and one user reported getting more than 25,000 reads.
For the Ion Torrent platform, Life Technologies also released initial performance specs in late November (IS 12/7/2010): a read length of 100 to 200 bases, and at least 100,000 reads per run — translating to an output of 5 to 10 megabases — and an error rate of about 1 percent.
More than half of Ion Torrent users said they are getting fewer reads and a smaller output than specified, while more than half obtain the specified average read lengths. Also, about three-quarters reported an error rate of at least 1 percent, and a fifth of greater than 2 percent.
We also wanted to know which factors were most important for users in choosing a particular vendor's platform. We asked them to rank throughput, accuracy, read length, run time, ease of sample prep, instrument price, and reagent price on a scale of 1 (not important) to 4 (very important) and to add other factors as needed.
For Illumina users, accuracy was the most important criterion, followed by throughput, ease of sample prep, and read length. Other factors mentioned were a "history of excellent technical, customer, and sales support," "base-space data," the fact that the Illumina platform was a "more established platform than SOLiD at the time of choice" and that "it looks nice." One core facility member "had to buy an Illumina because it was the next big thing."
For SOLiD users, accuracy was also the most important factor, followed by throughput as well, but the next two criteria were reagent and instrument price. Other responses included "support from Life Technologies," "scalable runs, different chemistries (PE, barcode, fragment, etc.) in same run," and "available reagent kits, esp. for RNA-seq."
Users of 454 platforms ranked read length first, well ahead of accuracy and run time. Other answers included "service," "versatility, changing protocols, and innovating chemistries," "no other next-gen platforms available at the time," "variety available to customers in addition to SOLiD and Illumina," "reliability – very important," and "none of the above, [it] was the first instrument on the market." One user noted, "We are a core facility. We had to buy a 454 because it is the next big thing. All other factors were secondary."
For PacBio users, read length and run time ranked first, followed by ease of sample prep. Ion Torrent users said they chose the platform mostly for the instrument price, followed by run time and read length.
We also asked users what applications they are using their platforms for. For Illumina users, the three top applications were RNA-seq/transcriptome sequencing and whole-genome resequencing, followed by targeted/amplicon/exome sequencing. One user mentioned "RAD sequencing" under "other applications."
For SOLiD, users listed whole-genome resequencing as the top application, followed closely by RNA-seq/transcriptome sequencing and targeted/amplicon/exome sequencing. Two users mentioned "reduced representation libraries" and "DNA-encoded chemical libraries" under "other applications."
Among 454 users, the most frequently cited applications were whole-genome de novo sequencing and targeted/amplicon/exome sequencing.
PacBio users mentioned whole-genome resequencing most often, followed by whole-genome de novo sequencing.
For the Ion Torrent platform, the application mentioned often was targeted/amplicon/exome sequencing.
Target Selection Methods
Targeted high-throughput sequencing — in particular exome sequencing — has taken off over the last few years, and we wanted to know what target selection or enrichment methods researchers prefer.
The clear favorite, according to our survey, is Agilent's SureSelect Target Enrichment System, used by more than half of NGS users polled. Interestingly, half the respondents also said they still use traditional multiplexed PCR to enrich their targets.
Next in line were Agilent SureSelect DNA Capture arrays, NimbleGen Sequence Capture arrays, and NimbleGen SeqCap EZ, followed by RainDance Technologies, Fluidigm, and several other selection methods (see figure, below ).
Next, we asked users to rank the importance of future improvements in next-gen sequencing (from 1 "not important" to 4 "very important").
As in previous years, users listed "better analytical tools" as their top priority, closely followed by "decreasing cost." Several users added "longer reads" as another desired improvement, one of them stating that "we've gotten so used to all the headaches associated with short-read technologies that we've almost forgotten the manifold advantages of longer read lengths." (see figure, below )
Looking toward the future, we also asked current NGS users what additional platforms — if any — they plan to bring in over the next 12 months. More than a third mentioned each the HiSeq 2000 and the Ion Torrent Personal Genome Machine, and about a quarter named each the PacBio RS and the 5500xl SOLiD. However, at the time of our survey, Illumina had not yet announced its MiSeq platform (IS 1/18/2011), which is expected to compete with both the Ion Torrent and the 454 GS Junior and might change users' purchasing plans.
Interestingly, about a fifth of current NGS users said they are not planning to add any new instrument within the next year.
We also analyzed responses from those survey participants who do not currently operate a next-generation sequencing platform. Some of these, presumably, will be NGS users in the future, while others might not be working scientists.
About a quarter said they plan to bring in the Ion Torrent PGM, and 17 percent favor the HiSeq 2000. More than 40 percent, however, said they are not planning to purchase a sequencer over the next 12 months.
Finally, we asked current NGS users to look into the crystal ball for us, telling us what new technology they think will provide the next big leap in sequencing.
Almost half said it will be Pacific Biosciences, and about a third each bet on Ion Torrent and Oxford Nanopore Technologies. About 15 percent said it will be none of the platforms listed as options.
Among participants who do not currently have a NGS system, predictions were similar.