By Julia Karow
2009 proved to be another good year for sequencing technology development, as existing high-throughput sequencing platforms increased in performance and entered labs large and small around the world, new technologies made their debut, and researchers continued to use “next-generation” sequencing for a variety of applications and in many research areas.
According to a database established by UK researchers that aggregates self-reported data from users of high-throughput sequencing systems, instruments from Roche’s 454 Life Sciences, Life Technologies’ Applied Biosystems, Illumina, and Helicos BioSciences are now installed in labs in more than 30 countries on five continents, with the greatest number of users in the US and in the UK.
Genome centers around the world continued to add high-throughput sequencers to their stables, with probably the largest scale-up to date at the Broad Institute, which in November purchased an additional 30 Illumina Genome Analyzers for a total of 89.
Besides a multitude of individual studies, several large-scale research projects funded by the US National Institutes of Health and other funding agencies continued to use the new technologies extensively. These efforts included the Cancer Genome Atlas, various cancer research projects under the umbrella of the International Cancer Genome Consortium, the 1000 Genomes Project, the Encyclopedia of DNA Elements Project, the modENCODE Project, the Human Microbiome Project, and the Roadmap Epigenomics Program.
Notably, researchers adopted both targeted and whole-genome massively parallel sequencing for human disease research last year, including studies investigating cancer and rare Mendelian disorders. Funding for many more such studies was awarded toward the end of 2009.
Researchers at Washington University St. Louis, for example, published results from sequencing the genome of an acute myeloid leukemia patient — the group's second AML genome — in the New England Journal of Medicine in the summer, and scientists at the Wellcome Trust Sanger Institute recently published several whole-genome sequencing studies on melanoma, lung, and breast cancer.
Meanwhile, a group at the University of Washington used exome sequencing to identify the cause of a rare Mendelian disorder, a study they published in the fall. According to experts, their approach has the potential to speed up the discovery of the genetic basis of hundreds of similar disorders.
Probably encouraged by these and other early results, the NIH decided to spend big bucks on sequencing-powered disease studies. In the fall, for example, the Cancer Genome Atlas was awarded $275 million in additional funding to study more than 20 types of cancer over the next two years, of which about $125 million will be used for DNA sequencing, according to an NIH official.
Also, several NIH institutes awarded close to $90 million in fiscal year 2009 stimulus funding for disease studies involving large-scale sequencing, according to an analysis by In Sequence. These projects aim to find genetic causes underlying, for example, heart, lung, and blood diseases; diabetes; and neurological diseases such as autism and schizophrenia.
In addition to disease-focused research studies, several groups started to explore the use of high-throughput sequencing in diagnostics, initially for genetic diseases, a trend expected to continue in 2010.
For example, NewGene, a UK-based molecular diagnostics startup, said that it planned to offer full-gene sequencing tests for several genetic diseases on the 454 platform by the end of last year or early this year. At the same time, MLL, a German reference laboratory that diagnoses hematological cancers, has been testing the 454 system’s ability to detect mutations in patient samples.
[ pagebreak ]
In the US, Correlagen Diagnostics launched in the fall a sequencing-based genetic test for the diagnosis of familial cardiac disease. The test runs on the Helicos platform but can also be performed on the Illumina Genome Analyzer.
And a European Union-funded consortium of European laboratories called Techgene is currently testing several second-generation sequencing platforms for genetic testing applications.
2009 also marked the advent of high-throughput human genome sequencing, with the publication of about a dozen individual human genomes sequenced on four platforms— Illumina’s Genome Analyzer, ABI’s SOLiD, the Helicos Genetic Analysis System, and Complete Genomics’ proprietary technology — in peer-reviewed journals and the start of a $48,000 personal genome sequencing service by Illumina. Throughout the year, cost estimates for sequencing a human genome kept falling, with Complete Genomics quoting consumables costs as low as $1,700 for one genome.
In addition, both 454 Life Sciences’ and Illumina’s sequencing platforms proved themselves in de novo sequencing projects of eukaryotic species with large genomes last year, thus chipping away at one of the last bastions of Sanger sequencing.
Toward the end of the year, researchers at the Beijing Genomics Institute published a de novo assembly of the giant panda genome as well as an assembly of two human genomes, both relying solely on Illumina data.
A consortium of Norwegian researchers, meantime, said in the fall that they assembled the cod genome de novo from 454 data alone. In addition, 454 Life Sciences and collaborators said in the spring they sequenced and assembled the oil palm genome, solely from 454 data. Neither of these projects, however, has been published to date.
While existing high-throughput sequencing platforms expanded their reach and improved their performance, funding for the development of new, “third-generation” sequencing technology continued to flow. In the fall, the National Human Genome Research Institute awarded approximately $50 million to research teams that aim to reduce the total cost of sequencing a human genome at high quality to $1,000 or less. Several of the grants went to commercial entities that had not previously received funding under the program, including IBM Research and Ion Torrent Systems.
The three major high-throughput sequencing platforms — Illumina’s Genome Analyzer, Applied Biosystems’ SOLiD, and 454 Life Sciences’ Genome Sequencer FLX — increased their performance last year, as all three vendors laid the groundwork for the next generation of platforms and technologies.
In addition, Helicos BioSciences established a customer base for its single-molecule sequencing instrument this year, as it demonstrated the platform’s capabilities in scientific studies. Toward the end of the year, Pacific Biosciences released a number of specifications for its real-time single-molecule platform, expected to launch commercially later this year. Human genome sequencing service provider Complete Genomics, in the meantime, demonstrated its technology in pilot projects and plans to ramp up its business this year.
Illumina continued to build its customer base for the Genome Analyzer in 2009, and still holds a dominant position at the majority of large-scale genome centers.
Early in the year, Illumina introduced a hardware upgrade to the platform, called GAIIx, which more than three quarters of its users had adopted by the end of September. It also launched a new cluster amplification station, called cBot, in the fall. According to the most recent specifications on the company’s website — current as of mid-October — the system now offers read lengths up to 2x100 base pairs and more than 300 million reads per run, resulting in up to 33 gigabases of data per run. Paired-end sequencing is available with insert sizes ranging from 200 base pairs to 5 kilobases. Preparing a sample takes about 10 hours in total, followed by a 10-day sequencing run for 2x100 base pair reads.
The company said in October, though, that several of its customers had already achieved runs of more than 55 gigabases, and that it was on track to achieve a 95-gigabase run internally around the end of 2009.
[ pagebreak ]
In addition to the GA, Illumina plans to start shipping a sequencing module for its iScan reader in the second quarter of 2010, which will have a lower performance than the Genome Analyzer. Originally, the company had planned to start shipping that module, called iScanSQ, by the end of 2009.
Illumina also placed a new bet early in 2009 on the next generation of sequencing technology by making an $18 million equity investment in UK startup Oxford Nanopore Technologies and striking a strategic alliance with the firm, which is developing a protein nanopore/exonuclease-based single-molecule sequencing technology. Early in the year, the Oxford Nanopore researchers demonstrated in a publication that their nanopore detector can reliably distinguish the four DNA bases as well as methylated cytosine. The firm, which moved into larger facilities last summer, has provided no timeline yet for the commercialization of its technology.
During 2009, Illumina rarely mentioned Avantome, a sequencing technology development startup which it acquired in 2008, but the company is expected to release more information about the status of that technology, which promises to offer longer reads, this year.
Life Technologies continued to broaden the user base for its Applied Biosystems SOLiD system, which was introduced after Illumina’s GA and has been widely perceived to be in a catch-up race with that platform. A number of large-scale genome centers have chosen the SOLiD as their primary platform, including the Human Genome Sequencing Center at Baylor, which increased its fleet of SOLiDs to 20 in the fall, and the Institute for Molecular Bioscience at the University of Queensland in Australia, which scaled up to 11 instruments last year.
Like the GA, the SOLiD’s performance increased over the year. According to the most recent product specifications on the company’s website, SOLiD version 3 Plus now generates up to 2x50 base pair reads and up to a billion tags per run from a mate-pair library, with insert sizes between 600 base pairs and 10 kilobases available. The output per mate-pair run is up to 60 gigabases, and a run takes 12 to 14 days for 2x50 base pairs.
Life Tech plans to launch an automation solution for the SOLiD’s front-end sample prep early this year.
The company is also working on a single-molecule sequencing technology, based in part on methods developed by VisiGen Biotechnologies, which it acquired in 2008, and on quantum dot technology from its Invitrogen branch.
Early-access testing of that platform is expected to start later this year and continue into 2011. This year, the company also plans to show data publicly for the first time, according to a company official.
In addition to further developing its “next-gen” sequencing platforms, Life Tech also launched a new low-to-medium-throughput capillary electrophoresis instrument last year, called the 3500 Genetic Analyzer, which is aimed at hospitals and diagnostic labs.
454 Life Sciences
Roche’s 454 Life Sciences last year continued to emphasize the key differentiator between its platform and that of competitors: long and accurate reads. Though no large-scale genome center seems to have adopted 454's technology as its dominant platform, most of them have at least one, if not several, instrument installed.
According to 454’s most recent product specifications, the Genome Sequencer FLX produces more than a million high-quality reads averaging 400 base pairs, or 0.6 gigabases of sequence data, in a 10-hour run. Paired-end reads, each averaging more than 140 base pairs, are available with 3-kilobase, 8-kilobase, and 20-kilobase insert sizes.
Late in 2009, the firm started early-access testing for an enhancement of its GS FLX Titanium chemistry that increases the platform’s read length up to 1,000 base pairs.
In addition, 454 is working on shortening and further automating the sample prep process this year.
The company also said late in 2009 that it plans to launch a scaled-down “desktop” version of the Genome Sequencer, called GS Junior, in the spring or early summer. The Junior, aimed at small research labs, will offer read lengths of 400 to 500 base pairs and an output of about 35 megabases per run and will cost about a fourth to a fifth of the GS FLX.
[ pagebreak ]
After a couple of false starts in 2008, Helicos BioSciences in 2009 started to build a customer base, and demonstrated the capabilities of its single-molecule platform in several scientific publications.
In the fall, the company said it had 11 Helicos Genetic Analysis systems installed and recognized revenue for two of those. Three of the systems are placed at academic institutions for scientific and commercial evaluation, and one at the Broad Institute at no cost to the institute.
In the summer, as Helicos' cash was running low, it hired investment firm Thomas Weisel to help it evaluate strategic alternatives, such as a sale of the company. But in November, after raising $9.4 million in a private placement of shares and warrants to new and existing investors, Helicos said it was no longer considering a sale because its business prospects and market valuation had improved. In December, the company said it was about to close another stock offering for gross proceeds of $6.4 million.
According to the company's most recent spec sheet — listing the sequencing performance as of Dec. 2008 — the instrument routinely generates 600 to 800 million usable strands per run, or 21 to 28 gigabases of data, with an average read length of 30 to 35 base pairs and a run time of eight days.
Helicos said it planned to make paired-end technology available to customers before the end of 2009.
In several publications by the company or its customers, Helicos demonstrated last year how its system can be used for digital gene expression analysis, direct RNA sequencing, whole human genome sequencing, RNA sequencing from formalin-fixed paraffin-embedded tissue, and chromatin immunoprecipitation sequencing.
Pacific Biosciences continued to raise exceptional amounts of funding in 2009 as it further developed its single-molecule real-time sequencing technology.
In August, the firm raised $68 million from new and existing investors, bringing its total VC financing since it was founded to more than $260 million.
Also, the firm said that it has started collaborating with a number of early-access customers, including three genome centers, investor Monsanto, and the Scripps Institute.
Toward the end of the year, PacBio revealed a number of specifications for its instrument, which is scheduled for launch in late 2010. The initial version will run arrays with 80,000 zero-mode waveguides, or nanowells, a third of which will be occupied by DNA polymerase. The minimum run time will be 10 to 15 minutes, depending on the desired read length, and an experiment, including sample preparation, can be completed in less than 12 hours. The system will be able to generate read lengths equivalent to or longer than Sanger sequencing, or about 1,000 base pairs.
In 2013, PacBio said it plans to start beta-testing the second version of its system, which will run a chip with at least a million zero-mode waveguides and use a faster polymerase, among other improvements.
In addition, the company is working on applications besides DNA sequencing, such as direct RNA sequencing and protein translation analysis.
PacBio is eyeing an initial public offering this year, and expects to be profitable by mid-2012, according to a company official.
[ pagebreak ]
For human genome sequencing service provider Complete Genomics, 2009 was all about demonstrating the capabilities of its proprietary human genome sequencing technology. In August, the firm closed a $45 million Series D funding round, six months later than expected, causing it to postpone the start of its large-scale human genome sequencing service from June until January 2010.
In September, the company said it had delivered sequence data to some of its pilot project customers, which include Pfizer, the Ontario Institute for Cancer Research, the Institute for Systems Biology, and the Broad Institute.
ISB researchers presented results from a disease research project in the fall, for which Complete Genomics sequenced the genomes of a four-member family, allowing them to identify three candidate disease genes for rare Mendelian disorders.
A couple of months later, the company published a proof-of-concept study in Science, in which it used its combinatorial probe anchor ligation chemistry and patterned DNA nanoarrays to sequence three human genomes, including two HapMap samples.
After completing its genome center, Complete Genomics plans to scale up its operations and sequence 10,000 genomes in 2010, to become cash-flow positive by mid-year, and to position itself for an initial public offering.