For years, scientists have been predicting the demise of the microarray. After all, arrays are really a stopgap technology designed to enable experiments that on a grander level would be prohibitively expensive. As the cost of sequencing drops, conventional wisdom says, more and more scientists would abandon their arrays for the quantitative, more comprehensive data of a sequencer.
Now that next-gen sequencing has brought costs down to unprecedented lows, it would seem that some folks are ready to give microarrays the RIP treatment. But to paraphrase Mark Twain, reports of this death have been greatly exaggerated.
There’s no doubt that some applications are rapidly making the expected transition — chromatin immunoprecipitation studies are a great example. No sooner had the term “ChIP-chip” made its way into ’omics vernacular than researchers came up with ChIP-seq, and many people aren’t looking back.
Still, the noteworthy trend here is not the path that people have long predicted, but rather the situation that has actually played out: scientists are finding innovative ways to pair the two technologies to deliver better results than either tool could generate on its own. Sequencing, it turns out, makes an ideal follow-up platform to array-based association studies. And recent publications have demonstrated that arrays can be the perfect sample prep filter for next-gen sequencing projects.
Of course, today’s complementary approaches don’t guarantee a bright and shiny future for microarrays. Experts say low-cost sequencing still threatens to overtake them. “The question is, has it not happened for some fundamental reason — or has it just not happened yet?” says Jay Shendure, an assistant professor at the University of Washington. In a Genome Technology Online poll asking whether sequencing would replace microarrays, half of respondents said that at some point the transition was bound to happen, whether it was in the near term or a decade down the road. Almost 40 percent of voters said that arrays would never be completely replaced because the tools work well together. (The remaining respondents chose the snarky option: 12 percent voted for “Wait, there are still microarrays?”)
As next-generation sequencers continue to drive prices down, the community will be eager to see whether arrays can indeed prove that they’re more than a transitional technology. Michael Zwick, an assistant professor at Emory University, says that three factors will determine how successful next-gen sequencing is in displacing the reigning tech: “sample prep has to become cheap and easy,” he says, “throughput has got to increase and price has to continue to decrease,” and finally, bioinformatics capabilities for handling new sequence data will have to improve tremendously.
Add to those challenges the inherent advantages of microarrays — a mammoth install base and investment in array-driven databases, as well as their low cost — and it’s easy to see why some people argue that there’s simply no getting rid of this technology, no matter how cheap sequencing gets. “I think that you’re going to find that both of these tools are going to be useful, and it’s going to be economics at the end of the day to determine which of these tools will be used at which point in the process,” says analyst John Sullivan, managing director of life science equity research with Leerink Swann.
To really understand the issue at hand, it’s worth a quick refresher on just how microarrays got this inferiority complex. “Arrays came about because sequencing was too expensive,” says Dick McCombie at Cold Spring Harbor Laboratory. While scientists snapped up arrays like they were the best tool ever invented, the truth is that in the community they’ve always had at least something of a reputation as a platform that was standing in for another, much preferable technology.
As far as the data they generate, that’s a sore spot for researchers. Microarrays were designed to compare things — samples, sequences, you name it. But as a comparative tool, their output is “always relative,” says Steve Henikoff at the Fred Hutchinson Cancer Research Center. Sequencing, on the other hand, offers “absolute measurements, so you can compare things and be more quantitative about it.”
And of course, the other trouble with microarrays is that they limit scientists to asking questions of known genomic regions, SNPs, or whatever else is spotted down. “Most arrays are set up to interrogate things you already know,” says Mike Snyder at Yale. That can lead to bias in the results, or simply missing a rare variant that isn’t yet characterized in a database.
Last but not least, microarrays have long struggled with variability and reproducibility issues. “Despite years of effort, there’s still quite a bit of variability with microarrays on a platform-to-platform basis,” says Shendure. “With sequencing it remains to be seen, but on an intuitive level, it should be less.”
Sequencing, meanwhile, represents the antithesis of these problems: it provides an absolute measure, and you don’t need to know what you’re looking for to ask a question about a genome. When it comes to data quality, the only widely cited drawback to sequencing is its cost. That factor is becoming less and less of an objection thanks to next-generation platforms from companies like 454/Roche, Illumina, and Applied Biosystems. “My guess is that were the costs … equivalent and the accessibility equivalent, I think people would choose to do sequencing over the microarrays,” Shendure says.
Richard Gibbs, director of Baylor’s genome center, says that isn’t so far off. By his estimates, current prices of sequencing are “at least 10-fold off” the cost of microarrays, “but not 100-fold off,” he says.
People who predicted that falling prices would convert scientists to sequencing platforms haven’t been disappointed. Among the early success stories: ChIP, gene expression, and even structural variation. In general, the rule of thumb is that sequencing will be especially useful in large-scale discovery research.
As for chromatin immunoprecipitation, says Henikoff at the Hutch, “if you have a transcription factor, boy, this is really the way to go. The beauty of ChIP-seq is that you get an absolute measure because you’re counting the number of times you get an alignable hit.”
Tristan Orpin, senior vice president of commercial operations at Illumina, says that for ChIP work, “sequencing solutions already offer superior performance, genome-wide analysis, and superior economics.”
Experts agree that ChIP-seq could become the poster child for a successful transition from arrays to a sequencing platform, and will probably be first to take hold. “That really, within a year, is going to be mostly sequencing,” says CSHL’s McCombie.
Gene expression, long the bread and butter of microarrays, could be a close second. “I think gene expression is one of the big topics,” says Roland Wicki, director for gene expression applications for the SOLiD system at Applied Biosystems. “This could be noncoding RNAs or small RNAs which have not been accurately represented on micro-arrays.” Sequencing offers higher sensitivity, which will be critical especially in clinical applications such as biomarker discovery or detection.
Patrice Francois, a scientist in the Genomic Research Laboratory at the University of Geneva Hospitals, says that he’s just beginning to perform expression analysis studies using a next-gen sequencing platform. He first sequenced strains of methicillin-resistant Staph aureus, and is now following that up with gene expression work. “We would like to have an inventory of all expressed genes,” he says. “I think this type of platform probably outperforms the … microarray platform.”
Wicki says that structural variation represents “another big field” for sequencing use. Yale’s Snyder says that sequencing data would be helpful for large insertions, which can be tough to pin down with microarrays. “If it’s a deletion and you capture the whole event, boom, you’re done,” he says. “Where you start to deduce things is where you capture one end of an insertion and you capture another one. … It’s nice to be able to find both ends of a break point” — a feat that comes easily with sequencing and much more laboriously with arrays.
Still, some applications are more likely to stay on microarray platforms. Peer Stahler, CSO of DNA analysis firm Febit, says that micro-RNAs would be a tough sell for sequencing because certain microRNAs have proven difficult to prepare for use on a next-gen sequencing platform.
Comparative genomic hybridization is another case, says Gibbs. While it would actually probably be done better with sequencing, “it’s just sort of impractical,” he says. “It’s very quick and dirty to do it with [arrays]. There are very standard reagents.”
Even gene expression, while clearly moving to sequencing, will remain in use on arrays, according to Xinmin Zhang, senior marketing manager for sequencing products at NimbleGen (now part of Roche). “Transcriptome sequencing might become sufficiently cost-effective over time to represent a nice way to look at gene expression globally, but even if this happens, microarrays may still be the choice for researchers who want to focus on disease-specific gene panels and investigate a large number of samples,” Zhang says.
Illumina’s Orpin says that large-scale SNP or CNV analysis is also likely to remain on arrays. “Costs for whole-genome resequencing would have to be below $1,000 per sample to drive a technology change in this area,” he says.
And regardless of the application, microarrays have in their favor a key element: familiarity. Scientists have invested in array technology for nearly 20 years, and that has led to innumerable papers, databases, and standards specific to arrays. Adding technical and biological replicates is both affordable and common on a microarray platform, says Fang Liu in the tumor biology department at the Norwegian Radium Hospital. “That provides us with statistical confidence in the data,” she says — and at this point it’s still cost-prohibitive to do the same thing on a sequencing platform. That technology “needs some time to prove itself,” she adds.
There’s also just no getting around price. “Microarrays are really cheap right now,” says ABI’s Wicki, “and they have some head room — not much, but some — to become even cheaper.”
Regardless of the ultimate outcome of sequencers challenging microarrays, for now, scientists are defying the death knell by finding ways to pair the technologies for intriguing new projects. Whether the tools are being used in tandem or are actually being merged, it’s a rapidly evolving scene that promises to enable better science.
In one of the more predictable pairings, sequencers and microarrays are being used in cycles to validate results and improve resolution. For example, researchers who perform genome-wide association studies with microarrays might follow that up by sequencing the regions of interest that emerged from the genotyping. At the end of an association study, scientists likely have a list of promising tag SNPs. But “the tagging SNP is probably not the causative SNP,” says Wicki, so “they need to go in and sequence those regions” to find the variant in question.
Orpin at Illumina says the next natural step after that is to include the SNPs that turned up from sequencing on a custom genotyping array that can then be used for additional variation studies across very large populations.
Hutch’s Henikoff says that the beauty of using data from both technologies lies in improved accuracy. Best known in the field for developing a reverse genetic method known as TILLING, or targeting induced local lesions in genomes, Henikoff studies epigenetics and has begun to perform his studies on both sequencers and arrays. “One of the things I really like about having two very different technologies is that you don’t have to worry that something that you’re looking at is a readout artifact when you get concordance between the two methods,” he says.
In a completely innovative approach to using the platforms together, a flurry of papers came out late last year demonstrating the use of microarrays to capture interesting parts of the genome as a filter for what to feed the sequencers. “We need these other technologies to serve as the front-end equivalent to PCR,” says Shendure. “Whatever PCR was to Sanger sequencing, we need that to direct our sequencing to the things that are most relevant to the individual investigator. We still can’t drop a couple hundred thousand every time we want to sequence a genome.”
To that end, groups from Cold Spring Harbor, Baylor, Emory, Harvard, and NimbleGen published several papers using arrays as the sample prep technique. Stahler at Febit says that in this kind of work, “the microarray is more or less a molecular tweezer that you use to grab out those juicy bits” of the genome that you want to sequence. (For more on this genome partitioning or genome capture work, see sidebar.)
No doubt there will be more examples as scientists imagine new uses for them. “I think there’ll be other synergies that result from microarray technology not being used in the conventional way,” Shendure says.
Thought Experiment: Free Science
One way to truly figure out which technology has the edge is to ask scientists which platform they would use if microarrays and sequencing were both free. Talk about clarifying the situation: everyone voted for sequencing. A few responses:
“There would be no reason to do arrays,” says Jay Shendure at Washington. “If it was free, it would be a no-brainer.”
“Obviously you do sequencing,” says Richard Gibbs. Think of it this way: “Could the world live without microarrays? Probably. Could the world live without de novo sequencing machines? It would be a bleak place indeed,” he adds.
“Probably I would try to sell my scanner!” Patrice Francois quipped.
New Science Alert: Arrays as Sample Prep for Sequencers
While next-gen sequencers have been hailed as a technology and pricing breakthrough, the reality is that they’re still expensive — too expensive to make whole-genome sequencing a trivial matter. With older sequencers, an easy cost-cutter was to use PCR to amplify up the genomic regions of real interest and then to just sequence those. But “long-range PCR is several times more expensive than [next-gen] sequencing,” says Dick McCombie at Cold Spring Harbor Laboratory. It was so pricey compared to the sequencing that it could essentially wipe out much of the savings of the new platforms.
So McCombie embarked on a different approach: using arrays to capture the high-value sequences — ones that represented protein-coding exons, for example — without getting the unwanted, intergenic portions of the human genome. He and his colleagues then pull these oligos off the arrays and feed them to a next-gen sequencer. McCombie estimates that the method is somewhere between a third and half the cost of PCR. “We’re still optimizing right now,” he says. McCombie and his crew worked with NimbleGen to develop the arrays.
NimbleGen was active in seeking out partners, and company scientists were included on several of the papers that came out in Nature Methods and Nature Genetics late last year. “The idea of using microarrays to target next-generation sequencing has been kicked around for a while now,” says Thomas Albert, senior director of advanced research at NimbleGen, “but nobody really knew if it would work. The publications demonstrated feasibility, and many labs are now running with the idea.”
Jay Shendure, previously at Harvard and now at the University of Washington, and Baylor’s Richard Gibbs were senior authors on similar, separate papers. Shendure and his Harvard partners worked on genome partitioning with Agilent arrays that could capture and release oligos. Gibbs and his colleagues partnered with NimbleGen.
What has surprised some people is how well the concept has worked just on first pass. “First off it worked at the 90 percent level. You could do production at the 90 percent level,” Gibbs says. “It’s already gotten better. This is really quite sensational.”
McCombie says Cold Spring Harbor is already planning to add more people to this line of work. They’ll be aiming to improve the target capture rate, which he says is already “at over 90 percent,” as well as finding different ways to shear the DNA and “designing arrays to use on contiguous genomic regions as opposed to just exons.”
As scientists evaluate the methods, one issue that will need further attention is “whether solid or liquid phase reagents are the best,” Gibbs says. While he adds that “it’s remarkable how well it’s working in solid phase,” a number of teams did their work in solution.
Shendure and Agilent, for example, used liquid phase. Jay Kaufman, senior director for genomics marketing at Agilent, believes this approach “is more amenable to automation,” he says. And in these early studies, “it appears as though the on-array-based approaches require more DNA than the solution-based approach,” he adds.
Ideally, Kalim Mir at the Wellcome Trust Center for Human Genetics says that the capture and sequencing steps would all take place right on the microarray. He and his team are working on an approach that would “capture directly on the microarray and sequence directly using those capture molecules as primers,” he says. The team is using a stepwise sequencing-by-synthesis method that is performed right on the array.
Ultimately, though, even the genome capture concept is a proxy for what scientists would really like to do: sequence the whole genome. Target capture or partitioning is “also an intermediate step,” says Mike Snyder at Yale, “because at some point sequencing will be cheap enough that you’ll just sequence the whole genome.”