Name: Seth Crosby
Title: Director, Genome Technology Access Center, Washington University School of Medicine
Background: 2003-present, director, Genome Technology Access Center, Washington University School of Medicine, St. Louis; 2000-2003, principal research scientist, Pharmacia/Pfizer; 1998-2000, adjunct professor, pathology, Chicago Medical School; 1993-2000, senior scientist, Abbott Labs; 1989-1993, fellow, Washington University School of Medicine
Education: MD, University of Texas at San Antonio; BA, biology, University of California, Santa Barbara; BA, English literature, University of California, Santa Barbara
"Microarrays: the reports of my death have been greatly exaggerated." That was the title of a workshop held at the Association of Biomolecular Resource Facilities conference in San Antonio in February.
While the title poked fun at the idea that next-generation sequencing technologies are rapidly displacing arrays, the core-laboratory directors who presented spoke to the real challenges of helping clients choose the right technology for their projects.
As director of the Genome Technology Access Center at the Washington University School of Medicine in St. Louis, Seth Crosby has first-hand experience with dealing with researchers who want to use next-generation sequencing technology, but whose projects are more likely to succeed with an array-based approach.
Crosby's ABRF talk, entitled, "Microarray futures: don't decommission your scanners just yet," touched on some of the major issues he has encountered in helping researchers with their projects.
BioArray News spoke with Crosby last week to get a better understanding of how arrays and sequencing can complement one another in genomic research.
At ABRF you urged array users not to decommission their scanners, and array companies would no doubt say the same thing. At the same time, it often seems that the sequencing companies believe their platforms will eventually replace array technology.
It was the same with the microarray world a few years ago, when the technology was seen as a replacement for PCR or Northern blots. In some platforms, such as RNA-seq, the transition may occur more quickly than in others. I just don’t see it happening as quickly as sequencing companies are predicting.
What were some of the main themes of your talk?
I showed a headstone with the inscription, "Microarray: 1980s-2008." Around the time of the "demise," in 2007-2008, Rick Wilson called me into his office at the Genome Center and said, "Look, I want to change your job title from array core facility director to the director of translational research." He said, "I want you to engage people in sequencing because microarrays are dead. They are going to be gone. So there is no need in you being the head of the microarray facility anymore." At the same time the microarray facility was getting busier and busier, so it wasn’t long before I returned to the facility. In 2010, we saw that arrays were coming off the assembly line faster than ever. In the microarray facility of GTAC, we had our busiest year yet in 2010.
So we have to analyze why this was happening. First, sequencing was still quite expensive and tricky. Another phenomenon was that we had all this new sequence that is coming off all of these novel organisms and researchers are taking this new sequence and turning it into microarrays. In addition, GWAS and various sequencing technologies are presenting hot loci to scientists, which, rather than using sequencing — which is still a more expensive and time-consuming technology than arrays — to answer the question, they are turning back to arrays to characterize new and rarer polymorphisms that resequencing is discovering for them.
I get at least two visitors a week who say, "I want to analyze RNA, but I don't know if I should use microarray or RNA-seq." I run through the various strengths and weaknesses of both of those technologies. Then I discuss the price: in my facility, for a 16-human-sample study, it still costs more than four times as much and takes longer to analyze the samples using RNA-seq rather than an Illumina expression array. When you compare just the volume of data from array experiments with the same done by sequencing, it's not even close to the same order of magnitude. That is a big logistical challenge. In addition, with microarrays, the analysis is pretty turn-key. But with RNA-seq and with various DNA sequencing applications, the strategic options [for analysis] are still evolving. It is a much bigger deal to store and analyze a sequencing experiment than it is to analyze a microarray experiment.
[ pagebreak ]
Do you think more researchers want to run sequencing experiments because it is a newer technology?
The sad reality is that when you are making a grant proposal or submitting a paper for publication reviewers, when they see sequencing rather than microarrays, it tends to give you a brownie point. People like to fund sexy stuff, and they like to publish sexy stuff. That said, there are still going to be times when you are going to direct people not to use the sexiest thing, because you want to support this person's hypothesis in the most effective way.
Recently Don Baldwin of the University of Pennsylvania [Microarray Core Facility] suggested during a session that candidates, when they are writing about this choice acknowledge the existence of sequencing. To say, "We are aware of the power of sequencing. Here is how we could apply the next-generation technologies to our question. The reason we chose not to do sequencing is …" And there can be very practical reasons. For example, if you want to be very sensitive to differences between samples, you will want to use an appropriate number of biological replicates. This is especially the case when you are using human tissue. If you are using a lot of biological replicates in a sequencing study, it can get kind of expensive. So what you can inform the reviewers of your grant is that you considered sequencing, but in order to be adequately powered, you rejected it because you wanted to be a good steward of the institution’s funds. It would, in spite of the power of sequencing, be possible to get the answer with cheaper technology and preserve the pot of gold for other research questions.
I think that is a reasonable approach. It makes it seem that you are still cutting edge, because you have shown that you are aware of this technology, have run its numbers, but you chose to use a more appropriate, albeit older technology. At the end of the day, the most important thing is the design, not the sex appeal.
But will sequencing displace some arrays?
I would suggest that the first microarray that is probably going to go down is the whole-transcriptome array. I think that within 12 to 18 months, a lot of people are going to be doing their whole-transcriptome analysis by sequencing. People using focused expression microarrays will stick around longer. For the next three years it will still be cheaper and easier to do a focused expression array. With SNP arrays, I think it is obvious that focused genotyping arrays will be around probably forever, because they are very cheap to make and easy to use. You can now get multiple arrays onto a single piece of glass. For whole-genome genotyping arrays, with their adoption into clinical cytogenetics, those are also probably going to be around for a long time as well. Those arrays are only going to get cheaper, the analysis on them is relatively easy and the amount of disc space you have to commit to a whole-genome analysis is modest compared to sequence. For these reasons, cytogenetics folks are going to stick with arrays forever. And by forever, I mean between six and eight years.
You have run large GWAS in your center before. Where is that approach at the moment? Do you anticipate a coming second round of studies?
Everybody is kind of holding their breath. People are standing back. [Illumina CEO Jay] Flatley has said that the market is currently divided between a few first adopters and those waiting to see the results using the new arrays with rare variant content. I have no idea if we are going to see a second wave. I don't know if, using these five-million-probe arrays, whether people will be able to put together studies that are powered enough to extract useful information or how much more we get from them over imputation on lower-density arrays. There were a lot of expectations for the 500,000 and million-SNP arrays. And to some degree there was disappointment that there weren't a lot of smoking guns that came out of that. I am not sure if anything additional will come out of these higher-density arrays. I’m one of Jay Flatley's 'wait and see-ers.'
We have done a number of projects in our core. But right now the funding institutions are getting behind sequencing. It is hard to push a GWAS proposal through. But if some hot papers come out from the first adopters, then the NIH might loosen up GWAS money, and that would open the floodgates for the second round of GWAS. Right now we are still doing GWAS, but nobody has come to us yet and said that they want to do a major, adequately powered study using the 2.5 million SNP array.
That said, at the end of this process, either by arrays or focused sequencing — probably a combination of both — there will be a limited number of polymorphisms that will be considered to be biomedically relevant. We will be able to take those and make tiny arrays out of them, process 40 or 60 samples on a single chip or on pegs. There is no way, as we understand sequencing right now, that sequencing could compete against an array that would cost you $37 a pop. That is how things stand right now.
You have been at WUSTL since '02, running the array facility. When did it become GTAC?
I came to Washington University eight years ago to run the array facility. At the time, it was a pretty modest affair. We had about three customers and we made spotted arrays. I happened to arrive at the time when there was a big technology transition. Suddenly arrays became a lot more useful and versatile. The core exploded. We went from three customers to hundreds of customers; we went from three species to 30; we established an international reputation; we no longer need departmental supplementary funding. It was all very good.
[ pagebreak ]
The Department of Genetics here got a new chairman, Jeff Milbrandt. Jeff wanted to set up a sequencing facility for Washington University researchers, the reason being the current sequencing operation in the Genome Institute wasn't really put together to handle small projects. As much as they wanted to handle small projects, it was difficult for them to do. So Jeff asked if I would consider directing the "sequencing for the rest of us" facility. I suggested we create an über facility, so that instead of just offering sequencing, we could capture all the different DNA/RNA technologies under the same umbrella. The vision is that a researcher comes to us with a hypothesis, not a request for a certain technology. It is our job to hook him up with a technology that will address his hypothesis in the most economic and sensitive way. So in addition to arrays, sequencing, and high-throughput qPCR, we offer technologies like NanoString, Luminex, these sort of double-hybridization technologies as well. In addition, in conjunction with the Department of Pathology we have a CAP/CLIA-certified lab and, in partnership with Agilent, are currently testing our first panel of disease-relevant genes for patient exon sequencing for clinicians and clinical trials.
GTAC offers Affymetrix, Agilent, Illumina, and Roche NimbleGen arrays. When it comes to serving customers, what distinguishes each?
There are a few differences. One of the differences is flexibility. For Affymetrix or Illumina to make a new array, it requires a significant amount of investment because they have to synthesize beads, in the case of Illumina, or, for Affy, they have to create photolithography masks, which are expensive. So if you are coming in with a study on an organism that hasn't yet been put on a chip, then we tend to use Agilent, because we can create the array on the fly, and the design is free. You can have Agilent make a single array, hybridize it, look at the data, and tweak some of the probes, then order another array and repeat the process. It's this wonderfully iterative way of developing a new array. This can be done with Roche as well, but they have a design fee, and it doesn't lend itself as well to those kinds of small projects.
On the other hand, if someone comes in with a widely studied model organism, like human, mouse, or rat, then we typically offer Illumina, because the arrays are quite reasonably priced. Demand for Illumina right now is the highest because it is the most economical array. That's followed by Affymetrix; people request it because of its traditional position and comparable data quality. The Affy SNP 6.0 is what we currently use for our cytogenetics. Also Affy’s exon array; I don't think there is any other exon-level offering other than Affymetrix's.
Agilent gives us the flexibility. I think Roche wins in terms of the density. For tiling arrays, ChIP-chip, people tend to use the Roche arrays. In addition, Roche is coming out with a custom SNP array that can handle more than 13,000 SNPs and does not require a big order.
Leerink Swann recently commissioned a survey of core lab directors, and predicted flat array growth for 2011, as compared to 2010. Would you make a similar prediction?
We did see array growth last year but, honestly, if you had asked me in 2009 to predict it, I would have missed it: I would have predicted flattish demand for 2010. For 2011 maybe I am making the same mistake, I am going to predict flat. We don't anticipate any decrease in demand, and few of the next generation of GWAS projects are coming to us, so I think we are just going to have a repeat of 2010.
Before you joined Washington University, you served as a scientist at both Pharmacia/Pfizer and Abbott Labs. Is it really true that pharmas and biotechs have been slower to adopt next-gen technologies?
It's true and it's also very smart. Sequencing and microarray technologies are very difficult to develop. Why not go ahead and let universities and the government make the investments to put these things in place and get the bugs out? Once things are running smoothly, bring the technology in house, or, better yet, continue to outsource. Then 90 percent of your tech development has been done by universities and the government. I think that's a smart thing from their perspective to do.