An Array of Opinions
GT invited three microarray users — one from pharma, one from biotech, and one from academia — to join us for lunch and tell us what goes into choosing a vendor. Here is the conversation that ensued between bites of turkey sandwiches.
The Cast (In order of appearance)
Steven Perrin, head of expression profiling group, Biogen
Li-Li Hsiao, nephrology fellow at Massachusetts General Hospital ¯ Brigham & Women’s Hospital, Harvard University School of Medicine
Timothy Connelly, research scientist responsible for expression profiling, Aventis Pharmaceuticals genomics center
Where is the microarray market heading — buy or roll-your-own?
Steve: It’s a changing landscape. Several years back the only way to make microarrays was to make them in-house. Anything that you bought commercially was fairly inadequate mostly from a quality control and content level. But the landscape’s changing a little bit. There are several products out there that are very competitive now, and to mimic what they’ve done internally is somewhat difficult.
Li-Li: Commercially available chips used to be very pricey. Recently the price has come down. So for academic purposes, they are much more available. We were told recently that a commercial vendor is going to make customized chips for academic purposes as well. In terms of homemade chips, the problem is there is no good software available to correctly interpret the image compared to using commercially available chips. The quality of the spotting chips is still a problem with the results. There are many problems with hybridization quality here to be solved to [get to] a point that we’ll be able to produce reliable data.
Are there other vendors that are now providing chips of the quality and reproducibility that make them viable alternatives?
Steve: There are at least 20 companies out there producing, but their content quality varies incredibly. And their platforms vary considerably. Agilent and Motorola are touting oligo-based chips. Mergen is touting oligo-based chips. Incyte’s touting cDNA-based chips with good content. So it depends what you want to do with your chips. From a biotech perspective, the only thing we use our internal platforms for nowadays is a validation-type strategy more so than trying to internally cover the genome. Other vendors are doing that. Why bother trying to do it in-house? It’s too difficult.
Is there a consensus about which is better, oligos or cDNAs?
Li-Li: It is all about whether there is a good program out there to help select a good probe. Whether it is an oligonucleotide or a cDNA, probe selection is very important. For example, we recently found out with Affymetrix that the probes they spot on the chips are not necessarily all good. Lots of scientists believe that all chip data are perfect. But that’s not true.
Steve: There’s a huge problem with the oligos on Affy. It’s not even a small percentage of the oligos — there are a lot of oligos you want to filter out. You don’t want to be utilizing every single piece of intensity information off of every oligo on Affy chips. Some of them are hurting your data quality more than helping. Oligos offer a lot of advantages over cDNA, but they’re much more difficult to implement internally, because the process of picking the right oligos is not trivial. There are a lot of pitfalls to be overcome. CDNAs are much, much easier to implement.
What do you look for in a vendor? What would make it worth your time and resources to give them a shot?
Steve: Pilot experiments are worth a thousand discussions. Everybody has a very powerful solution and everybody knows what the issues are. But until you do a pilot experiment for a technology platform, you can’t evaluate it. You want to compare it to your internal standards and to other vendors you’ve tried.
Timothy: I agree. I think pilot experiments are very important in terms of vendors’ systems and how their technology works: the time, the flexibility, the quality of the data, the reagents, and how they accommodate the existing infrastructure.
Li-Li: A cDNA probe is much easier to verify. So in the future, if I am looking for a vendor — either Affymetrix or some other vendor — I would like to have them provide me with sequence information. Even if they provide a service to verify the sequence so that everybody has the same baseline, knowing that we are looking at the same gene and there is no confusion.
We all went through the difficulty of the mouse clones, which is certainly something I don’t think anybody wants to re-experience. Another factor for academic purposes that is very helpful is if the company will be able to provide good software to link to whatever probes they are designing, and easy access not just to the sequence information but also a linkage to other information related to this particular probe, related to the particular gene.
Is that something Affymetrix doesn’t do?
Li-Li: No. They don’t give you sequence of the probe.
Steve: That’s an understatement.
Li-Li: It’s frustrating, yes.
What gets your attention from a vendor and gets them in the door? Is it price, data quality, unique content, or software?
Steve: From a biotech perspective, cost isn’t really the issue. It’s quality. As long as we don’t have to worry about not sleeping at night because the data quality is questionable from one week to the next. That’s the biggest problem. Cost is relative. Clinical trials cost more than chips. That’s the bottom line.
Timothy: So is interpreting data and using it in downstream applications. So you can’t hide behind cost. Cost is not a major issue.
Li-Li: I wish I could say that!
Steve: It’s actually funny that we have brought up software issues twice because in my experience in biotech at Biogen, as well as previously, we have informatics tools that are making that a non-issue. We don’t have any issue, having good quality LIMS systems to track samples, quality assurance of chip stamping internally, quality control on hybridizations for Affy or internal chips, mining data, researching sequences with the genome. We have informatics departments that are working on those tools. For biotech, and I’m sure for pharma, it’s not a problem.
Li-Li: Academically, not every single laboratory has a LIMS system. And so if the vendor wants to reach into the academic area, they would have to take that into consideration. And price is, of course, always a consideration. I see more and more people doing chips studies now because the price has come down significantly. So price does have impact. Another thing I would like to see is reproducibility. If they claim quality, whether they can show me that their chips are reproducible.
Steve: There is a balance of two powers. Some of the vendors have incredibly good content, but their reproducibility is not good enough to warrant their content. I don’t care if your content is good. If I can’t believe one hybridization to the next, it doesn’t really do me any good. On the other end of the spectrum, there are some vendors out there with very good reproducibility and very good reagents, but they don’t have enough content to basically knock off “King of the Hill” Affymetrix to make them competitive in the market.
Timothy: I think in terms of a custom platform, we look at three things: sensitivity, reproducibility, accuracy.
Li-Li: What do you mean, accuracy?
Timothy: How well that particular gene expression is validated by another platform. Whether it is reproducible from one platform to another.
In a dynamic, changing field we expect commercial providers to be far ahead a year from now than they are today and to stay abreast of technological developments. If they have a large resource of cDNAs for cDNA spotting, to refine the sequences to get better probe or cDNA selection, refine sequence algorithms. If it’s oligos, they need to improve their sequences’ capabilities and actually change them. I will say that for Affymetrix, the challenge ahead is that it’s a very dynamic and changing field.
Steve: You’re hitting the nail on the head. You don’t want to commit a million-plus dollars putting in an infrastructure, including equipment, consumables, and an informatics pipeline, for a company that’s going to be out of business in nine months. And one company that I can name that tried to enter into this market and died was Stratagene. They’ve got one of the best contents in the world. They made the first libraries, they have thousands of them sitting in their freezer. They came up with a chip product line, and it died in nine months. And I certainly wouldn’t want to spend a million dollars formatting my expression profile by doing it on Stratagene, and months later they’re out of business.
For what kind of research are microarrays best? And where are their limitations?
Steve: Over the last three years, we’ve seen it completely change. Five years ago, people were using these technologies for gene discovery. Now the technology is used throughout the entire drug discovery pipeline. It’s everywhere. It’s in the beginning in the discovery phase, it’s in the toxicology phase, it’s in the pharmacogenomics phase, and it’s in the validation phase.
Li-Li: From an academic point of view, the current gene chips help you to form a hypothesis. The chips help you view a new hypothesis, and you can use it to go back to a traditional research bench, and be more focused on a small range of genes and information.
The biggest challenge to us is how to develop a methodology to extract reliable data. Fold changes are not necessarily the best way. For example, some genes may only have 20 percent changes and they could be significant. However, with the quality of chips and the quality of probe selection, it may not be able to be identified.
Currently most people select more than four-fold changes, more than five-fold changes. However, even that’s still difficult. There are lots of instrumental noises, biological noises, variations at the biological level, at the instrumental level. How do you identify those noises and subtract them so they tell you that the five-fold change is truly a five-fold change or whether it’s just instrumental noise?
Steve: You don’t want to be mining expression data by some fold-change cutoff, because what’s accurate and relevant in one experiment may not be in another. You want to do your first filtering on statistical confidence levels that that fold change — be it a five-fold change or a two-fold change or a 50-fold change — is accurate or not accurate.
Once you filter out the inaccurate expression calls in your technology platform, then the rest of the data, you can play with it. If a biologist comes through and says ‘I want to detect a 20 percent change in a transcript,’ chip technology is not the way to go there. The only thing that has the dynamic range to detect something that sensitive is probably TaqMan. Real-time PCR can probably discriminate that very accurately with very good reproducibility. Affymetrix and microarraying cannot detect that subtle a change. You have to pick the technology platform to get the answer you’re looking for.
It’s not, Where is the fold change? It’s, How good are the quality of your hybridizations and the chip manufacturing process that dictates how subtle of a change you can detect? That varies depending upon how many replicates you do. The more you do the better the power of the analysis to put a better P value on it.
But how many replicates can you do? You can’t do a lot. It’s too expensive. If you can get two replicates per sample, that’s at least the minimum requirement for me. I don’t know if some groups are doing no replicates, but that’s awfully dangerous to calculate statistics on.
Li-Li: For us academics, it’s very difficult to convince your collaborator to do a replicate of anything … although we are [becoming] successful in showing the data to prove to them that it’s a must. This is going to be the standard from now on. Everybody who’s going to do a chip experiment is going to have at least two replicates to be able to calculate. The more the better. But that’s the least you have to do.
We would even prefer people to do this using the same sample. Some of the genes show 64-fold changes. Are they truly there, or is it actually only noise? You don’t always know if the probe is not good or your sample is not good or your hybridization process is not good. So you can’t rely on this particular chip.
Steve: The first thing you have to do when you’re setting up your core facility, or setting up your new technology platform, or if you’re evaluating one technology platform versus another, is to find out what your technology variance is. Because if you don’t do that first, you’re never going to know what your biological variance is. And those two things have to be separated.
You want to design an error model to control your technology variance, so the rest of your statistics are all based solely on biological error. And that’s really the first thing you want to do when you’re evaluating a technology, or implementing one. If you don’t do that, there’s just too much data coming out of these assays, and you don’t know where your noise is coming from.
Are microarrays a transition technology until the next big thing that is even more powerful? Do any of you already see something that is on the horizon? Or are microarrays here to stay?
Steve: I think they’re here to stay. I don’t think it’s going to keep increasing the amount of users that it has over the past couple of years. You can only gene-expression-profile so much. But there are other applications for it. You’re seeing more and more different applications coming up, than just spotting down cDNA on a piece of glass.
Li-Li: I agree. One thing that is already happening is the protein chips. When you study gene expression and function you have an extra step. The next question is, What are the proteins doing?
Steve: They’re much more difficult to implement from a technology perspective. But even the quality of samples is much more difficult. Even if somebody comes up with a technology platform tomorrow, that’s great. But how are you going to teach a biologist to generate a protein sample that has high enough quality to apply to the technology platform? There’s a big gap there still. It’s not going to be closed any time soon.
Does upfront investment in equipment play a big part in deciding to stay with a particular vendor?
Steve: It’s kind of like buying a car. The equipment is cheap. It’s the consumables over the long run that’ll get you.
Li-Li: Affymetrix equipment is $180,000 all together as a package. And we’re talking about the scanner, fluidics station, rotisserie oven, and software. The price hasn’t really changed much. So for academics, that’s where the core facility makes sense. There’s no need for every laboratory to spend so much money to buy the equipment.
Steve: Instead of picking on Affy, let’s talk about the cost of setting up a microarray facility: $130,000 for a spotter, $50,000 for a scanner, $500,000 on sequence-verified clone sets, $400,000 or $500,000 on automation to basically have the pipetting and relaying tools — the list goes on and on.
As a practicing physician, as well as a microarray researcher, what do you see as the challenges of using microarrays in the clinic?
Li-Li: One thing that will be challenging for everybody in the field is to develop a protocol to deal with a very small quantity of sample. As a clinician, I can’t take this patient into the operating room and get a big chunk of tissue all the time. The most common clinical scenario is: we want to find out the details so we do a biopsy. It’s a very small piece of material. So far there is no good method to deal with such a small quantity, other than laser capture, which is, as far as I’m concerned, still a very difficult method to use.
It would be nice to have a method to save a piece of a biopsy tissue. I’d take one half and give it to the pathologist to do conventional diagnosis. If we can use a small quantity of tissue, we can also do what we normally do without compromising the patient. As a clinician, I see the advantage of using chips applied to clinical applications. Eventually, we will put it into clinical use.