At A Glance
David Haaland, senior scientist, Sandia National Laboratories (1972-present);
Education: 1968 — BS, chemistry, University of New Mexico
1972 — PhD, physical chemistry, University of Rochester
David Haaland believes that microarrays can go far beyond today's two- to four-color analysis systems. He and a team from Sandia National Labs and the University of New Mexico have created a hyperspectral scanner and multivariate statistical analysis that go toward that goal.
He recently spoke with BioArray News to explain the technology and its applications.
How did you get involved with microarrays?
For the last 25 years, I've been involved in quantitative spectroscopy, applying multivariate data analysis methods to quantitative spectroscopy. About five or six years ago, I got involved in hyperspectral imaging. That means you are taking a digital image, but rather than [getting] a gray-scale or three-color—red, green, blue—image at each pixel, you are getting a full spectrum in whatever spectral region you might be looking at. So at each pixel [from a hyperspectral image] you have from hundreds to a thousand [different] wavelengths. Our first spectrometer had 4,000 pixels and more recently, we are collecting many millions of pixels. One problem with this is that you get enormous amounts of data, and the question is: How do you appropriately evaluate the data because you just can't look at every spectrum when there are that many data points? We have spent a lot of time developing the methods of analyzing those data automatically to produce quantitative results.
That's the background of what I was doing at a time probably five years ago when I happened to be at a soccer game—my son was playing, and there was another kid on the team whose mother was a biologist. We asked each other what we were doing and she mentioned she was starting to work in the microarray business, and was telling me about it how they did two-color measurement. And, I said, 'Well, gee, if you could measure many colors, not just do binary comparisons, but do many comparisons on the same microarray, would that be an advantage?' She said, 'Yeah, that would be great.' That's how it started.
Who won the game?
That I don't remember. But I do remember talking with the biologist. Her name is Maggie Werner-Washburne, a professor in the University of New Mexico biology department. We created a collaboration that led to some other collaborations with the University of New Mexico Health Sciences Center and the Cancer Research and Treatment Center. The project is now funded by the Department of Energy's Genomes to Life program.
It actually took us a year and half or two years writing proposals to finally get funded. Cheryl Willman and the Cancer Center—in a joint Sandia-University of New Mexico proposal—led the proposal that originally got funded through the Keck Foundation. In order for us to actually use the [grant] money at Sandia, we had to get some additional funding to support our time. So I was able to get an internal laboratory-directed research and development project to allow us to design the system and build it, and to work with the biologists.
How long did it take to get it operational?
Approximately two and half year years ago, we took our first image. And what we noticed was that there was some contaminating fluorescence in the slide that was spot-localized in the microarray. That is bad, because if there is another emission source—other than the fluorophor you put in there—you have a problem: If it has photons in the spectral region that the optical filters [of commercial instruments] are examining, then the contaminant signal is confounded with the dye signal. You can't tell them apart.
So, we started investigating other commercially printed slide yeast microarrays, [in] a study that came out in Nucleic Acids Research. We looked at a total of four different commercial suppliers' printed slides - and all four had these contaminants. And, we also found it in the in-house printed microarrays [Werner-Washburne] printed herself. It turns out [the contaminants are] from the buffers used in printing the DNA microarrays. The [buffers] are proprietary, so [the manufacturers] don't tell us what is in them. We talked to the manufacturers of the pre-printed arrays and at first some of them denied it was there, but finally admitted that it was but said, well, it would wash away in the hybridization. We did a mock hybridization where we did everything but label it with dye and then scanned it and saw that contaminant was still there. It was reduced but it didn't go away.
How did your scanner find this contaminant?
With our hyperspectral scanner, we actually can quantify each and every emitting source, and separate them out at every pixel in the image. And with our multivariate data analysis, a method we call multivariate curve resolution, we can effectively do quantitative spectroscopy, in terms of relative concentrations, without any standards, without any a priori information about what is in the sample and what is in the fluorescent image.
And so that is how we were able to discover numbers of fluorescent sources in these slides. We could see the standard Cy3 and Cy5 dyes, even though we were exciting only with the green 532 [nanometer] laser. We also saw glass emission, because glass, borosilicate glass, on these microarrays has an emission spectrum of its own. This contaminant shows up in the green channel, the Cy3 channel. With a commercial instrument, it is indistinguishable from the Cy3 fluorophor emission.
Our hyperspectral scanner, coupled with the multivariate curve resolution methods that we are using, allowed us to discover all those emitting sources and then do a relative quantitation of each and every one of those components in each pixel of the entire image. It meant that we could separate out the effects of the contaminant and the glass emissions from those of the fluorophor. Therefore, we could compare our results with those of commercial scanners and see how much error was present.
What numbers did you find on this error?
What we found was, in that particular sample, which didn't have large numbers of highly expressed genes, 75 percent of the spots' red/green ratios were in error by a factor of 2 or more. I think 50 percent of them were in error by a factor of 3, and 25 percent were in error by a factor of 4.5. Now, that can vary from slide to slide, depending, of course, on the amount of contaminant and how strongly the genes are expressed. This contaminate was about 800 counts, with a standard deviation of about 600. So, it was quite variable and you couldn't predict what it was without actually measuring it, and it meant that any spot that was not highly fluorescent would be in error.
How do you get rid of the contaminant?
There are multiple ways. No. 1, you use a different solution to do the printing in. For example, we showed that DMSO solvent doesn't have any contaminant. Or, we could pre-treat the coated slides for four or six hours in a moist environment and then print, reducing the contaminate emission to a very low level, maybe 40 counts or less, and much more consistent. It doesn't get rid of the contaminant during the print, but it doesn't stick too strongly and therefore most of it washes away during the hybridization.
We have learned that the hyperspectral scanner is very useful in understanding and identifying problems with microarrays. We have actually had maybe 10 different groups send us slides that are problems, and we have helped them identify problems with the slides.
What are some of these problems?
You can have high backgrounds and the question is, is that high background from dye that sticks to the glass, is it a contaminant? What is it? We can tell those types of things. In one set of slides, we saw that the Cy5 was sticking to the slide everywhere. There was not much Cy3 on the slide. When we did our multivariate curve resolution analysis, we noticed the spectrum of the Cy5 was shifted by 12 nanometers from the normal spectrum that we observed. And that was consistent with free Cy5 dye not incorporated into the DNA. We went back and found problems with the cleanup of the dye with a commercial column that is supposed to get rid of the Cy5 free dye leaving you only Cy5 dye that is incorporated into the DNA. That cleanup process is much harder with the Cy5 than the Cy3. So, sometimes free Cy5 gets through. So a combination of that and inadequate blocking of the slide caused the problem.
We have sometimes seen other fluorophors where dyes that people use for other purposes have contaminated the microarrays. Maybe it comes from the glassware. We are not sure, but we know it was in the lab and it did get into the arrays. So, we have seen those types of problems and we can identify them, we can tell them what the dye is, and then they tell us, 'Yes, we were using that or another purpose and didn't realize we were contaminating our microarrays.'
Will you describe the scanner?
The system was designed and built by Mike Sinclair at Sandia and is being operated by Jeri Timlin who also does the bulk of the data analysis. It's a system that uses laser excitation. We currently have three lasers available to us—blue, green, and red. And so we take a laser spot, and form a line image with a Powell lens, and that line then goes through the optical system, onto a dichroic beam splitter. What that does is reflects some colors and transmit others. It reflects the laser light down through a microscope objective onto a microscope slide that is on an X-Y positioner. Generally, we can operate on 10-micron spatial resolutions, so the line is about a milli-meter long, and 10 microns wide. Therefore, every pixel in the image will be 10 microns on a side.
So we have this microarray with a laser line on it that excites whatever can emit in that sample. And, the fluorescence emission is collected by the microscope objective and is projected back onto the dichroic beam splitter, and now, because the emission is at a longer wavelength, instead of being reflected, it transmits through the dichroic beam splitter. A telescope is then used to get the line emission to the right size to go into our imaging spectrometer. We currently have an imaging grating spectrometer, so the spectrometer takes that line and spreads it out into various wavelengths. Our spectrometer can currently measure emission wavelengths from about 480 to 900 nanometers, which covers the visible and near IR [spectral range]. The output of the spectrometer is imaged onto a CCD array, which is a two-dimensional array and is silicon-based. We image one 288 pixel line at a time, and at each pixel we obtain a spectrum at over 500 wavelengths. We capture that frame from the CCD camera, then move the stage 10 microns, and take another image. We just serpentine the laser line back and forth along the microarray slide until we have scanned the whole slide. We can scan in about 15 minutes. It's slower than the normal five minutes of the commercial scanner, but we get so much more information. The commercial scanners, because they are univariate scanners, get one intensity for each of the two colors. Because you can't distinguish sample emission from background emission that is always present, you measure the emission around the spot and subtract it from the emission in the spot, making the assumption that the intensity of the background around the spot is the same under the spot. Of course, if you have spot-localized contaminant, that's not true. That's why that process doesn't work in a commercial scanner when you have contaminant present.
Where is the system now?
We have it up and running after a number of months of modifications. We upgraded a lot of features to make the system much more flexible. We have three lasers, but each time we changed lasers, we had to realign the system, which is slow and tedious. Soon we will have the capability of rapidly switching excitation sources. We have put in a microscope turret so we can quickly change optical spatial resolution, that is, the magnification of the system. We incorporated more reliable, higher precision and more accurate X-Y positioners. We have just received a new detector, which is twice as sensitive, has twice the number of pixels, and is 10 times faster at reading the data than our previous CCD camera. So we will now be able to take data much faster. Sinclair has also designed and purchased parts for a new prism-based imaging spectrometer, which will have better image quality and about a factor of 3 greater throughput, meaning more light onto the CCD camera. The new spectrometer, in combination with the new detector, should theoretically give a factor of 6 improvement in sensitivity over what we currently have.
What are your hopes for the system?
We want to change how people do microarray experiments. We want to add many more fluorophors per microarray chip, for every slide that we run. Right now, people are doing essentially two-color studies. There are some commercial systems with five lasers but mostly people vary what dyes they use. They are not using five fluorophors simultaneously. Our collaborators, who have one of these commercial systems, have tried three dyes and found too much cross-talk between the dyes with that system so they couldn't get accurate, quantitative data. The advantage of our systems is that we can put in many dyes. I used to say maybe a dozen or 20. But, right now, we are limited by how many the biologists can put in. So, we can do as many as they can put in. We have demonstrated that we can hybridize with four green dyes and we will pick up some red dyes also. This was just a demo. We had the glass and four dyes, so we are simultaneously separating out five different components, four of which the biologists are interested in.
And if we do the same thing in the red channel, you know we should be able to get easily eight different dyes, where you can do comparison of eight samples on the same slide. The advantage of that is if you are doing, say, a time-course experiment, you can do all your times on one slide. You should get more precise results with fewer microarrays.
There is enormous promise in the microarray area, that I think has been held back by some of the problems, and the experimental variance in the data that makes it difficult to see the biology. These approaches will help. It will make the whole industry better and biologists will get more information.
How will it be commercialized?
We have a number of non-disclosure agreements with commercial companies and we have started the process of talking about transferring the technology. We are doing research, but you want somebody commercially to take it over and make it an instrument that is easy to use. That is certainly part of our goal. Most of our IP is not in the instrument, but in how you analyze the data, because that is really what you need to see at the end.