Part two of a two-part series. Click here to read part one.
At A Glance
Mark Ashwell, vice-president, Drug Discovery Chemistry
Dave Leggett, senior project manager (oncology programs)
Bin Zhang, senior investigator, Chemical Biology (manages high-content screening)
Andrew Smellie, senior investigator II, Discovery Informatics and Modeling
ArQule began its transition from a chemistry services company to a biotech firm with designs on its own drug discovery approximately at the same time that high-content image-based screening began to find its way into the drug-discovery process. It's no coincidence, then, that the Woburn, Mass.-based company, like many of its competitors, adopted high-content screening into its own discovery process, which revolves primarily around anti-cancer compounds.
Several ArQule researchers recently published a research paper in the Oct. 18 online edition of the Journal of Biomolecular Screening that served as a proof of principle for using HCS to identify small molecules that induce mitotic arrest. The scientists involved in that particular research also reflect ArQule's general approach to the use of HCS: the company prefers to assign different people to each part of the HCS process, from assay design to instrumentation to image analysis and informatics.
ArQule also is a real-world example of a company that is rolling with the changes in a nascent and frequently volatile market, as it has already exchanged its original high-content screening platform, a Beckman Coulter IC 100, for an IN Cell Analyzer from GE Healthcare.
Last week, ArQule's Ashwell and Leggett discussed the company's history and how it has integrated HCS into its drug-discovery program. In the second part of this two-part interview, Zhang and Smellie weigh in on changing HCS platforms, dealing with huge amounts of data, and developing algorithms that suit ArQule's particular needs.
In the JBS paper, your group used the Beckman/Q3DM instrument, the EIDAQ 100. How did you come to use that platform originally?
BZ: Regarding the Beckman instrument (formerly Q3DM) we have several instruments around, such as the [Cellomics] Arrayscan VTI, the [GE Healthcare] IN Cell 1000, and the [Q3DM] EIDAQ 100. We have recently upgraded the EIDAQ 100 to the IC 100 [from Beckman Coulter]. We have been using this instrument for almost two years, and a clear advantage of this instrument is its open architecture, especially in terms of being able to extract all the raw data, which allows us to exercise our own algothrims in multiple ways. On the market, most popular models, like the Arrayscan VTI, are relatively closed systems. You are provided with software modules that are easy to use, but you don't really know what's happened to the detailed data processing. Since we have a very strong IT group here, we are able to develop our own algorithms, such as [for] KS scoring, internally. In this way we can integrate our assay development with data processing, and put it all together with the other critical components, such as instrumentation and data archiving for our future drug discovery. We believe the investment will lead us to quickly determine so-called structure-activity relationships.
Also in the paper, you said that there were some problems with the robustness of the Beckman platform. Can you detail that at all?
BZ: We have experienced out-of-focus problems when integrating the EIDAQ 100 with our robot, which right now, is the Zymark [Caliper Life Sciences] Twister II. Those two, sometimes create problems when they communicate with one another. But recently, some other nice systems have been developed, and these auto-focus problems have been largely overcome. So this is another reason why we moved on to another system, the IN Cell 1000. We just purchased this system, and we can use it with high accuracy for high-throughput screening because in theory, it allows us to scan up to 60 96-well plates per day.
But you didn't have major problems with the Beckman software? You mentioned that you liked it from the beginning because of its open architecture…
AS: The software is actually fine. It basically analyzes the image, and basically generates a bunch of summary statistics on the pixels themselves. To that extent, the software is great. The problem is, we quickly realized that this problem was related to studying differences in populations of cells, because there are collections of cells in each image. The Beckman Coulter software got you as far as being able to compute simple averages. So it would compute, for example, the average intensity of the pixels over all cells in the well. The problem is, there is more information in the image than that, which is why we developed the KS score, which is a measure of the difference in two distributions of values, rather than summary statistics. So the software is fine for generating the summary statistics, but it's not so good at handling these distribution issues. It also wasn't obvious in the software when things didn't work. So when we had days when we had these focus problems, it wasn't always obvious from the software that things were wrong. So we had to develop all these other methodologies to help us be able to identify that.
So now you're working with the IN Cell, which has a high throughput. Do you anticipate that the software that comes with this will allow you the flexibility you need in developing your own algorithms? Will you need to incorporate another type of image-analysis software with this?
BZ: The reason we chose the IN Cell 1000 is that it also has an open architecture. Basically, people like to say the IN Cell has the back door and front door both wide open. And GE just signed a deal with Cellomics, such that if you install the Gateway software, you will be able to use all of Cellomics' analysis algorithms. So that gives us a lot of flexibility. We don't have to necessarily use all of them at this point, but down the road, if we want to use one of the algorithms from Cellomics, we would certainly buy that. The other thing I want to mention is data management. With all of these databases, to go back and retrieve data, you need good organization software. Your database fills up very quickly because images take up a lot of space. That has become a big hurdle, and you will get 100 terabytes [of data] very quickly. Cellomics offers the Store software, which links all of the data together in a searchable database format. Also, at the recent High-Content Analysis meeting in Washington, DC, there were published efforts from an academic lab from the University of Dundee doing the same thing, organizing data in a searchable format. And they're trying to do that internationally so people can share data.
AS: To add to that answer, the parts that we developed internally, we developed in a very agnostic fashion. So we're confident that we can extract the information that we need from the new instrument working our in-house algorithms, and we've deployed our in-house solutions in Pipeline Pilot from SciTegic, so they're actually readily available on the desktop of the biologist.
BZ: And they also link to the image.
How heavily do you anticipate integrate high-content screening into your drug discovery in the future?
MA: I think from a user point of view, it will have a huge impact on kick-starting several of our new projects, and adding value to some of our long-standing projects. So it's difficult to quantify, but I think it's already proven its value to us, and we've been spending a lot of time really getting under the covers and making sure that not only can we use it to understand cellular processes in an efficient way, but also that we can translate that into decision making about which compounds and which small molecules to advance, and how to be successful in getting those molecules against targets of interest. So it is firmly part of our armamentarium.