Group Leader, division of genomics
Genomics Institute of the Novartis Research Foundation
At A Glance
Name: Sumit Chanda
Position: Group Leader, division of genomics, Genomics Institute of the Novartis Research Foundation, San Diego, Calif., 2003-present
Background: Research assistant, National AIDS Research Institute, Pune, India, 1994; Research assistant, the Salk Institute for Biological Sciences, 1993-1994; Research assistant, department of microbiology, Cornell University, 1991-1993; PhD, molecular pharmacology, Stanford University, 2001
As group leader for genomics at the Genomics Institute of the Novartis Research Foundation, Sumit Chanda has been heavily involved in the adoption of high-content imaging technology as a tool for functional genomics and target ID and validation studies. Chanda, who will present on his recent work at IBC's upcoming ScreenTech 2006 conference in San Diego, took a few moments this week to discuss the work with CBA News.
You co-authored a paper published in Genome Research last year, which focused on the use of high-content image analysis to identify mammalian growth regulatory factors (see CBA News, 7/25/2005). How does the work you'll be presenting at ScreenTech tie into that?
Taking one step back — what we're doing, in contrast with what [Jeremy] Caldwell's group [at GNF] is doing (see CBA News, 3/3/2006), is functional genomics, instead of chemical genomics. We screen siRNAs and cDNAs and look for gain-of-function or loss-of-function phenotypes in cells. When we first started, we used your standard luciferase or GFP reporter assays, which were great, and they provide really nice readouts, but there are only so many reporter assays out there. We wanted to be able to explore more physiological phenotypes that you really couldn't with other detection methods. This is when we moved into high-content imaging. This was a really great fit for genomics at the time, because the throughput was more in the realm of 25,000 wells per day, instead of a million or 1.5 million wells per day. It was more compatible with our technology, although now we're beginning to move in that direction with our chemical screens. The paper that we published in Genome Research was really just proof of principle. Could we express an arrayed set of cDNAs — overexpress them — and see whether any of them could confer a proliferative advantage, or cause cell death? We wanted the readout to be totally independent of any sort of chemical reporter and human intervention. We imaged the wells on day two and day three, and it was essentially, 'How many cells did we have on day three as opposed to day two, and could we identify cDNAs that imparted a higher rate of proliferation or apoptosis?' It was strictly a cell-counting assay, and it was an exercise to see if we could take images, convert them into numbers, and extract biologically meaningful data from that. In the rest of the paper, we went on to show that some of the ones that we pulled out that increased proliferation turned out to be putative oncogenes. That really got the ball rolling on a lot of different types of assays that we have subsequently run.
Now you're using HCS for target identification and validation?
The initial study was also a target ID study, in that those putative genes that were involved in proliferation — we found that some of them are overexpressed in cancer. Theoretically you could target those as anti-cancer molecules. Anytime we're doing these sorts of genomics studies, we would categorize them as target ID studies. Even though it was more of a proof-of-principle study, you can always get some sort of target ID out of it.
We're starting to refine our HCS screens to interrogate more discrete phenomena, though. As an example, one of the things we're looking at is cell migration, which has obvious implications for metastasis and cancer. The assay was scratch-based, so we'd transfect in siRNAs on a genome scale one at a time; scratch the middle of each well in an automated fashion; and then look for siRNAs that blocked the cells from filling in that scratch. It's a standard assay for cell motility, done in high throughput. There, we could use high-content imaging to look for different targets that impact metastasis or cell motility, just by looking at the spatial distribution of the cells and how they fill in the gap.
We're starting to become more sophisticated in our readouts, so other things we're looking at are things like cell cycle analysis — DNA content, phosphohistone staining, and tubulin staining. We ran an assay that I'll be talking about at ScreenTech where we essentially knocked down every single gene in the genome — about 25,000 — and asked which genes were required for progression through G1/S/G2/M. Now we have this catalog of mammalian genes that are required for cell cycle progression — probably ten-fold the number of genes that were previously known to affect this process. Of course, inhibiting mitosis and cell cycle have obvious relevance towards cancer therapeutics.
Some other relatively inexpensive technologies that CBA News has recently covered can be used for some of these basic assays — electrical impedance sensing for scratch assays, or benchtop microfluidic cytometers for cell-cycle analysis, for instance. What are the benefits of using high-content imaging for these types of assays, especially considering that these platforms are expensive?
It's versatility. We can assay for anything that you can see under a microscope and where there is a qualitative difference — whether it's neurite outgrowth, any sort of fluorescent stain, or morphology — it's an incredibly flexible platform. I basically tell people that if the average scientist can tell the difference between a control and a treatment well, then we should be able to develop an algorithm that can detect it. We want a platform — because we run a core facility here — where someone might say, 'This is the process I want to study. Here's the staining, here's the antibody, and here's the treatment controls,' and we would be to run the assay to identify genes that regulate that process. So versatility and flexibility are keys for us.
Secondly, looking within a single assay, you get a lot more bang for your buck. A lot of people might not think that they need it, but let me give you an interesting example from our cell cycle analysis. Not only did we get cell cycle information, we could sub-catagorize G1 cells that have small nuclei, or large cytoplasm. You get a tremendous amount of additional data. You can collect 150 to 200 different measurements of each cell. That enables you to start doing discovery away from the actual indications you were looking at, and start to tie together different aspects of biology, for example cell size or cell morphology, or nuclear morphology to the cell cycle. A lot of unexpected discoveries happen when you don't limit yourself to the one readout you're looking at.
Can all this information, while useful, create a bit of a bottleneck itself in the discovery process?
It's definitely a challenge, especially on the genomics end, with the kinds of genome-wide technologies that are being put out there; it really becomes an embarrassment of riches. The key to making discoveries is being able to hone in on the one target that looks like it's going to be the most important. But these additional parameters help you classify and triage your data so you can focus in on important targets.
The other advantage that we're really excited about is that this data can be mined a year or two down the road saying, "We didn't know that these two processes were linked at the time, so this person didn't think to look," — but the data's right there to be reanalyzed. No one needs to re-run the screen, they can just go back and start making in silico discoveries, and start planning and actually having projects and targets without even lifting a pipette. You can just sit at your computer and look at the GNF database of these high-content assays, and requery it based on new knowledge.
These approaches to target ID and validation — does pharma have a problem adapting to all this excess information?
The more conferences I go to, the more pharma companies approach me and are really interested in doing things like this. A lot of pharma companies are actually better suited to handle this than academic labs, because they have been in the high-throughput screening environment, and have had experience with data overload. I think that pharma might be better suited to make immediate advances in this area, especially given the cost of a lot of these systems, and the infrastructure needed to set things up. Not only do you need the instruments — you almost need a dedicated user, like a FACS operator. Not to mention the cost of the screens — it starts to get a little out of reach for your average academic lab.
What type of throughput are you doing right now with this work?
We're generally looking at about 20 minutes per 384-well plate. GNF has a lot of custom automation that we've built, so we have a robotic suite that allows us to do high-throughput viral production for siRNAs, and also siRNA oligonucleotide screens. That includes up to fixing the cells and antibody staining. We've also integrated a stacker on our high-content imaging systems, and then we just take the stained cells in the plates and load them in the stacker. The throughput is still a little low for the imaging, but there is not a lot of hands-on time. The automation has solved some of the throughput issues, though, and the consistency of the fixing and staining process has improved, too. Lab workers can introduce some variability in that arena. The data has gotten a lot better since we've gone into a fully-automated assay execution mode.
Is your lab still using the Q3DM/Beckman EIDAQ 100 imaging technology?
We're still using it. There are a few applications that have been started on that, so we haven't really migrated everything as of yet. We're still considering what to do with that and waiting to see what Beckman's plan is for that instrument. We like the instrument and the software, but again, we're not going to run something that's not supported, so we're taking a wait-and-see stance.
What's next for this work?
We have a couple of papers either in press or review looking at various cellular phenotypes. We're looking at a nuclear translocation assay that we pulled out some interesting things with. The sky is the limit here, really. At GNF, we are opening up our screening center — including our high-content imaging center — to academic and non-profit groups, so we anticipate getting a lot more very interesting and challenging assays to run HCS on. We're very excited about some of the screens that we'll be running in the near future.
One of the major hurdles that we're not going to solve is the throughput issue — most instrument companies are trying to address that. A couple of issues on the back end that we are getting involved with include image analysis — that's still a difficult step. It really requires somebody with computer savvy to be able to partner with the biologist to do this kind of analysis. What we really want to shoot for, and we're working with a lot of other people to try and get this done, is to make it so simple that a biologist can use it. Basically 'show me your positive control, show me your negative control,' and your software will pick out an algorithm that will allow you to distinguish across a screen which wells are closer to the negative or positive populations.
Other challenges include back end storage of these data. We see an inherent value in keeping all of the data — not just for reconfirmation, but to reexamine the data. Being able to store and retrieve this data — there's just a tremendous amount of data that comes off these screens, both in the images and the underlying calculations. The creation of databases that can organize these files and retrieve them in a reasonable format and time frame is going to be another challenge.
Another technical challenge will be to start running things in 3D slices, doing more organ culture approaches — that will be a cutting-edge area that we'll be looking at, and I'm sure that a lot of others are also, so hopefully we can move together. When someone makes an advance in this field, everybody benefits, so we really try to work with as many people as possible to get the new technology to the forefront. Between our compound and genomics screens we've probably done more than 200 or 300 screens, and our best datasets invariably come from high-content imaging. There's really no comparison, the high-content data is far superior to the data we obtain from other detection methodologies.