Last April, SuperGen, a late-stage clinical development company, acquired Montigen Pharmaceuticals, an oncology-focused drug-discovery firm based in Salt Lake City. In addition to a pipeline of product candidates, the $18 million acquisition gave Supergen a computational discovery platform that Montigen had developed called CLIMB (Computational Lead Identification Modeling Biology).
Since then, the CLIMB platform has helped the combined firm advance several compounds toward the clinic, with one slated for an Investigational New Drug application early this year.
BioInform recently spoke to David Bearss, chief scientist at SuperGen and formerly chief scientific officer at Montigen, about the CLIMB platform and how it has helped the company speed the drug-discovery process. The transcript of the interview, edited for length, follows.
Can you provide some background on SuperGen and how the CLIMB process was developed?
Our goal as a company is to produce at least two IND-ready compounds every year — two compounds ready to go into the clinic — and really that’s driven by the discovery platform that we developed at Montigen and now here at SuperGen, and that’s really driven by a computational approach.
I think most people are looking at technology to drive efficiencies in their already established process, [but] what we’ve done is take an approach that we want to use new technology and computational techniques to really drive our process, so they’re at the core of what we do — they’re not really just add-on efficiency tools.
Our drug discovery process we call CLIMB, and that stands for computational lead identification modeling biology. And really the key is our ability to model. We try to build models of everything we can around a specific protein or around a specific pathway that that potential protein target exists within. And then, from those models we make predictions. So our process is very simple — it’s building models, making predictions, and then taking those predictions and testing them in the lab very quickly.
The laboratory component is tied to our ability to predict things that we think are important for whether this protein is a potential good target, specific activities around that protein, and then to test some of the protein-ligand interactions that we can model.
From those predictions we go very quickly into the lab, and all of our laboratory experiments are really set up around testing predictions that have been made from our models. … Then once we know that we can make good predictions from that model, we can really expand and utilize that model to go and search for things that we’re looking for.
For example, we use it to find new biomarkers, we use it to find our lead chemicals that we use for our drug-discovery process. We have an extremely large virtual library of chemicals, small molecules, and it’s organized in fragments and scaffolds, and then all of those scaffolds have been substituted in whatever way they can with as many functional groups, and we have close to 2 billion compounds now in that library that we can organize and search, and once we know we have a model that can make good predictions we can go and utilize that virtual screen to find leads.
We’ve been very efficient and effective at doing that. It’s been a very different process than I’ve been involved with in the past to find new drugs. We actually, in the lab, screen very few compounds. Normally, for a project, when we go through this process of building these models and identifying a specific target that we’re interested in, we’ll screen probably around 100 compounds per project, which is extremely small compared to what most people are doing out there. We have not had a project where we haven’t had success, and currently we have 14 drug discovery projects in house right now, and we have leads with all 14 of those projects.
Has everything that makes up the CLIMB platform been developed in house, or do you use third-party components?
CLIMB has a workflow component to it that’s a proprietary software package that combines [around] 54 third-party software packages. Then we have some proprietary algorithms, and CLIMB is really tied together by this workflow.
We have set out what our particular process is and split up our process into different stages, and the workflow helps guide us through those stages by calling up the different software packages that are available to us to help us either build a model or make predictions from that model, and then captures the information from the lab that helps us analyze whether that model was accurate in making predictions.
CLIMB calls up different programs as we need them as we’re moving through the process. In the beginning, we may use systems biology types of modeling software to help us look at particular pathways that we’re interested in for a particular disease. … If we inhibit this particular protein, are there other proteins that can compensate for its function? What would the predicted downstream events be? Would we be seeing a cell cycle type of event or would we be seeing induction of apoptosis or some other tractable event that we can then model and go to the lab and test? So we use that software up front to really help us define and decide what types of targets we want to look at, and then from that original modeling, we’ll go to the lab and use genetic tools to validate whether what we made predictions about in those pathways or with those particular protein activities, whether those are really things that we can see in the lab.
The workflow software is built in a way that as soon as those models are built and the predictions are made, the workflow alerts the scientists in the lab that here is the model and here are the predictions from the model, and here is the experiment that we need to do to test this particular prediction.
And then as we move along, we obviously will model the protein itself. So once we’ve identified a target, we’ll look and see what kind of structural information we know. … The crystal structure is just kind of the snapshot of a type of conformation that protein can exist in. But we know that proteins are flexible and have regulatory domains on them that change the conformation, so we try to model all of that and then test all of those models with a standard test algorithm that we use, and it’s tied to a way that we can test them in the lab. And a lot of times the model that seems to make the best predictions is not a solved crystal structure — it’s actually a model that’s been derived from some other structural information.
So we use that in high-throughput docking types of algorithms. We utilize a number of different types of docking algorithms, and we have our own scoring functions and our own algorithms that we’ve built to really detect what we consider is important for how a small molecule fits within a particular targeted site on the protein.
What we look for is what we call binding efficiency. Most docking algorithms have a scoring function associated with them that looks at binding energy, a calculation of energy. And what we try to calculate is binding efficiency, which is a little different term. It certainly has energy associated with it, but we’re looking at specific interactions that the small molecule can make with the protein, and we’ve developed that algorithm in house.
From that, we’ll take the most efficient scored ligands and take those into the lab as quickly as we can and test those in a number of different systems. And we don’t rely on high-throughput types of screens. We’re very focused on what the industry is calling high-content screens now, so we go right into cell-based screening right away, and once again, utilize our modeling and predictions and the previous information that we received … to see if we can copy or mimic what we saw with the genetic-based screens — using RNAi or antisense to knock out expression of genes and see if we can copy that with a particular pharmacologic inhibitor.
And then we go through a process to try and optimize that small molecule to conform to a list of properties that we consider to be what an ideal drug candidate would look like. We have close to 30 criteria … and what we try to do is to develop that lead to have all of those properties. We do that in parallel; we don’t do it one at a time. We think it’s important to do that because if you try to optimize just one property, and then you move onto the next one, a lot of times you’ll take two steps forward and four steps backward. You’ve got optimized potency, but you’ve completely killed its distribution or its metabolism profile. So we try to use computational models once again to give us a whole perspective of those 30 criteria that we’re trying to improve at the same time.
What are the advantages of this discovery platform in terms of cost or time saved?
We have about 25 people working here, so we have a very small group working on 14 different projects, but we’re able to push through a project from the very start — from the conception of, ‘This particular protein or target looks interesting,’ — to the time where we have an IND-ready candidate, and we’re looking at an 18- to 24-month time frame from start to finish. What you see that most people quote is five to seven years from the time that projects are initiated to when they actually go into the clinic, so we’re able to reduce that with very, very few people.
Building this workflow has really focused our efforts. There are so many things that we can do and so much information that we can generate, that a lot of times biologists run a bunch of assays that aren’t really necessary and look for a lot of things that aren’t really adding value to a program, just because they can do it. What our workflow really has done — and we didn’t really mean it to be this way, but it turns out that it really helps us focus — is that we only run assays that really are directed at testing specific predictions. So we’re really focused in the types of experiments that we actually do in the lab, and I think that’s turned into a huge time saving for us because our biologists and scientists aren’t spending time just trying to do everything they can think of doing. They’re only testing specific questions that are posed to them from our modeling efforts. And I think that’s been a huge advantage for us.
So it’s less of a trial-and-error approach.
Yes. It’s, ‘Does this do this?’ and the answer is yes or no. And if it does, that’s great. We’ve made a prediction that seems to be true. And if it’s not, then we can go back and refine the model.
So really we’re only restrained by the number of processors we have, by the number of software licenses we have, and our biology and chemistry actually move pretty quickly. The good thing about being restrained by hardware and software is that it’s easy to fix. If we need more processors, it’s easy to get our hands on that.
You developed a proprietary workflow platform, but there are several commercial workflow systems available now from companies like Scitegic and Inforsense. What’s the advantage of building your own?
We’ll evaluate and look at everything that’s out there, because we’re not going to reinvent something that would work for us, that we could adapt. But the reason why we built our own workflow is that I don’t think there’s another product out there that can really capture what we do and really direct what we do. As I mentioned, most companies are looking for tools that can create efficiencies in an already existing process, and because we really built our process around technology, adapting those tools that were really built to help create efficiencies in the high-throughput screening process or the traditional type of drug-discovery process, they just didn’t work as well for us because our process at its core is very different.
So we tried and looked at available programs that are out there, and certainly there are components of things that they do that our software does as well, but I think at its core, because our approach is different and it’s not using this workflow so much as an efficiency tool but really driving the whole process, it turned out that it just made sense for us to have that software be something that was specific to what we wanted it to do instead of trying to adapt something to fit the standard model of how drugs are discovered.
The neat thing about it is that it also connects to our LIMS system, so the workflow can capture information directly from the instruments that are being used in the lab, and manage all that information, centralize that, and present it in a way that makes sense to us based on our process.
How do you envision this platform changing or evolving going forward? What new methods are coming online that you’d like to bring into CLIMB?
We’re constantly adapting it. We have four full-time software programmers that work for us that are constantly adding new functionality and adapting and changing what we do. The core of what we’ve been doing recently is really focusing on our own way of calculating our binding efficiency algorithm to really calculate what we think is important with respect to small-molecule/protein interaction, so that’s what we spend a lot of time working on.
Additional things that we’re trying to do include expanding the workflow to really encompass not just what we do on the discovery side, but to look at what’s happening more on the development side. As we move into the clinic, we’re going to extend the workflow to capture the clinical information that’s coming from early clinical trials and feed that information back into our process. … So as we move compounds into the clinic, we’re extending our workflow and our process to capture that information as well, and really allow the clinical scientists at the company to utilize the same tool. The beauty of that is that they can back up in time and look at how did we get here and what were the reasons that we chose this particular indication, what information do we have about this pathway in that indication, and to really capture all that information.
So that’s what we’re really excited about at the moment, is to not just have this be what the discovery scientists use, but to really utilize this through our whole company so that we can all have access to that information and we all have the same paradigm, so that even in the clinic, we’re trying to model specific things about what we’re looking for and then taking that information and seeing whether what we’re actually observing is what we’ve modeled and predicted from our computational efforts.