Part one in a two-part series.
BOSTON A number of informatics providers are developing computational methods for predicting ADMET (absorption, distribution, metabolism, excretion, toxicity) properties, but most pharmaceutical firms still perceive ADMET as a primarily experimental process.
Last week, during IBC's Drug Discovery Technology conference, BioInform sat down with representatives from several informatics companies to discuss the current state of in silico ADMET technology, and how companies in the sector can expand adoption within pharma.
Ian Welsford: manager of application science, biosciences group, Fujitsu (Fujitsu recently launched an in silico ADMET prediction service [BioInform 07-18-05]).
Megan Laurance: product research scientist, Ingenuity Systems (Ingenuity's pathway informatics platform is used by researchers for a number of applications, including ADMET prediction).
Gregory Banik: general manager, informatics division, Bio-Rad (Last week, Bio-Rad released version 6.0 of its KnowItAll informatics platform, which includes a number of new in silico ADMET features).
How would you assess the field of computational methods for ADMET prediction? How good are current methods, and how much better will they need to be in order to see wider adoption?
Welsford: I think it's quite appropriate that you have three different vendors in different spaces here. Because what we see is really that no one comprehensive solution is meeting people's needs. There's ample opportunity for the predictive structural modeling that Bio-Rad has, and the knowledgebase approach that Ingenuity is basically the forerunner of, and we are more in the cheminformatics and chemical computing and chemical modeling space, and there's a role for each of these technologies.
Either it's early enough in the technology adoption cycle that one vendor hasn't agglomerated the one solution, or it could be that it's such a multifactorial problem that that this will never occur, but I think there's tremendous value in having a variety of approaches, and I think in all the places that we're seeing it used, different groups are using different tools in different parts of the process.
Laurance: When we first built the [Ingenuity Pathways Analysis] software application that most people are familiar with, it really was designed around target ID, target validation. We actually found that customers were dragging us down that drug discovery/development process, straight into pharmacogenomics, toxicogenomics, primarily because they had embraced things like full-scale gene expression analysis, proteomics, whatever methods they could to try to capture a molecular signature of tox effects, and [were trying to] put that into perspective as far as what that implied about particular mechanisms that were disrupted. So for us, it's making sure that whatever methods are being used out there, to capture that state of toxicity, or before it gets to that state of tox, and help them put that data into perspective, and start to build signature pathways around effects that can be measured by traditional [ADMET] methods.
Banik: One of the cornerstones of our strategy going forward has been one of partnerships and collaboration, and Ian made a good point: There is no one who has everything; there is no one solution. It's early enough in the life cycle of this technology, or these technologies, that no one has completed the mythical whole product. And that's the nature of discovery it's a vast range of end points and technologies, ranging from large molecules to small molecules so I think integration and collaboration and cooperation is absolutely the key.
Laurance: For a lot of our customers who are relying on gene expression patterns to predict whether something at a low dose can lead to toxicity at a high dose, they need to reference established effects of drugs in a similar class. That might not be something we would have in our knowledgebase, but they certainly can create that inside the knowledgebase, and then, as you said, kind of iterate out to what else is out there. So absolutely [these are] very complementary approaches.
Welsford: We're really at a nascent stage of understanding these interactions, so basically you have two streams running as fast as they can in parallel. One is the pharmaceutical industry, which is under tremendous pressure to find new chemical entities and to get them derivatized and trying to push the limits of the chemistry. In the other stream are the vendors. The new chemistry, unfortunately, challenges us as predictive vendors to try and keep up with this. Since most models are based on existing well-validated scaffolds, as the scaffolds change, [and] new chemistries come online, getting high-quality ADMET predictions is more challenging, so that's going to lead back into that cycle, and ends up being this sort of iterative process, with both sides evolving.
"The good news about this kind of prediction is that you generate lots of data. The bad news is that you generate lots of data."
Laurance: The other piece that for us is important in the knowledgebase approach is that because this affects so many different types of disciplines, there has to be some kind of common language that everyone can collaborate on and focus on. We're doing some work with the FDA to give them tools so that when, as people are doing these voluntary submissions for toxicogenomics, they have some way to communicate about what this actually implies at the molecular level. I was at a biomarker meeting a few years ago, and there was a very strong FDA presence there, and it was the same conversation: People need a way for chemists to talk to biologists, and how do we know that we don't already have in-house information about this, and how do you transfer this knowledge from group to group?
So would you say that current methods are at the predictive level yet, or is it still at the level of gathering information and putting it in the right place to help people access what they already know?
Laurance: For the cases that I'm familiar with, [customers are looking for] a proof of principle that [they] can use this technology to equate something that's happening at the molecular level with something [they] can see at the pathology level. And once those proof-of-principle experiments are in place, there's going to be a lot more confidence about using something like functional genomics or transcript analysis to predict [ADMET properties].
Banik: Our approach is much more of a small-molecule perspective, in utilizing tools and technologies through our partners, as well as developed internally, that have been tested over the years. In quantitative structure activity relationships, where we're looking at the relationship between molecular descriptors and some measured property or activity, those technologies are well known. The application of them and the validation of those particular models for those specific end points is one of the difficulties. In particular, whether or not a model that has been developed a global model, a one-size-fits-all model whether that's appropriate for the particular chemical space of the scaffolds that the pharmaceutical or biotech company is addressing in their discovery program. And that domain applicability is a critical issue whether or not users can have confidence in the particular values that have been predicted.
Welsford: We see, at Fujitsu, the need for filling in the holes. So I think that Ingenuity has been doing a good job of setting up the framework what are some basic amounts of information one needs to know in order to have the discussion about how to model? Bio-Rad has done a good job of integrating these sorts of standard tools and applications. Where we see our value is sort of on the back-end informatics side in terms of the hardware necessary to deal with the acceleration of this information, and also in some of the high-end modeling that has to occur, because you don't always know whether the up-regulation of a protein is a good or a bad event. You have these multifaceted, interactive up- and down-regulatory pathways, so, for example, you get a transcriptional signal a particular CYP goes up. Well, that could be a very good event, from an ADME perspective it could be very good, but in a very simplistic sense, you may be thinking, 'I don't want CYP up-regulation, CYP up-regulation is bad.' Well, whether it is good or bad is a context-specific event where it is in the body, and which subcellular region this activity occurs in is crucial to understanding its significance. And [we] don't always have all this specific information about the proteins that we're modeling. So we see a need for some sort of high-end modeling for some of these interactions, because we just don't know what that thing looks like yet, and we don't always have the crystal structure, and in some instances this can be very difficult to get. There's progress being made, but that still doesn't mean we can fill in all those holes.
On the back end, the storage and informatics, there are terabytes of data being generated by these applications, and there's a need for very fast database searching, and hardware, and we see needs in all of those areas.
Banik: Frost & Sullivan [wrote] a report on the ADME-tox market in the US in March, and there's a table of all the players in this field, and it lists 110. And I don't see any of those as necessarily competitors. They're all collaborators as far as I'm concerned because no one has completed the puzzle, no one has created the whole product, and the different methods are so complementary. There's a need to deal with the back end. The good news about this kind of prediction is that you generate lots of data. The bad news is that you generate lots of data. And then you need a way to make it accessible and store it all and convert that data, distill it into actionable information.
Laurance: So much of our focus as we progress with the technology at Ingenuity is, obviously, we're not going to have all the knowledge that's out there. We're talking a lot more about opening up the application behind the firewall at a lot of our large clients so that they can integrate with their internal information.
And the context-specific point that you brought up absolutely, we're making sure that we understand that this causal connection is something that only occurs in this tissue, and that effect is different in this tissue versus that tissue.
Welsford: From the vendor side, you hear frustration that many pharmaceuticals and biopharmas are slow to adopt new technology. I think a lot of times it's easy to lose perspective about how much pressure these companies are under and how much regulatory authority hangs over them every minute of every day, and you tend to forget that they're trying to avoid Vioxx and Bextra every single day, and if they don't, basically their entire world may fall apart. So they're under tremendous stress. And I think that is why we're sort of at the 'dipping-the-toe-in-the-water' stage a little bit, with pharma saying, 'What value is there in this?' But to expect the drug development industry to just whole-hog dive into [what we offer] without being concerned of the consequences is a bit naive.
You bring up Vioxx and Bextra as if that is holding up adoption for in silico ADMET prediction, but I've heard it presented the other way that the increasing demand for improved safety will help drive adoption of these methods.
Welsford: I guess what I was saying was that at some level that's probably accurate that more information is probably better. But if you assume that simply because we've had these kinds of adverse events, that automatically these companies are going to kind of wholesale take everything they've done 10 to 15 years of data and just say, 'Well, let's just go with QSAR, that's all we're going to do,' is also naive. Yes, I think downstream that there's value in that, but I don't think there's going to be an immediate switchover.
Laurance: I agree. A lot of our clients are completely embracing this technology, but there is concern that until it really gets to a predictive state and until there is really solid proof of concept and proof of principle, there is concern about misinterpretation. So for example, to take an example that you talked about, until we know that up-regulation of CYP is always a bad event, that's information that the FDA is going to be asking for, so there has to be some embracing of how this data is interpreted. I don't think it's keeping people from adopting [the technology], but I would be cautious about how to interpret all this stuff yet.
Banik: The hope is that in silico [methods] really [are] not going to be a replacement [for experimental methods], but really a prioritization tool. It's something that's relative to in vitro or in vivo testing inexpensive, and much higher throughput. It can't be used as a replacement, but rather as a refinement tool or a prioritization tool to put forward the best compounds, to run them in more expensive and more exacting in vivo and in vitro testing.
"I don't think anyone is going to plug it through a computer model and say, 'OK, this is what we're going to do.' It's going to be a compendium of evidence."
Welsford: It might also be good to think on sort of the flip side of the adoption of the technology as you're moving toward the multifactorial and the complex, it's increasingly apparent that there aren't single good answers for animal models, and while a lot of animal models used in testing for a lot of these factors are effective, there are certain other ones that are informative, but not necessarily definitive. So I think that's also driving demand for research organizations to try and utilize whatever surrogate data they can, and use those knowledgebases, infrastructure systems, and models, and leverage them sooner in the process.
Banik: There are two quotes in this area that I absolutely love. One is Yogi Berra, and it's, 'Predictions are difficult to make, especially about the future.' The other is from W. Edwards Deming, the code-quality management guru, and it's, 'All models are wrong, some models are useful.' And I think that bears remembering as well all these things are extrapolations of reality. Even an animal model is a model it's not a human, it's an extrapolation of reality. And we hope that what we do in those instances will be somehow representative and informative, in terms of putting potentially hazardous compounds into humans.
Laurance: I think that gets back to the point of all the complementary approaches. I really think that our customers are trying to gain as much confidence as possible about how to proceed with a particular candidate. A lot of that comes from what they can infer from pathways and what they can infer from the existing knowledge in the knowledgebase. A lot of it is going to come from the history of experimental data that they've generated in house. I don't think anyone is going to plug it through a computer model and say, 'OK, this is what we're going to do.' It's going to be a compendium of evidence. It's science: They're going to get together at their lab meeting every month and figure out what the evidence is telling them to move forward.
Welsford: Fujitsu is known as an in silico company, but several years back, some of our customers in Japan said, "These models are great, but you have to test them.' So they presented Fujitsu Laboratories with a challenge to provide a system to enable higher-throughout cellular screening. We are showing this at the show [CellInjector]. The lesson would seem to be that the in silico part is important, but at some point people are going to have to go into cellular assays, so Fujitsu is developing that part of the process, so it's not an entirely in silico game.
Banik: Going back to the point Megan made about confidence, one of the things that I hear from companies time and time again is [the need to provide] some level of confidence in the models, and validation is one way to do that. Have a predictive model, compare it with compounds that were not part of the training set to plot a predicted versus actual and generate relevant statistics that give you some confidence in your model or that particular validation set. But people have requested more than that, and one of our partners, ChemSilico, had a feature that is now implemented in our KnowItAll system that allowed, for a particular endpoint, a confidence level to be estimated based on whether or not the compound being put through the model was close enough to the chemical space upon which that model was built the training set's chemical space. And we've now implemented that in the new release that we're announcing in the show.
I think we're all in that stage where we have to convey to our customers and potential customers that this stuff does in fact work, and that they can see that it works. No one is going to use it without believing it, and seeing is believing.
Next week's issue of BioInform will include the second half of the discussion, which covers the quality of current training data, the impact of biological data on a historically chemical field, and the non-technical adoption hurdles for in silico ADMET.