In silico drug design shows promise as a quick, low-cost alternative to high-throughput screening and other expensive experimental methods, but the field has been hampered by the limited number of three-dimensional protein structures that have been experimentally derived. As quickly as the Protein Data Bank is growing (PDB contained 29,101 structures as of Jan. 11), it’s been estimated that this represents only around 2 percent of all known proteins.
Some researchers have turned to computational methods, such as homology modeling, in order to expand this available set of protein structures in the service of their in silico drug design efforts. BioInform recently spoke to Alexander Hillisch, who heads up Bayer Healthcare’s computational chemistry group in Wuppertal, Germany, to discuss the promise of this approach, as well as its limitations.
How common is the use of homology modeling in the pharmaceutical industry today?
Comparative or homology modeling is just one task done by computational chemistry or structural biology departments in the pharmaceutical industry. Although it is an important method, I would certainly say that it’s not a task that we do every day.
The method relies on the observation that in nature, protein structure is more conserved than sequence, and small or medium changes in the sequence normally result only in small changes of the three-dimensional structure. Homology modeling utilizes experimentally determined protein structures to predict the conformation of another protein with similar amino acid sequence.
One thing you have to keep in mind is that it’s only applicable if certain prerequisites are fulfilled. The sequence of the target protein of unknown structure has to have a significant sequence identity — 99-50%, some rare cases down to 30% — to a protein of known structure, the template protein.
What are the primary limitations for homology modeling in drug design today?
As mentioned before, the quality and, thus, the utility of homology models is highly dependent on the sequence identity between the target and the template protein. The primary limitation for a broader use of homology models in drug design is the poor quality of models that are based on a lower sequence identity to the template protein. In many cases such models are not sufficient for detailed predictions on how small-molecule ligands bind and where to modify them to obtain a stronger interaction with the target protein.
What would you like to see improved over the next few years?
The quality and the reliability of the structures. That is one thing that is limiting the applicability of the method. And also the inclusion of protein flexibility, like induced-fit phenomena. For example, Schrödinger has recently released a program that deals with induced-fit phenomena.
Do you use that software?
Yes. We see induced-fit phenomena in many of our protein-structure based design projects. The more we know about binding of ligands from X-ray structures, the more often we see such effects.
What do you see as the low-hanging fruit in terms of using this technology for drug discovery?
I see three kinds of applications that have short-term promise:
Homology model-based drug design, where you use protein structure models to identify ligands that modulate the activity of that target protein. Of course, experimental structural information is much better than models, but in some cases if you are starting a project in the pharmaceutical industry, you may not have the experimental structure available, and in that case, homology modeling can give additional insights about how to identify and optimize compounds with respect to binding affinity.
Another example for short-term promise would be the design of selective compounds. Let’s say you have two or more homologous proteins and you want to target only one of them with a drug. If the structure of one of these proteins is solved by X-ray crystallography or NMR spectroscopy, you can model the homologous proteins. This information may aid in the design of selective compounds that interact specifically with parts of the binding pockets that differ between the homologs. This is a very common task in computational chemistry nowadays.
A further option would be the design of compounds with broad-spectrum activity, against targets for anti-infective treatment. If you knew the binding sites of the target protein of choice for a variety of relevant organisms, you could design compounds that preferentially bind to conserved regions and avoid interactions with amino acids that vary throughout the binding pockets.
Another low-hanging fruit would be the prediction of animal model suitability, especially for pharmacological experiments. ... Progress in molecular biology has allowed [us] to use human expressed proteins and cells in the early drug discovery phase. However, in vivo pharmacological characterizations have to be done in animals, which is certainly a step forward with respect to the complexity of the test system and closer to the human situation than single cells. But the target protein, although probably highly similar, may differ between animals and humans. This disconnect in the pharmacology of animals and humans poses [the risk of losing] promising compounds only because the wrong animal model was selected.
Homology modeling of orthologous proteins provides one way to predict the suitability of animal models for pharmacological studies. Since in many cases proteins of species relevant for pharmacological tests — for example, rats, mice, or guinea pigs — are very similar to human proteins, one can easily build models for these proteins if the structure of the human target protein is known. If, for example, the binding site of a target protein from guinea pigs differs substantially from that of humans, one should avoid [using] this species for pharmacological investigations.
How about longer-term applications?
I would say that it’s actually very difficult to make structure-based predictions on drug metabolism and toxicity. Proteins of relevance here are cytochrome P450 enzymes or the human ether-a-go-go potassium channel. Some X-ray structures of relevant proteins have been solved recently. But in order to make reliable predictions one needs more protein structure information on such anti-target proteins.
Another long-term promise would be structure-based assessment of target drugability. It is clear that only a tiny fraction of the entire proteome can be affected by druglike — preferentially orally bioavailable — small molecules. Based on the total numbers of known genes, disease-modifying genes, and drugable proteins, the number of drug target proteins has been estimated for humans to be between about 600 and 1,500. For small molecules, sets of properties are established that distinguish drugs from other compounds, for example, which suggest compounds with poor oral-absorption properties. Since drug molecules and their corresponding target proteins are highly complementary, some rules that distinguish good target proteins from others should be deducible. Deep lipophilic pockets with distinct polar interaction sites are clearly superior to shallow highly charged protein surface regions. If one can choose from a variety of target proteins, one would certainly want to address the target with a binding site that can be modulated by drug-like compounds. Progress has been made in this area with homology modeling, but precise predictions are challenging at the moment.
Have you determined which of the available homology software tools works best for your requirements?
We have evaluated some of the methods out there, but I can’t give a real answer as to what is best. Some programs address special issues and are in some respects better than others, but there are none that I would say have solved all the problems.
Have you developed any of your own homology modeling tools at Bayer, or do you primarily use commercial software?
We’re not developing software for homology modeling ourselves. This is really something that we leave to software companies. But we are in constant contact with some of the companies and help them to improve the tools by communicating our experiences.
How long do you think will it take before homology modeling becomes a reliable method for 100-percent in silico drug design?
I think that 100-percent reliability will probably never be reached. Even if you take structure-based drug design, where you have one or several experimental X-ray structures, the reliability is far from 100 percent. The problem with homology-based drug design is that, in addition to the inherent uncertainties of structure-based design, you cannot be sure that your protein model is absolutely correct.
It really depends on how close the sequence that you want to model is to an experimental X-ray structure. If it’s 90 percent, then in most cases you can take the sequence, build the model, and it will be okay. If it’s 30 percent, you may end up with a bad prediction, which in the end will not help you very much for drug design. As mentioned before, I think that homology models that are based on a 99-50 percent sequence identity are quite okay for doing structure-based drug design.
In general, I would say that while complete experimental structures of pharmacologically important target proteins are missing, homology modeling provides one approach to bridge the time gap until the experimental structure becomes available and is thus a useful method for drug design purposes.