Computational drug design firm Numerate said this week that Boehringer Ingelheim will use its platform to generate in silico small-molecule drug leads for an undisclosed infectious disease target.
Numerate's drug-design platform uses a series of proprietary algorithms to produce predictive models for molecular properties of candidate compounds. The company said these models can predict drug success or failure with an accuracy of between 50 percent and 80 percent, which it claims is comparable to experimental laboratory testing.
The algorithms, which run on the Amazon Elastic Cloud Compute infrastructure, use large collections of compounds from publicly available and private sources to create "virtual libraries" comprising more than 10 billion compounds. The company then creates "virtual assays" for specific drug design properties, such as activity and toxicity, that it applies to the virtual libraries in order to identify a handful of compounds that are most likely to satisfy the design criteria.
Launched in 2007, Numerate applies a "data-driven approach" to designing drug molecules and charges its customers only when it successfully creates and delivers compounds that work, Guido Lanza, the company’s CEO, explained to BioInform.
The San Bruno, Calif.-based company's business model is based on forming risk-sharing partnerships with pharmaceutical and biotechnology companies as well as with academic groups.
In addition to its latest catch, Boehringer Ingelheim, the company has forged similar partnerships with Intellikine and Presidio Pharmaceuticals in the past. Currently, it is working on projects with two pharmas and a biotech firm, Lanza said although he could not provide specific details.
Numerate acquired the algorithms it uses in its pipeline from Pharmix — a now-defunct computational drug design startup that Lanza co-founded in 2001 to commercialize technology developed at Stanford University.
Pharmix focused on designing and commercializing small-molecule compounds with predicted, optimized, and validated characteristics.
However, that turned out not to be the "best model for this kind of technology," Lanza said, because "any time you say the words 'computer' and 'drug design,' you are going to get a fair bit of justifiable skepticism."
As a result, "we decided to structure our business model in such a way that we are rewarded not based on giving people access to our technology but based on delivering compounds that work," he said.
Additionally, "we realized that in order to make a big dent in this kind of space, simply going out and licensing ... packages and figuring out a more clever way to stitch them together was not going to get us there," he explained. Instead, the company would "have to take a step back, look at the real world, and actually solve the problems that generate the most value."
Lanza said that Numerate's approach to drug discovery differs from other computational methods — such as docking, simulation, and quantitative structure-activity relationship modeling — that attempt to make general predictions about a compound's activities based on structural characteristics.
"The key flaw of many of the existing approaches is that they either learn too little or too much from existing data," he said. "What you are really trying to learn in a series of compounds that are all very similar is what distinguishes a more active from a less active compound, not what active compounds have in common."
Additionally these approaches "fall short" on the statistical analysis of the data, he said.
In contrast, Numerate tries to help chemists answer the question, "What should I make next?" given all available information, he explained.
Additionally, he said that the company’s approach addresses experimental noise as well as the mismatch between the small number of drug compounds and the large number of variables needed to represent the phenomena being modeled.
Numerate has a list of criteria that it uses to determine whether potential projects are a good fit for its technology platform, including a minimum amount of initial data available for the project and the size of the compounds required. The company also conducts feasibility studies to test whether its method can appropriately address the client's questions.
While the platform has so far been "generally applicable" to most studies, there are some projects, such as those in which there is no initial data or very little intellectual property, "where a structure-based approach" would be more effective," Lanza explained, adding that the company has turned down projects for this reason.
Once it embarks upon a project, the company begins gathering and cleaning related data from multiple sources and uses this information to build its predictive models.
Numerate's system "learn[s]" from publicly available data on chemical compounds, data provided by the company's partners, as well as patent information, and then generates predictive models to help it sort through large lists of potential molecules, Lanza said.
For a project to design a kinase inhibitor, for example, Numerate's scientists design virtual libraries containing several billion compounds and then use in silico modeling techniques to "prioritize" the compounds that best match the design criteria, such as metabolic stability, activity, or selectivity, Lanza explained.
After the system applies virtual assays for each of the drug design goals to each of the compounds in the library — a process that takes around a month on around 1,000 compute nodes — it suggests a series of around three to five compounds containing 10 analogs each.
In cases where the system returns more recommendations than are required, Numerate's scientists take additional steps to further narrow the list, Lanza explained.
Numerate's system, which is Java-based, performs all its calculations on Amazon's cloud, where it uses the Sun Grid Engine to schedule compute jobs across hundreds of instances and OpenVPN to create a secure virtual network for its computation. It stores its results in Amazon's Simple Storage Service.
The company presents its list of compounds to its partners, who can either accept or reject Numerate's suggestions.
If the partner chooses to continue, the compounds are synthesized and screened using pre-selected assays either by the pharma partner, or outsourced to other companies, Lanza said.
Compounds that pass that screening stage can then be further tested and validated in partners' laboratories.
The rights to any intellectual property that is generated during the process are determined when the initial contract is inked. Such arrangements can vary depending on the nature of the partnership, Lanza said.
For example, in a situation where the partner does not contribute any initial IP, Numerate may retain rights to the results if it meets previously established goals for the project, he explained. Conversely, if the partner provides most of the initial data then it would own all or most of the IP.
Lanza said he isn't aware of any other computational drug design companies with a similar business model, although there are many firms that offer tools with some of the same capabilities and target similar clientele.
However, Lanza hopes that giving customers the option to pay only on receipt of a functional compound will keep his firm a step ahead of traditional software vendors in the space.
Have topics you'd like to see covered in BioInform? Contact the editor at uthomas [at] genomeweb [.] com.