Skip to main content
Premium Trial:

Request an Annual Quote

Collaborative Drug Discovery Bags Phase I SBIR to Build Tools for Sharing Models of Drug Molecules


By Uduak Grace Thomas

Collaborative Drug Discovery recently received a Phase I Small Business Innovation Research grant from the National Library of Medicine that it will use to develop a method of securely sharing in silico drug models that can be used to predict the absorption, distribution, metabolism, excretion, and toxicity of possible treatments for tuberculosis.

Barry Bunin, president and CEO of CDD, told BioInform that the company will use the funding, worth $150,000, to build a new module onto its existing drug-discovery platform, CDD Vault, that will allow customers to selectively share drug models without disclosing proprietary chemical structures.

In addition, CDD plans to incorporate open source cheminformatics modeling components into its platform with which users can build their models in the second phase of the project, he said.

As an example of the potential for the new capabilities, Bunin explained that a pharmaceutical company could build models based on proprietary information and then use CDD Vault to test them on data from potential collaborators without revealing any sensitive data.

CDD's current platform is "very good for securely collaborating in terms of being able to partition or selectively share [IP-sensitive] data with collaborators … before publications and patents," Bunin said. With this grant, CDD plans to "take the exact same concept and extend it to model the meta information based on the data without sharing the underlying data."

IP-sensitive data is similar to patient data in terms of security concerns, Bunin said. With patient data, "you have to make sure that it's de-identified two ways so you can't accidentally find the person by zip code and age. It’s a similar concept here."

He added that the new capabilities will allow researchers to "share a model based on the patterns in the data" that could be used to explore which molecules are effective for a particular disease and which aren’t, for example.

Bunin said the company expects its new infrastructure to encourage drug companies and research groups to collaborate with each other while keeping sensitive data protected — a goal that lines up with those of the Pistoia Alliance, a consortium of pharmaceutical firms and other groups that aims to improve pharmaceutical R&D informatics (BI 7/29/2011).

In fact, Sean Ekins, CDD's collaborations director, was voted on to Pistoia's board of directors earlier this month.

Furthermore, these models could help research groups working on "neglected diseases and orphan indications," which larger pharmaceutical companies have historically "ignored" because they aren’t seen as "highly profitable opportunities," CDD's grant abstract states.

During the six-month first phase of the SBIR project, CDD and collaborators at Weill Cornell Medical College, the National Institute of Allergy and Infectious Diseases, and the Infectious Disease Research Institute aim to demonstrate the value of creating "computational TB screening models derived from distributed, heterogeneous collections of data" and using them to "filter and prioritize the molecules scheduled to be screened."

A second component of the grant will be to develop a set of "initial standards for specifying models" that will form the basis for "more comprehensive software prototypes" that will be developed and tested in the second phase of the project, according to the abstract.

"Sometimes it will make more sense to combine datasets to build a stronger and a better model based on more data," Bunin said.

However because of differences in the way experiments are set up and variations in the technologies that are used, combining data might not always be the best option, so part of the task will be "to set up ways to rigorously and reproducibly demonstrate when people benefit from combining data for models from disparate sets and when it hurts," he explained.

Using open-source tools as a baseline, CDD and its collaborators hope to develop a general "framework" that's broad enough to allow any user to build a model with any platform and then test it against multiple datasets, Bunin said.

CDD is looking for additional partners on the project. Interested parties can contact the company at [email protected] or leave a comment here.

Free Agents

In addition to establishing a baseline for its modeling framework, CCD chose to include open source cheminformatics tools in the project to ensure that drug models would not depend on a single commercial or academic software program, Bunin said. In addition, he noted that there really isn't a difference between the models built using open source tools and those from commercial products.

CDD chose not to develop its own model-building tools because it wanted to remain a "neutral" part of the model building and sharing process, he said.

Some commercial cheminformatics packages that are currently used to create models include Accelrys' Discovery Studio and Chemical Computing Group's Molecular Operating Environment. An example of an open-source package is the Chemistry Development Kit, a Java library that’s available on sourceforge.

CDD and Pfizer last year published a paper in the journal Drug Metabolism and Disposition, in which they compared the model-building capabilities of MOE and CDK.

Their results showed that "computational models generated for very large datasets with open cheminformatics tools were equivalent to those generated with commercial tools," CDD's Ekins noted in a statement.

In the paper, the researchers wrote that when they ran both MOE and CDK on a training set of 50,000 molecules and a test set of 25,000 molecules, CDK's model had a sensitivity of 0.57 and specificity of 0.91 while MOE's had a sensitivity of 0.58 and a specificity of 0.91.

Bunin said that the company plans to develop its model-sharing module "in a way that's inclusive of both commercial producers of models and technologies as well as open source ones and then let the [users] decide," which tools they want to use.

Currently, users pay a subscription fee for access to CDD Vault's data-handling and chemical-modeling registration capabilities. The company hasn’t decided on a business model for the platform's new capabilities, Bunin said, since the project is still only in its first phase.

"We have a half a year or so for phase one and then a year for phase two, and so that will play out and become self evident in the fullness of time, " he said.

However CDD will be keeping its customer base in mind when putting its business plan in place.

"Our bread and butter is really providing CDD Vault for smaller labs, cash-sensitive startup companies, and academics, so that’s the bulk of a lot of our users," he said. "They also could benefit from this technology when it is developed."

Have topics you'd like to see covered in BioInform? Contact the editor at uthomas [at] genomeweb [.] com

Filed under