Cambridge, Mass. — The third annual CSBi Symposium kicked off last week with a clear message: Computational systems biology is emerging from blue-sky mode and finding use in drug discovery.
The first day of the Computational and Systems Biology symposium, hosted by the Massachusetts Institute of Technology’s campus-wide CSBi initiative, was dominated by talks about real-world applications of tools developed to support a systems-based approach to biological research.
The phrase “systems biology” often serves as a catch-all for any biological research done using high-throughput technology — a definition that some might disagree with — but this open-ended outlook has served CSBI’s mission well. The MIT initiative includes several technology areas, such as microfabrication and image analysis, that other systems —biology efforts bypass in favor of the usual ‘omics tools. Now, the initiative is setting its sights on small molecules as well.
Peter Sorger, director of CSBi, said during the symposium that one goal of the initiative is to “aggressively pursue” on- and off-pathway targets for drugs that have already been approved by the FDA, in an effort to better understand ADME-tox properties of small molecules.
Sorger said that the resulting data would be released in the public domain, but did not provide a timeframe for the research project.
Sorger’s comments followed a presentation by MIT’s Erik Brauner on his work integrating small-molecule data with biological information. Brauner co-developed the ChemBank database while at Harvard. He recently built a database that links structural information he identified in drugs listed in the FDA’s “Orange Book” of approved therapeutics with biological assay information and structural data on targets from the Protein Data Bank. This kind of integrated resource, he said, has not been made available in the public domain before.
After removing redundancies and sorting out some thorny nomenclature issues, Brauner found that there are just 1,355 “unique active molecular ingredients” that the FDA has approved to date — a number that Brauner said he was “surprised” to see so low. In addition, these compounds hit a total of only 147 protein targets (primary mechanisms only).
In another unexpected finding, Brauner said that when he checked the approved compounds against the so-called Lipinski rules for “drug-likeness” — molecular weight, lipophilicity, and several other criteria that medicinal chemists tend to take as gospel — around 26 percent of them violated one or more of the rules. Furthermore, he said, when he tracked these approved compounds over time, he found that the average number of Lipinski violations is actually climbing.
Brauner said that the full significance of this trend remains unclear, but noted that many pharmaceutical companies “throw out all molecules that violate one of the Lipinski rules” — a precaution that might not be necessary with improved biological understanding of the target molecules that these drugs are being designed for.
Steven Altschuler of the Bauer Center for Genomics Research at Harvard University also discussed a project based on small-molecule compounds.
Altschuler’s team used a library of 100 compounds across a range of dosages to perturb HeLa cells in a microscopy-based study. The study used 11 biomarkers as “broad readouts” whose location and concentration were analyzed within the treated cells over the course of 20 hours. This resulted in around 1 billion measurements, and about 1.5 terabytes of data.
The biggest bottleneck in the project, Altschuler said, “was going from the [microscopy] images to something that was computable.” Imaging data is “incredibly rich,” he said, but very difficult to process in a high-throughout manner. His team developed some statistical methods to ensure that the images were “robust enough” to compare “compound profiles” — heat-map style visualizations that use the same red and green coding as microarray experiments to show whether biomarker concentrations increased or decreased in the presence of a drug.
Once the compound profiles are in hand, they can be clustered and mined just like any other biological data set to search for trends, Altschuler said. He added that one advantage of microscopy over microarrays or other experimental methods is that researchers can go back and extract more information from the original images “if they see something interesting in the data.”
In an indirect indication of the significance of his work, Altschuler noted during his acknowledgements that “almost all of the lead authors on this have left for industry, so that must be a sign of something.”
Moving from the drug side of the equation to the disease side, several speakers discussed how they are applying computational systems biology to gain a better understanding of human disease mechanisms.
Vamsi Mootha, who recently moved from MIT to Harvard’s Systems Biology department, has built upon his earlier work in identifying the role of the OXPHOS family of genes in diabetes. Mootha said he had narrowed his search to PGC1-alpha, a co-activator of the OXPHOS genes, but because PGC1-alpha is active in a number of tissues, it is unsuitable as a potential drug target.
In order to “dissect the pathway further,” Mootha used a software program called motifADE (Motifs Associated with Differential Expression) that identifies all the potential upstream regulatory elements for each gene in a ranked set of differentially expressed genes. Those motifs that cluster together are likely to play an important role in the pathway, even if the genes don’t appear to be significantly differentially expressed, Mootha said.
Using this approach, two motifs stood out that are binding sites for ERR-alpha and Gabpa — two genes that exhibit a “double-positive feedback loop” that appears to play a role in diabetes. Mootha said that his team has validated its hypothesis using an ERR-alpha inhibitor, and is seeking to perform further work with an ERR-alpha agonist.