Stanford University researchers have developed an algorithm that uses publicly available information on genes and drugs to match existing medicines to new indications.
The team, led by Atul Butte, associate professor of systems medicine in pediatrics at Stanford and director of the Center for Pediatric Bioinformatics at the Lucile Packard Children's Hospital, developed an algorithm that mined public gene expression databases to identify drug/disease pairs that could have a therapeutic future together.
They described the results of their work in two papers published this week in Science Translational Medicine — one that outlines the computational approach and its validation and another that demonstrates the use of the algorithm to identify potential new therapies for inflammatory bowel disease.
In addition to suggesting new uses for old drugs, the Stanford algorithm also confirmed known therapeutic uses for some of the medicines, further validating the approach as a useful tool for suggesting new treatments. For instance, it predicted that prednisolone could treat inflammatory bowel disease, one of the conditions for which the drug is currently indicated.
The Stanford team has created a spinoff company called NuMedii that is looking into commercial options for the algorithm.
For the studies, the researchers first gathered gene expression data on 100 diseases from the Gene Expression Omnibus and 164 drugs from the Connectivity Map, a collection of gene expression data from human cell lines treated with therapeutic compounds that is maintained by the Broad Institute. They then developed an algorithm to search through all possible disease/drug combinations to pair those whose gene expression patterns cancelled each other out. In the case of diseases that increase the activity of certain genes, for example, the program looks for drugs with expression patterns that exhibit decreased activity of those genes.
As the team explains in its methods paper, "if a disease state is signified by a specific set of genome-wide expression changes, and if exposure to a particular drug causes the reverse set of changes in a model cell line … then that drug has the potential to have a therapeutic effect on that disease."
The algorithm groups diseases by how they change gene activity rather than by affected tissues, and then pairs them off against drugs whose gene-activity effects oppose those changes.
Among other predictions, the algorithm indicated that cimetidine, an ulcer drug, could treat lung cancer, a hunch that the researchers proved experimentally.
The team also predicted that topiramate, an anti-seizure drug, could treat inflammatory bowel disorders and validated that prediction in vivo as well.
Meanwhile, the researchers noticed during their analysis that diseases with similar molecular processes, such as those that affect the immune system, clustered together, as did drugs that had similar biological effects, such as slowing down cell division.
The researchers believe that studying the unexpected members of these clusters could provide insights into how some diseases progress and how some drugs work at the molecular level.
The Drug-Disease Intersection
The rationale for this work traces its history back to the early 2000s, Butte said, when he began assembling a systematic way to mine the wealth of underused information in public databases for information on drug-disease interactions.
"I was wondering: can we predict these intersections, instead of stumbling across them?" he said in a statement.
Butte told BioInform that about four years ago, his team began to work on "taking all the different drugs that have been tested against all the different diseases we had collected and just seeing if there were these types of matches where the drug was having the opposite effect of the disease."
At the time they began their work, he continued, there were only about 100 diseases in the GEO repository and about 130 drugs in the Connectivity Map, so the team "ran with all of them" for the analysis.
At present, there are more than 300 diseases in the repository, and the Connectivity Map contains more than 7,000 expression profiles representing 1,309 compounds.
Additionally, Butte said, there are 750,000 measurements of activity levels of every gene in a cell or tissue stored in publicly available databases, a nearly 30-fold increase since 2004.
Despite this wealth of available information, "for various reasons, very few scientists use the data that's already entered. So we thought we'd make use of this info," he said.
One reason he believes researchers have been reluctant to use the freely available information is a perception that because it’s free, it's valueless. "People think it can't be great data — that there must be quality problems," he said, but the fact is that "people that are depositing the data are usually in the best institutions around the country and the world."
A second issue is that bioinformaticians are often focused on tool development as opposed to translational work, which means they may miss opportunities to discover new connections in publicly available data.
"It's rare for bioinformaticians to start, ask, and answer a biomedical question," Butte said. "Too often in the field, we have people developing tools but not necessarily using those tools on the public data and actually testing an idea/hypothesis and taking it all the way through to animal models ... [and] making use of it, especially in the translational kind of way for healthcare like what we are showing here."
Finding New Pairs
The researchers mined gene expression data from GEO and the Connectivity Map to build a "compendium of disease-drug relationship predictions based on matching genome-wide signatures of disease pathophysiology and drug effect."
Rather than "examining a single-drug disease pair" or "looking at reactions of a large set of drugs on a single disease," the researchers wrote, "we focused on discovering connections between drugs and diseases across all the available gene measurements."
Next, to discover potential "therapeutic indications," the researchers "statistically compared each of the disease signatures to each reference drug expression signature" using a "nonparametric, rank-based, pattern-matching strategy."
They calculated a similarity score for the up- or down-regulated genes for each drug-disease pair and generated a ranked list of potential treatments for each disease, in which +1 indicated a "perfect correlation of signatures" and -1 indicated "exactly opposite signatures."
The closer the score is to -1, the more likely that the drug is a potential therapy for the particular disease it is paired with, Butte said.
"If things were perfectly correlated, [and] the drug seems to have the same effect as the process of getting the disease, they'd be on the positive one" side of the spectrum. On the other hand, "if they are exactly negatively correlated, they will be closer to minus one," he said.
Furthermore, to check the significance of the predictions, Butte's team used a permutation approach that generated random drug signatures and repeated the analysis 100 times for each disease. Then they calculated the false discovery rate for each drug-disease similarity score.
Out of 16,000 possible drug-disease pairings generated by their algorithm, 2,664 were deemed statistically significant, of which more than half suggested a possible "therapeutic relationship."
On the whole, the method "provided significant candidate therapeutic drug-disease relationships" for 53 out of 100 diseases with the 164 drugs used in the study, the team said.
The remaining 47 diseases could not be "significantly associated with any of the treatments" and were excluded.
Among other results, the method predicted that a group of treatments known as histone deacetylase inhibitors proved to be "efficacious for the largest set of diseases."
For instance, vorinostat, an HDAC inhibitor marketed under the brand name Zolinza by Merck and used to treat cutaneous T cell lymphoma, had associations with 21 of the 100 diseases in the dataset.
Additionally, HDAC inhibitors such as trichostatin A, an antifungal antibiotic; and fefitinib, an epidermal growth factor receptor inhibitor marketed under the brand name Iressa by AstraZeneca and Teva to treat some kinds of cancer, were found to have more than 16 and 20 "significant predicted therapeutic indications," respectively.
Furthermore, the authors found that several cancers, including transitional cell carcinoma and melanoma, showed the "highest number of significant matches to therapies," with 95 and 79 indications respectively.
[ pagebreak ]
From In Silico to In Vivo
Butte's team chose to investigate two of the predicted drug-disease combinations experimentally: cimetidine, an anti-ulcer medicine that was predicted to treat lung cancer; and topiramate, an anticonvulsant, which the algorithm matched to IBD.
The researchers tested cimetidine — sold over-the counter by GlaxoSmithKline under the brand name Tagamet — on human tumor xenograft models of non-small cell lung carcinomas implanted in mice models.
Results showed that the treatment successfully inhibited tumor cell growth compared to a control group that did not receive the treatment.
In order to test the topimerate-IBD match-up, the researches used rats exhibiting bowel disease symptoms such as diarrhea and inflammations.
Topimerate was once marketed under the brand name Topamax by Ortho-McNeil Neurologics for treating psychiatric disorders and is currently being marketed by Janssen Pharmaceuticals for treating migraine headaches.
The researchers noted that they selected these drugs for their experimental analysis because the US Food and Drug Administration regards them as generally safe in humans and because they are already available for clinical use.
Butte added that the selection was based on "convenience" as well as "story interest."
"Finding an anti-cancer use for an off-patent, over-the-counter drug, I think, made for an interesting story from our perspective," he said. "And same for the seizure drug that is also off-patent right now."
The investigators noted that current treatments for IBDs like Crohn's disease and ulcerative colitis, are "aimed toward controlling symptoms, promoting remission, and preventing relapse."
However, treatments such as corticosteroids and systemic anti-inflammatory medications have "severe side effects," while surgical removal of affected regions of the small or large intestine is both expensive and invasive.
The algorithm gave topiramate a higher score than prednisolone, a commonly used therapy for Crohn's disease, and the drug was also "one of the strongest predicted therapies for UC, indicating it could potentially treat both conditions."
They reported that the drug successfully reduced damage to colon tissue in the model.
The researchers are planning further studies around the proposed new uses for these treatments, including finding ways to "dose them for human studies" as well as testing pairings of other off-patent drugs and early-stage compounds.
Despite the promise of the findings for cimetidine and topiramate, the results are preliminary and Butte and his colleagues do not advocate self-treatment using drugs originally intended for other purposes. Clinical trials are still needed to demonstrate the efficacy of these repurposed drugs in humans.
"In theory, having [the drugs] already FDA approved, an investigator could launch a trial today if they had sufficient funding," Butte said. "They would have to file with the FDA, but these drugs have known safety profiles."
However, "the reality for us is some of these trials — especially cancer ones — might be expensive trials, so trying to figure out how to capture enough value in this molecule that is already available in generic form ... to fund the clinical trial is more of a business model challenge than a technical one," he said.
The researchers are also pursuing commercial avenues for their algorithm.
"One of the things we are realizing is that pharmaceutical companies actually do have this challenge of where to test their drugs initially in clinical trials while trying to avoid ... failure," Butte said, adding that the key goal of NuMedii will be "to pursue the commercial side of this for those early-stage molecules as well."
Additionally, the researchers are looking to repurpose some of these medicines for orphan or rare diseases.
The team isn't planning to publish its source code but is willing to work with researchers looking to try out the algorithm on their private datasets.
Other Roads to Drug Repurposing
Butte's team isn't the only one attempting to reroute medications in new directions.
Earlier this year, an international group of investigators published an in silico method that uses a drug's interaction with proteins to both predict the possibility of an adverse event as well as to identify potential new uses for existing treatments.
Unlike the Stanford method, the Drug Repositioning Potential and Adverse Drug Reaction via the Chemical-Protein Interactome, or DRAR-CPI, web server makes its predictions by calculating "the binding energy" between the drug molecule and its protein target and provides an "interaction profile of the [query] drug towards all targets in [a] database" comprised of 385 structural models of human proteins and 254 small drug molecules (BI 6/3/11).
That approach is moving forward in "a great direction," Butte said. However "I think we are limited right now in terms of how many protein structures we have across the genome and our ability to dock them with the molecules."
He believes genome-wide approaches will be a better technique for finding unexpected connections. "As a physician, it's sort of intuitive to me that when we give drugs to treat diseases, in the end we are actually treating a pattern of symptoms, for example, or we are treating a pattern of what we see under the microscope, in pathology," he said. "To me, microarrays and the gene signatures [are] the next level in terms of a quantitative molecular pathology. If I can tell across these ten or twenty thousand genes in a genome how much they are changing [and] if I saw similarities that match, then I could take those further."
On the commercial front, SOM Biotech, a privately held, Barcelona, Spain-based company founded in 2009, has built its business on drug repositioning using a proprietary in silico solution that identifies new biological activities for drugs.
According to its website, SOM makes its money by discovering, confirming, patenting, and licensing the application of already known drugs for their development and commercial use in unknown indications. A repositioning program takes between nine and 16 months.
The company's business model is based on signing licensing and joint-venture agreements for the development of reprofiled drugs.
Other firms have attempted to capitalize on genomic-based drug repositioning in the past. Gene Logic launched such an effort in the mid-2000s and in 2007 sold off all its genomic assets to Ocimum Biosolutions in order to focus on drug repositioning, changing its name to Ore Pharmaceuticals.
In 2009, the company changed its name to Ore Holdings and reorganized as a "pharmaceutical asset management company" with the aim of outlicensing four compounds in its development portfolio that it had identified as potential repositioning candidates.
As of March 31, 2011, Ore was still seeking licensees for the four compounds.
Have topics you'd like to see covered in BioInform? Contact the editor at uthomas [at] genomeweb [.] com