An international team of researchers has developed a method for building models of disease mechanisms that it claims can be used to identify diagnostic markers and personalize treatments for patients.
In a paper published in December in PLoS Computational Biology, the group described how they used text mining and manual curation to create a 51-node gene regulatory network to study the roles of two types of T helper cells in allergic disorders. They combined network analysis algorithms, in silico knockouts, and gene expression microarray analysis to determine that the two cell types — Th1 and Th2 cells — do not counter-regulate each other, as previously thought.
Mikael Benson, a researcher at the Unit for Clinical Systems Biology at Sweden's Sahlgrenska Academy and a co-author of the paper, said in a statement that the method could lead to "major savings in both time and money" because it could result in "quicker and better-designed experiments" that would result in "new knowledge about diagnostic markers or medicines."
Benson told BioInform that he and his colleagues decided to study allergy because it is among the simplest of the complex disorders. "You know the external cause ... pollen, and you know the key cell, which is a Th2 cell, and you know that that cell releases Th2 cytokines," he said.
The researchers relied on automated text-mining of 18 million Medline articles to build a model of genes involved in the differentiation of Th cells into Th1 and Th2 cells.
Benson explained that the scientists first identified about 400 genes associated with Th cells and then manually curated the list to pare it down to 51 genes used to create the model.
While several of the study's authors had previously created a gene network for the same system, it was based on 17 genes, which the authors deemed "a relatively small, though relevant, number of genes and interactions." Furthermore, they added, the previous model "did not include comparisons with biological data."
Since that model was developed, Benson and colleagues noted, new algorithms have made it possible to analyze more complex regulatory networks. Specifically, they used a so-called SAT-based algorithm developed by researchers at the Royal Institute of Technology Stockholm "to perform in silico studies based on a more comprehensive gene network model, which included a larger number of genes."
The researchers used this approach to explore the dynamics of the model and ultimately identified four reaction patterns, three of which were compatible with the hypothesis that Th1 and Th2 play a counter-regulatory role and one that did not support this theory.
Next they performed single-gene in silico knockout experiments for all the genes in the network in order to monitor the behavior of the four different reaction patterns and found that the results did not support the counter-regulatory role.
Finally, to test whether these four types of responses were consistent with records of immunological diseases, the team compared its findings to gene expression microarray data from patients with allergic disorders stored in the Gene Expression Omnibus. Benson said they found that "the reaction patterns in that model corresponded to what was found in different inflammatory diseases."
He said that the approach captures the inherent complexity of living systems, and even though the model only used 51 genes, "the same methods could be used ... [on] much larger numbers of genes."
As the team notes in the paper, one of the challenges of using network modeling is that the number of possible states grows exponentially as the number of nodes — or genes — increases.
As an example, they wrote that a network containing 40 nodes would typically require 6 terabytes to store the network's transition states, while their experiment with 50 genes would have required 7 petabytes of storage if they had not used the SAT-based algorithm, which is based on methods used to model very large scale integrated circuits.
Although the models are useful tools, Benson observed that a possible limitation of the method is that clinical data stored in public databases "is not specifically designed for the problem that you [are studying]." As a result, while a researcher might be able to decrease the need for costly experimental trials, "you cannot [use] this method instead of a clinical study," he said.
Benson said that he and his colleagues plan to use the model "to identify diagnostic markers so that we can personalize medication that we're testing in clinical studies of allergy patients."
He plans to further develop the model using time series data from patients with allergies and compare their findings to data from healthy controls to see if "the dynamics of the model is different in patients compared to controls" as well as whether "the corresponding protein can be used diagnostically."
While his team has no plans to use the model for other disease types, Benson hopes other research groups will. "I think that the analytical principles are highly applicable to other diseases ... there is a fantastic amount of data in the public domain, like in Medline and high-throughput data in repositories."
Have topics you'd like to see covered in BioInform? Contact the editor at uthomas [at] genomeweb [.] com.