Researchers at the University of Saskatchewan have developed a pipeline for analyzing data from peptide kinase arrays that could have applications in the drug development space.
The pipeline, dubbed Platform for Intelligent, Integrated Kinome Analysis, or PIIKA, and its application using bovine array data are described in a paper published in April in Science Signaling.
The software addresses an unmet need for tools to make sense of data from kinases, which have "a central role in controlling cellular processes and are associated with many diseases," Scott Napper, an associate professor of biochemistry at U of S and one of the co-authors on the Science Signaling paper, explained in a statement.
As a result of their activity, kinases are "logical points for understanding biology and represent important treatment targets," he said. The enzymes are of particular interest to the pharmaceutical industry. It is estimated that kinases comprise about a quarter of the druggable genome and that they account for between 20 percent and 30 percent of all targets screened in pharmaceutical firms.
Microarrays are an emerging lab tool in kinase research because they allow researchers to analyze many different kinases within a sample simultaneously. To date, researchers using these arrays have applied software packages developed to interpret gene-expression data, but the U of S researchers note that "the distinct biological nature of kinome data motivates questioning the use of the same systematic approaches as are used for gene expression analysis."
In response, the U of S researchers developed PIIKA specifically to handle data from kinase microarrays. As the paper shows, "we are able to get more data, and with more accuracy" than methods developed for cDNA arrays, Tony Kusalik, director of the bioinformatics and computational biology research lab at U of S and a paper co-author, said in statement.
Kusalik told BioInform that in published literature on kinase arrays that he and his colleagues examined, the researchers "simply picked up software that was designed for gene-expression microarrays and ... if there were parameters that were tunable or settable, they didn’t bother, they just left them as you would use them [for cDNA arrays]."
Computational tools used to analyze data from DNA microarrays don’t work well with kinase data due to differences in the density of gene-expression microarrays and peptide arrays, Napper, a senior scientist at the Vaccine and Infectious Disease Organization at U of S, told BioInform.
"Most of the gene-expression microarrays have tens of thousands of data points whereas with the peptide arrays ... you are only looking at a few hundred," he explained. "What this means is that you have to apply different levels of stringency ... to make sure you are not excluding data that might actually mean something."
In terms of biological activity, "when you look at the magnitude of changes that can occur with gene expression, they can be really quite extreme, whereas when you are looking at changes on the level of kinase activity, it's much more subtle," he explained.
As a result, "when you are applying these types of statistical tests to determine what represents technical variability and biological variability, it's really important to keep in mind those differences between gene-expression and kinome arrays," he said.
PIIKA, implemented in R, uses a set of statistical steps to address "problems of variability that exist among technical and biological replicates," the paper noted. While this overall approach is also used to analyze data from gene-expression microarrays, the U of S team put things together a little differently, Kusalik explained to BioInform.
For one thing, the developers used different statistical thresholds so that fewer data points would be eliminated in the analysis process, he said.
"When you are working with nucleotide microarray data, usually you have 30,000 or more probes or spots ... so if you set your threshold so that you end up with some false negatives and throw out some data points, that's not so bad," he said. However, peptide arrays have only about 300 spots. "We want to try to keep as many as we can and only toss them if we have to."
As a first step, PIIKA uses the variance stabilization, or VSN, algorithm to "transform" the kinase data so that it "more closely approximates a normal distribution." This step also captures as much information as possible compared to the more commonly used logarithmic approach, which results in lost data.
"The background intensity" of kinase array data "is very strong," Kusalik explained. "What can happen is that one quarter to one third of the data values are actually negative numbers, ...[but] you can't take the logarithm of a negative number so you end up having to assign those values to zero or to throw them away or something like that. It's not a very good situation because you've only got 300 spots on your array to begin with."
Next, PIIKA applies a chi-squared test to replicates on the array in order to filter out spots that have too much variability, Kusalik said. In the case of multiple arrays, PIIKA runs an F-test in order to determine the variability of particular spots across several arrays.
Finally, Kusalik said, PIIKA can use a T-test to look for differences in phosphorylation for a given spot on the array between a treatment and control case or to compare two treatments, he said. It can also do hierarchical clustering, generate heat maps, or do principal component analysis on the data to discover what spots are consistently phosphorylated across a number of individuals or to study which treatments give a consistent phosphorylation pattern across a number of spots, for example.
The researchers have filed a patent for their approach. Meanwhile, U of S is holding discussions with an unnamed company that has an option to license the software for commercial purposes, although it isn't clear what the outcome of those negotiations will be, Kusalik said.
Academic researchers, meantime, have access to a free version of the software that is available here.
Napper expects PIIKA to be of assistance to researchers in pharmaceutical companies who are developing kinase inhibitors.
These companies can use kinase arrays to determine the specificity of a potential treatment so that "ideally you are only taking out ... a limited number of kinases." The approach can also be used to understand the "secondary effects" of the therapy, he said.
Kinase arrays "allow you to carefully monitor kinase activity in the presence of, say, a kinase inhibitor or even other kinds of therapeutics," he explained. "So it's going to give you a much better insight into predicting the efficacy of treatments, but also predicting possible side effects."
Kinase arrays could also be of use in personalized medicine, Napper said.
"Certain kinase inhibitors might be more effective for particular diseases or in particular individuals, so by applying these arrays in advance, you might be able to customize someone's drug delivery program for them," he explained.