Skip to main content
Premium Trial:

Request an Annual Quote

Glasgow Team Takes a Page(Rank) from Google with New Gene-Prioritization Tool

Premium

A team of researchers has adapted Google's PageRank algorithm to fit the needs of microarray researchers.

Julie Morrison and colleagues at the University of Glasgow and the University of Strathclyde hit upon the algorithm as a handy way to prioritize lists of differentially expressed genes, and recently published the results of their work in BMC Bioinformatics (available at http://www.biomedcentral.com/1471-2105/6/233/).

PageRank, developed by Google founders Larry Page and Sergey Brin, ranks each web page in a list of search results based on how many other highly ranked web pages contain hyperlinks to it. Likewise, the modified version of the algorithm, which the authors have dubbed GeneRank, ranks a gene higher in a list if it is linked to other highly ranked genes.

According to the developers, this approach addresses a common shortcoming in microarray analysis, in which genes are ranked solely by their change in expression. This can rule out certain genes that play an important biological role, but have low differential expression. "Our aim is to rank genes higher in the ordered list if they have little change in expression in the experiment, but are connected to other genes that are highly changed," Morrison explained in an e-mail to BioInform. "In normal prioritization methods, these genes would not appear to be significant within the experiment, although we believe they should be given some biological consideration."

As an example, the authors note in their paper, a gene with low differential expression could be a transcription factor that controls the expression of all genes connected to it. "The transcription factor itself may be 'activated' by the experimental treatment but not change its expression — but its target genes will. Hence, GeneRank should be able to highlight the transcription factor among the results."

Typical prioritization methods order genes based on their change in expression in a microarray experiment, but "the drawback here," Morrison said, "is that only the microarray experiment results are used to make conclusions, and it is understood that the results of these experiments are subject to a large amount of experimental variation."

The authors note that GeneRank results "are not designed to replace the actual expression measurements, but should be used alongside the results with additional biological knowledge." For example, a gene that is not ranked highly from the microarray results alone but is highly ranked by GeneRank "should be given further biological consideration."

GeneRank uses Gene Ontology annotations as the basis for assessing the "links" between genes, but Morrison said that any other shared annotation system would work. She said that the algorithm described in the BMC Bioinformatics paper "boils down to a simple matrix computation that is simple to implement," and that a Matlab file with the code is also available for download (http://www.biomedcentral.com/content/supplementary/1471-2105-6-233-S2.mat).

Morrison said that the underlying principle of the approach is "likely to be accepted" by the microarray community, because "everybody has experienced how Google can successfully rank a huge collection of web pages."

There are a few possible drawbacks, however. Namely, "biological networks may not have the same topological structure as the web, and thus PageRanking in this context may behave differently," she noted. In addition, the Glasgow team used "absolute values of expression change in order to match the positiveness of the personalization weights." While it is "theoretically possible to use two-signed expression data to account for under- and over-expression," Morrison said, "this might be pushing the PageRank analogy too far."

Morrison said that the algorithm described in the paper is a proof of principle and that the researchers are hoping to get more feedback from the microarray community before taking further development steps. "Results from a wider range of datasets might shed more light on the strengths and weaknesses of the algorithm and could suggest possible improvements," she said.

— Bernadette Toner ([email protected])

Filed under

The Scan

Team Tracks Down Potential Blood Plasma Markers Linked to Heart Failure in Atrial Fibrillation Patients

Researchers in BMC Genomics found 10 differentially expressed proteins or metabolites that marked atrial fibrillation with heart failure cases.

Study Points to Synonymous Mutation Effects on E. Coli Enzyme Activity

Researchers in Nature Chemistry saw signs of enzyme activity shifts in the presence of synonymous mutations in a multiscale modeling analysis of three Escherichia coli genes.

Team Outlines Paternal Sample-Free Single-Gene Approach for Non-Invasive Prenatal Screening

With data for nearly 9,200 pregnant individuals, researchers in Genetics in Medicine demonstrate the feasibility of their carrier screening and reflex single-gene non-invasive prenatal screening approach.

Germline-Targeting HIV Vaccine Shows Promise in Phase I Trial

A National Institutes of Health-led team reports in Science that a broadly neutralizing antibody HIV vaccine induced bnAb precursors in 97 percent of those given the vaccine.