Skip to main content
Premium Trial:

Request an Annual Quote

Glasgow Team Takes a Page(Rank) from Google with New Gene-Prioritization Tool

Premium

A team of researchers has adapted Google's PageRank algorithm to fit the needs of microarray researchers.

Julie Morrison and colleagues at the University of Glasgow and the University of Strathclyde hit upon the algorithm as a handy way to prioritize lists of differentially expressed genes, and recently published the results of their work in BMC Bioinformatics (available at http://www.biomedcentral.com/1471-2105/6/233/).

PageRank, developed by Google founders Larry Page and Sergey Brin, ranks each web page in a list of search results based on how many other highly ranked web pages contain hyperlinks to it. Likewise, the modified version of the algorithm, which the authors have dubbed GeneRank, ranks a gene higher in a list if it is linked to other highly ranked genes.

According to the developers, this approach addresses a common shortcoming in microarray analysis, in which genes are ranked solely by their change in expression. This can rule out certain genes that play an important biological role, but have low differential expression. "Our aim is to rank genes higher in the ordered list if they have little change in expression in the experiment, but are connected to other genes that are highly changed," Morrison explained in an e-mail to BioInform. "In normal prioritization methods, these genes would not appear to be significant within the experiment, although we believe they should be given some biological consideration."

As an example, the authors note in their paper, a gene with low differential expression could be a transcription factor that controls the expression of all genes connected to it. "The transcription factor itself may be 'activated' by the experimental treatment but not change its expression — but its target genes will. Hence, GeneRank should be able to highlight the transcription factor among the results."

Typical prioritization methods order genes based on their change in expression in a microarray experiment, but "the drawback here," Morrison said, "is that only the microarray experiment results are used to make conclusions, and it is understood that the results of these experiments are subject to a large amount of experimental variation."

The authors note that GeneRank results "are not designed to replace the actual expression measurements, but should be used alongside the results with additional biological knowledge." For example, a gene that is not ranked highly from the microarray results alone but is highly ranked by GeneRank "should be given further biological consideration."

GeneRank uses Gene Ontology annotations as the basis for assessing the "links" between genes, but Morrison said that any other shared annotation system would work. She said that the algorithm described in the BMC Bioinformatics paper "boils down to a simple matrix computation that is simple to implement," and that a Matlab file with the code is also available for download (http://www.biomedcentral.com/content/supplementary/1471-2105-6-233-S2.mat).

Morrison said that the underlying principle of the approach is "likely to be accepted" by the microarray community, because "everybody has experienced how Google can successfully rank a huge collection of web pages."

There are a few possible drawbacks, however. Namely, "biological networks may not have the same topological structure as the web, and thus PageRanking in this context may behave differently," she noted. In addition, the Glasgow team used "absolute values of expression change in order to match the positiveness of the personalization weights." While it is "theoretically possible to use two-signed expression data to account for under- and over-expression," Morrison said, "this might be pushing the PageRank analogy too far."

Morrison said that the algorithm described in the paper is a proof of principle and that the researchers are hoping to get more feedback from the microarray community before taking further development steps. "Results from a wider range of datasets might shed more light on the strengths and weaknesses of the algorithm and could suggest possible improvements," she said.

— Bernadette Toner ([email protected])

Filed under

The Scan

US Booster Eligibility Decision

The US CDC director recommends that people at high risk of developing COVID-19 due to their jobs also be eligible for COVID-19 boosters, in addition to those 65 years old and older or with underlying medical conditions.

Arizona Bill Before Judge

The Arizona Daily Star reports that a judge weighing whether a new Arizona law restricting abortion due to genetic conditions is a ban or a restriction.

Additional Genes

Wales is rolling out new genetic testing service for cancer patients, according to BBC News.

Science Papers Examine State of Human Genomic Research, Single-Cell Protein Quantification

In Science this week: a number of editorials and policy reports discuss advances in human genomic research, and more.