A recent study from a team of researchers at the Virginia Bioinformatics Institute and Virginia Tech suggests that pathway informatics tools have a long way to go before they meet the research needs of end users.
Pathway informatics is a "very hot area" of development in bioinformatics, but the tools available to most biologists are "not all the way there yet," Karen Duca, a VBI professor and co-author on the paper, told BioInform.
Duca and her colleagues Purvi Suraiya and Chris North, a human-computer interaction expert at Virginia Tech conducted the study as a follow-on to a paper they published last year that assessed the usability of several microarray analysis packages. [BioInform 01-17-05].
"It's a big step away from using a heat map to being able to map the data onto these diagrams they're too static, and they're incomplete, and we're looking at ways to make them more dynamic."
Many researchers now want to visualize their microarray data within the context of sub-cellular pathways, so Duca and her team began exploring pathway visualization packages. "What people really want is to get their high-throughput data in some kind of a mechanistic framework so they can really see how whole activities and events are regulated," she said. "What people want is an overview of what pathways are changing."
However, Duca said, visualizing expression data within a pathway-based framework presents a steep learning curve for many users, who question the biological value of current tools. "It's a big step away from using a heat map to being able to map the data onto these diagrams they're too static, and they're incomplete, and we're looking at ways to make them more dynamic," she said.
In the study, "Visualizing biological pathways: requirements analysis, systems evaluation and research agenda," published in the Autumn 2005 issue of Information Visualization, Duca and her colleagues conducted a series of interviews, focus groups, and questionnaires to determine a list of 13 key requirements for pathway analysis tools that they grouped into three main categories: pathway assembly, information overlay, and pathway analysis.
Then, six scientists evaluated a handful of specific tools GenMAPP, Cytoscape, GScope, Pathway Assist, Patika, and BioCarta to determine how well they met those requirements. The authors acknowledged that the systems chosen for evaluation represent only a fraction of available pathway-analysis tools, and noted that the six tools were chosen "based on availability, popularity in the bioinformatics community, and visualization and data-analysis capabilities."
The most "serious shortcoming" of the tools that were evaluated "is that they do not provide adequate domain-specific biological context, and users must perform many tedious tasks to search for and extract relevant information," the authors wrote. "Unless the tools provide users with rapid biologically relevant insight that relates the data to the underlying biological meanings (e.g., to phenotype), most life scientists will be reluctant to use them." (See below for a list of the top five "unmet" requirements).
The paper highlighted a gap between the "cartoon" representations of canonical pathways that many biologists are comfortable with and the "ball-and-stick" visualizations derived from graph theory that are common in most pathway analysis packages. For this reason, the authors note, "although we had not originally planned to include it," several users requested that BioCarta a simple repository of pathway diagrams rather than an analytical tool be evaluated along with the other pathway visualization programs. The BioCarta diagrams "were considered most biologically meaningful, and were preferred by life scientists over ball-and-stick graphs," the authors wrote.
"The symbols, shapes, and organization of the [BioCarta] diagrams are familiar, and similar to those found in textbooks. … [Users] felt that none of the other pathway analysis tools provided as much information in such a helpful and biologically meaningful visual format," they wrote, concluding that although BioCarta "does not provide features like the other tools to automate pathway analysis or overlay gene-expression data," it can still serve as a "reference library" for users to construct pathways.
On the other side of the spectrum, some users described the network maps created by the open source Cytoscape program as "computer scientists' conceptions of pathways" and noted that "it would be very difficult to understand maps created by someone else in Cytoscape."
Are Commercial Tools Better?
The study, while limited in scope, did indicate that commercial pathway tools may meet user needs better than publicly available tools. According to the paper, the only commercially available tool that was evaluated Ariadne Genomics' PathwayAssist fared much better among users than its freely available counterparts.
Top Five Unmet Requirements for
Pathway Visualization Systems
| Saraiya et al. identified five "critical requirements" for pathway information systems in their paper:
"All the scientists were impressed with PathwayAssist's pathway assembly capabilities," the authors wrote, adding that the users "liked the ability to create pathways directly from the ResNet database and from PubMed using [natural language processing] algorithms."
Ilya Mazo, president of Ariadne, told BioInform that he was happy that the scientists regarded Pathway Assist highly, but noted that his company is not alone in its mission to reconcile biological context and network modeling. "They didn't mention Ingenuity, for example, in this study, but they have similar stuff," he said. "I believe that the commercial developers, at least, are trying to take everything that these static diagrams can provide and integrate that into the network tools."
Ramon Felciano, CTO of Ingenuity Systems, said he was "pleased to see this paper come out" because there has been "surprisingly little" published in the area of human-computer interaction studies for pathway analytics or other bioinformatics tools. However, he said, "I was kind of disappointed that they didn't look at Ingenuity Pathways Analysis."
Felciano said that he "walked through" the list of requirements identified in the table with the most recent release of IPA, "and it met every single one, so [the study] is kind of like post facto market research validation for us."
Likewise, Julie Bryant, vice president of business development at GeneGo, viewed the study as validation of her company's MetaCore pathway-analysis system because "the majority of their recommendations are already included in MetaCore."
In an e-mail to BioInform, Bryant said that GeneGo users want both canonical pathway maps and dynamic network visualizations. "They want canonical maps to get a general understanding of certain pathways in an easy-to-understand format," she said. "Then they want to dig deeper using network tools building pathways on the fly."
Mazo said that "the list of requirements [the authors] compiled is pretty accurate," but noted that the study was "limited" in its focus on visualization, which is only one aspect within the broader field of pathway informatics.
"It looks like the field in general right now is moving away from just visualization and there is more and more interest in simulation and different algorithms to predict pathways from microarray data or to use pathways to interpret microarray data," Mazo said. "So there is more emphasis on different tools and algorithms rather than just visualization."
Felciano echoed Mazo's comments, noting that "visualization is an important part of tools like this, but it's by no means the only part." In particular, he said, "I was a little surprised that they didn't focus on the content component."
A "particular risk" of focusing on visualization, he said, is that biologists may be fooled by "pretty diagrams" and will "believe the data underneath without digging into it."
Content may be what sets the commercial tools apart from their freely available counterparts. Felciano noted that Ingenuity "had to invest an enormous amount in building out the infrastructure and then the content and the tools on top of that infrastructure, and that is something that, typically, freely available tools simply don't have." That initial investment "has given us an advantage," he said, because "a lot of the challenges that a lot of the free tools are working through now, we were able to work through up front through usability studies, through market research and so forth."
Usability studies along the lines of the VBI study should ultimately help drive progress in the field by ensuring that bioinformatics tools are in line with the needs of biologists, Felciano said.
"When we built even the first version of our pathways product, we ran usability studies where we sat there with a stopwatch and said, 'We need to put this in front of a normal biologist and they need to find something interesting in their dataset within 10 minutes,' and we watched how many clicks they used, and how long it took them to learn the system, and how many errors they made, and so forth," he said.
Even though that approach is common in the consumer software industry, it's been "by and large ignored by the bioinformatics community," Felciano said, "and I'm really glad to see these guys putting forward studies like this because I think that coming up with software systems that are usable by a much broader range of scientists is absolutely critical to delivering the value of bioinformatics."
Bernadette Toner ([email protected])