You can tell a lot about a person by learning about those with whom he interacts. Knowing that someone spends time with top researchers or with heavy drinkers — or both — you might hypothesize about what’s important to him.
Whether it’s people or proteins that interest you, it’s helpful to have ways to visualize these interactions. You may start with the complete proteome of an organism, or a set of genes or proteins identified by an experimental screen. Whatever the area of your research, the basic questions about a group of proteins may be similar: Do they interact? How? Are you sure? Do they participate in a similar metabolic theme? Do they have a shared biological function?
We’ve used several applications for visualizing networks, including Osprey, VisANT, and Cytoscape. All are free, at least for non-commercial users, but only Cytoscape and VisANT are open source. In addition to visualization of protein-protein interactions, these tools can be used to show other types of interaction, such as protein-DNA binding and chemical pathways. All tools come with some type of public interaction data, but you may also use your own data. All applications work well for small datasets but vary in ease when handling lots of data. Some tools are primarily for visualization, but others include some analysis tools.
Osprey, a Java-based application, provides a good introduction to network analysis. Available for Linux and desktop operating systems, it comes with popular interaction data. The project is linked to the General Repository for Interaction Datasets, or GRID, which formats public interaction data and updates for Osprey from yeast, fly, and worm. Importantly, custom data can be loaded with a format as simple as a two-column, tab-delimited text file.
For data from species beyond the scope of GRID, however, it appears time-consuming to prepare the annotation data needed to take full advantage of the application. As with other tools, large networks can really stress out the computer. Options to stop or reverse the last command could be a big help. The menus are short and straightforward, so it’s quite easy to get started. Visible proteins can be filtered by Gene Ontology term (when available), and interactions can be filtered by such qualities as consistency between experimental methods. Details about nodes and edges are clearly displayed on a side panel, and several basic layout options are available, some with additional configuration possible. Once you have a view you like, network images can be exported in bitmap or SVG formats, or the view can be preserved by saving in Osprey format. As a whole, Osprey is an effective tool for viewing networks and is easy for people new to networks, especially for those interested in the three GRID species.
VisANT is a Java-based Web application that is free for all. It uses an interesting model of a remote server back-end which contains interaction data in its Predictome database. There’s no need to download data files, but loading large datasets from their remote sources does take awhile. Again, a stop button would be appreciated by those who are exploring the tool (or are just impatient). Once the interactions are loaded, however, there are lots of ways to analyze and visualize them. Pull-down menus can perform statistics and draw charts describing network topology, and a click of a button identifies all features like feedback loops. Mousing over nodes and edges creates a popup with details. Analyzing large datasets — like human data from multiple sources — seems to require some preprocessing to reduce the network to a manageable size. Several layout options can produce some attractive views, yet they can only be saved as jpeg images. Depending on a remote server, VisANT is a useful tool for smallish networks, but it’s frustrating for larger analyses.
Cytoscape is the most elegant and flexible tool for visualizing and analyzing interaction networks. This Java tool, developed as a collaborative project and free for all, has a deceptively simple interface. The user manual is well written and helpful where menu descriptions aren’t self-explanatory. As with other applications, Cytoscape comes with sample data, and it’s easy to import custom data, as long as it’s in “simple interaction file” format. Edges can also be quantified, which is very helpful for large-scale datasets that come with measures of confidence or strength.
Large datasets are quick to load, but are not drawn — the real CPU-intensive step — without an explicit request. As a result, you can manipulate a network without having to visualize it. This saves a lot of processor time, and since trying to get some meaning out of a huge mess of nodes and edges usually just causes headaches, you can also save on aspirin. As you zoom into a network, be sure to turn on the “Overview,” which shows where you are in the whole network. Via Visual Properties, nodes and edges are fully configurable. A half dozen layout styles are possible, although Cytoscape doesn’t contain some that are found in other tools. Several types of filters can be applied to select desired nodes or edges, but we had trouble figuring out how they all work at first. Cytoscape does easily integrate expression information so node color can be defined by expression level.
Some groups have also extended Cytoscape with plug-ins for specific tasks, like Gene Ontology enrichment analysis, merging of networks, and accession of remote databases. It would be great if one could also show a small heatmap of expression data from several different conditions/cells at once. Once you like what you see, a network can be saved in GML format, which preserves the layout for future loading, or exported in about any image format, including PDF and SVG. Overall for network visualization and analysis tools, we’re most impressed with Cytoscape. It’s powerful, intuitive, well designed, and can generate attractive figures.
These tools provide different interfaces for the analysis and visualization of networks. Having a favorite type of analysis, variety of layout, or desired figure format could help you decide which tool to use. Regardless of the tool we use, the more time we spend thinking about global interaction networks, the more creative we get about finding data to integrate and the more we appreciate the power and flexibility of the better analysis tools.
Fran Lewitter, PhD, is director of bioinformatics and research computing at Whitehead Institute for Biomedical Research. This column was written in collaboration with George Bell, PhD, a bioinformatics scientist in Fran’s group.
VisANT - http://visant.bu.edu/
Cytoscape - http://www.cytoscape.org/