NetworkPrioritizer - Documentation

Prioritization

The prioritization workflow can be divided in two main stages. First, during the centrality analysis, the importance of nodes is estimated by centrality measures. This yields one ranking per computed measure. Second, these different rankings can be inspected and combined to an overall ranking with the help of the Ranking Manager.

Input Data

In principle, NetworkPrioritizer can work with any kind of molecular network loaded in Cytoscape. For disease gene prioritization, the most useful networks will be protein interaction networks, functional similarity networks, phenotype networks, and heterogeneous networks. In addition to the network, the user has to specify a set of seed nodes either directly in the network or via an additional file (see 'Centrality Analysis' section for details). For the prioritization of candidate disease genes, these seed nodes are the already known disease-associated genes/proteins in the network.

To exploit the versatility of the Cytoscape platform and to compile networks as input for NetworkPrioritizer, you can, e.g., have a look at the following Cytoscape plugins:

MiMIplugin generates interaction networks from MiMI.
PSICQUICUniversalClient allows to import interaction networks from many public databases.
PathwayCommons allows to import pathways and networks from the Pathway Commons database.
iRefScape retrieves interactions from the iRefIndex.
CluePedia integrates different data into networks.

Of course, you can also import the pre-compiled Prioritizer network or Functional Linkage Network, published in Franke et al. (2006) and Linghu et al. (2009), respectively.

For additional information on functional networks, see e.g.:

Hawkins et al. (2010). Functional enrichment analyses and construction of functional similarity networks with high confidence function prediction by PFP. BMC Bioinformatics 11:265, doi:10.1186/1471-2105-11-265.
Dannenfelser et al. (2012). Genes2FANs: connecting genes through functional association networks. BMC Bioninformatics 13:156, doi:10.1186/1471-2105-13-156.

Centrality Analysis and Ranking

NetworkPrioritizer can compute several centrality measures for nodes in a given network (see 'Definitions' section for details). Currently, these are

degree centrality
shortest path betweenness
shortest path closeness
random walk betweenness
random walk receiver closeness
random walk transmitter closeness

All these measures are implemented as 'generalized centralities'. This means that they can be computed for weighted and unweighted networks alike. The influence of edge weights is adjustable by setting a parameter alpha in the NetworkPrioritizer preferences (see 'Preferences' section for details). The default value of alpha is 0.8. Although this value has shown a good overall performance in our benchmark study it may be beneficial to adjust it to specific cases.

Before the actual prioritization, one should select the seed nodes in the network. To do so, select some nodes in the network and then click NetworkPrioritizer -> Seed nodes -> Tag from selection. Alternatively, click NetworkPrioritizer -> Seed nodes -> Tag from file and choose a file containing the seed nodes. Such a file specifies the name of one seed node per line.

To start the prioritization, select Rank nodes from the NetworkPrioritizer menu and choose whether to rank the nodes with regard to newly computed centrality measures or with regard to some existing numerical node attribue. In the former case, it is possible to select the centrality measures to compute. After the ranking process, NetworkPrioritizer automatically compiles one ranking per measure and displays the rankings in the Ranking Manager. Additionally, the centralities can be computed without compiling rankings immediately afterwards: Open the Compute centralities entry from the NetworkPrioritizer menu and select the ones that should be determined.

Ranking Manager

The Ranking Manager is the central element to handle rankings in NetworkPrioritizer. It pops up automatically once a prioritization is completed and it can be opened manually via NetworkPrioritizer -> Show Ranking Manager.

Rankings are displayed in a tabbed manner on the left-hand side of the Ranking Manager. All functionalities of the Ranking Manager are accessible via the buttons on the right-hand side. These functionalities comprise the aggregation of rankings, the import and export of rankings and ranking distances, the computation of rank list distances, and the mapping of scores to nodes. The type of distance to compute and the aggregation algorithm to use can be chosen in the NetworkPrioritizer preferences. Per default, the Weighted Borda Fuse algorithm is used to aggregate rankings and the Spearman footrule is calculated as distance measure.

Preferences

In the Graph-Related section of the preferences you can choose which edge attribute contains the weights, which type of graph theoretic distance to use for the computations, and set a value for alpha. Alpha is in the range [0,1] and is important for any 'tuned' distance measure (e.g. the tunedinverted distance). If it is set to 0, the weights have no influence on the distance and centrality computation. If it is set to 1, the distance between nodes and the centrality of nodes fully depend on the edge weights.

In the Ranking-Related part of the preferences you can choose a ranking distance measure and the rank aggregation algorithm to use.

Import and Export of Rankings and Ranking Distances

Rankings and ranking distances can be imported and exported from within the Ranking Manager. After clicking on the corresponding button (Import Rankings,Export Rankings, Import Ranking Distances, or Export Ranking Distances) one has to select a directory. When data is exported, it will be written into this directory. When data is imported, it will be read from that directory.

Rankings are stored in tab-delimited text files with the ending .rlf (ranking list file). Each row represents one entry in the ranking. The first column contains the name of the ranked element, the second column contains its score, and the third column contains its rank. As an example, consider the following ranking, where A achieved a score of 10 and is ranked first, B achieved a score of 5 and is ranked second, and C achieved a score of 2 and is ranked third:

		A	10	1
		B	5	2
		C	2	3

Ranking distances are stored in tab-delimited text files with the ending .rdf (ranking distance file). The first line lists the names of all rankings that are considered in the file. The following lines then contain the (symmetric) distance matrix for the listed rankings. For example, the file SFD.rdf containing the Spearman footrule distance between three rankings Ra, Rb, and Rc, could look like this:

		Ra	Rb	Rc
		0	0.3	0.1
		0.3	0	0.3
		0.1	0.3	0