Disparity filter[1] is a network reduction algorithm (a.k.a. graph sparsification algorithm[2]) to extract the backbone structure of undirected weighted network. Many real world networks such as citation networks, food web, airport networks display heavy tailed statistical distribution of nodes' weight and strength. Disparity filter can sufficiently reduce the network without destroying the multi-scale nature of the network. The algorithm is developed by M. Angeles Serrano, Marian Boguna and Alessandro Vespignani.
k-core decomposition is an algorithm that reduces a graph into a maximal connected subgraph of vertices with at least degree k. This algorithm can only be applied to unweighted graphs.
A minimum spanning tree is a tree-like subgraph of a given graph G, in which it keeps all the nodes of graph G but minimizes the total weight of the subgraph. A minimum spanning tree is the least expensive way to maintain the size of a connected component. The significant limitation of this algorithm is that it overly simplifies the structure of the network (graph). The minimum spanning tree destroys local cycles, clustering coefficients which are usually present in real networks and are considered important in network measurement.
A weighted graph can be easily reduced to a subgraph in which any of the edges' weight is larger than a given threshold wc. This technique has been applied to study the resistance of food webs[3] and functional networks that connect correlated human brain sites.[4] The shortcoming of this method is that it disregards nodes with small strength. In real networks, both strength and weight distribution in general follow heavy tailed distributions which span several degrees of magnitude. Applying a simple cutoff on weight will remove all the information below the cut-off.
In network science, the strength notated as si of a node i is defined as si = Σjwij, where wij is the weight of the link between i and j.
In order to apply the disparity filter algorithm without overlooking nodes with low strength, a normalized weight pij is defined as pij = wij/si. In the null model, the normalized weights of a certain node with degree k is generated like this: k - 1 pins are randomly assigned between the interval 0 and 1. The interval is then divided into k subintervals. The length of the subinterval represents the normalized weight of each link in the null model.
Consecutively, and based on the null model, we can derive that the normalized weight distribution of a node with degree k follows
\rho(x)dx=(k-1)(1-x)k-2dx
The disparity filter algorithm is based on p-value[5] statistical significance test[6] of the null model: For a given normalized weight pij, the p-value αij of pij based on the null model is given by
\alphaij
pij | |
=1-(k-1)\int | |
0 |
(1-x)k-2dx
\alphaij=(1-pij)k-1
The disparity filter algorithm has been shown to be a particular case of the Pólya Filter[7] (built around the famous combinatorial scheme known as the Pólya Urn). The Pólya Filter is able to adapt the filtering procedure to the network's own heterogeneity by using a Maximum Likelihood procedure to set its free parameter
a
\alpha