See also: Configuration model.
Degree Preserving Randomization is a technique used in Network Science that aims to assess whether or not variations observed in a given graph could simply be an artifact of the graph's inherent structural properties rather than properties unique to the nodes, in an observed network.
Cataloged as early as 1996,[1] the simplest implementation of degree preserving randomization relies on a Monte Carlo algorithm that rearranges, or "rewires" the network at random such that, with a sufficient number of rewires, the network's degree distribution is identical to the initial degree distribution of the network, though the topological structure of the network has become completely distinct from the original network.
Degree preserving randomization, while it has many different forms, typically takes on the form of a relatively simple approach: for any network consisting of
N
E
As is common with algorithms based on Markov chains, the number of iterations, or individual rewires, that must occur on a given graph such that the graph is sufficiently random and distinct from the original graph is unknown, though Espinoza[2] asserts that a safe minimum threshold is
Q*E
Q
E | |
2 |
*ln(1/\epsilon)
10-6
10-7
There are several cases in which published research have explicitly employed degree preserving randomization in order to analyze network properties. Dekker used rewiring in order to more accurately model observed social networks by adding a secondary variable,
\pi
N
Additionally, some work has been done in investigating how Degree Preserving Randomization may be used in addressing considerations of anonymity in networked data research, which has been shown to be a cause for concern in Social Network Analysis, as in the case of a study by Lewis et al. Ultimately the work conducted by Ying and Wu, starting from a foundation of Degree Preserving Randomization, and then forwarding several modifications, has showed moderate advances in protecting anonymity without compromising the integrity of the underlying utility of the observed network.
Additionally, the method is similar in nature to the broadly used Exponential random graph models popularized in social science, and indeed the various forms of modeling networks against observed networks in order to identify and theorize about the differences expressed in real networks. Importantly, Degree Preserving Randomization provides a simple algorithmic design for those familiar with programming to apply a model to an available observed network.
What follows is a small example showing how one may apply Degree Preserving Randomization to an observed network in an effort to understand the network against otherwise random variation while maintaining the degree distributional aspect of the network. The Association of Internet Researchers has a Listserv that constitutes the majority of discussion threads surrounding their work. On it, members post updates about their own research, upcoming conferences, calls for papers and also engage one another in substantive discussions in their field. These emails can in turn constitute a directed and temporal network graph, where nodes are individual e-mail accounts belonging to the Listserv and edges are cases in which one e-mail address responds to another e-mail address on the Listserv.
In this observed network, the properties of the Listserv are relatively simple to calculate - for the network of 3,235 individual e-mail accounts and 9,824 exchanges in total, the observed reciprocity of the network is about 0.074, and the [Average path length|average path length] is about 4.46. Could these values be arrived at simply through the nature of the network's inherent structure?
Applying the
E | |
2 |
*ln(1/\epsilon)