Capacitated minimum spanning tree explained

Capacitated minimum spanning tree is a minimal cost spanning tree of a graph that has a designated root node

and satisfies the capacity constraint

. The capacity constraint ensures that all subtrees (maximal subgraphs connected to the root by a single edge) incident on the root node

have no more than

nodes. If the tree nodes have weights, then the capacity constraint may be interpreted as follows: the sum of weights in any subtree should be no greater than

. The edges connecting the subgraphs to the root node are called gates. Finding the optimal solution is NP-hard.

Algorithms

Suppose we have a graph

G=(V,E)

n=|G|

with a root

r\inG

. Let

a_i

be all other nodes in

. Let

c_ij

be the edge cost between vertices

a_i

and

a_j

which form a cost matrix

C={c_ij

Esau-Williams heuristic

Esau-Williams heuristic finds suboptimal CMST that are very close to the exact solutions, but on average EW produces better results than many other heuristics.

(star graph) and the network's cost is

	n
\displaystyle\sum
	i=0

c_ri

; each of these edges is a gate. At each iteration, we seek the closest neighbor

a_j

for every node in

G-{r}

and evaluate the tradeoff function:

t(a_i)=g_i-c_ij

. We look for the greatest

t(a_i)

among the positive tradeoffs and, if the resulting subtree does not violate the capacity constraints, remove the gate

g_i

connecting the

-th subtree to

a_j

by an edge

c_ij

. We repeat the iterations until we can not make any further improvements to the tree.

Esau-Williams heuristics for computing a suboptimal CMST:

function CMST(c,C,r): T = while have changes: for each node

a_i

a_j

= closest node in a different subtree

t(a_i)

g_i

c_ij

t_max = max(

t(a_i)

) k = i such that

t(a_i)

= t_max if (cost(i) + cost(j) <= c) T = T -

g_k

T = T union

c_kj

return T

It is easy to see that EW finds a solution in polynomial time.

^[1]

Ahuja's heuristic

Ahuja's heuristic ^[2] uses a local search in a large multi-exchange neighborhood from a randomized greedy initial solution.

Initial solution

The initial solution is found by using a randomized version of Esau-Williams. Randomization is achieved by executing a uniformly random join from the best

ones instead of the best one in each step.

Local Search Neighborhood

Let

be the initial solution with root

. The neighborhood consists of any combination of a single node or subtree (general subtrees, not as in the introduction of this article) displacing one in a different component of

T\setminusr

such that the displaced structure is the next displacer, the last displacer displaces the first displacer, no original component has more than one displacer and the capacity is not exceeded in any resulting component.

Improvement Graph

An improvement graph is a tool to search a very large neighborhood efficiently. Paths through an improvement graph correspond to changes to a solution and the cost of the path is the change in the cost of the solution when applying the change. Here the improvement graph is a directed multigraph built by using 2 copies

i',i''

of each node

i\inV(T)

and up to 4 edges from any node to any node in a different component of

T\setminusr

. The edge

i',j''

corresponds to the change of removing the node

from its original component and replacing the subtree rooted at

in the target component. Combining nodes

and subtrees

i''

yields the 4 possible edges. An edge exists if the corresponding change does not lead to the target component exceeding the capacity. The cost of an edge is the difference in the cost of the minimal spanning trees on the vertices in the target component before and after the displacement. Thus neighbors in the local search correspond to cycles in the improvement graph that contain at most one node from each component.

Local Search Step

The local search step uses a dynamic programming approach to find a minimum cost cycle in the improvement graph. Paths through the improvement graph with increasing length are generated and only the most favorable with the same start and end as well as involved components is stored. To this end a hash table with the tuple of those 3 properties as key is used to hold paths. Since in each negative cycle there is a node such that all paths within that cycle containing this node have negative cost, only paths with negative cost need to be considered at all. As the comparison of sets of involved components between paths is one of the most common operations in the algorithm, it is implemented as comparison of indicator bit arrays stored as integers for speed. This however clearly stems from a lot of hash collisions, which might be a consequence of the particular choice of hash function and table structure, as well as high load factor due to space restrictions (paper from 2003).

Performance

At the time the paper was written (2003) this algorithm was state of the art on a standard operations research benchmark. The execution was dominated by the building (respectively updating) of the improvement graph. The number of edges in the improvement graph empirically scaled quadratically with the size of the input graph and since this determines the number of times the comparatively complex step of finding a minimum spanning tree has to be run, this is the most critical factor. Thus one can conclude that less dense input graphs greatly benefit the running time, as this reduces the number of edges in the improvement graph.

Applications

CMST problem is important in network design: when many terminal computers have to be connected to the central hub, the star configuration is usually not the minimum cost design. Finding a CMST that organizes the terminals into subnetworks can lower the cost of implementing a network.

References

10.1147/sj.53.0142. Esau . L.R. . Williams . K.C. . 1966 . On teleprocessing network design: Part II. A method for approximating the optimal network. . IBM Systems Journal . 5 . 3 . 142–147.
10.1016/S0167-6377(02)00236-5 . 2003 . 31 . 3 . R.K. . Ahuja . J.B. . Orlin . D. . Sharma . A composite very large-scale neighborhood structure for the capacitated minimum spanning tree problem . Operations Research Letters . 185–194.