In graph theory, the metric -center problem is a combinatorial optimization problem studied in theoretical computer science. Given cities with specified distances, one wants to build warehouses in different cities and minimize the maximum distance of a city to a warehouse. In graph theory, this means finding a set of vertices for which the largest distance of any point to its closest vertex in the -set is minimum. The vertices must be in a metric space, providing a complete graph that satisfies the triangle inequality.
Let
(X,d)
X
d
V\subseteql{X}
k
l{C}\subseteqV
|l{C}|=k
V
l{C}
l{X}
V\subseteql{X}
k
l{C}\subseteqV
k
rl{C}(V)=\underset{v\inV}{max}
l{C}
That is, every point in a cluster is in distance at most
rl{C}(V)
The k-Center Clustering problem can also be defined on a complete undirected graph G = (V, E) as follows:
Given a complete undirected graph G = (V, E) with distances d(vi, vj) ∈ N satisfying the triangle inequality, find a subset C ⊆ V with |C| = k while minimizing:
maxvmincd(v,c)
In a complete undirected graph G = (V, E), if we sort the edges in non-decreasing order of the distances: d(e1) ≤ d(e2) ≤ ... ≤ d(em) and let Gi = (V, Ei), where Ei = . The k-center problem is equivalent to finding the smallest index i such that Gi has a dominating set of size at most k.
Although Dominating Set is NP-complete, the k-center problem remains NP-hard. This is clear, since the optimality of a given feasible solution for the k-center problem can be determined through the Dominating Set reduction only if we know in first place the size of the optimal solution (i.e. the smallest index i such that Gi has a dominating set of size at most k), which is precisely the difficult core of the NP-Hard problems. Although a Turing reduction can get around this issue by trying all values of k.
A simple greedy approximation algorithm that achieves an approximation factor of 2 builds
l{C}
\bar{c}1
C1
v\inV
d1[v]
\bar{c}1
\bar{c}2
\bar{c}1
C2
l{O}(n)
l{O}(nk)
The solution obtained using the simple greedy algorithm is a 2-approximation to the optimal solution. This section focuses on proving this approximation factor.
Given a set of n points
V\subseteql{X}
l{X}
i.e.
rK(V)\leq2ropt(V,it{k})
This theorem can be proven using two cases as follows,
Case 1: Every cluster of
l{C}opt
K
v\inV
\bar{c}
l{C}opt
\bar{k}
K
\Pi(l{C}opt,\bar{c})
d(v,\bar{c})=d(v,l{C}opt)\leqropt(V,k)
d(\bar{k},\bar{c})=d(\bar{k},l{C}opt)\leqropt
d(v,\bar{k})\leqd(v,\bar{c})+d(\bar{c},\bar{k})\leq2ropt
\bar{k}
\bar{u}
K
\Pi(l{C}opt,\bar{c})
\bar{c}\inl{C}opt
\bar{u}
K
\bar{k}\inl{C}i-1
\begin{align} rK(V)\leq
l{C | |
r | |
i-1 |
Another algorithm with the same approximation factor takes advantage of the fact that the k-Center problem is equivalent to finding the smallest index i such that Gi has a dominating set of size at most k and computes a maximal independent set of Gi, looking for the smallest index i that has a maximal independent set with a size of at least k.It is not possible to find an approximation algorithm with an approximation factor of 2 - ε for any ε > 0, unless P = NP.Furthermore, the distances of all edges in G must satisfy the triangle inequality if the k-center problem is to be approximated within any constant factor, unless P = NP.
It can be shown that the k-Center problem is W[2]-hard to approximate within a factor of 2 - ε for any ε > 0, when using k as the parameter.[2] This is also true when parameterizing by the doubling dimension (in fact the dimension of a Manhattan metric), unless P=NP.[3] When considering the combined parameter given by k and the doubling dimension, k-Center is still W[1]-hard but it is possible to obtain a parameterized approximation scheme.[4] This is even possible for the variant with vertex capacities, which bound how many vertices can be assigned to an opened center of the solution.[5]
. Har-peled, Sariel . Sariel Har-Peled . Geometric Approximation Algorithms . 2011 . 978-0821849118 . American Mathematical Society . Boston, MA, USA.