The maximum coverage problem is a classical question in computer science, computational complexity theory, and operations research.It is a problem that is widely taught in approximation algorithms.
As input you are given several sets and a number
k
k
Formally, (unweighted) Maximum Coverage
Instance: A number
k
S=\{S1,S2,\ldots,Sm\}
Objective: Find a subset
S'\subseteqS
\left|S'\right|\leqk
\left|
cup | |
Si\inS' |
{Si}\right|
1-
1 | |
e |
+o(1) ≈ 0.632
The maximum coverage problem can be formulated as the following integer linear program.
maximize |
yj | (maximizing the sum of covered elements) | ||||
subject to | \sum{xi}\leqk | (no more than k | ||||
xi\geqyj | (if yj>0 ej\inSi | |||||
yj\in\{0,1\} | (if yj=1 ej | |||||
xi\in\{0,1\} | (if xi=1 Si |
The greedy algorithm for maximum coverage chooses sets according to one rule: at each stage, choose a set which contains the largest number of uncovered elements. It can be shown that this algorithm achieves an approximation ratio of
1-
1 | |
e |
P=NP
The inapproximability results apply to all extensions of the maximum coverage problem since they hold the maximum coverage problem as a special case.
The Maximum Coverage Problem can be applied to road traffic situations; one such example is selecting which bus routes in a public transportation network should be installed with pothole detectors to maximise coverage, when only a limited number of sensors is available. This problem is a known extension of the Maximum Coverage Problem and was first explored in literature by Junade Ali and Vladimir Dyo.[4]
In the weighted version every element
ej
w(ej)
1
maximize
\sumew(ej) ⋅ yj
subject to
\sum{xi}\leqk
k
\sum | |
ej\inSi |
xi\geqyj
yj>0
ej\inSi
yj\in\{0,1\}
yj=1
ej
xi\in\{0,1\}
xi=1
Si
The greedy algorithm for the weighted maximum coverage at each stage chooses a set that contains the maximum weight of uncovered elements. This algorithm achieves an approximation ratio of
1-
1 | |
e |
In the budgeted maximum coverage version, not only does every element
ej
w(ej)
Si
c(Si)
k
B
B
maximize
\sumew(ej) ⋅ yj
subject to
\sum{c(Si) ⋅ xi}\leqB
B
\sum | |
ej\inSi |
xi\geqyj
yj>0
ej\inSi
yj\in\{0,1\}
yj=1
ej
xi\in\{0,1\}
xi=1
Si
A greedy algorithm will no longer produce solutions with a performance guarantee. Namely, the worst case behavior of this algorithm might be very far from the optimal solution. The approximation algorithm is extended by the following way. First, define a modified greedy algorithm, that selects the set
Si
1,2,...,k-1
H1
k
k
H2
H1
H2
1-{1\overe}
k\geq3
NP\subseteqDTIME(nO(loglog)
In the generalized maximum coverage version every set
Si
c(Si)
ej
ej
Si
ej
wi(ej)
ci(ej)
B
maximize
\sum | |
e\inE,Si |
wi(ej) ⋅ yij
subject to
\sum{ci(ej) ⋅ yij
B
\sumiyij\leq1
ej=1
\sum | |
Si |
xi\geqyij
yj>0
ej\inSi
yij\in\{0,1\}
yij=1
ej
Si
xi\in\{0,1\}
xi=1
Si
The algorithm uses the concept of residual cost/weight. The residual cost/weight is measured against a tentative solution and it is the difference of the cost/weight from the cost/weight gained by a tentative solution.
The algorithm has several stages. First, find a solution using greedy algorithm. In each iteration of the greedy algorithm the tentative solution is added the set which contains the maximum residual weight of elements divided by the residual cost of these elements along with the residual cost of the set. Second, compare the solution gained by the first step to the best solution which uses a small number of sets. Third, return the best out of all examined solutions. This algorithm achieves an approximation ratio of
1-1/e-o(1)