In combinatorics, a branch of mathematics, a weighted matroid is a matroid endowed with a function that assigns a weight to each element. Formally, let
M=(E,I)
w:E → R+
E
E
w(A)
w(x)
x
A
A basic problem regarding weighted matroids is to find an independent set with a maximum total weight. This problem can be solved using the following simple greedy algorithm:
This algorithm does not need to know anything about the matroid structure; it just needs an independence oracle for the matroid - a subroutine for testing whether a set is independent.
Jack Edmonds[1] proved that this simple algorithm indeed finds an independent set with maximum weight. Denote the set found by the algorithm by e1,...,ek. By the matroid properties, it is clear that k=rank(M), otherwise the set could be extended. Assume by contradiction that there is another set with a higher weight. Without loss of generality, it is possible to assume that this set has rank(M) elements too; denote it by f1,...,fk. Order these items such that w(f1) ≥ ... ≥ w(fk). Let j be the first index for which w(fj) > w(ej). Apply the augmentation property to the sets and ; we conclude that there must be some i ≤ j such that fi could be added to while keeping it independent. But w(fi) ≥ w(fj) > w(ej), so fi should have been chosen in step j instead of ej - a contradiction.[2]
As a simple example, say we wish to find the maximum spanning forest of a graph. That is, given a graph and a weight for each edge, find a forest containing every vertex and maximizing the total weight of the edges in the tree. This problem arises in some clustering applications. It can be solved by Kruskal's algorithm, which can be seen as the special case of the above greedy algorithm to a graphical matroid.
If we look at the definition of the forest matroid, we see that the maximum spanning forest is simply the independent set with largest total weight - such a set must span the graph, for otherwise we can add edges without creating cycles. But how do we find it?
There is a simple algorithm for finding a basis:
A
x
E
A\cup\{x\}
A
A\cup\{x\}
The result is clearly an independent set. It is a maximal independent set because if
B\cup\{x\}
B
A
A\cup\{x\}
An independent set of largest total weight is called an optimal set. Optimal sets are always bases, because if an edge can be added, it should be; this only increases the total weight. As it turns out, there is a trivial greedy algorithm for computing an optimal set of a weighted matroid. It works as follows:
A
x
E
A\cup\{x\}
A
A\cup\{x\}
This algorithm finds a basis, since it is a special case of the above algorithm. It always chooses the element of largest weight that it can while preserving independence (thus the term "greedy"). This always produces an optimal set: suppose that it produces
A=\{e1,e2,\ldots,er\}
B=\{f1,f2,\ldots,fr\}
k
1\lek\ler
O1=\{e1,\ldots,ek-1\}
O2=\{f1,\ldots,fk\}
O1
O2
O2
O1
ek
O1
ek
O2
ek
fk
k
A
B
The easiest way to traverse the members of
E
O(|E|log|E|)
x
A\cup\{x\}
O(f(|E|))
O(|E|log|E|+|E|f(|E|))
If we want to find a minimum spanning tree instead, we simply "invert" the weight function by subtracting it from a large constant. More specifically, let
wmin(x)=W-w(x)
W
Note also that if we take a set
I
I1
I2
|I1|<|I2|
e\inI2\setminusI1
I1\cupe
Pick an
\epsilon>0
\tau>0
(1+2\epsilon)|I1|+\tau|E|<|I2|
I1\cupI2
2
2+2\epsilon
I1\setminusI2
1+\epsilon
1+2\epsilon
I2\setminusI1
1
1+\epsilon
0
\tau
I1
I2\setminusI1
(1+2\epsilon)|I1|+\tau|E|+|I1\cupI2|
I2
This optimization algorithm may be used to characterize matroids: if a family F of sets, closed under taking subsets, has the property that, no matter how the sets are weighted, the greedy algorithm finds a maximum-weight set in the family, then F must be the family of independent sets of a matroid.[3]
The notion of matroid has been generalized to allow for other types of sets on which a greedy algorithm gives optimal solutions; see greedoid and matroid embedding for more information. Korte and Lovász would generalize these ideas to objects called greedoids, which allow even larger classes of problems to be solved by greedy algorithms.