Column generation or delayed column generation is an efficient algorithm for solving large linear programs.
The overarching idea is that many linear programs are too large to consider all the variables explicitly. The idea is thus to start by solving the considered program with only a subset of its variables. Then iteratively, variables that have the potential to improve the objective function are added to the program. Once it is possible to demonstrate that adding new variables would no longer improve the value of the objective function, the procedure stops. The hope when applying a column generation algorithm is that only a very small fraction of the variables will be generated. This hope is supported by the fact that in the optimal solution, most variables will be non-basic and assume a value of zero, so the optimal solution can be found without them.
In many cases, this method allows to solve large linear programs that would otherwise be intractable. The classical example of a problem where it is successfully used is the cutting stock problem. One particular technique in linear programming which uses this kind of approach is the Dantzig–Wolfe decomposition algorithm. Additionally, column generation has been applied to many problems such as crew scheduling, vehicle routing, and the capacitated p-median problem.
The algorithm considers two problems: the master problem and the subproblem. The master problem is the original problem with only a subset of variables being considered. The subproblem is a new problem created to identify an improving variable (i.e. which can improve the objective function of the master problem).
The algorithm then proceeds as follow:
The most difficult part of this procedure is how to find a variable that can improve the objective function of the master problem. This can be done by finding the variable with the most negative reduced cost (assuming without loss of generality that the problem is a minimization problem). If no variable has a negative reduced cost, then the current solution of the master problem is optimal.
When the number of variables is very large, it is not possible to find an improving variable by calculating all the reduced cost and choosing a variable with a negative reduced cost. Thus, the idea is to compute only the variable having the minimum reduced cost. This can be done using an optimization problem called the pricing subproblem which strongly depends on the structure of the original problem. The objective function of the subproblem is the reduced cost of the searched variable with respect to the current dual variables, and the constraints require that the variable obeys the naturally occurring constraints. The column generation method is particularly efficient when this structure makes it possible to solve the sub-problem with an efficient algorithm, typically a dedicated combinatorial algorithm.
We now detail how and why to compute the reduced cost of the variables. Consider the following linear program in standard form:
\begin{align} &minxcTx\\ &subjectedto\\ &Ax=b\\ &x\inR+ \end{align}
\begin{align} &maxuuTb\\ &subjectedto\\ &uTA\leqc\\ &u\inR \end{align}
Moreover, let
x*
u*
cTx*=u*Tb
z*
z*=z*(c,A,b)
* | |
u | |
i |
* | |
u | |
i |
z*
bi
* | |
u | |
i |
=
\partialz* | |
\partialbi |
u*=
\partialz* | |
\partialb |
* | |
u | |
i |
bi
Consider now that a variable
y
y
y
0
\hat{y}
cy
Ay
y
\begin{align} &minxcTx+cy\hat{y}\\ &subjectedto\\ &Ax=b-Ay\hat{y}\\ &x\inR+ \end{align}
In order to know if it is interesting to add the variable
y
z\hat{y
\hat{y}
y
\partialz\hat{y | |
*}{\partial |
\hat{y}}
z\hat{y
z\hat{y
\begin{align} | \partialz\hat{y |
*}{\partial |
\hat{y}}&~=~&&cy+
\partialz* | |
\partial\hat{y |
In other words, the impact of changing the value
\hat{y}
z\hat{y
x*
u*
\partialz\hat{y | |
*}{\partial |
\hat{y}}
y
cry