Interior-point methods (also referred to as barrier methods or IPMs) are algorithms for solving linear and non-linear convex optimization problems. IPMs combine two advantages of previously-known algorithms:
In contrast to the simplex method which traverses the boundary of the feasible region, and the ellipsoid method which bounds the feasible region from outside, an IPM reaches a best solution by traversing the interior of the feasible region—hence the name.
An interior point method was discovered by Soviet mathematician I. I. Dikin in 1967.[1] The method was reinvented in the U.S. in the mid-1980s. In 1984, Narendra Karmarkar developed a method for linear programming called Karmarkar's algorithm,[2] which runs in provably polynomial time (
O(n3.5L)
O(n3L)
Any convex optimization problem can be transformed into minimizing (or maximizing) a linear function over a convex set by converting to the epigraph form.[4] The idea of encoding the feasible set using a barrier and designing barrier methods was studied by Anthony V. Fiacco, Garth P. McCormick, and others in the early 1960s. These ideas were mainly developed for general nonlinear programming, but they were later abandoned due to the presence of more competitive methods for this class of problems (e.g. sequential quadratic programming).
Yurii Nesterov and Arkadi Nemirovski came up with a special class of such barriers that can be used to encode any convex set. They guarantee that the number of iterations of the algorithm is bounded by a polynomial in the dimension and accuracy of the solution.[5]
The class of primal-dual path-following interior-point methods is considered the most successful. Mehrotra's predictor–corrector algorithm provides the basis for most implementations of this class of methods.[6]
We are given a convex program of the form:where f is a convex function and G is a convex set. Without loss of generality, we can assume that the objective f is a linear function. Usually, the convex set G is represented by a set of convex inequalities and linear equalities; the linear equalities can be eliminated using linear algebra, so for simplicity we assume there are only convex inequalities, and the program can be described as follows, where the gi are convex functions:We assume that the constraint functions belong to some family (e.g. quadratic functions), so that the program can be represented by a finite vector of coefficients (e.g. the coefficients to the quadratic functions). The dimension of this coefficient vector is called the size of the program. A numerical solver for a given family of programs is an algorithm that, given the coefficient vector, generates a sequence of approximate solutions xt for t=1,2,..., using finitely many arithmetic operations. A numerical solver is called convergent if, for any program from the family and any positive ε>0, there is some T (which may depend on the program and on ε) such that, for any t>T, the approximate solution xt is ε-approximate, that is:
f(x_t) - f* ≤ εwhere f* is the optimal solution. A solver is called polynomial if the total number of arithmetic operations in the first T steps is at mostgi(x_t) ≤ ε for i in 1,...,m,
x in G,
poly(problem-size) * log(V/ε),where V is some data-dependent constant, e.g., the difference between the largest and smallest value in the feasible set. In other words, V/ε is the "relative accuracy" of the solution - the accuracy w.r.t. the largest coefficient. log(V/ε) represents the number of "accuracy digits". Therefore, a solver is 'polynomial' if each additional digit of accuracy requires a number of operations that is polynomial in the problem size.
Types of interior point methods include:
Given a convex optimization program (P) with constraints, we can convert it to an unconstrained program by adding a barrier function. Specifically, let b be a smooth convex function, defined in the interior of the feasible region G, such that for any sequence whose limit is on the boundary of G:
\limj\tob(xj)=infty
b''(x)
(Pt) minimize t * f(x) + b(x)Technically the program is restricted, since b is defined only in the interior of G. But practically, it is possible to solve it as an unconstrained program, since any solver trying to minimize the function will not approach the boundary, where b approaches infinity. Therefore, (Pt) has a unique solution - denote it by x*(t). The function x* is a continuous function of t, which is called the central path. All limit points of x*, as t approaches infinity, are optimal solutions of the original program (P).
A path-following method is a method of tracking the function x* along a certain increasing sequence t1,t2,..., that is: computing a good-enough approximation xi to the point x*(ti), such that the difference xi - x*(ti) approaches 0 as i approaches infinity; then the sequence xi approaches the optimal solution of (P). This requires to specify three things:
The main challenge in proving that the method is polytime is that, as the penalty parameter grows, the solution gets near the boundary, and the function becomes steeper. The run-time of solvers such as Newton's method becomes longer, and it is hard to prove that the total runtime is polynomial.
Renegar and Gonzaga proved that a specific instance of a path-following method is polytime:
ti+1:=\mu ⋅ ti
\mu=1+0.001 ⋅ \sqrt{m}
They proved that, in this case, the difference xi - x*(ti) remains at most 0.01, and f(xi) - f* is at most 2*m/ti. Thus, the solution accuracy is proportional to 1/ti, so to add a single accuracy-digit, it is suffiicent to multiply ti by 2 (or any other constant factor), which requires O(sqrt(m)) Newton steps. Since each Newton step takes O(m n2) operations, the total complexity is O(m3/2 n2) operations for accuracy digit.
Yuri Nesterov extended the idea from linear to non-linear programs. He noted that the main property of the logarithmic barrier, used in the above proofs, is that it is self-concordant with a finite barrier parameter. Therefore, many other classes of convex programs can be solved in polytime using a path-following method, if we can find a suitable self-concordant barrier function for their feasible region.
We are given a convex optimization problem (P) in "standard form":
minimize cTx s.t. x in G,where G is convex and closed. We can also assume that G is bounded (we can easily make it bounded by adding a constraint |x|≤R for some sufficiently large R).
To use the interior-point method, we need a self-concordant barrier for G. Let b be an M-self-concordant barrier for G, where M≥1 is the self-concordance parameter. We assume that we can compute efficiently the value of b, its gradient, and its Hessian, for every point x in the interior of G.
For every t>0, we define the penalized objective ft(x) := cTx + b(x). We define the path of minimizers by: x*(t) := arg min ft(x). We approximate this path along an increasing sequence ti. The sequence is initialized by a certain non-trivial two-phase initialization procedure. Then, it is updated according to the following rule:
ti+1:=\mu ⋅ ti
For each ti, we find an approximate minimum of fti, denoted by xi. The approximate minimum is chosen to satisfy the following "closeness condition" (where L is the path tolerance):
To find xi+1, we start with xi and apply the damped Newton method. We apply several steps of this method, until the above "closeness relation" is satisfied. The first point that satisfies this relation is denoted by xi+1..\sqrt{[\nablaxft(x
T i)]
2 [\nabla x ft(x
-1 i)] [\nablaxft(xi)]}\leqL
The convergence rate of the method is given by the following formula, for every i:
TakingcTxi-c*\leq
2M t0 \mu-i
\mu=\left(1+r/\sqrt{M}\right)
where the constant factor O(1) depends only on r and L. The number of Newton steps required for the two-step initialization procedure is at most:O(1) ⋅ \sqrt{M} ⋅ ln\left(
M t0\varepsilon +1\right)
where the constant factor O(1) depends only on r and L, andO(1) ⋅ \sqrt{M} ⋅ ln\left(
M
1-\pi (\bar{x
* x f )}+1\right) + O(1) ⋅ \sqrt{M} ⋅ ln\left(
MVarG(c) \epsilon +1\right)
VarG(c):=maxx\incTx-minx\incTx
\bar{x}
Each Newton step takes O(n3) arithmetic operations., where V is some problem-dependent constant:O(1) ⋅ \sqrt{M} ⋅ ln\left(
V \varepsilon +1\right)
} .V=
VarG(c)
1-\pi )
* x f(\bar{x
To initialize the path-following methods, we need a point in the relative interior of the feasible region G. In other words: if G is defined by the inequalities gi(x) ≤ 0, then we need some x for which gi(x) < 0 for all i in 1,...,m. If we do not have such a point, we need to find one using a so-called phase I method. A simple phase-I method is to solve the following convex program:Denote the optimal solution by x*,s*.
For this program it is easy to get an interior point: we can take arbitrarily x=0, and take s to be any number larger than max(f1(0),...,fm(0)). Therefore, it can be solved using interior-point methods. However, the run-time is proportional to log(1/s*). As s* comes near 0, it becomes harder and harder to find an exact solution to the phase-I problem, and thus harder to decide whether the original problem is feasible.
The theoretic guarantees assume that the penalty parameter is increased at the rate
\mu=\left(1+r/\sqrt{M}\right)
O(\sqrt{M})
O(M)
For potential-reduction methods, the problem is presented in the conic form:
minimize cTx s.t. x in ᚢ K,where b is a vector in Rn, L is a linear subspace in Rn (so b+L is an affine plane), and K is a closed pointed convex cone with a nonempty interior. Every convex program can be converted to the conic form. To use the potential-reduction method (specifically, the extension of Karmarkar's algorithm to convex programming), we need the following assumptions:
Assumptions A, B and D are needed in most interior-point methods. Assumption C is specific to Karmarkar's approach; it can be alleviated by using a "sliding objective value". It is possible to further reduce the program to the Karmarkar format:
minimize sTx s.t. x in M ᚢ K and eTx = 1where M is a linear subspace of in Rn, and the optimal objective value is 0.
The method is based on the following scalar potential function:
v(x) = F(x) + M ln (sTx)where F is the M-self-concordant barrier for the feasible cone. It is possible to prove that, when x is strictly feasible and v(x) is very small (- very negative), x is approximately-optimal. The idea of the potential-reduction method is to modify x such that the potential at each iteration drops by at least a fixed constant X (specifically, X=1/3-ln(4/3)). This implies that, after i iterations, the difference between objective value and the optimal objective value is at most V * exp(-i X / M), where V is a data-dependent constant. Therefore, the number of Newton steps required for an ε-approximate solution is at most
O(1) ⋅ M ⋅ ln\left(
V | |
\varepsilon |
+1\right)+1
Note that in path-following methods the expression is
\sqrt{M}
The primal-dual method's idea is easy to demonstrate for constrained nonlinear optimization.[8] [9] For simplicity, consider the following nonlinear optimization problem with inequality constraints:
This inequality-constrained optimization problem is solved by converting it into an unconstrained objective function whose minimum we hope to find efficiently.Specifically, the logarithmic barrier function associated with (1) is
Here
\mu
\mu
B(x,\mu)
The gradient of a differentiable function
h:Rn\toR
\nablah
In addition to the original ("primal") variable
x
λ\inRm
Equation (4) is sometimes called the "perturbed complementarity" condition, for its resemblance to "complementary slackness" in KKT conditions.
We try to find those
(x\mu,λ\mu)
Substituting
1/ci(x)=λi/\mu
J
c(x)
The intuition behind (5) is that the gradient of
f(x)
\mu
ci(x)=0
\nablaf
ci(x)
Let
(px,pλ)
(x,λ)
(px,pλ)
where
H
B(x,\mu)
\operatorname{diag}(λ)
λ
\operatorname{diag}(c(x))
c(x)
Because of (1), (4) the condition
λ\ge0
should be enforced at each step. This can be done by choosing appropriate
\alpha
(x,λ)\to(x+\alphapx,λ+\alphapλ).
Here are some special cases of convex programs that can be solved efficiently by interior-point methods.
Consider a linear program of the form: We can apply path-following methods with the barrier The function
b
Given a quadratically constrained quadratic program of the form: where all matrices Aj are positive-semidefinite matrices.We can apply path-following methods with the barrier The function
b
Consider a problem of the form where each
uj
vj
| ⋅ |p
1<p<infty.
Consider the problem
There is a self-concordant barrier with parameter 2k+m. The path-following method has Newton complexity O(mk2+k3+n3) and total complexity O((k+m)1/2[''mk''<sup>2</sup>+''k''<sup>3</sup>+''n''<sup>3</sup>]).
Interior point methods can be used to solve semidefinite programs.