Remez algorithm explained
The Remez algorithm or Remez exchange algorithm, published by Evgeny Yakovlevich Remez in 1934, is an iterative algorithm used to find simple approximations to functions, specifically, approximations by functions in a Chebyshev space that are the best in the uniform norm L∞ sense.[1] It is sometimes referred to as Remes algorithm or Reme algorithm.
A typical example of a Chebyshev space is the subspace of Chebyshev polynomials of order n in the space of real continuous functions on an interval, C[''a'', ''b'']. The polynomial of best approximation within a given subspace is defined to be the one that minimizes the maximum absolute difference between the polynomial and the function. In this case, the form of the solution is precised by the equioscillation theorem.
Procedure
The Remez algorithm starts with the function
to be approximated and a set
of
sample points
in the approximation interval, usually the extrema of Chebyshev polynomial linearly mapped to the interval. The steps are:
- Solve the linear system of equations
b0+b1xi+...+bn
+(-1)iE=f(xi)
(where
),
for the unknowns
and
E.
as coefficients to form a polynomial
.
of points of local maximum error
.
are of equal magnitude and alternate in sign, then
is the minimax approximation polynomial. If not, replace
with
and repeat the steps above.
The result is called the polynomial of best approximation or the minimax approximation algorithm.
A review of technicalities in implementing the Remez algorithm is given by W. Fraser.[2]
Choice of initialization
The Chebyshev nodes are a common choice for the initial approximation because of their role in the theory of polynomial interpolation. For the initialization of the optimization problem for function f by the Lagrange interpolant Ln(f), it can be shown that this initial approximation is bounded by
\lVertf-Ln(f)\rVertinfty\le(1+\lVertLn\rVertinfty)
\lVertf-p\rVert
with the norm or Lebesgue constant of the Lagrange interpolation operator Ln of the nodes (t1, ..., tn + 1) being
\lVertLn\rVertinfty=\overline{Λ}n(T)=max-1λn(T;x),
T being the zeros of the Chebyshev polynomials, and the Lebesgue functions being
λn(T;x)=
\left|lj(x)\right|, lj(x)=\prod\stackrel{i{i\nej}}n
.
Theodore A. Kilgore,[3] Carl de Boor, and Allan Pinkus[4] proved that there exists a unique ti for each Ln, although not known explicitly for (ordinary) polynomials. Similarly,
\underline{Λ}n(T)=min-1λn(T;x)
, and the optimality of a choice of nodes can be expressed as
\overline{Λ}n-\underline{Λ}n\ge0.
For Chebyshev nodes, which provides a suboptimal, but analytically explicit choice, the asymptotic behavior is known as[5]
\overline{Λ}n(T)=
log(n+1)+
\left(\gamma+log
\right)+\alphan
(being the Euler–Mascheroni constant) with
for
and upper bound[6]
\overline{Λ}n(T)\le
log(n+1)+1
Lev Brutman[7] obtained the bound for
, and
being the zeros of the expanded Chebyshev polynomials:
\overline{Λ}n(\hat{T})-\underline{Λ}n(\hat{T})<\overline{Λ}3-
\cot
+
-
(\gamma-log\pi) ≈ 0.201.
Rüdiger Günttner[8] obtained from a sharper estimate for
\overline{Λ}n(\hat{T})-\underline{Λ}n(\hat{T})<0.0196.
Detailed discussion
This section provides more information on the steps outlined above. In this section, the index i runs from 0 to n+1.
Step 1: Given
, solve the linear system of
n+2 equations
b0+b1xi+...+bn
+(-1)iE=f(xi)
(where
),
for the unknowns
and
E.
It should be clear that
in this equation makes sense only if the nodes
are
ordered, either strictly increasing or strictly decreasing. Then this linear system has a unique solution. (As is well known, not every linear system has a solution.) Also, the solution can be obtained with only
arithmetic operations while a standard solver from the library would take
operations. Here is the simple proof:
Compute the standard n-th degree interpolant
to
at the first
n+1 nodes and also the standard
n-th degree interpolant
to the ordinates
p1(xi)=f(xi),p2(xi)=(-1)i,i=0,...,n.
To this end, use each time Newton's interpolation formula with the divideddifferences of order
and
arithmetic operations.
The polynomial
has its
i-th zero between
and
, and thus no further zeroes between
and
:
and
have the same sign
.
The linear combination
is also a polynomial of degree
n and
p(xi)=p1(xi)-p2(xi) ⋅ E = f(xi)-(-1)iE, i=0,\ldots,n.
This is the same as the equation above for
and for any choice of
E.The same equation for
i =
n+1 is
p(xn+1) = p1(xn+1)-p2(xn+1) ⋅ E = f(xn+1)-(-1)n+1E
and needs special reasoning: solved for the variable
E, it is the
definition of
E:
E :=
| p1(xn+1)-f(xn+1) |
p2(xn+1)+(-1)n |
.
As mentioned above, the two terms in the denominator have same sign:
E and thus
are always well-defined.
The error at the given n+2 ordered nodes is positive and negative in turn because
p(xi)-f(xi) = -(-1)iE, i=0,...,n+1.
The theorem of de La Vallée Poussin states that under this condition no polynomial of degree n exists with error less than E. Indeed, if such a polynomial existed, call it
, then the difference
p(x)-\tildep(x)=(p(x)-f(x))-(\tildep(x)-f(x))
would still be positive/negative at the
n+2 nodes
and therefore have at least
n+1 zeros which is impossible for a polynomial of degree
n.Thus, this
E is a lower bound for the minimum error which can be achieved with polynomials of degree
n.
Step 2 changes the notation from
to
.
Step 3 improves upon the input nodes
and their errors
as follows.
In each P-region, the current node
is replaced with the local maximizer
and in each N-region
is replaced with the local minimizer. (Expect
at
A, the
near
, and
at
B.) No high precision is required here,the standard
line search with a couple of
quadratic fits should suffice. (See
[9])
Let
zi:=p(\bar{x}i)-f(\bar{x}i)
. Each amplitude
is greater than or equal to
E. The Theorem of
de La Vallée Poussin and its proof alsoapply to
with
as the newlower bound for the best error possible with polynomials of degree
n.
Moreover,
comes in handy as an obvious upper bound for that best possible error.
Step 4: With
and
as lower and upper bound for the best possible approximation error, one has a reliable stopping criterion: repeat the steps until
is sufficiently small or no longer decreases. These bounds indicate the progress.
Variants
Some modifications of the algorithm are present on the literature. These include:
- Replacing more than one sample point with the locations of nearby maximum absolute differences.
- Replacing all of the sample points with in a single iteration with the locations of all, alternating sign, maximum differences.[10]
- Using the relative error to measure the difference between the approximation and the function, especially if the approximation will be used to compute the function on a computer which uses floating point arithmetic;
- Including zero-error point constraints.
- The Fraser-Hart variant, used to determine the best rational Chebyshev approximation.[11]
External links
Notes and References
- E. Ya. Remez, "Sur la détermination des polynômes d'approximation de degré donnée", Comm. Soc. Math. Kharkov 10, 41 (1934);
"Sur un procédé convergent d'approximations successives pour déterminer les polynômes d'approximation, Compt. Rend. Acad. Sc. 198, 2063 (1934);
"Sur le calcul effectiv des polynômes d'approximation des Tschebyscheff", Compt. Rend. Acade. Sc. 199, 337 (1934).
- 10.1145/321281.321282 . W. . Fraser . A Survey of Methods of Computing Minimax and Near-Minimax Polynomial Approximations for Functions of a Single Independent Variable . J. ACM . 12 . 295–314 . 1965 . 3 . 2736060 . free .
- 10.1016/0021-9045(78)90013-8 . T. A. . Kilgore . A characterization of the Lagrange interpolating projection with minimal Tchebycheff norm . J. Approx. Theory . 24 . 273–288 . 1978 . 4 .
- 10.1016/0021-9045(78)90014-X . C. . de Boor . A. . Pinkus . Proof of the conjectures of Bernstein and Erdös concerning the optimal nodes for polynomial interpolation . . 24 . 289–303 . 1978 . 4 . free .
- F. W. . Luttmann . T. J. . Rivlin . Some numerical experiments in the theory of polynomial interpolation . IBM J. Res. Dev. . 9 . 187–191 . 1965 . 3 . 10.1147/rd.93.0187.
- T. Rivlin, "The Lebesgue constants for polynomial interpolation", in Proceedings of the Int. Conf. on Functional Analysis and Its Application, edited by H. G. Garnier et al. (Springer-Verlag, Berlin, 1974), p. 422; The Chebyshev polynomials (Wiley-Interscience, New York, 1974).
- 10.1137/0715046 . L. . Brutman . On the Lebesgue Function for Polynomial Interpolation . SIAM J. Numer. Anal. . 15 . 694–704 . 1978 . 4 . 1978SJNA...15..694B .
- 10.1137/0717043 . R. . Günttner . Evaluation of Lebesgue Constants . SIAM J. Numer. Anal. . 17 . 512–520 . 1980 . 4 . 1980SJNA...17..512G .
- David G. Luenberger: Introduction to Linear and Nonlinear Programming, Addison-Wesley Publishing Company 1973.
- Temes, G.C.; Barcilon, V.; Marshall, F.C. (1973). "The optimization of bandlimited systems". Proceedings of the IEEE. 61 (2): 196–234. doi:10.1109/PROC.1973.9004. ISSN 0018-9219.
- Dunham . Charles B. . 1975 . Convergence of the Fraser-Hart algorithm for rational Chebyshev approximation . Mathematics of Computation . en . 29 . 132 . 1078–1082 . 10.1090/S0025-5718-1975-0388732-9 . 0025-5718. free .