Recursive least squares (RLS) is an adaptive filter algorithm that recursively finds the coefficients that minimize a weighted linear least squares cost function relating to the input signals. This approach is in contrast to other algorithms such as the least mean squares (LMS) that aim to reduce the mean square error. In the derivation of the RLS, the input signals are considered deterministic, while for the LMS and similar algorithms they are considered stochastic. Compared to most of its competitors, the RLS exhibits extremely fast convergence. However, this benefit comes at the cost of high computational complexity.
RLS was discovered by Gauss but lay unused or ignored until 1950 when Plackett rediscovered the original work of Gauss from 1821. In general, the RLS can be used to solve any problem that can be solved by adaptive filters. For example, suppose that a signal
d(n)
q | |
x(n)=\sum | |
k=0 |
bn(k)d(n-k)+v(n)
where
v(n)
d(n)
p+1
w
d(n) ≈
p | |
\sum | |
k=0 |
w(k)x(n-k)=wTxn
where
xn=[x(n) x(n-1) \ldots x(n-p)]T
p+1
x(n)
\hat{d}(n)=
p | |
\sum | |
k=0 |
wn(k)x(n-k)=w
T | |
n |
xn
The goal is to estimate the parameters of the filter
w
n
wn
wn+1
wn
T | |
w | |
n |
T | |
w | |
n |
xn
wn
xn
\hat{d}(n)
\hat{d}(n)-d(n)
As time evolves, it is desired to avoid completely redoing the least squares algorithm to find the new estimate for
wn+1
wn
The benefit of the RLS algorithm is that there is no need to invert matrices, thereby saving computational cost. Another advantage is that it provides intuition behind such results as the Kalman filter.
C
wn
e(n)
d(n)
The error implicitly depends on the filter coefficients through the estimate
\hat{d}(n)
e(n)=d(n)-\hat{d}(n)
The weighted least squares error function
C
e(n)
C(wn)=\sum
nλ | |
i=0 |
n-ie2(i)
0<λ\le1
The cost function is minimized by taking the partial derivatives for all entries
k
wn
\partialC(wn) | |
\partialwn(k) |
n | |
=\sum | |
i=0 |
2λn-ie(i) ⋅
\partiale(i) | |
\partialwn(k) |
n | |
=-\sum | |
i=0 |
2λn-ie(i)x(i-k)=0 k=0,1,\ldots,p
e(n)
nλ | |
\sum | |
i=0 |
n-i
p | |
\left[d(i)-\sum | |
\ell=0 |
wn(\ell)x(i-\ell)\right]x(i-k)=0 k=0,1,\ldots,p
p | |
\sum | |
\ell=0 |
wn(\ell)\left[\sum
n | |
i=0 |
λn-ix(i-\ell)x(i-k)\right]=
n | |
\sum | |
i=0 |
λn-id(i)x(i-k) k=0,1,\ldots,p
Rx(n)wn=rdx(n)
Rx(n)
x(n)
rdx(n)
d(n)
x(n)
wn
-1 | |
=R | |
x |
(n)rdx(n)
The smaller
λ
λ=1
λ
λ
The discussion resulted in a single equation to determine a coefficient vector which minimizes the cost function. In this section we want to derive a recursive solution of the form
wn=wn-1+\Deltawn-1
\Deltawn-1
{n-1}
rdx(n)
rdx(n-1)
rdx(n) |
λn-id(i)x(i) | ||||||
λn-id(i)x(i)+λ0d(n)x(n) | |||||||
=λrdx(n-1)+d(n)x(n) |
x(i)
{p+1}
x(i)=[x(i),x(i-1),...,x(i-p)]T
Rx(n)
Rx(n-1)
Rx(n) |
λn-ix(i)xT(i) | ||||||
=λRx(n-1)+x(n)xT(n) |
A | =λRx(n-1) (p+1) (p+1) | |
U | =x(n) (p+1) | |
V | =xT(n) (p+1) | |
C | =I1 |
(n) | = | \left[λRx(n-1)+x(n)xT(n)\right]-1 | ||||||||||||||||||||||||
= |
(n-1)x(n)xT
(n-1)}{λ+xT
(n-1)x(n)}\right\rbrace | |||||||||||||||||||||||||
P(n) |
(n) | ||||||
=λ-1P(n-1)-g(n)xT(n)λ-1P(n-1) |
g(n)
g(n) | =λ-1P(n-1)x(n)\left\{1+xT(n)λ-1P(n-1)x(n)\right\}-1 | |
=P(n-1)x(n)\left\{λ+xT(n)P(n-1)x(n)\right\}-1 |
g(n)
g(n)\left\{1+xT(n)λ-1P(n-1)x(n)\right\} | =λ-1P(n-1)x(n) | |
g(n)+g(n)xT(n)λ-1P(n-1)x(n) | =λ-1P(n-1)x(n) |
g(n) | =λ-1P(n-1)x(n)-g(n)xT(n)λ-1P(n-1)x(n) | |
=λ-1\left[P(n-1)-g(n)xT(n)P(n-1)\right]x(n) |
P(n)
g(n)=P(n)x(n)
wn | =P(n)rdx(n) | |
=λP(n)rdx(n-1)+d(n)P(n)x(n) |
rdx(n)
P(n)
g(n)
wn | =λ\left[λ-1P(n-1)-g(n)xT(n)λ-1P(n-1)\right]rdx(n-1)+d(n)g(n) | |
=P(n-1)rdx(n-1)-g(n)xT(n)P(n-1)rdx(n-1)+d(n)g(n) | ||
=P(n-1)rdx(n-1)+g(n)\left[d(n)-xT(n)P(n-1)rdx(n-1)\right] |
wn-1=P(n-1)rdx(n-1)
wn | =wn-1+g(n)\left[d(n)-xT(n)wn-1\right] | |
=wn-1+g(n)\alpha(n) |
\alpha(n)=d(n)-xT(n)wn-1
e(n)=d(n)-xT(n)wn
That means we found the correction factor
\Deltawn-1=g(n)\alpha(n)
This intuitively satisfying result indicates that the correction factor is directly proportional to both the error and the gain vector, which controls how much sensitivity is desired, through the weighting factor,
λ
The RLS algorithm for a p-th order RLS filter can be summarized as
Parameters: | p= | |
λ= | ||
\delta= P(0) | ||
Initialization: | w(0)=0 | |
x(k)=0,k=-p,...,-1 | ||
d(k)=0,k=-p,...,-1 | ||
P(0)=\deltaI I p+1 | ||
Computation: | For n=1,2,... | |
x(n)=\left[ \begin{matrix} x(n)\\ x(n-1)\\ \vdots\\ x(n-p) \end{matrix} \right] | ||
\alpha(n)=d(n)-xT(n)w(n-1) | ||
g(n)=P(n-1)x(n)\left\{λ+xT(n)P(n-1)x(n)\right\}-1 | ||
P(n)=λ-1P(n-1)-g(n)xT(n)λ-1P(n-1) | ||
w(n)=w(n-1)+\alpha(n)g(n) |
The recursion for
P
The lattice recursive least squares adaptive filter is related to the standard RLS except that it requires fewer arithmetic operations (order N).[4] It offers additional advantages over conventional LMS algorithms such as faster convergence rates, modular structure, and insensitivity to variations in eigenvalue spread of the input correlation matrix. The LRLS algorithm described is based on a posteriori errors and includes the normalized form. The derivation is similar to the standard RLS algorithm and is based on the definition of
d(k)
d(k)=x(k)
x(k-1)
d(k)=x(k-i-1)
x(k)
\kappaf(k,i)
\kappab(k,i)
ef(k,i)
eb(k,i)
d | |
\xi | |
bmin |
(k,i)
d | |
\xi | |
fmin |
(k,i)
\gamma(k,i)
vi(k)
\varepsilon
The algorithm for a LRLS filter can be summarized as
Initialization: | |||||||||||||||||||
For | |||||||||||||||||||
\delta(-1,i)=\deltaD(-1,i)=0 | |||||||||||||||||||
(-1,i)=
(-1,i)=\varepsilon | |||||||||||||||||||
\gamma(-1,i)=1 | |||||||||||||||||||
eb(-1,i)=0 | |||||||||||||||||||
End | |||||||||||||||||||
Computation: | |||||||||||||||||||
For | |||||||||||||||||||
\gamma(k,0)=1 | |||||||||||||||||||
eb(k,0)=ef(k,0)=x(k) | |||||||||||||||||||
(k,0)=
(k,0)=x2(k)+
(k-1,0) | |||||||||||||||||||
e(k,0)=d(k) | |||||||||||||||||||
For | |||||||||||||||||||
\delta(k,i)=λ\delta(k-1,i)+
| |||||||||||||||||||
\gamma(k,i+1)=\gamma(k,i)-
| |||||||||||||||||||
\kappab(k,i)=
| |||||||||||||||||||
\kappaf(k,i)=
| |||||||||||||||||||
eb(k,i+1)=eb(k-1,i)-\kappab(k,i)ef(k,i) | |||||||||||||||||||
ef(k,i+1)=ef(k,i)-\kappaf(k,i)eb(k-1,i) | |||||||||||||||||||
(k,i+1)=
(k-1,i)-\delta(k,i)\kappab(k,i) | |||||||||||||||||||
(k,i+1)=
(k,i)-\delta(k,i)\kappaf(k,i) | |||||||||||||||||||
Feedforward filtering | |||||||||||||||||||
\deltaD(k,i)=λ\deltaD(k-1,i)+
| |||||||||||||||||||
vi(k)=
| |||||||||||||||||||
e(k,i+1)=e(k,i)-vi(k)eb(k,i) | |||||||||||||||||||
End | |||||||||||||||||||
End | |||||||||||||||||||
The normalized form of the LRLS has fewer recursions and variables. It can be calculated by applying a normalization to the internal variables of the algorithm which will keep their magnitude bounded by one. This is generally not used in real-time applications because of the number of division and square-root operations which comes with a high computational load.
The algorithm for a NLRLS filter can be summarized as
Initialization: | ||||||||||||||||
For | ||||||||||||||||
\overline{\delta}(-1,i)=0 | ||||||||||||||||
\overline{\delta}D(-1,i)=0 | ||||||||||||||||
\overline{e}b(-1,i)=0 | ||||||||||||||||
End | ||||||||||||||||
=
=\varepsilon | ||||||||||||||||
Computation: | ||||||||||||||||
For | ||||||||||||||||
=
+x2(k) | ||||||||||||||||
=
+d2(k) | ||||||||||||||||
\overline{e}b(k,0)=\overline{e}f(k,0)=
| ||||||||||||||||
\overline{e}(k,0)=
| ||||||||||||||||
For | ||||||||||||||||
\overline{\delta}(k,i)=\delta(k-1,i)\sqrt{(1-
-
+\overline{e}b(k-1,i)\overline{e}f(k,i) | ||||||||||||||||
\overline{e}b(k,i+1)=
-\overline{\delta}(k,i)\overline{e}f(k,i)}{\sqrt{(1-\overline{\delta}2(k,i))(1-
| ||||||||||||||||
\overline{e}f(k,i+1)=
-\overline{\delta}(k,i)\overline{e}b(k-1,i)}{\sqrt{(1-\overline{\delta}2(k,i))(1-
| ||||||||||||||||
Feedforward filter | ||||||||||||||||
\overline{\delta}D(k,i)=\overline{\delta}D(k-1,i)\sqrt{(1-
-\overline{e}2(k,i))}+\overline{e}(k,i)\overline{e}b(k,i) | ||||||||||||||||
\overline{e}(k,i+1)=
-
-\overline{\delta}D(k,i)\overline{e}b(k,i)] | ||||||||||||||||
End | ||||||||||||||||
End | ||||||||||||||||