A Savitzky–Golay filter is a digital filter that can be applied to a set of digital data points for the purpose of smoothing the data, that is, to increase the precision of the data without distorting the signal tendency. This is achieved, in a process known as convolution, by fitting successive sub-sets of adjacent data points with a low-degree polynomial by the method of linear least squares. When the data points are equally spaced, an analytical solution to the least-squares equations can be found, in the form of a single set of "convolution coefficients" that can be applied to all data sub-sets, to give estimates of the smoothed signal, (or derivatives of the smoothed signal) at the central point of each sub-set. The method, based on established mathematical procedures,[1] [2] was popularized by Abraham Savitzky and Marcel J. E. Golay, who published tables of convolution coefficients for various polynomials and sub-set sizes in 1964.[3] [4] Some errors in the tables have been corrected.[5] The method has been extended for the treatment of 2- and 3-dimensional data.
Savitzky and Golay's paper is one of the most widely cited papers in the journal Analytical Chemistry[6] and is classed by that journal as one of its "10 seminal papers" saying "it can be argued that the dawn of the computer-controlled analytical instrument can be traced to this article".[7]
The data consists of a set of points
\{xj
yj\};j=1,...,n
xj
yj
m
Ci
Yj=\sumi=\tfrac{1-m2}\tfrac{m-12}Ciyj+i,
m+1 | |
2 |
\lej\len-
m-1 | |
2 |
m=5,i=-2,-1,0,1,2
jth
Yj
Yj=
1 | |
35 |
(-3yj+12yj+17yj+12yj-3yj)
where,
C-2=-3/35,C-1=12/35
The following are applications of numerical differentiation of data.[8] Note When calculating the nth derivative, an additional scaling factor of
n! | |
hn |
dnY | |
dxn |
See main article: Moving average.
The "moving average filter" is a trivial example of a Savitzky-Golay filter that is commonly used with time series data to smooth out short-term fluctuations and highlight longer-term trends or cycles. Each subset of the data set is fit with a straight horizontal line as opposed to a higher order polynomial. An unweighted moving average filter is the simplest convolution filter.
The moving average is often used for a quick technical analysis of financial data, like stock prices, returns or trading volumes. It is also used in economics to examine gross domestic product, employment or other macroeconomic time series.
It was not included in some tables of Savitzky-Golay convolution coefficients as all the coefficient values are identical, with the value
1 | |
m |
When the data points are equally spaced, an analytical solution to the least-squares equations can be found.[2] This solution forms the basis of the convolution method of numerical smoothing and differentiation. Suppose that the data consists of a set of n points (xj, yj) (j = 1, ..., n), where xj is an independent variable and yj is a datum value. A polynomial will be fitted by linear least squares to a set of m (an odd number) adjacent data points, each separated by an interval h. Firstly, a change of variable is made
z={{x-\barx}\overh}
{\barx}
\tfrac{1-m}{2}, … ,0, … ,\tfrac{m-1}{2}
Y=a0+a1z+a2z2 … +akzk.
{a
J
i
J
1,zi,
2, | |
z | |
i |
...
For example, for a cubic polynomial fitted to 5 points, z= −2, −1, 0, 1, 2 the normal equations are solved as follows.
J=\begin{pmatrix} 1&-2&4&-8\ 1&-1&1&-1\ 1&0&0&0\\ 1&1&1&1\\ 1&2&4&8 \end{pmatrix}
JTJ |
=\begin{pmatrix} m&\sumz&\sumz2&\sumz3\ \sumz&\sumz2&\sumz3&\sumz4\ \sumz2&\sumz3&\sumz4&\sumz5\\ \sumz3&\sumz4&\sumz5&\sumz6\\ \end{pmatrix}= \begin{pmatrix}m&0&\sumz2&0\ 0&\sumz2&0&\sumz4\ \sumz2&0&\sumz4&0\\ 0&\sumz4&0&\sumz6\\ \end{pmatrix}= \begin{pmatrix} 5&0&10&0\\ 0&10&0&34\\ 10&0&34&0\\ 0&34&0&130\ \end{pmatrix}
Now, the normal equations can be factored into two separate sets of equations, by rearranging rows and columns, with
JTJ | |
even=\begin{pmatrix} 5 |
&10\ 10&34\\ \end{pmatrix} and
JTJ | |
odd=\begin{pmatrix} 10 |
&34\ 34&130\\ \end{pmatrix}
-1 | |
(JTJ) | |
even={1\over70} \begin{pmatrix} 34 |
&-10\\ -10&5\\ \end{pmatrix} and
-1 | |
(JTJ) | |
odd={1\over144} \begin{pmatrix} 130 |
&-34\\ -34&10\\ \end{pmatrix}
\begin{pmatrix}{a0}\ {a2}\\\end{pmatrix}j= {1\over70} \begin{pmatrix}34&-10\\ -10&5 \end{pmatrix}\begin{pmatrix} 1&1&1&1&1\\ 4&1&0&1&4\\ \end{pmatrix} \begin{pmatrix}yj-2\ yj-1\ yj\ yj+1\ yj+2\end{pmatrix}
\begin{pmatrix}a1\ a3\\\end{pmatrix}j= {1\over144} \begin{pmatrix} 130&-34\\ -34&10\\ \end{pmatrix}\begin{pmatrix} -2&-1&0&1&2\\ -8&-1&0&1&8\\ \end{pmatrix} \begin{pmatrix}yj-2\ yj-1\ yj\ yj+1\ yj+2\\\end{pmatrix}
\begin{align} a0,j&={1\over35}(-3yj-2+12yj-1+17yj+12yj+1-3yj+2)\\ a1,j&={1\over12}(yj-2-8yj-1+8yj+1-yj+2)\\ a2,j&={1\over14}(2yj-2-yj-1-2yj-yj+1+2yj+2)\\ a3,j&={1\over12}(-yj-2+2yj-1-2yj+1+yj+2) \end{align}
The coefficients of y in these expressions are known as convolution coefficients. They are elements of the matrix
C=(JTJ)-1JT |
(C ⊗ y)j =Yj=\sumi=-\tfrac{m-12}\tfrac{m-12}Ciyj+i,
m+1 | |
2 |
\lej\len-
m-1 | |
2 |
\begin{pmatrix} Y3\\ Y4\\ Y5\\ \vdots \end{pmatrix} ={1\over35} \begin{pmatrix} -3&12&17&12&-3&0&0& … \ 0&-3&12&17&12&-3&0& … \ 0&0&-3&12&17&12&-3& … \\ \vdots&\vdots&\vdots&\vdots&\vdots&\vdots&\vdots&\ddots \end{pmatrix} \begin{pmatrix} y1\\ y2\\ y3\\ y4\\ y5\\ y6\\ y7\\ \vdots \end{pmatrix}
Tables of convolution coefficients, calculated in the same way for m up to 25, were published for the Savitzky–Golay smoothing filter in 1964,[3] [5] The value of the central point, z = 0, is obtained from a single set of coefficients, a0 for smoothing, a1 for 1st derivative etc. The numerical derivatives are obtained by differentiating Y. This means that the derivatives are calculated for the smoothed data curve. For a cubic polynomial
\begin{align} Y&=a0+a1z+a2z2+a3z3=a0&atz=0,x=\barx\\
{dY | |
In general, polynomials of degree (0 and 1),[15] (2 and 3), (4 and 5) etc. give the same coefficients for smoothing and even derivatives. Polynomials of degree (1 and 2), (3 and 4) etc. give the same coefficients for odd derivatives.
It is not necessary always to use the Savitzky–Golay tables. The summations in the matrix JTJ can be evaluated in closed form,
| ||||
\begin{align} \sum | ||||
|
z2&={m(m2-1)\over12}\\ \sumz4&={m(m2-1)(3m2-7)\over240}\\ \sumz6&={m(m2-1)(3m4-18m2+31)\over1344}\end{align}
Smoothing, polynomial degree 2,3 :
C0i=
{\left({3m2-7-20i2 | |
\right)/4}}{{m\left( |
{m2-4}\right)/3}};
1-m | |
2 |
\lei\le
m-1 | |
2 |
1st derivative: polynomial degree 3,4
C1i=
{5\left({3m4-18m2+31 | |
\right)i |
-28\left({3m2-7}\right)i3}} {{m\left({m2-1}\right)\left({3m4-39m2+108}\right)/15}}
2nd derivative: polynomial degree 2,3
C2i=
{12mi2-m\left({m2-1 | |
\right)}} {{m |
2\left({m2-1}\right)\left({m2-4}\right)/30}}
3rd derivative: polynomial degree 3,4
C3i=
{-\left({3m2-7 | |
\right)i |
+20i3}} {{m\left({m2-1}\right)\left({3m4-39m2+108}\right)/420}}
Simpler expressions that can be used with curves that don't have an inflection point are:
Smoothing, polynomial degree 0,1 (moving average):
C0i=
1 | |
m |
1st derivative, polynomial degree 1,2:
C1i=
i | |
m(m2-1)/12 |
An alternative to fitting m data points by a simple polynomial in the subsidiary variable, z, is to use orthogonal polynomials.
Y=b0P0(z)+b1P1(z) … +bkPk(z).
Savitzky–Golay filters are most commonly used to obtain the smoothed or derivative value at the central point, z = 0, using a single set of convolution coefficients. (m − 1)/2 points at the start and end of the series cannot be calculated using this process. Various strategies can be employed to avoid this inconvenience.
For a cubic polynomial
\begin{align} Y&=a0+a1z+a2z2+a3
| ||||
z |
&=
1 | |
h |
(a1+2a2z+3a3z2)\\
d2Y | |
dx2 |
&=
1 | |
h2 |
(2a2+6a3z)\\
d3Y | |
dx3 |
&=
6 | |
h3 |
a3\end{align}
U=\sumiwi(Yi-y
2 | |
i) |
a=\left(
JTW |
J\right)-1
JTW |
y Wi,i\ne1
If the same set of diagonal weights is used for all data subsets,
W=diag(w1,w2,...,wm)
JTWJ |
=\begin{pmatrix}\sumwi~~~&\sumwizi&\sumwiz
2 | |
i |
\ \sumwizi&\sumwiz
2 | |
i |
&\sumwiz
3 | |
i |
\ \sumwiz
2 | |
i |
&\sumwiz
3 | |
i |
&\sumwiz
4 | |
i |
\end{pmatrix}
An explicit expression for the inverse of this matrix can be obtained using Cramer's rule. A set of convolution coefficients may then be derived as
C=\left(
JTW |
J\right)-1
JTW. |
Alternatively the coefficients, C, could be calculated in a spreadsheet, employing a built-in matrix inversion routine to obtain the inverse of the normal equations matrix. This set of coefficients, once calculated and stored, can be used with all calculations in which the same weighting scheme applies. A different set of coefficients is needed for each different weighting scheme.
It was shown that Savitzky–Golay filter can be improved by introducing weights that decrease at the ends of the fitting interval.[19]
Two-dimensional smoothing and differentiation can also be applied to tables of data values, such as intensity values in a photographic image which is composed of a rectangular grid of pixels.[20] [21] Such a grid is referred as a kernel, and the data points that constitute the kernel are referred as nodes. The trick is to transform the rectangular kernel into a single row by a simple ordering of the indices of the nodes. Whereas the one-dimensional filter coefficients are found by fitting a polynomial in the subsidiary variable z to a set of m data points, the two-dimensional coefficients are found by fitting a polynomial in subsidiary variables v and w to a set of the values at the m × n kernel nodes. The following example, for a bivariate polynomial of total degree 3, m = 7, and n = 5, illustrates the process, which parallels the process for the one dimensional case, above.[22]
v=
x-\barx | |
h(x) |
;w=
y-\bary | |
h(y) |
Y=a00+a10v+a01w+a20v2+a11vw+a02w2+a30v3+a21v2w+a12vw2+a03w3
−3 | −2 | −1 | 0 | 1 | 2 | 3 | ||
---|---|---|---|---|---|---|---|---|
−2 | d1 | d2 | d3 | d4 | d5 | d6 | d7 | |
−1 | d8 | d9 | d10 | d11 | d12 | d13 | d14 | |
0 | d15 | d16 | d17 | d18 | d19 | d20 | d21 | |
1 | d22 | d23 | d24 | d25 | d26 | d27 | d28 | |
2 | d29 | d30 | d31 | d32 | d33 | d34 | d35 |
d = (d1 ... d35)TThe Jacobian has 10 columns, one for each of the parameters a00 − a03, and 35 rows, one for each pair of v and w values. Each row has the form
Jrow=1 v w v2 vw w2 v3 v2w vw2 w3
C=\left(JTJ\right)-1JT
a00
Nikitas and Pappa-Louisi showed that depending on the format of the used polynomial, the quality of smoothing may vary significantly.[23] They recommend using the polynomial of the form
Y=
p | |
\sum | |
i=0 |
q | |
\sum | |
j=0 |
aiviwj
because such polynomials can achieve good smoothing both in the central and in the near-boundary regions of a kernel, and therefore they can be confidently used in smoothing both at the internal and at the near-boundary data points of a sampled domain. In order to avoid ill-conditioning when solving the least-squares problem, p < m and q < n. For software that calculates the two-dimensional coefficients and for a database of such C
The idea of two-dimensional convolution coefficients can be extended to the higher spatial dimensions as well, in a straightforward manner,[24] by arranging multidimensional distribution of the kernel nodes in a single row. Following the aforementioned finding by Nikitas and Pappa-Louisi in two-dimensional cases, usage of the following form of the polynomial is recommended in multidimensional cases:
Y=
p1 | |
\sum | |
i1=0 |
p2 | |
\sum | |
i2=0 |
…
pD | |
\sum | |
iD=0 |
(a | |
i1i2 … iD |
x
i1 | |
u | |
1 |
i2 | |
u | |
2 |
…
iD | |
u | |
D |
)
where D is the dimension of the space,
a
Accurate computation of C in multidimensional cases becomes challenging, as precision of standard floating point numbers available in computer programming languages no longer remain sufficient. The insufficient precision causes the floating point truncation errors to become comparable to the magnitudes of some C elements, which, in turn, severely degrades its accuracy and renders it useless. Chandra Shekhar has brought forth two open source softwares, Advanced Convolution Coefficient Calculator (ACCC) and Precise Convolution Coefficient Calculator (PCCC), which handle these accuracy issues adequately. ACCC performs the computation by using floating point numbers, in an iterative manner.[25] The precision of the floating-point numbers is gradually increased in each iteration, by using GNU MPFR. Once the obtained C
A database of C
It is inevitable that the signal will be distorted in the convolution process. From property 3 above, when data which has a peak is smoothed the peak height will be reduced and the half-width will be increased. Both the extent of the distortion and S/N (signal-to-noise ratio) improvement:
For example, If the noise in all data points is uncorrelated and has a constant standard deviation, σ, the standard deviation on the noise will be decreased by convolution with an m-point smoothing function to[29] [30]
polynomial degree 0 or 1:
\sqrt{{1\overm}}\sigma
polynomial degree 2 or 3:
\sqrt{ | 3(3m2-7) |
4m(m2-4) |
Although the moving average function gives better noise reduction it is unsuitable for smoothing data which has curvature over m points. A quadratic filter function is unsuitable for getting a derivative of a data curve with an inflection point because a quadratic polynomial does not have one. The optimal choice of polynomial order and number of convolution coefficients will be a compromise between noise reduction and distortion.[31]
One way to mitigate distortion and improve noise removal is to use a filter of smaller width and perform more than one convolution with it. For two passes of the same filter this is equivalent to one pass of a filter obtained by convolution of the original filter with itself. For example, 2 passes of the filter with coefficients (1/3, 1/3, 1/3) is equivalent to 1 pass of the filter with coefficients(1/9, 2/9, 3/9, 2/9, 1/9).
The disadvantage of multipassing is that the equivalent filter width for
n
m
n(m-1)+1
Convolution maps to multiplication in the Fourier co-domain. The discrete Fourier transform of a convolution filter is a real-valued function which can be represented as
FT(\theta)=\sumj=\tfrac{1-m2}\tfrac{m-12}Cj\cos(j\theta)
θ runs from 0 to 180 degrees, after which the function merely repeats itself. The plot for a 9-point quadratic/cubic smoothing function is typical. At very low angle, the plot is almost flat, meaning that low-frequency components of the data will be virtually unchanged by the smoothing operation. As the angle increases the value decreases so that higher frequency components are more and more attenuated. This shows that the convolution filter can be described as a low-pass filter: the noise that is removed is primarily high-frequency noise and low-frequency noise passes through the filter. Some high-frequency noise components are attenuated more than others, as shown by undulations in the Fourier transform at large angles. This can give rise to small oscillations in the smoothed data[33] and phase reversal, i.e., high-frequency oscillations in the data get inverted by Savitzky–Golay filtering.[34]
Convolution affects the correlation between errors in the data. The effect of convolution can be expressed as a linear transformation.
Yj =
| ||||
\sum | ||||
|
Ciyj+i
By the law of error propagation, the variance-covariance matrix of the data, A will be transformed into B according to
B=CACT
B={\sigma2\over9} \begin{pmatrix} 1&1&1&0&0\ 0&1&1&1&0\ 0&0&1&1&1 \end{pmatrix} \begin{pmatrix} 1&0&0&0&0\ 0&1&0&0&0\ 0&0&1&0&0\\ 0&0&0&1&0\\ 0&0&0&0&1 \end{pmatrix} \begin{pmatrix} 1&0&0\ 1&1&0\ 1&1&1\\ 0&1&1\\ 0&0&1 \end{pmatrix} ={\sigma2\over9} \begin{pmatrix} 3&2&1\ 2&3&2\ 1&2&3 \end{pmatrix}
In this case the correlation coefficients,
\rhoij=
Bij | |
\sqrt{BiiBjj |
\rhoi,i+1={2\over3}=0.66,\rhoi,i+2={1\over3}=0.33
In general, the calculated values are correlated even when the observed values are not correlated. The correlation extends over calculated points at a time.
To illustrate the effect of multipassing on the noise and correlation of a set of data, consider the effects of a second pass of a 3-point moving average filter. For the second pass[35]
CBCT |
={\sigma2\over81} \begin{pmatrix} 1&1&1&0&0\ 0&1&1&1&0\ 0&0&1&1&1 \end{pmatrix} \begin{pmatrix} 3&2&1&0&0\ 2&3&2&0&0\ 1&2&3&0&0\\ 0&0&0&0&0\\ 0&0&0&0&0 \end{pmatrix} \begin{pmatrix} 1&0&0\ 1&1&0\ 1&1&1\\ 0&1&1\\ 0&0&1 \end{pmatrix} ={\sigma2\over81} \begin{pmatrix} 19 &16 &10 &4 &1\ 16 &19 &16 &10 &4\\ 10 &16 &19 &16 &10\\ 4 &10 &16 &19 &16\\ 1 &4 &10 &16 &19 \end{pmatrix}
\sqrt{\tfrac{19}{81}}\sigma=0.48\sigma
Correlation now extends over a span of 4 sequential points with correlation coefficients
\rhoi,i+1={16\over19}=0.84,\rhoi,i+2={10\over19}=0.53,\rhoi,i+3={4\over19}=0.21,\rhoi,i+4={1\over19}=0.05
The advantage obtained by performing two passes with the narrower smoothing function is that it introduces less distortion into the calculated data.
Compared with other smoothing filters, e.g. convolution with a Gaussian or multi-pass moving-average filtering, Savitzky–Golay filters have an initially flatter response and sharper cutoff in the frequency domain, especially for high orders of the fit polynomial (see frequency characteristics). For data with limited signal bandwidth, this means that Savitzky–Golay filtering can provide better signal-to-noise ratio than many other filters; e.g., peak heights of spectra are better preserved than for other filters with similar noise suppression. Disadvantages of the Savitzky–Golay filters are comparably poor suppression of some high frequencies (poor stopband suppression) and artifacts when using polynomial fits for the first and last points.
Alternative smoothing methods that share the advantages of Savitzky–Golay filters and mitigate at least some of their disadvantages are Savitzky–Golay filters with properly chosen alternative fitting weights, Whittaker–Henderson smoothing and Hodrick–Prescott_filter (equivalent methods closely related to smoothing splines), and convolution with a windowed sinc function.
Consider a set of data points . The Savitzky–Golay tables refer to the case that the step is constant, h. Examples of the use of the so-called convolution coefficients, with a cubic polynomial and a window size, m, of 5 points are as follows.
Yj=
1 | |
35 |
(-3 x yj+12 x yj+17 x yj+12 x yj-3 x yj)
Y'j=
1 | |
12h |
(1 x yj-8 x yj+0 x yj+8 x yj-1 x yj)
Y''j=
1 | |
7h2 |
(2 x yj-1 x yj-2 x yj-1 x yj+2 x yj)
|
|
|
|
|