Taylor's theorem explained
In calculus, Taylor's theorem gives an approximation of a -times differentiable function around a given point by a polynomial of degree , called the -th-order Taylor polynomial. For a smooth function, the Taylor polynomial is the truncation at the order of the Taylor series of the function. The first-order Taylor polynomial is the linear approximation of the function, and the second-order Taylor polynomial is often referred to as the quadratic approximation.[1] There are several versions of Taylor's theorem, some giving explicit estimates of the approximation error of the function by its Taylor polynomial.
Taylor's theorem is named after the mathematician Brook Taylor, who stated a version of it in 1715,[2] although an earlier version of the result was already mentioned in 1671 by James Gregory.[3]
Taylor's theorem is taught in introductory-level calculus courses and is one of the central elementary tools in mathematical analysis. It gives simple arithmetic formulas to accurately compute values of many transcendental functions such as the exponential function and trigonometric functions.It is the starting point of the study of analytic functions, and is fundamental in various areas of mathematics, as well as in numerical analysis and mathematical physics. Taylor's theorem also generalizes to multivariate and vector valued functions. It provided the mathematical basis for some landmark early computing machines: Charles Babbage's Difference Engine calculated sines, cosines, logarithms, and other transcendental functions by numerically integrating the first 7 terms of their Taylor series.
Motivation
If a real-valued function is differentiable at the point , then it has a linear approximation near this point. This means that there exists a function h1(x) such that
Here
is the linear approximation of for x near the point a, whose graph is the tangent line to the graph at . The error in the approximation is:
As x tends to a, this error goes to zero much faster than
, making
a useful approximation.
For a better approximation to , we can fit a quadratic polynomial instead of a linear function:
Instead of just matching one derivative of at , this polynomial has the same first and second derivatives, as is evident upon differentiation.
Taylor's theorem ensures that the quadratic approximation is, in a sufficiently small neighborhood of , more accurate than the linear approximation. Specifically,
Here the error in the approximation is
which, given the limiting behavior of
, goes to zero faster than
as
x tends to
a.
Similarly, we might get still better approximations to f if we use polynomials of higher degree, since then we can match even more derivatives with f at the selected base point.
In general, the error in approximating a function by a polynomial of degree k will go to zero much faster than
as
x tends to
a. However, there are functions, even infinitely differentiable ones, for which increasing the degree of the approximating polynomial does not increase the accuracy of approximation: we say such a function fails to be
analytic at
x = a: it is not (locally) determined by its derivatives at this point.
Taylor's theorem is of asymptotic nature: it only tells us that the error in an approximation by a -th order Taylor polynomial Pk tends to zero faster than any nonzero -th degree polynomial as . It does not tell us how large the error is in any concrete neighborhood of the center of expansion, but for this purpose there are explicit formulas for the remainder term (given below) which are valid under some additional regularity assumptions on f. These enhanced versions of Taylor's theorem typically lead to uniform estimates for the approximation error in a small neighborhood of the center of expansion, but the estimates do not necessarily hold for neighborhoods which are too large, even if the function f is analytic. In that situation one may have to select several Taylor polynomials with different centers of expansion to have reliable Taylor-approximations of the original function (see animation on the right.)
There are several ways we might use the remainder term:
- Estimate the error for a polynomial Pk(x) of degree k estimating on a given interval (a – r, a + r). (Given the interval and degree, we find the error.)
- Find the smallest degree k for which the polynomial Pk(x) approximates to within a given error tolerance on a given interval (a − r, a + r) . (Given the interval and error tolerance, we find the degree.)
- Find the largest interval (a − r, a + r) on which Pk(x) approximates to within a given error tolerance. (Given the degree and error tolerance, we find the interval.)
Taylor's theorem in one real variable
Statement of the theorem
The precise statement of the most basic version of Taylor's theorem is as follows:
The polynomial appearing in Taylor's theorem is the -th order Taylor polynomial
of the function f at the point a. The Taylor polynomial is the unique "asymptotic best fit" polynomial in the sense that if there exists a function and a -th order polynomial p such that
then p = Pk. Taylor's theorem describes the asymptotic behavior of the remainder term
which is the approximation error when approximating f with its Taylor polynomial. Using the little-o notation, the statement in Taylor's theorem reads as
Explicit formulas for the remainder
Under stronger regularity assumptions on f there are several precise formulas for the remainder term Rk of the Taylor polynomial, the most common ones being the following.
These refinements of Taylor's theorem are usually proved using the mean value theorem, whence the name. Additionally, notice that this is precisely the mean value theorem when . Also other similar expressions can be found. For example, if G(t) is continuous on the closed interval and differentiable with a non-vanishing derivative on the open interval between and , then
for some number between and . This version covers the Lagrange and Cauchy forms of the remainder as special cases, and is proved below using Cauchy's mean value theorem. The Lagrange form is obtained by taking
and the Cauchy form is obtained by taking
.
The statement for the integral form of the remainder is more advanced than the previous ones, and requires understanding of Lebesgue integration theory for the full generality. However, it holds also in the sense of Riemann integral provided the (k + 1)th derivative of f is continuous on the closed interval [''a'',''x''].
Due to the absolute continuity of f on the closed interval between and , its derivative f exists as an L-function, and the result can be proven by a formal calculation using the fundamental theorem of calculus and integration by parts.
Estimates for the remainder
It is often useful in practice to be able to estimate the remainder term appearing in the Taylor approximation, rather than having an exact formula for it. Suppose that f is -times continuously differentiable in an interval I containing a. Suppose that there are real constants q and Q such that
throughout I. Then the remainder term satisfies the inequality
if, and a similar estimate if . This is a simple consequence of the Lagrange form of the remainder. In particular, if
on an interval with some
, then
\le M\frac
for all The second inequality is called a uniform estimate, because it holds uniformly for all x on the interval
Example
Suppose that we wish to find the approximate value of the function on the interval while ensuring that the error in the approximation is no more than 10−5. In this example we pretend that we only know the following properties of the exponential function:
From these properties it follows that for all , and in particular, . Hence the -th order Taylor polynomial of at and its remainder term in the Lagrange form are given by
where is some number between 0 and x. Since ex is increasing by, we can simply use for to estimate the remainder on the subinterval
. To obtain an upper bound for the remainder on
, we use the property
Example
The function
\begin& f : \R \to \R \\& f(x) = \frac\end
is real analytic, that is, locally determined by its Taylor series. This function was plotted above to illustrate the fact that some elementary functions cannot be approximated by Taylor polynomials in neighborhoods of the center of expansion which are too large. This kind of behavior is easily understood in the framework of complex analysis. Namely, the function f extends into a meromorphic function
\begin& f:\Complex \cup \ \to \Complex \cup \ \\& f(z) = \frac\end
on the compactified complex plane. It has simple poles at z=i and z=-i, and it is analytic elsewhere. Now its Taylor series centered at z0 converges on any disc B(z0, r) with r < |z − z0|, where the same Taylor series converges at z ∈ C. Therefore, Taylor series of f centered at 0 converges on B(0, 1) and it does not converge for any z ∈ C with |z| > 1 due to the poles at i and −i. For the same reason the Taylor series of f centered at 1 converges on B(1, \sqrt) and does not converge for any z ∈ C with \left\vert z-1 \right\vert>\sqrt.
Generalizations of Taylor's theorem
Higher-order differentiability
A function is differentiable at if and only if there exists a linear functional and a function such that
f(\boldsymbol) = f(\boldsymbol) + L(\boldsymbol-\boldsymbol) + h(\boldsymbol)\lVert\boldsymbol-\boldsymbol\rVert,\qquad \lim_ h(\boldsymbol)=0.
If this is the case, then L = df(\boldsymbol) is the (uniquely defined) differential of at the point . Furthermore, then the partial derivatives of exist at and the differential of at is given by
df(\boldsymbol)(\boldsymbol) = \frac(\boldsymbol) v_1 + \cdots + \frac(\boldsymbol) v_n.
Introduce the multi-index notation
|\alpha| = \alpha_1+\cdots+\alpha_n, \quad \alpha!=\alpha_1!\cdots\alpha_n!, \quad \boldsymbol^\alpha=x_1^\cdots x_n^
for and . If all the k-th order partial derivatives of are continuous at, then by Clairaut's theorem, one can change the order of mixed derivatives at, so the notation
D^\alpha f = \frac, \qquad |\alpha|\leq k
for the higher order partial derivatives is justified in this situation. The same is true if all the -th order partial derivatives of exist in some neighborhood of and are differentiable at .[4] Then we say that is times differentiable at the point .
Taylor's theorem for multivariate functions
Using notations of the preceding section, one has the following theorem.
B=\{y\in\Rn:\left\|a-y\right\|\leqr\}
for some
, then one can derive an exact formula for the remainder in terms of order
partial derivatives of
f in this neighborhood.
[5] Namely,
\begin& f(\boldsymbol) = \sum_
\frac (\boldsymbol-\boldsymbol)^\alpha + \sum_
R_\beta(\boldsymbol)(\boldsymbol-\boldsymbol)^\beta, \\& R_\beta(\boldsymbol) = \frac
\int_0^1 (1-t)^
D^\beta f \big(\boldsymbol+t(\boldsymbol-\boldsymbol)\big) \, dt.\end
In this case, due to the continuity of -th order partial derivatives in the compact set, one immediately obtains the uniform estimates
\left|R_\beta(\boldsymbol)\right| \leq \frac \max_
\max_ |D^\alpha f(\boldsymbol)|, \qquad \boldsymbol\in B.
Example in two dimensions
For example, the third-order Taylor polynomial of a smooth function
is, denoting
\boldsymbol{x}-\boldsymbol{a}=\boldsymbol{v}
,
\beginP_3(\boldsymbol) = f (\boldsymbol) + &\frac(\boldsymbol) v_1 + \frac(\boldsymbol) v_2 + \frac(\boldsymbol) \frac + \frac(\boldsymbol) v_1 v_2 + \frac(\boldsymbol) \frac \\& + \frac(\boldsymbol) \frac + \frac(\boldsymbol) \frac + \frac(\boldsymbol) \frac + \frac(\boldsymbol) \frac\end
Proofs
Proof for Taylor's theorem in one real variable
Let
h_k(x) = \begin\frac & x\not=a\\0&x=a\end
where, as in the statement of Taylor's theorem,
P(x) = f(a) + f'(a)(x-a) + \frac(x-a)^2 + \cdots + \frac(x-a)^k.
It is sufficient to show that
\lim_ h_k(x) =0.
The proof here is based on repeated application of L'Hôpital's rule. Note that, for each j=0,1,...,k-1,
. Hence each of the first
k-1 derivatives of the numerator in
vanishes at
, and the same is true of the denominator. Also, since the condition that the function
f be
k times differentiable at a point requires differentiability up to order
k-1 in a neighborhood of said point (this is true, because differentiability requires a function to be defined in a whole neighborhood of a point), the numerator and its
k-2 derivatives are differentiable in a neighborhood of
a. Clearly, the denominator also satisfies said condition, and additionally, doesn't vanish unless
x=a, therefore all conditions necessary for L'Hôpital's rule are fulfilled, and its use is justified. So
\begin\lim_ \frac&= \lim_ \frac \\[1ex]&= \cdots \\[1ex]&= \lim_ \frac\\[1ex]&= \frac\lim_ \frac\\[1ex]&=\frac(f^(a) - P^(a))= 0\end
where the second-to-last equality follows by the definition of the derivative at x=a.
Alternate proof for Taylor's theorem in one real variable
Let
be any real-valued continuous function to be approximated by the Taylor polynomial.
Step 1: Let F and G be functions. Set F and G to be
\beginF(x) = f(x) - \sum^_ \frac(x-a)^ \end
\beginG(x) = (x-a)^\end
Step 2: Properties of F and G:
\beginF(a) & = f(a) - f(a) - f'(a)(a - a) - ... - \frac = 0 \\G(a) & = (a-a)^n = 0\end
Similarly,
\beginF'(a) = f'(a) - f'(a) - \frac - ... - \frac = 0\end
\beginG'(a) &= n(a-a)^ = 0\\&\qquad \vdots\\G^(a) &= F^(a) = 0\end
Step 3: Use Cauchy Mean Value Theorem
Let
and
be continuous functions on
. Since
so we can work with the interval
. Let
and
be differentiable on
. Assume
for all
.Then there exists
such that
\begin\frac = \frac\end
Note:
in
and
so
\begin\frac = \frac = \frac\end
for some
.
This can also be performed for
:
\begin\frac = \frac = \frac\end
for some
.This can be continued to
.
This gives a partition in
:
a < c_ < c_ < \dots < c_ < x
with
\frac = \frac = \dots = \frac .
Set
:
\frac = \frac
Step 4: Substitute back
\begin\frac = \frac = \frac\end
By the Power Rule, repeated derivatives of
,
, so:
\frac = \frac = \frac.
This leads to:
\beginf(x) - \sum^_ \frac(x-a)^ = \frac(x-a)^\end.
By rearranging, we get:
\beginf(x) = \sum^_ \frac(x-a)^ + \frac(x-a)^\end,
or because
eventually:
f(x) = \sum^_ \frac(x-a)^.
Derivation for the mean value forms of the remainder
Let G be any real-valued function, continuous on the closed interval between a and x and differentiable with a non-vanishing derivative on the open interval between a and x, and define
F(t) = f(t) + f'(t)(x-t) + \frac(x-t)^2 + \cdots + \frac(x-t)^k.
For
. Then, by Cauchy's mean value theorem,
for some \xi on the open interval between a and x. Note that here the numerator F(x)-F(a)=R_k(x) is exactly the remainder of the Taylor polynomial for y=f(x). Compute
\beginF'(t) = & f'(t) + \big(f(t)(x-t) - f'(t)\big) + \left(\frac(x-t)^2 - \frac(x-t)\right) + \cdots \\& \cdots + \left(\frac(x-t)^k - \frac(x-t)^\right)= \frac(x-t)^k,\end
plug it into and rearrange terms to find that
R_k(x) = \frac(x-\xi)^k \frac.
This is the form of the remainder term mentioned after the actual statement of Taylor's theorem with remainder in the mean value form.The Lagrange form of the remainder is found by choosing
and the Cauchy form by choosing
.
Remark. Using this method one can also recover the integral form of the remainder by choosing
G(t) = \int_a^t \frac (x-s)^k \, ds,
but the requirements for f needed for the use of mean value theorem are too strong, if one aims to prove the claim in the case that f is only absolutely continuous. However, if one uses Riemann integral instead of Lebesgue integral, the assumptions cannot be weakened.
Derivation for the integral form of the remainder
Due to the absolute continuity of
on the closed interval between
a and
x, its derivative
exists as an
-function, and we can use the
fundamental theorem of calculus and
integration by parts. This same proof applies for the
Riemann integral assuming that
is
continuous on the closed interval and
differentiable on the open interval between
a and
x, and this leads to the same result than using the mean value theorem.
The fundamental theorem of calculus states that
f(x)=f(a)+ \int_a^x \, f'(t) \, dt.
Now we can integrate by parts and use the fundamental theorem of calculus again to see that
\beginf(x) &= f(a)+\Big(xf'(x)-af'(a)\Big)-\int_a^x tf(t) \, dt \\&= f(a) + x\left(f'(a) + \int_a^x f(t) \,dt \right) -af'(a)-\int_a^x tf(t) \, dt \\&= f(a)+(x-a)f'(a)+\int_a^x \, (x-t)f(t) \, dt,\end
which is exactly Taylor's theorem with remainder in the integral form in the case
. The general statement is proved using
induction. Suppose that
Integrating the remainder term by parts we arrive at
\begin\int_a^x \frac (x - t)^k \, dt = & - \left[\frac{f^{(k+1)} (t)}{(k+1)k!} (x - t)^{k+1} \right]_a^x + \int_a^x \frac (x - t)^ \, dt \\= & \ \frac (x - a)^ + \int_a^x \frac (x - t)^ \, dt.\end
Substituting this into the formula shows that if it holds for the value
, it must also hold for the value
. Therefore, since it holds for
, it must hold for every positive integer
.
Derivation for the remainder of multivariate Taylor polynomials
We prove the special case, where
has continuous partial derivatives up to the order
in some closed ball
with center
. The strategy of the proof is to apply the one-variable case of Taylor's theorem to the restriction of
to the line segment adjoining
and
. Parametrize the line segment between
and
by
\boldsymbol{u}(t)=\boldsymbol{a}+t(\boldsymbol{x}-\boldsymbol{a})
We apply the one-variable version of Taylor's theorem to the function
g(t)=f(\boldsymbol{u}(t))
:
f(\boldsymbol)=g(1)=g(0)+\sum_^k\fracg^(0)\ +\ \int_0^1 \frac g^(t)\, dt.
Applying the chain rule for several variables gives
\beging^(t)&=\fracf(\boldsymbol(t))\\&= \frac f(\boldsymbol+t(\boldsymbol-\boldsymbol))\\&= \sum_
\left(\begin j\\ \alpha\end \right) (D^\alpha f) (\boldsymbol+t(\boldsymbol-\boldsymbol)) (\boldsymbol-\boldsymbol)^\alpha\end
where
is the multinomial coefficient. Since
\tfrac{1}{j!}\tbinomj\alpha=\tfrac{1}{\alpha!}
, we get:
f(\boldsymbol)= f(\boldsymbol) + \sum_\frac (D^\alpha f) (\boldsymbol)(\boldsymbol-\boldsymbol)^\alpha+\sum_
\frac(\boldsymbol-\boldsymbol)^\alpha \int_0^1 (1-t)^k (D^\alpha f)(\boldsymbol+t(\boldsymbol-\boldsymbol))\,dt.
References
External links
Notes and References
- (2013). and quadratic approximation" Retrieved December 6, 2018
- Book: Taylor, Brook . la. Methodus Incrementorum Directa et Inversa . Direct and Reverse Methods of Incrementation . London . 1715 . p. 21–23 (Prop. VII, Thm. 3, Cor. 2). Translated into English in Book: Struik, D. J. . A Source Book in Mathematics 1200–1800 . Cambridge, Massachusetts . Harvard University Press . 1969 . 329–332.
- .
- This follows from iterated application of the theorem that if the partial derivatives of a function exist in a neighborhood of and are continuous at, then the function is differentiable at . See, for instance, .
- Web site: Higher-Order Derivatives and Taylor's Formula in Several Variables . Folland . G. B. . Department of Mathematics University of Washington . 2024-02-21 .