In mathematics, the symmetry of second derivatives (also called the equality of mixed partials) is the fact that exchanging the order of partial derivatives of a multivariate function
f\left(x1,x2,\ldots,xn\right)
\partial | |
\partialxi |
\left(
\partialf | |
\partialxj |
\right) =
\partial | |
\partialxj |
\left(
\partialf | |
\partialxi |
\right).
Sufficient conditions for the symmetry to hold are given by Schwarz's theorem, also called Clairaut's theorem or Young's theorem.[1]
In the context of partial differential equations, it is called the Schwarz integrability condition.
In symbols, the symmetry may be expressed as:
\partial | |
\partialx |
\left(
\partialf | |
\partialy |
\right) =
\partial | |
\partialy |
\left(
\partialf | \right) or | |
\partialx |
\partial2f | = | |
\partialx\partialy |
\partial2f | |
\partialy\partialx |
.
Another notation is:
\partialx\partialyf=\partialy\partialxf or fyx=fxy.
In terms of composition of the differential operator which takes the partial derivative with respect to :
Di\circDj=Dj\circDi
From this relation it follows that the ring of differential operators with constant coefficients, generated by the, is commutative; but this is only true as operators over a domain of sufficiently differentiable functions. It is easy to check the symmetry as applied to monomials, so that one can take polynomials in the as a domain. In fact smooth functions are another valid domain.
The result on the equality of mixed partial derivatives under certain conditions has a long history. The list of unsuccessful proposed proofs started with Euler's, published in 1740, although already in 1721 Bernoulli had implicitly assumed the result with no formal justification. Clairaut also published a proposed proof in 1740, with no other attempts until the end of the 18th century. Starting then, for a period of 70 years, a number of incomplete proofs were proposed. The proof of Lagrange (1797) was improved by Cauchy (1823), but assumed the existence and continuity of the partial derivatives
\tfrac{\partial2f}{\partialx2}
\tfrac{\partial2f}{\partialy2}
Six years after that, Schwarz succeeded in giving the first rigorous proof. Dini later contributed by finding more general conditions than those of Schwarz. Eventually a clean and more general version was found by Jordan in 1883 that is still the proof found in most textbooks. Minor variants of earlier proofs were published by Laurent (1885), Peano (1889 and 1893), J. Edwards (1892), P. Haag (1893), J. K. Whittemore (1898), Vivanti (1899) and Pierpont (1905). Further progress was made in 1907-1909 when E. W. Hobson and W. H. Young found proofs with weaker conditions than those of Schwarz and Dini. In 1918, Carathéodory gave a different proof based on the Lebesgue integral.
In mathematical analysis, Schwarz's theorem (or Clairaut's theorem on equality of mixed partials) named after Alexis Clairaut and Hermann Schwarz, states that for a function
f\colon\Omega\toR
\Omega\subsetRn
p\inRn
p
\Omega
f
p
\{1,2\ldots,n\},
\partial2 | |
\partialxi\partialxj |
f(p)=
\partial2 | |
\partialxj\partialxi |
f(p).
The partial derivatives of this function commute at that point.
One easy way to establish this theorem (in the case where
n=2
i=1
j=2
f.
An elementary proof for functions on open subsets of the plane is as follows (by a simple reduction, the general case for the theorem of Schwarz easily reduces to the planar case). Let
f(x,y)
\Omega
(a,b)
df
\partialx\partialyf
\partialy\partialxf
\Omega.
\begin{align} u\left(h,k\right)&=f\left(a+h,b+k\right)-f\left(a+h,b\right),\\ v\left(h,k\right)&=f\left(a+h,b+k\right)-f\left(a,b+k\right),\\ w\left(h,k\right)&=f\left(a+h,b+k\right)-f\left(a+h,b\right)-f\left(a,b+k\right)+f\left(a,b\right). \end{align}
These functions are defined for
\left|h\right|,\left|k\right|<\varepsilon
\varepsilon>0
\left[a-\varepsilon,a+\varepsilon\right] x \left[b-\varepsilon,b+\varepsilon\right]
\Omega.
By the mean value theorem, for fixed and non-zero,
\theta,\theta',\phi,\phi'
(0,1)
\begin{align} w\left(h,k\right) &=u\left(h,k\right)-u\left(0,k\right)=h\partialxu\left(\thetah,k\right)\\ &=h\left[\partialxf\left(a+\thetah,b+k\right)-\partialxf\left(a+\thetah,b\right)\right]\\ &=hk\partialy\partialxf\left(a+\thetah,b+\theta\primek\right)\\ w\left(h,k\right) &=v\left(h,k\right)-v\left(h,0\right)=k\partialyv\left(h,\phik\right)\\ &=k\left[\partialyf\left(a+h,b+\phik\right)-\partialyf\left(a,b+\phik\right)\right]\\ &=hk\partialx\partialyf\left(a+\phi\primeh,b+\phik\right). \end{align}
Since
h,k ≠ 0
hk
\begin{align} hk\partialy\partialxf\left(a+\thetah,b+\theta\primek\right)&= hk\partialx\partialyf\left(a+\phi\primeh,b+\phik\right),\\ \partialy\partialxf\left(a+\thetah,b+\theta\primek\right)&= \partialx\partialyf\left(a+\phi\primeh,b+\phik\right). \end{align}
Letting
h,k
\partialy\partialxf
\partialx\partialyf
\partial2 | |
\partialx\partialy |
f\left(a,b\right)=
\partial2 | |
\partialy\partialx |
f\left(a,b\right).
This account is a straightforward classical method found in many text books, for example in Burkill, Apostol and Rudin. Although the derivation above is elementary, the approach can also be viewed from a more conceptual perspective so that the result becomes more apparent. Indeed the difference operators
t | |
\Delta | |
y |
t | |
\Delta | |
x |
t | |
f,\Delta | |
y |
f
\partialxf,\partialyf
t
z
u
\tbinom{1}{0}
\tbinom{0}{1}
t | |
\Delta | |
u |
f(z)={f(z+tu)-f(z)\overt}.
By the fundamental theorem of calculus for
C1
f
I
(a,b)\subsetI
b | |
\int | |
a |
f\prime(x)dx=f(b)-f(a).
Hence
|f(b)-f(a)|\le(b-a)\supc\in|f\prime(c)|
This is a generalized version of the mean value theorem. Recall that the elementary discussion on maxima or minima for real-valued functions implies that if
f
[a,b]
(a,b)
c
(a,b)
{f(b)-f(a)\overb-a}=f\prime(c).
For vector-valued functions with
V
inff\prime\lef\prime(c)\le\supf\prime
V
\|f(b)-f(a)\|\le(b-a)\supc\in\|f\prime(c)\|
These versions of the mean valued theorem are discussed in Rudin, Hörmander and elsewhere.
For
f
C2
D1=\partialx
D2=\partialy
t\ne0
t | |
\Delta | |
1 |
f(x,y)=
t | |
[f(x+t,y)-f(x,y)]/t,\Delta | |
2f(x,y)=[f(x,y+t) |
-f(x,y)]/t
Then for
(x0,y0)
t | |
\left|\Delta | |
2 |
f(x0,y0)-D1D2f(x0,y0)\right|\le\sup0\le
t | |
\left|\Delta | |
1 |
D2f(x0,y0+ts)-D1D2f(x0,y0)\right|\le\sup0\le\left|D1D2f(x0+tr,y0+ts)-D1D2f(x0,y0)\right|.
Thus
t | |
\Delta | |
2 |
f(x0,y0)
D1D2f(x0,y0)
t
t | |
\Delta | |
1 |
f(x0,y0)
D2D1f(x0,y0)
D1
D2
Remark. By two applications of the classical mean value theorem,
t | |
\Delta | |
2 |
f(x0,y0)=D1D2f(x0+t\theta,y0+t\theta\prime)
for some
\theta
\theta\prime
(0,1)
The properties of repeated Riemann integrals of a continuous function on a compact rectangle are easily established. The uniform continuity of implies immediately that the functions
d | |
g(x)=\int | |
c |
F(x,y)dy
b | |
h(y)=\int | |
a |
F(x,y)dx
b | |
\int | |
a |
d | |
\int | |
c |
F(x,y)dydx=
d | |
\int | |
c |
b | |
\int | |
a |
F(x,y)dxdy
moreover it is immediate that the iterated integral is positive if is positive. The equality above is a simple case of Fubini's theorem, involving no measure theory. proves it in a straightforward way using Riemann approximating sums corresponding to subdivisions of a rectangle into smaller rectangles.
To prove Clairaut's theorem, assume is a differentiable function on an open set, for which the mixed second partial derivatives and exist and are continuous. Using the fundamental theorem of calculus twice,
d | |
\int | |
c |
b | |
\int | |
a |
fyx(x,y)dxdy=
d | |
\int | |
c |
fy(b,y)-fy(a,y)dy=f(b,d)-f(a,d)-f(b,c)+f(a,c).
Similarly
b | |
\int | |
a |
d | |
\int | |
c |
fxy(x,y)dydx=
b | |
\int | |
a |
fx(x,d)-fx(x,c)dx=f(b,d)-f(a,d)-f(b,c)+f(a,c).
The two iterated integrals are therefore equal. On the other hand, since is continuous, the second iterated integral can be performed by first integrating over and then afterwards over . But then the iterated integral of on must vanish. However, if the iterated integral of a continuous function function vanishes for all rectangles, then must be identically zero; for otherwise or would be strictly positive at some point and therefore by continuity on a rectangle, which is not possible. Hence must vanish identically, so that everywhere.
A weaker condition than the continuity of second partial derivatives (which is implied by the latter) which suffices to ensure symmetry is that all partial derivatives are themselves differentiable. Another strengthening of the theorem, in which existence of the permuted mixed partial is asserted, was provided by Peano in a short 1890 note on Mathesis:
If
f:E\toR
E\subset\R2
\partial1f(x,y)
\partial2,1f(x,y)
E
\partial2,1f
\left(x0,y0\right)\inE
\partial2f(x,y0)
x=x0
\partial1,2f
\left(x0,y0\right)
\partial1,2f\left(x0,y0\right)=\partial2,1f\left(x0,y0\right)
The theory of distributions (generalized functions) eliminates analytic problems with the symmetry. The derivative of an integrable function can always be defined as a distribution, and symmetry of mixed partial derivatives always holds as an equality of distributions. The use of formal integration by parts to define differentiation of distributions puts the symmetry question back onto the test functions, which are smooth and certainly satisfy this symmetry. In more detail (where f is a distribution, written as an operator on test functions, and φ is a test function),
\left(D1D2f\right)[\phi]=-\left(D2f\right)\left[D1\phi\right]=f\left[D2D1\phi\right]=f\left[D1D2\phi\right]=-\left(D1f\right)\left[D2\phi\right]=\left(D2D1f\right)[\phi].
Another approach, which defines the Fourier transform of a function, is to note that on such transforms partial derivatives become multiplication operators that commute much more obviously.
The symmetry may be broken if the function fails to have differentiable partial derivatives, which is possible if Clairaut's theorem is not satisfied (the second partial derivatives are not continuous).
An example of non-symmetry is the function (due to Peano)
This can be visualized by the polar form
f(r\cos(\theta),r\sin(\theta))=
r2\sin(4\theta) | |
4 |
fx(0,0)=fy(0,0)=0
z=f(x,y)
fx,fy
fy(x,0)=x
fyx(0,0)= \lim\varepsilon
fy(\varepsilon,0)-fy(0,0) | |
\varepsilon |
= 1.
In contrast, along the y-axis the x-derivative
fx(0,y)=-y
fxy(0,0)=-1
fyx\nefxy
The above function, written in polar coordinates, can be expressed as
f(r,\theta)=
r2\sin{4\theta | |
showing that the function oscillates four times when traveling once around an arbitrarily small loop containing the origin. Intuitively, therefore, the local behavior of the function at (0, 0) cannot be described as a quadratic form, and the Hessian matrix thus fails to be symmetric.
In general, the interchange of limiting operations need not commute. Given two variables near and two limiting processes on
f(h,k)-f(h,0)-f(0,k)+f(0,0)
corresponding to making h → 0 first, and to making k → 0 first. It can matter, looking at the first-order terms, which is applied first. This leads to the construction of pathological examples in which second derivatives are non-symmetric. This kind of example belongs to the theory of real analysis where the pointwise value of functions matters. When viewed as a distribution the second partial derivative's values can be changed at an arbitrary set of points as long as this has Lebesgue measure 0. Since in the example the Hessian is symmetric everywhere except, there is no contradiction with the fact that the Hessian, viewed as a Schwartz distribution, is symmetric.
Consider the first-order differential operators Di to be infinitesimal operators on Euclidean space. That is, Di in a sense generates the one-parameter group of translations parallel to the xi-axis. These groups commute with each other, and therefore the infinitesimal generators do also; the Lie bracket
[''D''<sub>''i''</sub>, ''D''<sub>''j''</sub>] = 0
is this property's reflection. In other words, the Lie derivative of one coordinate with respect to another is zero.
The Clairaut-Schwarz theorem is the key fact needed to prove that for every
Cinfty
\omega\in\Omegak(M)
d2\omega:=d(d\omega)=0
\alpha
\alpha=d\omega
\omega
d\alpha=0
d\alpha=d(d\omega)=0
In the middle of the 18th century, the theory of differential forms was first studied in the simplest case of 1-forms in the plane, i.e.
Adx+Bdy
A
B
\omega=Adx+Bdy
d\omega=0
\omega
df
f
f
x | |
f(x,y)=\int | |
x0 |
A(x,y)dx+
\int | |
y0 |
yB(x,y)dy;
while if
\omega=df
d\omega=0
\partialx\partialyf=\partialy\partialxf