In mathematics, the Legendre transformation (or Legendre transform), first introduced by Adrien-Marie Legendre in 1787 when studying the minimal surface problem,[1] is an involutive transformation on real-valued functions that are convex on a real variable. Specifically, if a real-valued multivariable function is convex on one of its independent real variables, then the Legendre transform with respect to this variable is applicable to the function.
In physical problems, the Legendre transform is used to convert functions of one quantity (such as position, pressure, or temperature) into functions of the conjugate quantity (momentum, volume, and entropy, respectively). In this way, it is commonly used in classical mechanics to derive the Hamiltonian formalism out of the Lagrangian formalism (or vice versa) and in thermodynamics to derive the thermodynamic potentials, as well as in the solution of differential equations of several variables.
For sufficiently smooth functions on the real line, the Legendre transform
f*
f
D
⋅
(\phi)-1( ⋅ )
(\phi)-1(\phi(x))=x
f'(f*\prime(x*))=x*
f*\prime(f'(x))=x
The generalization of the Legendre transformation to affine spaces and non-convex functions is known as the convex conjugate (also called the Legendre–Fenchel transformation), which can be used to construct a function's convex hull.
R
I\sub\R
f:I\to\R
f
f*:I*\to\R
I
x*x-f(x)
f(x)
The transform is always well-defined when
f(x)
x*x-f(x)
I
Rn
f:X\to\R
X\sub\Rn
f*:X*\to\R
\langlex*,x\rangle
x*
x
The function
f*
f
p
x*
f
y
f
p
The Legendre transformation is an application of the duality relationship between points and lines. The functional relationship specified by
f
(x,y)
For a differentiable convex function
f
f'
(f')-1
f
f*
f'=((f*)')-1
(f*)'=(f')-1
To see this, first note that if
f
\overline{x}
x\mapstop ⋅ x-f(x)
f
f*(p)=p ⋅ \overline{x}-f(\overline{x})
Then, suppose that the first derivative
f'
g=(f')-1
g(p)
x\mapstopx-f(x)
\overline{x}=g(p)
f'(g(p))=p
x
g(p)
p-f'(g(p))=0
f*(p)=p ⋅ g(p)-f(g(p))
f'(g(p))=p
(f*)'(p)=g(p)=(f')-1(p)
(f*)'
f'
In general, if
h'=(f')-1
f',
h'=(f*)'
f*=h+c.
c.
In practical terms, given
f(x),
xf'(x)-f(x)
f'(x)
f*(p)
p.
In some cases (e.g. thermodynamic potentials, below), a non-standard requirement is used, amounting to an alternative definition of with a minus sign,
In analytical mechanics and thermodynamics, Legendre transformation is usually defined as follows: suppose
f
x,
df=
df | |
dx |
dx.
performing Legendre transformation on this function means that we take
p=
df | |
dx |
df=pdx,
and according to Leibniz's rule
d(uv)=udv+vdu,
d\left(xp-f\right)=xdp,
and taking
f*=xp-f,
df*=xdp,
df* | |
dp |
=x.
When
f
n
x1,x2, … ,xn
df=p1dx1+p2dx2+ … +pndxn
where
pi=
\partialf | |
\partialxi |
.
x1
p1
x2, … ,xn
d(f-x1p1)=-x1dp1+p2dx2+ … +pndxn.
so for function
\varphi(p1,x2, … ,xn)=f(x1,x2, … ,xn)-x1p1,
\partial\varphi | |
\partialp1 |
=-x1,
\partial\varphi | |
\partialx2 |
=p2, … ,
\partial\varphi | |
\partialxn |
=pn.
We can also do this transformation for variables
x2, … ,xn
d\varphi=-x1dp1-x2dp2- … -xndpn
\varphi=f-x1p1-x2p2- … -xnpn.
In analytical mechanics, people perform this transformation on variables
q |
1,
q |
2, … ,
q |
n
L(q1, … ,qn,
q |
1, … ,
q |
n)
H(q1, … ,qn,p1, … ,pn)=
n | |
\sum | |
i=1 |
pi
q |
i- L(q1, … ,qn,
q |
1 … ,
q |
n)
and in thermodynamics, people perform this transformation on variables according to the type of thermodynamic system they want. E.g. starting from the cardinal function of state, the internal energy
U(S,V)
dU=TdS-pdV,
we can perform Legendre transformation on either or both of
S,V
dH=d(U+pV) = TdS+Vdp
dF=d(U-TS) =-SdT-pdV
dG=d(U-TS+pV)=-SdT+Vdp
and each of these three expressions has a physical meaning.
This definition of Legendre transformation is the one originally introduced by Legendre in his work in 1787, and still applied by physicists nowadays. Indeed, this definition can be mathematically rigorous if we treat all the variables and functions defined above, e.g.
f,x1, … ,xn,p1, … ,pn,
\Rn
df,dxi,dpi
f
x1,x2, … ,xn.
f(x)
p
\bar{x}
px-f(x)
x
f
f*(p)=p\bar{x}-f(\bar{x})
d | |
dx |
(px-f(x))=p-f'(x)=0
\bar{x}
p
\bar{x}=g(p)
g\equiv(f')-1
g
f'
f
f'(g(p))=p
g
f*(p)=pg(p)-f(g(p))
\bar{x}=g(p)
f*
f**=f~
f'(\bar{x})=p
\bar{x}=g(p)
f*(p)=p\bar{x}-f(\bar{x})
(f*)'(p)=g(p)
f
As shown above, for a convex function
f(x)
x=\bar{x}
px-f(x)
p
f*(p)=p\bar{x}-f(\bar{x})
g\equiv(f')-1
f'(\bar{x})=p
\bar{x}=g(p)
(f*)'(p)=g(p)
f(x)=ex,
I=R
I*
x*x-ex
x
-ex
x=ln(x*)
I*=(0,infty).
To find the Legendre transformation of the Legendre transformation of
f
x
f**
f**=f
x*=ex
d2 | |
{dx* |
2}f**(x)=-
1 | |
x* |
<0
f**
I*=(0,infty).
f**
f=f**,
Let defined on, where is a fixed constant.
For fixed, the function of, has the first derivative and second derivative ; there is one stationary point at, which is always a maximum.
Thus, and
The first derivatives of, 2, and of,, are inverse functions to each other. Clearly, furthermore,namely .
Let for .
For fixed, is continuous on compact, hence it always takes a finite maximum on it; it follows that the domain of the Legendre transform of
f
The stationary point at (found by setting that the first derivative of with respect to
x
x
-2
x*<4
x\in[2,3]
x=2
x*>6
x=3
The function is convex, for every (strict convexity is not required for the Legendre transformation to be well defined). Clearly is never bounded from above as a function of, unless . Hence is defined on and . (The definition of the Legendre transform requires the existence of the supremum, that requires upper bounds.)
One may check involutivity: of course, is always bounded as a function of, hence . Then, for all one hasand hence .
As an example of a convex continuous function that is not everywhere differentiable, consider
f(x)=|x|
f*(x*)=0
I*=[-1,1]
Letbe defined on, where is a real, positive definite matrix.
Then is convex, andhas gradient and Hessian, which is negative; hence the stationary point is a maximum.
We have, and
The Legendre transform is linked to integration by parts, .
Let be a function of two independent variables and, with the differential
Assume that the function is convex in for all, so that one may perform the Legendre transform on in, with the variable conjugate to (for information, there is a relation
\partialf | |
\partialx |
|\bar{x
\bar{x}
px-f(x,y)
We thus consider the function so that
The function is the Legendre transform of, where only the independent variable has been supplanted by . This is widely used in thermodynamics, as illustrated below.
A Legendre transform is used in classical mechanics to derive the Hamiltonian formulation from the Lagrangian formulation, and conversely. A typical Lagrangian has the form
where
(v,q)
For every fixed,
L(v,q)
v
V(q)
Hence the Legendre transform of
L(v,q)
v
In a more general setting,
(v,q)
TlM
lM
L(v,q)
H(p,q)
T*lM
The strategy behind the use of Legendre transforms in thermodynamics is to shift from a function that depends on a variable to a new (conjugate) function that depends on a new variable, the conjugate of the original one. The new variable is the partial derivative of the original function with respect to the original variable. The new function is the difference between the original function and the product of the old and new variables. Typically, this transformation is useful because it shifts the dependence of, e.g., the energy from an extensive variable to its conjugate intensive variable, which can often be controlled more easily in a physical experiment.
For example, the internal energy is an explicit function of the extensive variables entropy, volume , and chemical composition (e.g.,
i=1,2,3,\ldots
where
T=\left.
\partialU | |
\partialS |
\right\vert
V,Ni for all |
,P=\left.-
\partialU | |
\partialV |
\right\vert
S,Ni for all |
,\mui=\left.
\partialU | |
\partialNi |
\right\vert
S,V,Nj for all j |
(Subscripts are not necessary by the definition of partial derivatives but left here for clarifying variables.) Stipulating some common reference state, by using the (non-standard) Legendre transform of the internal energy with respect to volume, the enthalpy may be obtained as the following.
To get the (standard) Legendre transform of the internal energy with respect to volume, the function is defined first, then it shall be maximized or bounded by . To do this, the condition needs to be satisfied, so is obtained. This approach is justified because is a linear function with respect to (so a convex function on) by the definition of extensive variables. The non-standard Legendre transform here is obtained by negating the standard version, so .
is definitely a state function as it is obtained by adding (and as state variables) to a state function , so its differential is an exact differential. Because of and the fact that it must be an exact differential,
H=H(S,P,\{Ni\})
The enthalpy is suitable for description of processes in which the pressure is controlled from the surroundings.
It is likewise possible to shift the dependence of the energy from the extensive variable of entropy,, to the (often more convenient) intensive variable, resulting in the Helmholtz and Gibbs free energies. The Helmholtz free energy, and Gibbs energy, are obtained by performing Legendre transforms of the internal energy and enthalpy, respectively,
The Helmholtz free energy is often the most useful thermodynamic potential when temperature and volume are controlled from the surroundings, while the Gibbs energy is often the most useful when temperature and pressure are controlled from the surroundings.
As another example from physics, consider a parallel conductive plate capacitor, in which the plates can move relative to one another. Such a capacitor would allow transfer of the electric energy which is stored in the capacitor into external mechanical work, done by the force acting on the plates. One may think of the electric charge as analogous to the "charge" of a gas in a cylinder, with the resulting mechanical force exerted on a piston.
Compute the force on the plates as a function of, the distance which separates them. To find the force, compute the potential energy, and then apply the definition of force as the gradient of the potential energy function.
The electrostatic potential energy stored in a capacitor of the capacitance and a positive electric charge or negative charge on each conductive plate is (with using the definition of the capacitance as ),
where the dependence on the area of the plates, the dielectric constant of the insulation material between the plates, and the separation are abstracted away as the capacitance . (For a parallel plate capacitor, this is proportional to the area of the plates and inversely proportional to the separation.)
The force between the plates due to the electric field created by the charge separation is then
If the capacitor is not connected to any electric circuit, then the electric charges on the plates remain constant and the voltage varies when the plates move with respect to each other, and the force is the negative gradient of the electrostatic potential energy as
where as the charge is fixed in this configuration.
However, instead, suppose that the voltage between the plates is maintained constant as the plate moves by connection to a battery, which is a reservoir for electric charges at a constant potential difference. Then the amount of charges is a variable instead of the voltage; and are the Legendre conjugate to each other. To find the force, first compute the non-standard Legendre transform with respect to (also with using ),
This transformation is possible because is now a linear function of so is convex on it. The force now becomes the negative gradient of this Legendre transform, resulting in the same force obtained from the original function ,
The two conjugate energies and happen to stand opposite to each other (their signs are opposite), only because of the linearity of the capacitance—except now is no longer a constant. They reflect the two different pathways of storing energy into the capacitor, resulting in, for instance, the same "pull" between a capacitor's plates.
In large deviations theory, the rate function is defined as the Legendre transformation of the logarithm of the moment generating function of a random variable. An important application of the rate function is in the calculation of tail probabilities of sums of i.i.d. random variables, in particular in Cramér's theorem.
If
Xn
Sn=X1+ … +Xn
M(\xi)
X1
\xi\inR
\xiSn | |
E[e |
]=M(\xi)n
\xi\ge0
a\inR
Λ(\xi)=logM(\xi)
\xi
\xia-Λ(\xi)
Λ
x=a
Legendre transformation arises naturally in microeconomics in the process of finding the supply of some product given a fixed price on the market knowing the cost function, i.e. the cost for the producer to make/mine/etc. units of the given product.
A simple theory explains the shape of the supply curve based solely on the cost function. Let us suppose the market price for a one unit of our product is . For a company selling this good, the best strategy is to adjust the production so that its profit is maximized. We can maximize the profitby differentiating with respect to and solving
represents the optimal quantity of goods that the producer is willing to supply, which is indeed the supply itself:
If we consider the maximal profit as a function of price,
profitmax(P)
C(Q)
For a strictly convex function, the Legendre transformation can be interpreted as a mapping between the graph of the function and the family of tangents of the graph. (For a function of one variable, the tangents are well-defined at all but at most countably many points, since a convex function is differentiable at all but at most countably many points.)
p
y
b
y=px+b
f
\left(x0,f(x0)\right)
Being the derivative of a strictly convex function, the function
f'
x0
y
b
p,
f\star
f.
The family of tangent lines of the graph of
f
p
The graph of the original function can be reconstructed from this family of lines as the envelope of this family by demanding
Eliminating
p
Identifying
y
f(x)
f\star,
For a differentiable real-valued function on an open convex subset of the Legendre conjugate of the pair is defined to be the pair, where is the image of under the gradient mapping, and is the function on given by the formulawhere
is the scalar product on . The multidimensional transform can be interpreted as an encoding of the convex hull of the function's epigraph in terms of its supporting hyperplanes.[2] This can be seen as consequence of the following two observations. On the one hand, the hyperplane tangent to the epigraph of
f
(x,f(x))\inU x R
(\nablaf(x),-1)\inRn+1
C\inRm
x ⋅ n=hC(n)
hC(n)
C
*(x)=h | |
f | |
\operatorname{epi |
(f)}(x,-1)
(x,f(x))
Alternatively, if is a vector space and is its dual vector space, then for each point of and of, there is a natural identification of the cotangent spaces with and with . If is a real differentiable function over, then its exterior derivative,, is a section of the cotangent bundle and as such, we can construct a map from to . Similarly, if is a real differentiable function over, then defines a map from to . If both maps happen to be inverses of each other, we say we have a Legendre transform. The notion of the tautological one-form is commonly used in this setting.
When the function is not differentiable, the Legendre transform can still be extended, and is known as the Legendre-Fenchel transformation. In this more general setting, a few properties are lost: for example, the Legendre transform is no longer its own inverse (unless there are extra assumptions, like convexity).
Let be a smooth manifold, let
E
M
As usual, the dual of is denote by . The fiber of over is denoted , and the restriction of to is denoted by . The Legendre transformation of is the smooth morphism defined by , where . Here we use the fact that since is a vector space, can be identified with .In other words, is the covector that sends to the directional derivative .
To describe the Legendre transformation locally, let be a coordinate chart over which is trivial. Picking a trivialization of over , we obtain charts and . In terms of these charts, we have , where for all . If, as in the classical case, the restriction of to each fiber is strictly convex and bounded below by a positive definite quadratic form minus a constant, then the Legendre transform is a diffeomorphism.[3] Suppose that is a diffeomorphism and let be the "Hamiltonian" function defined by where . Using the natural isomorphism , we may view the Legendre transformation of as a map . Then we have[3]
The Legendre transformation has the following scaling properties: For,
It follows that if a function is homogeneous of degree then its image under the Legendre transformation is a homogeneous function of degree, where . (Since, with, implies .) Thus, the only monomial whose degree is invariant under Legendre transform is the quadratic.
Let be a linear transformation. For any convex function on, one haswhere is the adjoint operator of defined byand is the push-forward of along
A closed convex function is symmetric with respect to a given set of orthogonal linear transformations,if and only if is symmetric with respect to .
The infimal convolution of two functions and is defined as
Let be proper convex functions on . Then
For any function and its convex conjugate Fenchel's inequality (also known as the Fenchel–Young inequality) holds for every and, i.e., independent pairs,