Calculus of variations explained
The calculus of variations (or variational calculus) is a field of mathematical analysis that uses variations, which are small changes in functionsand functionals, to find maxima and minima of functionals: mappings from a set of functions to the real numbers. Functionals are often expressed as definite integrals involving functions and their derivatives. Functions that maximize or minimize functionals may be found using the Euler–Lagrange equation of the calculus of variations.
A simple example of such a problem is to find the curve of shortest length connecting two points. If there are no constraints, the solution is a straight line between the points. However, if the curve is constrained to lie on a surface in space, then the solution is less obvious, and possibly many solutions may exist. Such solutions are known as geodesics. A related problem is posed by Fermat's principle: light follows the path of shortest optical length connecting two points, which depends upon the material of the medium. One corresponding concept in mechanics is the principle of least/stationary action.
Many important problems involve functions of several variables. Solutions of boundary value problems for the Laplace equation satisfy the Dirichlet's principle. Plateau's problem requires finding a surface of minimal area that spans a given contour in space: a solution can often be found by dipping a frame in soapy water. Although such experiments are relatively easy to perform, their mathematical formulation is far from simple: there may be more than one locally minimizing surface, and they may have non-trivial topology.
History
The calculus of variations may be said to begin with Newton's minimal resistance problem in 1687, followed by the brachistochrone curve problem raised by Johann Bernoulli (1696).[1] It immediately occupied the attention of Jacob Bernoulli and the Marquis de l'Hôpital, but Leonhard Euler first elaborated the subject, beginning in 1733. Lagrange was influenced by Euler's work to contribute significantly to the theory. After Euler saw the 1755 work of the 19-year-old Lagrange, Euler dropped his own partly geometric approach in favor of Lagrange's purely analytic approach and renamed the subject the calculus of variations in his 1756 lecture Elementa Calculi Variationum.[2] [3]
Legendre (1786) laid down a method, not entirely satisfactory, for the discrimination of maxima and minima. Isaac Newton and Gottfried Leibniz also gave some early attention to the subject.[4] To this discrimination Vincenzo Brunacci (1810), Carl Friedrich Gauss (1829), Siméon Poisson (1831), Mikhail Ostrogradsky (1834), and Carl Jacobi (1837) have been among the contributors. An important general work is that of Sarrus (1842) which was condensed and improved by Cauchy (1844). Other valuable treatises and memoirs have been written by Strauch (1849), Jellett (1850), Otto Hesse (1857), Alfred Clebsch (1858), and Lewis Buffett Carll (1885), but perhaps the most important work of the century is that of Weierstrass. His celebrated course on the theory is epoch-making, and it may be asserted that he was the first to place it on a firm and unquestionable foundation. The 20th and the 23rd Hilbert problem published in 1900 encouraged further development.
In the 20th century David Hilbert, Oskar Bolza, Gilbert Ames Bliss, Emmy Noether, Leonida Tonelli, Henri Lebesgue and Jacques Hadamard among others made significant contributions. Marston Morse applied calculus of variations in what is now called Morse theory.[5] Lev Pontryagin, Ralph Rockafellar and F. H. Clarke developed new mathematical tools for the calculus of variations in optimal control theory. The dynamic programming of Richard Bellman is an alternative to the calculus of variations.[6] [7] [8]
Extrema
The calculus of variations is concerned with the maxima or minima (collectively called extrema) of functionals. A functional maps functions to scalars, so functionals have been described as "functions of functions." Functionals have extrema with respect to the elements
of a given
function space defined over a given
domain. A functional
is said to have an extremum at the function
if
has the same
sign for all
in an arbitrarily small neighborhood of
The function
is called an
extremal function or extremal. The extremum
is called a local maximum if
everywhere in an arbitrarily small neighborhood of
and a local minimum if
there. For a function space of continuous functions, extrema of corresponding functionals are called
strong extrema or
weak extrema, depending on whether the first derivatives of the continuous functions are respectively all continuous or not.
Both strong and weak extrema of functionals are for a space of continuous functions but strong extrema have the additional requirement that the first derivatives of the functions in the space be continuous. Thus a strong extremum is also a weak extremum, but the converse may not hold. Finding strong extrema is more difficult than finding weak extrema. An example of a necessary condition that is used for finding weak extrema is the Euler–Lagrange equation.
Euler–Lagrange equation
See main article: Euler–Lagrange equation. Finding the extrema of functionals is similar to finding the maxima and minima of functions. The maxima and minima of a function may be located by finding the points where its derivative vanishes (i.e., is equal to zero). The extrema of functionals may be obtained by finding functions for which the functional derivative is equal to zero. This leads to solving the associated Euler–Lagrange equation.
Consider the functionalwhere
are
constants,
is twice continuously differentiable,
L\left(x,y(x),y'(x)\right)
is twice continuously differentiable with respect to its arguments
and
If the functional
attains a
local minimum at
and
is an arbitrary function that has at least one derivative and vanishes at the endpoints
and
then for any number
close to 0,
The term
is called the
variation of the function
and is denoted by
Substituting
for
in the functional
the result is a function of
Since the functional
has a minimum for
the function
has a minimum at
and thus,
Taking the total derivative of
where
and
are considered as functions of
rather than
yields
and because
and
Therefore,where
L\left[x,y,y'\right]\toL\left[x,f,f'\right]
when
and we have used
integration by parts on the second term. The second term on the second line vanishes because
at
and
by definition. Also, as previously mentioned the left side of the equation is zero so that
According to the fundamental lemma of calculus of variations, the part of the integrand in parentheses is zero, i.e.which is called the Euler–Lagrange equation. The left hand side of this equation is called the functional derivative of
and is denoted
In general this gives a second-order ordinary differential equation which can be solved to obtain the extremal function
The Euler–Lagrange equation is a necessary, but not sufficient, condition for an extremum
A sufficient condition for a minimum is given in the section Variations and sufficient condition for a minimum.
Example
In order to illustrate this process, consider the problem of finding the extremal function
which is the shortest curve that connects two points
and
The
arc length of the curve is given by
with
Note that assuming is a function of loses generality; ideally both should be a function of some other parameter. This approach is good solely for instructive purposes.
The Euler–Lagrange equation will now be used to find the extremal function
that minimizes the functional
with
Since
does not appear explicitly in
the first term in the Euler–Lagrange equation vanishes for all
and thus,
Substituting for
and taking the derivative,
Thusfor some constant
Then
where
Solving, we get
which implies that
is a constant and therefore that the shortest curve that connects two points
and
is
and we have thus found the extremal function
that minimizes the functional
so that
is a minimum. The equation for a straight line is
In other words, the shortest distance between two points is a straight line.
Beltrami's identity
In physics problems it may be the case that
meaning the integrand is a function of
and
but
does not appear separately. In that case, the Euler–Lagrange equation can be simplified to the
Beltrami identity[9] where
is a constant. The left hand side is the
Legendre transformation of
with respect to
The intuition behind this result is that, if the variable
is actually time, then the statement
implies that the Lagrangian is time-independent. By
Noether's theorem, there is an associated conserved quantity. In this case, this quantity is the Hamiltonian, the Legendre transform of the Lagrangian, which (often) coincides with the energy of the system. This is (minus) the constant in Beltrami's identity.
Euler–Poisson equation
If
depends on higher-derivatives of
that is, if
then
must satisfy the Euler–
Poisson equation,
[10] Du Bois-Reymond's theorem
The discussion thus far has assumed that extremal functions possess two continuous derivatives, although the existence of the integral
requires only first derivatives of trial functions. The condition that the first variation vanishes at an extremal may be regarded as a
weak form of the Euler–Lagrange equation. The theorem of Du Bois-Reymond asserts that this weak form implies the strong form. If
has continuous first and second derivatives with respect to all of its arguments, and if
then
has two continuous derivatives, and it satisfies the Euler–Lagrange equation.
Lavrentiev phenomenon
Hilbert was the first to give good conditions for the Euler–Lagrange equations to give a stationary solution. Within a convex area and a positive thrice differentiable Lagrangian the solutions are composed of a countable collection of sections that either go along the boundary or satisfy the Euler–Lagrange equations in the interior.
However Lavrentiev in 1926 showed that there are circumstances where there is no optimum solution but one can be approached arbitrarily closely by increasing numbers of sections. The Lavrentiev Phenomenon identifies a difference in the infimum of a minimization problem across different classes of admissible functions. For instance the following problem, presented by Manià in 1934:[11]
Clearly,
minimizes the functional, but we find any function
gives a value bounded away from the infimum.
Examples (in one-dimension) are traditionally manifested across
and
but Ball and Mizel
[12] procured the first functional that displayed Lavrentiev's Phenomenon across
and
for
There are several results that gives criteria under which the phenomenon does not occur - for instance 'standard growth', a Lagrangian with no dependence on the second variable, or an approximating sequence satisfying Cesari's Condition (D) - but results are often particular, and applicable to a small class of functionals.
Connected with the Lavrentiev Phenomenon is the repulsion property: any functional displaying Lavrentiev's Phenomenon will display the weak repulsion property.[13]
Functions of several variables
For example, if
denotes the displacement of a membrane above the domain
in the
plane, then its potential energy is proportional to its surface area:
Plateau's problem consists of finding a function that minimizes the surface area while assuming prescribed values on the boundary of
; the solutions are called
minimal surfaces. The Euler–Lagrange equation for this problem is nonlinear:
See Courant (1950) for details.
Dirichlet's principle
It is often sufficient to consider only small displacements of the membrane, whose energy difference from no displacement is approximated byThe functional
is to be minimized among all trial functions
that assume prescribed values on the boundary of
If
is the minimizing function and
is an arbitrary smooth function that vanishes on the boundary of
then the first variation of
must vanish:
Provided that u has two derivatives, we may apply the divergence theorem to obtain
where
is the boundary of
is arclength along
and
is the normal derivative of
on
Since
vanishes on
and the first variation vanishes, the result is
for all smooth functions
that vanish on the boundary of
The proof for the case of one dimensional integrals may be adapted to this case to show that
in
The difficulty with this reasoning is the assumption that the minimizing function
must have two derivatives. Riemann argued that the existence of a smooth minimizing function was assured by the connection with the physical problem: membranes do indeed assume configurations with minimal potential energy. Riemann named this idea the
Dirichlet principle in honor of his teacher
Peter Gustav Lejeune Dirichlet. However Weierstrass gave an example of a variational problem with no solution: minimize
among all functions
that satisfy
and
can be made arbitrarily small by choosing piecewise linear functions that make a transition between −1 and 1 in a small neighborhood of the origin. However, there is no function that makes
Eventually it was shown that Dirichlet's principle is valid, but it requires a sophisticated application of the regularity theory for
elliptic partial differential equations; see Jost and Li–Jost (1998).
Generalization to other boundary value problems
A more general expression for the potential energy of a membrane isThis corresponds to an external force density
in
an external force
on the boundary
and elastic forces with modulus
acting on
The function that minimizes the potential energy
with no restriction on its boundary values will be denoted by
Provided that
and
are continuous, regularity theory implies that the minimizing function
will have two derivatives. In taking the first variation, no boundary condition need be imposed on the increment
The first variation of
is given by
If we apply the divergence theorem, the result is
If we first set
on
the boundary integral vanishes, and we conclude as before that
in
Then if we allow
to assume arbitrary boundary values, this implies that
must satisfy the boundary condition
on
This boundary condition is a consequence of the minimizing property of
: it is not imposed beforehand. Such conditions are called
natural boundary conditions.
The preceding reasoning is not valid if
vanishes identically on
In such a case, we could allow a trial function
where
is a constant. For such a trial function,
By appropriate choice of
can assume any value unless the quantity inside the brackets vanishes. Therefore, the variational problem is meaningless unless
This condition implies that net external forces on the system are in equilibrium. If these forces are in equilibrium, then the variational problem has a solution, but it is not unique, since an arbitrary constant may be added. Further details and examples are in Courant and Hilbert (1953).
Eigenvalue problems
Both one-dimensional and multi-dimensional eigenvalue problems can be formulated as variational problems.
Sturm–Liouville problems
See also: Sturm–Liouville theory. The Sturm–Liouville eigenvalue problem involves a general quadratic formwhere
is restricted to functions that satisfy the boundary conditions
Let
be a normalization integral
The functions
and
are required to be everywhere positive and bounded away from zero. The primary variational problem is to minimize the ratio
among all
satisfying the endpoint conditions, which is equivalent to minimizing
under the constraint that
is constant. It is shown below that the Euler–Lagrange equation for the minimizing
is
where
is the quotient
It can be shown (see Gelfand and Fomin 1963) that the minimizing
has two derivatives and satisfies the Euler–Lagrange equation. The associated
will be denoted by
; it is the lowest eigenvalue for this equation and boundary conditions. The associated minimizing function will be denoted by
This variational characterization of eigenvalues leads to the
Rayleigh–Ritz method: choose an approximating
as a linear combination of basis functions (for example trigonometric functions) and carry out a finite-dimensional minimization among such linear combinations. This method is often surprisingly accurate.
The next smallest eigenvalue and eigenfunction can be obtained by minimizing
under the additional constraint
This procedure can be extended to obtain the complete sequence of eigenvalues and eigenfunctions for the problem.
The variational problem also applies to more general boundary conditions. Instead of requiring that
vanish at the endpoints, we may not impose any condition at the endpoints, and set
where
and
are arbitrary. If we set
, the first variation for the ratio
is
where λ is given by the ratio
as previously.After integration by parts,
If we first require that
vanish at the endpoints, the first variation will vanish for all such
only if
If
satisfies this condition, then the first variation will vanish for arbitrary
only if
These latter conditions are the
natural boundary conditions for this problem, since they are not imposed on trial functions for the minimization, but are instead a consequence of the minimization.
Eigenvalue problems in several dimensions
Eigenvalue problems in higher dimensions are defined in analogy with the one-dimensional case. For example, given a domain
with boundary
in three dimensions we may define
and
Let
be the function that minimizes the quotient
with no condition prescribed on the boundary
The Euler–Lagrange equation satisfied by
is
where
The minimizing
must also satisfy the natural boundary condition
on the boundary
This result depends upon the regularity theory for elliptic partial differential equations; see Jost and Li–Jost (1998) for details. Many extensions, including completeness results, asymptotic properties of the eigenvalues and results concerning the nodes of the eigenfunctions are in Courant and Hilbert (1953).
Applications
Optics
Fermat's principle states that light takes a path that (locally) minimizes the optical length between its endpoints. If the
-coordinate is chosen as the parameter along the path, and
along the path, then the optical length is given by
where the refractive index
depends upon the material.If we try
f(x)=f0(x)+\varepsilonf1(x)
then the
first variation of
(the derivative of
with respect to ε) is
After integration by parts of the first term within brackets, we obtain the Euler–Lagrange equation
The light rays may be determined by integrating this equation. This formalism is used in the context of Lagrangian optics and Hamiltonian optics.
Snell's law
There is a discontinuity of the refractive index when light enters or leaves a lens. Letwhere
and
are constants. Then the Euler–Lagrange equation holds as before in the region where
or
and in fact the path is a straight line there, since the refractive index is constant. At the
must be continuous, but
may be discontinuous. After integration by parts in the separate regions and using the Euler–Lagrange equations, the first variation takes the form
The factor multiplying
is the sine of angle of the incident ray with the
axis, and the factor multiplying
is the sine of angle of the refracted ray with the
axis.
Snell's law for refraction requires that these terms be equal. As this calculation demonstrates, Snell's law is equivalent to vanishing of the first variation of the optical path length.
Fermat's principle in three dimensions
It is expedient to use vector notation: let
let
be a parameter, let
be the parametric representation of a curve
and let
be its tangent vector. The optical length of the curve is given by
Note that this integral is invariant with respect to changes in the parametric representation of
The Euler–Lagrange equations for a minimizing curve have the symmetric form
where
It follows from the definition that
satisfies
Therefore, the integral may also be written as
This form suggests that if we can find a function
whose gradient is given by
then the integral
is given by the difference of
at the endpoints of the interval of integration. Thus the problem of studying the curves that make the integral stationary can be related to the study of the level surfaces of
In order to find such a function, we turn to the wave equation, which governs the propagation of light. This formalism is used in the context of
Lagrangian optics and
Hamiltonian optics.
Connection with the wave equation
The wave equation for an inhomogeneous medium iswhere
is the velocity, which generally depends upon
Wave fronts for light are characteristic surfaces for this partial differential equation: they satisfy
We may look for solutions in the form
In that case,
satisfies
where
According to the theory of
first-order partial differential equations, if
then
satisfies
along a system of curves (
the light rays) that are given by
These equations for solution of a first-order partial differential equation are identical to the Euler–Lagrange equations if we make the identification
We conclude that the function
is the value of the minimizing integral
as a function of the upper end point. That is, when a family of minimizing curves is constructed, the values of the optical length satisfy the characteristic equation corresponding the wave equation. Hence, solving the associated partial differential equation of first order is equivalent to finding families of solutions of the variational problem. This is the essential content of the
Hamilton–Jacobi theory, which applies to more general variational problems.
Mechanics
See main article: Action (physics). In classical mechanics, the action,
is defined as the time integral of the Lagrangian,
The Lagrangian is the difference of energies,
where
is the
kinetic energy of a mechanical system and
its
potential energy.
Hamilton's principle (or the action principle) states that the motion of a conservative holonomic (integrable constraints) mechanical system is such that the action integral
is stationary with respect to variations in the path
The Euler–Lagrange equations for this system are known as Lagrange's equations:
and they are equivalent to Newton's equations of motion (for such systems).
The conjugate momenta
are defined by
For example, if
then
Hamiltonian mechanics results if the conjugate momenta are introduced in place of
by a Legendre transformation of the Lagrangian
into the Hamiltonian
defined by
The Hamiltonian is the total energy of the system:
Analogy with Fermat's principle suggests that solutions of Lagrange's equations (the particle trajectories) may be described in terms of level surfaces of some function of
This function is a solution of the
Hamilton–Jacobi equation:
Further applications
Further applications of the calculus of variations include the following:
Variations and sufficient condition for a minimum
Calculus of variations is concerned with variations of functionals, which are small changes in the functional's value due to small changes in the function that is its argument. The first variation is defined as the linear part of the change in the functional, and the second variation is defined as the quadratic part.
For example, if
is a functional with the function
as its argument, and there is a small change in its argument from
to
where
is a function in the same function space as
then the corresponding change in the functional is
The functional
is said to be
differentiable if
where
is a linear functional,
is the norm of
and
as
The linear functional
is the first variation of
and is denoted by,
The functional
is said to be
twice differentiable if
where
is a linear functional (the first variation),
is a quadratic functional, and
as
The quadratic functional
is the second variation of
and is denoted by,
The second variation
is said to be
strongly positive if
for all
and for some constant
.
Using the above definitions, especially the definitions of first variation, second variation, and strongly positive, the following sufficient condition for a minimum of a functional can be stated.
See also
Further reading
- Benesova, B. and Kruzik, M.: "Weak Lower Semicontinuity of Integral Functionals and Applications". SIAM Review 59(4) (2017), 703–766.
- Bolza, O.: Lectures on the Calculus of Variations. Chelsea Publishing Company, 1904, available on Digital Mathematics library. 2nd edition republished in 1961, paperback in 2005, .
- Cassel, Kevin W.: Variational Methods with Applications in Science and Engineering, Cambridge University Press, 2013.
- Clegg, J.C.: Calculus of Variations, Interscience Publishers Inc., 1968.
- Courant, R.: Dirichlet's principle, conformal mapping and minimal surfaces. Interscience, 1950.
- Dacorogna, Bernard: "Introduction" Introduction to the Calculus of Variations, 3rd edition. 2014, World Scientific Publishing, .
- Elsgolc, L.E.: Calculus of Variations, Pergamon Press Ltd., 1962.
- Forsyth, A.R.: Calculus of Variations, Dover, 1960.
- Fox, Charles: An Introduction to the Calculus of Variations, Dover Publ., 1987.
- Giaquinta, Mariano; Hildebrandt, Stefan: Calculus of Variations I and II, Springer-Verlag, and
- Jost, J. and X. Li-Jost: Calculus of Variations. Cambridge University Press, 1998.
- Lebedev, L.P. and Cloud, M.J.: The Calculus of Variations and Functional Analysis with Optimal Control and Applications in Mechanics, World Scientific, 2003, pages 1–98.
- Logan, J. David: Applied Mathematics, 3rd edition. Wiley-Interscience, 2006
- Book: Pike, Ralph W. . http://www.mpri.lsu.edu/textbook/Chapter8-b.htm. Chapter 8: Calculus of Variations. Optimization for Engineering Systems. Louisiana State University. https://web.archive.org/web/20070705141725/http://www.mpri.lsu.edu/bookindex.html. dead. 2007-07-05.
- Roubicek, T.: "Calculus of variations". Chap.17 in: Mathematical Tools for Physicists. (Ed. M. Grinfeld) J. Wiley, Weinheim, 2014,, pp. 551–588.
- Sagan, Hans: Introduction to the Calculus of Variations, Dover, 1992.
- Weinstock, Robert: Calculus of Variations with Applications to Physics and Engineering, Dover, 1974 (reprint of 1952 ed.).
External links
Notes and References
- Book: Gelfand. I. M.. Israel Gelfand. Fomin. S. V.. Sergei Fomin. Calculus of variations . 2000. Dover Publications. Mineola, New York. 978-0486414485. 3. Unabridged repr.. Silverman. Richard A..
- Book: Thiele, Rüdiger . Bradley . Robert E. . Sandifer . C. Edward . Leonhard Euler: Life, Work and Legacy . Elsevier . 2007 . 249 . Euler and the Calculus of Variations . https://books.google.com/books?id=75vJL_Y-PvsC&pg=PA249 . 9780080471297.
- Book: Goldstine, Herman H. . 2012 . A History of the Calculus of Variations from the 17th through the 19th Century . Springer Science & Business Media . 110 . 9781461381068 . Herman Goldstine .
- Book: van Brunt, Bruce . The Calculus of Variations . Springer . 2004 . 978-0-387-40247-5.
- Ferguson . James . math/0402357 . Brief Survey of the History of the Calculus of Variations and its Applications . 2004 .
- [Dimitri Bertsekas]
- Bellman . Richard E. . Dynamic Programming and a new formalism in the calculus of variations . 1954 . Proc. Natl. Acad. Sci. . 4 . 231–235. 527981 . 16589462 . 40 . 10.1073/pnas.40.4.231. 1954PNAS...40..231B . free .
- Web site: Richard E. Bellman Control Heritage Award . 2004 . American Automatic Control Council . 2013-07-28 . 2018-10-01 . https://web.archive.org/web/20181001032837/http://a2c2.org/awards/richard-e-bellman-control-heritage-award . dead .
- Web site: Weisstein, Eric W. . Euler–Lagrange Differential Equation . mathworld.wolfram.com . Wolfram . Eq. (5).
- Book: Kot, Mark . A First Course in the Calculus of Variations . American Mathematical Society . 2014 . 978-1-4704-1495-5 . Chapter 4: Basic Generalizations.
- Manià. Bernard. 1934. Sopra un esempio di Lavrentieff. Bollenttino dell'Unione Matematica Italiana. 13. 147–153.
- Ball & Mizel. 1985. One-dimensional Variational problems whose Minimizers do not satisfy the Euler-Lagrange equation.. Archive for Rational Mechanics and Analysis. 90. 4. 325–388. 10.1007/BF00276295. 1985ArRMA..90..325B. 55005550.
- Ferriero. Alessandro. 2007. The Weak Repulsion property . Journal de Mathématiques Pures et Appliquées. 88. 4. 378–388. 10.1016/j.matpur.2007.06.002 .