Covariant derivative explained

In mathematics, the covariant derivative is a way of specifying a derivative along tangent vectors of a manifold. Alternatively, the covariant derivative is a way of introducing and working with a connection on a manifold by means of a differential operator, to be contrasted with the approach given by a principal connection on the frame bundle – see affine connection. In the special case of a manifold isometrically embedded into a higher-dimensional Euclidean space, the covariant derivative can be viewed as the orthogonal projection of the Euclidean directional derivative onto the manifold's tangent space. In this case the Euclidean derivative is broken into two parts, the extrinsic normal component (dependent on the embedding) and the intrinsic covariant derivative component.

The name is motivated by the importance of changes of coordinate in physics: the covariant derivative transforms covariantly under a general coordinate transformation, that is, linearly via the Jacobian matrix of the transformation.^[1]

This article presents an introduction to the covariant derivative of a vector field with respect to a vector field, both in a coordinate-free language and using a local coordinate system and the traditional index notation. The covariant derivative of a tensor field is presented as an extension of the same concept. The covariant derivative generalizes straightforwardly to a notion of differentiation associated to a connection on a vector bundle, also known as a Koszul connection.

History

Historically, at the turn of the 20th century, the covariant derivative was introduced by Gregorio Ricci-Curbastro and Tullio Levi-Civita in the theory of Riemannian and pseudo-Riemannian geometry.^[2] Ricci and Levi-Civita (following ideas of Elwin Bruno Christoffel) observed that the Christoffel symbols used to define the curvature could also provide a notion of differentiation which generalized the classical directional derivative of vector fields on a manifold.^[3] ^[4] This new derivative – the Levi-Civita connection – was covariant in the sense that it satisfied Riemann's requirement that objects in geometry should be independent of their description in a particular coordinate system.

It was soon noted by other mathematicians, prominent among these being Hermann Weyl, Jan Arnoldus Schouten, and Élie Cartan,^[5] that a covariant derivative could be defined abstractly without the presence of a metric. The crucial feature was not a particular dependence on the metric, but that the Christoffel symbols satisfied a certain precise second-order transformation law. This transformation law could serve as a starting point for defining the derivative in a covariant manner. Thus the theory of covariant differentiation forked off from the strictly Riemannian context to include a wider range of possible geometries.

In the 1940s, practitioners of differential geometry began introducing other notions of covariant differentiation in general vector bundles which were, in contrast to the classical bundles of interest to geometers, not part of the tensor analysis of the manifold. By and large, these generalized covariant derivatives had to be specified ad hoc by some version of the connection concept. In 1950, Jean-Louis Koszul unified these new ideas of covariant differentiation in a vector bundle by means of what is known today as a Koszul connection or a connection on a vector bundle.^[6] Using ideas from Lie algebra cohomology, Koszul successfully converted many of the analytic features of covariant differentiation into algebraic ones. In particular, Koszul connections eliminated the need for awkward manipulations of Christoffel symbols (and other analogous non-tensorial objects) in differential geometry. Thus they quickly supplanted the classical notion of covariant derivative in many post-1950 treatments of the subject.

Motivation

The covariant derivative is a generalization of the directional derivative from vector calculus. As with the directional derivative, the covariant derivative is a rule,

\nabla_u{v}

, which takes as its inputs: (1) a vector,, defined at a point, and (2) a vector field defined in a neighborhood of .^[7] The output is the vector

\nabla_u{v}(P)

, also at the point . The primary difference from the usual directional derivative is that

\nabla_u{v}

must, in a certain precise sense, be independent of the manner in which it is expressed in a coordinate system.

A vector may be described as a list of numbers in terms of a basis, but as a geometrical object the vector retains its identity regardless of how it is described. For a geometric vector written in components with respect to one basis, when the basis is changed the components transform according to a change of basis formula, with the coordinates undergoing a covariant transformation. The covariant derivative is required to transform, under a change in coordinates, by a covariant transformation in the same way as a basis does (hence the name).

In the case of Euclidean space, one usually defines the directional derivative of a vector field in terms of the difference between two vectors at two nearby points.In such a system one translates one of the vectors to the origin of the other, keeping it parallel, then takes their difference within the same vector space. With a Cartesian (fixed orthonormal) coordinate system "keeping it parallel" amounts to keeping the components constant. This ordinary directional derivative on Euclidean space is the first example of a covariant derivative.

Next, one must take into account changes of the coordinate system. For example, if the Euclidean plane is described by polar coordinates, "keeping it parallel" does not amount to keeping the polar components constant under translation, since the coordinate grid itself "rotates". Thus, the same covariant derivative written in polar coordinates contains extra terms that describe how the coordinate grid itself rotates, or how in more general coordinates the grid expands, contracts, twists, interweaves, etc.

Consider the example of a particle moving along a curve in the Euclidean plane. In polar coordinates, may be written in terms of its radial and angular coordinates by . A vector at a particular time ^[8] (for instance, a constant acceleration of the particle) is expressed in terms of

(e_r,e_\theta)

, where

e_r

and

e_\theta

are unit tangent vectors for the polar coordinates, serving as a basis to decompose a vector in terms of radial and tangential components. At a slightly later time, the new basis in polar coordinates appears slightly rotated with respect to the first set. The covariant derivative of the basis vectors (the Christoffel symbols) serve to express this change.

In a curved space, such as the surface of the Earth (regarded as a sphere), the translation of tangent vectors between different points is not well defined, and its analog, parallel transport, depends on the path along which the vector is translated. A vector on a globe on the equator at point Q is directed to the north. Suppose we transport the vector (keeping it parallel) first along the equator to the point P, then drag it along a meridian to the N pole, and finally transport it along another meridian back to Q. Then we notice that the parallel-transported vector along a closed circuit does not return as the same vector; instead, it has another orientation. This would not happen in Euclidean space and is caused by the curvature of the surface of the globe. The same effect occurs if we drag the vector along an infinitesimally small closed surface subsequently along two directions and then back. This infinitesimal change of the vector is a measure of the curvature, and can be defined in terms of the covariant derivative.

Remarks

The definition of the covariant derivative does not use the metric in space. However, for each metric there is a unique torsion-free covariant derivative called the Levi-Civita connection such that the covariant derivative of the metric is zero.
The properties of a derivative imply that

\nabla_vu

depends on the values of on an arbitrarily small neighborhood of a point in the same way as e.g. the derivative of a scalar function along a curve at a given point depends on the values of in an arbitrarily small neighborhood of .

The information on the neighborhood of a point in the covariant derivative can be used to define parallel transport of a vector. Also the curvature, torsion, and geodesics may be defined only in terms of the covariant derivative or other related variation on the idea of a linear connection.

Informal definition using an embedding into Euclidean space

Suppose an open subset

of a

-dimensional Riemannian manifold

is embedded into Euclidean space

(\R^n,\langle ⋅ , ⋅ \rangle)

via a twice continuously-differentiable (C) mapping

\vec\Psi:\R^d\supsetU\to\Rⁿ

such that the tangent space at

\vec\Psi(p)

is spanned by the vectors

\left\

and the scalar product

\left\langle ⋅ , ⋅ \right\rangle

\Rⁿ

is compatible with the metric on :

g_ = \left\langle \frac, \frac \right\rangle.

(Since the manifold metric is always assumed to be regular, the compatibility condition implies linear independence of the partial derivative tangent vectors.)

For a tangent vector field, one has $\frac = \frac \left(v^j \frac \right)= \frac \frac + v^j \frac .$

The last term is not tangential to, but can be expressed as a linear combination of the tangent space base vectors using the Christoffel symbols as linear factors plus a vector orthogonal to the tangent space: $v^j \frac = v^j _ \frac + \vec n .$

In the case of the Levi-Civita connection, the covariant derivative

\nabla
	e_i

\vecV

, also written is defined as the orthogonal projection of the usual derivative onto tangent space:

\nabla_ \vec V := \frac - \vec n = \left(\frac + v^j _ \right) \frac.

From here it may be computationally convenient to obtain a relation between the Christoffel symbols for the Levi-Civita connection and the metric. To do this we first note that, since the vector

\vecn

in the previous equation is orthogonal to the tangent space,

\left\langle \frac, \frac \right\rangle= \left\langle _ \frac + \vec n, \frac \right\rangle= \left\langle \frac, \frac \right\rangle _= g_ \, _ .

Then, since the partial derivative of a component

g_ab

of the metric with respect to a coordinate

x^c

\frac = \frac \left\langle \frac, \frac \right\rangle = \left\langle \frac, \frac \right\rangle + \left\langle \frac, \frac \right\rangle,

any triplet of indices yields a system of equations $\left\$

Notes and References

Book: Einstein, Albert. The Meaning of Relativity. The General Theory of Relativity . 1922.
Levi-Civita . T. . Ricci . G. . Méthodes de calcul différential absolu et leurs applications . Mathematische Annalen . 54 . 1901 . 1–2 . 125–201 . 10.1007/bf01454201. 120009332 .
Book: Riemann, G. F. B. . Über die Hypothesen, welche der Geometrie zu Grunde liegen . Gesammelte Mathematische Werke . 1866 .

reprint, ed. Weber, H. (1953), New York: Dover.
Christoffel . E. B. . Über die Transformation der homogenen Differentialausdrücke zweiten Grades . . 70 . 1869 . 46–70 .
cf. with Cartan . É . Sur les variétés à connexion affine et la theorie de la relativité généralisée . Annales Scientifiques de l'École Normale Supérieure. 40 . 1923 . 325–412 . 10.24033/asens.751 . free .
Koszul . J. L. . Homologie et cohomologie des algebres de Lie . Bulletin de la Société Mathématique de France . 78 . 1950 . 65–127 . 10.24033/bsmf.1410 . free .
The covariant derivative is also denoted variously by \partial

_vu, D_vu, or other notations.
In many applications, it may be better not to think of as corresponding to time, at least for applications in general relativity. It is simply regarded as an abstract parameter varying smoothly and monotonically along the path.