Rayleigh quotient explained
and nonzero
vector
is defined as:
[2] [3] For real matrices and vectors, the condition of being Hermitian reduces to that of being
symmetric, and the
conjugate transpose
to the usual
transpose
. Note that
for any non-zero scalar
. Recall that a Hermitian (or real symmetric) matrix is
diagonalizable with only real eigenvalues. It can be shown that, for a given matrix, the Rayleigh quotient reaches its minimum value
(the smallest
eigenvalue of
) when
is
(the corresponding
eigenvector).
[4] Similarly,
and
.
The Rayleigh quotient is used in the min-max theorem to get exact values of all eigenvalues. It is also used in eigenvalue algorithms (such as Rayleigh quotient iteration) to obtain an eigenvalue approximation from an eigenvector approximation.
The range of the Rayleigh quotient (for any matrix, not necessarily Hermitian) is called a numerical range and contains its spectrum. When the matrix is Hermitian, the numerical radius is equal to the spectral norm. Still in functional analysis,
is known as the
spectral radius. In the context of
-algebras or algebraic quantum mechanics, the function that to
associates the Rayleigh–Ritz quotient
for a fixed
and
varying through the algebra would be referred to as
vector state of the algebra.
In quantum mechanics, the Rayleigh quotient gives the expectation value of the observable corresponding to the operator
for a system whose state is given by
.If we fix the complex matrix
, then the resulting Rayleigh quotient map (considered as a function of
) completely determines
via the polarization identity; indeed, this remains true even if we allow
to be non-Hermitian. However, if we restrict the field of scalars to the real numbers, then the Rayleigh quotient only determines the symmetric part of
.Bounds for Hermitian M
As stated in the introduction, for any vector x, one has
R(M,x)\in\left[λmin,λmax\right]
, where
are respectively the smallest and largest eigenvalues of
. This is immediate after observing that the Rayleigh quotient is a weighted average of eigenvalues of
M:
where
is the
-th eigenpair after orthonormalization and
is the
th coordinate of
x in the eigenbasis. It is then easy to verify that the bounds are attained at the corresponding eigenvectors
.
The fact that the quotient is a weighted average of the eigenvalues can be used to identify the second, the third, ... largest eigenvalues. Let
λmax=λ1\geλ2\ge … \geλn=λmin
be the eigenvalues in decreasing order. If
and
is constrained to be orthogonal to
, in which case
, then
has maximum value
, which is achieved when
.
Special case of covariance matrices
can be represented as the product
of the
data matrix
pre-multiplied by its transpose
. Being a positive semi-definite matrix,
has non-negative eigenvalues, and orthogonal (or orthogonalisable) eigenvectors, which can be demonstrated as follows.
Firstly, that the eigenvalues
are non-negative:
Secondly, that the eigenvectors
are orthogonal to one another:
if the eigenvalues are different – in the case of multiplicity, the basis can be orthogonalized.
To now establish that the Rayleigh quotient is maximized by the eigenvector with the largest eigenvalue, consider decomposing an arbitrary vector
on the basis of the eigenvectors
:
where
is the coordinate of
orthogonally projected onto
. Therefore, we have:
which, by
orthonormality of the eigenvectors, becomes:
The last representation establishes that the Rayleigh quotient is the sum of the squared cosines of the angles formed by the vector
and each eigenvector
, weighted by corresponding eigenvalues.
If a vector
maximizes
, then any non-zero scalar multiple
also maximizes
, so the problem can be reduced to the
Lagrange problem of maximizing
under the constraint that
.
Define:
. This then becomes a
linear program, which always attains its maximum at one of the corners of the domain. A maximum point will have
and
for all
(when the eigenvalues are ordered by decreasing magnitude).
Thus, the Rayleigh quotient is maximized by the eigenvector with the largest eigenvalue.
Formulation using Lagrange multipliers
Alternatively, this result can be arrived at by the method of Lagrange multipliers. The first part is to show that the quotient is constant under scaling
, where
is a scalar
Because of this invariance, it is sufficient to study the special case
. The problem is then to find the
critical points of the function
subject to the constraint
In other words, it is to find the critical points of
where
is a Lagrange multiplier. The stationary points of
occur at
and
Therefore, the eigenvectors
of
are the critical points of the Rayleigh quotient and their corresponding eigenvalues
are the stationary values of
. This property is the basis for
principal components analysis and
canonical correlation.
Use in Sturm–Liouville theory
Sturm–Liouville theory concerns the action of the linear operatoron the inner product space defined byof functions satisfying some specified boundary conditions at a and b. In this case the Rayleigh quotient is
This is sometimes presented in an equivalent form, obtained by separating the integral in the numerator and using integration by parts:
Generalizations
- For a given pair (A, B) of matrices, and a given non-zero vector x, the generalized Rayleigh quotient is defined as: The generalized Rayleigh quotient can be reduced to the Rayleigh Quotient
through the transformation
where
is the
Cholesky decomposition of the Hermitian positive-definite matrix
B.
- For a given pair (x, y) of non-zero vectors, and a given Hermitian matrix H, the generalized Rayleigh quotient can be defined as: which coincides with R(H,x) when x = y. In quantum mechanics, this quantity is called a "matrix element" or sometimes a "transition amplitude".
See also
References
- Also known as the Rayleigh–Ritz ratio; named after Walther Ritz and Lord Rayleigh.
- Book: Horn . R. A. . C. A. . Johnson . 1985 . Matrix Analysis . Cambridge University Press . 176–180 . 0-521-30586-1 .
- Book: Parlett, B. N. . The Symmetric Eigenvalue Problem . SIAM . Classics in Applied Mathematics . 1998 . 0-89871-402-8 .
- Web site: Rodica D. . Costin . 2013 . Midterm notes . Mathematics 5102 Linear Mathematics in Infinite Dimensions, lecture notes . The Ohio State University .
Further reading