Characteristic polynomial explained

In linear algebra, the characteristic polynomial of a square matrix is a polynomial which is invariant under matrix similarity and has the eigenvalues as roots. It has the determinant and the trace of the matrix among its coefficients. The characteristic polynomial of an endomorphism of a finite-dimensional vector space is the characteristic polynomial of the matrix of that endomorphism over any base (that is, the characteristic polynomial does not depend on the choice of a basis). The characteristic equation, also known as the determinantal equation,^[1] ^[2] ^[3] is the equation obtained by equating the characteristic polynomial to zero.

In spectral graph theory, the characteristic polynomial of a graph is the characteristic polynomial of its adjacency matrix.^[4]

Motivation

In linear algebra, eigenvalues and eigenvectors play a fundamental role, since, given a linear transformation, an eigenvector is a vector whose direction is not changed by the transformation, and the corresponding eigenvalue is the measure of the resulting change of magnitude of the vector.

More precisely, if the transformation is represented by a square matrix

an eigenvector

and the corresponding eigenvalue

must satisfy the equation

A \mathbf = \lambda \mathbf,

or, equivalently,

(\lambda I - A) \mathbf =\mathbf 0

where

is the identity matrix, and

v\ne0

(although the zero vector satisfies this equation for every

λ,

it is not considered an eigenvector).

It follows that the matrix

(λI-A)

must be singular, and its determinant

\det(\lambda I - A) = 0

must be zero.

In other words, the eigenvalues of are the roots of $\det(xI - A),$ which is a monic polynomial in of degree if is a matrix. This polynomial is the characteristic polynomial of .

Formal definition

Consider an

n x n

matrix

The characteristic polynomial of

denoted by

p_A(t),

is the polynomial defined by^[5]

p_A(t) = \det (t I - A)

where

denotes the

n x n

identity matrix.

Some authors define the characteristic polynomial to be

\det(A-tI).

That polynomial differs from the one defined here by a sign

(-1)^n,

so it makes no difference for properties like having as roots the eigenvalues of

; however the definition above always gives a monic polynomial, whereas the alternative definition is monic only when

is even.

Examples

To compute the characteristic polynomial of the matrix $A = \begin2 & 1\\-1& 0\end.$ the determinant of the following is computed: $t I-A = \begint-2&-1\\1&t-0\end$ and found to be

(t-2)t-1(-1)=t^2-2t+1,

the characteristic polynomial of

Another example uses hyperbolic functions of a hyperbolic angle φ.For the matrix take $A = \begin \cosh(\varphi) & \sinh(\varphi)\\ \sinh(\varphi)& \cosh(\varphi) \end.$ Its characteristic polynomial is $\det (tI - A) = (t - \cosh(\varphi))^2 - \sinh^2(\varphi) = t^2 - 2 t \ \cosh(\varphi) + 1 = (t - e^\varphi) (t - e^).$

Properties

The characteristic polynomial

p_A(t)

of a

n x n

matrix is monic (its leading coefficient is

) and its degree is

The most important fact about the characteristic polynomial was already mentioned in the motivational paragraph: the eigenvalues of

are precisely the roots of

p_A(t)

(this also holds for the minimal polynomial of

but its degree may be less than

). All coefficients of the characteristic polynomial are polynomial expressions in the entries of the matrix. In particular its constant coefficient of

t⁰

\det(-A)=(-1)ⁿ\det(A),

the coefficient of

tⁿ

is one, and the coefficient of

t^n-1

is, where is the trace of

(The signs given here correspond to the formal definition given in the previous section; for the alternative definition these would instead be

\det(A)

and respectively.^[6])

For a

2 x 2

matrix

the characteristic polynomial is thus given by

t^2 - \operatorname(A) t + \det(A).

Using the language of exterior algebra, the characteristic polynomial of an

n x n

matrix

may be expressed as

p_A (t) = \sum_^n t^ (-1)^k \operatorname\left(\textstyle\bigwedge^k A\right)

where

\operatorname\left(\bigwedge^k A\right)

is the trace of the

th exterior power of

which has dimension

\binom .

This trace may be computed as the sum of all principal minors of

of size

The recursive Faddeev–LeVerrier algorithm computes these coefficients more efficiently.

When the characteristic of the field of the coefficients is

each such trace may alternatively be computed as a single determinant, that of the

k x k

matrix,

\operatorname\left(\textstyle\bigwedge^k A\right) = \frac \begin \operatornameA & k-1 &0&\cdots &0 \\\operatornameA^2 &\operatornameA& k-2 &\cdots &0 \\ \vdots & \vdots & & \ddots & \vdots \\\operatornameA^ &\operatornameA^& & \cdots & 1 \\ \operatornameA^k &\operatornameA^& & \cdots & \operatornameA\end ~.

The Cayley–Hamilton theorem states that replacing

in the characteristic polynomial (interpreting the resulting powers as matrix powers, and the constant term

times the identity matrix) yields the zero matrix. Informally speaking, every matrix satisfies its own characteristic equation. This statement is equivalent to saying that the minimal polynomial of

divides the characteristic polynomial of

Two similar matrices have the same characteristic polynomial. The converse however is not true in general: two matrices with the same characteristic polynomial need not be similar.

The matrix

and its transpose have the same characteristic polynomial.

is similar to a triangular matrix if and only if its characteristic polynomial can be completely factored into linear factors over

(the same is true with the minimal polynomial instead of the characteristic polynomial). In this case

is similar to a matrix in Jordan normal form.

Characteristic polynomial of a product of two matrices

and

are two square

n x n

matrices then characteristic polynomials of

and

coincide:

p_(t)=p_(t).\,

When

is non-singular this result follows from the fact that

and

are similar:

BA = A^ (AB) A.

For the case where both

and

are singular, the desired identity is an equality between polynomials in

and the coefficients of the matrices. Thus, to prove this equality, it suffices to prove that it is verified on a non-empty open subset (for the usual topology, or, more generally, for the Zariski topology) of the space of all the coefficients. As the non-singular matrices form such an open subset of the space of all matrices, this proves the result.

More generally, if

is a matrix of order

m x n

and

is a matrix of order

n x m,

then

m x m

and

n x n

matrix, and one has

p_(t) = t^ p_(t).\,

To prove this, one may suppose

n>m,

by exchanging, if needed,

and

Then, by bordering

on the bottom by

n-m

rows of zeros, and

on the right, by,

n-m

columns of zeros, one gets two

n x n

matrices

A^\prime

and

B^\prime

such that

B^\primeA^\prime=BA

and

A^\primeB^\prime

is equal to

bordered by

n-m

rows and columns of zeros. The result follows from the case of square matrices, by comparing the characteristic polynomials of

A^\primeB^\prime

and

AB.

Characteristic polynomial of A^k

is an eigenvalue of a square matrix

with eigenvector

then

λ^k

is an eigenvalue of

A^k

because

A^k \textbf = A^ A \textbf = \lambda A^ \textbf = \dots = \lambda^k \textbf.

The multiplicities can be shown to agree as well, and this generalizes to any polynomial in place of

x^k

:^[7] That is, the algebraic multiplicity of

f(A)

equals the sum of algebraic multiplicities of

λ'

over

λ'

such that

f(λ')=λ.

In particular,

\operatorname{tr}(f(A))=

	n
style\sum
	i=1

f(λ_i)

and

\operatorname{det}(f(A))=

	n
style\prod
	i=1

f(λ_i).

Here a polynomial

f(t)=t^3+1,

for example, is evaluated on a matrix

simply as

f(A)=A^3+I.

The theorem applies to matrices and polynomials over any field or commutative ring.^[8] However, the assumption that

p_A(t)

has a factorization into linear factors is not always true, unless the matrix is over an algebraically closed field such as the complex numbers.

Secular function and secular equation

Secular function

The term secular function has been used for what is now called characteristic polynomial (in some literature the term secular function is still used). The term comes from the fact that the characteristic polynomial was used to calculate secular perturbations (on a time scale of a century, that is, slow compared to annual motion) of planetary orbits, according to Lagrange's theory of oscillations.

Secular equation

Secular equation may have several meanings.

In linear algebra it is sometimes used in place of characteristic equation.
In astronomy it is the algebraic or numerical expression of the magnitude of the inequalities in a planet's motion that remain after the inequalities of a short period have been allowed for.^[9]
In molecular orbital calculations relating to the energy of the electron and its wave function it is also used instead of the characteristic equation.

For general associative algebras

The above definition of the characteristic polynomial of a matrix

A\inM_n(F)

with entries in a field

generalizes without any changes to the case when

is just a commutative ring. defines the characteristic polynomial for elements of an arbitrary finite-dimensional (associative, but not necessarily commutative) algebra over a field

and proves the standard properties of the characteristic polynomial in this generality.

References

T.S. Blyth & E.F. Robertson (1998) Basic Linear Algebra, p 149, Springer .
John B. Fraleigh & Raymond A. Beauregard (1990) Linear Algebra 2nd edition, p 246, Addison-Wesley .
- Werner Greub (1974) Linear Algebra 4th edition, pp 120 - 5, Springer, .
Paul C. Shields (1980) Elementary Linear Algebra 3rd edition, p 274, Worth Publishers .
Gilbert Strang (1988) Linear Algebra and Its Applications 3rd edition, p 246, Brooks/Cole .

Notes and References

Book: Guillemin, Ernst . Introductory Circuit Theory . Ernst Guillemin . 1953 . Wiley . 366, 541 . 0471330663.
Forsythe . George E. . Motzkin . Theodore . January 1952 . An Extension of Gauss' Transformation for Improving the Condition of Systems of Linear Equations . Mathematics of Computation. 6 . 37 . 18–34 . 10.1090/S0025-5718-1952-0048162-0 . 3 October 2020. free .
Frank . Evelyn . 1946 . On the zeros of polynomials with complex coefficients . Bulletin of the American Mathematical Society . 52 . 2 . 144–157 . 10.1090/S0002-9904-1946-08526-2 . free .
Web site: Characteristic Polynomial of a Graph – Wolfram MathWorld. August 26, 2011.
Book: Advanced linear algebra . Steven Roman . 3540978372 . 2 . 1992 . Springer . 137.
Theorem 4 in these lecture notes
Book: Horn . Roger A. . Johnson . Charles R. . Matrix Analysis . . 978-0-521-54823-6 . 2013 . 2nd. pp. 108–109, Section 2.4.2.
Book: Lang, Serge . Algebra . Springer . 1993 . 978-1-4613-0041-0 . New York . 852792828. p.567, Theorem 3.10.
Web site: secular equation. January 21, 2010.

Characteristic polynomial explained

Motivation

Formal definition

Examples

Properties

Characteristic polynomial of a product of two matrices

Characteristic polynomial of Ak

Secular function and secular equation

Secular function

Secular equation

For general associative algebras

See also

References

Notes and References

Characteristic polynomial of A^k