Diagonalizable matrix explained

is called diagonalizable or non-defective if it is similar to a diagonal matrix. That is, if there exists an invertible matrix

and a diagonal matrix

such that . This is equivalent to (Such

are not unique.) This property exists for any linear map: for a finite-dimensional vector space a linear map

T:V\toV

is called diagonalizable if there exists an ordered basis of

consisting of eigenvectors of

. These definitions are equivalent: if

has a matrix representation

A=PDP^-1

as above, then the column vectors of

form a basis consisting of eigenvectors of and the diagonal entries of

are the corresponding eigenvalues of with respect to this eigenvector basis,

is represented by

Diagonalization is the process of finding the above

and and makes many subsequent computations easier. One can raise a diagonal matrix

to a power by simply raising the diagonal entries to that power. The determinant of a diagonal matrix is simply the product of all diagonal entries. Such computations generalize easily to

The geometric transformation represented by a diagonalizable matrix is an inhomogeneous dilation (or anisotropic scaling). That is, it can scale the space by a different amount in different directions. The direction of each eigenvector is scaled by a factor given by the corresponding eigenvalue.

A square matrix that is not diagonalizable is called defective. It can happen that a matrix

with real entries is defective over the real numbers, meaning that

A=PDP^-1

is impossible for any invertible

and diagonal

with real entries, but it is possible with complex entries, so that

is diagonalizable over the complex numbers. For example, this is the case for a generic rotation matrix.

Many results for diagonalizable matrices hold only over an algebraically closed field (such as the complex numbers). In this case, diagonalizable matrices are dense in the space of all matrices, which means any defective matrix can be deformed into a diagonalizable matrix by a small perturbation; and the Jordan–Chevalley decomposition states that any matrix is uniquely the sum of a diagonalizable matrix and a nilpotent matrix. Over an algebraically closed field, diagonalizable matrices are equivalent to semi-simple matrices.

Definition

A square

n x n

matrix,

, with entries in a field

is called diagonalizable or nondefective if there exists an

n x n

invertible matrix (i.e. an element of the general linear group GL_n(F)),

, such that

P^-1AP

is a diagonal matrix. Formally,

Characterization

The fundamental fact about diagonalizable maps and matrices is expressed by the following:

n x n

matrix

over a field

is diagonalizable if and only if the sum of the dimensions of its eigenspaces is equal to

, which is the case if and only if there exists a basis of

Fⁿ

consisting of eigenvectors of

. If such a basis has been found, one can form the matrix

having these basis vectors as columns, and

P^-1AP

will be a diagonal matrix whose diagonal entries are the eigenvalues of

. The matrix

is known as a modal matrix for

A linear map

T:V\toV

is diagonalizable if and only if the sum of the dimensions of its eigenspaces is equal to which is the case if and only if there exists a basis of

consisting of eigenvectors of

. With respect to such a basis,

will be represented by a diagonal matrix. The diagonal entries of this matrix are the eigenvalues of

The following sufficient (but not necessary) condition is often useful.

n x n

matrix

is diagonalizable over the field

if it has

distinct eigenvalues in i.e. if its characteristic polynomial has

distinct roots in however, the converse may be false. Consider

\begin -1 & 3 & -1 \\-3 & 5 & -1 \\-3 & 3 & 1 \end,

which has eigenvalues 1, 2, 2 (not all distinct) and is diagonalizable with diagonal form (similar to

\begin1 & 0 & 0 \\0 & 2 & 0 \\0 & 0 & 2\end

and change of basis matrix

\begin1 & 1 & -1 \\1 & 1 & 0 \\1 & 0 & 3\end.

The converse fails when

has an eigenspace of dimension higher than 1. In this example, the eigenspace of

associated with the eigenvalue 2 has dimension 2.

A linear map

T:V\toV

with

n=\dim(V)

is diagonalizable if it has

distinct eigenvalues, i.e. if its characteristic polynomial has

distinct roots in

Let

be a matrix over If

is diagonalizable, then so is any power of it. Conversely, if

is invertible,

is algebraically closed, and

Aⁿ

is diagonalizable for some

that is not an integer multiple of the characteristic of then

is diagonalizable. Proof: If

Aⁿ

is diagonalizable, then

is annihilated by some polynomial which has no multiple root (since and is divided by the minimal polynomial of

Over the complex numbers

\Complex

, almost every matrix is diagonalizable. More precisely: the set of complex

n x n

matrices that are not diagonalizable over considered as a subset of has Lebesgue measure zero. One can also say that the diagonalizable matrices form a dense subset with respect to the Zariski topology: the non-diagonalizable matrices lie inside the vanishing set of the discriminant of the characteristic polynomial, which is a hypersurface. From that follows also density in the usual (strong) topology given by a norm. The same is not true over

The Jordan–Chevalley decomposition expresses an operator as the sum of its semisimple (i.e., diagonalizable) part and its nilpotent part. Hence, a matrix is diagonalizable if and only if its nilpotent part is zero. Put in another way, a matrix is diagonalizable if each block in its Jordan form has no nilpotent part; i.e., each "block" is a one-by-one matrix.

Diagonalization

Consider the two following arbitrary bases

E=\{{\boldsymbol{e}_i|\foralli\in[n]}\}

and

F=\{{\boldsymbol{\alpha}_i|\foralli\in[n]}\}

. Suppose that there exists a linear transformation represented by a matrix

A_E

which is written with respect to basis E. Suppose also that there exists the following eigen-equation:

A_E\boldsymbol{\alpha}_E,i=λ_i\boldsymbol{\alpha}_E,i

The alpha eigenvectors are written also with respect to the E basis. Since the set F is both a set of eigenvectors for matrix A and it spans some arbitrary vector space, then we say that there exists a matrix

D_F

which is a diagonal matrix that is similar to

A_E

. In other words,

A_E

is a diagonalizable matrix if the matrix is written in the basis F. We perform the change of basis calculation using the transition matrix

, which changes basis from E to F as follows:

D_F=

	F
S
	E

A_E

	-1F
S
	E

where

	F
S
	E

is the transition matrix from E-basis to F-basis. The inverse can then be equated to a new transition matrix

which changes basis from F to E instead and so we have the following relationship :

	-1F
S
	E

	E
P
	F

Both

and

transition matrices are invertible. Thus we can manipulate the matrices in the following fashion:

\begin D = S \ A_ \ S^ \\ D = P^ \ A_ \ P \end

The matrix

A_E

will be denoted as

, which is still in the E-basis. Similarly, the diagonal matrix is in the F-basis.

If a matrix

can be diagonalized, that is,

P^-1AP=\begin{bmatrix} λ₁&0& … &0\\ 0&λ₂& … &0\\ \vdots&\vdots&\ddots&\vdots\\ 0&0& … &λ_{n
\end{bmatrix}}=D,

then:

AP=P\begin{bmatrix} λ₁&0& … &0\\ 0&λ₂& … &0\\ \vdots&\vdots&\ddots&\vdots\\ 0&0& … &λ_{n
\end{bmatrix}.}

The transition matrix S has the E-basis vectors as columns written in the basis F. Inversely, the inverse transition matrix P has F-basis vectors

\boldsymbol{\alpha}_i

written in the basis of E so that we can represent P in block matrix form in the following manner:

P=\begin{bmatrix}\boldsymbol{\alpha}_E,1&\boldsymbol{\alpha}_E,2& … &\boldsymbol{\alpha}_E,n\end{bmatrix},

as a result we can write: $\begin A \begin \boldsymbol_ & \boldsymbol_ & \cdots & \boldsymbol_ \end = \begin \boldsymbol_ & \boldsymbol_ & \cdots & \boldsymbol_ \endD.\end$

In block matrix form, we can consider the A-matrix to be a matrix of 1x1 dimensions whilst P is a 1xn dimensional matrix. The D-matrix can be written in full form with all the diagonal elements as an nxn dimensional matrix:

A\begin{bmatrix}\boldsymbol{\alpha}_E,1&\boldsymbol{\alpha}_E,2& … &\boldsymbol{\alpha}_E,n\end{bmatrix}=\begin{bmatrix}\boldsymbol{\alpha}_E,1&\boldsymbol{\alpha}_E,2& … &\boldsymbol{\alpha}_E,n\end{bmatrix} \begin{bmatrix} λ₁&0& … &0\\ 0&λ₂& … &0\\ \vdots&\vdots&\ddots&\vdots\\ 0&0& … &λ_{n
\end{bmatrix}.}

Performing the above matrix multiplication we end up with the following result: $\begin A \begin \boldsymbol_1 & \boldsymbol_2 & \cdots & \boldsymbol_n \end = \begin \lambda_1 \boldsymbol_1 & \lambda_2\boldsymbol_2 & \cdots & \lambda_n \boldsymbol_n \end
\end$ Taking each component of the block matrix individually on both sides, we end up with the following:

A\boldsymbol{\alpha}_i=λ_i\boldsymbol{\alpha}_i (i=1,2,...,n).

So the column vectors of

are right eigenvectors of and the corresponding diagonal entry is the corresponding eigenvalue. The invertibility of

also suggests that the eigenvectors are linearly independent and form a basis of This is the necessary and sufficient condition for diagonalizability and the canonical approach of diagonalization. The row vectors of

P^-1

are the left eigenvectors of

When a complex matrix

A\inC^{n x}

is a Hermitian matrix (or more generally a normal matrix), eigenvectors of

can be chosen to form an orthonormal basis of and

can be chosen to be a unitary matrix. If in addition,

A\inR^{n x}

is a real symmetric matrix, then its eigenvectors can be chosen to be an orthonormal basis of

Rⁿ

and

can be chosen to be an orthogonal matrix.

For most practical work matrices are diagonalized numerically using computer software. Many algorithms exist to accomplish this.

Simultaneous diagonalization

A set of matrices is said to be simultaneously diagonalizable if there exists a single invertible matrix

such that

P^-1AP

is a diagonal matrix for every

in the set. The following theorem characterizes simultaneously diagonalizable matrices: A set of diagonalizable matrices commutes if and only if the set is simultaneously diagonalizable.^[1]

The set of all

n x n

diagonalizable matrices (over with

n>1

is not simultaneously diagonalizable. For instance, the matrices

\begin{bmatrix}1&0\ 0&0\end{bmatrix} and \begin{bmatrix}1&1\ 0&0\end{bmatrix}

are diagonalizable but not simultaneously diagonalizable because they do not commute.

A set consists of commuting normal matrices if and only if it is simultaneously diagonalizable by a unitary matrix; that is, there exists a unitary matrix

such that

U^*AU

is diagonal for every

in the set.

In the language of Lie theory, a set of simultaneously diagonalizable matrices generates a toral Lie algebra.

Examples

Diagonalizable matrices

Involutions are diagonalizable over the reals (and indeed any field of characteristic not 2), with ±1 on the diagonal.
Finite order endomorphisms are diagonalizable over

(or any algebraically closed field where the characteristic of the field does not divide the order of the endomorphism) with roots of unity on the diagonal. This follows since the minimal polynomial is separable, because the roots of unity are distinct.

Projections are diagonalizable, with 0s and 1s on the diagonal.
Real symmetric matrices are diagonalizable by orthogonal matrices; i.e., given a real symmetric matrix

Q^TAQ

is diagonal for some orthogonal matrix More generally, matrices are diagonalizable by unitary matrices if and only if they are normal. In the case of the real symmetric matrix, we see that so clearly

AA^T=A^TA

holds. Examples of normal matrices are real symmetric (or skew-symmetric) matrices (e.g. covariance matrices) and Hermitian matrices (or skew-Hermitian matrices). See spectral theorems for generalizations to infinite-dimensional vector spaces.

Matrices that are not diagonalizable

In general, a rotation matrix is not diagonalizable over the reals, but all rotation matrices are diagonalizable over the complex field. Even if a matrix is not diagonalizable, it is always possible to "do the best one can", and find a matrix with the same properties consisting of eigenvalues on the leading diagonal, and either ones or zeroes on the superdiagonal – known as Jordan normal form.

Some matrices are not diagonalizable over any field, most notably nonzero nilpotent matrices. This happens more generally if the algebraic and geometric multiplicities of an eigenvalue do not coincide. For instance, consider

C=\begin{bmatrix}0&1\ 0&0\end{bmatrix}.

This matrix is not diagonalizable: there is no matrix

such that

U^-1CU

is a diagonal matrix. Indeed,

has one eigenvalue (namely zero) and this eigenvalue has algebraic multiplicity 2 and geometric multiplicity 1.

Some real matrices are not diagonalizable over the reals. Consider for instance the matrix

B=\left[\begin{array}{rr}0&1\ -1&0\end{array}\right].

The matrix

does not have any real eigenvalues, so there is no real matrix

such that

Q^-1BQ

is a diagonal matrix. However, we can diagonalize

if we allow complex numbers. Indeed, if we take

Q=\begin{bmatrix}1&i\ i&1\end{bmatrix},

then

Q^-1BQ

is diagonal. It is easy to find that

is the rotation matrix which rotates counterclockwise by angle

\theta = -\frac

Note that the above examples show that the sum of diagonalizable matrices need not be diagonalizable.

How to diagonalize a matrix

Diagonalizing a matrix is the same process as finding its eigenvalues and eigenvectors, in the case that the eigenvectors form a basis. For example, consider the matrix

A=\left[\begin{array}{rrr} 0&1&-2\\ 0&1&0\\ 1&-1&3 \end{array}\right].

p(λ)=\det(λI-A)

are the eigenvalues Solving the linear system

\left(I-A\right)v=0

gives the eigenvectors

v₁=(1,1,0)

and while

\left(2I-A\right)v=0

gives that is,

Av_i=λ_iv_i

for These vectors form a basis of so we can assemble them as the column vectors of a change-of-basis matrix

to get:

P^AP =\left[\begin{array}{rrr}
1 & 0 & 1\\
1 & 2 & 0\\
0 & 1 & \!\!\!\!-1
\end{array}\right]^\left[\begin{array}{rrr}
0 & 1 & \!\!\!-2\\
0 & 1 & 0\\
1 & \!\!\!-1 & 3
\end{array}\right]
\left[\begin{array}{rrr}
1 & \,0 & 1\\
1 & 2 & 0\\
0 & 1 & \!\!\!\!-1
\end{array}\right]=\begin 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 2 \end = D .

We may see this equation in terms of transformations:

takes the standard basis to the eigenbasis, so we have:

P^ AP \mathbf_i =P^ A \mathbf_i =P^ (\lambda_i\mathbf_i) =\lambda_i\mathbf_i,

so that

P^-1AP

has the standard basis as its eigenvectors, which is the defining property of

Note that there is no preferred order of the eigenvectors in changing the order of the eigenvectors in

just changes the order of the eigenvalues in the diagonalized form of ^[2]

Application to matrix functions

Diagonalization can be used to efficiently compute the powers of a matrix

\begin{align}A^k&=\left(PDP^-1\right)^k=\left(PDP^-1\right)\left(PDP^-1\right) … \left(PDP^-1\right)\\ &=PD\left(P^-1P\right)D\left(P^-1P\right) … \left(P^-1P\right)DP^-1=PD^kP^-1, \end{align}

and the latter is easy to calculate since it only involves the powers of a diagonal matrix. For example, for the matrix

with eigenvalues

λ=1,1,2

in the example above we compute:

\begin{align} A^k=PD^kP^-1&=\left[\begin{array}{rrr} 1&0&1\\ 1&2&0\\ 0&1&-1 \end{array}\right] \begin{bmatrix}1^k&0&0\ 0&1^k&0\ 0&0&2^k\end{bmatrix} \left[\begin{array}{rrr} 1&0&1\\ 1&2&0\\ 0&1&-1 \end{array}\right]^-1\\[1em] &=\begin{bmatrix} 2-2^k&-1+2^k&2-2^k\\ 0&1&0\\ -1+2^k&1-2^k&-1+2^k\end{bmatrix}. \end{align}

This approach can be generalized to matrix exponential and other matrix functions that can be defined as power series. For example, defining we have:

\begin{align} \exp(A)=P\exp(D)P^-1&=\left[\begin{array}{rrr} 1&0&1\\ 1&2&0\\ 0&1&-1 \end{array}\right] \begin{bmatrix}e¹&0&0\ 0&e¹&0\ 0&0&e²\end{bmatrix} \left[\begin{array}{rrr} 1&0&1\\ 1&2&0\\ 0&1&-1 \end{array}\right]^-1\\[1em] &=\begin{bmatrix} 2e-e²&-e+e²&2e-2e²\\ 0&e&0\\ -e+e²&e-e²&-e+2e²\end{bmatrix}. \end{align}

This is particularly useful in finding closed form expressions for terms of linear recursive sequences, such as the Fibonacci numbers.

Particular application

For example, consider the following matrix:

M=\begin{bmatrix}a&b-a\ 0&b\end{bmatrix}.

Calculating the various powers of

reveals a surprising pattern:

M²=\begin{bmatrix}a²&b^2-a²\ 0&b²\end{bmatrix}, M³=\begin{bmatrix}a³&b^3-a³\ 0&b³\end{bmatrix}, M⁴=\begin{bmatrix}a⁴&b^4-a⁴\ 0&b⁴\end{bmatrix}, \ldots

The above phenomenon can be explained by diagonalizing To accomplish this, we need a basis of

\R²

consisting of eigenvectors of One such eigenvector basis is given by

u=\begin{bmatrix}1\ 0\end{bmatrix}=e_1,v=\begin{bmatrix}1\ 1\end{bmatrix}=e₁+e_2,

where e_i denotes the standard basis of Rⁿ. The reverse change of basis is given by

e₁=u, e₂=v-u.

Straightforward calculations show that

Mu=au, Mv=bv.

Thus, a and b are the eigenvalues corresponding to u and v, respectively. By linearity of matrix multiplication, we have that

Mⁿu=aⁿu, Mⁿv=bⁿv.

Switching back to the standard basis, we have

\begin{align} Mⁿe₁&=Mⁿu=aⁿe_1,\\ Mⁿe₂&=Mⁿ\left(v-u\right)=bⁿv-a^nu=\left(bⁿ-a^n\right)e₁+

	ne
b
	2. \end{align}

The preceding relations, expressed in matrix form, are

Mⁿ=\begin{bmatrix}aⁿ&bⁿ-aⁿ\ 0&bⁿ\end{bmatrix},

thereby explaining the above phenomenon.

Quantum mechanical application

In quantum mechanical and quantum chemical computations matrix diagonalization is one of the most frequently applied numerical processes. The basic reason is that the time-independent Schrödinger equation is an eigenvalue equation, albeit in most of the physical situations on an infinite dimensional Hilbert space.

A very common approximation is to truncate Hilbert space to finite dimension, after which the Schrödinger equation can be formulated as an eigenvalue problem of a real symmetric, or complex Hermitian matrix. Formally this approximation is founded on the variational principle, valid for Hamiltonians that are bounded from below.

First-order perturbation theory also leads to matrix eigenvalue problem for degenerate states.

Notes and References

Book: Matrix Analysis, second edition. Horn. Roger A.. Johnson. Charles R.. Cambridge University Press. 2013. 9780521839402.
Book: Anton . H. . Rorres. C. . Elementary Linear Algebra (Applications Version) . registration . John Wiley & Sons. 8th. 22 Feb 2000. 978-0-471-17052-5.

Diagonalizable matrix explained

Definition

Characterization

Diagonalization

Simultaneous diagonalization

Examples

Diagonalizable matrices

Matrices that are not diagonalizable

How to diagonalize a matrix

Application to matrix functions

Particular application

Quantum mechanical application

See also

Notes and References