Weyr canonical form explained

In mathematics, in linear algebra, a Weyr canonical form (or, Weyr form or Weyr matrix) is a square matrix which (in some sense) induces "nice" properties with matrices it commutes with. It also has a particularly simple structure and the conditions for possessing a Weyr form are fairly weak, making it a suitable tool for studying classes of commuting matrices. A square matrix is said to be in the Weyr canonical form if the matrix has the structure defining the Weyr canonical form. The Weyr form was discovered by the Czech mathematician Eduard Weyr in 1885.^[1] ^[2] ^[3] The Weyr form did not become popular among mathematicians and it was overshadowed by the closely related, but distinct, canonical form known by the name Jordan canonical form.^[3] The Weyr form has been rediscovered several times since Weyr’s original discovery in 1885.^[4] This form has been variously called as modified Jordan form, reordered Jordan form, second Jordan form, and H-form.^[4] The current terminology is credited to Shapiro who introduced it in a paper published in the American Mathematical Monthly in 1999.^[4] ^[5]

Recently several applications have been found for the Weyr matrix. Of particular interest is an application of the Weyr matrix in the study of phylogenetic invariants in biomathematics.

Definitions

Definition

n₁+n₂₊ … +n_r=n

with

n_1\gen_2\ge … \gen_r\ge1

such that, when

is viewed as an

r x r

block matrix

(W_ij)

, where the

(i,j)

block

W_ij

is an

n_i x n_j

matrix, the following three features are present:

The main diagonal blocks

W_ii

are the

n_{i x}n_i

scalar matrices

λI

for

i=1,\ldots,r

The first superdiagonal blocks

W_i,i+1

are full column rank

n_i x n_i+1

matrices in reduced row-echelon form (that is, an identity matrix followed by zero rows) for

i=1,\ldots,r-1

All other blocks of W are zero (that is,

W_ij=0

when

j\nei,i+1

In this case, we say that

has Weyr structure

(n_1,n_2,\ldots,n_r)

Example

The following is an example of a basic Weyr matrix.

= \begin{bmatrix} W₁₁&W₁₂&&\\ &W₂₂&W₂₃&\\ &&W₃₃&W₃₄\\ &&&W₄₄\\ \end{bmatrix}

In this matrix,

n=9

and

n_1=4,n_2=2,n_3=2,n₄₌₁

. So

has the Weyr structure

(4,2,2,1)

. Also,

W₁₁= \begin{bmatrix} λ&0&0&0\\ 0&λ&0&0\\ 0&0&λ&0\\ 0&0&0&λ\\ \end{bmatrix}=λI_4, W₂₂= \begin{bmatrix} λ&0\\ 0&λ&\\ \end{bmatrix}=λI_2, W₃₃= \begin{bmatrix} λ&0\\ 0&λ&\\ \end{bmatrix}=λI_2, W₄₄= \begin{bmatrix} λ\\ \end{bmatrix}=λI₁

and

W₁₂= \begin{bmatrix} 1&0\\ 0&1\\ 0&0\\ 0&0\\ \end{bmatrix}, W₂₃= \begin{bmatrix} 1&0\\ 0&1\\ \end{bmatrix}, W₃₄= \begin{bmatrix} 1\\ 0\\ \end{bmatrix}.

Definition

Let

be a square matrix and let

λ_1,\ldots,λ_k

be the distinct eigenvalues of

. We say that

is in Weyr form (or is a Weyr matrix) if

has the following form:

W= \begin{bmatrix} W₁&&&\\ &W₂&&\\ &&\ddots&\\ &&&W_k\\ \end{bmatrix}

where

W_i

is a basic Weyr matrix with eigenvalue

λ_i

for

i=1,\ldots,k

Example

The following image shows an example of a general Weyr matrix consisting of three basic Weyr matrix blocks. The basic Weyr matrix in the top-left corner has the structure (4,2,1) with eigenvalue 4, the middle block has structure (2,2,1,1) with eigenvalue -3 and the one in the lower-right corner has the structure (3, 2) with eigenvalue 0.

Relation between Weyr and Jordan forms

The Weyr canonical form

W=P^-1JP

is related to the Jordan form

by a simple permutation

for each Weyr basic block as follows: The first index of each Weyr subblock forms the largest Jordan chain. After crossing out these rows and columns, the first index of each new subblock forms the second largest Jordan chain, and so forth.^[6]

The Weyr form is canonical

That the Weyr form is a canonical form of a matrix is a consequence of the following result:^[3] Each square matrix

over an algebraically closed field is similar to a Weyr matrix

which is unique up to permutation of its basic blocks. The matrix

is called the Weyr (canonical) form of

Computation of the Weyr canonical form

Reduction to the nilpotent case

Let

be a square matrix of order

over an algebraically closed field and let the distinct eigenvalues of

λ_1,λ_2,\ldots,λ_k

. The Jordan–Chevalley decomposition theorem states that

is similar to a block diagonal matrix of the form

A= \begin{bmatrix} λ_1I+N_1&&&\\ &λ_2I+N₂&&\\ &&\ddots&\\ &&&λ_kI+N_k\\ \end{bmatrix} = \begin{bmatrix} λ_1I&&&\\ &λ_2I&&\\ &&\ddots&\\ &&&λ_kI\\ \end{bmatrix} + \begin{bmatrix} N_1&&&\\ &N₂&&\\ &&\ddots&\\ &&&N_k\\ \end{bmatrix} = D+N

where

is a diagonal matrix,

is a nilpotent matrix, and

[D,N]=0

, justifying the reduction of

into subblocks

N_i

. So the problem of reducing

to the Weyr form reduces to the problem of reducing the nilpotent matrices

N_i

to the Weyr form. This is leads to the generalized eigenspace decomposition theorem.

Reduction of a nilpotent matrix to the Weyr form

Given a nilpotent square matrix

of order

over an algebraically closed field

, the following algorithm produces an invertible matrix

and a Weyr matrix

such that

W=C^-1AC

Step 1

Let

A_1=A

Step 2

Compute a basis for the null space of

A₁

Extend the basis for the null space of

A₁

to a basis for the

-dimensional vector space

Fⁿ

Form the matrix

P₁

consisting of these basis vectors.

Compute

	-1
P
	1

A_1P_{1=\begin{bmatrix}0}&B₂\ 0&A₂\end{bmatrix}

A₂

is a square matrix of size

- nullity

(A₁₎

Step 3

A₂

is nonzero, repeat Step 2 on

A₂

Compute a basis for the null space of

A₂

Extend the basis for the null space of

A₂

to a basis for the vector space having dimension

- nullity

(A₁₎

Form the matrix

P₂

consisting of these basis vectors.

Compute

	-1
P
	2

A_2P_{2=\begin{bmatrix}0}&B₃\ 0&A₃\end{bmatrix}

A₂

is a square matrix of size

- nullity

(A₁₎

- nullity

(A₂₎

Step 4

Continue the processes of Steps 1 and 2 to obtain increasingly smaller square matrices

A_1,A_2,A_3,\ldots

and associated invertible matrices

P_1,P_2,P_3,\ldots

until the first zero matrix

A_r

is obtained.

Step 5

The Weyr structure of

(n_1,n_2,\ldots,n_r)

where

n_i

= nullity

(A_i)

Step 6

Compute the matrix

P=P₁\begin{bmatrix}I&0\ 0&P₂\end{bmatrix}\begin{bmatrix}I&0\ 0&P₃\end{bmatrix} … \begin{bmatrix}I&0\ 0&P_r\end{bmatrix}

(here the

's are appropriately sized identity matrices).

Compute

X=P^-1AP

is a matrix of the following form:

X=\begin{bmatrix}0&X₁₂&X₁₃& … &X_1,r-1&X_1r\ &0&X₂₃& … &X_2,r-1&X_2r\ &&&\ddots&\ &&& … &0&X_r-1,r\ &&&&&0\end{bmatrix}

Step 7

Use elementary row operations to find an invertible matrix

Y_r-1

of appropriate size such that the product

Y_r-1X_r,r-1

is a matrix of the form

I_r,r-1=\begin{bmatrix}I\ O\end{bmatrix}

Step 8

Set

Q₁₌

diag

(I,I,\ldots,

	-1
Y
	r-1

,I)

and compute

	-1
Q
	1

XQ₁

. In this matrix, the

(r,r-1)

-block is

I_r,r-1

Step 9

Find a matrix

R₁

formed as a product of elementary matrices such that

	-1
R
	1

	-1
Q
	1

XQ_1R₁

is a matrix in which all the blocks above the block

I_r,r-1

contain only

's.

Step 10

Repeat Steps 8 and 9 on column

r-1

converting

(r-1,r-2)

-block to

I_r-1,r-2

via conjugation by some invertible matrix

Q₂

. Use this block to clear out the blocks above, via conjugation by a product

R₂

of elementary matrices.

Step 11

Repeat these processes on

r-2,r-3,\ldots,3,2

columns, using conjugations by

Q_3,R_3,\ldots,Q_r-2,R_r-2,Q_r-1

. The resulting matrix

is now in Weyr form.

Step 12

Let

C=P₁diag(I,P₂₎ … diag(I,P_r-1)Q_1R_1Q_{2 …}R_r-2Q_r-1

. Then

W=C^-1AC

Applications of the Weyr form

Some well-known applications of the Weyr form are listed below:^[3]

The Weyr form can be used to simplify the proof of Gerstenhaber’s Theorem which asserts that the subalgebra generated by two commuting

n x n

matrices has dimension at most

A set of finite matrices is said to be approximately simultaneously diagonalizable if they can be perturbed to simultaneously diagonalizable matrices. The Weyr form is used to prove approximate simultaneous diagonalizability of various classes of matrices. The approximate simultaneous diagonalizability property has applications in the study of phylogenetic invariants in biomathematics.
The Weyr form can be used to simplify the proofs of the irreducibility of the variety of all k-tuples of commuting complex matrices.

Notes and References

Eduard Weyr. Répartition des matrices en espèces et formation de toutes les espèces. Comptes Rendus de l'Académie des Sciences de Paris. 1885. 100. 966–969. 10 December 2013.
Eduard Weyr. Zur Theorie der bilinearen Formen. Monatshefte für Mathematik und Physik. 1890. 1. 163–236.
Book: Kevin C. Meara . John Clark . Charles I. Vinsonhaler . Advanced Topics in Linear Algebra: Weaving Matrix Problems through the Weyr Form. 2011. Oxford University Press.
Book: Kevin C. Meara . John Clark . Charles I. Vinsonhaler . Advanced Topics in Linear Algebra: Weaving Matrix Problems through the Weyr Form. 2011. Oxford University Press. 44, 81–82.
Shapiro, H.. The Weyr characteristic. The American Mathematical Monthly. 1999. 106. 10. 919–929. 10.2307/2589746. 2589746. 56072601 .
Sergeichuk, "Canonical matrices for linear matrix problems", Arxiv:0709.2485 [math.RT], 2007