Schur–Horn theorem explained

In mathematics, particularly linear algebra, the Schur–Horn theorem, named after Issai Schur and Alfred Horn, characterizes the diagonal of a Hermitian matrix with given eigenvalues. It has inspired investigations and substantial generalizations in the setting of symplectic geometry. A few important generalizations are Kostant's convexity theorem, Atiyah–Guillemin–Sternberg convexity theorem and Kirwan convexity theorem.

Statement

The inequalities above may alternatively be written: $\begind_1 &\;\leq\;&& \lambda_1 \\[0.3ex]d_2 + d_1 &\;\leq&& \lambda_1 + \lambda_2 \\[0.3ex]\vdots &\;\leq&& \vdots \\[0.3ex]d_ + \cdots + d_2 + d_1&\;\leq&& \lambda_1 + \lambda_2 + \cdots + \lambda_ \\[0.3ex]d_N + d_ + \cdots + d_2 + d_1 &\;=&& \lambda_1 + \lambda_2 + \cdots + \lambda_ + \lambda_N. \\[0.3ex]\end$

The Schur–Horn theorem may thus be restated more succinctly and in plain English:

Schur–Horn theorem: Given any non-increasing real sequences of desired diagonal elements

d₁\geq … \geqd_N

and desired eigenvalues

λ₁\geq … \geqλ_N,

there exists a Hermitian matrix with these eigenvalues and diagonal elements if and only if these two sequences have the same sum and for every possible integer

the sum of the first

desired diagonal elements never exceeds the sum of the first

desired eigenvalues.

Reformation allowing unordered diagonals and eigenvalues

Although this theorem requires that

d₁\geq … \geqd_N

and

λ₁\geq … \geqλ_N

be non-increasing, it is possible to reformulate this theorem without these assumptions.

We start with the assumption

λ₁\geq … \geqλ_N.

The left hand side of the theorem's characterization (that is, "there exists a Hermitian matrix with these eigenvalues and diagonal elements") depends on the order of the desired diagonal elements

d_1,...,d_N

(because changing their order would change the Hermitian matrix whose existence is in question) but it does depend on the order of the desired eigenvalues

λ_1,...,λ_N.

On the right hand right hand side of the characterization, only the values of

λ₁+ … +λ_n

depend on the assumption

λ₁\geq … \geqλ_N.

Notice that this assumption means that the expression

λ₁+ … +λ_n

is just notation for the sum of the

largest desired eigenvalues. Replacing the expression

λ₁+ … +λ_n

with this written equivalent makes the assumption

λ₁\geq … \geqλ_N

completely unnecessary:

Schur–Horn theorem: Given any

desired real eigenvalues and a non-increasing real sequence of desired diagonal elements

d₁\geq … \geqd_N,

there exists a Hermitian matrix with these eigenvalues and diagonal elements if and only if these two sequences have the same sum and for every possible integer

the sum of the first

desired diagonal elements never exceeds the sum of the

desired eigenvalues.

Permutation polytope generated by a vector

The permutation polytope generated by

\tilde{x}=(x_1,x_2,\ldots,x_n)\in\Realsⁿ

denoted by

l{K}_\tilde{x

} is defined as the convex hull of the set

\{(x_\pi(1),x_\pi(2),\ldots,x_\pi(n))\in\Realsⁿ:\pi\inS_n\}.

Here

S_n

denotes the symmetric group on

\{1,2,\ldots,n\}.

In other words, the permutation polytope generated by

(x_1,...,x_n)

is the convex hull of the set of all points in

\Realsⁿ

that can be obtained by rearranging the coordinates of

(x_1,...,x_n).

The permutation polytope of

(1,1,2),

for instance, is the convex hull of the set

\{(1,1,2),(1,2,1),(2,1,1)\},

which in this case is the solid (filled) triangle whose vertices are the three points in this set. Notice, in particular, that rearranging the coordinates of

(x_1,...,x_n)

does not change the resulting permutation polytope; in other words, if a point

\tilde{y}

can be obtained from

\tilde{x}=(x_1,...,x_n)

by rearranging its coordinates, then

l{K}_\tilde{y

} = \mathcal_.

The following lemma characterizes the permutation polytope of a vector in

\Reals^n.

Reformulation of Schur–Horn theorem

In view of the equivalence of (i) and (ii) in the lemma mentioned above, one may reformulate the theorem in the following manner.

Theorem. Let

d_1,...,d_N

and

λ_1,...,λ_N

be real numbers. There is a Hermitian matrix with diagonal entries

d_1,...,d_N

and eigenvalues

λ_1,...,λ_N

if and only if the vector

(d_1,\ldots,d_n)

is in the permutation polytope generated by

(λ_1,\ldots,λ_n).

Note that in this formulation, one does not need to impose any ordering on the entries of the vectors

d_1,...,d_N

and

λ_1,...,λ_N.

Proof of the Schur–Horn theorem

Let

A=(a_jk)

be a

n x n

Hermitian matrix with eigenvalues

\{λ_i\}

	n,

	i=1

counted with multiplicity. Denote the diagonal of

\tilde{a},

thought of as a vector in

\Reals^n,

and the vector

(λ_1,λ_2,\ldots,λ_n)

\tilde{λ}.

Let

be the diagonal matrix having

λ_1,λ_2,\ldots,λ_n

on its diagonal.

(

⇒

)

may be written in the form

UΛU^-1,

where

is a unitary matrix. Then

a_ = \sum_^n \lambda_j |u_|^2, \; i = 1, 2, \ldots, n.

Let

S=(s_ij)

be the matrix defined by

s_ij=|u_ij|^2.

Since

is a unitary matrix,

is a doubly stochastic matrix and we have

\tilde{a}=S\tilde{λ}.

By the Birkhoff–von Neumann theorem,

can be written as a convex combination of permutation matrices. Thus

\tilde{a}

is in the permutation polytope generated by

\tilde{λ}.

This proves Schur's theorem.

(

\Leftarrow

) If

\tilde{a}

occurs as the diagonal of a Hermitian matrix with eigenvalues

\{λ_i\}

	n,

	i=1

then

t\tilde{a}+(1-t)\tau(\tilde{a})

also occurs as the diagonal of some Hermitian matrix with the same set of eigenvalues, for any transposition

\tau

S_n.

One may prove that in the following manner.

Let

\xi

be a complex number of modulus

such that

\overline{\xia_jk

} = - \xi a_ and

be a unitary matrix with

\xi\sqrt{t},\sqrt{t}

in the

j,j

and

k,k

entries, respectively,

-\sqrt{1-t},\xi\sqrt{1-t}

at the

j,k

and

k,j

entries, respectively,

at all diagonal entries other than

j,j

and

k,k,

and

at all other entries. Then

UAU^-1

has

ta_jj+(1-t)a_kk

at the

j,j

entry,

(1-t)a_jj+ta_kk

at the

k,k

entry, and

a_ll

at the

l,l

entry where

l ≠ j,k.

Let

\tau

be the transposition of

\{1,2,...,n\}

that interchanges

and

Then the diagonal of

UAU^-1

t\tilde{a}+(1-t)\tau(\tilde{a}).

is a Hermitian matrix with eigenvalues

\{λ_i

	n.
\}
	i=1

Using the equivalence of (i) and (iii) in the lemma mentioned above, we see that any vector in the permutation polytope generated by

(λ_1,λ_2,\ldots,λ_n),

occurs as the diagonal of a Hermitian matrix with the prescribed eigenvalues. This proves Horn's theorem.

Symplectic geometry perspective

The Schur–Horn theorem may be viewed as a corollary of the Atiyah–Guillemin–Sternberg convexity theorem in the following manner. Let

l{U}(n)

denote the group of

n x n

unitary matrices. Its Lie algebra, denoted by

ak{u}(n),

is the set of skew-Hermitian matrices. One may identify the dual space

ak{u}(n)^*

with the set of Hermitian matrices

l{H}(n)

via the linear isomorphism

\Psi:l{H}(n) → ak{u}(n)^*

defined by

\Psi(A)(B)=tr(iAB)

for

A\inl{H}(n),B\inak{u}(n).

The unitary group

l{U}(n)

acts on

l{H}(n)

by conjugation and acts on

ak{u}(n)^*

by the coadjoint action. Under these actions,

\Psi

is an

l{U}(n)

-equivariant map i.e. for every

U\inl{U}(n)

the following diagram commutes,

Let

\tilde{λ}=(λ_1,λ_2,\ldots,λ_n)\in\Realsⁿ

and

Λ\inl{H}(n)

denote the diagonal matrix with entries given by

\tilde{λ}.

Let

l{O}_\tilde{λ

} denote the orbit of

under the

l{U}(n)

-action i.e. conjugation. Under the

l{U}(n)

-equivariant isomorphism

\Psi,

the symplectic structure on the corresponding coadjoint orbit may be brought onto

l{O}_\tilde{λ

}. Thus

l{O}_\tilde{λ

} is a Hamiltonian

l{U}(n)

-manifold.

Let

denote the Cartan subgroup of

l{U}(n)

which consists of diagonal complex matrices with diagonal entries of modulus

The Lie algebra

ak{t}

consists of diagonal skew-Hermitian matrices and the dual space

ak{t}^*

consists of diagonal Hermitian matrices, under the isomorphism

\Psi.

In other words,

ak{t}

consists of diagonal matrices with purely imaginary entries and

ak{t}^*

consists of diagonal matrices with real entries. The inclusion map

ak{t}\hookrightarrowak{u}(n)

induces a map

\Phi:l{H}(n)\congak{u}(n)^* → ak{t}^*,

which projects a matrix

to the diagonal matrix with the same diagonal entries as

The set

l{O}_\tilde{λ

} is a Hamiltonian

-manifold, and the restriction of

\Phi

to this set is a moment map for this action.

By the Atiyah–Guillemin–Sternberg theorem,

\Phi(l{O}_\tilde{λ

}) is a convex polytope. A matrix

A\inl{H}(n)

is fixed under conjugation by every element of

if and only if

is diagonal. The only diagonal matrices in

l{O}_\tilde{λ

} are the ones with diagonal entries

λ_1,λ_2,\ldots,λ_n

in some order. Thus, these matrices generate the convex polytope

\Phi(l{O}_\tilde{λ

}). This is exactly the statement of the Schur–Horn theorem.

References

Schur, Issai, Über eine Klasse von Mittelbildungen mit Anwendungen auf die Determinantentheorie, Sitzungsber. Berl. Math. Ges. 22 (1923), 9–20.
Horn, Alfred, Doubly stochastic matrices and the diagonal of a rotation matrix, American Journal of Mathematics 76 (1954), 620–630.
Kadison, R. V.; Pedersen, G. K., Means and Convex Combinations of Unitary Operators, Math. Scand. 57 (1985),249–266.
Kadison, R. V., The Pythagorean Theorem: I. The finite case, Proc. Natl. Acad. Sci. USA, vol. 99 no. 7 (2002):4178–4184 (electronic)

External links

MathWorld
Terry Tao: 254A, Notes 3a: Eigenvalues and sums of Hermitian matrices
Sheela Devadas, Peter J. Haine, Keaton Stubis: The Schur-Horn Theorem