In matrix theory, the Perron–Frobenius theorem, proved by and, asserts that a real square matrix with positive entries has a unique eigenvalue of largest magnitude and that eigenvalue is real. The corresponding eigenvector can be chosen to have strictly positive components, and also asserts a similar statement for certain classes of nonnegative matrices. This theorem has important applications to probability theory (ergodicity of Markov chains); to the theory of dynamical systems (subshifts of finite type); to economics (Okishio's theorem,[1] Hawkins–Simon condition);to demography (Leslie population age distribution model); to social networks (DeGroot learning process); to Internet search engines (PageRank);[2] and even to ranking of American footballteams. The first to discuss the ordering of players within tournaments using Perron–Frobenius eigenvectors is Edmund Landau.
Let positive and non-negative respectively describe matrices with exclusively positive real numbers as elements and matrices with exclusively non-negative real numbers as elements. The eigenvalues of a real square matrix A are complex numbers that make up the spectrum of the matrix. The exponential growth rate of the matrix powers Ak as k → ∞ is controlled by the eigenvalue of A with the largest absolute value (modulus). The Perron–Frobenius theorem describes the properties of the leading eigenvalue and of the corresponding eigenvectors when A is a non-negative real square matrix. Early results were due to and concerned positive matrices. Later, found their extension to certain classes of non-negative matrices.
Let
A=(aij)
n x n
aij>0
1\lei,j\len
\rho(A)
\limkAk/rk=vwT
All of these properties extend beyond strictly positive matrices to primitive matrices (see below). Facts 1–7 can be found in Meyer chapter 8 claims 8.2.11–15 page 667 and exercises 8.2.5,7,9 pages 668–669.
The left and right eigenvectors w and v are sometimes normalized so that the sum of their components is equal to 1; in this case, they are sometimes called stochastic eigenvectors. Often they are normalized so that the right eigenvector v sums to one, while
wTv=1
There is an extension to matrices with non-negative entries. Since any non-negative matrix can be obtained as a limit of positive matrices, one obtains the existence of an eigenvector with non-negative components; the corresponding eigenvalue will be non-negative and greater than or equal, in absolute value, to all other eigenvalues. However, for the example
A=\left(\begin{smallmatrix}0&1\\ 1&0\end{smallmatrix}\right)
A=\left(\begin{smallmatrix}0&1\\ 0&0\end{smallmatrix}\right)
However, Frobenius found a special subclass of non-negative matrices — irreducible matrices — for which a non-trivial generalization is possible. For such a matrix, although the eigenvalues attaining the maximal absolute value might not be unique, their structure is under control: they have the form
\omegar
r
\omega
r
Let A be a n × n square matrix over field F.The matrix A is irreducible if any of the following equivalent propertiesholds.
Definition 1 : A does not have non-trivial invariant coordinate subspaces.Here a non-trivial coordinate subspace means a linear subspace spanned by any proper subset of standard basis vectors of Fn. More explicitly, for any linear subspace spanned by standard basis vectors ei1, ...,eik, 0 < k < n its image under the action of A is not contained in the same subspace.
Definition 2: A cannot be conjugated into block upper triangular form by a permutation matrix P:
PAP-1\ne \begin{pmatrix}E&F\ O&G\end{pmatrix},
Definition 3: One can associate with a matrix A a certain directed graph GA. It has n vertices labeled 1,...,n, and there is an edge from vertex i to vertex j precisely when aij ≠ 0. Then the matrix A is irreducible if and only if its associated graph GA is strongly connected.
If F is the field of real or complex numbers, then we also have the following condition.
Definition 4: The group representation of
(R,+)
Rn
(C,+)
Cn
t\mapsto\exp(tA)
A matrix is reducible if it is not irreducible.
A real matrix A is primitive if it is non-negative and its mth power is positive for some natural number m (i.e. all entries of Am are positive).
Let A be real and non-negative. Fix an index i and define the period of index i to be the greatest common divisor of all natural numbers m such that (Am)ii > 0. When A is irreducible, the period of every index is the same and is called the period of A. In fact, when A is irreducible, the period can be defined as the greatest common divisor of the lengths of the closed directed paths in GA (see Kitchens page 16). The period is also called the index of imprimitivity (Meyer page 674) or the order of cyclicity. If the period is 1, A is aperiodic. It can be proved that primitive matrices are the same as irreducible aperiodic non-negative matrices.
All statements of the Perron–Frobenius theorem for positive matrices remain true for primitive matrices. The same statements also hold for a non-negative irreducible matrix, except that it may possess several eigenvalues whose absolute value is equal to its spectral radius, so the statements need to be correspondingly modified. In fact the number of such eigenvalues is equal to the period.
Results for non-negative matrices were first obtained by Frobenius in 1912.
Let
A
N x N
h
\rho(A)=r
r\inR+
A
r
r
A
v
w
r
r
A
h
h
r
r
h
\omega=2\pi/h
A
ei\omegaA
A
ei\omega
\omega
h>1
P
PAP-1= \begin{pmatrix} O&A1&O&O&\ldots&O\\ O&O&A2&O&\ldots&O\\ \vdots&\vdots&\vdots&\vdots&&\vdots\\ O&O&O&O&\ldots&Ah-1\\ Ah&O&O&O&\ldots&O \end{pmatrix},
where
O
x
f(x)
[Ax]i/xi
i
xi ≠ 0
f
mini\sumjaij\ler\lemaxi\sumjaij.
The example
A=\left(\begin{smallmatrix} 0&0&1\\ 0&0&1\\ 1&1&0\end{smallmatrix}\right)
Let A be an irreducible non-negative matrix, then:
PAqP-1=\begin{pmatrix} A1&O&O&...&O\\ O&A2&O&...&O\\ \vdots&\vdots&\vdots&&\vdots\\ O&O&O&...&Ad\\ \end{pmatrix}
\limk1/k\sumi=0,...,kAi/ri=(vwT),
A matrix A is primitive provided it is non-negative and Am is positive for some m, and hence Ak is positive for all k ≥ m. To check primitivity, one needs a bound on how large the minimal such m can be, depending on the size of A:
M= \left(\begin{smallmatrix} 0&1&0&0& … &0\\ 0&0&1&0& … &0\\ 0&0&0&1& … &0\\ \vdots&\vdots&\vdots&\vdots&&\vdots\\ 0&0&0&0& … &1\\ 1&1&0&0& … &0 \end{smallmatrix}\right)
Numerous books have been written on the subject of non-negative matrices, and Perron–Frobenius theory is invariably a central feature. The following examples given below only scratch the surface of its vast application domain.
The Perron–Frobenius theorem does not apply directly to non-negative matrices. Nevertheless, any reducible square matrix A may be written in upper-triangular block form (known as the normal form of a reducible matrix)
PAP−1 =
\left(\begin{smallmatrix} B1&*&*& … &*\\ 0&B2&*& … &*\\ \vdots&\vdots&\vdots&&\vdots\\ 0&0&0& … &*\\ 0&0&0& … &Bh \end{smallmatrix} \right)
where P is a permutation matrix and each Bi is a square matrix that is either irreducible or zero. Now if A isnon-negative then so too is each block of PAP−1, moreover the spectrum of A is just the union of the spectra of theBi.
The invertibility of A can also be studied. The inverse of PAP−1 (if it exists) must have diagonal blocks of the form Bi−1 so if anyBi isn't invertible then neither is PAP−1 or A.Conversely let D be the block-diagonal matrix corresponding to PAP−1, in other words PAP−1 with theasterisks zeroised. If each Bi is invertible then so is D and D−1(PAP−1) is equal to theidentity plus a nilpotent matrix. But such a matrix is always invertible (if Nk = 0 the inverse of 1 − N is1 + N + N2 + ... + Nk−1) so PAP−1 and A are both invertible.
Therefore, many of the spectral properties of A may be deduced by applying the theorem to the irreducible Bi. For example, the Perron root is the maximum of the ρ(Bi). While there will still be eigenvectors with non-negative components it is quite possiblethat none of these will be positive.
A row (column) stochastic matrix is a square matrix each of whose rows (columns) consists of non-negative real numbers whose sum is unity. The theorem cannot be applied directly to such matrices because they need not be irreducible.
If A is row-stochastic then the column vector with each entry 1 is an eigenvector corresponding to the eigenvalue 1, which is also ρ(A) by the remark above. It might not be the only eigenvalue on the unit circle: and the associated eigenspace can be multi-dimensional. If A is row-stochastic and irreducible then the Perron projection is also row-stochastic and all its rows are equal.
The theorem has particular use in algebraic graph theory. The "underlying graph" of a nonnegative n-square matrix is the graph with vertices numbered 1, ..., n and arc ij if and only if Aij ≠ 0. If the underlying graph of such a matrix is strongly connected, then the matrix is irreducible, and thus the theorem applies. In particular, the adjacency matrix of a strongly connected graph is irreducible.[9] [10]
The theorem has a natural interpretation in the theory of finite Markov chains (where it is the matrix-theoretic equivalent of the convergence of an irreducible finite Markov chain to its stationary distribution, formulated in terms of the transition matrix of the chain; see, for example, the article on the subshift of finite type).
See main article: Krein–Rutman theorem. More generally, it can be extended to the case of non-negative compact operators, which, in many ways, resemble finite-dimensional matrices. These are commonly studied in physics, under the name of transfer operators, or sometimes Ruelle–Perron–Frobenius operators (after David Ruelle). In this case, the leading eigenvalue corresponds to the thermodynamic equilibrium of a dynamical system, and the lesser eigenvalues to the decay modes of a system that is not in equilibrium. Thus, the theory offers a way of discovering the arrow of time in what would otherwise appear to be reversible, deterministic dynamical processes, when examined from the point of view of point-set topology.[11]
A common thread in many proofs is the Brouwer fixed point theorem. Another popular method is that of Wielandt (1950). He used the Collatz–Wielandt formula described above to extend and clarify Frobenius's work. Another proof is based on the spectral theory from which part of the arguments are borrowed.
If A is a positive (or more generally primitive) matrix, then there exists a real positive eigenvalue r (Perron–Frobenius eigenvalue or Perron root), which is strictly greater in absolute value than all other eigenvalues, hence r is the spectral radius of A.
This statement does not hold for general non-negative irreducible matrices, which have h eigenvalues with the same absolute eigenvalue as r, where h is the period of A.
Let A be a positive matrix, assume that its spectral radius ρ(A) = 1 (otherwise consider A/ρ(A)). Hence, there exists an eigenvalue λ on the unit circle, and all the other eigenvalues are less or equal 1 in absolute value. Suppose that another eigenvalue λ ≠ 1 also falls on the unit circle. Then there exists a positive integer m such that Am is a positive matrix and the real part of λm is negative. Let ε be half the smallest diagonal entry of Am and set T = Am − εI which is yet another positive matrix. Moreover, if Ax = λx then Amx = λmx thus λm − ε is an eigenvalue of T. Because of the choice of m this point lies outside the unit disk consequently ρ(T) > 1. On the other hand, all the entries in T are positive and less than or equal to those in Am so by Gelfand's formula ρ(T) ≤ ρ(Am) ≤ ρ(A)m = 1. This contradiction means that λ=1 and there can be no other eigenvalues on the unit circle.
Absolutely the same arguments can be applied to the case of primitive matrices; we just need to mention the following simple lemma, which clarifies the properties of primitive matrices.
Given a non-negative A, assume there exists m, such that Am is positive, then Am+1, Am+2, Am+3,... are all positive.
Am+1 = AAm, so it can have zero element only if some row of A is entirely zero, but in this case the same row of Am will be zero.
Applying the same arguments as above for primitive matrices, prove the main claim.
For a positive (or more generally irreducible non-negative) matrix A the dominant eigenvector is real and strictly positive (for non-negative A respectively non-negative.)
This can be established using the power method, which states that for a sufficiently generic (in the sense below) matrix A the sequence of vectors bk+1 = Abk / | Abk | converges to the eigenvector with the maximum eigenvalue. (The initial vector b0 can be chosen arbitrarily except for some measure zero set). Starting with a non-negative vector b0 produces the sequence of non-negative vectors bk. Hence the limiting vector is also non-negative. By the power method this limiting vector is the dominant eigenvector for A, proving the assertion. The corresponding eigenvalue is non-negative.
The proof requires two additional arguments. First, the power method converges for matrices which do not have several eigenvalues of the same absolute value as the maximal one. The previous section's argument guarantees this.
Second, to ensure strict positivity of all of the components of the eigenvector for the case of irreducible matrices. This follows from the following fact, which is of independent interest:
Lemma: given a positive (or more generally irreducible non-negative) matrix A and v as any non-negative eigenvector for A, then it is necessarily strictly positive and the corresponding eigenvalue is also strictly positive.
Proof. One of the definitions of irreducibility for non-negative matrices is that for all indexes i,j there exists m, such that (Am)ij is strictly positive. Given a non-negative eigenvector v, and that at least one of its components say j-th is strictly positive, the corresponding eigenvalue is strictly positive, indeed, given n such that (An)ii >0, hence: rnvi =Anvi ≥(An)iivi>0. Hence r is strictly positive. The eigenvector is strict positivity. Then given m, such that (Am)ij >0, hence: rmvj =(Amv)j ≥(Am)ijvi >0, hencevj is strictly positive, i.e., the eigenvector is strictly positive.
This section proves that the Perron–Frobenius eigenvalue is a simple root of the characteristic polynomial of the matrix. Hence the eigenspace associated to Perron–Frobenius eigenvalue r is one-dimensional. The arguments here are close to those in Meyer.
Given a strictly positive eigenvector v corresponding to r and another eigenvector w with the same eigenvalue. (The vectors v and w can be chosen to be real, because A and r are both real, so the null space of A-r has a basis consisting of real vectors.) Assuming at least one of the components of w is positive (otherwise multiply w by −1). Given maximal possible α such that u=v- α w is non-negative, then one of the components of u is zero, otherwise α is not maximum. Vector u is an eigenvector. It is non-negative, hence by the lemma described in the previous section non-negativity implies strict positivity for any eigenvector. On the other hand, as above at least one component of u is zero. The contradiction implies that w does not exist.
Case: There are no Jordan cells corresponding to the Perron–Frobenius eigenvalue r and all other eigenvalues which have the same absolute value.
If there is a Jordan cell, then the infinity norm(A/r)k∞ tends to infinity for k → ∞ ,but that contradicts the existence of the positive eigenvector.
Given r = 1, or A/r. Letting v be a Perron–Frobenius strictly positive eigenvector, so Av=v, then:
\|v\|infty=\|Akv\|infty\ge
k\| | |
\|A | |
infty |
mini(vi),~~ ⇒ ~~
k\| | |
\|A | |
infty |
\le\|v\|/mini(vi)
Jk=\begin{pmatrix}λ&1\ 0&λ\end{pmatrix}k = \begin{pmatrix}λk&kλk-1\ 0&λk\end{pmatrix},
Combining the two claims above reveals that the Perron–Frobenius eigenvalue r is simple root of the characteristic polynomial. In the case of nonprimitive matrices, there exist other eigenvalues which have the same absolute value as r. The same claim is true for them, but requires more work.
Given positive (or more generally irreducible non-negative matrix) A, the Perron–Frobenius eigenvector is the only (up to multiplication by constant) non-negative eigenvector for A.
Other eigenvectors must contain negative or complex components since eigenvectors for different eigenvalues are orthogonal in some sense, but two positive eigenvectors cannot be orthogonal, so they must correspond to the same eigenvalue, but the eigenspace for the Perron–Frobenius is one-dimensional.
Assuming there exists an eigenpair (λ, y) for A, such that vector y is positive, and given (r, x), where x – is the left Perron–Frobenius eigenvector for A (i.e. eigenvector for AT), thenrxTy = (xT A) y = xT (Ay) = λxTy, also xT y > 0, so one has: r = λ. Since the eigenspace for the Perron–Frobenius eigenvalue r is one-dimensional, non-negative eigenvector y is a multiple of the Perron–Frobenius one.
Given a positive (or more generally irreducible non-negative matrix) A, one defines the function f on the set of all non-negative non-zero vectors x such that f(x) is the minimum value of [''Ax'']i / xi taken over all those i such that xi ≠ 0. Then f is a real-valued function, whose maximum is the Perron–Frobenius eigenvalue r.
For the proof we denote the maximum of f by the value R. The proof requires to show R = r. Inserting the Perron-Frobenius eigenvector v into f, we obtain f(v) = r and conclude r ≤ R. For the opposite inequality, we consider an arbitrary nonnegative vector x and let ξ=f(x). The definition of f gives 0 ≤ ξx ≤ Ax (componentwise). Now, we use the positive right eigenvector w for A for the Perron-Frobenius eigenvalue r, then ξ wT x = wT ξx ≤ wT (Ax) = (wT A)x = r wT x . Hence f(x) = ξ ≤ r, which implies R ≤ r.
Let A be a positive (or more generally, primitive) matrix, and let r be its Perron–Frobenius eigenvalue.
Hence P is a spectral projection for the Perron–Frobenius eigenvalue r, and is called the Perron projection. The above assertion is not true for general non-negative irreducible matrices.
Actually the claims above (except claim 5) are valid for any matrix M such that there exists an eigenvalue r which is strictly greater than the other eigenvalues in absolute value and is the simple root of the characteristic polynomial. (These requirements hold for primitive matrices as above).
Given that M is diagonalizable, M is conjugate to a diagonal matrix with eigenvalues r1, ..., rn on the diagonal (denote r1 = r). The matrix Mk/rk will be conjugate (1, (r2/r)k, ..., (rn/r)k), which tends to (1,0,0,...,0), for k → ∞, so the limit exists. The same method works for general M (without assuming that M is diagonalizable).
The projection and commutativity properties are elementary corollaries of the definition: MMk/rk = Mk/rk M ; P2 = lim M2k/r2k = P. The third fact is also elementary: M(Pu) = M lim Mk/rk u = lim rMk+1/rk+1u, so taking the limit yields M(Pu) = r(Pu), so image of P lies in the r-eigenspace for M, which is one-dimensional by the assumptions.
Denoting by v, r-eigenvector for M (by w for MT). Columns of P are multiples of v, because the image of P is spanned by it. Respectively, rows of w. So P takes a form (a v wT), for some a. Hence its trace equals to (a wT v). Trace of projector equals the dimension of its image. It was proved before that it is not more than one-dimensional. From the definition one sees that P acts identically on the r-eigenvector for M. So it is one-dimensional. So choosing (wTv) = 1, implies P = vwT.
For any non-negative matrix A its Perron–Frobenius eigenvalue r satisfies the inequality:
r \le maxi\sumjaij.
This is not specific to non-negative matrices: for any matrix A with an eigenvalue
\scriptstyleλ
\scriptstyle|λ| \le maxi\sumj|aij|
Any matrix induced norm satisfies the inequality
\scriptstyle\|A\|\ge|λ|
\scriptstyleλ
\scriptstylex
\scriptstyle\|A\|\ge|Ax|/|x|=|λx|/|x|=|λ|
\scriptstyle\left\|A\right\|infty=max\limits1\sumj=1n|aij|.
\scriptstyle\|A\|infty\ge|λ|
Another inequality is:
mini\sumjaij \le r.
This fact is specific to non-negative matrices; for general matrices there is nothing similar. Given that A is positive (not just non-negative), then there exists a positive eigenvector w such that Aw = rw and the smallest component of w (say wi) is 1. Then r = (Aw)i ≥ the sum of the numbers in row i of A. Thus the minimum row sum gives a lower bound for r and this observation can be extended to all non-negative matrices by continuity.
Another way to argue it is via the Collatz-Wielandt formula. One takes the vector x = (1, 1, ..., 1) and immediately obtains the inequality.
The proof now proceeds using spectral decomposition. The trick here is to split the Perron root from the other eigenvalues. The spectral projection associated with the Perron root is called the Perron projection and it enjoys the following property:
The Perron projection of an irreducible non-negative square matrix is a positive matrix.
Perron's findings and also (1)–(5) of the theorem are corollaries of this result. The key point is that a positive projection always has rank one. This means that if A is an irreducible non-negative square matrix then the algebraic and geometric multiplicities of its Perron root are both one. Also if P is its Perron projection then AP = PA = ρ(A)P so every column of P is a positive right eigenvector of A and every row is a positive left eigenvector. Moreover, if Ax = λx then PAx = λPx = ρ(A)Px which means Px = 0 if λ ≠ ρ(A). Thus the only positive eigenvectors are those associated with ρ(A). If A is a primitive matrix with ρ(A) = 1 then it can be decomposed as P ⊕ (1 − P)A so that An = P + (1 − P)An. As n increases the second of these terms decays to zero leaving P as the limit of An as n → ∞.
The power method is a convenient way to compute the Perron projection of a primitive matrix. If v and w are the positive row and column vectors that it generates then the Perron projection is just wv/vw. The spectral projections aren't neatly blocked as in the Jordan form. Here they are overlaid and each generally has complex entries extending to all four corners of the square matrix. Nevertheless, they retain their mutual orthogonality which is what facilitates the decomposition.
The analysis when A is irreducible and non-negative is broadly similar. The Perron projection is still positive but there may now be other eigenvalues of modulus ρ(A) that negate use of the power method and prevent the powers of (1 − P)A decaying as in the primitive case whenever ρ(A) = 1. So we consider the peripheral projection, which is the spectral projection of A corresponding to all the eigenvalues that have modulus ρ(A). It may then be shown that the peripheral projection of an irreducible non-negative square matrix is a non-negative matrix with a positive diagonal.
Suppose in addition that ρ(A) = 1 and A has h eigenvalues on the unit circle. If P is the peripheral projection then the matrix R = AP = PA is non-negative and irreducible, Rh = P, and the cyclic group P, R, R2, ...., Rh−1 represents the harmonics of A. The spectral projection of A at the eigenvalue λ on the unit circle is given by the formula
\scriptstyleh-1
-k | |
\sum | |
1λ |
Rk
The matrices L =
\left(\begin{smallmatrix} 1&0&0\\ 1&0&0\\ 1&1&1 \end{smallmatrix} \right)
\left(\begin{smallmatrix} 1&0&0\\ 1&0&0\\ -1&1&1 \end{smallmatrix} \right)
\left(\begin{smallmatrix} 0&1&1\\ 1&0&1\\ 1&1&0 \end{smallmatrix} \right)
\left(\begin{smallmatrix} 0&1&0&0&0\\ 1&0&0&0&0\\ 0&0&0&1&0\\ 0&0&0&0&1\\ 0&0&1&0&0 \end{smallmatrix} \right)
A problem that causes confusion is a lack of standardisation in the definitions. For example, some authors use the terms strictly positive and positive to mean > 0 and ≥ 0 respectively. In this article positive means > 0 and non-negative means ≥ 0. Another vexed area concerns decomposability and reducibility: irreducible is an overloaded term. For avoidance of doubt a non-zero non-negative square matrix A such that 1 + A is primitive is sometimes said to be connected. Then irreducible non-negative square matrices and connected matrices are synonymous.[12]
The nonnegative eigenvector is often normalized so that the sum of its components is equal to unity; in this case, the eigenvector is the vector of a probability distribution and is sometimes called a stochastic eigenvector.
Perron–Frobenius eigenvalue and dominant eigenvalue are alternative names for the Perron root. Spectral projections are also known as spectral projectors and spectral idempotents. The period is sometimes referred to as the index of imprimitivity or the order of cyclicity.