See main article: Perturbation theory. In mathematics, an eigenvalue perturbation problem is that of finding the eigenvectors and eigenvalues of a system
Ax=λx
A0x0=λ0x0
x0i,λ0i,i=1,...n
The derivations in this article are essentially self-contained and can be found in many texts on numerical linear algebra or numerical functional analysis.This article is focused on the case of the perturbation of a simple eigenvalue (see in multiplicity of eigenvalues).
In the entry applications of eigenvalues and eigenvectors we find numerous scientific fields in which eigenvalues are used to obtain solutions. Generalized eigenvalue problems are less widespread but are a key in the study of vibrations.They are useful when we use the Galerkin method or Rayleigh-Ritz method to find approximate solutions of partial differential equations modeling vibrations of structures such as strings and plates; the paper of Courant (1943) [2] is fundamental. The Finite element method is a widespread particular case.
In classical mechanics, we may find generalized eigenvalues when we look for vibrations of multiple degrees of freedom systems close to equilibrium; the kinetic energy provides the mass matrix
M
K
With both methods, we obtain a system of differential equations or Matrix differential equation
M\ddotx+B
x |
+Kx=0
M
B
K
B=0
x=eiu
u
\omega2
-\omega2Mu+Ku=0
Suppose we have solutions to the generalized eigenvalue problem,
K0x0i=λ0iM0x0i. (0)
where
K0
M0
Now suppose we want to change the matrices by a small amount. That is, we want to find the eigenvalues and eigenvectors of
Kxi=λiMxi (1)
\begin{align} K&=K0+\deltaK\\ M&=M0+\deltaM \end{align}
with the perturbations
\deltaK
\deltaM
K
M
\begin{align} λi&=λ0i+\deltaλi\\ xi&=x0i+\deltaxi\end{align}
We assume that the matrices are symmetric and positive definite, and assume we have scaled the eigenvectors such that
\top | |
x | |
0j |
M0x0i=\deltaij,
T | |
x | |
i |
Mxj=\deltaij (2)
where is the Kronecker delta. Now we want to solve the equation
Kxi-λiMxi=0.
Substituting in (1), we get
(K0+\deltaK)(x0i+\deltaxi)=\left(λ0i+\deltaλi\right)\left(M0+\deltaM\right)\left(x0i+\deltaxi\right),
which expands to
\begin{align} K0x0i&+\deltaKx0i+K0\deltaxi+\deltaK\deltaxi=\\[6pt] &λ0iM0x0i+λ0iM0\deltaxi+λ0i\deltaMx0i+\deltaλiM0x0i+\\ & λ0i\deltaM\deltaxi+\deltaλi\deltaMx0i+\deltaλiM0\deltaxi+\deltaλi\deltaM\deltaxi. \end{align}
Canceling from (0) (
K0x0i=λ0iM0x0i
\begin{align} \deltaKx0i+&K0\deltaxi+\deltaK\deltaxi=λ0iM0\deltaxi+λ0i\deltaMx0i+\deltaλiM0x0i+\ &λ0i\deltaM\deltaxi+\deltaλi\deltaMx0i+\deltaλiM0\deltaxi+\deltaλi\deltaM\deltaxi. \end{align}
Removing the higher-order terms, this simplifies to
K0\deltaxi+\deltaKx0i=λ0iM0\deltaxi+λ0i\deltaMx0i+\deltaλiM0x0i. (3)
In other words,
\deltaλi
As the matrix is symmetric, the unperturbed eigenvectors are
M
\deltaxi=
N | |
\sum | |
j=1 |
\varepsilonijx0j (4)
\varepsilonij
T | |
=x | |
0j |
M\deltaxi
In the same way, substituting in (2), and removing higher order terms, we get
\deltaxjM0x0i+x0jM0\deltaxi+x0j\deltaM0x0i=0 {(5)}
The derivation can go on with two forks.
We start with (3)
K0\deltaxi+\deltaKx0i=λ0iM0\deltaxi+λ0i\deltaMx0i+\deltaλiM0x0i;
T | |
x | |
0i |
T | |
x | |
0i |
\deltaKx0i=λ0i
T\delta | |
x | |
0i |
Mx0i+\deltaλi
\deltaλi=x
T | |
0i |
\deltaKx0i-λ0i
T\delta | |
x | |
0i |
Mx0i
x0i
R(K,M;x0i
T | |
)=x | |
0i |
Kx0i
TMx | |
/x | |
0i |
,
TMx | |
withx | |
0i |
=1
Moreover, for
M=I
\deltaλi=x0iT\deltaKx0i
We left multiply (3) with
T | |
x | |
0j |
j ≠ i
TK | |
x | |
0 |
\deltaxi+
T | |
x | |
0j |
\deltaKx0i=λ0i
T | |
x | |
0j |
M0\deltaxi+λ0i
T\delta | |
x | |
0j |
Mx0i+\deltaλi
TM | |
x | |
0x |
0i.
T | |
x | |
0j |
K=λ0j
TM | |
x | |
0j |
and
TM | |
x | |
0x |
0i=0,
j ≠ i
λ0j
TM | |
x | |
0 |
\deltaxi+
T | |
x | |
0j |
\deltaKx0i=λ0i
T | |
x | |
0j |
M0\deltaxi+λ0i
T\delta | |
x | |
0j |
Mx0i.
(λ0j-λ0i)
TM | |
x | |
0 |
\deltaxi+
T | |
x | |
0j |
\deltaKx0i=λ0i
T\delta | |
x | |
0j |
Mx0i.
j ≠ i
\epsilonij
TM | |
=x | |
0 |
\deltaxi=
| ||||||||||||||||
(λ0j-λ0i) |
,i=1,...N;j=1,...N;j ≠ i.
2\epsilonii=2
T | |
x | |
0i |
M0\deltaxi=-x
T | |
0i |
\deltaMx0i.
\deltaxi
Substituting (4) into (3) and rearranging gives
\begin{align} K0
N | |
\sum | |
j=1 |
\varepsilonijx0j+\deltaKx0i&=λ0iM0
N | |
\sum | |
j=1 |
\varepsilonijx0j+λ0i\deltaMx0i+\deltaλiM0x0i&&(5)
N | |
\\ \sum | |
j=1 |
\varepsilonijK0x0j+\deltaKx0i&=λ0iM0
N | |
\sum | |
j=1 |
\varepsilonijx0j+λ0i\deltaMx0i+\deltaλiM0x0i&&\ (applyingK0tothesum
N | |
)\\ \sum | |
j=1 |
\varepsilonijλ0jM0x0j+\deltaKx0i&=λ0iM0
N | |
\sum | |
j=1 |
\varepsilonijx0j+λ0i\deltaMx0i+\deltaλiM0x0i&&(usingEq.(1)) \end{align}
Because the eigenvectors are -orthogonal when is positive definite, we can remove the summations by left-multiplying by
\top | |
x | |
0i |
\top | |
x | |
0i |
\varepsiloniiλ0iM0x0i+
\top | |
x | |
0i |
\deltaKx0i=λ0i
\top | |
x | |
0i |
M0\varepsiloniix0i+λ0i
\top | |
x | |
0i |
\deltaMx0i+\deltaλix
\top | |
0i |
M0x0i.
By use of equation (1) again:
\top | |
x | |
0i |
K0\varepsiloniix0i+
\top | |
x | |
0i |
\deltaKx0i=λ0i
\top | |
x | |
0i |
M0\varepsiloniix0i+λ0i
\top | |
x | |
0i |
\deltaMx0i+\deltaλix
\top | |
0i |
M0x0i. (6)
The two terms containing are equal because left-multiplying (1) by
\top | |
x | |
0i |
\topK | |
x | |
0x |
0i=λ0i
\top | |
x | |
0i |
M0x0i.
Canceling those terms in (6) leaves
\top | |
x | |
0i |
\deltaKx0i=λ0i
\top | |
x | |
0i |
\deltaMx0i+\deltaλi
\top | |
x | |
0i |
M0x0i.
Rearranging gives
\deltaλi=
| ||||||||||
|
But by (2), this denominator is equal to 1. Thus
\deltaλi=
\top | |
x | |
0i |
\left(\deltaK-λ0i\deltaM\right)x0i.
Then, as
λi ≠ λk
i ≠ k
\top | |
x | |
0k |
\varepsilonik=
| ||||||||||
λ0i-λ0k |
, i ≠ k.
Or by changing the name of the indices:
\varepsilonij=
| ||||||||||
λ0i-λ0j |
, i ≠ j.
To find, use the fact that:
\top | |
x | |
i |
Mxi=1
implies:
\varepsilonii
\top | |
=-\tfrac{1}{2}x | |
0i |
\deltaMx0i.
In the case where all the matrices are Hermitian positive definite and all the eigenvalues are distinct,
\begin{align} λi&=λ0i+
\top | |
x | |
0i |
\left(\deltaK-λ0i\deltaM\right)x0i\ xi&=x0i\left(1-\tfrac{1}{2}
\top | |
x | |
0i |
\deltaMx0i\right)+
N | |
\sum | |
j=1\atopj ≠ i |
| ||||||||||
λ0i-λ0j |
x0j\end{align}
for infinitesimal
\deltaK
\deltaM
So far, we have not proved that these higher order terms may be neglected. This point may be derived using the implicit function theorem; in next section, we summarize the use of this theorem in order to obtain a first order expansion.
In the next paragraph, we shall use the Implicit function theorem (Statement of the theorem); we notice that for a continuously differentiable function
f:\Rn+m\to\Rm, f:(x,y)\mapstof(x,y)
Jf,b(x0,y0)
(x0,y0)
f(x0,y0)=0
f(x,y)=0
x
x0
y=g(x)
g
g
Jf,y(x,g(x))Jg,x(x)+Jf,x(x,g(x))=0 (6)
g
f(x0+\deltax,y0+\deltay)=0
Jf,x(x,g(x))\deltax+Jf,y(x,g(x))\deltay=0
\deltay=Jg,x(x)\deltax
(6)
We use the previous paragraph (Perturbation of an implicit function) with somewhat different notations suited to eigenvalue perturbation; we introduce
\tilde{f}:
2n2 | |
\R |
x \Rn+1\to\Rn+1
\tilde{f}(K,M,λ,x)=\binom{f(K,M,λ,x)}{fn+1(x)}
f(K,M,λ,x)=Kx-λx,fn+1(M,x)=xTMx-1
J\tilde{f;λ,x}(K,M;λ0i,x0i)
J\tilde{f;λ,x}(K,M;λi,xi)(\deltaλ,\deltax)=\binom{-Mxi}{0}\deltaλ+\binom{K-λM}{2
T | |
x | |
i |
M}\deltaxi
J\tilde{f;λ0i,x0i}(K,M;λ0i,x0i)(\deltaλi,\deltaxi)=
\binom{y}{yn+1
T | |
orx | |
0j |
M\deltaxi=x
T | |
j |
y/(λ0i-λ0j),and
TM | |
2x | |
0i |
\deltaxi=yn+1
When
λi
x0j,j=1,...,n
The implicit function theorem provides a continuously differentiable function
(K,M)\mapsto(λi(K,M),xi(K,M))
λi=λ0i+\deltaλi+o(\|\deltaK\|+\|\deltaM\|)
xi=x0i+\deltaxi+o(\|\deltaK\|+\|\deltaM\|)
\deltaλi=x
T | |
0i |
\deltaKx0i-λ0i
T\delta | |
x | |
0i |
Mx0i;
\deltaxi=x
TM | |
0 |
\deltaxix0jwith
TM | |
x | |
0 |
\deltaxi=
| ||||||||||||||||
(λ0j-λ0i) |
,i=1,...n;j=1,...n;j ≠ i.
This means it is possible to efficiently do a sensitivity analysis on as a function of changes in the entries of the matrices. (Recall that the matrices are symmetric and so changing will also change, hence the term.)
\begin{align} | \partialλi |
\partialK(k\ell) |
&=
\partial | |
\partialK(k\ell) |
\left(λ0i+
\top | |
x | |
0i |
\left(\deltaK-λ0i\deltaM\right)x0i\right)=x0i(k)x0i(\ell)\left(2-\deltak\ell\right)\\
\partialλi | |
\partialM(k\ell) |
&=
\partial | |
\partialM(k\ell) |
\left(λ0i+
\top | |
x | |
0i |
\left(\deltaK-λ0i\deltaM\right)x0i\right)=-λix0i(k)x0i(\ell)\left(2-\deltak\ell\right). \end{align}
Similarly
\begin{align} | \partialxi |
\partialK(k\ell) |
&=
N | |
\sum | |
j=1\atopj ≠ i |
x0j(k)x0i(\ell)\left(2-\deltak\ell\right) | |
λ0i-λ0j |
x0j\\
\partialxi | |
\partialM(k\ell) |
&=-x0i
x0i(k)x0i(\ell) | |
2 |
(2-\deltak\ell)-
N | |
\sum | |
j=1\atopj ≠ i |
λ0ix0j(k)x0i(\ell) | |
λ0i-λ0j |
x0j\left(2-\deltak\ell\right). \end{align}
A simple case is
K=\begin{bmatrix}2&b\ b&0\end{bmatrix}
λ=-\left[\sqrt{b2+1}+1\right]
\partialλ | = | |
\partialb |
-x | |
\sqrt{x2+1 |
\tilde
2+1}+1))] | |
x | |
0=[x,-(\sqrt{x |
T
x01x02=\tildex01\tildex02/\|\tildex0\|2
\|\tildex0\|2=2\sqrt{x2+1}(\sqrt{x2+1}+1)
\tildex01\tildex02=-x(\sqrt{x2+1}+1)
x01x02=-
x | |
2\sqrt{x2+1 |
\partialλ | |
\partialb |
=2x01x02
\deltaλ=2x01x02\deltab
Note that in the above example we assumed that both the unperturbed and the perturbed systems involved symmetric matrices, which guaranteed the existence of
N
N
K
M
A technical report of Rellich [4] for perturbation of eigenvalue problems provides several examples. The elementary examples are in chapter 2. The report may be downloaded from archive.org. We draw an example in which the eigenvectors have a nasty behavior.
Consider the following matrix
B(\epsilon)=\epsilon\begin{bmatrix} \cos(2/\epsilon)&,\sin(2/\epsilon)\\ \sin(2/\epsilon)&,s\cos(2/\epsilon)\end{bmatrix}
A(\epsilon)=I-
-1/\epsilon2 | |
e |
B;
A(0)=I.
\epsilon ≠ 0
A(\epsilon)
\Phi1=[\cos(1/\epsilon),-\sin(1/\epsilon)]T;\Phi2=[\sin(1/\epsilon),-\cos(1/\epsilon)]T
λ1=
-1/\epsilon2) | |
1-e |
,λ2=
-1/\epsilon2) | |
1+e |
λ1 ≠ λ2
\epsilon ≠ 0
uj(\epsilon), j=1,2,
λj(\epsilon),j=1,2
uj=e
\alphaj(\epsilon) | |
\Phij(\epsilon)
\alphaj,j=1,2
\epsilon ≠ 0.
\alpha1(\epsilon)
u1(\epsilon)
\epsilon → 0,
|u1(\epsilon)|=|\cos(1/\epsilon)|
\epsilon → 0.
Note in this example that
Ajk(\epsilon)
A(\epsilon)
This example is less nasty that the previous one. Suppose
[K0]
u0=[1,1]T/\sqrt{2}
[K]=[K0]+\begin{bmatrix}\epsilon&0\\0&0\end{bmatrix}
Then the eigenvectors are
v1=[1,0]T
v2=[0,1]T
\epsilon
\|u0-v1\|
Rellich, F., & Berkowitz, J. . 1969 . Perturbation theory of eigenvalue problems. . Portugaliae Mathematica . 2 . 1 . 36–55 . CRC_Press. .