In mathematics, a block matrix pseudoinverse is a formula for the pseudoinverse of a partitioned matrix. This is useful for decomposing or approximating many algorithms updating parameters in signal processing, which are based on the least squares method.
Consider a column-wise partitioned matrix:
\begin{bmatrix}A&B\end{bmatrix}, A\in\realsm, B\in\realsm, m\geqn+p.
If the above matrix is full column rank, the Moore–Penrose inverse matrices of it and its transpose are
\begin{align} \begin{bmatrix}A&B\end{bmatrix}+&= \left(\begin{bmatrix}A&B\end{bmatrix}sf{T} \begin{bmatrix}A&B\end{bmatrix} \right)-1\begin{bmatrix}A&B\end{bmatrix}sf{T},\\ \begin{bmatrix} Asf{T}\\ Bsf{T} \end{bmatrix}+&= \begin{bmatrix}A&B\end{bmatrix}\left(\begin{bmatrix}A&B\end{bmatrix}sf{T} \begin{bmatrix}A&B\end{bmatrix} \right)-1. \end{align}
This computation of the pseudoinverse requires (n + p)-square matrix inversion and does not take advantage of the block form.
To reduce computational costs to n- and p-square matrix inversions and to introduce parallelism, treating the blocks separately, one derives [1]
\begin{align} \begin{bmatrix}A&B\end{bmatrix}+&= \begin{bmatrix}
\perp | |
P | |
B |
\perp | |
A\left(A | |
B |
A\right)-1\\
\perp | |
P | |
A |
\perp | |
B\left(B | |
A |
B\right)-1\end{bmatrix}= \begin{bmatrix}
\perpA\right) | |
\left(P | |
B |
+\\
\perpB\right) | |
\left(P | |
A |
+ \end{bmatrix},\\ \begin{bmatrix} Asf{T}\\ Bsf{T} \end{bmatrix}+&= \begin{bmatrix}
\perp | |
P | |
B |
\perp | |
A\left(A | |
B |
A\right)-1,
\perp | |
P | |
A |
\perp | |
B\left(B | |
A |
B\right)-1\end{bmatrix}= \begin{bmatrix}
\perp\right) | |
\left(A | |
B |
+&
\perp\right) | |
\left(B | |
A |
+ \end{bmatrix}, \end{align}
where orthogonal projection matrices are defined by
\begin{align}
\perp | |
P | |
A |
&=I-A\left(Asf{T}A\right)-1Asf{T},\\
\perp | |
P | |
B |
&=I-B\left(Bsf{T}B\right)-1Bsf{T}. \end{align}
The above formulas are not necessarily valid if
\begin{bmatrix}A&B\end{bmatrix}
A ≠ 0
\begin{bmatrix}A&A\end{bmatrix}+=
1 | |
2 |
\begin{bmatrix} A+\\ A+ \end{bmatrix} ≠ \begin{bmatrix}
\perpA\right) | |
\left(P | |
A |
+\\
\perpA\right) | |
\left(P | |
A |
+ \end{bmatrix}= 0
Given the same matrices as above, we consider the following least squares problems, whichappear as multiple objective optimizations or constrained problems in signal processing.Eventually, we can implement a parallel algorithm for least squares based on the following results.
Suppose a solution
x=\begin{bmatrix} x1\\ x2\\ \end{bmatrix}
\begin{bmatrix} A,&B \end{bmatrix} \begin{bmatrix} x1\\ x2\\ \end{bmatrix}= d, d\in\realsm.
Using the block matrix pseudoinverse, we have
x=\begin{bmatrix} A,&B \end{bmatrix}+d= \begin{bmatrix}
\perp | |
\left(P | |
B |
A\right)+\\
\perp | |
\left(P | |
A |
B\right)+\end{bmatrix}d.
Therefore, we have a decomposed solution:
x1=
\perp | |
\left(P | |
B |
A\right)+d, x2=
\perp | |
\left(P | |
A |
B\right)+d.
Suppose a solution
x
\begin{bmatrix} Asf{T}\\ Bsf{T} \end{bmatrix}x= \begin{bmatrix} e\\ f \end{bmatrix}, e\in\realsn, f\in\realsp.
The minimum-norm solution is given by
x= \begin{bmatrix} Asf{T}\\ Bsf{T} \end{bmatrix}+ \begin{bmatrix} e\\ f \end{bmatrix}.
Using the block matrix pseudoinverse, we have
x=\begin{bmatrix}
\perp\right) | |
\left(A | |
B |
+&
\perp\right) | |
\left(B | |
A |
+ \end{bmatrix}\begin{bmatrix} e\\ f \end{bmatrix}=
\perp\right) | |
\left(A | |
B |
+e+
\perp\right) | |
\left(B | |
A |
+f.
Instead of
\left(\begin{bmatrix}A&B\end{bmatrix}sf{T}\begin{bmatrix}A&B\end{bmatrix}\right)-1
\left(Asf{T}A\right)-1, \left(Bsf{T}B\right)-1,
\perp | |
\left(A | |
B |
A\right)-1,
\perp | |
\left(B | |
A |
B\right)-1.
In a dense and small system, we can use singular value decomposition, QR decomposition, or Cholesky decomposition to replace the matrix inversions with numerical routines. In a large system, we may employ iterative methods such as Krylov subspace methods.
Considering parallel algorithms, we can compute
\left(Asf{T}A\right)-1
\left(Bsf{T}B\right)-1
\perp | |
\left(A | |
B |
A\right)-1
\perp | |
\left(B | |
A |
B\right)-1