Multilinear multiplication explained

In multilinear algebra, applying a map that is the tensor product of linear maps to a tensor is called a multilinear multiplication.

Abstract definition

Let

F

be a field of characteristic zero, such as

R

or

C

.Let

Vk

be a finite-dimensional vector space over

F

, and let

l{A}\inV1V2Vd

be an order-d simple tensor, i.e., there exist some vectors

vk\inVk

such that

l{A}=v1v2vd

. If we are given a collection of linear maps

Ak:Vk\toWk

, then the multilinear multiplication of

l{A}

with

(A1,A2,\ldots,Ad)

is defined[1] as the action on

l{A}

of the tensor product of these linear maps,[2] namely

\beginA_1 \otimes A_2 \otimes \cdots \otimes A_d : V_1 \otimes V_2 \otimes \cdots \otimes V_d & \to W_1 \otimes W_2 \otimes \cdots \otimes W_d, \\\mathbf_1 \otimes \mathbf_2 \otimes \cdots \otimes \mathbf_d & \mapsto A_1(\mathbf_1) \otimes A_2(\mathbf_2) \otimes \cdots \otimes A_d(\mathbf_d)\end

Since the tensor product of linear maps is itself a linear map, and because every tensor admits a tensor rank decomposition, the above expression extends linearly to all tensors. That is, for a general tensor

l{A}\inV1V2Vd

, the multilinear multiplication is

\begin& \mathcal := (A_1 \otimes A_2 \otimes \cdots \otimes A_d)(\mathcal) \\[4pt]= & (A_1 \otimes A_2 \otimes \cdots \otimes A_d)\left(\sum_^r \mathbf_i^1 \otimes \mathbf_i^2 \otimes \cdots \otimes \mathbf_i^d\right) \\[5pt]= & \sum_^r A_1(\mathbf_i^1) \otimes A_2(\mathbf_i^2) \otimes \cdots \otimes A_d(\mathbf_i^d)\end

where \mathcal = \sum_^r \mathbf_i^1 \otimes \mathbf_i^2 \otimes \cdots \otimes \mathbf_i^d with

k
a
i

\inVk

is one of

l{A}

's tensor rank decompositions. The validity of the above expression is not limited to a tensor rank decomposition; in fact, it is valid for any expression of

l{A}

as a linear combination of pure tensors, which follows from the universal property of the tensor product.

It is standard to use the following shorthand notations in the literature for multilinear multiplications:(A_1, A_2, \ldots, A_d) \cdot \mathcal := (A_1 \otimes A_2 \otimes \cdots \otimes A_d)(\mathcal) andA_k \cdot_k \mathcal := (\operatorname_, \ldots, \operatorname_, A_k, \operatorname_, \ldots, \operatorname_) \cdot \mathcal,where

\operatorname{Id}
Vk

:Vk\toVk

is the identity operator.

Definition in coordinates

In computational multilinear algebra it is conventional to work in coordinates. Assume that an inner product is fixed on

Vk

and let
*
V
k
denote the dual vector space of

Vk

. Let

\{

k,
e
1

\ldots,

k
e
nk

\}

be a basis for

Vk

, let

\{

k)
(e
1

*,\ldots,

k)
(e
nk

*\}

be the dual basis, and let
k,
\{f
1

\ldots,

k
f
mk

\}

be a basis for

Wk

. The linear map M_k = \sum_^ \sum_^ m_^ f_i^k \otimes (e_j^k)^* is then represented by the matrix

\widehat{M}k=

(k)
[m
i,j

]\in

mk x nk
F
. Likewise, with respect to the standard tensor product basis

\{

1
e
j1

2
e
j2

d
e
jd
\}
j1,j2,\ldots,jd

, the abstract tensor\mathcal = \sum_^ \sum_^ \cdots \sum_^ a_ e_^1 \otimes e_^2 \otimes \cdots \otimes e_^d is represented by the multidimensional array

\widehat{l{A}}=

[a
j1,j2,\ldots,jd

]\in

n1 x n2 x … x nd
F

. Observe that \widehat = \sum_^ \sum_^ \cdots \sum_^ a_ \mathbf_^1 \otimes \mathbf_^2 \otimes \cdots \otimes \mathbf_^d,

where

k
e
j

\in

nk
F

is the jth standard basis vector of
nk
F

and the tensor product of vectors is the affine Segre map

:(v(1),v(2),\ldots,v(d))\mapsto

(1)
[v
i1
(2)
v
i2

(d)
v
id
]
i1,i2,\ldots,id

. It follows from the above choices of bases that the multilinear multiplication

l{B}=(M1,M2,\ldots,Md)l{A}

becomes

\begin\widehat &= (\widehat_1, \widehat_2, \ldots, \widehat_d) \cdot \sum_^ \sum_^ \cdots \sum_^ a_ \mathbf_^1 \otimes \mathbf_^2 \otimes \cdots \otimes \mathbf_^d \\&= \sum_^ \sum_^ \cdots \sum_^ a_ (\widehat_1, \widehat_2, \ldots, \widehat_d) \cdot (\mathbf_^1 \otimes \mathbf_^2 \otimes \cdots \otimes \mathbf_^d) \\&= \sum_^ \sum_^ \cdots \sum_^ a_ (\widehat_1 \mathbf_^1) \otimes (\widehat_2 \mathbf_^2) \otimes \cdots \otimes (\widehat_d \mathbf_^d).\end

The resulting tensor

\widehat{l{B}}

lives in
m1 x m2 x … x md
F
.

Element-wise definition

From the above expression, an element-wise definition of the multilinear multiplication is obtained. Indeed, since

\widehat{l{B}}

is a multidimensional array, it may be expressed as \widehat = \sum_^ \sum_^ \cdots \sum_^ b_ \mathbf_^1 \otimes \mathbf_^2 \otimes \cdots \otimes \mathbf_^d, where
b
j1,j2,\ldots,jd

\inF

are the coefficients. Then it follows from the above formulae that

\begin& \left((\mathbf_^1)^T, (\mathbf_^2)^T, \ldots, (\mathbf_^d)^T \right) \cdot \widehat \\= & \sum_^ \sum_^ \cdots \sum_^ b_ \left((\mathbf_^1)^T \mathbf_^1 \right) \otimes \left((\mathbf_^2)^T \mathbf_^2\right) \otimes \cdots \otimes \left((\mathbf_^d)^T \mathbf_^d \right) \\= & \sum_^ \sum_^ \cdots \sum_^ b_ \delta_ \cdot \delta_ \cdots \delta_ \\= & b_,\end

where

\deltai,j

is the Kronecker delta. Hence, if

l{B}=(M1,M2,\ldots,Md)l{A}

, then

\begin& b_ = \left((\mathbf_^1)^T, (\mathbf_^2)^T, \ldots, (\mathbf_^d)^T \right) \cdot \widehat \\= & \left((\mathbf_^1)^T, (\mathbf_^2)^T, \ldots, (\mathbf_^d)^T \right) \cdot (\widehat_1, \widehat_2, \ldots, \widehat_d) \cdot \sum_^ \sum_^ \cdots \sum_^ a_ \mathbf_^1 \otimes \mathbf_^2 \otimes \cdots \otimes \mathbf_^d \\= & \sum_^ \sum_^ \cdots \sum_^ a_ ((\mathbf_^1)^T \widehat_1 \mathbf_^1) \otimes ((\mathbf_^2)^T \widehat_2 \mathbf_^2) \otimes \cdots \otimes ((\mathbf_^d)^T \widehat_d \mathbf_^d) \\= & \sum_^ \sum_^ \cdots \sum_^ a_ m_^ \cdot m_^ \cdots m_^,\end

where the

(k)
m
i,j

are the elements of

\widehat{M}k

as defined above.

Properties

Let

l{A}\inV1V2Vd

be an order-d tensor over the tensor product of

F

-vector spaces.

Since a multilinear multiplication is the tensor product of linear maps, we have the following multilinearity property (in the construction of the map):

A_1 \otimes \cdots \otimes A_ \otimes (\alpha A_k + \beta B) \otimes A_ \otimes \cdots \otimes A_d = \alpha A_1 \otimes \cdots \otimes A_d + \beta A_1 \otimes \cdots \otimes A_ \otimes B \otimes A_ \otimes \cdots \otimes A_d

Multilinear multiplication is a linear map: (M_1, M_2, \ldots, M_d) \cdot (\alpha \mathcal + \beta \mathcal) = \alpha \; (M_1, M_2, \ldots, M_d) \cdot \mathcal +\beta \; (M_1, M_2, \ldots, M_d) \cdot \mathcal

It follows from the definition that the composition of two multilinear multiplications is also a multilinear multiplication:

(M_1, M_2, \ldots, M_d) \cdot \left((K_1, K_2, \ldots, K_d) \cdot \mathcal \right)= (M_1 \circ K_1, M_2 \circ K_2, \ldots, M_d \circ K_d) \cdot \mathcal,

where

Mk:Uk\toWk

and

Kk:Vk\toUk

are linear maps.

Observe specifically that multilinear multiplications in different factors commute,

M_k \cdot_k \left(M_\ell \cdot_\ell \mathcal \right) = M_\ell \cdot_\ell \left(M_k \cdot_k \mathcal \right) = M_k \cdot_k M_\ell \cdot_\ell \mathcal,

if

k\ne\ell.

Computation

The factor-k multilinear multiplication

Mkkl{A}

can be computed in coordinates as follows. Observe first that

\beginM_k \cdot_k \mathcal &= M_k \cdot_k \sum_^ \sum_^ \cdots \sum_^ a_ \mathbf_^1 \otimes \mathbf_^2 \otimes \cdots \otimes \mathbf_^d \\&= \sum_^ \cdots \sum_^ \sum_^ \cdots \sum_^ \mathbf_^1 \otimes \cdots \otimes \mathbf_^ \otimes M_k \left (\sum_^ a_ \mathbf_^k \right) \otimes \mathbf_^ \otimes \cdots \otimes \mathbf_^d.\end

Next, since

F^ \otimes F^ \otimes \cdots \otimes F^ \simeq F^ \otimes (F^ \otimes \cdots \otimes F^ \otimes F^ \otimes \cdots \otimes F^) \simeq F^ \otimes F^,

there is a bijective map, called the factor-k standard flattening, denoted by

()(k)

, that identifies

Mkkl{A}

with an element from the latter space, namely

\left(M_k \cdot_k \mathcal \right)_ :=\sum_^ \cdots \sum_^ \sum_^ \cdots \sum_^ M_k \left (\sum_^ a_ \mathbf_^ \right) \otimes \mathbf_ := M_k \mathcal_,

where

ej

is the jth standard basis vector of
Nk
F
,

Nk=n1nk-1nk+1nd

, and

l{A}(k)\in

nk
F

Nk
F

\simeq

nk x Nk
F

is the factor-k flattening matrix of

l{A}

whose columns are the factor-k vectors
[a
j1,\ldots,jk-1,i,jk+1,\ldots,jd
nk
]
i=1

in some order, determined by the particular choice of the bijective map

\mu_k : [1,n_1] \times \cdots \times [1,n_{k-1}] \times [1,n_{k+1}] \times \cdots \times [1,n_d] \to [1,N_k].

In other words, the multilinear multiplication

(M1,M2,\ldots,Md)l{A}

can be computed as a sequence of d factor-k multilinear multiplications, which themselves can be implemented efficiently as classic matrix multiplications.

Applications

The higher-order singular value decomposition (HOSVD) factorizes a tensor given in coordinates

l{A}\in

n1 x n2 x … x nd
F

as the multilinear multiplication

l{A}=(U1,U2,\ldots,Ud)l{S}

, where

Uk\in

nk x nk
F

are orthogonal matrices and

l{S}\in

n1 x n2 x … x nd
F

.

Further reading

  1. Book: M., Landsberg, J.. Tensors : geometry and applications. 2012. American Mathematical Society. 9780821869079. Providence, R.I.. 733546583.
  2. Book: Multilinear Algebra Werner Greub Springer. Universitext. 1978. Springer. 9780387902845. en.