Tensor reshaping explained

In multilinear algebra, a reshaping of tensors is any bijection between the set of indices of an order-

tensor and the set of indices of an order-

tensor, where

L<M

. The use of indices presupposes tensors in coordinate representation with respect to a basis. The coordinate representation of a tensor can be regarded as a multi-dimensional array, and a bijection from one set of indices to another therefore amounts to a rearrangement of the array elements into an array of a different shape. Such a rearrangement constitutes a particular kind of linear map between the vector space of order-

tensors and the vector space of order-

tensors.

Definition

Given a positive integer

, the notation

[M]

refers to the set

\{1,...,M\}

of the first positive integers.

For each integer

where

1\lem\leM

for a positive integer

, let

V_m

denote an

I_m

-dimensional vector space over a field

. Then there are vector space isomorphisms (linear maps)

$\beginV_1 \otimes \cdots \otimes V_M & \simeq F^ \otimes \cdots \otimes F^ \\& \simeq F^ \otimes \cdots \otimes F^ \\& \simeq F^ \otimes F^ \otimes \cdots \otimes F^ \\& \simeq F^ \otimes F^ \otimes F^ \otimes \cdots \otimes F^ \\& \,\,\,\vdots \\& \simeq F^,\end$

where

\pi\inak{S}_M

is any permutation and

ak{S}_M

is the symmetric group on

elements. Via these (and other) vector space isomorphisms, a tensor can be interpreted in several ways as an order-

tensor where

L\leM

Coordinate representation

The first vector space isomorphism on the list above,

V₁ ⊗ … ⊗ V_M\simeq

	I₁
F

⊗ … ⊗

	I_M
F

, gives the coordinate representation of an abstract tensor. Assume that each of the

vector spaces

V_m

has a basis

	m,
v
	1

	m,
v
	2

\ldots,

	m
v
	I_m

. The expression of a tensor with respect to this basis has the form

\mathcal = \sum_^\ldots\sum_^ a_ v_^1 \otimes v_^2 \otimes \cdots \otimes v_^,

where the coefficients

a
	i_1,i_2,\ldots,i_M

are elements of

. The coordinate representation of

l{A}

\sum_^\ldots\sum_^ a_ \mathbf_^1 \otimes \mathbf_^2 \otimes \cdots \otimes \mathbf_^M,

where

	m
e
	i

is the

i^th

standard basis vector of

	I_m
F

. This can be regarded as a M-way array whose elements are the coefficients

a
	i_1,i_2,\ldots,i_M

General flattenings

For any permutation

\pi\inak{S}_M

there is a canonical isomorphism between the two tensor products of vector spaces

V₁ ⊗ V₂ ⊗ … ⊗ V_M

and

V_\pi(1) ⊗ V_\pi(2) ⊗ … ⊗ V_\pi(M)

. Parentheses are usually omitted from such products due to the natural isomorphism between

V_{i ⊗ (V}_{j ⊗}V_k)

and

(V_{i ⊗}V_{j) ⊗}V_k

, but may, of course, be reintroduced to emphasize a particular grouping of factors. In the grouping,

(V_ \otimes \cdots \otimes V_)\otimes(V_ \otimes \cdots \otimes V_)\otimes\cdots\otimes(V_ \otimes \cdots \otimes V_),

there are

groups with

r_l-r_l-1

factors in the

l^th

group (where

r₀₌₀

and

r_L=M

Letting

S_l=(\pi(r_l-1+1),\pi(r_l-1+2),\ldots,\pi(r_l))

for each

satisfying

1\lel\leL

, an

(S_1,S_2,\ldots,S_L)

-flattening of a tensor

l{A}

, denoted

l{A}
	(S_1,S_2,\ldots,S_L)

, is obtained by applying the two processes above within each of the

groups of factors. That is, the coordinate representation of the

l^th

group of factors is obtained using the isomorphism

(V
	\pi(r_l-1+1)

⊗

V
	\pi(r_l-1+2)

⊗ … ⊗

V
	\pi(r_l)

I
	\pi(r_l-1+1)

)\simeq(F

⊗

I
	\pi(r_l-1+2)

⊗ … ⊗

I
	\pi(r_l)

)

, which requires specifying bases for all of the vector spaces

V_k

. The result is then vectorized using a bijection

\mu_l:[I


	\pi(r_l-1+1)

] x [I
	\pi(r_l-1+2)

] x … x [I
	\pi(r_l)

]\to[I
	S_l

]

to obtain an element of

I
	S_l

, where

I_ := \prod_^ I_

, the product of the dimensions of the vector spaces in the

l^th

group of factors. The result of applying these isomorphisms within each group of factors is an element of

I
	S₁

⊗ … ⊗

I
	S_L

, which is a tensor of order

Vectorization

By means of a bijective map

\mu:[I_1] x … x [I_M]\to[I_{1 …}I_M]

, a vector space isomorphism between

	I₁
F

⊗ … ⊗

	I_M
F

and

	I₁ … I_M
F

is constructed via the mapping

	1
e
	i₁

⊗ …

	m
e
	i_m

⊗ … ⊗

	M
e
	i_M

\mapsto

e
	\mu(i_1,i_2,\ldots,i_M)

where for every natural number

such that

1\lei\leI₁ … I_M

, the vector

e_i

denotes the ith standard basis vector of

	i₁ … i_M
F

. In such a reshaping, the tensor is simply interpreted as a vector in

	I₁ … I_M
F

. This is known as vectorization, and is analogous to vectorization of matrices. A standard choice of bijection

\mu

is such that

$\operatorname(\mathcal) = \begin a_ & a_ & \cdots & a_ & a_ & \cdots & a_ \end^T,$

which is consistent with the way in which the colon operator in Matlab and GNU Octave reshapes a higher-order tensor into a vector. In general, the vectorization of

l{A}

is the vector

[

a
	\mu^-1(i)

	I₁ … I_M
]
	i=1

The vectorization of

l{A}

denoted with

vec(l{A})

l{A}_[:]

is an

[S_1,S_2]

-reshaping where

S₁=(1,2,\ldots,M)

and

S_2=\empty

Mode-m Flattening / Mode-m Matrixization

Let

l{A}\in

	I₁
F

⊗

	I₂
F

⊗ … ⊗

	I_M
F

be the coordinate representation of an abstract tensor with respect to a basis.Mode-m matrixizing (a.k.a. flattening) of

l{A}

is an

[S_1,S_2]

-reshaping in which

S₁=(m)

and

S₂=(1,2,\ldots,m-1,m+1,\ldots,M)

. Usually, a standard matrixizing is denoted by

$_ = \mathcal_$

This reshaping is sometimes called matrixizing, matricizing, flattening or unfolding in the literature. A standard choice for the bijections

\mu_1, \mu₂

is the one that is consistent with the reshape function in Matlab and GNU Octave, namely

$_ := \begin a_ & a_ & \cdots & a_ \\a_ & a_ & \cdots & a_ \\\vdots & \vdots & & \vdots \\a_ & a_ & \cdots & a_\end$

Definition Mode-m Matrixizing: $[{\mathbf A}_{[m]}]_ = a_, \;\; \text j = i_m \text k=1+\sum_^M(i_n - 1) \prod_^ I_l.$ The mode-m matrixizing of a tensor

{lA}\in

	I_{1 x ...I}_M
F

is defined as the matrix

{A}_[m]\in

	I_m x (I₁...I_m-1I_m+1...I_M)
F

. As the parenthetical ordering indicates, the mode-m column vectors are arranged bysweeping all the other mode indices through their ranges,with smaller mode indexes varying more rapidly than larger ones; thu