Elastic maps provide a tool for nonlinear dimensionality reduction. By their construction, they are a system of elastic springs embedded in the dataspace. This system approximates a low-dimensional manifold. The elastic coefficients of this system allow the switch from completely unstructured k-means clustering (zero elasticity) to the estimators located closely to linear PCA manifolds (for high bending and low stretching modules). With some intermediate values of the elasticity coefficients, this system effectively approximates non-linear principal manifolds. This approach is based on a mechanical analogy between principal manifolds, that are passing through "the middle" of the data distribution, and elastic membranes and plates. The method was developed by A.N. Gorban, A.Y. Zinovyev and A.A. Pitenko in 1996–1998.
Let
{lS}
{\bfw}j
s\in{lS}
{\bfw}j
{lS}
Kj=\{s | {\bfw}jisahostofs\}
The approximation energy D is the distortion
D= | 1 |
2 |
k | |
\sum | |
j=1 |
\sum | |
s\inKj |
\|s-{\bf
2 | |
w} | |
j\| |
\{si\}
On the set of nodes an additional structure is defined. Some pairs of nodes,
({\bfw}i,{\bfw}j)
E
({\bfw}i,{\bfw}j,{\bfw}k)
G
The stretching energy is
UE=
1 | |
2 |
λ\sum({\bfi,{\bfw}j)\inE}\|{\bfw}i-{\bf
2 | |
w} | |
j\| |
The bending energy is
U | ||||
|
\mu\sum({\bfi,{\bfw}j,{\bfw}k)\inG}\|{\bfw}i-2{\bfw}j+{\bf
2 | |
w} | |
k\| |
λ
\mu
For example, on the 2D rectangular grid the elastic edges are just vertical and horizontal edges (pairs of closest vertices) and the bending ribs are the vertical or horizontal triplets of consecutive (closest) vertices.
The total energy of the elastic map is thus
U=D+UE+UG.
\{{\bfw}j\}
U
For a given splitting of dataset
{lS}
Kj
U
\{{\bfw}j\}
\{Kj\}
\{Kj\}
U
\{{\bfw}j\}
This expectation-maximization algorithm guarantees a local minimum of
U
λ
\mu
λ
\mu
Most important applications of the method and free software are in bioinformatics[2] [3] for exploratory data analysis and visualisation of multidimensional data, for data visualisation in economics, social and political sciences,[4] as an auxiliary tool for data mapping in geographic informational systems and for visualisation of data of various nature.
The method is applied in quantitative biology for reconstructing the curved surface of a tree leaf from a stack of light microscopy images.[5] This reconstruction is used for quantifying the geodesic distances between trichomes and their patterning, which is a marker of the capability of a plant to resist to pathogenes.
Recently, the method is adapted as a support tool in the decision process underlying the selection, optimization, and management of financial portfolios.[6]
The method of elastic maps has been systematically tested and compared with several machine learning methods on the applied problem of identification of the flow regime of a gas-liquid flow in a pipe.[7] There are various regimes: Single phase water or air flow, Bubbly flow, Bubbly-slug flow, Slug flow, Slug-churn flow, Churn flow, Churn-annular flow, and Annular flow. The simplest and most common method used to identify the flow regime is visual observation. This approach is, however, subjective and unsuitable for relatively high gas and liquid flow rates. Therefore, the machine learning methods are proposed by many authors. The methods are applied to differential pressure data collected during a calibration process. The method of elastic maps provided a 2D map, where the area of each regime is represented. The comparison with some other machine learning methods is presented in Table 1 for various pipe diameters and pressure.
Elastic map | 100 | 98.2 | 100 | 100 | |
---|---|---|---|---|---|
ANN | 99.1 | 89.2 | 76.2 | 70.5 | |
SVM | 100 | 88.5 | 61.7 | 70.5 | |
SOM (small) | 94.9 | 94.2 | 83.6 | 88.6 | |
SOM (large) | 100 | 94.6 | 82.1 | 84.1 |
Here, ANN stands for the backpropagation artificial neural networks, SVM stands for the support vector machine, SOM for the self-organizing maps. The hybrid technology was developed for engineering applications.[8] In this technology, elastic maps are used in combination with Principal Component Analysis (PCA), Independent Component Analysis (ICA) and backpropagation ANN.
The textbook[9] provides a systematic comparison of elastic maps and self-organizing maps (SOMs) in applications to economic and financial decision-making.