In machine learning, a Hyper basis function network, or HyperBF network, is a generalization of radial basis function (RBF) networks concept, where the Mahalanobis-like distance is used instead of Euclidean distance measure. Hyper basis function networks were first introduced by Poggio and Girosi in the 1990 paper “Networks for Approximation and Learning”.[1] [2]
The typical HyperBF network structure consists of a real input vector
x\inRn
\phi:Rn\toR
N | |
\phi(x)=\sum | |
j=1 |
aj\rhoj(||x-\muj||)
where
N
\muj
aj
j
\rhoj(||x-\muj||)
\rhoj(||x-\mu
| ||||||||||
j||)=e |
where
Rj
d x d
Rj
R | ||||
|
Id x
\sigma>0
R | ||||||||||
|
Id x
\sigmaj>0
R | ,..., | ||||||||||
|
1 | ||||||
|
\right)Id x
\sigmaji>0
Training HyperBF networks involves estimation of weights
aj
Rj
\muj
Consider the quadratic loss of the network
N | |
H[\phi | |
i=1 |
* | |
(y | |
i-\phi |
2 | |
(x | |
i)) |
\partialH(\phi*) | |
\partialaj |
=0
\partialH(\phi*) | |
\partial\muj |
=0
\partialH(\phi*) | |
\partialW |
=0
where
TW | |
R | |
j=W |
aj,\muj,W
H[\phi*]
aj | =-\omega |
\partialH(\phi*) | |
\partialaj |
\muj | =-\omega |
\partialH(\phi*) | |
\partial\muj |
W | =-\omega |
\partialH(\phi*) | |
\partialW |
where
\omega
Overall, training HyperBF networks can be computationally challenging. Moreover, the high degree of freedom of HyperBF leads to overfitting and poor generalization. However, HyperBF networks have an important advantage that a small number of neurons is enough for learning complex functions.[2]