Reproducing kernel Hilbert space explained

In functional analysis (a branch of mathematics), a reproducing kernel Hilbert space (RKHS) is a Hilbert space of functions in which point evaluation is a continuous linear functional. Roughly speaking, this means that if two functions

and

in the RKHS are close in norm, i.e.,

\|f-g\|

is small, then

and

are also pointwise close, i.e.,

|f(x)-g(x)|

is small for all

. The converse does not need to be true. Informally, this can be shown by looking at the supremum norm: the sequence of functions

\sin²ⁿ(x)

converges pointwise, but does not converge uniformly i.e. does not converge with respect to the supremum norm. (This is not a counterexample because the supremum norm does not arise from any inner product due to not satisfying the parallelogram law.)

It is not entirely straightforward to construct a Hilbert space of functions which is not an RKHS.^[1] Some examples, however, have been found.^[2] ^[3]

L² spaces are not Hilbert spaces of functions (and hence not RKHSs), but rather Hilbert spaces of equivalence classes of functions (for example, the functions

and

defined by

f(x)=0

and

g(x)=1_Q

are equivalent in L²). However, there are RKHSs in which the norm is an L²-norm, such as the space of band-limited functions (see the example below).

An RKHS is associated with a kernel that reproduces every function in the space in the sense that for every

in the set on which the functions are defined, "evaluation at

" can be performed by taking an inner product with a function determined by the kernel. Such a reproducing kernel exists if and only if every evaluation functional is continuous.

The reproducing kernel was first introduced in the 1907 work of Stanisław Zaremba concerning boundary value problems for harmonic and biharmonic functions. James Mercer simultaneously examined functions which satisfy the reproducing property in the theory of integral equations. The idea of the reproducing kernel remained untouched for nearly twenty years until it appeared in the dissertations of Gábor Szegő, Stefan Bergman, and Salomon Bochner. The subject was eventually systematically developed in the early 1950s by Nachman Aronszajn and Stefan Bergman.^[4]

These spaces have wide applications, including complex analysis, harmonic analysis, and quantum mechanics. Reproducing kernel Hilbert spaces are particularly important in the field of statistical learning theory because of the celebrated representer theorem which states that every function in an RKHS that minimises an empirical risk functional can be written as a linear combination of the kernel function evaluated at the training points. This is a practically useful result as it effectively simplifies the empirical risk minimization problem from an infinite dimensional to a finite dimensional optimization problem.

For ease of understanding, we provide the framework for real-valued Hilbert spaces. The theory can be easily extended to spaces of complex-valued functions and hence include the many important examples of reproducing kernel Hilbert spaces that are spaces of analytic functions.^[5]

Definition

Let

be an arbitrary set and

a Hilbert space of real-valued functions on

, equipped with pointwise addition and pointwise scalar multiplication. The evaluation functional over the Hilbert space of functions

is a linear functional that evaluates each function at a point

L_x:f\mapstof(x)\forallf\inH.

We say that H is a reproducing kernel Hilbert space if, for all

L_x

is continuous at every

or, equivalently, if

L_x

is a bounded operator on

, i.e. there exists some

M_x>0

such that

Although

M_x<infty

is assumed for all

x\inX

, it might still be the case that

\sup_x M_x = \infty

While property is the weakest condition that ensures both the existence of an inner product and the evaluation of every function in

at every point in the domain, it does not lend itself to easy application in practice. A more intuitive definition of the RKHS can be obtained by observing that this property guarantees that the evaluation functional can be represented by taking the inner product of

with a function

K_x

. This function is the so-called reproducing kernel for the Hilbert space

from which the RKHS takes its name. More formally, the Riesz representation theorem implies that for all

there exists a unique element

K_x

with the reproducing property,

Since

K_x

is itself a function defined on

with values in the field

(or

in the case of complex Hilbert spaces) and as

K_x

is in

we have that

K_x(y)=L_y(K_x)=\langleK_x, K_y\rangle_H,

where

K_y\inH

is the element in

associated to

L_y

This allows us to define the reproducing kernel of

as a function

K:X x X\toR

(or

in the complex case) by

K(x,y)=\langleK_x, K_y\rangle_H.

From this definition it is easy to see that

K:X x X\toR

(or

in the complex case) is both symmetric (resp. conjugate symmetric) and positive definite, i.e.

	n
\sum
	i,j=1

c_ic_jK(x_i,x_j)=
\sum

	n

	i=1

c_i\left\langle

K
	x_i

	n
\sum
	j=1

c_j

K
	x_j

\right\rangle_H=\left\langle

	n
\sum
	i=1

c_i

K
	x_i

	n
\sum
	j=1

c_j

K
	x_j

\right\rangle_H

	nc
= \left\\|\sum
	iK


	x_i

	2
\right\\|
	H

\ge0

for every

n\inN,x_1,...,x_n\inX,andc_1,...,c_n\inR.

^[6] The Moore–Aronszajn theorem (see below) is a sort of converse to this: if a function

satisfies these conditions then there is a Hilbert space of functions on

for which it is a reproducing kernel.

Example

The space of bandlimited continuous functions

is a RKHS, as we now show. Formally, fix some cutoff frequency

0<a<infty

and define the Hilbert space

H=\{f\inC(R)\mid\operatorname{supp}(F)\subset[-a,a]\}

where

C(R)

is the set of continuous square integrable functions, and

F(\omega) = \int_^\infty f(t) e^ \, dt

is the Fourier transform of

. As the inner product of this Hilbert space, we use

\langlef,

g\rangle
	L²

	infty
\int
	-infty

f(x) ⋅ \overline{g(x)}dx.

From the Fourier inversion theorem, we have

f(x)=

	1
	2\pi

	a
\int
	-a

F(\omega)e^ixd\omega.

It then follows by the Cauchy–Schwarz inequality and Plancherel's theorem that, for all

|f(x)|\le

	1
	2\pi

\sqrt{

	a
2a\int
	-a

|F(\omega)|²d\omega}=

	\sqrt{2a
	}{2\pi}\sqrt{\int

	infty

	-infty

|F(\omega)|²d\omega}=\sqrt{

	a
	\pi

} \|f\|_.

This inequality shows that the evaluation functional is bounded, proving that

is indeed a RKHS.

The kernel function

K_x

in this case is given by

K_x(y)=

	a
	\pi

\operatorname{sinc}\left(

	a
	\pi

(y-x)\right)=

	\sin(a(y-x))
	\pi(y-x)

The Fourier transform of

K_x(y)

defined above is given by

	infty
\int
	-infty

	-i\omegay
K
	x(y)e

dy=\begin{cases} e^-i&if\omega\in[-a,a],\\ 0&rm{otherwise}, \end{cases}

which is a consequence of the time-shifting property of the Fourier transform. Consequently, using Plancherel's theorem, we have

\langlef,K_x\rangle


	L²

	infty
\int
	-infty

f(y) ⋅ \overline{K_x(y)}dy=

	1
	2\pi

	a
\int
	-a

F(\omega) ⋅ e^i\omegad\omega=f(x).

Thus we obtain the reproducing property of the kernel.

K_x

in this case is the "bandlimited version" of the Dirac delta function, and that

K_x(y)

converges to

\delta(y-x)

in the weak sense as the cutoff frequency

tends to infinity.

Moore–Aronszajn theorem

We have seen how a reproducing kernel Hilbert space defines a reproducing kernel function that is both symmetric and positive definite. The Moore–Aronszajn theorem goes in the other direction; it states that every symmetric, positive definite kernel defines a unique reproducing kernel Hilbert space. The theorem first appeared in Aronszajn's Theory of Reproducing Kernels, although he attributes it to E. H. Moore.

Theorem. Suppose K is a symmetric, positive definite kernel on a set X. Then there is a unique Hilbert space of functions on X for which K is a reproducing kernel.

Proof. For all x in X, define K_x = K(x, ⋅). Let H₀ be the linear span of . Define an inner product on H₀ by

\left\langle

	n
\sum
	j=1

b_j

K
	y_j

	m
\sum
	i=1

a_i

K
	x_i

\right

\rangle
	H₀

	m
\sum
	i=1

	n
\sum
	j=1

{a_i}b_jK(y_j,x_i),

which implies

K(x,y)=\left\langleK_x,K_y

\right\rangle
	H₀

.The symmetry of this inner product follows from the symmetry of K and the non-degeneracy follows from the fact that K is positive definite.

Let H be the completion of H₀ with respect to this inner product. Then H consists of functions of the form

f(x)=

	infty
\sum
	i=1

a_i

K
	x_i

(x) where \lim_n\sup_p\geq0

	n+p
\left\\|\sum
	i=n

a_i

K
	x_i

\right\\|
	H₀

=0.

Now we can check the reproducing property :

\langlef,K_x\rangle_H=

	infty
\sum
	i=1

a_i\left\langle

K
	x_i

,K_x\right

\rangle
	H₀

	infty
\sum
	i=1

a_iK(x_i,x)=f(x).

To prove uniqueness, let G be another Hilbert space of functions for which K is a reproducing kernel. For every x and y in X, implies that

\langleK_x,K_y\rangle_H=K(x,y)=\langleK_x,K_y\rangle_G.

By linearity,

\langle ⋅ , ⋅ \rangle_H=\langle ⋅ , ⋅ \rangle_G

on the span of

\{K_x:x\inX\}

. Then

H\subsetG

because G is complete and contains H₀ and hence contains its completion.

Now we need to prove that every element of G is in H. Let

be an element of G. Since H is a closed subspace of G, we can write

f=f_H+

f
	H^\bot

where

f_H\inH

and

f
	H^\bot

\inH^\bot

. Now if

x\inX

then, since K is a reproducing kernel of G and H:

f(x)=\langleK_x,f\rangle_G=\langleK_x,f_H\rangle_G+\langleK_x,

f
	H^\bot

\rangle_G=\langleK_x,f_H\rangle_G=\langleK_x,f_H\rangle_H=f_H(x),

where we have used the fact that

K_x

belongs to H so that its inner product with

f
	H^\bot

in G is zero.This shows that

f=f_H

in G and concludes the proof.

Integral operators and Mercer's theorem

We may characterize a symmetric positive definite kernel

via the integral operator using Mercer's theorem and obtain an additional view of the RKHS. Let

be a compact space equipped with a strictly positive finite Borel measure

\mu

and

K:X x X\to\R

a continuous, symmetric, and positive definite function. Define the integral operator

T_K:L_2(X)\toL_2(X)

[T_Kf]( ⋅ )=\int_XK( ⋅ ,t)f(t)d\mu(t)

where

L_2(X)

is the space of square integrable functions with respect to

\mu

Mercer's theorem states that the spectral decomposition of the integral operator

T_K

yields a series representation of

in terms of the eigenvalues and eigenfunctions of

T_K

. This then implies that

is a reproducing kernel so that the corresponding RKHS can be defined in terms of these eigenvalues and eigenfunctions. We provide the details below.

Under these assumptions

T_K

is a compact, continuous, self-adjoint, and positive operator. The spectral theorem for self-adjoint operators implies that there is an at most countable decreasing sequence

(\sigma_i)_i\geq0

such that

\lim_\sigma_i = 0

and

T_K\varphi_i(x)=\sigma_i\varphi_i(x)

, where the

\{\varphi_i\}

form an orthonormal basis of

L_2(X)

. By the positivity of

T_K,\sigma_i>0

for all

One can also show that

T_K

maps continuously into the space of continuous functions

C(X)

and therefore we may choose continuous functions as the eigenvectors, that is,

\varphi_i\inC(X)

for all

Then by Mercer's theorem

may be written in terms of the eigenvalues and continuous eigenfunctions as

K(x,y)=

	infty
\sum
	j=1

\sigma_j\varphi_j(x)\varphi_j(y)

for all

x,y\inX

such that

\lim_n\sup_u,v\left|K(u,v)-

	n
\sum
	j=1

\sigma_j\varphi_j(u)\varphi_j(v)\right|=0.

This above series representation is referred to as a Mercer kernel or Mercer representation of

Furthermore, it can be shown that the RKHS

is given by

H=\left\{f\inL_2(X)\vert

	infty
\sum
	i=1

\left\langle

f,\varphi_i\right

	2
\rangle
	L₂

\sigma_i

<infty\right\}

where the inner product of

given by

\left\langlef,g\right\rangle_H=

	infty
\sum
	i=1

\left\langle

f,\varphi_i

\right\rangle
	L₂

\left\langleg,\varphi_i

\right\rangle
	L₂

\sigma_i

This representation of the RKHS has application in probability and statistics, for example to the Karhunen-Loève representation for stochastic processes and kernel PCA.

Feature maps

A feature map is a map

\varphi\colonX → F

, where

is a Hilbert space which we will call the feature space. The first sections presented the connection between bounded/continuous evaluation functions, positive definite functions, and integral operators and in this section we provide another representation of the RKHS in terms of feature maps.

Every feature map defines a kernel via

Clearly

is symmetric and positive definiteness follows from the properties of inner product in

. Conversely, every positive definite function and corresponding reproducing kernel Hilbert space has infinitely many associated feature maps such that holds.

For example, we can trivially take

F=H

and

\varphi(x)=K_x

for all

x\inX

. Then is satisfied by the reproducing property. Another classical example of a feature map relates to the previous section regarding integral operators by taking

F=\ell²

and

\varphi(x)=(\sqrt{\sigma_i}\varphi_i(x))_i

This connection between kernels and feature maps provides us with a new way to understand positive definite functions and hence reproducing kernels as inner products in

. Moreover, every feature map can naturally define a RKHS by means of the definition of a positive definite function.

Lastly, feature maps allow us to construct function spaces that reveal another perspective on the RKHS. Consider the linear space

H_\varphi=\{f:X\toR\mid\existsw\inF,f(x)=\langlew,\varphi(x)\rangle_F,\forallx\inX\}.

We can define a norm on

H_\varphi

\|f\|_\varphi=inf\{\|w\|_F:w\inF,f(x)=\langlew,\varphi(x)\rangle_F,\forallx\inX\}.

It can be shown that

H_\varphi

is a RKHS with kernel defined by

K(x,y)=\langle\varphi(x),\varphi(y)\rangle_F

. This representation implies that the elements of the RKHS are inner products of elements in the feature space and can accordingly be seen as hyperplanes. This view of the RKHS is related to the kernel trick in machine learning.^[7]

Properties

Useful properties of RKHSs:

(X_i)

	p

	i=1

be a sequence of sets and

(K_i)

	p

	i=1

be a collection of corresponding positive definite functions on

(X_i)

	p.

	i=1

It then follows that

K((x_1,\ldots,x_p),(y_1,\ldots,y_p))=K_1(x_1,y_{1) …}K_p(x_p,y_p)

is a kernel on

X=X₁ x ... x X_p.

X₀\subsetX,

then the restriction of

X₀ x X₀

is also a reproducing kernel.

Consider a normalized kernel

such that

K(x,x)=1

for all

x\inX

. Define a pseudo-metric on X as

d_K(x,y)=\|K_x-K_y\|

	2

	H

=2(1-K(x,y)) \forallx\inX.

By the Cauchy–Schwarz inequality,

K(x,y)²\leK(x,x)K(y,y)=1 \forallx,y\inX.

This inequality allows us to view

as a measure of similarity between inputs. If

x,y\inX

are similar then

K(x,y)

will be closer to 1 while if

x,y\inX

are dissimilar then

K(x,y)

will be closer to 0.

The closure of the span of

\{K_x\midx\inX\}

coincides with

.^[8]

Common examples

Bilinear kernels

K(x,y)=\langlex,y\rangle

The RKHS

corresponding to this kernel is the dual space, consisting of functions

f(x)=\langlex,\beta\rangle

satisfying

	2=\\|\beta\\|
\\|f\\|
	H

Polynomial kernels

K(x,y)=(\alpha\langlex,y\rangle+1)^d, \alpha\in\R,d\in\N

Radial basis function kernels

These are another common class of kernels which satisfy

K(x,y)=K(\|x-y\|)

. Some examples include:

Gaussian or squared exponential kernel:

K(x,y)=

-	\\|x-y\\|²
	2\sigma²

, \sigma>0

Laplacian kernel:

K(x,y)=

-	\\|x-y\\|
	\sigma

, \sigma>0

The squared norm of a function

in the RKHS

with this kernel is:^[9] ^[10]

	2=\int
\\|f\\|
	R

(

	1{\sigma}
	f(x)

²+\sigmaf'(x)²⁾dx.

Bergman kernels

We also provide examples of Bergman kernels. Let X be finite and let H consist of all complex-valued functions on X. Then an element of H can be represented as an array of complex numbers. If the usual inner product is used, then K_x is the function whose value is 1 at x and 0 everywhere else, and

K(x,y)

can be thought of as an identity matrix since

K(x,y)=\begin{cases}1&x=y\ 0&x ≠ y\end{cases}

In this case, H is isomorphic to

\Complexⁿ

The case of

X=D

(where

denotes the unit disc) is more sophisticated. Here the Bergman space

H^2(D)

is the space of square-integrable holomorphic functions on

. It can be shown that the reproducing kernel for

H^2(D)

K(x,y)=	1
	\pi

	1
	(1-x\overline{y

)^2}.

Lastly, the space of band limited functions in

L^2(\R)

with bandwidth

is a RKHS with reproducing kernel

K(x,y)=	\sina(x-y)
	\pi(x-y)

Extension to vector-valued functions

In this section we extend the definition of the RKHS to spaces of vector-valued functions as this extension is particularly important in multi-task learning and manifold regularization. The main difference is that the reproducing kernel

\Gamma

is a symmetric function that is now a positive semi-definite matrix for every

x,y

. More formally, we define a vector-valued RKHS (vvRKHS) as a Hilbert space of functions

f:X\toR^T

such that for all

c\inR^T

and

x\inX

\Gamma_xc(y)=\Gamma(x,y)c\inHfory\inX

and

\langlef,\Gamma_xc\rangle_H=f(x)^\intercalc.

This second property parallels the reproducing property for the scalar-valued case. This definition can also be connected to integral operators, bounded evaluation functions, and feature maps as we saw for the scalar-valued RKHS. We can equivalently define the vvRKHS as a vector-valued Hilbert space with a bounded evaluation functional and show that this implies the existence of a unique reproducing kernel by the Riesz Representation theorem. Mercer's theorem can also be extended to address the vector-valued setting and we can therefore obtain a feature map view of the vvRKHS. Lastly, it can also be shown that the closure of the span of

\{\Gamma_xc:x\inX,c\inR^T\}

coincides with

, another property similar to the scalar-valued case.

We can gain intuition for the vvRKHS by taking a component-wise perspective on these spaces. In particular, we find that every vvRKHS is isometrically isomorphic to a scalar-valued RKHS on a particular input space. Let

Λ=\{1,...,T\}

. Consider the space

X x Λ

and the corresponding reproducing kernel

As noted above, the RKHS associated to this reproducing kernel is given by the closure of the span of

\{\gamma_(x,t):x\inX,t\inΛ\}

where

\gamma_(x,t)(y,s)=\gamma((x,t),(y,s))

for every set of pairs

The connection to the scalar-valued RKHS can then be made by the fact that every matrix-valued kernel can be identified with a kernel of the form of via

\Gamma(x,y)_(t,s)=\gamma((x,t),(y,s)).

Moreover, every kernel with the form of defines a matrix-valued kernel with the above expression. Now letting the map

D:H_\Gamma\toH_\gamma

be defined as

(Df)(x,t)=\langlef(x),e_t

\rangle
	R^T

where

e_t

is the

t^th

component of the canonical basis for

R^T

, one can show that

is bijective and an isometry between

H_\Gamma

and

H_\gamma

While this view of the vvRKHS can be useful in multi-task learning, this isometry does not reduce the study of the vector-valued case to that of the scalar-valued case. In fact, this isometry procedure can make both the scalar-valued kernel and the input space too difficult to work with in practice as properties of the original kernels are often lost.^[11] ^[12] ^[13]

An important class of matrix-valued reproducing kernels are separable kernels which can factorized as the product of a scalar valued kernel and a

-dimensional symmetric positive semi-definite matrix. In light of our previous discussion these kernels are of the form

\gamma((x,t),(y,s))=K(x,y)K_T(t,s)

for all

x,y

and

t,s

. As the scalar-valued kernel encodes dependencies between the inputs, we can observe that the matrix-valued kernel encodes dependencies among both the inputs and the outputs.

We lastly remark that the above theory can be further extended to spaces of functions with values in function spaces but obtaining kernels for these spaces is a more difficult task.^[14]

Connection between RKHSs and the ReLU function

The ReLU function is commonly defined as

f(x)=max\{0,x\}

and is a mainstay in the architecture of neural networks where it is used as an activation function. One can construct a ReLU-like nonlinear function using the theory of reproducing kernel Hilbert spaces. Below, we derive this construction and show how it implies the representation power of neural networks with ReLU activations.

We will work with the Hilbert space

	1
l{H}=L
	2(0)[0,

infty)

of absolutely continuous functions with

f(0)=0

and square integrable (i.e.

L₂

) derivative. It has the inner product

\langlef,g\rangle_l{H

} = \int_0^\infty f'(x)g'(x) \, dx .

To construct the reproducing kernel it suffices to consider a dense subspace, so let

f\inC^1[0,infty)

and

f(0)=0

. The Fundamental Theorem of Calculus then gives

f(y)=

	y
\int
	0

f'(x)dx=

	infty
\int
	0

G(x,y)f'(x)dx=\langleK_y,f\rangle

where

G(x,y)=\begin{cases}1,&x<y\\ 0,&otherwise \end{cases}

and

K_y'(x)=G(x,y), K_y(0)=0

i.e.

K(x,y)=K_y(x)=\int

	x

	0

G(z,y)dz= \begin{cases} x,&0\leqx<y\\ y,&otherwise. \end{cases}=min(x,y)

This implies

K_{y=K( ⋅ ,}y)

reproduces

Moreover the minimum function on

X x X=[0,infty) x [0,infty)

has the following representations with the ReLu function:

min(x,y)=x-\operatorname{ReLU}(x-y)=y-\operatorname{ReLU}(y-x).

Using this formulation, we can apply the representer theorem to the RKHS, letting one prove the optimality of using ReLU activations in neural network settings.

References

Alvarez, Mauricio, Rosasco, Lorenzo and Lawrence, Neil, “Kernels for Vector-Valued Functions: a Review,” https://arxiv.org/abs/1106.6251, June 2011.
Nachman . Aronszajn . Nachman Aronszajn . Theory of Reproducing Kernels . . 68 . 3 . 337–404 . 1950 . 1990404 . 51437 . 10.1090/S0002-9947-1950-0051437-7. free.
Berlinet, Alain and Thomas, Christine. Reproducing kernel Hilbert spaces in Probability and Statistics, Kluwer Academic Publishers, 2004.
Felipe . Cucker . Steve . Smale . On the Mathematical Foundations of Learning . . 39 . 1 . 1–49 . 2002 . 10.1090/S0273-0979-01-00923-5 . 1864085. free.
De Vito, Ernest, Umanita, Veronica, and Villa, Silvia. "An extension of Mercer theorem to vector-valued measurable kernels,", June 2013.
Durrett, Greg. 9.520 Course Notes, Massachusetts Institute of Technology, https://www.mit.edu/~9.520/scribe-notes/class03_gdurett.pdf, February 2010.
George . Kimeldorf . Grace . Wahba . Grace Wahba . Some results on Tchebycheffian Spline Functions . Journal of Mathematical Analysis and Applications . 33 . 1 . 1971 . 82–95 . 10.1016/0022-247X(71)90184-3 . 290013. free.
Okutmustur, Baver. “Reproducing Kernel Hilbert Spaces,” M.S. dissertation, Bilkent University, http://www.thesis.bilkent.edu.tr/0002953.pdf, August 2005.
Paulsen, Vern. “An introduction to the theory of reproducing kernel Hilbert spaces,” https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&doi=440218056738e05b5ab43679f932a9f33fccee87.
Ingo . Steinwart . Clint . Scovel . Mercer's theorem on general domains: On the interaction between measures, kernels, and RKHSs . Constr. Approx. . 35 . 3. 2012 . 363–417 . 2914365 . 10.1007/s00365-012-9153-3. 253885172 .
Rosasco, Lorenzo and Poggio, Thomas. "A Regularization Tour of Machine Learning – MIT 9.520 Lecture Notes" Manuscript, Dec. 2014.
Wahba, Grace, Spline Models for Observational Data, SIAM, 1990.
Zhang . Haizhang . Xu . Yuesheng . Zhang . Qinghui . 2012 . Refinement of Operator-valued Reproducing Kernels . Journal of Machine Learning Research . 13 . 91–136 .

Notes and References

Alpay, D., and T. M. Mills. "A family of Hilbert spaces which are not reproducing kernel Hilbert spaces." J. Anal. Appl. 1.2 (2003): 107–111.
Z. Pasternak-Winiarski, "On weights which admit reproducing kernel of Bergman type", International Journal of Mathematics and Mathematical Sciences, vol. 15, Issue 1, 1992.
T. Ł. Żynda, "On weights which admit reproducing kernel of Szegő type", Journal of Contemporary Mathematical Analysis (Armenian Academy of Sciences), 55, 2020.
Okutmustur
Paulson
Durrett
Rosasco
Rosasco
Berlinet, Alain and Thomas, Christine. Reproducing kernel Hilbert spaces in Probability and Statistics, Kluwer Academic Publishers, 2004
Thomas-Agnan C . Computing a family of reproducing kernels for statistical applications. Numerical Algorithms, 13, pp. 21-32 (1996)
De Vito
Zhang
Alvarez
Rosasco

Reproducing kernel Hilbert space explained

Definition

Example

Moore–Aronszajn theorem

Integral operators and Mercer's theorem

Feature maps

Properties

Common examples

Bilinear kernels

Polynomial kernels

Radial basis function kernels

Bergman kernels

Extension to vector-valued functions

Connection between RKHSs and the ReLU function

See also

References

Notes and References