Reproducing Kernel Hilbert Space

Since writing about the simulation of fractional Brownian motion, I’ll spend an hour to write RKHS.

Definition: A Hilbert space \mathcal{H} is a RKHS if |\mathcal{F}_t[f]|=|f(t)|\leq M\|f\|_{\mathcal{H}}, \forall f\in \mathcal{H}

Theorem: If \mathcal{H} is a RKHS, then for each t \in X there exists a function K_t \in \mathcal{H} (called the representer of t) with the reproducing property

\mathcal{F}_t[f]=<K_t,f>_{\mathcal{H}}=f(t), \forall f\in\mathcal{H}

Therefore, K_t(t^{'})=<K_t,K_{t^{'}}>_{\mathcal{H}}.

Definition: K: X\times X\rightarrow\mathbb{R} is a reproducing kernel if it’s symmetric and positive definite.

Theorem: A RKHS defines a corresponding reproducing kernel. Conversely, a reproducing kernel defines a unique RKHS.

Once we have the kernels, if f(\cdot)=\sum \alpha_i K(t_i,\cdot), g(\cdot)=\sum \beta_i K(t^{'}_i,\cdot),

then <f,g>_{\mathcal{H}}=\sum\sum \alpha_i \beta_j K(t_i, t^{'}_j)

When it comes to the fractional Brownian motion

Theorem: For fBm RKHS K(x,x^{'})=R(x,x^{'}) is symmetric and positive definite, , there exists K^H(x,\cdot) s.t K(x,x^{'})=\int K^H(x,y)K^H(x^{'},y)dy

Take Wiener integral as an example:

\begin{pmatrix}    \textit{dual space}&\textit{inner product}&E&\leftrightarrow&<\cdot,\cdot>_{\mathcal{H}}&\leftrightarrow&<\cdot,\cdot>_{L^{2}}\\    \delta_{t}(\cdot)& &B^H_t&\leftrightarrow&R(\cdot,t)&\leftrightarrow&K^H(t,\cdot)\\    f(\cdot)& &\textit{"}\int_0^T f(t)B^H_tdt\textit{"}&\leftrightarrow&K^H\circ(K^H)^{*}f&\leftrightarrow&(K^H)^{*}f    \end{pmatrix}

where K^H\circ f(t)=\int_0^T K^H(t,s)f(s)ds,

(K^H)^{*}\circ f(t)=\int_0^T K^H(s,t)f(s)ds and

K^H\circ(K^H)^{*}f(t)=\int_0^T R(t,s)f(s)ds

For example, when H=\frac{1}{2}, K^H(t,s)=1_{[0,t]}(s), , we can check

E(B_tB_s)=<t\wedge\cdot, s\wedge\cdot>_{\mathcal{H}}=<1_{[0,t]}(\cdot),1_{[0,s]}(\cdot)>_{L^2}

Claim: By previous theorem, the reproducing kernel R(t,s) uniquely defines a RKHS L(R(t,\cdot)) with the inner product \mathcal{H}


Hilbert Schmidt on Hilbert probability space

Now use twenty minutes to explain why we consider Hilbert Schmidt operator on Hilbert probability space.

We consider an operator F: H\rightarrow H where H is a Hilbert space with probability measure \mu. Then when F is Hilbert Schmidt, let’s denote z=\sum a_i e_i where F^*F e_i=\lambda_i e_i, then we have

\int\|Fz\|^2_H\mu(dz)=\int <F^*Fz,z>_H\mu(dz)=\int <\sum a_i\lambda_i e_i, \sum a_i e_i>_H \mu(a_1,a_2,\dots)=\int \lambda_i<\sum a_i\lambda_i e_i, \sum a_i e_i>_H \mu(a_1,a_2,\dots)

Hilbert-Schmidt determinant

In the previous post, I mentioned the intuition of Fredholm determinant for trace class operator, which is Tr(K)<\infty. When it comes to Hilbert-Schmidt determinant, which is Tr(K^*K)<\infty, that definition(let’s call it det_1) may fail since trace of H-S operator my be infinity. So we need to come up with another definition(let’s call it det_2) for H-S operator that makes sense.

The intuition is that consider the eigen values of a H-S matrix K, say \lambda_1,\lambda_2,dots,notice that for \forall \lambda, 1+\lambda\leq exp(\lambda), so we can define det_2(I+K)=\prod(\frac{1+\lambda_i}{exp(\lambda_i)})=det_1(I+K)\cdot exp(-Tr(K))

Fredholm determinant

In the study of measure transformation in Gaussian space, there is a fundamental issue that I want to write down here, which is Fredholm determinant. Let me try to use half an hour to explain the intuition of this determinant in matrix form with finite dimension.

So suppose a matrix A=(a_{ij})\in \mathbb{R}_{n\times n}, our goal is to find out what definition det(I+A) is that makes sense for trace class operator(i.e.Tr(A)<\infty). We know that its trace Tr(A)=\sum a_{ii}, and assume the eigenvalues of A are \lambda_1,\lambda_2, \dots, \lambda_n , we have

AX=\begin{pmatrix}  \lambda_1 & & &\\  & \lambda_2 & 0 \\  & & \ddots &\\  & & & \lambda_n  \end{pmatrix} X, so

Tr(A)=Tr(X^{-1}AX)=\sum \lambda_i.

And because det(I+A)=\prod (1+\lambda_i)=\sum_{k=0}^{n}(\sum_{i_1,\dots,i_k}\lambda_{i_1}\dots\lambda_{i_k})

Iet’s take a look at these terms in the summation,

when k=0, it’s 1,

when k=1, it’s \lambda_1+\lambda_2+\dots+\lambda_n,

when k=2, it’s \sum_{i\neq j}\lambda_i\lambda_j,

which is the trace of operator \Lambda^2(A) on a linear space with basis e_i\wedge e_j, i<j where is \wedge is the wedge product form.

Similarly, we have the expression given in widipedia about “Fredholm determinant” that for a general trace-class operator A

det(I+A)=\sum_{k=0}^{\infty}Tr(\Lambda^k(A)) and this new operator \Lambda^k(A) is a linear operator on space formed by the basis \{e_{i_1}\wedge e_{i_2}\wedge\dots\wedge e_{i_k}| i_1<i_2<\dots<i_k\}

Finally, I want to mention that when Tr(A)=\sum |\lambda_i|<\infty, easy to check that det(A)<\infty indeed.