# Hilbert Schmidt on Hilbert probability space

Now use twenty minutes to explain why we consider Hilbert Schmidt operator on Hilbert probability space.

We consider an operator $F: H\rightarrow H$ where $H$ is a Hilbert space with probability measure $\mu$. Then when $F$ is Hilbert Schmidt, let’s denote $z=\sum a_i e_i$ where $F^*F e_i=\lambda_i e_i$, then we have

$\int\|Fz\|^2_H\mu(dz)=\int _H\mu(dz)=\int <\sum a_i\lambda_i e_i, \sum a_i e_i>_H \mu(a_1,a_2,\dots)=\int \lambda_i<\sum a_i\lambda_i e_i, \sum a_i e_i>_H \mu(a_1,a_2,\dots)$

# Hilbert-Schmidt determinant

In the previous post, I mentioned the intuition of Fredholm determinant for trace class operator, which is $Tr(K)<\infty$. When it comes to Hilbert-Schmidt determinant, which is $Tr(K^*K)<\infty$, that definition(let’s call it $det_1$) may fail since trace of H-S operator my be infinity. So we need to come up with another definition(let’s call it $det_2$) for H-S operator that makes sense.

The intuition is that consider the eigen values of a H-S matrix $K$, say $\lambda_1,\lambda_2,dots$,notice that for $\forall \lambda, 1+\lambda\leq exp(\lambda)$, so we can define $det_2(I+K)=\prod(\frac{1+\lambda_i}{exp(\lambda_i)})=det_1(I+K)\cdot exp(-Tr(K))$

# Fredholm determinant

In the study of measure transformation in Gaussian space, there is a fundamental issue that I want to write down here, which is Fredholm determinant. Let me try to use half an hour to explain the intuition of this determinant in matrix form with finite dimension.

So suppose a matrix $A=(a_{ij})\in \mathbb{R}_{n\times n}$, our goal is to find out what definition $det(I+A)$ is that makes sense for trace class operator(i.e.$Tr(A)<\infty$). We know that its trace $Tr(A)=\sum a_{ii}$, and assume the eigenvalues of $A$ are $\lambda_1,\lambda_2, \dots, \lambda_n$, we have

$AX=\begin{pmatrix} \lambda_1 & & &\\ & \lambda_2 & 0 \\ & & \ddots &\\ & & & \lambda_n \end{pmatrix} X$, so

$Tr(A)=Tr(X^{-1}AX)=\sum \lambda_i$.

And because $det(I+A)=\prod (1+\lambda_i)=\sum_{k=0}^{n}(\sum_{i_1,\dots,i_k}\lambda_{i_1}\dots\lambda_{i_k})$

Iet’s take a look at these terms in the summation,

when $k=0$, it’s 1,

when $k=1$, it’s $\lambda_1+\lambda_2+\dots+\lambda_n$,

when $k=2$, it’s $\sum_{i\neq j}\lambda_i\lambda_j$,

which is the trace of operator $\Lambda^2(A)$ on a linear space with basis $e_i\wedge e_j, i where is $\wedge$ is the wedge product form.

Similarly, we have the expression given in widipedia about “Fredholm determinant” that for a general trace-class operator $A$

$det(I+A)=\sum_{k=0}^{\infty}Tr(\Lambda^k(A))$ and this new operator $\Lambda^k(A)$ is a linear operator on space formed by the basis $\{e_{i_1}\wedge e_{i_2}\wedge\dots\wedge e_{i_k}| i_1

Finally, I want to mention that when $Tr(A)=\sum |\lambda_i|<\infty$, easy to check that $det(A)<\infty$ indeed.

# One way ANOVA vs Two way ANOVA

Today in class we talked about Two-way ANOVA, so my question is how to understand the essence of these two kinds of ANOVA? I’ll try to spend less than one hour to write down what I thought about it.

Let’s take a look at an example, we have treatments $i \in \{1,2,\dots,I\}$ and groups $j\in \{1,2,\dots,J\}$ and for each treatment $i$ and group $j$, we have $m_{i,j}$ samples, so our two-way ANOVA model should would be

$\mu_{ijk}=\mu_{ij}+\epsilon_{ij;k}$ where $\mu_{ij}$ is the mean of the $i$th treatment and $j$th group.

What if we want to use one-way ANOVA to model it? What would that be look like? Because there may be interaction between the treatment and group, so in my understanding, we should consider this model be a linear model with respect to three terms, i.e. treatment level , group level  and interaction level , so the model should be $X_{ijk}=\mu_0+\alpha_{i} +\beta_{j} +\gamma_{ij} +\epsilon_{ijk}$, now we consider the least square solution for $\mu_0,\alpha_i,\beta_j,\gamma_{ij}$ we have

$\min_{\mu_0,\alpha_i,\beta_j,\gamma_{ij}} S=\sum_{i,j,k} (X_{ijk}-\mu_0-\alpha_i-\beta_j-\gamma_{ij})^2$

By taking derivatives with respect to $\mu_0,\alpha,\beta,\gamma$, we have

$\mu_0 \sum_{ij} m_{ij}=\sum_{ijk} X_{ijk}-\sum_{i}\alpha_i(\sum_j m_{ij})-\sum_{j}\beta_j(\sum_{i} m_{ij})-\sum_{i,j} \gamma_{ij} m_{ij}$

$\alpha_i\sum_j m_{ij}=\sum_{jk} X_{ijk}-\mu_0\sum_j m_{ij}-\beta_i\sum_j m_{ij}-\sum_j(\gamma_{ij} m_{ij})$

$\beta_j\sum_j m_{ij}=\sum_{ik} X_{ijk}-\mu_0\sum_i m_{ij}-\alpha_j\sum_i m_{ij}-\sum_i(\gamma_{ij} m_{ij})$

$\gamma_{ij}=\frac{1}{m_{ij}}\sum_k X_{ijk}-\mu_0 -\alpha_i-\beta_j$

Then the solution of these equations is the two-way ANONA estimator.

So as we can see that if there is only one factor, there would not exist the interaction term $\gamma$, then we can just regard the original linear model be $I$ different group and analyse individually.

So in sum, so called “Two-way ANOVA” is mainly dealing with the interaction between two different “factors”. So question is if there are three different factors, can we follow the same idea do the “many-factors ANOVA”?

# miscellaneous

After years' study in probability, I felt more and more happy with the construction of probability space, it's so simple seemingly but interesting that may be applied to many more areas rather than math.

Let’s review a little bit about the probability space in math. Given a space $(\Omega, \mathbb{F}, \mathbb{P})$, assume that there exists a r.v $X s.t. \{\omega|X(\omega)>0\}\in\mathbb{F}$, then we can measure the events $\{X>0\}$ by using $\mathbb{P}(X>0)$. What’s more, adding a time dimension to make it a stochastic process $X_t(\omega)$, then we have properties like $\mathbb{F}_t\subset\mathbb{F}_{t^+}$ and so on.

Secondly, let’s consider another fact, different people have different decisions when given tasks, why is that? That apparently depends on the knowledge we own, how do I relate this decision making base on probability space construction idea? This attracts me these days and I’ll post what I think later.

To be continued……

# Trip to Mount Emei

During this summer vacation back in home, I visited Mount Emei in Chengdu, Sichuan with my cousin. It was quite
a tough but pleasant trip in my view. It was tough because among a couple ways to get to the top, we walked the longest and the most difficult one within almost one day, pleasant because of the satisfaction finally compared with the other hikers who thought impossible to achieve it in one day.

Unlike the hikings in the states, hiking the Mount Emei is almost climbing the stairs all the way, sometimes steep enough and sometimes gradual. We started our trip on July 24th afternoon from the foot of the mounting, Baoguosi. According to our plan, we will be staying around Guangfusi, at about 1/6 location of the whole route, so the first day hiking is really just a warm-up exercise, though we still almost couldn’t get to the scheduled staying place before the dark.

On July 25th morning, we started out at 9:00 am, the first destination was Monkey area, those monkeys were more naughty than any child, as long as they saw travellers’ food or bottled water exposed outside backpack, they would grab them away without any doubt. At beginning, my cousin and I were both excited about the monkeys, later in the trip whenever we came across several ones on the road, we were both cautious in case they would grab our food.

If someone asks what the most difficult part of the hiking was, it would definintely be “Ninty nine turnings”, right after the monkey area, it was the steepest and longest continous stairs with 15km length if I was right. We spent hours dealing with this period, the situation was like, we had to take a one minute rest after every one minute’s walk! One funny thing is that, even in such a narrow and difficult way, you can ask someone to carry you up to the top of the Mounting, by sitting on a sedan chair carried by two people. The price is around ￥1500 per trip, usually, those businessman from Hongkong or Guangzhou would asks such a helper as we met at the day there.

As I remember, the distance between Hongchunpin and Jinding(the mounting top) was 42km, we met so many travellers on the way, those who sit in sedan chair carried by others, those pilgrims who also head to Jinding, but only us, planned to walk to the top at the day. We also felt afraid sometime during the trip, but we made it finally, get to Jinding at 10:00pm, after more than 12 hours climbing.

# comparison

I was trying to simulate something related to two dimensional brownian motion or fractional brownian motion.

As shown in the graph, can you tell the difference of these two sets of pictures?
At the first glance, they look the same, but actually, their underlying property are very
different. What is it ?