Hilbert-Schmidt determinant

In the previous post, I mentioned the intuition of Fredholm determinant for trace class operator, which is Tr(K)<\infty. When it comes to Hilbert-Schmidt determinant, which is Tr(K^*K)<\infty, that definition(let’s call it det_1) may fail since trace of H-S operator my be infinity. So we need to come up with another definition(let’s call it det_2) for H-S operator that makes sense.

The intuition is that consider the eigen values of a H-S matrix K, say \lambda_1,\lambda_2,dots,notice that for \forall \lambda, 1+\lambda\leq exp(\lambda), so we can define det_2(I+K)=\prod(\frac{1+\lambda_i}{exp(\lambda_i)})=det_1(I+K)\cdot exp(-Tr(K))

Fredholm determinant

In the study of measure transformation in Gaussian space, there is a fundamental issue that I want to write down here, which is Fredholm determinant. Let me try to use half an hour to explain the intuition of this determinant in matrix form with finite dimension.

So suppose a matrix A=(a_{ij})\in \mathbb{R}_{n\times n}, our goal is to find out what definition det(I+A) is that makes sense for trace class operator(i.e.Tr(A)<\infty). We know that its trace Tr(A)=\sum a_{ii}, and assume the eigenvalues of A are \lambda_1,\lambda_2, \dots, \lambda_n , we have

AX=\begin{pmatrix}  \lambda_1 & & &\\  & \lambda_2 & 0 \\  & & \ddots &\\  & & & \lambda_n  \end{pmatrix} X, so

Tr(A)=Tr(X^{-1}AX)=\sum \lambda_i.

And because det(I+A)=\prod (1+\lambda_i)=\sum_{k=0}^{n}(\sum_{i_1,\dots,i_k}\lambda_{i_1}\dots\lambda_{i_k})

Iet’s take a look at these terms in the summation,

when k=0, it’s 1,

when k=1, it’s \lambda_1+\lambda_2+\dots+\lambda_n,

when k=2, it’s \sum_{i\neq j}\lambda_i\lambda_j,

which is the trace of operator \Lambda^2(A) on a linear space with basis e_i\wedge e_j, i<j where is \wedge is the wedge product form.

Similarly, we have the expression given in widipedia about “Fredholm determinant” that for a general trace-class operator A

det(I+A)=\sum_{k=0}^{\infty}Tr(\Lambda^k(A)) and this new operator \Lambda^k(A) is a linear operator on space formed by the basis \{e_{i_1}\wedge e_{i_2}\wedge\dots\wedge e_{i_k}| i_1<i_2<\dots<i_k\}

Finally, I want to mention that when Tr(A)=\sum |\lambda_i|<\infty, easy to check that det(A)<\infty indeed.

One way ANOVA vs Two way ANOVA

Today in class we talked about Two-way ANOVA, so my question is how to understand the essence of these two kinds of ANOVA? I’ll try to spend less than one hour to write down what I thought about it.

Let’s take a look at an example, we have treatments i \in \{1,2,\dots,I\} and groups j\in \{1,2,\dots,J\} and for each treatment i and group j, we have m_{i,j} samples, so our two-way ANOVA model should would be

\mu_{ijk}=\mu_{ij}+\epsilon_{ij;k} where \mu_{ij} is the mean of the ith treatment and jth group.

What if we want to use one-way ANOVA to model it? What would that be look like? Because there may be interaction between the treatment and group, so in my understanding, we should consider this model be a linear model with respect to three terms, i.e. treatment level , group level  and interaction level , so the model should be X_{ijk}=\mu_0+\alpha_{i} +\beta_{j} +\gamma_{ij} +\epsilon_{ijk}, now we consider the least square solution for \mu_0,\alpha_i,\beta_j,\gamma_{ij} we have

\min_{\mu_0,\alpha_i,\beta_j,\gamma_{ij}} S=\sum_{i,j,k} (X_{ijk}-\mu_0-\alpha_i-\beta_j-\gamma_{ij})^2

By taking derivatives with respect to \mu_0,\alpha,\beta,\gamma, we have

\mu_0 \sum_{ij} m_{ij}=\sum_{ijk} X_{ijk}-\sum_{i}\alpha_i(\sum_j m_{ij})-\sum_{j}\beta_j(\sum_{i} m_{ij})-\sum_{i,j} \gamma_{ij} m_{ij}

\alpha_i\sum_j m_{ij}=\sum_{jk} X_{ijk}-\mu_0\sum_j m_{ij}-\beta_i\sum_j m_{ij}-\sum_j(\gamma_{ij} m_{ij})

\beta_j\sum_j m_{ij}=\sum_{ik} X_{ijk}-\mu_0\sum_i m_{ij}-\alpha_j\sum_i m_{ij}-\sum_i(\gamma_{ij} m_{ij})

\gamma_{ij}=\frac{1}{m_{ij}}\sum_k X_{ijk}-\mu_0 -\alpha_i-\beta_j

Then the solution of these equations is the two-way ANONA estimator.

So as we can see that if there is only one factor, there would not exist the interaction term \gamma, then we can just regard the original linear model be I different group and analyse individually.

So in sum, so called “Two-way ANOVA” is mainly dealing with the interaction between two different “factors”. So question is if there are three different factors, can we follow the same idea do the “many-factors ANOVA”?