Hypothesis and Error SSP Matrices for Multivariate General Linear Model Contrasts

Theorem

Edit Issues Pull Requests Attributions Admin

Discussion

Proof

[proofplan] We reduce the hypothesis $C B M = 0$ to an ordinary multivariate general linear model for the transformed response matrix $Z := YM$. The unrestricted least-squares fit gives the residual SSP matrix $E_M$, while the constrained least-squares fit under $C\Gamma = 0$ gives an increase in fitted SSP equal to $H_M$. Once $E_M$ is positive definite, the pair $(H_M,E_M)$ determines a well-defined generalized eigenvalue problem, and the four classical MANOVA statistics are exactly the standard symmetric functions of those roots. [/proofplan] [step:Pass to the response contrast model] Define the response contrast matrix \begin{align*} Z := YM \in \mathbb{R}^{N \times \ell}. \end{align*} Since $Y = XB + U$, we have \begin{align*} Z = XBM + UM. \end{align*} Define the contrast coefficient matrix \begin{align*} \Gamma := BM \in \mathbb{R}^{r \times \ell} \end{align*} and the contrast error matrix \begin{align*} V := UM \in \mathbb{R}^{N \times \ell}. \end{align*} Then the transformed model is \begin{align*} Z = X\Gamma + V. \end{align*} The hypothesis $H_0 : C B M = 0$ is therefore exactly \begin{align*} H_0 : C\Gamma = 0. \end{align*} Because $X$ has full column rank, $X^\top X$ is invertible. Hence the unrestricted least-squares estimator of $\Gamma$ is \begin{align*} \hat\Gamma &= (X^\top X)^{-1}X^\top Z \\ &= (X^\top X)^{-1}X^\top Y M \\ &= \hat B M. \end{align*} [guided] The role of $M$ is to select the response directions in which the hypothesis is tested. We therefore replace the original response matrix $Y$ by the transformed response matrix \begin{align*} Z := YM \in \mathbb{R}^{N \times \ell}. \end{align*} Substituting the model equation $Y = XB + U$ gives \begin{align*} Z = YM = XBM + UM. \end{align*} Define \begin{align*} \Gamma := BM \in \mathbb{R}^{r \times \ell}, \qquad V := UM \in \mathbb{R}^{N \times \ell}. \end{align*} Then the transformed model has the standard multivariate linear form \begin{align*} Z = X\Gamma + V. \end{align*} The original null hypothesis $C B M = 0$ becomes exactly \begin{align*} C\Gamma = 0. \end{align*} Since $X$ has full column rank, the Gram matrix $X^\top X$ is invertible. Thus the least-squares estimator in the transformed model is \begin{align*} \hat\Gamma &= (X^\top X)^{-1}X^\top Z \\ &= (X^\top X)^{-1}X^\top Y M \\ &= \hat B M. \end{align*} This shows that testing $C B M = 0$ in the original model is the same as testing $C\Gamma = 0$ in the contrast model. [/guided] [/step] [step:Identify the error SSP matrix from the unrestricted residuals] The unrestricted fitted value in the contrast model is \begin{align*} \hat Z := X\hat\Gamma = X(X^\top X)^{-1}X^\top Z = P_X Z. \end{align*} The unrestricted residual matrix is therefore \begin{align*} R := Z - \hat Z = (I_N - P_X)Z. \end{align*} The error SSP matrix is \begin{align*} R^\top R &= Z^\top (I_N - P_X)^\top (I_N - P_X)Z. \end{align*} Since $P_X$ is symmetric and idempotent, $I_N - P_X$ is also symmetric and idempotent. Hence \begin{align*} R^\top R &= Z^\top (I_N - P_X)Z \\ &= M^\top Y^\top (I_N - P_X)Y M \\ &= E_M. \end{align*} Thus $E_M$ is the error sum-of-squares-and-products matrix for the transformed response $YM$. [/step] [step:Compute the constrained least-squares increase under $C\Gamma = 0$] Let \begin{align*} A := C(X^\top X)^{-1}C^\top \in \mathbb{R}^{k \times k}. \end{align*} Because $C$ has full row rank and $X^\top X$ is positive definite, $A$ is positive definite and hence invertible. Define \begin{align*} \tilde\Gamma := \hat\Gamma - (X^\top X)^{-1}C^\top A^{-1}C\hat\Gamma. \end{align*} Then \begin{align*} C\tilde\Gamma &= C\hat\Gamma - C(X^\top X)^{-1}C^\top A^{-1}C\hat\Gamma \\ &= C\hat\Gamma - A A^{-1}C\hat\Gamma \\ &= 0, \end{align*} so $\tilde\Gamma$ is feasible for the constraint $C\Gamma=0$. We now prove that $\tilde\Gamma$ is the constrained least-squares minimizer. Define the residual sum-of-squares functional \begin{align*} Q: \mathbb{R}^{r \times \ell} &\to \mathbb{R} \\ G &\mapsto \operatorname{tr}\!\left((Z-XG)^\top(Z-XG)\right). \end{align*} Since $X^\top(Z-X\hat\Gamma)=0$, expanding around $\hat\Gamma$ gives, for every $G \in \mathbb{R}^{r \times \ell}$, \begin{align*} Q(G) &= Q(\hat\Gamma) + \operatorname{tr}\!\left((G-\hat\Gamma)^\top X^\top X(G-\hat\Gamma)\right). \end{align*} Let $G \in \mathbb{R}^{r \times \ell}$ satisfy $CG=0$, and define $\Delta := G-\tilde\Gamma \in \mathbb{R}^{r \times \ell}$. Then $C\Delta=0$. Moreover \begin{align*} \tilde\Gamma-\hat\Gamma = -(X^\top X)^{-1}C^\top A^{-1}C\hat\Gamma, \end{align*} and hence the $X^\top X$-inner product cross-term vanishes: \begin{align*} \operatorname{tr}\!\left((\tilde\Gamma-\hat\Gamma)^\top X^\top X\Delta\right) &= -\operatorname{tr}\!\left((C\hat\Gamma)^\top A^{-1}C\Delta\right) \\ &= 0. \end{align*} Therefore \begin{align*} Q(G)-Q(\tilde\Gamma) &= \operatorname{tr}\!\left(\Delta^\top X^\top X\Delta\right) \geq 0. \end{align*} Since $X^\top X$ is positive definite, equality holds only when $\Delta=0$. Thus $\tilde\Gamma$ is the unique constrained least-squares estimator under $C\Gamma=0$. The fitted-value difference between the unrestricted and constrained fits is \begin{align*} X\hat\Gamma - X\tilde\Gamma = X(X^\top X)^{-1}C^\top A^{-1}C\hat\Gamma. \end{align*} Therefore the hypothesis SSP matrix, defined as the cross-product of this fitted-value difference, is \begin{align*} (X\hat\Gamma - X\tilde\Gamma)^\top(X\hat\Gamma - X\tilde\Gamma) &= \hat\Gamma^\top C^\top A^{-1}C(X^\top X)^{-1}X^\top X(X^\top X)^{-1}C^\top A^{-1}C\hat\Gamma \\ &= \hat\Gamma^\top C^\top A^{-1}C(X^\top X)^{-1}C^\top A^{-1}C\hat\Gamma \\ &= \hat\Gamma^\top C^\top A^{-1}A A^{-1}C\hat\Gamma \\ &= \hat\Gamma^\top C^\top A^{-1}C\hat\Gamma. \end{align*} Using $\hat\Gamma = \hat B M$, this becomes \begin{align*} H_M = M^\top \hat B^\top C^\top \left(C(X^\top X)^{-1}C^\top\right)^{-1}C\hat B M. \end{align*} Thus $H_M$ is the hypothesis sum-of-squares-and-products matrix associated with $H_0 : C B M = 0$. [guided] The constrained estimator is the unrestricted estimator adjusted just enough to satisfy $C\Gamma = 0$. Define \begin{align*} A := C(X^\top X)^{-1}C^\top \in \mathbb{R}^{k \times k}. \end{align*} The matrix $X^\top X$ is positive definite because $X$ has full column rank. Since $C$ has full row rank, for every nonzero $a \in \mathbb{R}^k$ we have $C^\top a \neq 0$, and therefore \begin{align*} a^\top A a = (C^\top a)^\top (X^\top X)^{-1}(C^\top a) > 0. \end{align*} So $A$ is positive definite and invertible. Define \begin{align*} \tilde\Gamma := \hat\Gamma - (X^\top X)^{-1}C^\top A^{-1}C\hat\Gamma. \end{align*} This matrix satisfies the constraint because \begin{align*} C\tilde\Gamma &= C\hat\Gamma - C(X^\top X)^{-1}C^\top A^{-1}C\hat\Gamma \\ &= C\hat\Gamma - A A^{-1}C\hat\Gamma \\ &= 0. \end{align*} So $\tilde\Gamma$ is feasible, but feasibility alone is not enough: we must prove it minimizes the least-squares criterion among all matrices satisfying $C\Gamma=0$. Define the least-squares objective \begin{align*} Q: \mathbb{R}^{r \times \ell} &\to \mathbb{R} \\ G &\mapsto \operatorname{tr}\!\left((Z-XG)^\top(Z-XG)\right). \end{align*} The unrestricted normal equation is $X^\top(Z-X\hat\Gamma)=0$. Therefore, when we expand $Q(G)$ around $\hat\Gamma$, the linear cross-term disappears and we get \begin{align*} Q(G) &= Q(\hat\Gamma) + \operatorname{tr}\!\left((G-\hat\Gamma)^\top X^\top X(G-\hat\Gamma)\right) \end{align*} for every $G \in \mathbb{R}^{r \times \ell}$. Now let $G \in \mathbb{R}^{r \times \ell}$ be any feasible coefficient matrix, so $CG=0$, and define \begin{align*} \Delta := G-\tilde\Gamma \in \mathbb{R}^{r \times \ell}. \end{align*} Since both $G$ and $\tilde\Gamma$ satisfy the constraint, $C\Delta=0$. Also \begin{align*} \tilde\Gamma-\hat\Gamma = -(X^\top X)^{-1}C^\top A^{-1}C\hat\Gamma. \end{align*} Thus the correction $\tilde\Gamma-\hat\Gamma$ is orthogonal, in the $X^\top X$ inner product, to every feasible direction $\Delta$: \begin{align*} \operatorname{tr}\!\left((\tilde\Gamma-\hat\Gamma)^\top X^\top X\Delta\right) &= -\operatorname{tr}\!\left((C\hat\Gamma)^\top A^{-1}C\Delta\right) \\ &= 0. \end{align*} Consequently, \begin{align*} Q(G)-Q(\tilde\Gamma) &= \operatorname{tr}\!\left(\Delta^\top X^\top X\Delta\right) \geq 0. \end{align*} The matrix $X^\top X$ is positive definite, so equality occurs only when $\Delta=0$, that is, only when $G=\tilde\Gamma$. Hence $\tilde\Gamma$ is the unique constrained least-squares fit. This is exactly the projection of $\hat\Gamma$ onto the constraint space $\{G \in \mathbb{R}^{r \times \ell}: CG=0\}$ with respect to the $X^\top X$ inner product. The hypothesis SSP matrix measures how much fitted variation is lost by imposing the null hypothesis. The difference between unrestricted and constrained fitted values is \begin{align*} X\hat\Gamma - X\tilde\Gamma = X(X^\top X)^{-1}C^\top A^{-1}C\hat\Gamma. \end{align*} Taking its cross-product gives \begin{align*} (X\hat\Gamma - X\tilde\Gamma)^\top(X\hat\Gamma - X\tilde\Gamma) &= \hat\Gamma^\top C^\top A^{-1}C(X^\top X)^{-1}X^\top X(X^\top X)^{-1}C^\top A^{-1}C\hat\Gamma \\ &= \hat\Gamma^\top C^\top A^{-1}C(X^\top X)^{-1}C^\top A^{-1}C\hat\Gamma \\ &= \hat\Gamma^\top C^\top A^{-1}A A^{-1}C\hat\Gamma \\ &= \hat\Gamma^\top C^\top A^{-1}C\hat\Gamma. \end{align*} Finally, since $\hat\Gamma = \hat B M$, this is exactly \begin{align*} H_M = M^\top \hat B^\top C^\top \left(C(X^\top X)^{-1}C^\top\right)^{-1}C\hat B M. \end{align*} This verifies that $H_M$ is the hypothesis SSP matrix for the contrast hypothesis. [/guided] [/step] [step:Express the four MANOVA statistics through the real nonnegative roots of $(H_M,E_M)$] Assume $E_M$ is positive definite. Since $H_M$ is a cross-product matrix, it is positive semidefinite. Let $E_M^{1/2} \in \mathbb{R}^{\ell \times \ell}$ denote the positive definite square root of $E_M$, and define \begin{align*} S := E_M^{-1/2}H_M E_M^{-1/2} \in \mathbb{R}^{\ell \times \ell}. \end{align*} Then $S$ is symmetric positive semidefinite. Also \begin{align*} E_M^{-1}H_M = E_M^{-1/2} S E_M^{1/2}, \end{align*} so $E_M^{-1}H_M$ is similar to $S$. Therefore the generalized roots of $(H_M,E_M)$ are real and nonnegative. Denote the nonzero roots, counted with multiplicity, by \begin{align*} \lambda_1,\dots,\lambda_s, \end{align*} where $s \leq \ell$. The four standard MANOVA statistics are then obtained from these roots: \begin{align*} \Lambda_{\mathrm{Wilks}} &:= \frac{\det(E_M)}{\det(E_M + H_M)} = \prod_{i=1}^{s}(1+\lambda_i)^{-1}, \\ V_{\mathrm{Pillai}} &:= \operatorname{tr}\!\left(H_M(H_M+E_M)^{-1}\right) = \sum_{i=1}^{s}\frac{\lambda_i}{1+\lambda_i}, \\ T_{\mathrm{LH}} &:= \operatorname{tr}\!\left(E_M^{-1}H_M\right) = \sum_{i=1}^{s}\lambda_i, \\ \Theta_{\mathrm{Roy}} &:= \lambda_{\max}, \end{align*} where $\lambda_{\max} := \max\{\lambda_i : 1 \leq i \leq s\}$. Thus Wilks' lambda, Pillai's trace, the Lawley-Hotelling trace, and Roy's largest root are precisely the root statistics associated with $(H_M,E_M)$. [guided] The final step is to check that the root statistics are well-defined. The hypothesis matrix $H_M$ was constructed as a cross-product matrix, so for every $a \in \mathbb{R}^\ell$, \begin{align*} a^\top H_M a \geq 0. \end{align*} Thus $H_M$ is positive semidefinite. The theorem assumes $E_M$ is positive definite, so it has a unique positive definite square root $E_M^{1/2}$ and an inverse square root $E_M^{-1/2}$. Define \begin{align*} S := E_M^{-1/2}H_M E_M^{-1/2} \in \mathbb{R}^{\ell \times \ell}. \end{align*} This matrix is symmetric because both $E_M^{-1/2}$ and $H_M$ are symmetric, and it is positive semidefinite because \begin{align*} a^\top S a = (E_M^{-1/2}a)^\top H_M(E_M^{-1/2}a) \geq 0 \end{align*} for every $a \in \mathbb{R}^\ell$. Therefore all eigenvalues of $S$ are real and nonnegative. The generalized eigenvalue problem for the pair $(H_M,E_M)$ is equivalent to the ordinary eigenvalue problem for $S$. Indeed, \begin{align*} E_M^{-1}H_M = E_M^{-1/2} S E_M^{1/2}, \end{align*} so $E_M^{-1}H_M$ is similar to $S$ and has the same eigenvalues. Denote the nonzero eigenvalues by \begin{align*} \lambda_1,\dots,\lambda_s, \end{align*} counted with multiplicity, where $s \leq \ell$. These are the real nonnegative generalized roots of $(H_M,E_M)$. The classical MANOVA tests are symmetric functions of these roots: \begin{align*} \Lambda_{\mathrm{Wilks}} &:= \frac{\det(E_M)}{\det(E_M + H_M)} = \prod_{i=1}^{s}(1+\lambda_i)^{-1}, \\ V_{\mathrm{Pillai}} &:= \operatorname{tr}\!\left(H_M(H_M+E_M)^{-1}\right) = \sum_{i=1}^{s}\frac{\lambda_i}{1+\lambda_i}, \\ T_{\mathrm{LH}} &:= \operatorname{tr}\!\left(E_M^{-1}H_M\right) = \sum_{i=1}^{s}\lambda_i, \\ \Theta_{\mathrm{Roy}} &:= \lambda_{\max}, \end{align*} where $\lambda_{\max} := \max\{\lambda_i : 1 \leq i \leq s\}$. Hence the pair $(H_M,E_M)$ supplies exactly the roots used by Wilks' lambda, Pillai's trace, the Lawley-Hotelling trace, and Roy's largest root. [/guided] [/step] [step:Conclude that the constructed matrices are the required SSP matrices] The preceding steps construct $E_M$ as the residual cross-product matrix for the unrestricted fit of $YM$ on $X$, and construct $H_M$ as the fitted cross-product increase obtained by comparing the unrestricted fit with the constrained fit satisfying $C B M = 0$. Hence $H_M$ and $E_M$ are exactly the hypothesis and error SSP matrices associated with the estimable hypothesis $H_0 : C B M = 0$. When $E_M$ is positive definite, the matrix pair $(H_M,E_M)$ has well-defined generalized roots, and the four classical MANOVA tests are obtained by applying the corresponding Wilks, Pillai, Lawley-Hotelling, and Roy root statistics to that pair. This proves the theorem. [/step]

Prerequisites (0/1 completed)

Prerequisites Graph

Interactive dependency map showing how this theorem builds on foundational concepts

Loading dependency graph...

Theorems

test

What brings you to Androma?

Start with a route through the knowledge graph.