Reduction of Multivariate Linear Hypotheses to MANOVA

Reduction of Multivariate Linear Hypotheses to MANOVA (Theorem # 4065)

Theorem

Edit Issues Pull Requests Attributions Admin

Discussion

Proof

[proofplan] We rewrite the hypothesis part using the matrix-normal distribution of $C\hat B M$. The row covariance of this estimator is $C(X^\top X)^{-1}C^\top$, so premultiplying by its inverse square root reduces the hypothesis sum of squares to the cross-product of an $r \times s$ matrix with independent $\mathcal{N}_s(0,\Omega_M)$ rows. The residual matrix is the projection of the noise onto the orthogonal complement of the column space of $X$, and after applying $M$ its cross-product is a Wishart matrix with $n-q$ degrees of freedom. Orthogonality of the projection onto $\operatorname{Range}(X)$ and its complement gives independence of the hypothesis and error parts. [/proofplan] [step:Standardize the estimable hypothesis matrix] Define \begin{align*} A_C := C(X^\top X)^{-1}C^\top \in \mathbb{R}^{r \times r}. \end{align*} Since $X$ has rank $q$, the matrix $X^\top X$ is symmetric positive definite. Since $C$ has rank $r$, $A_C$ is symmetric positive definite. Let $A_C^{-1/2}$ denote the symmetric positive definite inverse square root of $A_C$. Let $(\Omega_{\mathrm{prob}},\mathcal{F}_{\mathrm{prob}},\mathbb{P})$ denote the probability space on which the random error matrix $U: \Omega_{\mathrm{prob}} \to \mathbb{R}^{n \times p}$ is defined. The least-squares estimator satisfies \begin{align*} \hat B = B + (X^\top X)^{-1}X^\top U. \end{align*} Therefore \begin{align*} C\hat B M = CBM + C(X^\top X)^{-1}X^\top U M. \end{align*} Under $H_0$, $CBM=0$, so \begin{align*} C\hat B M = C(X^\top X)^{-1}X^\top U M. \end{align*} Define the standardized hypothesis matrix \begin{align*} Z: \Omega_{\mathrm{prob}} &\to \mathbb{R}^{r \times s} \\ \omega &\mapsto A_C^{-1/2}C\hat B(\omega)M. \end{align*} Because $UM$ has independent rows with distribution $\mathcal{N}_s(0,\Omega_M)$ and \begin{align*} A_C^{-1/2}C(X^\top X)^{-1}X^\top \end{align*} has row Gram matrix \begin{align*} A_C^{-1/2}C(X^\top X)^{-1}X^\top X(X^\top X)^{-1}C^\top A_C^{-1/2} = A_C^{-1/2}A_C A_C^{-1/2} = I_r, \end{align*} the rows of $Z$ are independent $\mathcal{N}_s(0,\Omega_M)$ random vectors. Hence \begin{align*} H_M = (A_C^{-1/2}C\hat B M)^\top(A_C^{-1/2}C\hat B M) = Z^\top Z. \end{align*} Thus \begin{align*} H_M \sim W_s(\Omega_M,r). \end{align*} [guided] The role of $A_C$ is to measure the covariance among the $r$ linear hypothesis directions. We define \begin{align*} A_C := C(X^\top X)^{-1}C^\top. \end{align*} This matrix is invertible because $X^\top X$ is symmetric positive definite and $C$ has full row rank. Thus the inverse square root $A_C^{-1/2}$ exists and standardizes the hypothesis directions. Let $(\Omega_{\mathrm{prob}},\mathcal{F}_{\mathrm{prob}},\mathbb{P})$ denote the probability space on which the random error matrix $U: \Omega_{\mathrm{prob}} \to \mathbb{R}^{n \times p}$ is defined. The least-squares estimator is \begin{align*} \hat B = (X^\top X)^{-1}X^\top Y = (X^\top X)^{-1}X^\top(XB+U) = B + (X^\top X)^{-1}X^\top U. \end{align*} Multiplying on the left by $C$ and on the right by $M$ gives \begin{align*} C\hat B M = CBM + C(X^\top X)^{-1}X^\top U M. \end{align*} Under the null hypothesis $CBM=0$, this becomes \begin{align*} C\hat B M = C(X^\top X)^{-1}X^\top U M. \end{align*} Now define \begin{align*} Z: \Omega_{\mathrm{prob}} &\to \mathbb{R}^{r \times s} \\ \omega &\mapsto A_C^{-1/2}C\hat B(\omega)M. \end{align*} The rows of $UM$ are independent centered normal vectors with covariance \begin{align*} \Omega_M = M^\top \Sigma M. \end{align*} The row covariance of $Z$ is the Gram matrix of the deterministic multiplier: \begin{align*} A_C^{-1/2}C(X^\top X)^{-1}X^\top X(X^\top X)^{-1}C^\top A_C^{-1/2} = A_C^{-1/2}A_C A_C^{-1/2} = I_r. \end{align*} Thus $Z$ has independent rows, each distributed as $\mathcal{N}_s(0,\Omega_M)$. Its cross-product is exactly \begin{align*} Z^\top Z &= M^\top \hat B^\top C^\top A_C^{-1/2}A_C^{-1/2}C\hat B M \\ &= M^\top \hat B^\top C^\top A_C^{-1}C\hat B M \\ &= H_M. \end{align*} Therefore $H_M$ is a central Wishart matrix with scale $\Omega_M$ and $r$ degrees of freedom: \begin{align*} H_M \sim W_s(\Omega_M,r). \end{align*} [/guided] [/step] [step:Represent the residual cross-product as a projected Gaussian cross-product] Define the projection matrices \begin{align*} P_X := X(X^\top X)^{-1}X^\top \in \mathbb{R}^{n \times n}, \qquad Q_X := I_n - P_X \in \mathbb{R}^{n \times n}. \end{align*} Then \begin{align*} Y-X\hat B = (I_n-P_X)Y = Q_X(XB+U)=Q_XU, \end{align*} because $Q_XX=0$. Hence \begin{align*} E_M = M^\top U^\top Q_X U M = (Q_XUM)^\top(Q_XUM). \end{align*} The matrix $Q_X$ is symmetric idempotent with rank $n-q$. Therefore there exists an orthogonal matrix $O \in \mathbb{R}^{n \times n}$ such that \begin{align*} Q_X = O^\top \begin{pmatrix} I_{n-q} & 0 \\ 0 & 0_q \end{pmatrix} O. \end{align*} Since left multiplication by an orthogonal matrix preserves independence and the common covariance of the rows of $UM$, the first $n-q$ rows of $OUM$ are independent $\mathcal{N}_s(0,\Omega_M)$ random vectors, and the remaining rows do not contribute to the cross-product. Therefore \begin{align*} E_M \sim W_s(\Omega_M,n-q). \end{align*} [guided] The residual matrix is obtained by projecting $Y$ away from the column space of $X$. Define \begin{align*} P_X := X(X^\top X)^{-1}X^\top, \qquad Q_X := I_n-P_X. \end{align*} The matrix $P_X$ is the [orthogonal projection](/theorems/437) onto $\operatorname{Range}(X)$, and $Q_X$ is the orthogonal projection onto $\operatorname{Range}(X)^\perp$. Since $\hat B=(X^\top X)^{-1}X^\top Y$, we have \begin{align*} X\hat B = X(X^\top X)^{-1}X^\top Y = P_XY. \end{align*} Thus \begin{align*} Y-X\hat B = Y-P_XY = Q_XY. \end{align*} Substituting $Y=XB+U$ gives \begin{align*} Y-X\hat B = Q_X(XB+U)=Q_XXB+Q_XU. \end{align*} Because $P_XX=X$, we have $Q_XX=(I_n-P_X)X=0$, so \begin{align*} Y-X\hat B = Q_XU. \end{align*} After projecting through $M$, the error cross-product becomes \begin{align*} E_M = M^\top U^\top Q_X^\top Q_X U M. \end{align*} Since $Q_X$ is symmetric and idempotent, $Q_X^\top Q_X=Q_X$, hence \begin{align*} E_M = M^\top U^\top Q_X U M = (Q_XUM)^\top(Q_XUM). \end{align*} The matrix $Q_X$ is symmetric idempotent with rank $n-q$, so it is orthogonally diagonalizable with eigenvalue $1$ repeated $n-q$ times and eigenvalue $0$ repeated $q$ times. Therefore there is an orthogonal matrix $O \in \mathbb{R}^{n \times n}$ such that \begin{align*} Q_X = O^\top \begin{pmatrix} I_{n-q} & 0 \\ 0 & 0_q \end{pmatrix} O. \end{align*} Let \begin{align*} V: \Omega_{\mathrm{prob}} &\to \mathbb{R}^{n \times s} \\ \omega &\mapsto O\,U(\omega)M. \end{align*} The rows of $UM$ are independent $\mathcal{N}_s(0,\Omega_M)$ random vectors, and orthogonal left multiplication preserves the joint centered Gaussian law and transforms the row covariance by $OO^\top=I_n$. Hence the rows of $V$ are independent $\mathcal{N}_s(0,\Omega_M)$ random vectors. The cross-product becomes the sum of the first $n-q$ row outer products: \begin{align*} E_M = V^\top \begin{pmatrix} I_{n-q} & 0 \\ 0 & 0_q \end{pmatrix} V. \end{align*} Thus $E_M$ is the cross-product of $n-q$ independent $\mathcal{N}_s(0,\Omega_M)$ rows, and therefore \begin{align*} E_M \sim W_s(\Omega_M,n-q). \end{align*} [/guided] [/step] [step:Use orthogonal projection to prove independence] The standardized hypothesis matrix $Z$ depends on $U$ only through $P_XUM$, because \begin{align*} A_C^{-1/2}C(X^\top X)^{-1}X^\top U M = A_C^{-1/2}C(X^\top X)^{-1}X^\top P_X U M. \end{align*} The residual cross-product $E_M$ depends on $U$ only through $Q_XUM$. Since $P_XQ_X=0$, the Gaussian random matrices $P_XUM$ and $Q_XUM$ are independent. Therefore $Z$ and $E_M$ are independent, and hence $H_M=Z^\top Z$ and $E_M$ are independent. [/step] [step:Identify the resulting MANOVA hypothesis-error structure] We have shown that, under $H_0$, \begin{align*} H_M \sim W_s(\Omega_M,r), \qquad E_M \sim W_s(\Omega_M,n-q), \end{align*} and that $H_M$ and $E_M$ are independent. The matrix $H_M$ is therefore the hypothesis sum-of-squares-and-products matrix for $r$ standardized hypothesis directions and $s$ response combinations, while $E_M$ is the corresponding error sum-of-squares-and-products matrix with $n-q$ residual degrees of freedom. This is precisely the central MANOVA hypothesis-error structure for the transformed responses $YM$ and the linear hypothesis $CBM=0$. [/step]

Prerequisites (0/3 completed)

Prerequisites Graph

Interactive dependency map showing how this theorem builds on foundational concepts

Loading dependency graph...

Theorems

Orthogonal Projection

Definitions & Concepts

Explore Further

Distribution Definition Orthogonality Definition Orthogonal Projection Theorem #437 Wilks' Lambda Product Formula probability Strong Consistency of the Multivariate Normal Maximum Likelihood Estimators probability Noncentral F Distribution of Hotelling's $T^2$ Statistic probability Bonferroni Simultaneous Confidence Intervals for Linear Contrasts of a Multivariate Normal Mean probability Tracy-Widom Limit for the Largest Eigenvalue of a Real Wishart Matrix probability Quadratic Discriminant Analysis Bayes Rule probability Spectral Density Formula for a Causal ARMA Process probability Durbin-Levinson Recursion probability

What brings you to Androma?

Start with a route through the knowledge graph.