One-Sample Hotelling $T^2$ Distribution Theorem — Statement & Proof

One-Sample Hotelling $T^2$ Distribution Theorem (Theorem # 4022)

Theorem

Edit Issues Pull Requests Attributions Admin

Discussion

Proof

[proofplan] We first reduce the statistic to a whitened normal-Wishart problem: under $H_0$, the standardized sample mean is standard Gaussian and is independent of the standardized sample covariance matrix. The core distributional calculation is then the identity that, for $Y \sim \mathcal{N}_p(0,I_p)$ independent of $B \sim W_p(n-1,I_p)$, the quadratic form $Y^\top B^{-1}Y$ is a ratio of independent chi-square variables. This follows by rotating $Y$ to the first coordinate axis and applying a Schur-complement decomposition of a standard Wishart matrix. The claimed $F$ distribution is then obtained by restoring the factor relating $B$ to $(n-1)S$. [/proofplan] [step:Whiten the sample mean and covariance matrix] Let $\Sigma^{1/2} \in \mathbb{R}^{p \times p}$ denote the symmetric positive definite square root of $\Sigma$, and let $C=(\Sigma^{1/2})^{-1} \in \mathbb{R}^{p \times p}$ denote its symmetric positive definite inverse square root, so that $C\Sigma C = I_p$. Define the random vector $Y: \Omega \to \mathbb{R}^p$ and the random matrix $B: \Omega \to \mathbb{R}^{p \times p}$ by \begin{align*} Y &= \sqrt{n}\,C(\bar{X}-\mu_0), & B &= (n-1)CSC. \end{align*} By the normal-Wishart sampling theorem for multivariate normal samples (citing a result not yet in the wiki: normal-Wishart independence theorem), under $H_0$ we have \begin{align*} Y &\sim \mathcal{N}_p(0,I_p), & B &\sim W_p(n-1,I_p), \end{align*} and $Y$ and $B$ are independent. Here $W_p(\nu,\Psi)$ denotes the $p$-variate Wishart distribution with $\nu$ degrees of freedom and scale matrix $\Psi$, equivalently the law of $\sum_{k=1}^{\nu} Z_k Z_k^\top$ for independent random vectors $Z_k \sim \mathcal{N}_p(0,\Psi)$. Since $n>p$, the Wishart matrix $B$ has $n-1 \ge p$ degrees of freedom and is invertible almost surely. Hence $S$ is invertible almost surely because \begin{align*} S = \frac{1}{n-1}\Sigma^{1/2}B\Sigma^{1/2}. \end{align*} On this almost sure event, \begin{align*} S^{-1} = (n-1)CB^{-1}C. \end{align*} Therefore \begin{align*} T^2 &= n(\bar{X}-\mu_0)^\top S^{-1}(\bar{X}-\mu_0) \\ &= n(\bar{X}-\mu_0)^\top (n-1)CB^{-1}C(\bar{X}-\mu_0) \\ &= (n-1)Y^\top B^{-1}Y. \end{align*} [guided] The goal is to remove the unknown covariance matrix $\Sigma$ from the statistic. Since $\Sigma$ is symmetric positive definite, it has a symmetric positive definite square root $\Sigma^{1/2}$ and a symmetric positive definite inverse square root $C=\Sigma^{-1/2}$ satisfying $C\Sigma C=I_p$. Define \begin{align*} Y &= \sqrt{n}\,C(\bar{X}-\mu_0), & B &= (n-1)CSC. \end{align*} The random vector $Y: \Omega \to \mathbb{R}^p$ is the whitened sample mean, and the random matrix $B: \Omega \to \mathbb{R}^{p \times p}$ is the whitened scaled sample covariance matrix. By the normal-Wishart sampling theorem for multivariate normal samples, under the null hypothesis $H_0:\mu=\mu_0$, \begin{align*} Y &\sim \mathcal{N}_p(0,I_p), & B &\sim W_p(n-1,I_p), \end{align*} and $Y$ is independent of $B$. Here $W_p(\nu,\Psi)$ denotes the $p$-variate Wishart distribution with $\nu$ degrees of freedom and scale matrix $\Psi$, equivalently the law of $\sum_{k=1}^{\nu} Z_k Z_k^\top$ for independent random vectors $Z_k \sim \mathcal{N}_p(0,\Psi)$. The hypothesis $n>p$ is used here to ensure that the Wishart matrix has enough degrees of freedom: $n-1 \ge p$. A standard Wishart matrix with at least $p$ degrees of freedom is invertible almost surely, so $B$ is invertible almost surely. The relation between $B$ and $S$ is \begin{align*} B=(n-1)CSC. \end{align*} Solving this for $S$ gives \begin{align*} S = \frac{1}{n-1}\Sigma^{1/2}B\Sigma^{1/2}. \end{align*} Thus, whenever $B$ is invertible, \begin{align*} S^{-1} = (n-1)CB^{-1}C. \end{align*} Substituting this into Hotelling's statistic gives \begin{align*} T^2 &= n(\bar{X}-\mu_0)^\top S^{-1}(\bar{X}-\mu_0) \\ &= n(\bar{X}-\mu_0)^\top (n-1)CB^{-1}C(\bar{X}-\mu_0). \end{align*} Because $Y=\sqrt{n}\,C(\bar{X}-\mu_0)$, this becomes \begin{align*} T^2 = (n-1)Y^\top B^{-1}Y. \end{align*} So the theorem is reduced to identifying the distribution of $Y^\top B^{-1}Y$ when $Y$ is standard Gaussian and independent of a standard Wishart matrix. [/guided] [/step] [step:Compute the inverse-Wishart quadratic form by rotating the Gaussian vector] Set $m=n-1$. We prove the following distributional identity: if $Y \sim \mathcal{N}_p(0,I_p)$, $B \sim W_p(m,I_p)$, $m \ge p$, and $Y$ is independent of $B$, then \begin{align*} \frac{m-p+1}{p}Y^\top B^{-1}Y \sim F_{p,m-p+1}. \end{align*} Define the nonnegative random variable $R: \Omega \to [0,\infty)$ by \begin{align*} R = (Y^\top Y)^{1/2}. \end{align*} Then $R^2=Y^\top Y \sim \chi^2_p$. On the event $Y\ne 0$, choose an orthogonal matrix $Q_Y \in \mathbb{R}^{p \times p}$ such that \begin{align*} Q_YY = R e_1, \end{align*} where $e_1=(1,0,\dots,0)^\top \in \mathbb{R}^p$. Since $Y=0$ has probability $0$, this defines the relevant distribution almost surely. Conditional on $Y$, the matrix \begin{align*} A = Q_Y B Q_Y^\top \end{align*} has distribution $W_p(m,I_p)$ and is independent of $R^2$, because the standard Wishart law is invariant under deterministic orthogonal conjugation and $B$ is independent of $Y$. Hence \begin{align*} Y^\top B^{-1}Y &= (Q_YY)^\top (Q_YBQ_Y^\top)^{-1}(Q_YY) \\ &= R^2 e_1^\top A^{-1}e_1. \end{align*} [claim:Schur complement of a standard Wishart matrix] Let $A \sim W_p(m,I_p)$ with $m \ge p$. Write $A$ in block form \begin{align*} A = \begin{pmatrix} a_{11} & a_{12}^\top \\ a_{12} & A_{22} \end{pmatrix}, \end{align*} where $a_{11} \in \mathbb{R}$, $a_{12} \in \mathbb{R}^{p-1}$, and $A_{22} \in \mathbb{R}^{(p-1)\times(p-1)}$. Then $A_{22}$ is invertible almost surely, and the Schur complement \begin{align*} C_A = a_{11}-a_{12}^\top A_{22}^{-1}a_{12} \end{align*} satisfies \begin{align*} C_A \sim \chi^2_{m-p+1}. \end{align*} Moreover, \begin{align*} e_1^\top A^{-1}e_1 = C_A^{-1}. \end{align*} [/claim] [proof] Represent $A$ as \begin{align*} A = G^\top G, \end{align*} where $G: \Omega \to \mathbb{R}^{m \times p}$ is a random matrix whose entries are independent $\mathcal{N}(0,1)$ random variables. Partition $G$ by columns as \begin{align*} G = (g_1\; G_2), \end{align*} where $g_1: \Omega \to \mathbb{R}^m$ is the first column and $G_2: \Omega \to \mathbb{R}^{m \times (p-1)}$ is the remaining column block. Then \begin{align*} a_{11} &= g_1^\top g_1, & a_{12} &= G_2^\top g_1, & A_{22} &= G_2^\top G_2. \end{align*} Since $m \ge p$, the Gaussian matrix $G_2$ has rank $p-1$ almost surely, so $A_{22}=G_2^\top G_2$ is invertible almost surely. On this event, define the [orthogonal projection](/theorems/437) matrix $P_2: \mathbb{R}^m \to \mathbb{R}^m$ onto $\operatorname{Range}(G_2)$ by \begin{align*} P_2 = G_2(G_2^\top G_2)^{-1}G_2^\top. \end{align*} Then \begin{align*} C_A &= g_1^\top g_1 - g_1^\top G_2(G_2^\top G_2)^{-1}G_2^\top g_1 \\ &= g_1^\top(I_m-P_2)g_1. \end{align*} Conditional on $G_2$, the matrix $I_m-P_2$ is an orthogonal projection of rank \begin{align*} m-(p-1)=m-p+1. \end{align*} Since $g_1 \sim \mathcal{N}_m(0,I_m)$ and $g_1$ is independent of $G_2$, the quadratic form $g_1^\top(I_m-P_2)g_1$ has conditional distribution $\chi^2_{m-p+1}$. This conditional distribution does not depend on $G_2$, so $C_A \sim \chi^2_{m-p+1}$. Finally, the block inverse formula for an invertible block matrix with invertible lower-right block gives \begin{align*} e_1^\top A^{-1}e_1 = \left(a_{11}-a_{12}^\top A_{22}^{-1}a_{12}\right)^{-1} = C_A^{-1}. \end{align*} [/proof] By the claim, \begin{align*} Y^\top B^{-1}Y = \frac{R^2}{C_A}, \end{align*} where $R^2 \sim \chi^2_p$ and $C_A \sim \chi^2_{m-p+1}$. The random variables $R^2$ and $C_A$ are independent because $R^2$ is a function of $Y$ and $C_A$ is computed from the conditionally rotated Wishart matrix, whose conditional distribution is independent of $Y$. Therefore, by the definition of the $F$ distribution as a ratio of independent normalized chi-square variables, \begin{align*} \frac{m-p+1}{p}Y^\top B^{-1}Y = \frac{R^2/p}{C_A/(m-p+1)} \sim F_{p,m-p+1}. \end{align*} [guided] We now prove the distributional identity that drives the theorem. Let $m=n-1$, and suppose \begin{align*} Y &\sim \mathcal{N}_p(0,I_p), & B &\sim W_p(m,I_p), \end{align*} with $Y$ independent of $B$ and $m\ge p$. The first idea is rotational symmetry. Define the nonnegative random variable $R: \Omega \to [0,\infty)$ by \begin{align*} R = (Y^\top Y)^{1/2}. \end{align*} Since $Y$ is standard Gaussian in $\mathbb{R}^p$, we have $R^2=Y^\top Y\sim \chi^2_p$. On the event $Y\ne 0$, choose an orthogonal matrix $Q_Y\in\mathbb{R}^{p\times p}$ such that \begin{align*} Q_YY = R e_1, \end{align*} where $e_1=(1,0,\dots,0)^\top$. The event $Y=0$ has probability $0$, so ignoring it does not change the distribution of the statistic. Now define \begin{align*} A = Q_YBQ_Y^\top. \end{align*} Conditional on $Y$, the matrix $Q_Y$ is deterministic. Since the standard Wishart distribution $W_p(m,I_p)$ is invariant under orthogonal conjugation, the conditional distribution of $A$ is again $W_p(m,I_p)$. Also, because $B$ is independent of $Y$, this conditional distribution does not depend on $Y$. The quadratic form becomes \begin{align*} Y^\top B^{-1}Y &= (Q_YY)^\top (Q_YBQ_Y^\top)^{-1}(Q_YY) \\ &= R^2 e_1^\top A^{-1}e_1. \end{align*} So the only remaining task is to identify the upper-left entry of $A^{-1}$ for a standard Wishart matrix $A$. Write $A$ in block form: \begin{align*} A = \begin{pmatrix} a_{11} & a_{12}^\top \\ a_{12} & A_{22} \end{pmatrix}, \end{align*} where $a_{11}\in\mathbb{R}$, $a_{12}\in\mathbb{R}^{p-1}$, and $A_{22}\in\mathbb{R}^{(p-1)\times(p-1)}$. We claim that \begin{align*} e_1^\top A^{-1}e_1 = \left(a_{11}-a_{12}^\top A_{22}^{-1}a_{12}\right)^{-1}, \end{align*} and that the Schur complement \begin{align*} C_A = a_{11}-a_{12}^\top A_{22}^{-1}a_{12} \end{align*} has distribution $\chi^2_{m-p+1}$. To prove this, represent the Wishart matrix as $A=G^\top G$, where $G:\Omega\to\mathbb{R}^{m\times p}$ has independent standard normal entries. Partition $G$ by columns: \begin{align*} G=(g_1\;G_2), \end{align*} with $g_1:\Omega\to\mathbb{R}^m$ and $G_2:\Omega\to\mathbb{R}^{m\times(p-1)}$. Then \begin{align*} a_{11} &= g_1^\top g_1, & a_{12} &= G_2^\top g_1, & A_{22} &= G_2^\top G_2. \end{align*} Since $m\ge p$, the Gaussian matrix $G_2$ has full column rank $p-1$ almost surely, so $A_{22}$ is invertible almost surely. Define the orthogonal projection matrix $P_2:\mathbb{R}^m\to\mathbb{R}^m$ onto the column space $\operatorname{Range}(G_2)$ by \begin{align*} P_2 = G_2(G_2^\top G_2)^{-1}G_2^\top. \end{align*} Then the Schur complement is \begin{align*} C_A &= g_1^\top g_1 - g_1^\top G_2(G_2^\top G_2)^{-1}G_2^\top g_1 \\ &= g_1^\top(I_m-P_2)g_1. \end{align*} Conditional on $G_2$, the matrix $I_m-P_2$ is an orthogonal projection. Its rank is the [dimension of the orthogonal complement](/theorems/3297) of $\operatorname{Range}(G_2)$, namely \begin{align*} m-(p-1)=m-p+1. \end{align*} Since $g_1$ is an $\mathcal{N}_m(0,I_m)$ random vector independent of $G_2$, the projected squared length $g_1^\top(I_m-P_2)g_1$ is chi-square with $m-p+1$ degrees of freedom. Hence \begin{align*} C_A\sim \chi^2_{m-p+1}. \end{align*} The block inverse formula gives \begin{align*} e_1^\top A^{-1}e_1 = C_A^{-1}. \end{align*} Therefore \begin{align*} Y^\top B^{-1}Y = \frac{R^2}{C_A}. \end{align*} Here $R^2\sim\chi^2_p$, while $C_A\sim\chi^2_{m-p+1}$, and these two variables are independent because $R^2$ depends only on $Y$ whereas the Schur complement comes from the independently rotated Wishart matrix. By the definition of the $F$ distribution, \begin{align*} \frac{R^2/p}{C_A/(m-p+1)} \sim F_{p,m-p+1}. \end{align*} Equivalently, \begin{align*} \frac{m-p+1}{p}Y^\top B^{-1}Y \sim F_{p,m-p+1}. \end{align*} [/guided] [/step] [step:Restore the Hotelling scaling] From the first step, \begin{align*} T^2=(n-1)Y^\top B^{-1}Y. \end{align*} From the second step with $m=n-1$, \begin{align*} \frac{n-p}{p}Y^\top B^{-1}Y \sim F_{p,n-p}. \end{align*} Multiplying the identity $T^2=(n-1)Y^\top B^{-1}Y$ by $\frac{n-p}{p(n-1)}$ gives \begin{align*} \frac{n-p}{p(n-1)}T^2 = \frac{n-p}{p}Y^\top B^{-1}Y \sim F_{p,n-p}. \end{align*} This is the asserted distribution of the one-sample Hotelling statistic under $H_0$. [/step]

Prerequisites (0/2 completed)

Prerequisites Graph

Interactive dependency map showing how this theorem builds on foundational concepts

Loading dependency graph...

Theorems

Orthogonal Projection

Definitions & Concepts

Distribution

Explore Further

Distribution Definition Orthogonal Projection Theorem #437 Affine Stability of the Multivariate Normal Distribution probability Linear Filter Spectral Transformation Theorem probability Asymptotic Normality of ARMA Maximum Likelihood Estimators probability Independence of the Sample Mean and Sample Covariance for a Multivariate Normal Sample probability Two-Sample Hotelling $T^2$ Distribution Theorem probability Nelson's Strict Stationarity Theorem for the GARCH(1,1) Process probability Unbiasedness of the Sample Mean and Sample Covariance Matrix probability Root-Pattern Sensitivity of Classical One-Way MANOVA Criteria probability

What brings you to Androma?

Start with a route through the knowledge graph.

One-Sample Hotelling $T^2$ Distribution Theorem (Theorem # 4022)

Discussion

Proof

Prerequisites (0/2 completed)

Prerequisites Graph

Explore Further

Sign in to Androma

Check your inbox

One last step

One-Sample Hotelling $T^2$ Distribution Theorem (Theorem # 4022)

Discussion

Proof

Prerequisites (0/2 completed)

Prerequisites Graph

Explore Further