Attributions & Verification

Track contributions and verify content correctness

Proof

custom_env admin

[step:Verify independence of $Z$ and $W$, then invoke the $t_{n-p}$ definition]The definition of $t_{n-p}$ requires $Z \sim N(0,1)$, $W \sim \chi^2_{n-p}$, and $Z \perp\!\!\!\perp W$. We verify all three. **$Z \sim N(0,1)$:** Step 3 gives $\hat Y^* - Y^* \sim N(0, \sigma^2(\tau^2 + 1))$, and $Z$ is the standardisation. **$W \sim \chi^2_{n-p}$:** By [Distributional Properties of the Normal Linear Model](/theorems/1445) part (2). **$Z \perp\!\!\!\perp W$:** The random variable $Z$ is a (measurable) function of $\hat\beta$ and $\varepsilon^*$, and $W$ is a (measurable) function of $\mathrm{RSS}$. We show $W$ is independent of the pair $(\hat\beta, \varepsilon^*)$, which implies $W \perp\!\!\!\perp Z$. By [Distributional Properties of the Normal Linear Model](/theorems/1445) part (3), $\hat\beta$ and $\mathrm{RSS}$ are independent. By hypothesis, $\varepsilon^*$ is independent of $Y$; since both $\hat\beta$ and $\mathrm{RSS}$ are (measurable) functions of $Y$, the pair $(\hat\beta, \mathrm{RSS})$ is jointly a function of $Y$, hence independent of $\varepsilon^*$. Therefore: - $\mathrm{RSS} \perp\!\!\!\perp \hat\beta$ (within the training data); - $\mathrm{RSS} \perp\!\!\!\perp \varepsilon^*$ (training vs future noise). These two independences combine to give $\mathrm{RSS}$ independent of the pair $(\hat\beta, \varepsilon^*)$: since the joint distribution of $(\mathrm{RSS}, \hat\beta, \varepsilon^*)$ factors as the product of the distribution of $\mathrm{RSS}$ with the joint distribution of $(\hat\beta, \varepsilon^*)$ (using in turn $\mathrm{RSS} \perp\!\!\!\perp Y$ fails to hold in general, but $\mathrm{RSS} \perp\!\!\!\perp \hat\beta$ within $\mathcal{F}_{Y}$ combined with $Y \perp\!\!\!\perp \varepsilon^*$ gives the factorisation via conditioning). Passing to measurable functions, $W \perp\!\!\!\perp Z$. Having verified all three conditions, the definition of the $t$ distribution gives \begin{align*} \frac{\hat Y^* - Y^*}{\tilde\sigma \sqrt{\tau^2 + 1}} &= \frac{Z}{\sqrt{W/(n-p)}} \sim t_{n-p}, \end{align*} completing the proof.[/step]

custom_env admin

[guided]The $t$-ratio construction closely parallels the proof of [t-Distribution of Normalised Coefficient Estimate](/theorems/1446): a standard normal over the square root of an independent $\chi^2_{n-p}/(n-p)$. The only new element is the extra independent noise term $\varepsilon^*$ in the numerator. We need to check carefully that this extra term does not disturb the independence between numerator and denominator. *Summary of what needs to be independent.* The $t$-ratio $Z/\sqrt{W/(n-p)}$ requires $Z \perp\!\!\!\perp W$. Here $Z$ depends on both $\hat\beta$ (through $\hat Y^* = x^{*\top}\hat\beta$) and $\varepsilon^*$ (through $Y^* = x^{*\top}\beta + \varepsilon^*$), while $W = \mathrm{RSS}/\sigma^2$ depends on the training data through $\mathrm{RSS}$. So we must verify $\mathrm{RSS}$ is independent of $(\hat\beta, \varepsilon^*)$ jointly. *Step 1: $\mathrm{RSS} \perp\!\!\!\perp \hat\beta$.* This is [Distributional Properties of the Normal Linear Model](/theorems/1445) part (3). *Step 2: $\mathrm{RSS} \perp\!\!\!\perp \varepsilon^*$.* The new noise $\varepsilon^*$ is independent of $Y$ by hypothesis. Since $\mathrm{RSS}$ is a Borel-measurable function of $Y$ (explicitly $\mathrm{RSS} = Y^\top (I_n - P) Y$), and independence between $\varepsilon^*$ and $Y$ passes to $\mathrm{RSS} = \phi(Y)$ by $\sigma(\mathrm{RSS}) \subseteq \sigma(Y)$, we get $\mathrm{RSS} \perp\!\!\!\perp \varepsilon^*$. *Step 3: Joint independence $\mathrm{RSS} \perp\!\!\!\perp (\hat\beta, \varepsilon^*)$.* This is strictly stronger than the two pairwise independences in Steps 1 and 2, but follows from them combined with the independence structure of the underlying data. The cleanest way is to check the joint distribution factorises. The random vector $(Y, \varepsilon^*)$ has joint distribution $\mu_{Y} \otimes \mu_{\varepsilon^*}$ because $\varepsilon^* \perp\!\!\!\perp Y$. Applying the measurable map $(y, e) \mapsto (\mathrm{RSS}(y), \hat\beta(y), e)$, the image measure factors as the joint distribution of $(\mathrm{RSS}, \hat\beta)$ (which itself factors as $\mu_{\mathrm{RSS}} \otimes \mu_{\hat\beta}$ by [Distributional Properties of the Normal Linear Model](/theorems/1445) part (3)) times $\mu_{\varepsilon^*}$. So \begin{align*} \mu_{(\mathrm{RSS}, \hat\beta, \varepsilon^*)} = \mu_{\mathrm{RSS}} \otimes \mu_{\hat\beta} \otimes \mu_{\varepsilon^*}, \end{align*} i.e., the three variables are mutually independent, and in particular $\mathrm{RSS}$ is independent of the pair $(\hat\beta, \varepsilon^*)$. *Step 4: $W \perp\!\!\!\perp Z$.* Since $W = \mathrm{RSS}/\sigma^2$ is a measurable function of $\mathrm{RSS}$, and $Z$ is a measurable function of the pair $(\hat\beta, \varepsilon^*)$, the previous joint independence gives $W \perp\!\!\!\perp Z$ — measurable functions of independent variables remain independent. *Why the extra $\varepsilon^*$ does not inflate the degrees of freedom.* This is the subtle point. One might worry that carrying around an extra normal in the numerator should "cost" us some chi-squared variability and reduce the degrees of freedom. It does not, because $\varepsilon^*$ is completely independent of the training data and therefore of $W$. The only degrees of freedom consumed in the denominator are the $p$ coefficients estimated from the $n$ training observations, leaving $n - p$ — exactly the $\chi^2_{n - p}$ degrees of freedom that appeared in the $t$-ratio for normalised coefficients. Applying the definition of the $t$ distribution: \begin{align*} \frac{Z}{\sqrt{W/(n-p)}} \sim t_{n-p}, \end{align*} and substituting the identity from Step 4, \begin{align*} \frac{\hat Y^* - Y^*}{\tilde\sigma \sqrt{\tau^2 + 1}} \sim t_{n-p}. \end{align*} This is the pivot used to construct prediction intervals for a future observation.[/guided]

custom_env admin

Verification Progress

9 Total Blocks

0 Verified

0% verified

Contributors

admin 9 blocks (0 verified)

Who Can Verify

Areas: Probability & Statistics
Subareas: Regression Analysis

Viktor Miykov Admin

Max Vassiliev Global Reviewer

Horia Neagu Global Reviewer

강현욱 Global Reviewer

Demo Testing Global Reviewer

Archie Pennycook Global Reviewer

Quick Actions

Edit Theorem

What brings you to Androma?

Start with a route through the knowledge graph.

Attributions & Verification

Proof

Verification Progress

Contributors

Who Can Verify

Quick Actions

Sign in to Androma

Check your inbox

One last step

Attributions & Verification

Proof

Verification Progress

Contributors

Who Can Verify

Quick Actions

Raw Attribution Data