Unbiasedness of the Ordinary Least Squares Estimator Under Exogeneity

Unbiasedness of the Ordinary Least Squares Estimator Under Exogeneity (Theorem # 4423)

Theorem

Edit Issues Pull Requests Attributions Admin

Discussion

Proof

[proofplan] We first verify that the ordinary least squares formula is well-defined by using the full column rank of $X$ to show that $X^\top X$ is invertible. Then we substitute the model equation $y=X\beta+\varepsilon$ into the estimator and separate the deterministic signal term from the random error term. Finally, linearity of expectation and the exogeneity assumption $\mathbb{E}[\varepsilon]=0$ make the error contribution vanish. [/proofplan] [step:Use full column rank to justify the inverse] Since $X$ has full column rank, its null space is $\{0\}$. Let $v \in \mathbb{R}^p$ satisfy $X^\top Xv=0$. Multiplying on the left by $v^\top$ gives \begin{align*} 0=v^\top X^\top Xv=|Xv|^2. \end{align*} Hence $Xv=0$, and the full column rank of $X$ implies $v=0$. Therefore $\ker(X^\top X)=\{0\}$. Since $X^\top X \in \mathbb{R}^{p \times p}$ is a square matrix with trivial kernel, $X^\top X$ is invertible. Thus $(X^\top X)^{-1}X^\top y$ is well-defined as an $\mathbb{R}^p$-valued random vector. [/step] [step:Decompose the estimator into the true coefficient and an error term] Define the deterministic [linear map](/page/Linear%20Map) $A:\mathbb{R}^n \to \mathbb{R}^p$ by \begin{align*} A z=(X^\top X)^{-1}X^\top z \end{align*} for $z \in \mathbb{R}^n$. Using $y=X\beta+\varepsilon$, we compute pointwise on $\Omega$: \begin{align*} \hat{\beta} &=(X^\top X)^{-1}X^\top y \\ &=(X^\top X)^{-1}X^\top(X\beta+\varepsilon) \\ &=(X^\top X)^{-1}X^\top X\beta+(X^\top X)^{-1}X^\top\varepsilon \\ &=\beta+A\varepsilon. \end{align*} The last equality uses $(X^\top X)^{-1}X^\top X=I_p$, where $I_p$ denotes the identity matrix on $\mathbb{R}^p$. [/step] [step:Take expectations and use exogeneity to eliminate the error term] Because $\varepsilon$ is integrable and $A:\mathbb{R}^n \to \mathbb{R}^p$ is deterministic and linear, $A\varepsilon$ is integrable and \begin{align*} \mathbb{E}[A\varepsilon]=A\mathbb{E}[\varepsilon]. \end{align*} Taking expectations in the decomposition $\hat{\beta}=\beta+A\varepsilon$ gives \begin{align*} \mathbb{E}[\hat{\beta}] &=\mathbb{E}[\beta+A\varepsilon] \\ &=\beta+\mathbb{E}[A\varepsilon] \\ &=\beta+A\mathbb{E}[\varepsilon] \\ &=\beta+A0 \\ &=\beta. \end{align*} Thus the ordinary least squares estimator satisfies $\mathbb{E}[\hat{\beta}]=\beta$, so it is unbiased for $\beta$. [/step]

Prerequisites (0/1 completed)

Prerequisites Graph

Interactive dependency map showing how this theorem builds on foundational concepts

Loading dependency graph...

Definitions & Concepts

Expectation

Explore Further

Expectation Definition Backwards Martingale Convergence Theorem Martingale Theory Bonferroni Inequalities Probability Theory White Heteroskedasticity-Robust Covariance Consistency Theorem Probability & Statistics Gaussian Integral Probability Theory Immediate Return to Zero for Brownian Motion Brownian Motion Coordinate Characterisation of Product Measurability Probability & Statistics Continuity of Probability Probability Theory Harmonicity of the Brownian Dirichlet Solution Brownian Motion Probability & Statistics Area

What brings you to Androma?

Start with a route through the knowledge graph.