[proofplan]
We first verify that the ordinary least squares formula is well-defined by using the full column rank of $X$ to show that $X^\top X$ is invertible. Then we substitute the model equation $y=X\beta+\varepsilon$ into the estimator and separate the deterministic signal term from the random error term. Finally, linearity of expectation and the exogeneity assumption $\mathbb{E}[\varepsilon]=0$ make the error contribution vanish.
[/proofplan]
[step:Use full column rank to justify the inverse]
Since $X$ has full column rank, its null space is $\{0\}$. Let $v \in \mathbb{R}^p$ satisfy $X^\top Xv=0$. Multiplying on the left by $v^\top$ gives
\begin{align*}
0=v^\top X^\top Xv=|Xv|^2.
\end{align*}
Hence $Xv=0$, and the full column rank of $X$ implies $v=0$. Therefore $\ker(X^\top X)=\{0\}$. Since $X^\top X \in \mathbb{R}^{p \times p}$ is a square matrix with trivial kernel, $X^\top X$ is invertible. Thus $(X^\top X)^{-1}X^\top y$ is well-defined as an $\mathbb{R}^p$-valued random vector.
[/step]
[step:Decompose the estimator into the true coefficient and an error term]
Define the deterministic [linear map](/page/Linear%20Map) $A:\mathbb{R}^n \to \mathbb{R}^p$ by
\begin{align*}
A z=(X^\top X)^{-1}X^\top z
\end{align*}
for $z \in \mathbb{R}^n$. Using $y=X\beta+\varepsilon$, we compute pointwise on $\Omega$:
\begin{align*}
\hat{\beta}
&=(X^\top X)^{-1}X^\top y \\
&=(X^\top X)^{-1}X^\top(X\beta+\varepsilon) \\
&=(X^\top X)^{-1}X^\top X\beta+(X^\top X)^{-1}X^\top\varepsilon \\
&=\beta+A\varepsilon.
\end{align*}
The last equality uses $(X^\top X)^{-1}X^\top X=I_p$, where $I_p$ denotes the identity matrix on $\mathbb{R}^p$.
[/step]
[step:Take expectations and use exogeneity to eliminate the error term]
Because $\varepsilon$ is integrable and $A:\mathbb{R}^n \to \mathbb{R}^p$ is deterministic and linear, $A\varepsilon$ is integrable and
\begin{align*}
\mathbb{E}[A\varepsilon]=A\mathbb{E}[\varepsilon].
\end{align*}
Taking expectations in the decomposition $\hat{\beta}=\beta+A\varepsilon$ gives
\begin{align*}
\mathbb{E}[\hat{\beta}]
&=\mathbb{E}[\beta+A\varepsilon] \\
&=\beta+\mathbb{E}[A\varepsilon] \\
&=\beta+A\mathbb{E}[\varepsilon] \\
&=\beta+A0 \\
&=\beta.
\end{align*}
Thus the ordinary least squares estimator satisfies $\mathbb{E}[\hat{\beta}]=\beta$, so it is unbiased for $\beta$.
[/step]