[proofplan]
We prove the projection result directly in the [Hilbert space](/page/Hilbert%20Space) $L^2(\Omega,\mathcal F,\mathbb P)$. First we take a minimizing sequence in the closed linear subspace $\mathcal H$ and use the parallelogram identity to show that the sequence is Cauchy. Closedness of $\mathcal H$ gives a minimizer. We then prove uniqueness by applying the same convexity argument to two hypothetical minimizers, and finally derive the orthogonality condition by minimizing a one-variable quadratic along affine lines inside $\mathcal H$. The converse follows from the Pythagorean identity.
[/proofplan]
custom_env
admin
[step:Construct a minimizing sequence and prove that it is Cauchy]Define the distance from $Y$ to $\mathcal H$ by
\begin{align*}
d:=\inf_{Z\in\mathcal H}\|Y-Z\|_{L^2}.
\end{align*}
By the definition of the infimum, there exists a sequence $(Z_k)_{k=1}^\infty$ in $\mathcal H$ such that
\begin{align*}
\|Y-Z_k\|_{L^2}\to d.
\end{align*}
For $m,n\in\mathbb N$, the midpoint
\begin{align*}
M_{m,n}:=\frac{Z_m+Z_n}{2}
\end{align*}
belongs to $\mathcal H$ because $\mathcal H$ is a linear subspace. Hence
\begin{align*}
\|Y-M_{m,n}\|_{L^2}\ge d.
\end{align*}
Applying the parallelogram identity in $L^2(\Omega,\mathcal F,\mathbb P)$ to the two vectors $Y-Z_m$ and $Y-Z_n$ gives
\begin{align*}
\|Z_m-Z_n\|_{L^2}^2
&=2\|Y-Z_m\|_{L^2}^2+2\|Y-Z_n\|_{L^2}^2
-4\left\|Y-\frac{Z_m+Z_n}{2}\right\|_{L^2}^2 \\
&\le 2\|Y-Z_m\|_{L^2}^2+2\|Y-Z_n\|_{L^2}^2-4d^2.
\end{align*}
Since $\|Y-Z_k\|_{L^2}\to d$, the right-hand side tends to $0$ as $m,n\to\infty$. Therefore $(Z_k)_{k=1}^\infty$ is Cauchy in $L^2(\Omega,\mathcal F,\mathbb P)$.[/step]
custom_env
admin
[guided]The quantity we want to minimize is the distance from $Y$ to the subspace $\mathcal H$, so we first name it:
\begin{align*}
d:=\inf_{Z\in\mathcal H}\|Y-Z\|_{L^2}.
\end{align*}
Because $d$ is an infimum, we may choose a sequence $(Z_k)_{k=1}^\infty$ in $\mathcal H$ whose distances to $Y$ approach $d$:
\begin{align*}
\|Y-Z_k\|_{L^2}\to d.
\end{align*}
The main point is to prove that the approximants $Z_k$ approach one another. For $m,n\in\mathbb N$, define the midpoint
\begin{align*}
M_{m,n}:=\frac{Z_m+Z_n}{2}.
\end{align*}
Since $\mathcal H$ is a linear subspace, it is closed under addition and scalar multiplication, so $M_{m,n}\in\mathcal H$. Therefore the definition of $d$ gives
\begin{align*}
\|Y-M_{m,n}\|_{L^2}\ge d.
\end{align*}
Now apply the parallelogram identity to the two vectors $Y-Z_m$ and $Y-Z_n$ in the Hilbert space $L^2(\Omega,\mathcal F,\mathbb P)$. Their difference is $Z_n-Z_m$, and their average is
\begin{align*}
\frac{(Y-Z_m)+(Y-Z_n)}{2}=Y-\frac{Z_m+Z_n}{2}.
\end{align*}
Thus
\begin{align*}
\|Z_m-Z_n\|_{L^2}^2
&=2\|Y-Z_m\|_{L^2}^2+2\|Y-Z_n\|_{L^2}^2
-4\left\|Y-\frac{Z_m+Z_n}{2}\right\|_{L^2}^2 \\
&\le 2\|Y-Z_m\|_{L^2}^2+2\|Y-Z_n\|_{L^2}^2-4d^2.
\end{align*}
The last inequality uses the midpoint estimate $\|Y-M_{m,n}\|_{L^2}\ge d$. Since both $\|Y-Z_m\|_{L^2}$ and $\|Y-Z_n\|_{L^2}$ tend to $d$, the upper bound tends to $0$ as $m,n\to\infty$. Hence $(Z_k)_{k=1}^\infty$ is Cauchy in $L^2(\Omega,\mathcal F,\mathbb P)$.[/guided]
custom_env
admin
[step:Use completeness and closedness to obtain a minimizer]The space $L^2(\Omega,\mathcal F,\mathbb P)$ is complete, so there exists $P\in L^2(\Omega,\mathcal F,\mathbb P)$ such that
\begin{align*}
Z_k\to P
\end{align*}
in $L^2(\Omega,\mathcal F,\mathbb P)$. Since each $Z_k\in\mathcal H$ and $\mathcal H$ is closed in $L^2(\Omega,\mathcal F,\mathbb P)$, we have $P\in\mathcal H$. By the [reverse triangle inequality](/theorems/2300),
\begin{align*}
\left|\|Y-Z_k\|_{L^2}-\|Y-P\|_{L^2}\right|
\le \|Z_k-P\|_{L^2},
\end{align*}
so $\|Y-Z_k\|_{L^2}\to\|Y-P\|_{L^2}$. Since also $\|Y-Z_k\|_{L^2}\to d$, it follows that
\begin{align*}
\|Y-P\|_{L^2}=d.
\end{align*}
Thus $P$ is a minimizer in $\mathcal H$.[/step]
custom_env
admin
[guided]The previous step showed that $(Z_k)_{k=1}^\infty$ is Cauchy in $L^2(\Omega,\mathcal F,\mathbb P)$. The space $L^2(\Omega,\mathcal F,\mathbb P)$ is a Hilbert space, hence complete, so there is an element $P\in L^2(\Omega,\mathcal F,\mathbb P)$ such that
\begin{align*}
Z_k\to P
\end{align*}
in the $L^2$ norm.
We must still check that the limit belongs to the constraint set. This is exactly where closedness of $\mathcal H$ is used. Since every $Z_k$ lies in $\mathcal H$ and $\mathcal H$ is closed in $L^2(\Omega,\mathcal F,\mathbb P)$, the limit $P$ also lies in $\mathcal H$.
It remains to prove that $P$ actually attains the infimum. The reverse triangle inequality gives
\begin{align*}
\left|\|Y-Z_k\|_{L^2}-\|Y-P\|_{L^2}\right|
\le \|(Y-Z_k)-(Y-P)\|_{L^2}
=\|Z_k-P\|_{L^2}.
\end{align*}
Since $\|Z_k-P\|_{L^2}\to 0$, we get
\begin{align*}
\|Y-Z_k\|_{L^2}\to\|Y-P\|_{L^2}.
\end{align*}
But the sequence was chosen so that $\|Y-Z_k\|_{L^2}\to d$. Therefore
\begin{align*}
\|Y-P\|_{L^2}=d,
\end{align*}
so $P$ is a minimizer in $\mathcal H$.[/guided]
custom_env
admin
[step:Prove uniqueness of the minimizer by midpoint convexity]
Suppose $P,Q\in\mathcal H$ both satisfy
\begin{align*}
\|Y-P\|_{L^2}=d,
\qquad
\|Y-Q\|_{L^2}=d.
\end{align*}
Since $\mathcal H$ is a linear subspace,
\begin{align*}
R:=\frac{P+Q}{2}
\end{align*}
belongs to $\mathcal H$, and therefore $\|Y-R\|_{L^2}\ge d$. Applying the parallelogram identity to $Y-P$ and $Y-Q$ gives
\begin{align*}
\|P-Q\|_{L^2}^2
&=2\|Y-P\|_{L^2}^2+2\|Y-Q\|_{L^2}^2
-4\left\|Y-\frac{P+Q}{2}\right\|_{L^2}^2 \\
&\le 2d^2+2d^2-4d^2 \\
&=0.
\end{align*}
Hence $\|P-Q\|_{L^2}=0$, so $P=Q$ in $L^2(\Omega,\mathcal F,\mathbb P)$. The minimizer is unique. Denote it by $P_{\mathcal H}Y$.
[/step]
custom_env
admin
[step:Derive the orthogonality condition from one-dimensional minimization]Let $P:=P_{\mathcal H}Y$. Fix $Z\in\mathcal H$ and define the function
\begin{align*}
\varphi:\mathbb R&\to\mathbb R \\
t&\mapsto \|Y-(P+tZ)\|_{L^2}^2.
\end{align*}
Since $P+tZ\in\mathcal H$ for every $t\in\mathbb R$ and $P$ minimizes the distance from $Y$ to $\mathcal H$, the function $\varphi$ has a minimum at $t=0$. Expanding the square using the $L^2$ inner product,
\begin{align*}
\varphi(t)
&=\|Y-P-tZ\|_{L^2}^2 \\
&=\|Y-P\|_{L^2}^2-2t(Y-P,Z)_{L^2}+t^2\|Z\|_{L^2}^2.
\end{align*}
This quadratic polynomial has a minimum at $t=0$, so its derivative at $0$ is $0$:
\begin{align*}
\varphi'(0)=-2(Y-P,Z)_{L^2}=0.
\end{align*}
Therefore
\begin{align*}
(Y-P_{\mathcal H}Y,Z)_{L^2}=0
\end{align*}
for every $Z\in\mathcal H$.[/step]
custom_env
admin
[guided]Fix an arbitrary element $Z\in\mathcal H$, and write $P:=P_{\mathcal H}Y$ for the unique minimizer. To extract an orthogonality condition, we move away from $P$ in the direction $Z$. Define
\begin{align*}
\varphi:\mathbb R&\to\mathbb R \\
t&\mapsto \|Y-(P+tZ)\|_{L^2}^2.
\end{align*}
Because $\mathcal H$ is a linear subspace and both $P$ and $Z$ belong to $\mathcal H$, the point $P+tZ$ belongs to $\mathcal H$ for every $t\in\mathbb R$. Since $P$ minimizes the distance from $Y$ to $\mathcal H$, the function $\varphi$ has a minimum at $t=0$.
Now compute $\varphi$ explicitly. Using bilinearity and symmetry of the real $L^2$ inner product,
\begin{align*}
\varphi(t)
&=\|Y-P-tZ\|_{L^2}^2 \\
&=(Y-P-tZ,Y-P-tZ)_{L^2} \\
&=\|Y-P\|_{L^2}^2-2t(Y-P,Z)_{L^2}+t^2\|Z\|_{L^2}^2.
\end{align*}
This is a polynomial in $t$. Since it has a minimum at $t=0$, its derivative at $0$ must vanish:
\begin{align*}
\varphi'(0)=-2(Y-P,Z)_{L^2}=0.
\end{align*}
Therefore
\begin{align*}
(Y-P_{\mathcal H}Y,Z)_{L^2}=0.
\end{align*}
Because $Z\in\mathcal H$ was arbitrary, the orthogonality condition holds for every element of $\mathcal H$.[/guided]
custom_env
admin
[step:Recover minimality and uniqueness from the orthogonality condition]
Conversely, suppose $P\in\mathcal H$ satisfies
\begin{align*}
(Y-P,Z)_{L^2}=0
\end{align*}
for every $Z\in\mathcal H$. For any $W\in\mathcal H$, the difference $P-W$ belongs to $\mathcal H$, so
\begin{align*}
(Y-P,P-W)_{L^2}=0.
\end{align*}
Using the decomposition
\begin{align*}
Y-W=(Y-P)+(P-W),
\end{align*}
we obtain
\begin{align*}
\|Y-W\|_{L^2}^2
&=\|(Y-P)+(P-W)\|_{L^2}^2 \\
&=\|Y-P\|_{L^2}^2+2(Y-P,P-W)_{L^2}+\|P-W\|_{L^2}^2 \\
&=\|Y-P\|_{L^2}^2+\|P-W\|_{L^2}^2 \\
&\ge \|Y-P\|_{L^2}^2.
\end{align*}
Thus $P$ minimizes the distance from $Y$ to $\mathcal H$. Since the minimizer has already been shown to be unique, $P=P_{\mathcal H}Y$. This proves both the characterization and the theorem.
[/step]