[proofplan]
We minimize the $L^2$ distance from $Y$ to the closed subspace $H$. A minimizing sequence is shown to be Cauchy by applying the parallelogram identity to two near-minimizers and observing that their midpoint is still in $H$. Completeness of $L^2$ and closedness of $H$ give the minimizer. Orthogonality follows by varying the minimizer along each direction in $H$, and uniqueness follows because the difference of two minimizers is both in $H$ and orthogonal to itself.
[/proofplan]
[step:Define the distance from $Y$ to $H$ and choose a minimizing sequence]
Define the $L^2$ norm $\|\cdot\|_{L^2}: L^2(\Omega,\mathcal F,\mathbb P)\to [0,\infty)$ by
\begin{align*}
\|X\|_{L^2} := \mathbb E[|X|^2]^{1/2}.
\end{align*}
Set
\begin{align*}
d := \inf_{Z\in H}\|Y-Z\|_{L^2}.
\end{align*}
Since $0\le d<\infty$, by the definition of the infimum there exists a sequence $(Z_n)_{n\in\mathbb N}$ in $H$ such that
\begin{align*}
\|Y-Z_n\|_{L^2}^2 \le d^2+\frac{1}{n}
\end{align*}
for every $n\in\mathbb N$.
[/step]
[step:Use the parallelogram identity to prove the minimizing sequence is Cauchy]
For $m,n\in\mathbb N$, define
\begin{align*}
X_m := Y-Z_m,
\qquad
X_n := Y-Z_n.
\end{align*}
The parallelogram identity in the real inner product space $L^2(\Omega,\mathcal F,\mathbb P)$ gives
\begin{align*}
\|X_m-X_n\|_{L^2}^2
=
2\|X_m\|_{L^2}^2
+
2\|X_n\|_{L^2}^2
-
4\left\|\frac{X_m+X_n}{2}\right\|_{L^2}^2.
\end{align*}
Because $H$ is a linear subspace, the midpoint
\begin{align*}
\frac{Z_m+Z_n}{2}
\end{align*}
belongs to $H$. Also,
\begin{align*}
\frac{X_m+X_n}{2}
=
Y-\frac{Z_m+Z_n}{2}.
\end{align*}
Therefore, by the definition of $d$,
\begin{align*}
\left\|\frac{X_m+X_n}{2}\right\|_{L^2} \ge d.
\end{align*}
Using the minimizing bound for $Z_m$ and $Z_n$, we obtain
\begin{align*}
\|Z_m-Z_n\|_{L^2}^2
&=
\|X_n-X_m\|_{L^2}^2 \\
&\le
2\left(d^2+\frac{1}{m}\right)
+
2\left(d^2+\frac{1}{n}\right)
-
4d^2 \\
&=
\frac{2}{m}+\frac{2}{n}.
\end{align*}
Hence $(Z_n)_{n\in\mathbb N}$ is Cauchy in $L^2(\Omega,\mathcal F,\mathbb P)$.
[guided]
The key point is that two near-minimizers cannot be far apart. To make this quantitative, define
\begin{align*}
X_m := Y-Z_m,
\qquad
X_n := Y-Z_n.
\end{align*}
These are the two residuals. The parallelogram identity gives the exact relation
\begin{align*}
\|X_m-X_n\|_{L^2}^2
=
2\|X_m\|_{L^2}^2
+
2\|X_n\|_{L^2}^2
-
4\left\|\frac{X_m+X_n}{2}\right\|_{L^2}^2.
\end{align*}
The last term is where convexity of the linear subspace enters. Since $H$ is a real linear subspace,
\begin{align*}
\frac{Z_m+Z_n}{2}\in H.
\end{align*}
Moreover,
\begin{align*}
\frac{X_m+X_n}{2}
=
\frac{Y-Z_m+Y-Z_n}{2}
=
Y-\frac{Z_m+Z_n}{2}.
\end{align*}
Thus the average residual is the residual associated to an admissible element of $H$. By the definition of the distance $d$ from $Y$ to $H$,
\begin{align*}
\left\|\frac{X_m+X_n}{2}\right\|_{L^2}
=
\left\|Y-\frac{Z_m+Z_n}{2}\right\|_{L^2}
\ge d.
\end{align*}
Substituting this lower bound into the parallelogram identity and using
\begin{align*}
\|X_m\|_{L^2}^2 \le d^2+\frac{1}{m},
\qquad
\|X_n\|_{L^2}^2 \le d^2+\frac{1}{n},
\end{align*}
we get
\begin{align*}
\|Z_m-Z_n\|_{L^2}^2
&=
\|X_n-X_m\|_{L^2}^2 \\
&\le
2\left(d^2+\frac{1}{m}\right)
+
2\left(d^2+\frac{1}{n}\right)
-
4d^2 \\
&=
\frac{2}{m}+\frac{2}{n}.
\end{align*}
The right-hand side tends to $0$ as $m,n\to\infty$, so $(Z_n)_{n\in\mathbb N}$ is Cauchy in $L^2(\Omega,\mathcal F,\mathbb P)$.
[/guided]
[/step]
[step:Pass to the limit inside the closed subspace]
Since $L^2(\Omega,\mathcal F,\mathbb P)$ is complete, there exists $P\in L^2(\Omega,\mathcal F,\mathbb P)$ such that
\begin{align*}
Z_n \to P
\end{align*}
in $L^2(\Omega,\mathcal F,\mathbb P)$. Since each $Z_n$ belongs to $H$ and $H$ is closed in the $L^2$ norm, we have $P\in H$.
The reverse triangle inequality gives
\begin{align*}
\left|\|Y-Z_n\|_{L^2}-\|Y-P\|_{L^2}\right|
\le
\|Z_n-P\|_{L^2}.
\end{align*}
Letting $n\to\infty$ yields
\begin{align*}
\|Y-P\|_{L^2}=d.
\end{align*}
Therefore
\begin{align*}
\mathbb E[|Y-P|^2]
=
d^2
=
\inf_{Z\in H}\mathbb E[|Y-Z|^2].
\end{align*}
[/step]
[step:Vary the minimizer along each direction in $H$ to obtain orthogonality]
Let $W\in H$ be arbitrary, and define the residual
\begin{align*}
R := Y-P.
\end{align*}
For every $a\in\mathbb R$, the element $P+aW$ belongs to $H$. Since $P$ minimizes the squared distance, the function
\begin{align*}
\varphi:\mathbb R &\to \mathbb R \\
a &\mapsto \|Y-(P+aW)\|_{L^2}^2
\end{align*}
has a minimum at $a=0$. Expanding the square gives
\begin{align*}
\varphi(a)
&=
\|R-aW\|_{L^2}^2 \\
&=
\mathbb E[|R-aW|^2] \\
&=
\mathbb E[|R|^2]-2a\,\mathbb E[RW]+a^2\mathbb E[|W|^2].
\end{align*}
The product $RW$ is integrable because
\begin{align*}
|RW|\le \frac{1}{2}|R|^2+\frac{1}{2}|W|^2
\end{align*}
and $R,W\in L^2(\Omega,\mathcal F,\mathbb P)$. Since the quadratic polynomial $\varphi$ is minimized at $a=0$, its linear coefficient must vanish. Hence
\begin{align*}
\mathbb E[RW]=0.
\end{align*}
Because $W\in H$ was arbitrary,
\begin{align*}
\mathbb E[(Y-P)W]=0
\end{align*}
for every $W\in H$.
[guided]
Fix an arbitrary direction $W\in H$. We want to show that the residual $R:=Y-P$ has zero inner product with this direction. Since $H$ is a real linear subspace and $P,W\in H$, every perturbation
\begin{align*}
P+aW
\end{align*}
belongs to $H$ for every $a\in\mathbb R$. Thus the minimizing property of $P$ says that the real-valued function
\begin{align*}
\varphi:\mathbb R &\to \mathbb R \\
a &\mapsto \|Y-(P+aW)\|_{L^2}^2
\end{align*}
has a minimum at $a=0$.
Now compute this function exactly. Since $R=Y-P$,
\begin{align*}
\varphi(a)
&=
\|R-aW\|_{L^2}^2 \\
&=
\mathbb E[|R-aW|^2] \\
&=
\mathbb E[|R|^2]-2a\,\mathbb E[RW]+a^2\mathbb E[|W|^2].
\end{align*}
The expectation $\mathbb E[RW]$ is finite because the elementary inequality
\begin{align*}
|RW|\le \frac{1}{2}|R|^2+\frac{1}{2}|W|^2
\end{align*}
and the facts $R,W\in L^2(\Omega,\mathcal F,\mathbb P)$ imply $RW\in L^1(\Omega,\mathcal F,\mathbb P)$.
A real quadratic polynomial minimized at $a=0$ must have zero linear coefficient. Equivalently, if $\mathbb E[RW]\neq 0$, choosing $a$ with the same sign as $\mathbb E[RW]$ and sufficiently small would make
\begin{align*}
-2a\,\mathbb E[RW]+a^2\mathbb E[|W|^2] <0,
\end{align*}
contradicting the minimality of $a=0$. Hence
\begin{align*}
\mathbb E[RW]=0.
\end{align*}
Since the direction $W\in H$ was arbitrary, the residual satisfies
\begin{align*}
\mathbb E[(Y-P)W]=0
\end{align*}
for every $W\in H$.
[/guided]
[/step]
[step:Use orthogonality of residuals to prove uniqueness]
Suppose $P_1,P_2\in H$ both minimize the squared distance from $Y$ to $H$. Applying the previous orthogonality argument to each minimizer gives
\begin{align*}
\mathbb E[(Y-P_1)W]=0
\qquad\text{and}\qquad
\mathbb E[(Y-P_2)W]=0
\end{align*}
for every $W\in H$. Take
\begin{align*}
W:=P_1-P_2\in H.
\end{align*}
Subtracting the two orthogonality identities gives
\begin{align*}
0
&=
\mathbb E[(Y-P_1)W]-\mathbb E[(Y-P_2)W] \\
&=
\mathbb E[(P_2-P_1)(P_1-P_2)] \\
&=
-\mathbb E[|P_1-P_2|^2].
\end{align*}
Thus
\begin{align*}
\mathbb E[|P_1-P_2|^2]=0,
\end{align*}
so $P_1=P_2$ as elements of $L^2(\Omega,\mathcal F,\mathbb P)$. Therefore the minimizer is unique. Denoting it by $P_HY$, the existence, minimizing property, and orthogonality assertion are all proved.
[/step]