[proofplan]
Fix $x \in H$ and compare an arbitrary vector $m \in M$ with the projected vector $P_Mx \in M$. The key decomposition is $x-m=(x-P_Mx)+(P_Mx-m)$, where the first summand is orthogonal to $M$ and the second summand lies in $M$. Expanding the squared norm gives a Pythagorean identity, from which the minimizing inequality and the equality condition follow by positivity and definiteness of the Hilbert norm.
[/proofplan]
custom_env
admin
[step:Decompose the error into orthogonal summands]
Fix $x \in H$ and $m \in M$. Define
\begin{align*}
r:=x-P_Mx \in H.
\end{align*}
Since $P_M$ is the [orthogonal projection](/theorems/437) onto $M$, we have $P_Mx \in M$ and $r \in M^\perp$. Define
\begin{align*}
y:=P_Mx-m \in H.
\end{align*}
Because $P_Mx \in M$, $m \in M$, and $M$ is a linear subspace of $H$, we have $y \in M$. Since $r \in M^\perp$ and $y \in M$,
\begin{align*}
(r,y)_H=0.
\end{align*}
Finally,
\begin{align*}
x-m=r+y.
\end{align*}
[/step]
custom_env
admin
[step:Expand the squared norm using orthogonality]Using the [Hilbert space](/page/Hilbert%20Space) norm identity $\|z\|_H^2=(z,z)_H$ for $z \in H$, the decomposition $x-m=r+y$, and the orthogonality $(r,y)_H=0$, we obtain
\begin{align*}
\|x-m\|_H^2=(r+y,r+y)_H.
\end{align*}
By sesquilinearity of the [inner product](/page/Inner%20Product),
\begin{align*}
(r+y,r+y)_H=(r,r)_H+(r,y)_H+(y,r)_H+(y,y)_H.
\end{align*}
Since $(r,y)_H=0$, conjugate symmetry gives $(y,r)_H=0$. Therefore
\begin{align*}
\|x-m\|_H^2=\|r\|_H^2+\|y\|_H^2.
\end{align*}
Substituting back $r=x-P_Mx$ and $y=P_Mx-m$ gives
\begin{align*}
\|x-m\|_H^2=\|x-P_Mx\|_H^2+\|P_Mx-m\|_H^2.
\end{align*}[/step]
custom_env
admin
[guided]We want to compare the distance from $x$ to the arbitrary point $m \in M$ with the distance from $x$ to the special point $P_Mx \in M$. The useful comparison is not made directly, but after splitting the error $x-m$ into two pieces. Define
\begin{align*}
r:=x-P_Mx.
\end{align*}
By the defining property of the orthogonal projection, $P_Mx \in M$ and $r \in M^\perp$. Now define
\begin{align*}
y:=P_Mx-m.
\end{align*}
Because both $P_Mx$ and $m$ belong to the linear subspace $M$, their difference $y$ also belongs to $M$. Hence $r$ is orthogonal to $y$, so
\begin{align*}
(r,y)_H=0.
\end{align*}
The decomposition is
\begin{align*}
x-m=(x-P_Mx)+(P_Mx-m)=r+y.
\end{align*}
Now expand the squared norm using the Hilbert inner product:
\begin{align*}
\|x-m\|_H^2=(r+y,r+y)_H.
\end{align*}
Sesquilinearity gives
\begin{align*}
(r+y,r+y)_H=(r,r)_H+(r,y)_H+(y,r)_H+(y,y)_H.
\end{align*}
The cross term $(r,y)_H$ is zero because $r \in M^\perp$ and $y \in M$. By conjugate symmetry of the inner product, $(y,r)_H$ is also zero. Thus only the two squared lengths remain:
\begin{align*}
\|x-m\|_H^2=\|r\|_H^2+\|y\|_H^2.
\end{align*}
Replacing $r$ and $y$ by their definitions gives the Pythagorean identity
\begin{align*}
\|x-m\|_H^2=\|x-P_Mx\|_H^2+\|P_Mx-m\|_H^2.
\end{align*}
This identity is the whole mechanism of the theorem: the arbitrary error contains the projection error plus an additional non-negative squared distance inside $M$.[/guided]
custom_env
admin
[step:Deduce the minimizing inequality]
Since $\|P_Mx-m\|_H^2 \geq 0$, the identity from the previous step implies
\begin{align*}
\|x-m\|_H^2 \geq \|x-P_Mx\|_H^2.
\end{align*}
Both norms are non-negative [real numbers](/page/Real%20Numbers), so monotonicity of the square-root function on $[0,\infty)$ gives
\begin{align*}
\|x-P_Mx\|_H \leq \|x-m\|_H.
\end{align*}
Since $x \in H$ and $m \in M$ were arbitrary, this proves the asserted best approximation inequality.
[/step]
custom_env
admin
[step:Characterize the equality case by definiteness of the norm]
For the fixed pair $x \in H$ and $m \in M$, equality in
\begin{align*}
\|x-P_Mx\|_H \leq \|x-m\|_H
\end{align*}
holds if and only if equality holds after squaring both sides, because both sides are non-negative. By the Pythagorean identity, this is equivalent to
\begin{align*}
\|P_Mx-m\|_H^2=0.
\end{align*}
By positive definiteness of the Hilbert norm, this holds if and only if
\begin{align*}
P_Mx-m=0.
\end{align*}
Equivalently, $m=P_Mx$. This proves both the inequality and the equality characterization.
[/step]