[proofplan]
We first show that the residual $x-p$ is orthogonal to every vector in $V$ by differentiating the squared distance along arbitrary affine lines $p+tv$ inside $V$. Testing this orthogonality relation against each spanning vector $\varphi_i$ gives the normal equations. Finally, when the spanning family is linearly independent, the coefficient matrix is the Gram matrix, whose quadratic form is the squared Hilbert norm of a linear combination of the $\varphi_j$; this proves invertibility and hence uniqueness.
[/proofplan]
[step:Differentiate the squared distance along every direction in $V$]
Let $v \in V$ be arbitrary. Define the function $q_v: \mathbb{R} \to \mathbb{R}$ by
\begin{align*}
q_v(t) := \|x-(p+tv)\|_H^2.
\end{align*}
Since $V$ is a linear subspace and $p,v \in V$, we have $p+tv \in V$ for every $t \in \mathbb{R}$. The best-approximation property of $p$ therefore gives
\begin{align*}
q_v(0) \leq q_v(t)
\end{align*}
for every $t \in \mathbb{R}$, so $q_v$ has a minimum at $0$.
Expanding the square in the real [Hilbert space](/page/Hilbert%20Space) $H$ gives
\begin{align*}
q_v(t) = \|x-p\|_H^2 - 2t(x-p,v)_H + t^2\|v\|_H^2.
\end{align*}
Thus $q_v$ is differentiable and
\begin{align*}
q_v'(0) = -2(x-p,v)_H.
\end{align*}
Since $q_v$ has a minimum at $0$, we have $q_v'(0)=0$. Hence
\begin{align*}
(x-p,v)_H = 0.
\end{align*}
Because $v \in V$ was arbitrary, $x-p$ is orthogonal to every element of $V$.
[guided]
The point of this step is to convert the metric statement “$p$ is closest to $x$” into the Hilbert-space statement “the error $x-p$ is orthogonal to $V$.” To do this, fix an arbitrary vector $v \in V$ and move from $p$ in the direction $v$. Since $V$ is a linear subspace and $p,v \in V$, every point $p+tv$ with $t \in \mathbb{R}$ still lies in $V$.
Define $q_v: \mathbb{R} \to \mathbb{R}$ by
\begin{align*}
q_v(t) := \|x-(p+tv)\|_H^2.
\end{align*}
The best-approximation hypothesis says that $p$ is at least as close to $x$ as every other vector in $V$. Applying this to the particular vector $p+tv \in V$ gives
\begin{align*}
q_v(0) = \|x-p\|_H^2 \leq \|x-(p+tv)\|_H^2 = q_v(t)
\end{align*}
for every $t \in \mathbb{R}$. Therefore $q_v$ has a minimum at $0$.
Now expand $q_v(t)$ using bilinearity and symmetry of the real Hilbert [inner product](/page/Inner%20Product):
\begin{align*}
q_v(t) = (x-p-tv,x-p-tv)_H.
\end{align*}
This becomes
\begin{align*}
q_v(t) = \|x-p\|_H^2 - 2t(x-p,v)_H + t^2\|v\|_H^2.
\end{align*}
Thus $q_v$ is a polynomial in $t$, so it is differentiable, and its derivative at $0$ is
\begin{align*}
q_v'(0) = -2(x-p,v)_H.
\end{align*}
A differentiable real-valued function with a minimum at an interior point has derivative zero there, so $q_v'(0)=0$. Hence
\begin{align*}
(x-p,v)_H = 0.
\end{align*}
Since the direction $v \in V$ was arbitrary, the residual $x-p$ is orthogonal to every vector in $V$.
[/guided]
[/step]
[step:Test the orthogonality relation against the spanning vectors]
For each $i \in \{1,\dots,n\}$, the vector $\varphi_i$ belongs to $V$. Applying the orthogonality relation from the previous step with $v=\varphi_i$ gives
\begin{align*}
(x-p,\varphi_i)_H = 0.
\end{align*}
Using the representation $p=\sum_{j=1}^n a_j\varphi_j$ and linearity of the inner product in the first variable, we obtain
\begin{align*}
0 = (x,\varphi_i)_H - \left(\sum_{j=1}^n a_j\varphi_j,\varphi_i\right)_H.
\end{align*}
Therefore
\begin{align*}
0 = (x,\varphi_i)_H - \sum_{j=1}^n a_j(\varphi_j,\varphi_i)_H.
\end{align*}
Rearranging gives
\begin{align*}
\sum_{j=1}^n a_j(\varphi_j,\varphi_i)_H = (x,\varphi_i)_H.
\end{align*}
This is the desired normal equation for the index $i$. Since $i$ was arbitrary, the full system holds.
[/step]
[step:Identify the coefficient matrix as a Gram matrix]
Define the real $n \times n$ matrix $G$ by
\begin{align*}
G_{ij} := (\varphi_j,\varphi_i)_H
\end{align*}
for $i,j \in \{1,\dots,n\}$. Define the coefficient vector $a \in \mathbb{R}^n$ by $a=(a_1,\dots,a_n)$ and define the right-hand side vector $b \in \mathbb{R}^n$ by
\begin{align*}
b_i := (x,\varphi_i)_H.
\end{align*}
The normal equations are exactly the finite-dimensional linear system
\begin{align*}
Ga=b.
\end{align*}
[/step]
[step:Prove that the Gram matrix is invertible under linear independence]
Assume that $\varphi_1,\dots,\varphi_n$ are linearly independent. To prove that $G$ is invertible, it is enough to prove that its kernel is zero.
Let $c=(c_1,\dots,c_n) \in \mathbb{R}^n$ satisfy $Gc=0$. Define the vector $y \in H$ by
\begin{align*}
y := \sum_{j=1}^n c_j\varphi_j.
\end{align*}
Since $Gc=0$, for each $i \in \{1,\dots,n\}$ we have
\begin{align*}
\sum_{j=1}^n c_j(\varphi_j,\varphi_i)_H = 0.
\end{align*}
Multiplying the equation for index $i$ by $c_i$ and summing over $i$ gives
\begin{align*}
\sum_{i=1}^n c_i\sum_{j=1}^n c_j(\varphi_j,\varphi_i)_H = 0.
\end{align*}
By bilinearity of the real Hilbert inner product, the left-hand side is
\begin{align*}
\left(\sum_{j=1}^n c_j\varphi_j,\sum_{i=1}^n c_i\varphi_i\right)_H = \|y\|_H^2.
\end{align*}
Therefore
\begin{align*}
\|y\|_H^2 = 0.
\end{align*}
Positive definiteness of the Hilbert norm gives $y=0$. Since $\varphi_1,\dots,\varphi_n$ are linearly independent, the equality
\begin{align*}
\sum_{j=1}^n c_j\varphi_j = 0
\end{align*}
implies $c_1=\dots=c_n=0$. Thus $\ker G=\{0\}$, so $G$ is invertible.
[/step]
[step:Conclude uniqueness of the normal-equation solution]
When $\varphi_1,\dots,\varphi_n$ are linearly independent, the previous step shows that the coefficient matrix $G$ is invertible. Hence the linear system $Ga=b$ has exactly one solution $a \in \mathbb{R}^n$. Therefore the normal equations determine a unique coefficient vector, completing the proof.
[/step]