[guided]The supersolution direction is the subtler half because a lower test function must be compared with an actual minimizing trajectory in the dynamic programming formula. We begin exactly as in the terse proof. Let $\varphi: (0,\infty)\times \mathbb{R}^n \to \mathbb{R}$ be $C^1$, suppose $u-\varphi$ has a local minimum at $(t_0,x_0)$, and normalize so that $u(t_0,x_0)=\varphi(t_0,x_0)$. We need to show
\begin{align*}
\partial_t \varphi(t_0,x_0) + \frac{1}{2}|\nabla \varphi(t_0,x_0)|^2 \geq 0.
\end{align*}
Fix $h \in (0,t_0)$. The dynamic programming identity says that $u(t_0,x_0)$ is obtained by minimizing over the previous spatial point $z$ at time $t_0-h$. The map $z \mapsto u(t_0-h,z)$ is continuous by the continuity step, and the added quadratic cost is coercive. Since $u$ is bounded by $B$, the whole objective is coercive. Hence the minimum is attained. Choose $z_h \in \mathbb{R}^n$ such that
\begin{align*}
u(t_0,x_0) = u(t_0-h,z_h) + \frac{|x_0-z_h|^2}{2h}.
\end{align*}
The boundedness of $f$ gives $|u| \leq B$, where $B=\|f\|_\infty$. Hence
\begin{align*}
\frac{|x_0-z_h|^2}{2h} = u(t_0,x_0)-u(t_0-h,z_h) \leq 2B.
\end{align*}
This estimate is important: it forces $z_h \to x_0$ as $h \downarrow 0$, so the minimizing point remains inside the neighbourhood where the test function touches $u$ from below.
Because $u-\varphi$ has a local minimum at $(t_0,x_0)$, and because $(t_0-h,z_h) \to (t_0,x_0)$, for all sufficiently small $h$ we have
\begin{align*}
u(t_0-h,z_h)-u(t_0,x_0) \geq \varphi(t_0-h,z_h)-\varphi(t_0,x_0).
\end{align*}
Substituting the equality for $u(t_0,x_0)$ into this contact inequality gives
\begin{align*}
0 \geq \varphi(t_0-h,z_h)-\varphi(t_0,x_0) + \frac{|x_0-z_h|^2}{2h}.
\end{align*}
Now introduce the velocity variable $r_h \in \mathbb{R}^n$ by
\begin{align*}
r_h := \frac{x_0-z_h}{h}.
\end{align*}
Then $z_h=x_0-hr_h$, and the inequality becomes
\begin{align*}
0 \geq \varphi(t_0-h,x_0-hr_h)-\varphi(t_0,x_0) + \frac{h}{2}|r_h|^2.
\end{align*}
The point of this notation is that the quadratic cost becomes the kinetic term $\frac{h}{2}|r_h|^2$, while the test function can be expanded to first order. Before using a first-order expansion along this moving direction, we prove that the velocities $r_h$ are bounded. Since $\varphi$ is $C^1$, it is locally Lipschitz near $(t_0,x_0)$. Thus there are constants $\rho>0$ and $M>0$ such that
\begin{align*}
|\varphi(t,x)-\varphi(t_0,x_0)| \leq M(|t-t_0|+|x-x_0|)
\end{align*}
whenever $|t-t_0|+|x-x_0|<\rho$. Since $(t_0-h,z_h)\to(t_0,x_0)$, this applies for all sufficiently small $h$. Combining the local Lipschitz estimate with
\begin{align*}
0 \geq \varphi(t_0-h,x_0-hr_h)-\varphi(t_0,x_0) + \frac{h}{2}|r_h|^2
\end{align*}
gives
\begin{align*}
0 \geq -Mh(1+|r_h|) + \frac{h}{2}|r_h|^2.
\end{align*}
After division by $h>0$, the quadratic inequality
\begin{align*}
\frac{1}{2}|r_h|^2 \leq M(1+|r_h|)
\end{align*}
shows that $(r_h)$ is bounded along small $h$.
Now the differentiability expansion is legitimate in a uniform form along the selected sequence of directions: because $r_h$ is bounded, the increment $(-h,-hr_h)$ has size $O(h)$, and hence
\begin{align*}
\varphi(t_0-h,x_0-hr_h)-\varphi(t_0,x_0) = -h\,\partial_t\varphi(t_0,x_0) - h\,\nabla\varphi(t_0,x_0)\cdot r_h + o(h).
\end{align*}
Choose a sequence $h_k \downarrow 0$ such that $r_{h_k} \to r \in \mathbb{R}^n$. Divide the inequality by $h_k$ and pass to the limit. The differentiability remainder vanishes because the velocities are bounded, and we obtain
\begin{align*}
0 \geq -\partial_t\varphi(t_0,x_0) - \nabla\varphi(t_0,x_0)\cdot r + \frac{1}{2}|r|^2.
\end{align*}
Finally, complete the square:
\begin{align*}
-\nabla\varphi(t_0,x_0)\cdot r + \frac{1}{2}|r|^2 = \frac{1}{2}|r-\nabla\varphi(t_0,x_0)|^2 - \frac{1}{2}|\nabla\varphi(t_0,x_0)|^2.
\end{align*}
The square is non-negative, so
\begin{align*}
-\nabla\varphi(t_0,x_0)\cdot r + \frac{1}{2}|r|^2 \geq -\frac{1}{2}|\nabla\varphi(t_0,x_0)|^2.
\end{align*}
Combining this lower bound with the limiting inequality gives
\begin{align*}
0 \geq -\partial_t\varphi(t_0,x_0) - \frac{1}{2}|\nabla\varphi(t_0,x_0)|^2.
\end{align*}
Equivalently,
\begin{align*}
\partial_t\varphi(t_0,x_0) + \frac{1}{2}|\nabla\varphi(t_0,x_0)|^2 \geq 0.
\end{align*}
This proves the viscosity supersolution inequality.[/guided]