[proofplan]
Fix an admissible two-sided variation direction $h$ at $y$ and reduce the functional problem to a one-variable problem along the affine variation curve $\varepsilon\mapsto y+\varepsilon h$. The local extremum assumption on $J$ implies that the induced real-valued function has a local extremum at $\varepsilon=0$. The ordinary one-variable necessary condition for differentiable local extrema gives vanishing derivative at $0$, and the definition of first variation identifies that derivative with $\delta J[y;h]$.
[/proofplan]
[step:Restrict the functional to the admissible variation line]
Let $h\in\mathcal V$ be an admissible two-sided variation direction at $y$. By admissibility, there exists $\rho_h>0$ such that $y+\varepsilon h\in\mathcal A$ for every $\varepsilon\in(-\rho_h,\rho_h)$. Define the one-variable variation map
\begin{align*}
\phi:(-\rho_h,\rho_h)&\to\mathbb R
\end{align*}
by
\begin{align*}
\phi(\varepsilon)=J[y+\varepsilon h].
\end{align*}
By the assumed existence of the first variation in the direction $h$, the derivative $\phi'(0)$ exists and satisfies
\begin{align*}
\phi'(0)=\delta J[y;h].
\end{align*}
[guided]
We fix one admissible two-sided variation direction $h\in\mathcal V$ at $y$. The phrase “two-sided” is the point that makes the ordinary derivative test applicable: it gives a whole interval around $0$, not just perturbations with $\varepsilon\geq0$ or $\varepsilon\leq0$. Thus there exists $\rho_h>0$ such that $y+\varepsilon h\in\mathcal A$ for every $\varepsilon\in(-\rho_h,\rho_h)$.
Now define the associated real-valued one-variable function
\begin{align*}
\phi:(-\rho_h,\rho_h)&\to\mathbb R
\end{align*}
by
\begin{align*}
\phi(\varepsilon)=J[y+\varepsilon h].
\end{align*}
This function records the value of the functional $J$ along the admissible line through $y$ in the direction $h$. Since the first variation of $J$ at $y$ in the direction $h$ exists by hypothesis, the derivative of this one-variable function at $0$ exists, and the definition of first variation gives
\begin{align*}
\phi'(0)=\frac{d}{d\varepsilon}\bigg|_{\varepsilon=0}J[y+\varepsilon h]=\delta J[y;h].
\end{align*}
[/guided]
[/step]
[step:Transfer the local extremum from $J$ to the one-variable function]
Assume first that $y$ is a local minimizer relative to admissible two-sided variations. Then, after decreasing $\rho_h$ if necessary, we have
\begin{align*}
J[y]\leq J[y+\varepsilon h]
\end{align*}
for all $\varepsilon\in(-\rho_h,\rho_h)$. Hence
\begin{align*}
\phi(0)\leq \phi(\varepsilon)
\end{align*}
for all $\varepsilon\in(-\rho_h,\rho_h)$, so $0$ is a local minimizer of $\phi$.
If $y$ is instead a local maximizer, the same argument gives
\begin{align*}
\phi(0)\geq \phi(\varepsilon)
\end{align*}
for all sufficiently small $\varepsilon$, so $0$ is a local maximizer of $\phi$.
[/step]
[step:Apply the one-variable derivative test at the interior extremum]
In either case, $0$ is an interior point of the interval $(-\rho_h,\rho_h)$ and $\phi$ is differentiable at $0$. By Fermat's one-variable necessary condition for differentiable local extrema, $\phi'(0)=0$.
Combining this with the identity obtained from the definition of first variation gives
\begin{align*}
\delta J[y;h]=\phi'(0)=0.
\end{align*}
Because $h$ was an arbitrary admissible two-sided variation direction at $y$, the conclusion holds for every such $h$.
[/step]