[proofplan]
The proof computes the first variation for one admissible endpoint-moving family and then rewrites the boundary contribution in terms of the actual endpoint velocity in $\mathbb{R}^2$. The moving upper limit contributes the term $L(b,y(b),y'(b))\delta b$, while [integration by parts](/theorems/210) contributes $\partial_{y'}L(b,y(b),y'(b))h(b)$ from the vertical graph variation. The endpoint velocity is not $(\delta b,h(b))$, but rather $(\delta b,\delta y)$ with $\delta y=h(b)+y'(b)\delta b$, and this substitution produces the transversality vector. Since every tangent direction to $\Gamma$ is realised by an admissible variation, stationarity forces that vector to be orthogonal to the full tangent line of $\Gamma$.
[/proofplan]
[step:Introduce the endpoint and graph variation velocities]
Fix an admissible variation $\varepsilon \mapsto (b_\varepsilon,Y_\varepsilon)$. Define the endpoint abscissa velocity $\delta b \in \mathbb{R}$ by
\begin{align*}
\delta b=\frac{d}{d\varepsilon}\Big|_{\varepsilon=0}b_\varepsilon.
\end{align*}
Define the graph variation field $h:[a,b]\to\mathbb{R}$ by
\begin{align*}
h(x)=\frac{\partial Y_\varepsilon}{\partial\varepsilon}(0,x).
\end{align*}
Because $(\varepsilon,x)\mapsto Y_\varepsilon(x)$ is $C^2$ on $(-\varepsilon_0,\varepsilon_0)\times U$, the function $h$ is $C^1$ on $[a,b]$ and
\begin{align*}
h'(x)=\frac{\partial}{\partial\varepsilon}\Big|_{\varepsilon=0}\partial_xY_\varepsilon(x)
\end{align*}
for every $x\in[a,b]$.
Because $Y_\varepsilon(a)=y(a)$ for all $\varepsilon$, differentiating at $\varepsilon=0$ gives
\begin{align*}
h(a)=0.
\end{align*}
Define the endpoint ordinate velocity $\delta y \in \mathbb{R}$ by
\begin{align*}
\delta y=\frac{d}{d\varepsilon}\Big|_{\varepsilon=0}Y_\varepsilon(b_\varepsilon).
\end{align*}
The chain rule applied to the $C^1$ map $(\varepsilon,x)\mapsto Y_\varepsilon(x)$ at $(0,b)$ gives
\begin{align*}
\delta y=h(b)+y'(b)\delta b.
\end{align*}
Thus the endpoint velocity is
\begin{align*}
\tau=(\delta b,\delta y)=(\delta b,h(b)+y'(b)\delta b).
\end{align*}
[guided]
The variation has two different first-order quantities at the right endpoint, and it is important not to confuse them. First, the endpoint moves horizontally with velocity
\begin{align*}
\delta b=\frac{d}{d\varepsilon}\Big|_{\varepsilon=0}b_\varepsilon.
\end{align*}
Second, the graph itself changes vertically at a fixed abscissa. We encode that fixed-abscissa vertical variation by the map $h:[a,b]\to\mathbb{R}$ defined by
\begin{align*}
h(x)=\frac{\partial Y_\varepsilon}{\partial\varepsilon}(0,x).
\end{align*}
The left endpoint is fixed, so $Y_\varepsilon(a)=y(a)$ for every $\varepsilon$. Differentiating this identity with respect to $\varepsilon$ gives
\begin{align*}
h(a)=0.
\end{align*}
The ordinate of the moving endpoint is not simply $Y_\varepsilon(b)$. It is $Y_\varepsilon(b_\varepsilon)$, so its derivative has two contributions: the graph changes with $\varepsilon$, and the evaluation point $b_\varepsilon$ moves along the original graph. Define
\begin{align*}
\delta y=\frac{d}{d\varepsilon}\Big|_{\varepsilon=0}Y_\varepsilon(b_\varepsilon).
\end{align*}
By the chain rule,
\begin{align*}
\delta y=\frac{\partial Y_\varepsilon}{\partial\varepsilon}(0,b)+\partial_xY_0(b)\delta b.
\end{align*}
Since $Y_0|_{[a,b]}=y$, this becomes
\begin{align*}
\delta y=h(b)+y'(b)\delta b.
\end{align*}
Therefore the velocity of the endpoint as a point of the plane is
\begin{align*}
\tau=(\delta b,\delta y)=(\delta b,h(b)+y'(b)\delta b).
\end{align*}
This identity is the bridge between the variational boundary term, which naturally contains $h(b)$, and the geometric constraint, which is stated in terms of tangent vectors to $\Gamma$ in $\mathbb{R}^2$.
[/guided]
[/step]
[step:Differentiate the functional and isolate the boundary terms]
Define the auxiliary functions
\begin{align*}
L_y:U\times\mathbb{R}\times\mathbb{R}\to\mathbb{R}
\end{align*}
and
\begin{align*}
L_{y'}:U\times\mathbb{R}\times\mathbb{R}\to\mathbb{R}
\end{align*}
by
\begin{align*}
L_y(x,z,p)=\partial_yL(x,z,p)
\end{align*}
and
\begin{align*}
L_{y'}(x,z,p)=\partial_{y'}L(x,z,p).
\end{align*}
After shrinking $\varepsilon_0>0$ if necessary, the continuity of $\varepsilon\mapsto b_\varepsilon$ and the inequality $a<b_0=b$ ensure that $a<b_\varepsilon$ for all $|\varepsilon|<\varepsilon_0$, so each interval $[a,b_\varepsilon]$ has the intended orientation. The map $(\varepsilon,x)\mapsto L(x,Y_\varepsilon(x),\partial_xY_\varepsilon(x))$ is $C^1$ on a neighbourhood of $\{0\}\times[a,b]$, because $L\in C^2$ and $(\varepsilon,x)\mapsto Y_\varepsilon(x)$ is $C^2$. The one-dimensional Leibniz rule for a $C^1$ integrand with $C^1$ moving upper limit therefore gives
\begin{align*}
I'(0)=\int_{[a,b]}\left(L_y(x,y(x),y'(x))h(x)+L_{y'}(x,y(x),y'(x))h'(x)\right)\,d\mathcal{L}^1(x)+L(b,y(b),y'(b))\delta b.
\end{align*}
Since $L\in C^2$ and $y\in C^2([a,b])$, the function
\begin{align*}
x\mapsto L_{y'}(x,y(x),y'(x))
\end{align*}
is $C^1$ on $[a,b]$. [Integration by parts](/theorems/2098) on $[a,b]$ gives
\begin{align*}
\int_{[a,b]}L_{y'}(x,y(x),y'(x))h'(x)\,d\mathcal{L}^1(x)=L_{y'}(b,y(b),y'(b))h(b)-L_{y'}(a,y(a),y'(a))h(a)-\int_{[a,b]}\frac{d}{dx}L_{y'}(x,y(x),y'(x))h(x)\,d\mathcal{L}^1(x).
\end{align*}
Using $h(a)=0$, this becomes
\begin{align*}
I'(0)=\int_{[a,b]}\left(L_y(x,y(x),y'(x))-\frac{d}{dx}L_{y'}(x,y(x),y'(x))\right)h(x)\,d\mathcal{L}^1(x)+L_{y'}(b,y(b),y'(b))h(b)+L(b,y(b),y'(b))\delta b.
\end{align*}
[/step]
[step:Use stationarity to cancel the interior Euler-Lagrange term]
By the fixed-endpoint interior variation richness assumed in the statement, for every $\phi\in C_c^1((a,b))$ the family $b_\varepsilon=b$ and $Y_\varepsilon=y+\varepsilon\phi$ is admissible. Applying stationarity to these variations gives the weak Euler-Lagrange identity
\begin{align*}
\int_{[a,b]}\left(L_y(x,y(x),y'(x))-\frac{d}{dx}L_{y'}(x,y(x),y'(x))\right)\phi(x)\,d\mathcal{L}^1(x)=0
\end{align*}
for every $\phi\in C_c^1((a,b))$. Since the integrand is continuous on $(a,b)$, the fundamental lemma of the [calculus of variations](/page/Calculus%20of%20Variations) gives the Euler-Lagrange equation along $y$:
\begin{align*}
L_y(x,y(x),y'(x))-\frac{d}{dx}L_{y'}(x,y(x),y'(x))=0
\end{align*}
for every $x\in(a,b)$. Substituting this identity into the [first variation formula](/theorems/2728) gives, for every admissible endpoint-moving variation,
\begin{align*}
0=I'(0)=L_{y'}(b,y(b),y'(b))h(b)+L(b,y(b),y'(b))\delta b.
\end{align*}
[/step]
[step:Rewrite the boundary expression using the actual endpoint velocity]
From the endpoint velocity relation,
\begin{align*}
h(b)=\delta y-y'(b)\delta b.
\end{align*}
Substituting this into the boundary identity yields
\begin{align*}
0=L_{y'}(b,y(b),y'(b))(\delta y-y'(b)\delta b)+L(b,y(b),y'(b))\delta b.
\end{align*}
Collecting the coefficients of $\delta b$ and $\delta y$ gives
\begin{align*}
0=\left(L(b,y(b),y'(b))-y'(b)L_{y'}(b,y(b),y'(b))\right)\delta b+L_{y'}(b,y(b),y'(b))\delta y.
\end{align*}
Equivalently,
\begin{align*}
\left(L(b,y(b),y'(b))-y'(b)\partial_{y'}L(b,y(b),y'(b)),\partial_{y'}L(b,y(b),y'(b))\right)\cdot(\delta b,\delta y)=0.
\end{align*}
Since $(\delta b,\delta y)=\tau$, the displayed orthogonality holds for the endpoint velocity of the chosen admissible variation.
[guided]
For the chosen admissible variation, stationarity gives $I'(0)=0$. The differentiated functional and the integration by parts computation give
\begin{align*}
I'(0)=\int_{[a,b]}\left(L_y(x,y(x),y'(x))-\frac{d}{dx}L_{y'}(x,y(x),y'(x))\right)h(x)\,d\mathcal{L}^1(x)+L_{y'}(b,y(b),y'(b))h(b)+L(b,y(b),y'(b))\delta b.
\end{align*}
The fixed-endpoint interior variations imply the Euler-Lagrange equation
\begin{align*}
L_y(x,y(x),y'(x))-\frac{d}{dx}L_{y'}(x,y(x),y'(x))=0
\end{align*}
for every $x\in(a,b)$, and the integrand is continuous on $[a,b]$. Hence the interior integral vanishes, and stationarity reduces the first variation to the boundary identity
\begin{align*}
0=L_{y'}(b,y(b),y'(b))h(b)+L(b,y(b),y'(b))\delta b.
\end{align*}
This expression is not yet the desired transversality condition because the geometric constraint is imposed on the moving endpoint as a point of $\mathbb{R}^2$. The relevant endpoint velocity is
\begin{align*}
\tau=(\delta b,\delta y),
\end{align*}
not $(\delta b,h(b))$.
The chain rule for the moving endpoint gives
\begin{align*}
\delta y=h(b)+y'(b)\delta b.
\end{align*}
Solving for $h(b)$ gives
\begin{align*}
h(b)=\delta y-y'(b)\delta b.
\end{align*}
Substituting this into the boundary identity gives
\begin{align*}
0=L_{y'}(b,y(b),y'(b))(\delta y-y'(b)\delta b)+L(b,y(b),y'(b))\delta b.
\end{align*}
Now group the coefficient of the horizontal endpoint velocity $\delta b$ and the coefficient of the vertical endpoint velocity $\delta y$:
\begin{align*}
0=\left(L(b,y(b),y'(b))-y'(b)L_{y'}(b,y(b),y'(b))\right)\delta b+L_{y'}(b,y(b),y'(b))\delta y.
\end{align*}
Since $L_{y'}=\partial_{y'}L$ by definition, this is exactly the dot product identity
\begin{align*}
\left(L(b,y(b),y'(b))-y'(b)\partial_{y'}L(b,y(b),y'(b)),\partial_{y'}L(b,y(b),y'(b))\right)\cdot(\delta b,\delta y)=0.
\end{align*}
The key point is that the term $-y'(b)\partial_{y'}L$ appears because a horizontal displacement of the endpoint also changes the endpoint ordinate along the original curve by $y'(b)\delta b$.
[/guided]
[/step]
[step:Apply the realisation hypothesis for tangent directions]
For every admissible variation, the endpoint curve condition
\begin{align*}
(b_\varepsilon,Y_\varepsilon(b_\varepsilon))\in\Gamma
\end{align*}
implies that its endpoint velocity $\tau$ lies in $T_{(b,y(b))}\Gamma$. The preceding step proves the required dot product identity for every tangent vector that is realised as such an endpoint velocity. By hypothesis, every vector in $T_{(b,y(b))}\Gamma$ is realised by some admissible endpoint-moving variation. Therefore, for every $\tau\in T_{(b,y(b))}\Gamma$,
\begin{align*}
\left(L(b,y(b),y'(b))-y'(b)\partial_{y'}L(b,y(b),y'(b)),\partial_{y'}L(b,y(b),y'(b))\right)\cdot\tau=0.
\end{align*}
This is the asserted transversality condition.
[/step]