[proofplan]
We express $J[y + \varepsilon h]$ as the integral of a parametric function $\Phi(x, \varepsilon) = L(x, y(x) + \varepsilon h(x), y'(x) + \varepsilon h'(x))$ and compute $\partial_\varepsilon \Phi$ via the multivariable chain rule. The $C^2$ regularity of $L$, combined with the compactness of $[a,b]$ and the boundedness of $y, y', h, h'$, yields a uniform bound on $|\partial_\varepsilon \Phi|$ for $\varepsilon$ near $0$, providing an integrable dominator. The Leibniz integral rule then justifies interchanging differentiation and integration, and evaluating at $\varepsilon = 0$ yields the formula.
[/proofplan]
[step:Reduce to a parametric integral and compute $\partial_\varepsilon \Phi$ via the chain rule]
Define the parametric integrand
\begin{align*}
\Phi: [a,b] \times \mathbb{R} &\to \mathbb{R} \\
(x, \varepsilon) &\mapsto L\!\left(x,\; y(x) + \varepsilon h(x),\; y'(x) + \varepsilon h'(x)\right).
\end{align*}
Then $J[y + \varepsilon h] = \int_a^b \Phi(x, \varepsilon) \, d\mathcal{L}^1(x)$ for every $\varepsilon \in \mathbb{R}$.
For each fixed $x \in [a,b]$, the map $\varepsilon \mapsto \Phi(x, \varepsilon)$ is the composition $L \circ \alpha_x$, where
\begin{align*}
\alpha_x: \mathbb{R} &\to [a,b] \times \mathbb{R}^n \times \mathbb{R}^n \\
\varepsilon &\mapsto \left(x,\; y(x) + \varepsilon h(x),\; y'(x) + \varepsilon h'(x)\right)
\end{align*}
is affine, hence smooth. Since $L$ is $C^2$ and $\alpha_x$ is smooth, the multivariable chain rule gives that $\varepsilon \mapsto \Phi(x, \varepsilon)$ is differentiable with
\begin{align*}
\partial_\varepsilon \Phi(x, \varepsilon) &= \sum_{i=1}^n \frac{\partial L}{\partial u_i}\!\left(x,\; y(x) + \varepsilon h(x),\; y'(x) + \varepsilon h'(x)\right) h_i(x) \\
&\quad + \sum_{i=1}^n \frac{\partial L}{\partial p_i}\!\left(x,\; y(x) + \varepsilon h(x),\; y'(x) + \varepsilon h'(x)\right) h'_i(x).
\end{align*}
In vector notation:
\begin{align*}
\partial_\varepsilon \Phi(x, \varepsilon) = \frac{\partial L}{\partial u}\!\left(x,\; y(x) + \varepsilon h(x),\; y'(x) + \varepsilon h'(x)\right) \cdot h(x) + \frac{\partial L}{\partial p}\!\left(x,\; y(x) + \varepsilon h(x),\; y'(x) + \varepsilon h'(x)\right) \cdot h'(x).
\end{align*}
[guided]
The first variation $\delta J[y; h]$ is defined as $\frac{d}{d\varepsilon}\big|_{\varepsilon=0} J[y + \varepsilon h]$, so the problem reduces to differentiating the integral $J[y + \varepsilon h] = \int_a^b L(x, y(x) + \varepsilon h(x), y'(x) + \varepsilon h'(x)) \, d\mathcal{L}^1(x)$ with respect to $\varepsilon$. Since $J[y + \varepsilon h]$ is itself an integral over $x$, we need to interchange the $\varepsilon$-derivative and the $x$-integral. Before justifying that interchange (Step 2), we first compute what $\partial_\varepsilon$ of the integrand equals.
Define the parametric integrand
\begin{align*}
\Phi: [a,b] \times \mathbb{R} &\to \mathbb{R} \\
(x, \varepsilon) &\mapsto L\!\left(x,\; y(x) + \varepsilon h(x),\; y'(x) + \varepsilon h'(x)\right),
\end{align*}
so that $J[y + \varepsilon h] = \int_a^b \Phi(x, \varepsilon) \, d\mathcal{L}^1(x)$.
For each fixed $x \in [a,b]$, the map $\varepsilon \mapsto \Phi(x, \varepsilon)$ is a composition. The inner map is affine:
\begin{align*}
\alpha_x: \mathbb{R} &\to [a,b] \times \mathbb{R}^n \times \mathbb{R}^n \\
\varepsilon &\mapsto \left(x,\; y(x) + \varepsilon h(x),\; y'(x) + \varepsilon h'(x)\right),
\end{align*}
and the outer map is $L$, which is $C^2$ by hypothesis. Since an affine map is smooth, the composition $\Phi(x, \cdot) = L \circ \alpha_x$ is differentiable (in fact $C^2$ in $\varepsilon$). The multivariable chain rule gives
\begin{align*}
\partial_\varepsilon \Phi(x, \varepsilon) &= \sum_{i=1}^n \frac{\partial L}{\partial u_i}\!\left(x,\; y(x) + \varepsilon h(x),\; y'(x) + \varepsilon h'(x)\right) \cdot \frac{d}{d\varepsilon}\!\left[y_i(x) + \varepsilon h_i(x)\right] \\
&\quad + \sum_{i=1}^n \frac{\partial L}{\partial p_i}\!\left(x,\; y(x) + \varepsilon h(x),\; y'(x) + \varepsilon h'(x)\right) \cdot \frac{d}{d\varepsilon}\!\left[y'_i(x) + \varepsilon h'_i(x)\right].
\end{align*}
The $\varepsilon$-derivatives of the affine components are $\frac{d}{d\varepsilon}[y_i(x) + \varepsilon h_i(x)] = h_i(x)$ and $\frac{d}{d\varepsilon}[y'_i(x) + \varepsilon h'_i(x)] = h'_i(x)$. Note that $x$ is held fixed, so the first argument of $L$ contributes $\frac{\partial L}{\partial x} \cdot \frac{dx}{d\varepsilon} = 0$. Substituting:
\begin{align*}
\partial_\varepsilon \Phi(x, \varepsilon) &= \sum_{i=1}^n \frac{\partial L}{\partial u_i}\!\left(x,\; y(x) + \varepsilon h(x),\; y'(x) + \varepsilon h'(x)\right) h_i(x) \\
&\quad + \sum_{i=1}^n \frac{\partial L}{\partial p_i}\!\left(x,\; y(x) + \varepsilon h(x),\; y'(x) + \varepsilon h'(x)\right) h'_i(x).
\end{align*}
In compact vector notation, this reads
\begin{align*}
\partial_\varepsilon \Phi(x, \varepsilon) = \frac{\partial L}{\partial u}\!\left(x,\; y(x) + \varepsilon h(x),\; y'(x) + \varepsilon h'(x)\right) \cdot h(x) + \frac{\partial L}{\partial p}\!\left(x,\; y(x) + \varepsilon h(x),\; y'(x) + \varepsilon h'(x)\right) \cdot h'(x).
\end{align*}
The two terms have a clean interpretation: the first captures how the integrand responds to perturbing the curve values $y(x) \rightsquigarrow y(x) + \varepsilon h(x)$, and the second captures the response to perturbing the derivatives $y'(x) \rightsquigarrow y'(x) + \varepsilon h'(x)$.
[/guided]
[/step]
[step:Bound $|\partial_\varepsilon \Phi|$ uniformly on $[a,b] \times [-1,1]$ to obtain an integrable dominator]
Since $y, h \in C^1([a,b], \mathbb{R}^n)$ and $[a,b]$ is compact, the continuous functions $y, y', h, h': [a,b] \to \mathbb{R}^n$ are bounded. Define
\begin{align*}
M_y := \max_{x \in [a,b]} |y(x)|, \quad M_{y'} := \max_{x \in [a,b]} |y'(x)|, \quad M_h := \max_{x \in [a,b]} |h(x)|, \quad M_{h'} := \max_{x \in [a,b]} |h'(x)|.
\end{align*}
For every $(x, \varepsilon) \in [a,b] \times [-1, 1]$, the triangle inequality gives $|y(x) + \varepsilon h(x)| \leq M_y + M_h$ and $|y'(x) + \varepsilon h'(x)| \leq M_{y'} + M_{h'}$. Therefore the argument of $L$ in $\Phi(x, \varepsilon)$ lies in the compact set
\begin{align*}
K := [a,b] \times \overline{B}(0,\, M_y + M_h) \times \overline{B}(0,\, M_{y'} + M_{h'}) \subset [a,b] \times \mathbb{R}^n \times \mathbb{R}^n.
\end{align*}
Since $L \in C^2([a,b] \times \mathbb{R}^n \times \mathbb{R}^n)$, its first-order partial derivatives $\frac{\partial L}{\partial u_i}$ and $\frac{\partial L}{\partial p_i}$ are of class $C^1$, hence continuous. By the extreme value theorem applied to each $\left|\frac{\partial L}{\partial u_i}\right|$ and $\left|\frac{\partial L}{\partial p_i}\right|$ on the compact set $K$:
\begin{align*}
C_u := \max_{1 \leq i \leq n}\; \max_{(x,u,p) \in K} \left|\frac{\partial L}{\partial u_i}(x, u, p)\right| < \infty, \qquad C_p := \max_{1 \leq i \leq n}\; \max_{(x,u,p) \in K} \left|\frac{\partial L}{\partial p_i}(x, u, p)\right| < \infty.
\end{align*}
Applying the Cauchy–Schwarz inequality on $\mathbb{R}^n$ to each dot product in the expression for $\partial_\varepsilon \Phi$ from the previous step, and bounding each gradient norm by $\left|\frac{\partial L}{\partial u}\right| = \left(\sum_{i=1}^n \left|\frac{\partial L}{\partial u_i}\right|^2\right)^{1/2} \leq \sqrt{n}\, C_u$ (and likewise for $\frac{\partial L}{\partial p}$):
\begin{align*}
|\partial_\varepsilon \Phi(x, \varepsilon)| &\leq \left|\frac{\partial L}{\partial u}(\cdots)\right| |h(x)| + \left|\frac{\partial L}{\partial p}(\cdots)\right| |h'(x)| \leq \sqrt{n}\, C_u \cdot M_h + \sqrt{n}\, C_p \cdot M_{h'} =: M
\end{align*}
for all $(x, \varepsilon) \in [a,b] \times [-1,1]$, where $(\cdots)$ abbreviates the common argument $(x, y(x) + \varepsilon h(x), y'(x) + \varepsilon h'(x)) \in K$. The constant function $g: [a,b] \to [0,\infty)$ defined by $g(x) := M$ is integrable since $\int_a^b M \, d\mathcal{L}^1(x) = M(b-a) < \infty$.
[guided]
To differentiate under the integral sign, we will invoke the Leibniz integral rule, whose key hypothesis is the existence of an integrable function that dominates $|\partial_\varepsilon \Phi(x, \varepsilon)|$ uniformly in $\varepsilon$ near $0$. How do we manufacture such a bound? The strategy is: (a) confine the arguments of $L$ to a compact set, (b) use continuity of the partial derivatives on that compact set to get a uniform pointwise bound, and (c) combine with the boundedness of $h$ and $h'$.
**Part (a): confine to a compact set.** Since $y, h \in C^1([a,b], \mathbb{R}^n)$ and $[a,b]$ is compact, the continuous functions $y, y', h, h': [a,b] \to \mathbb{R}^n$ attain their maxima. Define
\begin{align*}
M_y := \max_{x \in [a,b]} |y(x)|, \quad M_{y'} := \max_{x \in [a,b]} |y'(x)|, \quad M_h := \max_{x \in [a,b]} |h(x)|, \quad M_{h'} := \max_{x \in [a,b]} |h'(x)|.
\end{align*}
For $\varepsilon \in [-1,1]$, the triangle inequality gives $|y(x) + \varepsilon h(x)| \leq M_y + |\varepsilon| M_h \leq M_y + M_h$ and $|y'(x) + \varepsilon h'(x)| \leq M_{y'} + M_{h'}$. So the argument of $L$ in $\Phi(x, \varepsilon)$ is confined to the compact set
\begin{align*}
K := [a,b] \times \overline{B}(0,\, M_y + M_h) \times \overline{B}(0,\, M_{y'} + M_{h'}) \subset [a,b] \times \mathbb{R}^n \times \mathbb{R}^n.
\end{align*}
Why does bounding on $K$ suffice? Because we only need to dominate $|\partial_\varepsilon \Phi|$ for $\varepsilon$ in *some* neighbourhood of $0$ — the interval $[-1, 1]$ is such a neighbourhood, and it determines $K$.
**Part (b): bound the partial derivatives of $L$ on $K$.** Since $L \in C^2$, its first-order partial derivatives $\frac{\partial L}{\partial u_i}$ and $\frac{\partial L}{\partial p_i}$ ($1 \leq i \leq n$) are of class $C^1$ and in particular continuous. A continuous real-valued function on a compact set attains its maximum (extreme value theorem), so
\begin{align*}
C_u := \max_{1 \leq i \leq n}\; \max_{(x,u,p) \in K} \left|\frac{\partial L}{\partial u_i}(x, u, p)\right| < \infty, \qquad C_p := \max_{1 \leq i \leq n}\; \max_{(x,u,p) \in K} \left|\frac{\partial L}{\partial p_i}(x, u, p)\right| < \infty.
\end{align*}
**Part (c): combine the bounds.** From the formula in the previous step, the Cauchy–Schwarz inequality $|a \cdot b| \leq |a||b|$ on $\mathbb{R}^n$ gives
\begin{align*}
|\partial_\varepsilon \Phi(x, \varepsilon)| &\leq \left|\frac{\partial L}{\partial u}(\cdots)\right| |h(x)| + \left|\frac{\partial L}{\partial p}(\cdots)\right| |h'(x)|,
\end{align*}
where $(\cdots)$ abbreviates the common argument $(x, y(x) + \varepsilon h(x), y'(x) + \varepsilon h'(x)) \in K$. Since each component $\left|\frac{\partial L}{\partial u_i}\right| \leq C_u$ on $K$, we have $\left|\frac{\partial L}{\partial u}\right| = \left(\sum_{i=1}^n \left|\frac{\partial L}{\partial u_i}\right|^2\right)^{1/2} \leq \sqrt{n}\, C_u$, and similarly $\left|\frac{\partial L}{\partial p}\right| \leq \sqrt{n}\, C_p$. Therefore
\begin{align*}
|\partial_\varepsilon \Phi(x, \varepsilon)| \leq \sqrt{n}\, C_u \cdot M_h + \sqrt{n}\, C_p \cdot M_{h'} =: M
\end{align*}
for all $(x, \varepsilon) \in [a,b] \times [-1,1]$. The constant $M$ is finite and depends on $L$, $y$, $h$, and $n$, but not on $x$ or $\varepsilon$. The dominating function $g(x) := M$ is integrable over $[a,b]$ since $\int_a^b M \, d\mathcal{L}^1(x) = M(b-a) < \infty$.
*Remark.* Only $C^1$ regularity of $L$ is needed for this argument: $L \in C^1$ already implies the first partial derivatives are continuous, which is all we use to obtain the uniform bound. The full $C^2$ hypothesis is not consumed here — it is typically needed for the Euler–Lagrange equation, which differentiates the first variation formula once more with respect to $x$.
[/guided]
[/step]
[step:Apply the Leibniz integral rule and evaluate at $\varepsilon = 0$]
We verify the hypotheses of the Leibniz integral rule (differentiation of a parameter-dependent Lebesgue integral):
**(i)** For each $x \in [a,b]$, the map $\varepsilon \mapsto \Phi(x, \varepsilon)$ is differentiable on $(-1, 1)$ — established in the first step via the chain rule.
**(ii)** For each $\varepsilon \in (-1, 1)$, the map $x \mapsto \Phi(x, \varepsilon)$ is continuous on $[a,b]$ (as the composition of continuous functions $y, y', h, h'$ with the $C^2$ map $L$), hence Lebesgue measurable and integrable on the bounded interval $[a,b]$.
**(iii)** There exists an integrable function $g: [a,b] \to [0, \infty)$ such that $|\partial_\varepsilon \Phi(x, \varepsilon)| \leq g(x)$ for all $(x, \varepsilon) \in [a,b] \times (-1,1)$ — established in the second step with $g \equiv M$.
By the Leibniz integral rule, for every $\varepsilon \in (-1, 1)$:
\begin{align*}
\frac{d}{d\varepsilon} J[y + \varepsilon h] = \frac{d}{d\varepsilon} \int_a^b \Phi(x, \varepsilon) \, d\mathcal{L}^1(x) = \int_a^b \partial_\varepsilon \Phi(x, \varepsilon) \, d\mathcal{L}^1(x).
\end{align*}
Evaluating at $\varepsilon = 0$ and substituting the chain rule formula from the first step:
\begin{align*}
\delta J[y; h] = \frac{d}{d\varepsilon}\bigg|_{\varepsilon=0} J[y + \varepsilon h] = \int_a^b \left[\frac{\partial L}{\partial u}(x, y(x), y'(x)) \cdot h(x) + \frac{\partial L}{\partial p}(x, y(x), y'(x)) \cdot h'(x)\right] d\mathcal{L}^1(x).
\end{align*}
[guided]
We are now ready to interchange the $\varepsilon$-derivative and the $x$-integral. The tool is the Leibniz integral rule, which states: if $\Phi: [a,b] \times I \to \mathbb{R}$ (where $I \subseteq \mathbb{R}$ is an open interval) satisfies (i) $\varepsilon \mapsto \Phi(x, \varepsilon)$ is differentiable for each $x$, (ii) $x \mapsto \Phi(x, \varepsilon)$ is integrable for each $\varepsilon$, and (iii) there exists an integrable function $g: [a,b] \to [0,\infty)$ with $|\partial_\varepsilon \Phi(x, \varepsilon)| \leq g(x)$ for all $(x, \varepsilon)$, then
\begin{align*}
\frac{d}{d\varepsilon} \int_a^b \Phi(x, \varepsilon) \, d\mathcal{L}^1(x) = \int_a^b \partial_\varepsilon \Phi(x, \varepsilon) \, d\mathcal{L}^1(x).
\end{align*}
(This rule is proved by applying the dominated convergence theorem to the difference quotients $\frac{\Phi(x, \varepsilon + \delta) - \Phi(x, \varepsilon)}{\delta}$ as $\delta \to 0$, with dominator $g$.)
We verify each hypothesis for $I = (-1, 1)$:
**(i)** For each fixed $x \in [a,b]$, the map $\varepsilon \mapsto \Phi(x, \varepsilon) = L \circ \alpha_x(\varepsilon)$ is differentiable on $(-1,1)$, as shown in the first step via the chain rule applied to the composition of the $C^2$ map $L$ with the smooth affine map $\alpha_x$.
**(ii)** For each fixed $\varepsilon \in (-1,1)$, the map $x \mapsto \Phi(x, \varepsilon) = L(x, y(x) + \varepsilon h(x), y'(x) + \varepsilon h'(x))$ is continuous on $[a,b]$, because it is a composition of the continuous functions $x \mapsto (x, y(x) + \varepsilon h(x), y'(x) + \varepsilon h'(x))$ and the $C^2$ (hence continuous) map $L$. A continuous function on the compact interval $[a,b]$ is bounded, and a bounded measurable function on a set of finite Lebesgue measure is integrable.
**(iii)** The integrable dominator $g(x) = M$ was constructed in the second step. The bound $|\partial_\varepsilon \Phi(x, \varepsilon)| \leq M$ holds for all $(x, \varepsilon) \in [a,b] \times [-1,1]$, which contains $[a,b] \times (-1,1)$.
All three conditions are satisfied. The Leibniz integral rule gives
\begin{align*}
\frac{d}{d\varepsilon} J[y + \varepsilon h] = \frac{d}{d\varepsilon} \int_a^b \Phi(x, \varepsilon) \, d\mathcal{L}^1(x) = \int_a^b \partial_\varepsilon \Phi(x, \varepsilon) \, d\mathcal{L}^1(x)
\end{align*}
for every $\varepsilon \in (-1,1)$. In particular, this identity holds at $\varepsilon = 0$. Substituting the chain rule formula from the first step with $\varepsilon = 0$ — so that the arguments of the partial derivatives simplify from $(x, y(x) + \varepsilon h(x), y'(x) + \varepsilon h'(x))$ to $(x, y(x), y'(x))$ — we obtain
\begin{align*}
\delta J[y; h] &= \frac{d}{d\varepsilon}\bigg|_{\varepsilon=0} J[y + \varepsilon h] = \int_a^b \partial_\varepsilon \Phi(x, 0) \, d\mathcal{L}^1(x) \\
&= \int_a^b \left[\frac{\partial L}{\partial u}(x, y(x), y'(x)) \cdot h(x) + \frac{\partial L}{\partial p}(x, y(x), y'(x)) \cdot h'(x)\right] d\mathcal{L}^1(x).
\end{align*}
This completes the proof: the first variation $\delta J[y; h]$ exists (the derivative at $\varepsilon = 0$ is well-defined) and equals the stated integral formula.
[/guided]
[/step]