Formula for the First Variation — Statement & Proof

Theorem

Edit Issues Pull Requests Attributions Admin

Discussion

No discussion available for this theorem.

Proof

[proofplan] We express $J[y + \varepsilon h]$ as the integral of a parametric function $\Phi(x, \varepsilon) = L(x, y(x) + \varepsilon h(x), y'(x) + \varepsilon h'(x))$ and compute $\partial_\varepsilon \Phi$ via the multivariable chain rule. The $C^2$ regularity of $L$, combined with the compactness of $[a,b]$ and the boundedness of $y, y', h, h'$, yields a uniform bound on $|\partial_\varepsilon \Phi|$ for $\varepsilon$ near $0$, providing an integrable dominator. The Leibniz integral rule then justifies interchanging differentiation and integration, and evaluating at $\varepsilon = 0$ yields the formula. [/proofplan] [step:Reduce to a parametric integral and compute $\partial_\varepsilon \Phi$ via the chain rule] Define the parametric integrand \begin{align*} \Phi: [a,b] \times \mathbb{R} &\to \mathbb{R} \\ (x, \varepsilon) &\mapsto L\!\left(x,\; y(x) + \varepsilon h(x),\; y'(x) + \varepsilon h'(x)\right). \end{align*} Then $J[y + \varepsilon h] = \int_a^b \Phi(x, \varepsilon) \, d\mathcal{L}^1(x)$ for every $\varepsilon \in \mathbb{R}$. For each fixed $x \in [a,b]$, the map $\varepsilon \mapsto \Phi(x, \varepsilon)$ is the composition $L \circ \alpha_x$, where \begin{align*} \alpha_x: \mathbb{R} &\to [a,b] \times \mathbb{R}^n \times \mathbb{R}^n \\ \varepsilon &\mapsto \left(x,\; y(x) + \varepsilon h(x),\; y'(x) + \varepsilon h'(x)\right) \end{align*} is affine, hence smooth. Since $L$ is $C^2$ and $\alpha_x$ is smooth, the multivariable chain rule gives that $\varepsilon \mapsto \Phi(x, \varepsilon)$ is differentiable with \begin{align*} \partial_\varepsilon \Phi(x, \varepsilon) &= \sum_{i=1}^n \frac{\partial L}{\partial u_i}\!\left(x,\; y(x) + \varepsilon h(x),\; y'(x) + \varepsilon h'(x)\right) h_i(x) \\ &\quad + \sum_{i=1}^n \frac{\partial L}{\partial p_i}\!\left(x,\; y(x) + \varepsilon h(x),\; y'(x) + \varepsilon h'(x)\right) h'_i(x). \end{align*} In vector notation: \begin{align*} \partial_\varepsilon \Phi(x, \varepsilon) = \frac{\partial L}{\partial u}\!\left(x,\; y(x) + \varepsilon h(x),\; y'(x) + \varepsilon h'(x)\right) \cdot h(x) + \frac{\partial L}{\partial p}\!\left(x,\; y(x) + \varepsilon h(x),\; y'(x) + \varepsilon h'(x)\right) \cdot h'(x). \end{align*} [guided] The first variation $\delta J[y; h]$ is defined as $\frac{d}{d\varepsilon}\big|_{\varepsilon=0} J[y + \varepsilon h]$, so the problem reduces to differentiating the integral $J[y + \varepsilon h] = \int_a^b L(x, y(x) + \varepsilon h(x), y'(x) + \varepsilon h'(x)) \, d\mathcal{L}^1(x)$ with respect to $\varepsilon$. Since $J[y + \varepsilon h]$ is itself an integral over $x$, we need to interchange the $\varepsilon$-derivative and the $x$-integral. Before justifying that interchange (Step 2), we first compute what $\partial_\varepsilon$ of the integrand equals. Define the parametric integrand \begin{align*} \Phi: [a,b] \times \mathbb{R} &\to \mathbb{R} \\ (x, \varepsilon) &\mapsto L\!\left(x,\; y(x) + \varepsilon h(x),\; y'(x) + \varepsilon h'(x)\right), \end{align*} so that $J[y + \varepsilon h] = \int_a^b \Phi(x, \varepsilon) \, d\mathcal{L}^1(x)$. For each fixed $x \in [a,b]$, the map $\varepsilon \mapsto \Phi(x, \varepsilon)$ is a composition. The inner map is affine: \begin{align*} \alpha_x: \mathbb{R} &\to [a,b] \times \mathbb{R}^n \times \mathbb{R}^n \\ \varepsilon &\mapsto \left(x,\; y(x) + \varepsilon h(x),\; y'(x) + \varepsilon h'(x)\right), \end{align*} and the outer map is $L$, which is $C^2$ by hypothesis. Since an affine map is smooth, the composition $\Phi(x, \cdot) = L \circ \alpha_x$ is differentiable (in fact $C^2$ in $\varepsilon$). The multivariable chain rule gives \begin{align*} \partial_\varepsilon \Phi(x, \varepsilon) &= \sum_{i=1}^n \frac{\partial L}{\partial u_i}\!\left(x,\; y(x) + \varepsilon h(x),\; y'(x) + \varepsilon h'(x)\right) \cdot \frac{d}{d\varepsilon}\!\left[y_i(x) + \varepsilon h_i(x)\right] \\ &\quad + \sum_{i=1}^n \frac{\partial L}{\partial p_i}\!\left(x,\; y(x) + \varepsilon h(x),\; y'(x) + \varepsilon h'(x)\right) \cdot \frac{d}{d\varepsilon}\!\left[y'_i(x) + \varepsilon h'_i(x)\right]. \end{align*} The $\varepsilon$-derivatives of the affine components are $\frac{d}{d\varepsilon}[y_i(x) + \varepsilon h_i(x)] = h_i(x)$ and $\frac{d}{d\varepsilon}[y'_i(x) + \varepsilon h'_i(x)] = h'_i(x)$. Note that $x$ is held fixed, so the first argument of $L$ contributes $\frac{\partial L}{\partial x} \cdot \frac{dx}{d\varepsilon} = 0$. Substituting: \begin{align*} \partial_\varepsilon \Phi(x, \varepsilon) &= \sum_{i=1}^n \frac{\partial L}{\partial u_i}\!\left(x,\; y(x) + \varepsilon h(x),\; y'(x) + \varepsilon h'(x)\right) h_i(x) \\ &\quad + \sum_{i=1}^n \frac{\partial L}{\partial p_i}\!\left(x,\; y(x) + \varepsilon h(x),\; y'(x) + \varepsilon h'(x)\right) h'_i(x). \end{align*} In compact vector notation, this reads \begin{align*} \partial_\varepsilon \Phi(x, \varepsilon) = \frac{\partial L}{\partial u}\!\left(x,\; y(x) + \varepsilon h(x),\; y'(x) + \varepsilon h'(x)\right) \cdot h(x) + \frac{\partial L}{\partial p}\!\left(x,\; y(x) + \varepsilon h(x),\; y'(x) + \varepsilon h'(x)\right) \cdot h'(x). \end{align*} The two terms have a clean interpretation: the first captures how the integrand responds to perturbing the curve values $y(x) \rightsquigarrow y(x) + \varepsilon h(x)$, and the second captures the response to perturbing the derivatives $y'(x) \rightsquigarrow y'(x) + \varepsilon h'(x)$. [/guided] [/step] [step:Bound $|\partial_\varepsilon \Phi|$ uniformly on $[a,b] \times [-1,1]$ to obtain an integrable dominator] Since $y, h \in C^1([a,b], \mathbb{R}^n)$ and $[a,b]$ is compact, the continuous functions $y, y', h, h': [a,b] \to \mathbb{R}^n$ are bounded. Define \begin{align*} M_y := \max_{x \in [a,b]} |y(x)|, \quad M_{y'} := \max_{x \in [a,b]} |y'(x)|, \quad M_h := \max_{x \in [a,b]} |h(x)|, \quad M_{h'} := \max_{x \in [a,b]} |h'(x)|. \end{align*} For every $(x, \varepsilon) \in [a,b] \times [-1, 1]$, the triangle inequality gives $|y(x) + \varepsilon h(x)| \leq M_y + M_h$ and $|y'(x) + \varepsilon h'(x)| \leq M_{y'} + M_{h'}$. Therefore the argument of $L$ in $\Phi(x, \varepsilon)$ lies in the compact set \begin{align*} K := [a,b] \times \overline{B}(0,\, M_y + M_h) \times \overline{B}(0,\, M_{y'} + M_{h'}) \subset [a,b] \times \mathbb{R}^n \times \mathbb{R}^n. \end{align*} Since $L \in C^2([a,b] \times \mathbb{R}^n \times \mathbb{R}^n)$, its first-order partial derivatives $\frac{\partial L}{\partial u_i}$ and $\frac{\partial L}{\partial p_i}$ are of class $C^1$, hence continuous. By the extreme value theorem applied to each $\left|\frac{\partial L}{\partial u_i}\right|$ and $\left|\frac{\partial L}{\partial p_i}\right|$ on the compact set $K$: \begin{align*} C_u := \max_{1 \leq i \leq n}\; \max_{(x,u,p) \in K} \left|\frac{\partial L}{\partial u_i}(x, u, p)\right| < \infty, \qquad C_p := \max_{1 \leq i \leq n}\; \max_{(x,u,p) \in K} \left|\frac{\partial L}{\partial p_i}(x, u, p)\right| < \infty. \end{align*} Applying the Cauchy–Schwarz inequality on $\mathbb{R}^n$ to each dot product in the expression for $\partial_\varepsilon \Phi$ from the previous step, and bounding each gradient norm by $\left|\frac{\partial L}{\partial u}\right| = \left(\sum_{i=1}^n \left|\frac{\partial L}{\partial u_i}\right|^2\right)^{1/2} \leq \sqrt{n}\, C_u$ (and likewise for $\frac{\partial L}{\partial p}$): \begin{align*} |\partial_\varepsilon \Phi(x, \varepsilon)| &\leq \left|\frac{\partial L}{\partial u}(\cdots)\right| |h(x)| + \left|\frac{\partial L}{\partial p}(\cdots)\right| |h'(x)| \leq \sqrt{n}\, C_u \cdot M_h + \sqrt{n}\, C_p \cdot M_{h'} =: M \end{align*} for all $(x, \varepsilon) \in [a,b] \times [-1,1]$, where $(\cdots)$ abbreviates the common argument $(x, y(x) + \varepsilon h(x), y'(x) + \varepsilon h'(x)) \in K$. The constant function $g: [a,b] \to [0,\infty)$ defined by $g(x) := M$ is integrable since $\int_a^b M \, d\mathcal{L}^1(x) = M(b-a) < \infty$. [guided] To differentiate under the integral sign, we will invoke the Leibniz integral rule, whose key hypothesis is the existence of an integrable function that dominates $|\partial_\varepsilon \Phi(x, \varepsilon)|$ uniformly in $\varepsilon$ near $0$. How do we manufacture such a bound? The strategy is: (a) confine the arguments of $L$ to a compact set, (b) use continuity of the partial derivatives on that compact set to get a uniform pointwise bound, and (c) combine with the boundedness of $h$ and $h'$. **Part (a): confine to a compact set.** Since $y, h \in C^1([a,b], \mathbb{R}^n)$ and $[a,b]$ is compact, the continuous functions $y, y', h, h': [a,b] \to \mathbb{R}^n$ attain their maxima. Define \begin{align*} M_y := \max_{x \in [a,b]} |y(x)|, \quad M_{y'} := \max_{x \in [a,b]} |y'(x)|, \quad M_h := \max_{x \in [a,b]} |h(x)|, \quad M_{h'} := \max_{x \in [a,b]} |h'(x)|. \end{align*} For $\varepsilon \in [-1,1]$, the triangle inequality gives $|y(x) + \varepsilon h(x)| \leq M_y + |\varepsilon| M_h \leq M_y + M_h$ and $|y'(x) + \varepsilon h'(x)| \leq M_{y'} + M_{h'}$. So the argument of $L$ in $\Phi(x, \varepsilon)$ is confined to the compact set \begin{align*} K := [a,b] \times \overline{B}(0,\, M_y + M_h) \times \overline{B}(0,\, M_{y'} + M_{h'}) \subset [a,b] \times \mathbb{R}^n \times \mathbb{R}^n. \end{align*} Why does bounding on $K$ suffice? Because we only need to dominate $|\partial_\varepsilon \Phi|$ for $\varepsilon$ in *some* neighbourhood of $0$ — the interval $[-1, 1]$ is such a neighbourhood, and it determines $K$. **Part (b): bound the partial derivatives of $L$ on $K$.** Since $L \in C^2$, its first-order partial derivatives $\frac{\partial L}{\partial u_i}$ and $\frac{\partial L}{\partial p_i}$ ($1 \leq i \leq n$) are of class $C^1$ and in particular continuous. A continuous real-valued function on a compact set attains its maximum (extreme value theorem), so \begin{align*} C_u := \max_{1 \leq i \leq n}\; \max_{(x,u,p) \in K} \left|\frac{\partial L}{\partial u_i}(x, u, p)\right| < \infty, \qquad C_p := \max_{1 \leq i \leq n}\; \max_{(x,u,p) \in K} \left|\frac{\partial L}{\partial p_i}(x, u, p)\right| < \infty. \end{align*} **Part (c): combine the bounds.** From the formula in the previous step, the Cauchy–Schwarz inequality $|a \cdot b| \leq |a||b|$ on $\mathbb{R}^n$ gives \begin{align*} |\partial_\varepsilon \Phi(x, \varepsilon)| &\leq \left|\frac{\partial L}{\partial u}(\cdots)\right| |h(x)| + \left|\frac{\partial L}{\partial p}(\cdots)\right| |h'(x)|, \end{align*} where $(\cdots)$ abbreviates the common argument $(x, y(x) + \varepsilon h(x), y'(x) + \varepsilon h'(x)) \in K$. Since each component $\left|\frac{\partial L}{\partial u_i}\right| \leq C_u$ on $K$, we have $\left|\frac{\partial L}{\partial u}\right| = \left(\sum_{i=1}^n \left|\frac{\partial L}{\partial u_i}\right|^2\right)^{1/2} \leq \sqrt{n}\, C_u$, and similarly $\left|\frac{\partial L}{\partial p}\right| \leq \sqrt{n}\, C_p$. Therefore \begin{align*} |\partial_\varepsilon \Phi(x, \varepsilon)| \leq \sqrt{n}\, C_u \cdot M_h + \sqrt{n}\, C_p \cdot M_{h'} =: M \end{align*} for all $(x, \varepsilon) \in [a,b] \times [-1,1]$. The constant $M$ is finite and depends on $L$, $y$, $h$, and $n$, but not on $x$ or $\varepsilon$. The dominating function $g(x) := M$ is integrable over $[a,b]$ since $\int_a^b M \, d\mathcal{L}^1(x) = M(b-a) < \infty$. *Remark.* Only $C^1$ regularity of $L$ is needed for this argument: $L \in C^1$ already implies the first partial derivatives are continuous, which is all we use to obtain the uniform bound. The full $C^2$ hypothesis is not consumed here — it is typically needed for the Euler–Lagrange equation, which differentiates the first variation formula once more with respect to $x$. [/guided] [/step] [step:Apply the Leibniz integral rule and evaluate at $\varepsilon = 0$] We verify the hypotheses of the Leibniz integral rule (differentiation of a parameter-dependent Lebesgue integral): **(i)** For each $x \in [a,b]$, the map $\varepsilon \mapsto \Phi(x, \varepsilon)$ is differentiable on $(-1, 1)$ — established in the first step via the chain rule. **(ii)** For each $\varepsilon \in (-1, 1)$, the map $x \mapsto \Phi(x, \varepsilon)$ is continuous on $[a,b]$ (as the composition of continuous functions $y, y', h, h'$ with the $C^2$ map $L$), hence Lebesgue measurable and integrable on the bounded interval $[a,b]$. **(iii)** There exists an integrable function $g: [a,b] \to [0, \infty)$ such that $|\partial_\varepsilon \Phi(x, \varepsilon)| \leq g(x)$ for all $(x, \varepsilon) \in [a,b] \times (-1,1)$ — established in the second step with $g \equiv M$. By the Leibniz integral rule, for every $\varepsilon \in (-1, 1)$: \begin{align*} \frac{d}{d\varepsilon} J[y + \varepsilon h] = \frac{d}{d\varepsilon} \int_a^b \Phi(x, \varepsilon) \, d\mathcal{L}^1(x) = \int_a^b \partial_\varepsilon \Phi(x, \varepsilon) \, d\mathcal{L}^1(x). \end{align*} Evaluating at $\varepsilon = 0$ and substituting the chain rule formula from the first step: \begin{align*} \delta J[y; h] = \frac{d}{d\varepsilon}\bigg|_{\varepsilon=0} J[y + \varepsilon h] = \int_a^b \left[\frac{\partial L}{\partial u}(x, y(x), y'(x)) \cdot h(x) + \frac{\partial L}{\partial p}(x, y(x), y'(x)) \cdot h'(x)\right] d\mathcal{L}^1(x). \end{align*} [guided] We are now ready to interchange the $\varepsilon$-derivative and the $x$-integral. The tool is the Leibniz integral rule, which states: if $\Phi: [a,b] \times I \to \mathbb{R}$ (where $I \subseteq \mathbb{R}$ is an open interval) satisfies (i) $\varepsilon \mapsto \Phi(x, \varepsilon)$ is differentiable for each $x$, (ii) $x \mapsto \Phi(x, \varepsilon)$ is integrable for each $\varepsilon$, and (iii) there exists an integrable function $g: [a,b] \to [0,\infty)$ with $|\partial_\varepsilon \Phi(x, \varepsilon)| \leq g(x)$ for all $(x, \varepsilon)$, then \begin{align*} \frac{d}{d\varepsilon} \int_a^b \Phi(x, \varepsilon) \, d\mathcal{L}^1(x) = \int_a^b \partial_\varepsilon \Phi(x, \varepsilon) \, d\mathcal{L}^1(x). \end{align*} (This rule is proved by applying the dominated convergence theorem to the difference quotients $\frac{\Phi(x, \varepsilon + \delta) - \Phi(x, \varepsilon)}{\delta}$ as $\delta \to 0$, with dominator $g$.) We verify each hypothesis for $I = (-1, 1)$: **(i)** For each fixed $x \in [a,b]$, the map $\varepsilon \mapsto \Phi(x, \varepsilon) = L \circ \alpha_x(\varepsilon)$ is differentiable on $(-1,1)$, as shown in the first step via the chain rule applied to the composition of the $C^2$ map $L$ with the smooth affine map $\alpha_x$. **(ii)** For each fixed $\varepsilon \in (-1,1)$, the map $x \mapsto \Phi(x, \varepsilon) = L(x, y(x) + \varepsilon h(x), y'(x) + \varepsilon h'(x))$ is continuous on $[a,b]$, because it is a composition of the continuous functions $x \mapsto (x, y(x) + \varepsilon h(x), y'(x) + \varepsilon h'(x))$ and the $C^2$ (hence continuous) map $L$. A continuous function on the compact interval $[a,b]$ is bounded, and a bounded measurable function on a set of finite Lebesgue measure is integrable. **(iii)** The integrable dominator $g(x) = M$ was constructed in the second step. The bound $|\partial_\varepsilon \Phi(x, \varepsilon)| \leq M$ holds for all $(x, \varepsilon) \in [a,b] \times [-1,1]$, which contains $[a,b] \times (-1,1)$. All three conditions are satisfied. The Leibniz integral rule gives \begin{align*} \frac{d}{d\varepsilon} J[y + \varepsilon h] = \frac{d}{d\varepsilon} \int_a^b \Phi(x, \varepsilon) \, d\mathcal{L}^1(x) = \int_a^b \partial_\varepsilon \Phi(x, \varepsilon) \, d\mathcal{L}^1(x) \end{align*} for every $\varepsilon \in (-1,1)$. In particular, this identity holds at $\varepsilon = 0$. Substituting the chain rule formula from the first step with $\varepsilon = 0$ — so that the arguments of the partial derivatives simplify from $(x, y(x) + \varepsilon h(x), y'(x) + \varepsilon h'(x))$ to $(x, y(x), y'(x))$ — we obtain \begin{align*} \delta J[y; h] &= \frac{d}{d\varepsilon}\bigg|_{\varepsilon=0} J[y + \varepsilon h] = \int_a^b \partial_\varepsilon \Phi(x, 0) \, d\mathcal{L}^1(x) \\ &= \int_a^b \left[\frac{\partial L}{\partial u}(x, y(x), y'(x)) \cdot h(x) + \frac{\partial L}{\partial p}(x, y(x), y'(x)) \cdot h'(x)\right] d\mathcal{L}^1(x). \end{align*} This completes the proof: the first variation $\delta J[y; h]$ exists (the derivative at $\varepsilon = 0$ is well-defined) and equals the stated integral formula. [/guided] [/step]

What brings you to Androma?

Start with a route through the knowledge graph.

Formula for the First Variation (Theorem # 3519)

Discussion

Proof

Explore Further

Sign in to Androma

Check your inbox

One last step

Formula for the First Variation (Theorem # 3519)

Discussion

Proof

Explore Further