Androma — The Home of Mathematics on the Internet

custom_env admin

[guided]We now compute the KKT equations carefully, because the sign and scaling of the multiplier are the whole point of the theorem. The free finite-dimensional variables are $(x_1,\dots,x_N,u_0,\dots,u_{N-1})$; the initial value $x_0=x_0^*$ is fixed data. The defect constraint at index $k$ is the map $G_{k,h}:\mathbb{R}^{nN}\times\mathbb{R}^{mN}\to\mathbb{R}^n$ defined, with $x_0=x_0^*$ held fixed, by \begin{align*} G_{k,h}(x_1,\dots,x_N,u_0,\dots,u_{N-1})=\frac{x_{k+1}-x_k}{h_k}-f(x_k,u_k). \end{align*} The discrete Lagrangian is the objective plus $\sum_{k=0}^{N-1}\lambda_{k,h}\cdot G_{k,h}$. We use the finite-dimensional equality-constrained KKT necessary condition: for a $C^1$ finite-dimensional equality-constrained problem, LICQ at a local minimizer gives Lagrange multipliers satisfying stationarity, and LICQ also gives uniqueness of the equality multiplier vector. Since LICQ is assumed at the local minimizer for these equality constraints, this result gives a unique defect multiplier vector $\lambda_{k,h}$ and stationarity of $\mathcal{L}_h$ with respect to every free variable. First vary $u_k$. Only the objective term $h_kL(x_k,u_k)$ and the defect term $\lambda_{k,h}\cdot G_{k,h}$ depend on $u_k$. Differentiating gives \begin{align*} 0=h_k\nabla_u L(x_{k,h},u_{k,h})-\nabla_u f(x_{k,h},u_{k,h})^\top\lambda_{k,h}. \end{align*} The recovered costate is defined by $\mu_{k,h}=-h_k^{-1}\lambda_{k,h}$, so $\lambda_{k,h}=-h_k\mu_{k,h}$. Substitution gives \begin{align*} 0=h_k\nabla_u L(x_{k,h},u_{k,h})+h_k\nabla_u f(x_{k,h},u_{k,h})^\top\mu_{k,h}. \end{align*} Since $h_k>0$, division by $h_k$ gives \begin{align*} 0=\nabla_u L(x_{k,h},u_{k,h})+\nabla_u f(x_{k,h},u_{k,h})^\top\mu_{k,h}. \end{align*} This is exactly $0=\nabla_u H(x_{k,h},\mu_{k,h},u_{k,h})$, because $H(x,p,u)=L(x,u)+p\cdot f(x,u)$. Next vary the terminal variable $x_N$. The variable $x_N$ appears in $\Phi(x_N)$ and in the last defect $G_{N-1,h}$ through the term $x_N/h_{N-1}$. Therefore stationarity gives \begin{align*} 0=\nabla\Phi(x_{N,h})+\frac{\lambda_{N-1,h}}{h_{N-1}}. \end{align*} Using $\mu_{N-1,h}=-h_{N-1}^{-1}\lambda_{N-1,h}$, this becomes \begin{align*} \mu_{N-1,h}=\nabla\Phi(x_{N,h}). \end{align*} This is the discrete transversality condition. Finally fix $1\le k\le N-1$ and vary the interior state $x_k$. This variable appears in three places: in the running cost $h_kL(x_k,u_k)$, in the previous defect $G_{k-1,h}$ through $x_k/h_{k-1}$, and in the current defect $G_{k,h}$ through $-x_k/h_k-f(x_k,u_k)$. Differentiating those three contributions gives \begin{align*} 0=h_k\nabla_x L(x_{k,h},u_{k,h})+\frac{\lambda_{k-1,h}}{h_{k-1}}-\frac{\lambda_{k,h}}{h_k}-\nabla_x f(x_{k,h},u_{k,h})^\top\lambda_{k,h}. \end{align*} Substituting $\lambda_{k-1,h}=-h_{k-1}\mu_{k-1,h}$ and $\lambda_{k,h}=-h_k\mu_{k,h}$ gives \begin{align*} 0=h_k\nabla_x L(x_{k,h},u_{k,h})-\mu_{k-1,h}+\mu_{k,h}+h_k\nabla_x f(x_{k,h},u_{k,h})^\top\mu_{k,h}. \end{align*} Rearranging yields the backward discrete adjoint equation \begin{align*} \mu_{k-1,h}-\mu_{k,h}-h_k\nabla_x f(x_{k,h},u_{k,h})^\top\mu_{k,h}=h_k\nabla_x L(x_{k,h},u_{k,h}). \end{align*} The appearance of $h_k$ in this recursion is a direct consequence of differentiating the cost and defect at node $k$; no reindexing is being hidden.[/guided]

custom_env admin

[step:Prove that the residual in the error equation vanishes uniformly]We prove that \begin{align*} \max_{1\le k\le N-1}|r_{k,h}|\to0. \end{align*} Define the continuous adjoint vector field $F:[0,T]\to\mathbb{R}^n$ by \begin{align*} F(t)=\nabla_x L(x^*(t),u^*(t))+\nabla_x f(x^*(t),u^*(t))^\top p^*(t). \end{align*} The continuous adjoint equation is exactly $\dot p^*(t)=-F(t)$ for every $t\in[0,T]$. Since $x^*$, $u^*$, and $p^*$ are continuous and $\nabla_x L$ and $\nabla_x f$ are continuous, $F$ is continuous on the compact interval $[0,T]$ and hence uniformly continuous and bounded. By the [fundamental theorem of calculus](/theorems/632) applied componentwise to $p^*$, \begin{align*} p^*(t_k)-p^*(t_{k-1})=-\int_{t_{k-1}}^{t_k}F(s)\,d\mathcal{L}^1(s). \end{align*} For $1\le k\le N-1$, define $F_{k,h}\in\mathbb{R}^n$ by \begin{align*} F_{k,h}=\nabla_x L(x_{k,h},u_{k,h})+\nabla_x f(x_{k,h},u_{k,h})^\top p^*(t_k). \end{align*} Then the residual can be written as \begin{align*} r_{k,h}=h_kF_{k,h}-\int_{t_{k-1}}^{t_k}F(s)\,d\mathcal{L}^1(s). \end{align*} Let $M_F=\max_{0\le t\le T}|F(t)|$. Define the compact reference set $K_0\subset\mathbb{R}^n\times\mathbb{R}^m$ by \begin{align*} K_0=\{(x^*(t),u^*(t)):0\le t\le T\}. \end{align*} It is compact because it is the continuous image of $[0,T]$. The assumed uniform convergence of the discrete states and controls implies that there is $R>0$ such that, for all sufficiently small $h$, every $(x_{k,h},u_{k,h})$ lies in the compact set \begin{align*} K_R=\{(x,u)\in\mathbb{R}^n\times\mathbb{R}^m:\operatorname{dist}((x,u),K_0)\le R\}. \end{align*} On $K_R$, the continuous maps $\nabla_x L$ and $\nabla_x f$ are uniformly continuous, and $p^*$ is bounded on $[0,T]$. Hence \begin{align*} \varepsilon_h:=\max_{1\le k\le N-1}|F_{k,h}-F(t_k)|\to0. \end{align*} For each $k$, \begin{align*} |r_{k,h}|\le h_k|F_{k,h}-F(t_k)|+\left|h_kF(t_k)-\int_{t_{k-1}}^{t_k}F(s)\,d\mathcal{L}^1(s)\right|. \end{align*} The first term is bounded by $h\varepsilon_h$. For the second term, add and subtract $h_{k-1}F(t_k)$ to obtain \begin{align*} \left|h_kF(t_k)-\int_{t_{k-1}}^{t_k}F(s)\,d\mathcal{L}^1(s)\right|\le |h_k-h_{k-1}|M_F+\int_{t_{k-1}}^{t_k}|F(t_k)-F(s)|\,d\mathcal{L}^1(s). \end{align*} Since $|h_k-h_{k-1}|\le h$ and $F$ is uniformly continuous, the right-hand side is bounded by \begin{align*} hM_F+h\omega_F(h), \end{align*} where $\omega_F(h)=\sup\{|F(t)-F(s)|:s,t\in[0,T], |s-t|\le h\}$ and $\omega_F(h)\to0$. Therefore \begin{align*} \max_{1\le k\le N-1}|r_{k,h}|\le h\varepsilon_h+hM_F+h\omega_F(h)\to0. \end{align*}[/step]

custom_env admin

[guided]The residual must be compared with the differential equation solved by the continuous costate, not merely bounded term by term. Define \begin{align*} F(t)=\nabla_x L(x^*(t),u^*(t))+\nabla_x f(x^*(t),u^*(t))^\top p^*(t). \end{align*} The continuous adjoint equation says $-\dot p^*(t)=F(t)$ for every $t\in[0,T]$. This is the consistency input: it identifies the increment of the sampled continuous costate with the integral of the same vector field that appears in the discrete adjoint recursion. Since $x^*$, $u^*$, and $p^*$ are continuous, and since $\nabla_x L$ and $\nabla_x f$ are continuous, the map $F:[0,T]\to\mathbb{R}^n$ is continuous. Because $[0,T]$ is compact, $F$ is uniformly continuous and bounded. By the fundamental theorem of calculus applied componentwise to $p^*$, \begin{align*} p^*(t_k)-p^*(t_{k-1})=\int_{t_{k-1}}^{t_k}\dot p^*(s)\,d\mathcal{L}^1(s)=-\int_{t_{k-1}}^{t_k}F(s)\,d\mathcal{L}^1(s). \end{align*} Now define the discrete sampled vector field $F_{k,h}\in\mathbb{R}^n$ by \begin{align*} F_{k,h}=\nabla_x L(x_{k,h},u_{k,h})+\nabla_x f(x_{k,h},u_{k,h})^\top p^*(t_k). \end{align*} Using the definition of $r_{k,h}$ and the preceding integral identity gives \begin{align*} r_{k,h}=h_kF_{k,h}-\int_{t_{k-1}}^{t_k}F(s)\,d\mathcal{L}^1(s). \end{align*} This formula shows exactly what must be controlled: the discrete vector field must approximate $F(t_k)$, and the integral of $F$ over the previous mesh interval must be approximated by a one-point quadrature term. Define the compact reference set $K_0\subset\mathbb{R}^n\times\mathbb{R}^m$ by \begin{align*} K_0=\{(x^*(t),u^*(t)):0\le t\le T\}. \end{align*} This set is compact because it is the continuous image of the compact interval $[0,T]$. Uniform convergence of $(x_{k,h},u_{k,h})$ to $(x^*(t_k),u^*(t_k))$ implies that there is $R>0$ such that all discrete pairs lie, for all sufficiently small $h$, in the compact set \begin{align*} K_R=\{(x,u)\in\mathbb{R}^n\times\mathbb{R}^m:\operatorname{dist}((x,u),K_0)\le R\}. \end{align*} On $K_R$, the continuous maps $\nabla_x L$ and $\nabla_x f$ are uniformly continuous, and $p^*$ is bounded on $[0,T]$. Therefore \begin{align*} \varepsilon_h:=\max_{1\le k\le N-1}|F_{k,h}-F(t_k)|\to0. \end{align*} Let $M_F=\max_{0\le t\le T}|F(t)|$. For each $1\le k\le N-1$, \begin{align*} |r_{k,h}|\le h_k|F_{k,h}-F(t_k)|+\left|h_kF(t_k)-\int_{t_{k-1}}^{t_k}F(s)\,d\mathcal{L}^1(s)\right|. \end{align*} The first term is at most $h\varepsilon_h$. For the second term, insert $h_{k-1}F(t_k)$, because the integration interval has length $h_{k-1}=t_k-t_{k-1}$: \begin{align*} \left|h_kF(t_k)-\int_{t_{k-1}}^{t_k}F(s)\,d\mathcal{L}^1(s)\right|\le |h_k-h_{k-1}|M_F+\int_{t_{k-1}}^{t_k}|F(t_k)-F(s)|\,d\mathcal{L}^1(s). \end{align*} Since both $h_k$ and $h_{k-1}$ are bounded by $h$, we have $|h_k-h_{k-1}|\le h$. If \begin{align*} \omega_F(h)=\sup\{|F(t)-F(s)|:s,t\in[0,T], |s-t|\le h\}, \end{align*} then [uniform continuity](/page/Uniform%20Continuity) gives $\omega_F(h)\to0$, and the integral term is at most $h\omega_F(h)$. Consequently \begin{align*} \max_{1\le k\le N-1}|r_{k,h}|\le h\varepsilon_h+hM_F+h\omega_F(h)\to0. \end{align*} This proves the required uniform residual consistency. The continuous adjoint equation is essential here: it is what turns the difference $p^*(t_k)-p^*(t_{k-1})$ into the integral of the correct vector field.[/guided]

custom_env admin

What brings you to Androma?

Start with a route through the knowledge graph.

Attributions & Verification

Proof

Verification Progress

Contributors

Who Can Verify

Quick Actions

Sign in to Androma

Check your inbox

One last step

Attributions & Verification

Proof

Verification Progress

Contributors

Who Can Verify

Quick Actions

Raw Attribution Data