Finite-Horizon Linear Quadratic Regulator Theorem

Theorem

Edit Issues Pull Requests Attributions Admin

Discussion

Proof

[proofplan] We first solve the Riccati terminal-value problem by viewing it as an ordinary differential equation on the finite-dimensional [vector space](/page/Vector%20Space) of symmetric matrices. A completion-of-squares identity on any interval where the Riccati solution exists gives positivity and a uniform bound, which prevents finite-time blow-up and extends the solution back to time $0$. The same identity then proves optimality: every admissible control has cost equal to $x_0^\top P(0)x_0$ plus a non-negative square, and equality forces exactly the stated feedback law. [/proofplan] [step:Declare the control class and cost functional] Let $\mathcal{L}^1$ denote one-dimensional [Lebesgue measure](/page/Lebesgue%20Measure) on $\mathbb{R}$. For an initial state $x_0\in\mathbb{R}^n$, define the admissible control class \begin{align*} \mathcal U_T(x_0) := L^2((0,T);\mathbb{R}^m), \end{align*} where each $u\in\mathcal U_T(x_0)$ is identified up to $\mathcal L^1$-almost everywhere equality and determines the unique absolutely continuous state $x_u:[0,T]\to\mathbb{R}^n$ satisfying $x_u'(t)=Ax_u(t)+Bu(t)$ for $\mathcal L^1$-almost every $t\in(0,T)$ and $x_u(0)=x_0$. Define the finite-horizon cost functional \begin{align*} J_T[\cdot;x_0]:\mathcal U_T(x_0)\to[0,\infty) \end{align*} by \begin{align*} J_T[u;x_0] := x_u(T)^\top Sx_u(T)+\int_0^{\!T}\left(x_u(t)^\top Qx_u(t)+u(t)^\top Ru(t)\right)\,d\mathcal L^1(t). \end{align*} [/step] [step:Solve the Riccati equation locally in the symmetric matrix space] Let $\operatorname{Sym}_n(\mathbb{R})$ denote the vector space of real symmetric $n \times n$ matrices, and define the matrix \begin{align*} M := BR^{-1}B^\top \in \mathbb{R}^{n \times n}. \end{align*} Since $R = R^\top > 0$, the inverse $R^{-1}$ exists and satisfies $(R^{-1})^\top = R^{-1}$, so $M = M^\top$. Define the vector field \begin{align*} F: \operatorname{Sym}_n(\mathbb{R}) \to \operatorname{Sym}_n(\mathbb{R}), \qquad F(Y) = -A^\top Y - YA + YMY - Q. \end{align*} For $Y = Y^\top$, each term in $F(Y)$ is symmetric after taking the sum, because \begin{align*} (-A^\top Y - YA + YMY - Q)^\top = -YA - A^\top Y + YMY - Q. \end{align*} Thus $F$ maps $\operatorname{Sym}_n(\mathbb{R})$ into itself. Its coordinate functions are polynomial functions of the entries of $Y$, hence locally Lipschitz. Identifying $\operatorname{Sym}_n(\mathbb{R})$ with $\mathbb{R}^{n(n+1)/2}$ by the independent matrix entries, the domain is the whole finite-dimensional Euclidean space and the vector field $-F$ is locally Lipschitz. Apply the [Picard-Lindelöf Theorem](/theorems/69) to the time-reversed initial-value problem $\widetilde P'(s)=-F(\widetilde P(s))$ with $\widetilde P(0)=S$. Setting $P(t)=\widetilde P(T-t)$ gives a maximal interval $(\tau,T] \subseteq [0,T]$ and a unique solution \begin{align*} P: (\tau,T] \to \operatorname{Sym}_n(\mathbb{R}) \end{align*} of \begin{align*} P'(t) = F(P(t)), \qquad P(T) = S. \end{align*} Equivalently, \begin{align*} P'(t) + A^\top P(t) + P(t)A - P(t)BR^{-1}B^\top P(t) + Q = 0. \end{align*} [guided] The Riccati equation is an ordinary differential equation whose unknown is a matrix-valued function. The correct ambient space is not all matrices but the finite-dimensional vector space $\operatorname{Sym}_n(\mathbb{R})$, because the theorem asserts that the solution remains symmetric. We first define \begin{align*} M := BR^{-1}B^\top. \end{align*} The condition $R = R^\top > 0$ implies that $R$ is invertible and that $R^{-1}$ is symmetric. Therefore \begin{align*} M^\top = (BR^{-1}B^\top)^\top = BR^{-1}B^\top = M. \end{align*} Now define \begin{align*} F: \operatorname{Sym}_n(\mathbb{R}) \to \operatorname{Sym}_n(\mathbb{R}), \qquad F(Y) = -A^\top Y - YA + YMY - Q. \end{align*} We verify that this is really a map into $\operatorname{Sym}_n(\mathbb{R})$. If $Y = Y^\top$, then \begin{align*} F(Y)^\top = -YA - A^\top Y + YMY - Q. \end{align*} Since matrix addition is commutative term-by-term, this equals $F(Y)$. Thus $F(Y)$ is symmetric whenever $Y$ is symmetric. The entries of $F(Y)$ are polynomial expressions in the entries of $Y$, because $A$, $M$, and $Q$ are fixed matrices and the only nonlinear term is $YMY$. Polynomial maps on finite-dimensional Euclidean spaces are locally Lipschitz. To apply the usual initial-time existence theorem, introduce the time-reversed unknown $\widetilde P$ with $\widetilde P(s)=P(T-s)$. Then $\widetilde P$ must solve \begin{align*} \widetilde P'(s)=-F(\widetilde P(s)), \qquad \widetilde P(0)=S. \end{align*} The vector field $-F$ is locally Lipschitz on the same finite-dimensional space. After identifying $\operatorname{Sym}_n(\mathbb{R})$ with $\mathbb{R}^{n(n+1)/2}$ by the independent entries, the hypotheses of the [Picard-Lindelöf Theorem](/theorems/69) are satisfied: the domain is open, the initial value $S$ lies in it, and the vector field is locally Lipschitz. Hence there is a unique solution $\widetilde P$ on a maximal forward interval. Returning to $t=T-s$ gives a unique solution $P$ on a maximal backward interval $(\tau,T]$ satisfying $P'(t)=F(P(t))$ and $P(T)=S$. Written out in the original data, the differential equation is \begin{align*} P'(t) + A^\top P(t) + P(t)A - P(t)BR^{-1}B^\top P(t) + Q = 0. \end{align*} [/guided] [/step] [step:Derive the completion-of-squares identity on the existence interval] Fix $t_0 \in (\tau,T]$, a state $z \in \mathbb{R}^n$, and a control $u \in L^2((t_0,T);\mathbb{R}^m)$. Let \begin{align*} x: [t_0,T] \to \mathbb{R}^n \end{align*} be the absolutely continuous solution of \begin{align*} x'(t) = Ax(t) + Bu(t), \qquad x(t_0) = z. \end{align*} Define \begin{align*} W: [t_0,T] \to \mathbb{R}, \qquad W(t) = x(t)^\top P(t)x(t). \end{align*} Since $x$ is absolutely continuous and $P$ is $C^1$, the function $W$ is absolutely continuous. For $\mathcal{L}^1$-almost every $t \in (t_0,T)$, differentiating the scalar product of the absolutely continuous functions $x$, $P x$, and using the symmetry of $P(t)$ gives \begin{align*} W'(t) = x(t)^\top P'(t)x(t) + 2x(t)^\top P(t)x'(t). \end{align*} Substituting $x'(t)=Ax(t)+Bu(t)$ and \begin{align*} P'(t) = -A^\top P(t) - P(t)A + P(t)MP(t) - Q \end{align*} yields \begin{align*} W'(t) = x(t)^\top P(t)MP(t)x(t) - x(t)^\top Qx(t) + 2x(t)^\top P(t)Bu(t). \end{align*} Since $M=BR^{-1}B^\top$, this becomes \begin{align*} x(t)^\top Qx(t) + u(t)^\top Ru(t) + W'(t) = \left(u(t)+R^{-1}B^\top P(t)x(t)\right)^\top R\left(u(t)+R^{-1}B^\top P(t)x(t)\right). \end{align*} Integrating over $(t_0,T)$ with respect to $\mathcal{L}^1$ and using $P(T)=S$, we obtain \begin{align*} x(T)^\top Sx(T) + \int_{t_0}^\top \left(x(t)^\top Qx(t) + u(t)^\top Ru(t)\right)\,d\mathcal{L}^1(t) = z^\top P(t_0)z + \int_{t_0}^\top \left(u(t)+R^{-1}B^\top P(t)x(t)\right)^\top R\left(u(t)+R^{-1}B^\top P(t)x(t)\right)\,d\mathcal{L}^1(t). \end{align*} [/step] [step:Use the identity to prove positivity and prevent blow-up] Let $x_P: [t_0,T] \to \mathbb{R}^n$ be the unique absolutely continuous solution of \begin{align*} x_P'(t) = \left(A - BR^{-1}B^\top P(t)\right)x_P(t), \qquad x_P(t_0)=z. \end{align*} Define the feedback control as the map \begin{align*} u_P: (t_0,T) \to \mathbb{R}^m \end{align*} is defined by \begin{align*} u_P(t) := -R^{-1}B^\top P(t)x_P(t). \end{align*} Since $P$ and $x_P$ are continuous on $[t_0,T]$, the map $u_P$ belongs to $L^2((t_0,T);\mathbb{R}^m)$. Substituting this admissible control into the previous identity makes the square term vanish. Therefore \begin{align*} z^\top P(t_0)z = x_P(T)^\top Sx_P(T) + \int_{t_0}^\top x_P(t)^\top Qx_P(t)\,d\mathcal{L}^1(t) \geq 0. \end{align*} Since $z \in \mathbb{R}^n$ was arbitrary, $P(t_0)\geq 0$ for all $t_0 \in (\tau,T]$. Next define the zero control as the map \begin{align*} u_0: (t_0,T) \to \mathbb{R}^m. \end{align*} Set $u_0(t)=0$ for $\mathcal{L}^1$-almost every $t \in (t_0,T)$. The corresponding state is the map \begin{align*} x_{\mathrm{zero},z}: [t_0,T] \to \mathbb{R}^n. \end{align*} For $r\in\mathbb R$, define the matrix exponential $e^{Ar}:=\sum_{k=0}^{\infty}A^k r^k/k!$. For $t\in[t_0,T]$, define $x_{\mathrm{zero},z}(t)=e^{A(t-t_0)}z$. The square term is non-negative because $R>0$, so the identity gives \begin{align*} z^\top P(t_0)z \leq (x_{\mathrm{zero},z}(T))^\top Sx_{\mathrm{zero},z}(T) + \int_{t_0}^\top (x_{\mathrm{zero},z}(t))^\top Qx_{\mathrm{zero},z}(t)\,d\mathcal{L}^1(t). \end{align*} Here $|\cdot|$ denotes the Euclidean norm on $\mathbb{R}^n$, and $\|\cdot\|_{\mathrm{op}}$ denotes the operator norm induced by the Euclidean norm on the relevant matrix space. Define the finite constant \begin{align*} C_T := \sup_{0 \leq r \leq T} \|e^{Ar}\|_{\mathrm{op}}^2\left(\|S\|_{\mathrm{op}} + T\|Q\|_{\mathrm{op}}\right). \end{align*} Then, for every $t_0 \in (\tau,T]$ and $z \in \mathbb{R}^n$, \begin{align*} 0 \leq z^\top P(t_0)z \leq C_T |z|^2. \end{align*} Thus $\|P(t_0)\|_{\mathrm{op}} \leq C_T$ for every $t_0 \in (\tau,T]$. We now spell out the finite-dimensional extension argument. Suppose, for contradiction, that $\tau>0$. The closed operator-norm ball \begin{align*} K_T:=\{Y\in\operatorname{Sym}_n(\mathbb{R}):\|Y\|_{\mathrm{op}}\le C_T\} \end{align*} is compact because $\operatorname{Sym}_n(\mathbb{R})$ is finite-dimensional, and $P((\tau,T])\subset K_T$. The vector field $F$ is continuous, so it is bounded on the compact set $K_T$; let \begin{align*} D_T:=\sup_{Y\in K_T}\|F(Y)\|_{\mathrm{op}}<\infty. \end{align*} For $s,t\in(\tau,T]$, the integral form of the equation gives \begin{align*} \|P(t)-P(s)\|_{\mathrm{op}}\le D_T|t-s|. \end{align*} Hence $P(t)$ has a limit $P_\tau\in K_T$ as $t\downarrow\tau$. Extend the old solution continuously by setting $P(\tau):=P_\tau$. For every $t\in(\tau,T]$, the integral form of the equation and continuity of $F$ give \begin{align*} P(t)=P_\tau+\int_\tau^t F(P(r))\,d\mathcal L^1(r), \end{align*} so the extended function solves the Riccati equation on $[\tau,T]$ in integral form. Applying the [Picard-Lindelöf Theorem](/theorems/69) with initial value $P_\tau$ at time $\tau$ gives a local solution on an interval containing $\tau$; uniqueness on the overlap with $[\tau,T]$ identifies it with the continuous extension, so it glues to $P$ and extends the solution to a smaller interval to the left of $\tau$. This contradicts maximality. Therefore $\tau=0$, and $P$ extends to a unique function \begin{align*} P \in C^1([0,T];\operatorname{Sym}_n(\mathbb{R})) \end{align*} with $P(t)\geq 0$ for all $t \in [0,T]$. [guided] The square identity is not only an optimality identity; it also controls the Riccati solution itself. Fix $t_0 \in (\tau,T]$ and $z\in\mathbb{R}^n$. Let $x_P: [t_0,T]\to\mathbb{R}^n$ be the unique absolutely continuous solution of \begin{align*} x_P'(t) = \left(A - BR^{-1}B^\top P(t)\right)x_P(t), \qquad x_P(t_0)=z. \end{align*} Define $u_P: (t_0,T)\to\mathbb{R}^m$ by \begin{align*} u_P(t) := -R^{-1}B^\top P(t)x_P(t). \end{align*} Because $P$ and $x_P$ are continuous on the compact interval $[t_0,T]$, the control $u_P$ is continuous and hence belongs to $L^2((t_0,T);\mathbb{R}^m)$. Substituting this pair into the square identity makes the square term vanish, so \begin{align*} z^\top P(t_0)z = x_P(T)^\top Sx_P(T) + \int_{t_0}^\top x_P(t)^\top Qx_P(t)\,d\mathcal{L}^1(t). \end{align*} Since $S\geq 0$ and $Q\geq 0$, the right-hand side is non-negative. Therefore $z^\top P(t_0)z\geq 0$ for every $z\in\mathbb{R}^n$, which proves $P(t_0)\geq 0$. To prevent blow-up, use the zero control. Define \begin{align*} u_0: (t_0,T)\to\mathbb{R}^m. \end{align*} Set $u_0(t)=0$ for $\mathcal{L}^1$-almost every $t\in(t_0,T)$. For $r\in\mathbb R$, define the matrix exponential $e^{Ar}:=\sum_{k=0}^{\infty}A^k r^k/k!$. Let $x_{\mathrm{zero},z}:[t_0,T]\to\mathbb{R}^n$ be given by \begin{align*} x_{\mathrm{zero},z}(t):=e^{A(t-t_0)}z. \end{align*} The square term in the identity is non-negative because $R>0$, so \begin{align*} z^\top P(t_0)z \leq (x_{\mathrm{zero},z}(T))^\top Sx_{\mathrm{zero},z}(T) + \int_{t_0}^\top (x_{\mathrm{zero},z}(t))^\top Qx_{\mathrm{zero},z}(t)\,d\mathcal{L}^1(t). \end{align*} Here $|\cdot|$ denotes the Euclidean norm on $\mathbb{R}^n$, and $\|\cdot\|_{\mathrm{op}}$ denotes the operator norm induced by the Euclidean norm. Define \begin{align*} C_T := \sup_{0 \leq r \leq T} \|e^{Ar}\|_{\mathrm{op}}^2\left(\|S\|_{\mathrm{op}} + T\|Q\|_{\mathrm{op}}\right). \end{align*} The supremum is finite because $r\mapsto e^{Ar}$ is continuous on the compact interval $[0,T]$. The preceding estimate gives \begin{align*} 0 \leq z^\top P(t_0)z \leq C_T |z|^2. \end{align*} Since $P(t_0)$ is symmetric and non-negative, this implies $\|P(t_0)\|_{\mathrm{op}}\leq C_T$. Now we turn the bound into extension. Suppose $\tau>0$. Define the compact set \begin{align*} K_T:=\{Y\in\operatorname{Sym}_n(\mathbb{R}):\|Y\|_{\mathrm{op}}\le C_T\}. \end{align*} The estimate gives $P((\tau,T])\subset K_T$. Because $F$ is continuous, it is bounded on $K_T$; define \begin{align*} D_T:=\sup_{Y\in K_T}\|F(Y)\|_{\mathrm{op}}<\infty. \end{align*} For any $s,t\in(\tau,T]$, the integral form of the differential equation gives \begin{align*} \|P(t)-P(s)\|_{\mathrm{op}}\le D_T|t-s|. \end{align*} Thus $P(t)$ is Cauchy as $t\downarrow\tau$, so there is $P_\tau\in K_T$ with $P(t)\to P_\tau$. Define $P(\tau):=P_\tau$ to extend the old solution continuously to the endpoint. The integral equation on $(\tau,T]$ is \begin{align*} P(t)=P(s)+\int_s^t F(P(r))\,d\mathcal L^1(r) \end{align*} for $\tau<s<t\leq T$. Letting $s\downarrow\tau$ is justified as follows: $P(s)\to P_\tau$, and the integrands $r\mapsto F(P(r))\mathbb{1}_{(s,t)}(r)$ converge pointwise to $r\mapsto F(P(r))\mathbb{1}_{(\tau,t)}(r)$ while being dominated by the integrable constant function $D_T$ on $(\tau,t)$. By the [Dominated Convergence Theorem](/theorems/4), the integrals converge. Hence \begin{align*} P(t)=P_\tau+\int_\tau^t F(P(r))\,d\mathcal L^1(r) \end{align*} for every $t\in[\tau,T]$. Since $F$ is locally Lipschitz on the whole finite-dimensional space, the [Picard-Lindelöf Theorem](/theorems/69) applied at the initial value $P_\tau$ gives a local solution through time $\tau$. On the part of its interval lying to the right of $\tau$, uniqueness identifies that local solution with the continuous extension just constructed. Therefore the local solution glues to the old one and extends it to times strictly smaller than $\tau$, contradicting maximality of $(\tau,T]$. Hence $\tau=0$, and the solution extends uniquely to $P\in C^1([0,T];\operatorname{Sym}_n(\mathbb{R}))$ with $P(t)\geq 0$ for all $t\in[0,T]$. [/guided] [/step] [step:Apply the square identity to the original control problem] Now fix $x_0 \in \mathbb{R}^n$ and let $u \in \mathcal{U}_T(x_0)$ be arbitrary. Let \begin{align*} x: [0,T] \to \mathbb{R}^n \end{align*} be the corresponding state. Applying the identity from the interval $[0,T]$ gives \begin{align*} J_T[u;x_0] = x_0^\top P(0)x_0 + \int_0^{\!T}\left(u(t)+R^{-1}B^\top P(t)x(t)\right)^\top R\left(u(t)+R^{-1}B^\top P(t)x(t)\right)\,d\mathcal{L}^1(t). \end{align*} Since $R>0$, the integrand is non-negative for $\mathcal{L}^1$-almost every $t \in (0,T)$. Therefore \begin{align*} J_T[u;x_0] \geq x_0^\top P(0)x_0. \end{align*} Define the continuous coefficient map \begin{align*} C:[0,T]\to\mathbb{R}^{n\times n}, \qquad C(t):=A-BR^{-1}B^\top P(t). \end{align*} For a fixed $a\in[0,T]$, define the fixed-point operator $\Phi:C([0,a];\mathbb{R}^n)\to C([0,a];\mathbb{R}^n)$ by \begin{align*} (\Phi y)(t):=x_0+\int_0^t C(r)y(r)\,d\mathcal L^1(r) \end{align*} has a unique fixed point by the contraction argument on sufficiently short subintervals, iterated finitely many times because $C$ is bounded on $[0,T]$. Let $x_{\mathrm{opt}}: [0,T]\to\mathbb{R}^n$ be this unique solution of \begin{align*} {x_{\mathrm{opt}}}'(t)=\left(A-BR^{-1}B^\top P(t)\right)x_{\mathrm{opt}}(t), \qquad x_{\mathrm{opt}}(0)=x_0, \end{align*} and define \begin{align*} u_{\mathrm{opt}}: (0,T) \to \mathbb{R}^m, \qquad u_{\mathrm{opt}}(t)=-R^{-1}B^\top P(t)x_{\mathrm{opt}}(t). \end{align*} The coefficient $t \mapsto A-BR^{-1}B^\top P(t)$ is continuous on $[0,T]$, so $x_{\mathrm{opt}}$ is continuously differentiable, and $u_{\mathrm{opt}}$ is continuous on $[0,T]$. In particular $u_{\mathrm{opt}} \in L^2((0,T);\mathbb{R}^m)$, so $u_{\mathrm{opt}} \in \mathcal{U}_T(x_0)$. For this pair, the square integrand vanishes for every $t \in [0,T]$, hence \begin{align*} J_T[u_{\mathrm{opt}};x_0] = x_0^\top P(0)x_0. \end{align*} Thus \begin{align*} \inf_{u \in \mathcal{U}_T(x_0)} J_T[u;x_0] = x_0^\top P(0)x_0. \end{align*} [guided] The verification argument starts by rederiving the square identity on the full horizon. Let $u \in \mathcal{U}_T(x_0)$ be arbitrary, and let $x: [0,T] \to \mathbb{R}^n$ be its absolutely continuous state, so $x'(t)=Ax(t)+Bu(t)$ for $\mathcal{L}^1$-almost every $t \in (0,T)$ and $x(0)=x_0$. Define the scalar map \begin{align*} W: [0,T] \to \mathbb{R}. \end{align*} For $t\in[0,T]$, set $W(t)=x(t)^\top P(t)x(t)$. Since $x$ is absolutely continuous and $P\in C^1([0,T];\operatorname{Sym}_n(\mathbb{R}))$, the scalar map $W$ is absolutely continuous. Differentiating the scalar product of the absolutely continuous functions $x$, $Px$, and using the symmetry of $P(t)$, we obtain for $\mathcal{L}^1$-almost every $t\in(0,T)$ \begin{align*} W'(t)=x(t)^\top P'(t)x(t)+2x(t)^\top P(t)x'(t). \end{align*} Substituting the state equation and the Riccati equation gives \begin{align*} W'(t)=x(t)^\top P(t)BR^{-1}B^\top P(t)x(t)-x(t)^\top Qx(t)+2x(t)^\top P(t)Bu(t). \end{align*} Now we add $x(t)^\top Qx(t)+u(t)^\top Ru(t)$ to both sides and complete the square with respect to the positive definite matrix $R$: \begin{align*} x(t)^\top Qx(t)+u(t)^\top Ru(t)+W'(t)=\left(u(t)+R^{-1}B^\top P(t)x(t)\right)^\top R\left(u(t)+R^{-1}B^\top P(t)x(t)\right). \end{align*} Integrating over $(0,T)$ with respect to $\mathcal{L}^1$ and using $P(T)=S$ gives \begin{align*} J_T[u;x_0]=x_0^\top P(0)x_0+\int_0^{\!T}\left(u(t)+R^{-1}B^\top P(t)x(t)\right)^\top R\left(u(t)+R^{-1}B^\top P(t)x(t)\right)\,d\mathcal{L}^1(t). \end{align*} This identity is useful because the first term depends only on the initial state, while the second term measures how far the control is from the feedback law. Since $R=R^\top>0$, the quadratic form $v\mapsto v^\top Rv$ is non-negative and vanishes only at $v=0$. Hence every admissible control satisfies \begin{align*} J_T[u;x_0]\geq x_0^\top P(0)x_0. \end{align*} To show that this lower bound is attained, we choose the control that makes the square equal to zero. Let $x_{\mathrm{opt}}: [0,T]\to\mathbb{R}^n$ solve \begin{align*} {x_{\mathrm{opt}}}'(t)=\left(A-BR^{-1}B^\top P(t)\right)x_{\mathrm{opt}}(t), \qquad x_{\mathrm{opt}}(0)=x_0. \end{align*} This linear system has continuous coefficients because $P\in C^1([0,T];\operatorname{Sym}_n(\mathbb{R}))$. More explicitly, with $C(t):=A-BR^{-1}B^\top P(t)$, the integral operator $\Phi:C([0,a];\mathbb{R}^n)\to C([0,a];\mathbb{R}^n)$ defined by $\Phi y(t):=x_0+\int_0^t C(r)y(r)\,d\mathcal L^1(r)$ is a contraction on sufficiently short time intervals, since $C$ is bounded there. Iterating over finitely many subintervals covering $[0,T]$ gives a unique continuously differentiable solution on the full interval. Define \begin{align*} u_{\mathrm{opt}}: (0,T) \to \mathbb{R}^m \end{align*} by \begin{align*} u_{\mathrm{opt}}(t):=-R^{-1}B^\top P(t)x_{\mathrm{opt}}(t). \end{align*} Since $P$ and $x_{\mathrm{opt}}$ are continuous on the compact interval $[0,T]$, the function $u_{\mathrm{opt}}$ is continuous and therefore belongs to $L^2((0,T);\mathbb{R}^m)$. Moreover, \begin{align*} {x_{\mathrm{opt}}}'(t)=Ax_{\mathrm{opt}}(t)+Bu_{\mathrm{opt}}(t), \end{align*} so $u_{\mathrm{opt}}$ is admissible for the initial state $x_0$. For this feedback control, the square term is \begin{align*} u_{\mathrm{opt}}(t)+R^{-1}B^\top P(t)x_{\mathrm{opt}}(t)=0 \end{align*} for every $t\in(0,T)$. Substituting into the identity gives \begin{align*} J_T[u_{\mathrm{opt}};x_0]=x_0^\top P(0)x_0. \end{align*} Thus the lower bound is exactly the optimal cost. [/guided] [/step] [step:Prove uniqueness of the optimal input] Let $u \in \mathcal{U}_T(x_0)$ be optimal, and let $x: [0,T]\to\mathbb{R}^n$ be its state. Since $J_T[u;x_0]=x_0^\top P(0)x_0$, the non-negative integral in the square identity must vanish: \begin{align*} \int_0^{\!T}\left(u(t)+R^{-1}B^\top P(t)x(t)\right)^\top R\left(u(t)+R^{-1}B^\top P(t)x(t)\right)\,d\mathcal{L}^1(t)=0. \end{align*} Because $R>0$, this implies \begin{align*} u(t)=-R^{-1}B^\top P(t)x(t) \end{align*} for $\mathcal{L}^1$-almost every $t \in (0,T)$. Hence $x$ satisfies the closed-loop equation \begin{align*} x'(t)=\left(A-BR^{-1}B^\top P(t)\right)x(t), \qquad x(0)=x_0. \end{align*} By uniqueness for this linear ordinary differential equation, $x=x_{\mathrm{opt}}$ on $[0,T]$. Therefore $u=u_{\mathrm{opt}}$ in $L^2((0,T);\mathbb{R}^m)$. This proves that the optimal input is unique and is precisely the stated feedback control. [guided] Let $u\in\mathcal{U}_T(x_0)$ be optimal, and let $x:[0,T]\to\mathbb{R}^n$ be the corresponding absolutely continuous state. The square identity gives \begin{align*} J_T[u;x_0] = x_0^\top P(0)x_0 + \int_0^{\!T}\left(u(t)+R^{-1}B^\top P(t)x(t)\right)^\top R\left(u(t)+R^{-1}B^\top P(t)x(t)\right)\,d\mathcal{L}^1(t). \end{align*} Optimality means $J_T[u;x_0]=x_0^\top P(0)x_0$, so the integral of the non-negative integrand must be zero: \begin{align*} \int_0^{\!T}\left(u(t)+R^{-1}B^\top P(t)x(t)\right)^\top R\left(u(t)+R^{-1}B^\top P(t)x(t)\right)\,d\mathcal{L}^1(t)=0. \end{align*} Because $R=R^\top>0$, the quadratic form $v\mapsto v^\top Rv$ is positive definite. Hence a non-negative measurable function with zero integral must vanish for $\mathcal{L}^1$-almost every $t\in(0,T)$, and therefore \begin{align*} u(t)=-R^{-1}B^\top P(t)x(t) \end{align*} for $\mathcal{L}^1$-almost every $t\in(0,T)$. Substituting this identity into the state equation gives the closed-loop equation \begin{align*} x'(t)=\left(A-BR^{-1}B^\top P(t)\right)x(t), \qquad x(0)=x_0, \end{align*} for $\mathcal{L}^1$-almost every $t\in(0,T)$. The coefficient $t\mapsto A-BR^{-1}B^\top P(t)$ is continuous on $[0,T]$, so the [Picard-Lindelöf Theorem](/theorems/69) applies to this linear closed-loop system. The state $x_{\mathrm{opt}}$ constructed above is the unique solution, so $x=x_{\mathrm{opt}}$ on $[0,T]$. The almost-everywhere feedback identity then gives $u=u_{\mathrm{opt}}$ in $L^2((0,T);\mathbb{R}^m)$. Thus the optimal input is unique as an element of $L^2((0,T);\mathbb{R}^m)$, that is, up to $\mathcal L^1$-almost everywhere equality. [/guided] [/step]

Prerequisites (0/2 completed)

Prerequisites Graph

Interactive dependency map showing how this theorem builds on foundational concepts

Loading dependency graph...

Definitions & Concepts

What brings you to Androma?

Start with a route through the knowledge graph.