Continuous-Time Algebraic Riccati Equation Existence and LQR Optimality Theorem

Theorem

Edit Issues Pull Requests Attributions Admin

Discussion

Proof

[proofplan] We obtain the infinite-horizon solution as the monotone limit of the finite-horizon Riccati value matrices. Stabilisability supplies one finite-cost stabilising comparison feedback, so the finite-horizon value matrices are uniformly bounded; monotonicity then gives a positive semidefinite limit. Passing the Riccati equation to the limit gives the algebraic Riccati equation, detectability forces the associated closed-loop matrix to be Hurwitz, and completing the square proves both optimality and uniqueness among stabilising positive semidefinite solutions. [/proofplan] [step:Build finite-horizon value matrices and bound them by a stabilising feedback] Let $\mathcal L^1$ denote one-dimensional [Lebesgue measure](/page/Lebesgue%20Measure) on the Borel subsets of $\mathbb R$. A real square matrix is called Hurwitz if every complex eigenvalue has strictly negative real part. A solution of the algebraic Riccati equation is called stabilising if its associated closed-loop matrix is Hurwitz. For $T>0$, define the finite-horizon admissible class $\mathcal U_T$ to be the set of measurable maps $u:(0,T)\to\mathbb R^m$ with $u\in L^2((0,T);\mathbb R^m)$. For $u\in\mathcal U_T$ and $x_0\in\mathbb R^n$, let $x_{u,x_0,T}:[0,T]\to\mathbb R^n$ denote the absolutely continuous solution of \begin{align*} x_{u,x_0,T}'(t)=Ax_{u,x_0,T}(t)+Bu(t), \qquad x_{u,x_0,T}(0)=x_0. \end{align*} Define the finite-horizon cost functional $J_T[\cdot;x_0]:\mathcal U_T\to[0,\infty)$ by \begin{align*} J_T[u;x_0]=\int_0^{\!T}\left(x_{u,x_0,T}(t)^\top Qx_{u,x_0,T}(t)+u(t)^\top Ru(t)\right)\,d\mathcal L^1(t). \end{align*} Since $Q=C^\top C$, $Q$ is symmetric positive semidefinite. Since $R=R^\top>0$, the running cost is non-negative and strictly convex in $u$. By the finite-horizon continuous-time Riccati theorem with terminal weight $0$ and sign convention \begin{align*} -\Pi_T'(t)=A^\top\Pi_T(t)+\Pi_T(t)A-\Pi_T(t)BR^{-1}B^\top\Pi_T(t)+Q, \end{align*} with terminal condition $\Pi_T(T)=0$, for every $T>0$ there is a continuously differentiable symmetric positive semidefinite Riccati matrix $\Pi_T:[0,T]\to\mathbb R^{n\times n}$ and a symmetric positive semidefinite matrix $P_T:=\Pi_T(0)\in\mathbb R^{n\times n}$ such that \begin{align*} \inf_{u\in\mathcal U_T}J_T[u;x_0]=x_0^\top P_Tx_0 \end{align*} for every $x_0\in\mathbb R^n$. Because $(A,B)$ is stabilisable, choose a matrix $K\in\mathbb R^{m\times n}$ such that $A-BK$ is Hurwitz. Define the comparison closed-loop trajectory $x_K:[0,\infty)\to\mathbb R^n$ by \begin{align*} x_K'(t)=(A-BK)x_K(t), \qquad x_K(0)=x_0, \end{align*} and the comparison control $u_K:(0,\infty)\to\mathbb R^m$ by $u_K(t)=-Kx_K(t)$. Since $A-BK$ is Hurwitz, there are constants $M_K\ge 1$ and $\alpha_K>0$ such that \begin{align*} |x_K(t)|\le M_Ke^{-\alpha_Kt}|x_0| \end{align*} for every $t\ge 0$. Let $\|Q\|_{\mathrm{op}}$ and $\|R\|_{\mathrm{op}}$ denote the Euclidean operator norms of $Q$ and $R$. Then \begin{align*} x_K(t)^\top Qx_K(t)+u_K(t)^\top Ru_K(t)\le \left(\|Q\|_{\mathrm{op}}+\|K\|_{\mathrm{op}}^2\|R\|_{\mathrm{op}}\right)M_K^2e^{-2\alpha_Kt}|x_0|^2. \end{align*} Integrating with respect to $\mathcal L^1$ gives \begin{align*} J_T[u_K|_{(0,T)};x_0]\le \frac{\left(\|Q\|_{\mathrm{op}}+\|K\|_{\mathrm{op}}^2\|R\|_{\mathrm{op}}\right)M_K^2}{2\alpha_K}|x_0|^2. \end{align*} Define \begin{align*} \Gamma_K:=\frac{\left(\|Q\|_{\mathrm{op}}+\|K\|_{\mathrm{op}}^2\|R\|_{\mathrm{op}}\right)M_K^2}{2\alpha_K}. \end{align*} Since $P_T$ gives the infimum over $\mathcal U_T$, we have \begin{align*} 0\le x_0^\top P_Tx_0\le \Gamma_K|x_0|^2 \end{align*} for every $T>0$ and every $x_0\in\mathbb R^n$. [guided] The first purpose of this step is to produce a family of finite-horizon quadratic value functions. For a horizon $T>0$, the admissible controls are the square-integrable maps $u:(0,T)\to\mathbb R^m$, and each such control determines the state trajectory $x_{u,x_0,T}$ by the linear differential equation \begin{align*} x_{u,x_0,T}'(t)=Ax_{u,x_0,T}(t)+Bu(t), \qquad x_{u,x_0,T}(0)=x_0. \end{align*} The corresponding finite-horizon cost is \begin{align*} J_T[u;x_0]=\int_0^{\!T}\left(x_{u,x_0,T}(t)^\top Qx_{u,x_0,T}(t)+u(t)^\top Ru(t)\right)\,d\mathcal L^1(t). \end{align*} The hypotheses needed for the finite-horizon Riccati theorem are exactly the finite-dimensional linear dynamics, the positive semidefinite state weight, and the positive definite control weight. Here $Q=C^\top C$ is symmetric positive semidefinite because $x^\top Qx=|Cx|^2\ge 0$ for all $x\in\mathbb R^n$, and $R=R^\top>0$ is positive definite by hypothesis. Therefore the finite-horizon continuous-time Riccati theorem with terminal weight $0$ and sign convention \begin{align*} -\Pi_T'(t)=A^\top\Pi_T(t)+\Pi_T(t)A-\Pi_T(t)BR^{-1}B^\top\Pi_T(t)+Q, \end{align*} with terminal condition $\Pi_T(T)=0$, gives a continuously differentiable symmetric positive semidefinite Riccati matrix $\Pi_T:[0,T]\to\mathbb R^{n\times n}$ and a symmetric positive semidefinite matrix $P_T:=\Pi_T(0)\in\mathbb R^{n\times n}$ satisfying \begin{align*} \inf_{u\in\mathcal U_T}J_T[u;x_0]=x_0^\top P_Tx_0 \end{align*} for every initial state $x_0\in\mathbb R^n$. The second purpose is to prove that these matrices cannot grow without bound as $T$ increases. Stabilisability is used precisely here. Since $(A,B)$ is stabilisable, there exists a feedback matrix $K\in\mathbb R^{m\times n}$ such that the closed-loop matrix $A-BK$ is Hurwitz. Define the comparison trajectory $x_K:[0,\infty)\to\mathbb R^n$ by \begin{align*} x_K'(t)=(A-BK)x_K(t), \qquad x_K(0)=x_0, \end{align*} and define the comparison control by $u_K(t)=-Kx_K(t)$. The Hurwitz property gives exponential decay: there are constants $M_K\ge 1$ and $\alpha_K>0$ such that \begin{align*} |x_K(t)|\le M_Ke^{-\alpha_Kt}|x_0| \end{align*} for all $t\ge 0$. Using the Euclidean operator norms of $Q$, $R$, and $K$, we estimate the running cost along this stabilising feedback: \begin{align*} x_K(t)^\top Qx_K(t)\le \|Q\|_{\mathrm{op}}|x_K(t)|^2 \end{align*} and \begin{align*} u_K(t)^\top Ru_K(t)\le \|R\|_{\mathrm{op}}|u_K(t)|^2\le \|R\|_{\mathrm{op}}\|K\|_{\mathrm{op}}^2|x_K(t)|^2. \end{align*} Substituting the exponential bound gives \begin{align*} x_K(t)^\top Qx_K(t)+u_K(t)^\top Ru_K(t)\le \left(\|Q\|_{\mathrm{op}}+\|K\|_{\mathrm{op}}^2\|R\|_{\mathrm{op}}\right)M_K^2e^{-2\alpha_Kt}|x_0|^2. \end{align*} The right-hand side is integrable on $(0,\infty)$ with respect to $\mathcal L^1$, so for every finite $T$, \begin{align*} J_T[u_K|_{(0,T)};x_0]\le \frac{\left(\|Q\|_{\mathrm{op}}+\|K\|_{\mathrm{op}}^2\|R\|_{\mathrm{op}}\right)M_K^2}{2\alpha_K}|x_0|^2. \end{align*} Define this explicit comparison constant by \begin{align*} \Gamma_K:=\frac{\left(\|Q\|_{\mathrm{op}}+\|K\|_{\mathrm{op}}^2\|R\|_{\mathrm{op}}\right)M_K^2}{2\alpha_K}. \end{align*} Since $P_T$ is the optimal finite-horizon value matrix, its value is no larger than the cost of this particular admissible stabilising control. Hence \begin{align*} 0\le x_0^\top P_Tx_0\le \Gamma_K|x_0|^2 \end{align*} for every $T>0$ and every $x_0\in\mathbb R^n$. [/guided] [/step] [step:Take the monotone limit and pass to the algebraic Riccati equation] If $0<T_1<T_2$, restricting any admissible control on $(0,T_2)$ to $(0,T_1)$ and using the non-negativity of the running cost gives \begin{align*} x_0^\top P_{T_1}x_0\le x_0^\top P_{T_2}x_0 \end{align*} for every $x_0\in\mathbb R^n$. Thus $(P_T)_{T>0}$ is increasing in the Loewner order and is bounded above by $\Gamma_KI_n$. Since the space of real symmetric $n\times n$ matrices is finite-dimensional, there exists a symmetric positive semidefinite matrix $P\in\mathbb R^{n\times n}$ such that $P_T\to P$ in every matrix norm and \begin{align*} \lim_{T\to\infty}x_0^\top P_Tx_0=x_0^\top Px_0 \end{align*} for every $x_0\in\mathbb R^n$. Let $\Pi_T:[0,T]\to\mathbb R^{n\times n}$ be the finite-horizon Riccati matrix with terminal condition $\Pi_T(T)=0$, so $P_T=\Pi_T(0)$. The finite-horizon continuous-time Riccati theorem gives \begin{align*} -\Pi_T'(t)=A^\top\Pi_T(t)+\Pi_T(t)A-\Pi_T(t)BR^{-1}B^\top\Pi_T(t)+Q. \end{align*} By time-[translation invariance](/theorems/4911) of the autonomous problem, $\Pi_T(s)=P_{T-s}$ for $0\le s\le T$. Therefore, for fixed $h>0$ and $T>h$, \begin{align*} P_T-P_{T-h}=\int_0^h \left(A^\top P_{T-r}+P_{T-r}A-P_{T-r}BR^{-1}B^\top P_{T-r}+Q\right)\,d\mathcal L^1(r). \end{align*} The map $\mathcal R:\mathbb R^{n\times n}\to\mathbb R^{n\times n}$ defined by \begin{align*} \mathcal R(X)=A^\top X+XA-XBR^{-1}B^\top X+Q \end{align*} is polynomial in the entries of $X$, hence continuous. Since $0\le P_s\le \Gamma_KI_n$ for every $s>0$, the family $\{\mathcal R(P_s):s>0\}$ is bounded. Moreover, for $r\in[0,h]$, $P_{T-r}\to P$ uniformly in $r$ as $T\to\infty$, because $T-r\ge T-h\to\infty$ and the monotone bounded matrix limit has already been identified. Therefore $\mathcal R(P_{T-r})\to\mathcal R(P)$ uniformly on $[0,h]$, so the limit may be passed through the [Lebesgue integral](/page/Lebesgue%20Integral) over the finite [measure space](/page/Measure%20Space) $([0,h],\mathcal B([0,h]),\mathcal L^1)$. Letting $T\to\infty$, the left-hand side tends to $0$, and the right-hand side tends to \begin{align*} h\left(A^\top P+PA-PBR^{-1}B^\top P+Q\right). \end{align*} Since $h>0$, the limit matrix satisfies the algebraic Riccati equation \begin{align*} A^\top P+PA-PBR^{-1}B^\top P+Q=0. \end{align*} [guided] The finite-horizon value cannot decrease when the time horizon is enlarged, because an admissible control on the longer interval has a restriction to the shorter interval and the extra running cost is non-negative. Thus, for $0<T_1<T_2$, \begin{align*} x_0^\top P_{T_1}x_0\le x_0^\top P_{T_2}x_0 \end{align*} for every $x_0\in\mathbb R^n$. This means that $(P_T)_{T>0}$ is increasing in the Loewner order. The comparison estimate from the preceding step gives $P_T\le \Gamma_KI_n$, so every scalar quadratic form $x_0^\top P_Tx_0$ is monotone and bounded. Since real symmetric $n\times n$ matrices form a finite-dimensional [vector space](/page/Vector%20Space), these scalar limits determine a symmetric positive semidefinite matrix $P\in\mathbb R^{n\times n}$, and in fact $P_T\to P$ in matrix norm. Now we extract the stationary Riccati equation from the finite-horizon differential equation. Let $\Pi_T:[0,T]\to\mathbb R^{n\times n}$ be the Riccati matrix with terminal condition $\Pi_T(T)=0$, so that $P_T=\Pi_T(0)$. The finite-horizon continuous-time Riccati theorem gives \begin{align*} -\Pi_T'(t)=A^\top\Pi_T(t)+\Pi_T(t)A-\Pi_T(t)BR^{-1}B^\top\Pi_T(t)+Q. \end{align*} Because the dynamics and cost are autonomous, solving backward from terminal time $T$ and observing the system at time $s$ is the same finite-horizon problem with remaining horizon $T-s$. Hence $\Pi_T(s)=P_{T-s}$ for $0\le s\le T$. Integrating the differential equation from $0$ to $h$ gives, for $T>h$, \begin{align*} P_T-P_{T-h}=\int_0^h \left(A^\top P_{T-r}+P_{T-r}A-P_{T-r}BR^{-1}B^\top P_{T-r}+Q\right)\,d\mathcal L^1(r). \end{align*} The only delicate point is passing to the limit inside this integral. Define $\mathcal R:\mathbb R^{n\times n}\to\mathbb R^{n\times n}$ by \begin{align*} \mathcal R(X)=A^\top X+XA-XBR^{-1}B^\top X+Q. \end{align*} This map is continuous because it is polynomial in the entries of $X$. Also $0\le P_s\le \Gamma_KI_n$, so the matrices $P_s$ remain in a compact bounded subset of the finite-dimensional matrix space. For $r\in[0,h]$, the parameter $T-r$ tends to infinity uniformly as $T\to\infty$, and therefore $P_{T-r}\to P$ uniformly in $r$. It follows that $\mathcal R(P_{T-r})\to\mathcal R(P)$ uniformly on $[0,h]$. Since $[0,h]$ has finite $\mathcal L^1$-measure, [uniform convergence](/page/Uniform%20Convergence) justifies passing the limit through the integral. The left-hand side satisfies $P_T-P_{T-h}\to P-P=0$. The right-hand side tends to \begin{align*} h\left(A^\top P+PA-PBR^{-1}B^\top P+Q\right). \end{align*} Since $h>0$, division by $h$ yields \begin{align*} A^\top P+PA-PBR^{-1}B^\top P+Q=0. \end{align*} Thus the monotone limit of the finite-horizon value matrices solves the algebraic Riccati equation. [/guided] [/step] [step:Use detectability to prove that the limiting feedback is stabilising] Define the closed-loop matrix \begin{align*} A_P:=A-BR^{-1}B^\top P\in\mathbb R^{n\times n}. \end{align*} Since $R=R^\top>0$, there is a unique symmetric positive definite matrix $R^{-1/2}\in\mathbb R^{m\times m}$ such that $R^{-1/2}R^{-1/2}=R^{-1}$. Using the algebraic Riccati equation, for every $x\in\mathbb R^n$, \begin{align*} x^\top(A_P^\top P+PA_P)x=-x^\top Qx-x^\top PBR^{-1}B^\top Px. \end{align*} Since $Q=C^\top C$ and $R^{-1}=R^{-\top}>0$, this becomes \begin{align*} x^\top(A_P^\top P+PA_P)x=-|Cx|^2-|R^{-1/2}B^\top Px|^2\le 0. \end{align*} We prove that $A_P$ has no eigenvalue in the closed right half-plane. Let $\lambda\in\mathbb C$ and $v\in\mathbb C^n\setminus\{0\}$ satisfy $A_Pv=\lambda v$. Complexifying the displayed identity gives \begin{align*} v^*(A_P^*P+PA_P)v=-|Cv|^2-|R^{-1/2}B^\top Pv|^2. \end{align*} The left-hand side equals \begin{align*} (\overline\lambda+\lambda)v^*Pv=2\operatorname{Re}\lambda\,v^*Pv. \end{align*} Since $P\ge 0$, the left-hand side is non-negative whenever $\operatorname{Re}\lambda\ge 0$, while the right-hand side is non-positive. Therefore, if $\operatorname{Re}\lambda\ge 0$, both sides are zero, and in particular $Cv=0$ and $B^\top Pv=0$. Hence \begin{align*} Av=A_Pv+BR^{-1}B^\top Pv=\lambda v. \end{align*} Detectability of $(C,A)$ means that every eigenvector of $A$ with $Cv=0$ has eigenvalue in the open left half-plane. This contradicts $\operatorname{Re}\lambda\ge 0$. Thus every eigenvalue of $A_P$ has negative real part, so $A_P$ is Hurwitz. [guided] The stabilising property means that the matrix \begin{align*} A_P:=A-BR^{-1}B^\top P \end{align*} has all eigenvalues in the open left half-plane. We begin with the Lyapunov identity supplied by the algebraic Riccati equation. Because $R=R^\top>0$, the inverse $R^{-1}$ is symmetric positive definite, and there is a unique symmetric positive definite square root $R^{-1/2}\in\mathbb R^{m\times m}$ satisfying $R^{-1/2}R^{-1/2}=R^{-1}$. Using \begin{align*} A^\top P+PA-PBR^{-1}B^\top P+Q=0 \end{align*} and $A_P=A-BR^{-1}B^\top P$, we compute for $x\in\mathbb R^n$: \begin{align*} x^\top(A_P^\top P+PA_P)x=-x^\top Qx-x^\top PBR^{-1}B^\top Px. \end{align*} Since $Q=C^\top C$, the first term on the right is $-|Cx|^2$. Since $P=P^\top$ and $R^{-1}=R^{-1/2}R^{-1/2}$, the second term is $-|R^{-1/2}B^\top Px|^2$. Hence \begin{align*} x^\top(A_P^\top P+PA_P)x=-|Cx|^2-|R^{-1/2}B^\top Px|^2\le 0. \end{align*} Now take a complex eigenpair $A_Pv=\lambda v$ with $v\in\mathbb C^n\setminus\{0\}$. The same identity holds after complexification, with transpose replaced by conjugate transpose: \begin{align*} v^*(A_P^*P+PA_P)v=-|Cv|^2-|R^{-1/2}B^\top Pv|^2. \end{align*} The eigenvalue equation gives $v^*A_P^*P v=\overline\lambda v^*Pv$ and $v^*PA_Pv=\lambda v^*Pv$, so the left-hand side is \begin{align*} 2\operatorname{Re}\lambda\,v^*Pv. \end{align*} If $\operatorname{Re}\lambda\ge 0$, this quantity is non-negative because $P\ge 0$. The right-hand side is a sum of two non-positive terms. Therefore equality can hold only if both non-positive terms vanish: \begin{align*} Cv=0 \end{align*} and \begin{align*} B^\top Pv=0. \end{align*} The second identity converts the closed-loop eigenvector back into an open-loop eigenvector: \begin{align*} Av=A_Pv+BR^{-1}B^\top Pv=\lambda v. \end{align*} Thus $v$ is an eigenvector of $A$ with output $Cv=0$ and eigenvalue $\lambda$ in the closed right half-plane. Detectability of $(C,A)$ rules out exactly this situation: unobservable eigenvectors must correspond only to eigenvalues with negative real part. This contradiction proves that no eigenvalue of $A_P$ has non-negative real part. Hence $A_P$ is Hurwitz. [/guided] [/step] [step:Complete the square to prove optimality and identify the infinite-horizon value] Define the infinite-horizon admissible class $\mathcal U_\infty(x_0)$ to be the set of measurable controls $u:(0,\infty)\to\mathbb R^m$ with $u\in L^2_{\mathrm{loc}}((0,\infty);\mathbb R^m)$ for which the state $x_{u,x_0}:[0,\infty)\to\mathbb R^n$ is absolutely continuous, satisfies $x_{u,x_0}'(t)=Ax_{u,x_0}(t)+Bu(t)$ and $x_{u,x_0}(0)=x_0$, and has well-defined extended cost \begin{align*} J_\infty[u;x_0]=\int_0^\infty \left(x_{u,x_0}(t)^\top Qx_{u,x_0}(t)+u(t)^\top Ru(t)\right)\,d\mathcal L^1(t)\in[0,\infty]. \end{align*} For $u\in\mathcal U_\infty(x_0)$ and $T>0$, the restriction $u|_{(0,T)}$ is admissible for the finite-horizon problem, so \begin{align*} \int_0^{\!T}\left(x_{u,x_0}(t)^\top Qx_{u,x_0}(t)+u(t)^\top Ru(t)\right)\,d\mathcal L^1(t)\ge x_0^\top P_Tx_0. \end{align*} Since the integrand is non-negative, the left-hand side increases to $J_\infty[u;x_0]$ as $T\to\infty$ by monotone convergence for Lebesgue integrals, while $x_0^\top P_Tx_0\to x_0^\top Px_0$. Hence \begin{align*} J_\infty[u;x_0]\ge x_0^\top Px_0. \end{align*} For the feedback trajectory $x^*:[0,\infty)\to\mathbb R^n$ solving $x^{*'}(t)=A_Px^*(t)$ and $x^*(0)=x_0$, define $u^*:(0,\infty)\to\mathbb R^m$ by \begin{align*} u^*(t)=-R^{-1}B^\top Px^*(t). \end{align*} Since $A_P$ is Hurwitz, $x^*(t)$ decays exponentially, and therefore $u^*$ and the corresponding running cost are integrable on $(0,\infty)$. Define the feedback defect $w^*:(0,\infty)\to\mathbb R^m$ by \begin{align*} w^*(t)=u^*(t)+R^{-1}B^\top Px^*(t). \end{align*} Then $w^*(t)=0$ for every $t>0$. Using $x^{*'}=Ax^*+Bu^*$ and the algebraic Riccati equation, differentiation gives \begin{align*} \frac{d}{dt}\left(x^*(t)^\top Px^*(t)\right)=w^*(t)^\top Rw^*(t)-x^*(t)^\top Qx^*(t)-u^*(t)^\top Ru^*(t). \end{align*} Integrating over $(0,T)$ with respect to $\mathcal L^1$ and using $w^*=0$ gives \begin{align*} \int_0^{\!T}\left(x^*(t)^\top Qx^*(t)+u^*(t)^\top Ru^*(t)\right)\,d\mathcal L^1(t)=x_0^\top Px_0-x^*(T)^\top Px^*(T). \end{align*} Letting $T\to\infty$ and using exponential decay of $x^*$ yields \begin{align*} J_\infty[u^*;x_0]=x_0^\top Px_0. \end{align*} Thus the feedback is optimal and the optimal cost is $x_0^\top Px_0$. [guided] The lower bound for arbitrary controls does not require a separate transversality theorem. We define the infinite-horizon admissible class $\mathcal U_\infty(x_0)$ to consist of measurable controls $u:(0,\infty)\to\mathbb R^m$ with $u\in L^2_{\mathrm{loc}}((0,\infty);\mathbb R^m)$ such that the state $x_{u,x_0}:[0,\infty)\to\mathbb R^n$ is absolutely continuous, solves \begin{align*} x_{u,x_0}'(t)=Ax_{u,x_0}(t)+Bu(t) \end{align*} with $x_{u,x_0}(0)=x_0$, and has extended non-negative cost \begin{align*} J_\infty[u;x_0]=\int_0^\infty \left(x_{u,x_0}(t)^\top Qx_{u,x_0}(t)+u(t)^\top Ru(t)\right)\,d\mathcal L^1(t). \end{align*} For any $T>0$, the restricted control $u|_{(0,T)}$ is admissible for the finite-horizon problem with the same initial state. Therefore finite-horizon optimality gives \begin{align*} \int_0^{\!T}\left(x_{u,x_0}(t)^\top Qx_{u,x_0}(t)+u(t)^\top Ru(t)\right)\,d\mathcal L^1(t)\ge x_0^\top P_Tx_0. \end{align*} The integrand is non-negative because $Q\ge 0$ and $R>0$. Hence the finite-horizon integrals increase to the infinite-horizon integral by monotone convergence for Lebesgue integrals. At the same time, $P_T\to P$, so $x_0^\top P_Tx_0\to x_0^\top Px_0$. Passing to the limit gives \begin{align*} J_\infty[u;x_0]\ge x_0^\top Px_0. \end{align*} This proves the lower bound for every admissible control, including controls of infinite cost. It remains to show that the claimed feedback attains this lower bound. Let $x^*:[0,\infty)\to\mathbb R^n$ solve \begin{align*} x^{*'}(t)=A_Px^*(t),\qquad x^*(0)=x_0, \end{align*} and define $u^*:(0,\infty)\to\mathbb R^m$ by \begin{align*} u^*(t)=-R^{-1}B^\top Px^*(t). \end{align*} Since $A_P$ is Hurwitz, $x^*$ decays exponentially. Consequently $u^*$ also decays exponentially, so $u^*\in L^2((0,\infty);\mathbb R^m)$ and the infinite-horizon cost is finite. Define the feedback defect $w^*:(0,\infty)\to\mathbb R^m$ by \begin{align*} w^*(t)=u^*(t)+R^{-1}B^\top Px^*(t). \end{align*} For the feedback control, $w^*(t)=0$. Differentiating the quadratic function $x\mapsto x^\top Px$ along the controlled trajectory and using the algebraic Riccati equation gives \begin{align*} \frac{d}{dt}\left(x^*(t)^\top Px^*(t)\right)=w^*(t)^\top Rw^*(t)-x^*(t)^\top Qx^*(t)-u^*(t)^\top Ru^*(t). \end{align*} Since $w^*=0$, integration over $(0,T)$ with respect to $\mathcal L^1$ gives \begin{align*} \int_0^{\!T}\left(x^*(t)^\top Qx^*(t)+u^*(t)^\top Ru^*(t)\right)\,d\mathcal L^1(t)=x_0^\top Px_0-x^*(T)^\top Px^*(T). \end{align*} Exponential decay of $x^*$ implies $x^*(T)^\top Px^*(T)\to 0$. Letting $T\to\infty$ therefore yields \begin{align*} J_\infty[u^*;x_0]=x_0^\top Px_0. \end{align*} The lower bound and the attaining feedback together prove optimality and identify the infinite-horizon value. [/guided] [/step] [step:Compare two stabilising solutions by a Lyapunov identity] Let $S\in\mathbb R^{n\times n}$ be another symmetric positive semidefinite solution of the algebraic Riccati equation such that \begin{align*} A_S:=A-BR^{-1}B^\top S \end{align*} is Hurwitz. Define the symmetric matrix $D\in\mathbb R^{n\times n}$ by $D:=S-P$, and define the symmetric positive semidefinite matrix $G\in\mathbb R^{n\times n}$ by $G:=BR^{-1}B^\top$. Subtracting the Riccati equation for $P$ from the Riccati equation for $S$ gives \begin{align*} A^\top D+DA-SGS+PGP=0. \end{align*} Using $S=P+D$ and $A_P=A-GP$, this is equivalent to \begin{align*} A_P^\top D+DA_P-DGD=0. \end{align*} For an arbitrary $x_0\in\mathbb R^n$, let $y_P:[0,\infty)\to\mathbb R^n$ be the solution of $y_P'(t)=A_Py_P(t)$ with $y_P(0)=x_0$. Since $A_P$ is Hurwitz, $y_P(t)\to 0$ as $t\to\infty$. Differentiating $y_P(t)^\top Dy_P(t)$ gives \begin{align*} \frac{d}{dt}\left(y_P(t)^\top Dy_P(t)\right)=y_P(t)^\top DGDy_P(t)\ge 0. \end{align*} Integrating over $(0,T)$ with respect to $\mathcal L^1$ and letting $T\to\infty$ yields \begin{align*} x_0^\top D x_0=-\int_0^\infty y_P(t)^\top DGDy_P(t)\,d\mathcal L^1(t)\le 0. \end{align*} Hence $S\le P$ in the Loewner order. The same subtraction can be written relative to $A_S=A-GS$ as \begin{align*} A_S^\top D+DA_S+DGD=0. \end{align*} Let $y_S:[0,\infty)\to\mathbb R^n$ be the solution of $y_S'(t)=A_Sy_S(t)$ with $y_S(0)=x_0$. Since $A_S$ is Hurwitz, $y_S(t)\to 0$ as $t\to\infty$. Differentiating $y_S(t)^\top Dy_S(t)$ gives \begin{align*} \frac{d}{dt}\left(y_S(t)^\top Dy_S(t)\right)=-y_S(t)^\top DGDy_S(t)\le 0. \end{align*} Integrating over $(0,T)$ with respect to $\mathcal L^1$ and letting $T\to\infty$ yields \begin{align*} x_0^\top D x_0=\int_0^\infty y_S(t)^\top DGDy_S(t)\,d\mathcal L^1(t)\ge 0. \end{align*} Thus $P\le S$. Therefore $x_0^\top(P-S)x_0=0$ for every $x_0\in\mathbb R^n$. The polarization identity for symmetric bilinear forms gives $P=S$. This proves uniqueness of the positive semidefinite stabilising solution and completes the theorem. [guided] Suppose $S\in\mathbb R^{n\times n}$ is another symmetric positive semidefinite solution of the algebraic Riccati equation and that its closed-loop matrix \begin{align*} A_S:=A-BR^{-1}B^\top S \end{align*} is Hurwitz. We avoid any transversality assumption for arbitrary admissible controls by comparing the two Riccati equations directly. Define $D:=S-P\in\mathbb R^{n\times n}$ and $G:=BR^{-1}B^\top\in\mathbb R^{n\times n}$. Both $D$ and $G$ are symmetric, and $G$ is positive semidefinite because $R^{-1}$ is positive definite. Subtract the algebraic Riccati equation for $P$ from the algebraic Riccati equation for $S$: \begin{align*} A^\top D+DA-SGS+PGP=0. \end{align*} Since $S=P+D$, we expand $SGS=(P+D)G(P+D)$. After cancellation of the $PGP$ terms, the identity becomes \begin{align*} A^\top D+DA-PGD-DGP-DGD=0. \end{align*} Because $A_P=A-GP$, this is exactly \begin{align*} A_P^\top D+DA_P-DGD=0. \end{align*} Now let $y_P:[0,\infty)\to\mathbb R^n$ solve $y_P'(t)=A_Py_P(t)$ with $y_P(0)=x_0$. The matrix $A_P$ is Hurwitz, so $y_P(t)\to 0$ exponentially. Differentiating the scalar function $t\mapsto y_P(t)^\top Dy_P(t)$ and using the last displayed identity gives \begin{align*} \frac{d}{dt}\left(y_P(t)^\top Dy_P(t)\right)=y_P(t)^\top(A_P^\top D+DA_P)y_P(t)=y_P(t)^\top DGDy_P(t). \end{align*} The last term is non-negative because $DGD=(B^\top D)^\top R^{-1}(B^\top D)$. Therefore integration over $(0,T)$ gives \begin{align*} y_P(T)^\top Dy_P(T)-x_0^\top D x_0=\int_0^{\!T}y_P(t)^\top DGDy_P(t)\,d\mathcal L^1(t). \end{align*} Letting $T\to\infty$ is justified by exponential decay of $y_P$ and finiteness of the fixed matrix $D$, so $y_P(T)^\top Dy_P(T)\to 0$. Hence \begin{align*} x_0^\top D x_0=-\int_0^\infty y_P(t)^\top DGDy_P(t)\,d\mathcal L^1(t)\le 0. \end{align*} Thus $S\le P$. To get the opposite inequality, use the same algebraic difference but write it relative to $A_S=A-GS$. Since $S=P+D$, the subtraction identity is equivalent to \begin{align*} A_S^\top D+DA_S+DGD=0. \end{align*} Let $y_S:[0,\infty)\to\mathbb R^n$ solve $y_S'(t)=A_Sy_S(t)$ with $y_S(0)=x_0$. Because $A_S$ is Hurwitz by assumption, $y_S(t)\to 0$ exponentially. Differentiating along this trajectory gives \begin{align*} \frac{d}{dt}\left(y_S(t)^\top Dy_S(t)\right)=y_S(t)^\top(A_S^\top D+DA_S)y_S(t)=-y_S(t)^\top DGDy_S(t)\le 0. \end{align*} Integrating over $(0,T)$ and passing to the limit $T\to\infty$ gives \begin{align*} x_0^\top D x_0=\int_0^\infty y_S(t)^\top DGDy_S(t)\,d\mathcal L^1(t)\ge 0. \end{align*} Thus $P\le S$. Combining $S\le P$ and $P\le S$ gives \begin{align*} x_0^\top(P-S)x_0=0 \end{align*} for every $x_0\in\mathbb R^n$. Since $P-S$ is symmetric, the polarization identity for symmetric bilinear forms implies $P-S=0$. Hence $P=S$, proving uniqueness of the positive semidefinite stabilising solution. [/guided] [/step]

What brings you to Androma?

Start with a route through the knowledge graph.