Linear Quadratic Regulator Energy Identity — Statement & Proof

Theorem

Edit Issues Pull Requests Attributions Admin

Discussion

Proof

[proofplan] We differentiate the quadratic storage function $V(x)=x^\top Px$ along the closed-loop system $x'=(A-BK)x$. The Riccati equation converts the terms involving $A$ and $B$ into the negative state cost $-x^\top Qx$ and the negative control cost $-u^{*\top}Ru^*$. The energy identity itself uses the Riccati equation, symmetry and positive semidefiniteness of $P$, the definition of $K$, the positivity of $R$, the factorisation $Q=C^\top C$, and the stabilising property that $A-BK$ is Hurwitz. Since $A-BK$ is Hurwitz, the closed-loop trajectory decays to zero, so integrating the differential identity over $[0,T]$ and letting $T \to \infty$ gives the energy identity. [/proofplan] [step:Differentiate the quadratic storage function along the closed-loop trajectory] Define the storage function $V: \mathbb{R}^n \to \mathbb{R}$ by \begin{align*} V(z)=z^\top Pz. \end{align*} Since $x: [0,\infty) \to \mathbb{R}^n$ is differentiable and $P=P^\top$, the product rule gives, for every $t \ge 0$, \begin{align*} \frac{d}{dt}V(x(t))=x'(t)^\top Px(t)+x(t)^\top Px'(t). \end{align*} Using $x'(t)=(A-BK)x(t)$, this becomes \begin{align*} \frac{d}{dt}V(x(t))=x(t)^\top\left((A-BK)^\top P+P(A-BK)\right)x(t). \end{align*} Expanding the matrix expression gives \begin{align*} \frac{d}{dt}V(x(t))=x(t)^\top\left(A^\top P+PA-K^\top B^\top P-PBK\right)x(t). \end{align*} [guided] The natural quantity to differentiate is the quadratic function determined by the Riccati solution. Define $V: \mathbb{R}^n \to \mathbb{R}$ by \begin{align*} V(z)=z^\top Pz. \end{align*} The map $x: [0,\infty) \to \mathbb{R}^n$ is differentiable because it solves a linear constant-coefficient ordinary differential equation. Since $P=P^\top$, differentiating $V(x(t))=x(t)^\top Px(t)$ by the product rule gives \begin{align*} \frac{d}{dt}V(x(t))=x'(t)^\top Px(t)+x(t)^\top Px'(t). \end{align*} Substitute the closed-loop equation $x'(t)=(A-BK)x(t)$ into both appearances of $x'(t)$. The first term becomes \begin{align*} x'(t)^\top Px(t)=x(t)^\top(A-BK)^\top Px(t), \end{align*} and the second term becomes \begin{align*} x(t)^\top Px'(t)=x(t)^\top P(A-BK)x(t). \end{align*} Adding these two identities yields \begin{align*} \frac{d}{dt}V(x(t))=x(t)^\top\left((A-BK)^\top P+P(A-BK)\right)x(t). \end{align*} Now expand the closed-loop matrix terms: \begin{align*} (A-BK)^\top P+P(A-BK)=A^\top P-K^\top B^\top P+PA-PBK. \end{align*} Therefore \begin{align*} \frac{d}{dt}V(x(t))=x(t)^\top\left(A^\top P+PA-K^\top B^\top P-PBK\right)x(t). \end{align*} This is the point of introducing $V$: the derivative produces exactly the matrix combination that can be simplified by the Riccati equation. [/guided] [/step] [step:Use the Riccati equation to identify the state and control costs] Because $K=R^{-1}B^\top P$, $R=R^\top$, and $P=P^\top$, we have \begin{align*} K^\top B^\top P=PBR^{-1}B^\top P. \end{align*} Also, \begin{align*} PBK=PBR^{-1}B^\top P. \end{align*} Hence \begin{align*} A^\top P+PA-K^\top B^\top P-PBK=A^\top P+PA-2PBR^{-1}B^\top P. \end{align*} The Riccati equation gives \begin{align*} A^\top P+PA-PBR^{-1}B^\top P=-Q. \end{align*} Therefore \begin{align*} A^\top P+PA-2PBR^{-1}B^\top P=-Q-PBR^{-1}B^\top P. \end{align*} Thus, for every $t \ge 0$, \begin{align*} \frac{d}{dt}V(x(t))=-x(t)^\top Qx(t)-x(t)^\top PBR^{-1}B^\top Px(t). \end{align*} Finally, since $u^*(t)=-Kx(t)$, \begin{align*} u^*(t)^\top Ru^*(t)=x(t)^\top K^\top RKx(t). \end{align*} Since $R=R^\top>0$, the matrix $R$ is invertible and taking transposes in $RR^{-1}=I$ gives $(R^{-1})^\top R=I$, hence $(R^{-1})^\top=R^{-1}$. Using this identity together with $K=R^{-1}B^\top P$ gives \begin{align*} K^\top RK=PBR^{-1}B^\top P. \end{align*} Consequently, \begin{align*} \frac{d}{dt}\left(x(t)^\top Px(t)\right)=-x(t)^\top Qx(t)-u^*(t)^\top Ru^*(t). \end{align*} [guided] We now convert the algebraic expression from the differentiated storage function into the LQR cost terms. The gain matrix is $K=R^{-1}B^\top P$, and $R=R^\top>0$ implies that $R$ is invertible. Since $P=P^\top$ and $R^{-1}$ is symmetric, taking the transpose of $K$ gives $K^\top=PBR^{-1}$. Therefore \begin{align*} K^\top B^\top P=PBR^{-1}B^\top P. \end{align*} The same definition of $K$ gives \begin{align*} PBK=PBR^{-1}B^\top P. \end{align*} Substituting these two identities into the derivative formula yields \begin{align*} A^\top P+PA-K^\top B^\top P-PBK=A^\top P+PA-2PBR^{-1}B^\top P. \end{align*} The algebraic Riccati equation states that \begin{align*} A^\top P+PA-PBR^{-1}B^\top P+Q=0. \end{align*} Rearranging this equation gives \begin{align*} A^\top P+PA-PBR^{-1}B^\top P=-Q. \end{align*} Hence \begin{align*} A^\top P+PA-2PBR^{-1}B^\top P=-Q-PBR^{-1}B^\top P. \end{align*} Substituting this back into the derivative of $V(x(t))$ gives, for every $t\ge 0$, \begin{align*} \frac{d}{dt}V(x(t))=-x(t)^\top Qx(t)-x(t)^\top PBR^{-1}B^\top Px(t). \end{align*} It remains to identify the second quadratic form with the control cost. Since $u^*(t)=-Kx(t)$, the sign disappears in the quadratic expression and \begin{align*} u^*(t)^\top Ru^*(t)=x(t)^\top K^\top RKx(t). \end{align*} Using $K=R^{-1}B^\top P$ and $K^\top=PBR^{-1}$, we obtain \begin{align*} K^\top RK=PBR^{-1}RR^{-1}B^\top P=PBR^{-1}B^\top P. \end{align*} Therefore \begin{align*} \frac{d}{dt}\left(x(t)^\top Px(t)\right)=-x(t)^\top Qx(t)-u^*(t)^\top Ru^*(t). \end{align*} [/guided] [/step] [step:Integrate the differential identity and pass to the infinite horizon] For $T>0$, integrate the differential identity over $[0,T]$ with respect to one-dimensional [Lebesgue measure](/page/Lebesgue%20Measure): \begin{align*} V(x(T))-V(x_0)=-\int_0^{\!T}\left(x(t)^\top Qx(t)+u^*(t)^\top Ru^*(t)\right)\, d\mathcal{L}^1(t). \end{align*} Equivalently, \begin{align*} V(x_0)-V(x(T))=\int_0^{\!T}\left(x(t)^\top Qx(t)+u^*(t)^\top Ru^*(t)\right)\, d\mathcal{L}^1(t). \end{align*} Because $P$ is assumed to be the stabilising Riccati solution, the closed-loop matrix $A-BK$ is Hurwitz, meaning that every eigenvalue of $A-BK$ has negative real part. The finite-dimensional spectral consequence of this condition gives constants $M>0$ and $\alpha>0$ such that $|e^{t(A-BK)}z|\leq Me^{-\alpha t}|z|$ for every $z\in\mathbb{R}^n$ and every $t\geq 0$. The closed-loop solution is \begin{align*} x(t)=e^{t(A-BK)}x_0, \end{align*} so $x(t)\to 0$ as $t\to\infty$. Since $V$ is continuous, $V(x(T))\to V(0)=0$ as $T\to\infty$. The integrand is nonnegative because $Q=C^\top C$ and $R$ is positive definite: \begin{align*} x(t)^\top Qx(t)=|Cx(t)|^2 \ge 0, \end{align*} and \begin{align*} u^*(t)^\top Ru^*(t)\ge 0. \end{align*} Also $V(x(T))=x(T)^\top Px(T)\ge 0$ for every $T>0$ because $P\ge 0$. Therefore the finite-horizon integrals are monotone nondecreasing in $T$ and bounded above by $V(x_0)$. The nonnegative measurable function $t\mapsto x(t)^\top Qx(t)+u^*(t)^\top Ru^*(t)$ on $[0,\infty)$ therefore has a well-defined improper integral as the monotone limit of its finite-horizon integrals with respect to $\mathcal{L}^1$. Letting $T\to\infty$ gives \begin{align*} x_0^\top Px_0=\int_0^\infty \left(x(t)^\top Qx(t)+u^*(t)^\top Ru^*(t)\right)\, d\mathcal{L}^1(t). \end{align*} This is the asserted LQR energy identity. [guided] We integrate the differential identity over a finite time interval first, because the infinite-horizon identity requires a limiting argument. For $T>0$, the [fundamental theorem of calculus](/theorems/632) applied to the differentiable function $t\mapsto V(x(t))$ gives \begin{align*} V(x(T))-V(x_0)=-\int_0^{\!T}\left(x(t)^\top Qx(t)+u^*(t)^\top Ru^*(t)\right)\, d\mathcal{L}^1(t). \end{align*} Equivalently, \begin{align*} V(x_0)-V(x(T))=\int_0^{\!T}\left(x(t)^\top Qx(t)+u^*(t)^\top Ru^*(t)\right)\, d\mathcal{L}^1(t). \end{align*} Because $P$ is the stabilising Riccati solution, the matrix $A-BK$ is Hurwitz. In finite dimensions, this spectral condition implies exponential decay of the matrix exponential: there exist constants $M>0$ and $\alpha>0$ such that $|e^{t(A-BK)}z|\leq Me^{-\alpha t}|z|$ for every $z\in\mathbb{R}^n$ and every $t\ge 0$. The closed-loop solution is \begin{align*} x(t)=e^{t(A-BK)}x_0, \end{align*} so $x(t)\to 0$ as $t\to\infty$. Since $V: \mathbb{R}^n\to\mathbb{R}$ is continuous and $V(0)=0$, it follows that $V(x(T))\to 0$ as $T\to\infty$. We also need the right-hand side to have a well-defined infinite-horizon limit. The factorisation $Q=C^\top C$ gives \begin{align*} x(t)^\top Qx(t)=|Cx(t)|^2\ge 0, \end{align*} and the positive definiteness of $R$ gives \begin{align*} u^*(t)^\top Ru^*(t)\ge 0. \end{align*} Thus the function $t\mapsto x(t)^\top Qx(t)+u^*(t)^\top Ru^*(t)$ is nonnegative and measurable on $[0,\infty)$, because $x$ and $u^*$ are continuous. The finite-horizon integrals are therefore monotone nondecreasing in $T$. Moreover $V(x(T))=x(T)^\top Px(T)\ge 0$ because $P\ge 0$, so the finite-horizon identity gives \begin{align*} \int_0^{\!T}\left(x(t)^\top Qx(t)+u^*(t)^\top Ru^*(t)\right)\, d\mathcal{L}^1(t)\le V(x_0). \end{align*} Hence the improper integral is the monotone limit of these bounded finite-horizon integrals. Passing to the limit $T\to\infty$ gives \begin{align*} x_0^\top Px_0=\int_0^\infty \left(x(t)^\top Qx(t)+u^*(t)^\top Ru^*(t)\right)\, d\mathcal{L}^1(t). \end{align*} This proves the asserted LQR energy identity. [/guided] [/step]

What brings you to Androma?

Start with a route through the knowledge graph.

Linear Quadratic Regulator Energy Identity (Theorem # 6404)

Discussion

Proof

Explore Further

Sign in to Androma

Check your inbox

One last step

Linear Quadratic Regulator Energy Identity (Theorem # 6404)

Discussion

Proof

Explore Further