Continuous-Time Algebraic Riccati Equation Existence and LQR Optimality Theorem (Theorem # 6403)
Theorem
Let $A\in\mathbb R^{n\times n}$, $B\in\mathbb R^{n\times m}$, $C\in\mathbb R^{p\times n}$, $Q=C^\top C\in\mathbb R^{n\times n}$, and $R\in\mathbb R^{m\times m}$ with $R=R^\top>0$. Assume that $(A,B)$ is stabilisable and $(C,A)$ is detectable. Then the algebraic Riccati equation
\begin{align*}
A^\top P+PA-PBR^{-1}B^\top P+Q=0
\end{align*}
has a unique positive semidefinite stabilising solution $P\ge 0$. For $x_0\in\mathbb R^n$, define the infinite-horizon cost of an admissible control $u:(0,\infty)\to\mathbb R^m$ and its state trajectory $x:[0,\infty)\to\mathbb R^n$, satisfying $x'(t)=Ax(t)+Bu(t)$ and $x(0)=x_0$, by
\begin{align*}
J_\infty[u;x_0]=\int_0^\infty \left(x(t)^\top Qx(t)+u(t)^\top Ru(t)\right)\,d\mathcal L^1(t).
\end{align*}
The infinite-horizon LQR problem has optimal feedback
\begin{align*}
u^*(t)=-R^{-1}B^\top Px^*(t).
\end{align*}
Here
\begin{align*}
{x^*}'(t)=\left(A-BR^{-1}B^\top P\right)x^*(t)
\end{align*}
and $x^*(0)=x_0$. The optimal cost is
\begin{align*}
\inf_u J_\infty[u;x_0]=x_0^\top Px_0.
\end{align*}
Knowledge Status
Discussion
Continuous-Time Algebraic Riccati Equation Existence and LQR Optimality Theorem gives a standard result for linear control systems: Let A\in\mathbb R^{n\times n}, B\in\mathbb R^{n\times m}, C\in\mathbb R^{p\times n}, Q=C^\top C\in\mathbb R^{n\times n}, and R\in\mathbb R^{m\times m} with R=R^\top>0.
Proof
[proofplan]
We obtain the infinite-horizon solution as the monotone limit of the finite-horizon Riccati value matrices. Stabilisability supplies one finite-cost stabilising comparison feedback, so the finite-horizon value matrices are uniformly bounded; monotonicity then gives a positive semidefinite limit. Passing the Riccati equation to the limit gives the algebraic Riccati equation, detectability forces the associated closed-loop matrix to be Hurwitz, and completing the square proves both optimality and uniqueness among stabilising positive semidefinite solutions.
[/proofplan]
[step:Build finite-horizon value matrices and bound them by a stabilising feedback]
Let $\mathcal L^1$ denote one-dimensional [Lebesgue measure](/page/Lebesgue%20Measure) on the Borel subsets of $\mathbb R$. A real square matrix is called Hurwitz if every complex eigenvalue has strictly negative real part. A solution of the algebraic Riccati equation is called stabilising if its associated closed-loop matrix is Hurwitz. For $T>0$, define the finite-horizon admissible class $\mathcal U_T$ to be the set of measurable maps $u:(0,T)\to\mathbb R^m$ with $u\in L^2((0,T);\mathbb R^m)$. For $u\in\mathcal U_T$ and $x_0\in\mathbb R^n$, let $x_{u,x_0,T}:[0,T]\to\mathbb R^n$ denote the absolutely continuous solution of
\begin{align*}
x_{u,x_0,T}'(t)=Ax_{u,x_0,T}(t)+Bu(t), \qquad x_{u,x_0,T}(0)=x_0.
\end{align*}
Define the finite-horizon cost functional $J_T[\cdot;x_0]:\mathcal U_T\to[0,\infty)$ by
\begin{align*}
J_T[u;x_0]=\int_0^{\!T}\left(x_{u,x_0,T}(t)^\top Qx_{u,x_0,T}(t)+u(t)^\top Ru(t)\right)\,d\mathcal L^1(t).
\end{align*}
Since $Q=C^\top C$, $Q$ is symmetric positive semidefinite. Since $R=R^\top>0$, the running cost is non-negative and strictly convex in $u$.
By the finite-horizon continuous-time Riccati theorem with terminal weight $0$ and sign convention
\begin{align*}
-\Pi_T'(t)=A^\top\Pi_T(t)+\Pi_T(t)A-\Pi_T(t)BR^{-1}B^\top\Pi_T(t)+Q,
\end{align*}
with terminal condition $\Pi_T(T)=0$, for every $T>0$ there is a continuously differentiable symmetric positive semidefinite Riccati matrix $\Pi_T:[0,T]\to\mathbb R^{n\times n}$ and a symmetric positive semidefinite matrix $P_T:=\Pi_T(0)\in\mathbb R^{n\times n}$ such that
\begin{align*}
\inf_{u\in\mathcal U_T}J_T[u;x_0]=x_0^\top P_Tx_0
\end{align*}
for every $x_0\in\mathbb R^n$.
Because $(A,B)$ is stabilisable, choose a matrix $K\in\mathbb R^{m\times n}$ such that $A-BK$ is Hurwitz. Define the comparison closed-loop trajectory $x_K:[0,\infty)\to\mathbb R^n$ by
\begin{align*}
x_K'(t)=(A-BK)x_K(t), \qquad x_K(0)=x_0,
\end{align*}
and the comparison control $u_K:(0,\infty)\to\mathbb R^m$ by $u_K(t)=-Kx_K(t)$. Since $A-BK$ is Hurwitz, there are constants $M_K\ge 1$ and $\alpha_K>0$ such that
\begin{align*}
|x_K(t)|\le M_Ke^{-\alpha_Kt}|x_0|
\end{align*}
for every $t\ge 0$. Let $\|Q\|_{\mathrm{op}}$ and $\|R\|_{\mathrm{op}}$ denote the Euclidean operator norms of $Q$ and $R$. Then
\begin{align*}
x_K(t)^\top Qx_K(t)+u_K(t)^\top Ru_K(t)\le \left(\|Q\|_{\mathrm{op}}+\|K\|_{\mathrm{op}}^2\|R\|_{\mathrm{op}}\right)M_K^2e^{-2\alpha_Kt}|x_0|^2.
\end{align*}
Integrating with respect to $\mathcal L^1$ gives
\begin{align*}
J_T[u_K|_{(0,T)};x_0]\le \frac{\left(\|Q\|_{\mathrm{op}}+\|K\|_{\mathrm{op}}^2\|R\|_{\mathrm{op}}\right)M_K^2}{2\alpha_K}|x_0|^2.
\end{align*}
Define
\begin{align*}
\Gamma_K:=\frac{\left(\|Q\|_{\mathrm{op}}+\|K\|_{\mathrm{op}}^2\|R\|_{\mathrm{op}}\right)M_K^2}{2\alpha_K}.
\end{align*}
Since $P_T$ gives the infimum over $\mathcal U_T$, we have
\begin{align*}
0\le x_0^\top P_Tx_0\le \Gamma_K|x_0|^2
\end{align*}
for every $T>0$ and every $x_0\in\mathbb R^n$.
[guided]
The first purpose of this step is to produce a family of finite-horizon quadratic value functions. For a horizon $T>0$, the admissible controls are the square-integrable maps $u:(0,T)\to\mathbb R^m$, and each such control determines the state trajectory $x_{u,x_0,T}$ by the linear differential equation
\begin{align*}
x_{u,x_0,T}'(t)=Ax_{u,x_0,T}(t)+Bu(t), \qquad x_{u,x_0,T}(0)=x_0.
\end{align*}
The corresponding finite-horizon cost is
\begin{align*}
J_T[u;x_0]=\int_0^{\!T}\left(x_{u,x_0,T}(t)^\top Qx_{u,x_0,T}(t)+u(t)^\top Ru(t)\right)\,d\mathcal L^1(t).
\end{align*}
The hypotheses needed for the finite-horizon Riccati theorem are exactly the finite-dimensional linear dynamics, the positive semidefinite state weight, and the positive definite control weight. Here $Q=C^\top C$ is symmetric positive semidefinite because $x^\top Qx=|Cx|^2\ge 0$ for all $x\in\mathbb R^n$, and $R=R^\top>0$ is positive definite by hypothesis. Therefore the finite-horizon continuous-time Riccati theorem with terminal weight $0$ and sign convention
\begin{align*}
-\Pi_T'(t)=A^\top\Pi_T(t)+\Pi_T(t)A-\Pi_T(t)BR^{-1}B^\top\Pi_T(t)+Q,
\end{align*}
with terminal condition $\Pi_T(T)=0$, gives a continuously differentiable symmetric positive semidefinite Riccati matrix $\Pi_T:[0,T]\to\mathbb R^{n\times n}$ and a symmetric positive semidefinite matrix $P_T:=\Pi_T(0)\in\mathbb R^{n\times n}$ satisfying
\begin{align*}
\inf_{u\in\mathcal U_T}J_T[u;x_0]=x_0^\top P_Tx_0
\end{align*}
for every initial state $x_0\in\mathbb R^n$.
The second purpose is to prove that these matrices cannot grow without bound as $T$ increases. Stabilisability is used precisely here. Since $(A,B)$ is stabilisable, there exists a feedback matrix $K\in\mathbb R^{m\times n}$ such that the closed-loop matrix $A-BK$ is Hurwitz. Define the comparison trajectory $x_K:[0,\infty)\to\mathbb R^n$ by
\begin{align*}
x_K'(t)=(A-BK)x_K(t), \qquad x_K(0)=x_0,
\end{align*}
and define the comparison control by $u_K(t)=-Kx_K(t)$. The Hurwitz property gives exponential decay: there are constants $M_K\ge 1$ and $\alpha_K>0$ such that
\begin{align*}
|x_K(t)|\le M_Ke^{-\alpha_Kt}|x_0|
\end{align*}
for all $t\ge 0$.
Using the Euclidean operator norms of $Q$, $R$, and $K$, we estimate the running cost along this stabilising feedback:
\begin{align*}
x_K(t)^\top Qx_K(t)\le \|Q\|_{\mathrm{op}}|x_K(t)|^2
\end{align*}
and
\begin{align*}
u_K(t)^\top Ru_K(t)\le \|R\|_{\mathrm{op}}|u_K(t)|^2\le \|R\|_{\mathrm{op}}\|K\|_{\mathrm{op}}^2|x_K(t)|^2.
\end{align*}
Substituting the exponential bound gives
\begin{align*}
x_K(t)^\top Qx_K(t)+u_K(t)^\top Ru_K(t)\le \left(\|Q\|_{\mathrm{op}}+\|K\|_{\mathrm{op}}^2\|R\|_{\mathrm{op}}\right)M_K^2e^{-2\alpha_Kt}|x_0|^2.
\end{align*}
The right-hand side is integrable on $(0,\infty)$ with respect to $\mathcal L^1$, so for every finite $T$,
\begin{align*}
J_T[u_K|_{(0,T)};x_0]\le \frac{\left(\|Q\|_{\mathrm{op}}+\|K\|_{\mathrm{op}}^2\|R\|_{\mathrm{op}}\right)M_K^2}{2\alpha_K}|x_0|^2.
\end{align*}
Define this explicit comparison constant by
\begin{align*}
\Gamma_K:=\frac{\left(\|Q\|_{\mathrm{op}}+\|K\|_{\mathrm{op}}^2\|R\|_{\mathrm{op}}\right)M_K^2}{2\alpha_K}.
\end{align*}
Since $P_T$ is the optimal finite-horizon value matrix, its value is no larger than the cost of this particular admissible stabilising control. Hence
\begin{align*}
0\le x_0^\top P_Tx_0\le \Gamma_K|x_0|^2
\end{align*}
for every $T>0$ and every $x_0\in\mathbb R^n$.
[/guided]
[/step]
[step:Take the monotone limit and pass to the algebraic Riccati equation]
If $0<T_1<T_2$, restricting any admissible control on $(0,T_2)$ to $(0,T_1)$ and using the non-negativity of the running cost gives
\begin{align*}
x_0^\top P_{T_1}x_0\le x_0^\top P_{T_2}x_0
\end{align*}
for every $x_0\in\mathbb R^n$. Thus $(P_T)_{T>0}$ is increasing in the Loewner order and is bounded above by $\Gamma_KI_n$. Since the space of real symmetric $n\times n$ matrices is finite-dimensional, there exists a symmetric positive semidefinite matrix $P\in\mathbb R^{n\times n}$ such that $P_T\to P$ in every matrix norm and
\begin{align*}
\lim_{T\to\infty}x_0^\top P_Tx_0=x_0^\top Px_0
\end{align*}
for every $x_0\in\mathbb R^n$.
Let $\Pi_T:[0,T]\to\mathbb R^{n\times n}$ be the finite-horizon Riccati matrix with terminal condition $\Pi_T(T)=0$, so $P_T=\Pi_T(0)$. The finite-horizon continuous-time Riccati theorem gives
\begin{align*}
-\Pi_T'(t)=A^\top\Pi_T(t)+\Pi_T(t)A-\Pi_T(t)BR^{-1}B^\top\Pi_T(t)+Q.
\end{align*}
By time-[translation invariance](/theorems/4911) of the autonomous problem, $\Pi_T(s)=P_{T-s}$ for $0\le s\le T$. Therefore, for fixed $h>0$ and $T>h$,
\begin{align*}
P_T-P_{T-h}=\int_0^h \left(A^\top P_{T-r}+P_{T-r}A-P_{T-r}BR^{-1}B^\top P_{T-r}+Q\right)\,d\mathcal L^1(r).
\end{align*}
The map $\mathcal R:\mathbb R^{n\times n}\to\mathbb R^{n\times n}$ defined by
\begin{align*}
\mathcal R(X)=A^\top X+XA-XBR^{-1}B^\top X+Q
\end{align*}
is polynomial in the entries of $X$, hence continuous. Since $0\le P_s\le \Gamma_KI_n$ for every $s>0$, the family $\{\mathcal R(P_s):s>0\}$ is bounded. Moreover, for $r\in[0,h]$, $P_{T-r}\to P$ uniformly in $r$ as $T\to\infty$, because $T-r\ge T-h\to\infty$ and the monotone bounded matrix limit has already been identified. Therefore $\mathcal R(P_{T-r})\to\mathcal R(P)$ uniformly on $[0,h]$, so the limit may be passed through the [Lebesgue integral](/page/Lebesgue%20Integral) over the finite [measure space](/page/Measure%20Space) $([0,h],\mathcal B([0,h]),\mathcal L^1)$.
Letting $T\to\infty$, the left-hand side tends to $0$, and the right-hand side tends to
\begin{align*}
h\left(A^\top P+PA-PBR^{-1}B^\top P+Q\right).
\end{align*}
Since $h>0$, the limit matrix satisfies the algebraic Riccati equation
\begin{align*}
A^\top P+PA-PBR^{-1}B^\top P+Q=0.
\end{align*}
[guided]
The finite-horizon value cannot decrease when the time horizon is enlarged, because an admissible control on the longer interval has a restriction to the shorter interval and the extra running cost is non-negative. Thus, for $0<T_1<T_2$,
\begin{align*}
x_0^\top P_{T_1}x_0\le x_0^\top P_{T_2}x_0
\end{align*}
for every $x_0\in\mathbb R^n$. This means that $(P_T)_{T>0}$ is increasing in the Loewner order. The comparison estimate from the preceding step gives $P_T\le \Gamma_KI_n$, so every scalar quadratic form $x_0^\top P_Tx_0$ is monotone and bounded. Since real symmetric $n\times n$ matrices form a finite-dimensional [vector space](/page/Vector%20Space), these scalar limits determine a symmetric positive semidefinite matrix $P\in\mathbb R^{n\times n}$, and in fact $P_T\to P$ in matrix norm.
Now we extract the stationary Riccati equation from the finite-horizon differential equation. Let $\Pi_T:[0,T]\to\mathbb R^{n\times n}$ be the Riccati matrix with terminal condition $\Pi_T(T)=0$, so that $P_T=\Pi_T(0)$. The finite-horizon continuous-time Riccati theorem gives
\begin{align*}
-\Pi_T'(t)=A^\top\Pi_T(t)+\Pi_T(t)A-\Pi_T(t)BR^{-1}B^\top\Pi_T(t)+Q.
\end{align*}
Because the dynamics and cost are autonomous, solving backward from terminal time $T$ and observing the system at time $s$ is the same finite-horizon problem with remaining horizon $T-s$. Hence $\Pi_T(s)=P_{T-s}$ for $0\le s\le T$. Integrating the differential equation from $0$ to $h$ gives, for $T>h$,
\begin{align*}
P_T-P_{T-h}=\int_0^h \left(A^\top P_{T-r}+P_{T-r}A-P_{T-r}BR^{-1}B^\top P_{T-r}+Q\right)\,d\mathcal L^1(r).
\end{align*}
The only delicate point is passing to the limit inside this integral. Define $\mathcal R:\mathbb R^{n\times n}\to\mathbb R^{n\times n}$ by
\begin{align*}
\mathcal R(X)=A^\top X+XA-XBR^{-1}B^\top X+Q.
\end{align*}
This map is continuous because it is polynomial in the entries of $X$. Also $0\le P_s\le \Gamma_KI_n$, so the matrices $P_s$ remain in a compact bounded subset of the finite-dimensional matrix space. For $r\in[0,h]$, the parameter $T-r$ tends to infinity uniformly as $T\to\infty$, and therefore $P_{T-r}\to P$ uniformly in $r$. It follows that $\mathcal R(P_{T-r})\to\mathcal R(P)$ uniformly on $[0,h]$. Since $[0,h]$ has finite $\mathcal L^1$-measure, [uniform convergence](/page/Uniform%20Convergence) justifies passing the limit through the integral.
The left-hand side satisfies $P_T-P_{T-h}\to P-P=0$. The right-hand side tends to
\begin{align*}
h\left(A^\top P+PA-PBR^{-1}B^\top P+Q\right).
\end{align*}
Since $h>0$, division by $h$ yields
\begin{align*}
A^\top P+PA-PBR^{-1}B^\top P+Q=0.
\end{align*}
Thus the monotone limit of the finite-horizon value matrices solves the algebraic Riccati equation.
[/guided]
[/step]
[step:Use detectability to prove that the limiting feedback is stabilising]
Define the closed-loop matrix
\begin{align*}
A_P:=A-BR^{-1}B^\top P\in\mathbb R^{n\times n}.
\end{align*}
Since $R=R^\top>0$, there is a unique symmetric positive definite matrix $R^{-1/2}\in\mathbb R^{m\times m}$ such that $R^{-1/2}R^{-1/2}=R^{-1}$. Using the algebraic Riccati equation, for every $x\in\mathbb R^n$,
\begin{align*}
x^\top(A_P^\top P+PA_P)x=-x^\top Qx-x^\top PBR^{-1}B^\top Px.
\end{align*}
Since $Q=C^\top C$ and $R^{-1}=R^{-\top}>0$, this becomes
\begin{align*}
x^\top(A_P^\top P+PA_P)x=-|Cx|^2-|R^{-1/2}B^\top Px|^2\le 0.
\end{align*}
We prove that $A_P$ has no eigenvalue in the closed right half-plane. Let $\lambda\in\mathbb C$ and $v\in\mathbb C^n\setminus\{0\}$ satisfy $A_Pv=\lambda v$. Complexifying the displayed identity gives
\begin{align*}
v^*(A_P^*P+PA_P)v=-|Cv|^2-|R^{-1/2}B^\top Pv|^2.
\end{align*}
The left-hand side equals
\begin{align*}
(\overline\lambda+\lambda)v^*Pv=2\operatorname{Re}\lambda\,v^*Pv.
\end{align*}
Since $P\ge 0$, the left-hand side is non-negative whenever $\operatorname{Re}\lambda\ge 0$, while the right-hand side is non-positive. Therefore, if $\operatorname{Re}\lambda\ge 0$, both sides are zero, and in particular $Cv=0$ and $B^\top Pv=0$. Hence
\begin{align*}
Av=A_Pv+BR^{-1}B^\top Pv=\lambda v.
\end{align*}
Detectability of $(C,A)$ means that every eigenvector of $A$ with $Cv=0$ has eigenvalue in the open left half-plane. This contradicts $\operatorname{Re}\lambda\ge 0$. Thus every eigenvalue of $A_P$ has negative real part, so $A_P$ is Hurwitz.
[guided]
The stabilising property means that the matrix
\begin{align*}
A_P:=A-BR^{-1}B^\top P
\end{align*}
has all eigenvalues in the open left half-plane. We begin with the Lyapunov identity supplied by the algebraic Riccati equation. Because $R=R^\top>0$, the inverse $R^{-1}$ is symmetric positive definite, and there is a unique symmetric positive definite square root $R^{-1/2}\in\mathbb R^{m\times m}$ satisfying $R^{-1/2}R^{-1/2}=R^{-1}$. Using
\begin{align*}
A^\top P+PA-PBR^{-1}B^\top P+Q=0
\end{align*}
and $A_P=A-BR^{-1}B^\top P$, we compute for $x\in\mathbb R^n$:
\begin{align*}
x^\top(A_P^\top P+PA_P)x=-x^\top Qx-x^\top PBR^{-1}B^\top Px.
\end{align*}
Since $Q=C^\top C$, the first term on the right is $-|Cx|^2$. Since $P=P^\top$ and $R^{-1}=R^{-1/2}R^{-1/2}$, the second term is $-|R^{-1/2}B^\top Px|^2$. Hence
\begin{align*}
x^\top(A_P^\top P+PA_P)x=-|Cx|^2-|R^{-1/2}B^\top Px|^2\le 0.
\end{align*}
Now take a complex eigenpair $A_Pv=\lambda v$ with $v\in\mathbb C^n\setminus\{0\}$. The same identity holds after complexification, with transpose replaced by conjugate transpose:
\begin{align*}
v^*(A_P^*P+PA_P)v=-|Cv|^2-|R^{-1/2}B^\top Pv|^2.
\end{align*}
The eigenvalue equation gives $v^*A_P^*P v=\overline\lambda v^*Pv$ and $v^*PA_Pv=\lambda v^*Pv$, so the left-hand side is
\begin{align*}
2\operatorname{Re}\lambda\,v^*Pv.
\end{align*}
If $\operatorname{Re}\lambda\ge 0$, this quantity is non-negative because $P\ge 0$. The right-hand side is a sum of two non-positive terms. Therefore equality can hold only if both non-positive terms vanish:
\begin{align*}
Cv=0
\end{align*}
and
\begin{align*}
B^\top Pv=0.
\end{align*}
The second identity converts the closed-loop eigenvector back into an open-loop eigenvector:
\begin{align*}
Av=A_Pv+BR^{-1}B^\top Pv=\lambda v.
\end{align*}
Thus $v$ is an eigenvector of $A$ with output $Cv=0$ and eigenvalue $\lambda$ in the closed right half-plane. Detectability of $(C,A)$ rules out exactly this situation: unobservable eigenvectors must correspond only to eigenvalues with negative real part. This contradiction proves that no eigenvalue of $A_P$ has non-negative real part. Hence $A_P$ is Hurwitz.
[/guided]
[/step]
[step:Complete the square to prove optimality and identify the infinite-horizon value]
Define the infinite-horizon admissible class $\mathcal U_\infty(x_0)$ to be the set of measurable controls $u:(0,\infty)\to\mathbb R^m$ with $u\in L^2_{\mathrm{loc}}((0,\infty);\mathbb R^m)$ for which the state $x_{u,x_0}:[0,\infty)\to\mathbb R^n$ is absolutely continuous, satisfies $x_{u,x_0}'(t)=Ax_{u,x_0}(t)+Bu(t)$ and $x_{u,x_0}(0)=x_0$, and has well-defined extended cost
\begin{align*}
J_\infty[u;x_0]=\int_0^\infty \left(x_{u,x_0}(t)^\top Qx_{u,x_0}(t)+u(t)^\top Ru(t)\right)\,d\mathcal L^1(t)\in[0,\infty].
\end{align*}
For $u\in\mathcal U_\infty(x_0)$ and $T>0$, the restriction $u|_{(0,T)}$ is admissible for the finite-horizon problem, so
\begin{align*}
\int_0^{\!T}\left(x_{u,x_0}(t)^\top Qx_{u,x_0}(t)+u(t)^\top Ru(t)\right)\,d\mathcal L^1(t)\ge x_0^\top P_Tx_0.
\end{align*}
Since the integrand is non-negative, the left-hand side increases to $J_\infty[u;x_0]$ as $T\to\infty$ by monotone convergence for Lebesgue integrals, while $x_0^\top P_Tx_0\to x_0^\top Px_0$. Hence
\begin{align*}
J_\infty[u;x_0]\ge x_0^\top Px_0.
\end{align*}
For the feedback trajectory $x^*:[0,\infty)\to\mathbb R^n$ solving $x^{*'}(t)=A_Px^*(t)$ and $x^*(0)=x_0$, define $u^*:(0,\infty)\to\mathbb R^m$ by
\begin{align*}
u^*(t)=-R^{-1}B^\top Px^*(t).
\end{align*}
Since $A_P$ is Hurwitz, $x^*(t)$ decays exponentially, and therefore $u^*$ and the corresponding running cost are integrable on $(0,\infty)$. Define the feedback defect $w^*:(0,\infty)\to\mathbb R^m$ by
\begin{align*}
w^*(t)=u^*(t)+R^{-1}B^\top Px^*(t).
\end{align*}
Then $w^*(t)=0$ for every $t>0$. Using $x^{*'}=Ax^*+Bu^*$ and the algebraic Riccati equation, differentiation gives
\begin{align*}
\frac{d}{dt}\left(x^*(t)^\top Px^*(t)\right)=w^*(t)^\top Rw^*(t)-x^*(t)^\top Qx^*(t)-u^*(t)^\top Ru^*(t).
\end{align*}
Integrating over $(0,T)$ with respect to $\mathcal L^1$ and using $w^*=0$ gives
\begin{align*}
\int_0^{\!T}\left(x^*(t)^\top Qx^*(t)+u^*(t)^\top Ru^*(t)\right)\,d\mathcal L^1(t)=x_0^\top Px_0-x^*(T)^\top Px^*(T).
\end{align*}
Letting $T\to\infty$ and using exponential decay of $x^*$ yields
\begin{align*}
J_\infty[u^*;x_0]=x_0^\top Px_0.
\end{align*}
Thus the feedback is optimal and the optimal cost is $x_0^\top Px_0$.
[guided]
The lower bound for arbitrary controls does not require a separate transversality theorem. We define the infinite-horizon admissible class $\mathcal U_\infty(x_0)$ to consist of measurable controls $u:(0,\infty)\to\mathbb R^m$ with $u\in L^2_{\mathrm{loc}}((0,\infty);\mathbb R^m)$ such that the state $x_{u,x_0}:[0,\infty)\to\mathbb R^n$ is absolutely continuous, solves
\begin{align*}
x_{u,x_0}'(t)=Ax_{u,x_0}(t)+Bu(t)
\end{align*}
with $x_{u,x_0}(0)=x_0$, and has extended non-negative cost
\begin{align*}
J_\infty[u;x_0]=\int_0^\infty \left(x_{u,x_0}(t)^\top Qx_{u,x_0}(t)+u(t)^\top Ru(t)\right)\,d\mathcal L^1(t).
\end{align*}
For any $T>0$, the restricted control $u|_{(0,T)}$ is admissible for the finite-horizon problem with the same initial state. Therefore finite-horizon optimality gives
\begin{align*}
\int_0^{\!T}\left(x_{u,x_0}(t)^\top Qx_{u,x_0}(t)+u(t)^\top Ru(t)\right)\,d\mathcal L^1(t)\ge x_0^\top P_Tx_0.
\end{align*}
The integrand is non-negative because $Q\ge 0$ and $R>0$. Hence the finite-horizon integrals increase to the infinite-horizon integral by monotone convergence for Lebesgue integrals. At the same time, $P_T\to P$, so $x_0^\top P_Tx_0\to x_0^\top Px_0$. Passing to the limit gives
\begin{align*}
J_\infty[u;x_0]\ge x_0^\top Px_0.
\end{align*}
This proves the lower bound for every admissible control, including controls of infinite cost.
It remains to show that the claimed feedback attains this lower bound. Let $x^*:[0,\infty)\to\mathbb R^n$ solve
\begin{align*}
x^{*'}(t)=A_Px^*(t),\qquad x^*(0)=x_0,
\end{align*}
and define $u^*:(0,\infty)\to\mathbb R^m$ by
\begin{align*}
u^*(t)=-R^{-1}B^\top Px^*(t).
\end{align*}
Since $A_P$ is Hurwitz, $x^*$ decays exponentially. Consequently $u^*$ also decays exponentially, so $u^*\in L^2((0,\infty);\mathbb R^m)$ and the infinite-horizon cost is finite.
Define the feedback defect $w^*:(0,\infty)\to\mathbb R^m$ by
\begin{align*}
w^*(t)=u^*(t)+R^{-1}B^\top Px^*(t).
\end{align*}
For the feedback control, $w^*(t)=0$. Differentiating the quadratic function $x\mapsto x^\top Px$ along the controlled trajectory and using the algebraic Riccati equation gives
\begin{align*}
\frac{d}{dt}\left(x^*(t)^\top Px^*(t)\right)=w^*(t)^\top Rw^*(t)-x^*(t)^\top Qx^*(t)-u^*(t)^\top Ru^*(t).
\end{align*}
Since $w^*=0$, integration over $(0,T)$ with respect to $\mathcal L^1$ gives
\begin{align*}
\int_0^{\!T}\left(x^*(t)^\top Qx^*(t)+u^*(t)^\top Ru^*(t)\right)\,d\mathcal L^1(t)=x_0^\top Px_0-x^*(T)^\top Px^*(T).
\end{align*}
Exponential decay of $x^*$ implies $x^*(T)^\top Px^*(T)\to 0$. Letting $T\to\infty$ therefore yields
\begin{align*}
J_\infty[u^*;x_0]=x_0^\top Px_0.
\end{align*}
The lower bound and the attaining feedback together prove optimality and identify the infinite-horizon value.
[/guided]
[/step]
[step:Compare two stabilising solutions by a Lyapunov identity]
Let $S\in\mathbb R^{n\times n}$ be another symmetric positive semidefinite solution of the algebraic Riccati equation such that
\begin{align*}
A_S:=A-BR^{-1}B^\top S
\end{align*}
is Hurwitz. Define the symmetric matrix $D\in\mathbb R^{n\times n}$ by $D:=S-P$, and define the symmetric positive semidefinite matrix $G\in\mathbb R^{n\times n}$ by $G:=BR^{-1}B^\top$. Subtracting the Riccati equation for $P$ from the Riccati equation for $S$ gives
\begin{align*}
A^\top D+DA-SGS+PGP=0.
\end{align*}
Using $S=P+D$ and $A_P=A-GP$, this is equivalent to
\begin{align*}
A_P^\top D+DA_P-DGD=0.
\end{align*}
For an arbitrary $x_0\in\mathbb R^n$, let $y_P:[0,\infty)\to\mathbb R^n$ be the solution of $y_P'(t)=A_Py_P(t)$ with $y_P(0)=x_0$. Since $A_P$ is Hurwitz, $y_P(t)\to 0$ as $t\to\infty$. Differentiating $y_P(t)^\top Dy_P(t)$ gives
\begin{align*}
\frac{d}{dt}\left(y_P(t)^\top Dy_P(t)\right)=y_P(t)^\top DGDy_P(t)\ge 0.
\end{align*}
Integrating over $(0,T)$ with respect to $\mathcal L^1$ and letting $T\to\infty$ yields
\begin{align*}
x_0^\top D x_0=-\int_0^\infty y_P(t)^\top DGDy_P(t)\,d\mathcal L^1(t)\le 0.
\end{align*}
Hence $S\le P$ in the Loewner order.
The same subtraction can be written relative to $A_S=A-GS$ as
\begin{align*}
A_S^\top D+DA_S+DGD=0.
\end{align*}
Let $y_S:[0,\infty)\to\mathbb R^n$ be the solution of $y_S'(t)=A_Sy_S(t)$ with $y_S(0)=x_0$. Since $A_S$ is Hurwitz, $y_S(t)\to 0$ as $t\to\infty$. Differentiating $y_S(t)^\top Dy_S(t)$ gives
\begin{align*}
\frac{d}{dt}\left(y_S(t)^\top Dy_S(t)\right)=-y_S(t)^\top DGDy_S(t)\le 0.
\end{align*}
Integrating over $(0,T)$ with respect to $\mathcal L^1$ and letting $T\to\infty$ yields
\begin{align*}
x_0^\top D x_0=\int_0^\infty y_S(t)^\top DGDy_S(t)\,d\mathcal L^1(t)\ge 0.
\end{align*}
Thus $P\le S$. Therefore $x_0^\top(P-S)x_0=0$ for every $x_0\in\mathbb R^n$. The polarization identity for symmetric bilinear forms gives $P=S$. This proves uniqueness of the positive semidefinite stabilising solution and completes the theorem.
[guided]
Suppose $S\in\mathbb R^{n\times n}$ is another symmetric positive semidefinite solution of the algebraic Riccati equation and that its closed-loop matrix
\begin{align*}
A_S:=A-BR^{-1}B^\top S
\end{align*}
is Hurwitz. We avoid any transversality assumption for arbitrary admissible controls by comparing the two Riccati equations directly. Define $D:=S-P\in\mathbb R^{n\times n}$ and $G:=BR^{-1}B^\top\in\mathbb R^{n\times n}$. Both $D$ and $G$ are symmetric, and $G$ is positive semidefinite because $R^{-1}$ is positive definite.
Subtract the algebraic Riccati equation for $P$ from the algebraic Riccati equation for $S$:
\begin{align*}
A^\top D+DA-SGS+PGP=0.
\end{align*}
Since $S=P+D$, we expand $SGS=(P+D)G(P+D)$. After cancellation of the $PGP$ terms, the identity becomes
\begin{align*}
A^\top D+DA-PGD-DGP-DGD=0.
\end{align*}
Because $A_P=A-GP$, this is exactly
\begin{align*}
A_P^\top D+DA_P-DGD=0.
\end{align*}
Now let $y_P:[0,\infty)\to\mathbb R^n$ solve $y_P'(t)=A_Py_P(t)$ with $y_P(0)=x_0$. The matrix $A_P$ is Hurwitz, so $y_P(t)\to 0$ exponentially. Differentiating the scalar function $t\mapsto y_P(t)^\top Dy_P(t)$ and using the last displayed identity gives
\begin{align*}
\frac{d}{dt}\left(y_P(t)^\top Dy_P(t)\right)=y_P(t)^\top(A_P^\top D+DA_P)y_P(t)=y_P(t)^\top DGDy_P(t).
\end{align*}
The last term is non-negative because $DGD=(B^\top D)^\top R^{-1}(B^\top D)$. Therefore integration over $(0,T)$ gives
\begin{align*}
y_P(T)^\top Dy_P(T)-x_0^\top D x_0=\int_0^{\!T}y_P(t)^\top DGDy_P(t)\,d\mathcal L^1(t).
\end{align*}
Letting $T\to\infty$ is justified by exponential decay of $y_P$ and finiteness of the fixed matrix $D$, so $y_P(T)^\top Dy_P(T)\to 0$. Hence
\begin{align*}
x_0^\top D x_0=-\int_0^\infty y_P(t)^\top DGDy_P(t)\,d\mathcal L^1(t)\le 0.
\end{align*}
Thus $S\le P$.
To get the opposite inequality, use the same algebraic difference but write it relative to $A_S=A-GS$. Since $S=P+D$, the subtraction identity is equivalent to
\begin{align*}
A_S^\top D+DA_S+DGD=0.
\end{align*}
Let $y_S:[0,\infty)\to\mathbb R^n$ solve $y_S'(t)=A_Sy_S(t)$ with $y_S(0)=x_0$. Because $A_S$ is Hurwitz by assumption, $y_S(t)\to 0$ exponentially. Differentiating along this trajectory gives
\begin{align*}
\frac{d}{dt}\left(y_S(t)^\top Dy_S(t)\right)=y_S(t)^\top(A_S^\top D+DA_S)y_S(t)=-y_S(t)^\top DGDy_S(t)\le 0.
\end{align*}
Integrating over $(0,T)$ and passing to the limit $T\to\infty$ gives
\begin{align*}
x_0^\top D x_0=\int_0^\infty y_S(t)^\top DGDy_S(t)\,d\mathcal L^1(t)\ge 0.
\end{align*}
Thus $P\le S$. Combining $S\le P$ and $P\le S$ gives
\begin{align*}
x_0^\top(P-S)x_0=0
\end{align*}
for every $x_0\in\mathbb R^n$. Since $P-S$ is symmetric, the polarization identity for symmetric bilinear forms implies $P-S=0$. Hence $P=S$, proving uniqueness of the positive semidefinite stabilising solution.
[/guided]
[/step]
Explore Further
Polynomial Invariance of Polynomial Time
applied
Cayley-Hamilton Formula for the Matrix Exponential
applied
Deterministic Time Hierarchy Theorem
applied
Characterization of ZPP as RP Intersect coRP
applied
Pole Placement Theorem for Single-Input Controllable Linear Systems
applied
Polynomial-Time Algorithm for an NP-Hard Language Implies $P = NP$
applied
Similarity Invariance of Reachability and Observability
applied
Closed Loop Signal Maps
applied