Attributions & Verification

Track contributions and verify content correctness

Proof

custom_env admin

[step:Complete the square for the conditional mean dynamics] For the conditional mean dynamics \begin{align*} d\hat{x}(t)=A\hat{x}(t)\,dt+Bu(t)\,dt+L\,d\nu(t), \end{align*} apply [Itô's formula](/theorems/2099) to the function $V_P:\mathbb R^n \to \mathbb R$ defined by \begin{align*} V_P(z)=z^\top Pz. \end{align*} Using the Riccati identity \begin{align*} A^\top P+PA+Q-PBR^{-1}B^\top P=0, \end{align*} we obtain the following identity first for stopped processes at $\tau_N:=\inf\{t:|\hat{x}(t)|\ge N\}\wedge N$: \begin{align*} \hat{x}(t)^\top Q\hat{x}(t)+u(t)^\top Ru(t) = \left(u(t)+K\hat{x}(t)\right)^\top R\left(u(t)+K\hat{x}(t)\right) -\frac{d}{dt}\left(\hat{x}(t)^\top P\hat{x}(t)\right) +\operatorname{tr}(PLVL^\top) \end{align*} after taking expectations, where the stopped stochastic integral has expectation zero because its integrand is bounded and predictable. For fixed $T$, local square-integrability of $u$ and locally finite second moments of $\hat{x}$ make the drift terms integrable on $[0,T]$. The stopped terminal quadratic terms increase to the unstopped terminal quadratic term along a subsequence and are uniformly controlled by these finite second moments, so dominated convergence for the drift integrals and $L^1$ convergence of the stopped terminal terms let $N\to\infty$. Thus the identity holds on every finite interval. Integrating over $[0,T]$ with respect to $\mathcal L^1$ gives \begin{align*} \mathbb E\left[\int_0^{\!T}\left(\hat{x}(t)^\top Q\hat{x}(t)+u(t)^\top Ru(t)\right)\,d\mathcal L^1(t)\right] \end{align*} \begin{align*} = \mathbb E\left[\int_0^{\!T}\left(u(t)+K\hat{x}(t)\right)^\top R\left(u(t)+K\hat{x}(t)\right)\,d\mathcal L^1(t)\right] +\mathbb E[\hat{x}(0)^\top P\hat{x}(0)] -\mathbb E[\hat{x}(T)^\top P\hat{x}(T)] +T\operatorname{tr}(PLVL^\top). \end{align*} Because $R$ is positive definite, the integrand \begin{align*} \left(u(t)+K\hat{x}(t)\right)^\top R\left(u(t)+K\hat{x}(t)\right) \end{align*} is nonnegative and is minimised pointwise by \begin{align*} u(t)=-K\hat{x}(t)=-R^{-1}B^\top P\hat{x}(t). \end{align*} [/step]

custom_env admin

[guided]We verify the hypotheses of the external inputs first. The regulator Riccati theorem [quotetheorem:TEMP-43] applies because $(A,B)$ is stabilisable, $(Q^{1/2},A)$ is detectable, and $R=R^\top>0$. It gives the stabilising solution $P$ and the gain $K=R^{-1}B^\top P$. The steady-state filter Riccati theorem [quotetheorem:TEMP-47] applies because $(A,C)$ is detectable, $(A,GW^{1/2})$ is stabilisable, and $V=V^\top>0$. It gives the stabilising covariance $\Sigma$ and the gain $L=\Sigma C^\top V^{-1}$. The Kalman-Bucy filter theorem [quotetheorem:TEMP-46] applies because the initial state is Gaussian and independent of the Gaussian noises, the measurement covariance is positive definite, and the observation-adapted control $u$ is known to the filter at time $t$. The steady-state initial covariance assumption gives $S(0)=\Sigma$, so uniqueness for the covariance Riccati equation keeps $S(t)=\Sigma$ for all $t\ge 0$. Therefore the exact conditional mean equation is \begin{align*} d\hat{x}(t)=A\hat{x}(t)\,dt+Bu(t)\,dt+L(dy(t)-C\hat{x}(t)\,dt), \end{align*} where $dy(t)-C\hat{x}(t)\,dt=d\nu(t)$ is the innovation increment with covariance $V\,dt$. Now decompose the state as $x(t)=\hat{x}(t)+\tilde{x}(t)$. Since $\hat{x}(t)$ is the conditional expectation of $x(t)$, the error satisfies $\mathbb E[\tilde{x}(t)\mid\mathcal Y_t]=0$. Expanding $x(t)^\top Qx(t)$ and conditioning on $\mathcal Y_t$ kills the cross term. Because the conditional covariance is $\Sigma$, the error contribution is \begin{align*} \mathbb E[\tilde{x}(t)^\top Q\tilde{x}(t)\mid\mathcal Y_t]=\operatorname{tr}(Q\Sigma). \end{align*} Thus the original average cost is the conditional-mean average cost plus the fixed constant $\operatorname{tr}(Q\Sigma)$. It remains to minimize the conditional-mean part. For fixed $T$ and $N$, stop the process at $\tau_N=\inf\{t:|\hat{x}(t)|\ge N\}\wedge N$. On $[0,T\wedge\tau_N]$ the Itô integrands in $V_P(\hat{x})=\hat{x}^\top P\hat{x}$ are bounded predictable processes, so the stochastic integral has expectation zero. The drift identity obtained from Itô's formula is \begin{align*} (u(t)+K\hat{x}(t))^\top R(u(t)+K\hat{x}(t))-\frac{d}{dt}(\hat{x}(t)^\top P\hat{x}(t))+\operatorname{tr}(PLVL^\top). \end{align*} The equality follows by substituting the regulator Riccati identity $A^\top P+PA+Q-PBR^{-1}B^\top P=0$ and $K=R^{-1}B^\top P$. The admissibility hypotheses give local square-integrability of $u$ and locally finite second moments of $\hat{x}$, so the stopped drift integrals converge in $L^1$ to the unstopped drift integrals on $[0,T]$, and the stopped terminal quadratic terms converge in $L^1$ to $\hat{x}(T)^\top P\hat{x}(T)$. Hence the integrated identity holds without stopping. Divide that identity by $T$ and take $\limsup_{T\to\infty}$. The initial term divided by $T$ tends to zero. The admissibility terminal condition removes the terminal quadratic term. The square term is nonnegative because $R>0$, and it is zero exactly when $u(t)=-K\hat{x}(t)$. Under this feedback the matrix $A-BK$ is Hurwitz, so the conditional-mean second moment is bounded and the terminal condition is satisfied. Therefore the certainty-equivalence feedback attains the minimum, and adding back the fixed error-cost constant gives the minimum average cost \begin{align*} \operatorname{tr}(Q\Sigma)+\operatorname{tr}(PLVL^\top). \end{align*}[/guided]

custom_env admin

Verification Progress

9 Total Blocks

0 Verified

0% verified

Contributors

admin 9 blocks (0 verified)

Who Can Verify

No area tags assigned. Only global reviewers can verify.

Viktor Miykov Admin

Max Vassiliev Global Reviewer

Horia Neagu Global Reviewer

강현욱 Global Reviewer

Demo Testing Global Reviewer

Archie Pennycook Global Reviewer

Quick Actions

Edit Theorem

What brings you to Androma?

Start with a route through the knowledge graph.

Attributions & Verification

Proof

Verification Progress

Contributors

Who Can Verify

Quick Actions

Sign in to Androma

Check your inbox

One last step

Attributions & Verification

Proof

Verification Progress

Contributors

Who Can Verify

Quick Actions

Raw Attribution Data