Attributions & Verification

Track contributions and verify content correctness

Proof

custom_env admin

[guided]We expand the score, not the likelihood itself, because the estimator is characterised by the first-order condition. Define the score map \begin{align*} S_n: U &\to \mathbb{R}^{p+q},\\ \vartheta &\mapsto \nabla \ell_n(\vartheta), \end{align*} and define the observed Hessian matrix map as the Jacobian of the score, \begin{align*} H_n: U &\to \mathbb{R}^{(p+q)\times(p+q)},\\ \vartheta &\mapsto JS_{n,\vartheta}. \end{align*} Thus the $(i,j)$ entry of $H_n(\vartheta)$ is $\partial_{\vartheta_j}S_{n,i}(\vartheta)$ for $1 \le i,j \le p+q$. Let $\mathcal{L}^1$ denote one-dimensional Lebesgue measure on $[0,1]$. This avoids using a scalar one-dimensional mean value theorem in a vector setting. In several parameters, a componentwise Taylor theorem need not produce one common intermediate point for every component of the score, so we use the integral Taylor formula instead. On $E_n$, the estimator belongs to $B(\vartheta_0,r)$, and this ball is convex as a Euclidean ball in $\mathbb{R}^{p+q}$. Therefore for every $t \in [0,1]$, the parameter \begin{align*} \vartheta_{n,t}:=\vartheta_0+t(\hat{\vartheta}_n-\vartheta_0) \end{align*} belongs to $B(\vartheta_0,r) \subset U$. The same event $E_n$ also gives twice continuous differentiability of $\ell_n$ on $U$, hence continuous differentiability of $S_n$ on the whole segment $\{\vartheta_{n,t}:0\le t\le 1\}$. These hypotheses allow us to reduce the vector-valued expansion to one-dimensional calculus along the line segment. Define the pathwise score curve \begin{align*} g_n: [0,1] &\to \mathbb{R}^{p+q},\\ t &\mapsto S_n(\vartheta_{n,t}). \end{align*} Because $S_n$ is continuously differentiable on the segment, every component of $g_n$ is continuously differentiable on $[0,1]$. By the [Fundamental Theorem of Calculus](/theorems/632) applied componentwise to $g_n$, \begin{align*} S_n(\hat{\vartheta}_n)-S_n(\vartheta_0) = g_n(1)-g_n(0) = \int_0^1 H_n(\vartheta_{n,t})(\hat{\vartheta}_n-\vartheta_0)\,d\mathcal{L}^1(t). \end{align*} Since the vector $\hat{\vartheta}_n-\vartheta_0$ is independent of the integration variable $t$, we may factor it to the right of the matrix integral. Define the averaged observed Hessian matrix \begin{align*} \bar{H}_n: E_n &\to \mathbb{R}^{(p+q)\times(p+q)},\\ \omega &\mapsto \int_0^1 H_n\bigl(\vartheta_0+t(\hat{\vartheta}_n(\omega)-\vartheta_0)\bigr)\,d\mathcal{L}^1(t). \end{align*} Then the Taylor identity becomes \begin{align*} S_n(\hat{\vartheta}_n) = S_n(\vartheta_0)+\bar{H}_n(\hat{\vartheta}_n-\vartheta_0). \end{align*} Because $\hat{\vartheta}_n$ is an interior maximiser on $E_n$, the first-order likelihood equation gives $S_n(\hat{\vartheta}_n)=0$. Hence \begin{align*} 0 = S_n(\vartheta_0)+\bar{H}_n(\hat{\vartheta}_n-\vartheta_0). \end{align*} Multiplying by $n^{-1/2}$ and moving the Hessian term to the left gives \begin{align*} -\frac{1}{n}\bar{H}_n\sqrt n(\hat{\vartheta}_n-\vartheta_0) = \frac{1}{\sqrt n}S_n(\vartheta_0). \end{align*} This identity is the mechanism of the proof: the left side contains the averaged observed information, while the right side is the normalised score.[/guided]

custom_env admin

[step:Show the averaged observed information converges to Fisher information] For $t \in [0,1]$, define the random parameter \begin{align*} \vartheta_{n,t}:=\vartheta_0+t(\hat{\vartheta}_n-\vartheta_0). \end{align*} For a matrix $A \in \mathbb{R}^{(p+q)\times(p+q)}$, let $\|A\|_{\mathrm{op}}$ denote the operator norm induced by the Euclidean norm on $\mathbb{R}^{p+q}$: \begin{align*} \|A\|_{\mathrm{op}}:=\sup\{|Av|:v\in\mathbb{R}^{p+q},\ |v|=1\}. \end{align*} Since \begin{align*} |\vartheta_{n,t}-\vartheta_0| = t|\hat{\vartheta}_n-\vartheta_0| \le |\hat{\vartheta}_n-\vartheta_0|, \end{align*} the whole segment $\{\vartheta_{n,t}:0\le t\le 1\}$ lies in the random ball $B(\vartheta_0,|\hat{\vartheta}_n-\vartheta_0|)$. Because $\hat{\vartheta}_n \xrightarrow{\mathbb{P}} \vartheta_0$, for every fixed $\delta>0$, \begin{align*} \mathbb{P}\bigl(|\hat{\vartheta}_n-\vartheta_0|>\delta\bigr) \to 0. \end{align*} The assumed local uniform observed-Hessian convergence says that \begin{align*} \sup_{\vartheta \in B(\vartheta_0,\delta)}\left\| -\frac{1}{n}H_n(\vartheta)-I(\vartheta_0)\right\|_{\mathrm{op}} \xrightarrow{\mathbb{P}} 0 \end{align*} as first $n\to\infty$ and then $\delta\downarrow 0$. Hence, for every $\varepsilon>0$ and $\eta>0$, choose $\delta>0$ small enough for the local uniform convergence bound and then use consistency to obtain \begin{align*} \mathbb{P}\left( \sup_{0\le t\le 1}\left\| -\frac{1}{n}H_n(\vartheta_{n,t})-I(\vartheta_0)\right\|_{\mathrm{op}}>\varepsilon \right) &\le \mathbb{P}\bigl(|\hat{\vartheta}_n-\vartheta_0|>\delta\bigr)\\ &\quad+ \mathbb{P}\left( \sup_{\vartheta \in B(\vartheta_0,\delta)}\left\| -\frac{1}{n}H_n(\vartheta)-I(\vartheta_0)\right\|_{\mathrm{op}}>\varepsilon \right)\\ &<\eta \end{align*} for all sufficiently large $n$. Therefore \begin{align*} \sup_{0\le t\le 1}\left\| -\frac{1}{n}H_n(\vartheta_{n,t})-I(\vartheta_0)\right\|_{\mathrm{op}} \xrightarrow{\mathbb{P}} 0. \end{align*} Using the triangle inequality for the operator norm and the definition of $\bar{H}_n$, we obtain \begin{align*} \left\|-\frac{1}{n}\bar{H}_n-I(\vartheta_0)\right\|_{\mathrm{op}} &= \left\|\int_0^1\left(-\frac{1}{n}H_n(\vartheta_{n,t})-I(\vartheta_0)\right)\,d\mathcal{L}^1(t)\right\|_{\mathrm{op}}\\ &\le \int_0^1\left\|-\frac{1}{n}H_n(\vartheta_{n,t})-I(\vartheta_0)\right\|_{\mathrm{op}}\,d\mathcal{L}^1(t)\\ &\le \sup_{0\le t\le 1}\left\| -\frac{1}{n}H_n(\vartheta_{n,t})-I(\vartheta_0)\right\|_{\mathrm{op}}, \end{align*} so \begin{align*} -\frac{1}{n}\bar{H}_n \xrightarrow{\mathbb{P}} I(\vartheta_0). \end{align*} Because $I(\vartheta_0)$ is nonsingular, choose $\rho>0$ such that every matrix $A \in \mathbb{R}^{(p+q)\times(p+q)}$ with $\|A-I(\vartheta_0)\|_{\mathrm{op}}<\rho$ is nonsingular. The convergence \begin{align*} -\frac{1}{n}\bar{H}_n \xrightarrow{\mathbb{P}} I(\vartheta_0) \end{align*} therefore implies that $-n^{-1}\bar{H}_n$ is nonsingular with probability tending to one. On this event, the adjugate formula for matrix inversion and continuity of determinant and cofactors give continuity of the inversion map at $I(\vartheta_0)$, hence \begin{align*} \left(-\frac{1}{n}\bar{H}_n\right)^{-1} \xrightarrow{\mathbb{P}} I(\vartheta_0)^{-1}. \end{align*} [/step]

custom_env admin

Verification Progress

6 Total Blocks

0 Verified

0% verified

Contributors

admin 6 blocks (0 verified)

Who Can Verify

No area tags assigned. Only global reviewers can verify.

Viktor Miykov Admin

Max Vassiliev Global Reviewer

Horia Neagu Global Reviewer

강현욱 Global Reviewer

Demo Testing Global Reviewer

Archie Pennycook Global Reviewer

Quick Actions

Edit Theorem

What brings you to Androma?

Start with a route through the knowledge graph.

Attributions & Verification

Proof

Verification Progress

Contributors

Who Can Verify

Quick Actions

Sign in to Androma

Check your inbox

One last step

Attributions & Verification

Proof

Verification Progress

Contributors

Who Can Verify

Quick Actions

Raw Attribution Data