[guided]We expand the score, not the likelihood itself, because the estimator is characterised by the first-order condition. Define the score map
\begin{align*}
S_n: U &\to \mathbb{R}^{p+q},\\
\vartheta &\mapsto \nabla \ell_n(\vartheta),
\end{align*}
and define the observed Hessian matrix map as the Jacobian of the score,
\begin{align*}
H_n: U &\to \mathbb{R}^{(p+q)\times(p+q)},\\
\vartheta &\mapsto JS_{n,\vartheta}.
\end{align*}
Thus the $(i,j)$ entry of $H_n(\vartheta)$ is $\partial_{\vartheta_j}S_{n,i}(\vartheta)$ for $1 \le i,j \le p+q$. Let $\mathcal{L}^1$ denote one-dimensional Lebesgue measure on $[0,1]$. This avoids using a scalar one-dimensional mean value theorem in a vector setting. In several parameters, a componentwise Taylor theorem need not produce one common intermediate point for every component of the score, so we use the integral Taylor formula instead.
On $E_n$, the estimator belongs to $B(\vartheta_0,r)$, and this ball is convex as a Euclidean ball in $\mathbb{R}^{p+q}$. Therefore for every $t \in [0,1]$, the parameter
\begin{align*}
\vartheta_{n,t}:=\vartheta_0+t(\hat{\vartheta}_n-\vartheta_0)
\end{align*}
belongs to $B(\vartheta_0,r) \subset U$. The same event $E_n$ also gives twice continuous differentiability of $\ell_n$ on $U$, hence continuous differentiability of $S_n$ on the whole segment $\{\vartheta_{n,t}:0\le t\le 1\}$. These hypotheses allow us to reduce the vector-valued expansion to one-dimensional calculus along the line segment. Define the pathwise score curve
\begin{align*}
g_n: [0,1] &\to \mathbb{R}^{p+q},\\
t &\mapsto S_n(\vartheta_{n,t}).
\end{align*}
Because $S_n$ is continuously differentiable on the segment, every component of $g_n$ is continuously differentiable on $[0,1]$. By the [Fundamental Theorem of Calculus](/theorems/632) applied componentwise to $g_n$,
\begin{align*}
S_n(\hat{\vartheta}_n)-S_n(\vartheta_0)
=
g_n(1)-g_n(0)
=
\int_0^1 H_n(\vartheta_{n,t})(\hat{\vartheta}_n-\vartheta_0)\,d\mathcal{L}^1(t).
\end{align*}
Since the vector $\hat{\vartheta}_n-\vartheta_0$ is independent of the integration variable $t$, we may factor it to the right of the matrix integral. Define the averaged observed Hessian matrix
\begin{align*}
\bar{H}_n: E_n &\to \mathbb{R}^{(p+q)\times(p+q)},\\
\omega &\mapsto \int_0^1 H_n\bigl(\vartheta_0+t(\hat{\vartheta}_n(\omega)-\vartheta_0)\bigr)\,d\mathcal{L}^1(t).
\end{align*}
Then the Taylor identity becomes
\begin{align*}
S_n(\hat{\vartheta}_n)
=
S_n(\vartheta_0)+\bar{H}_n(\hat{\vartheta}_n-\vartheta_0).
\end{align*}
Because $\hat{\vartheta}_n$ is an interior maximiser on $E_n$, the first-order likelihood equation gives $S_n(\hat{\vartheta}_n)=0$. Hence
\begin{align*}
0
=
S_n(\vartheta_0)+\bar{H}_n(\hat{\vartheta}_n-\vartheta_0).
\end{align*}
Multiplying by $n^{-1/2}$ and moving the Hessian term to the left gives
\begin{align*}
-\frac{1}{n}\bar{H}_n\sqrt n(\hat{\vartheta}_n-\vartheta_0)
=
\frac{1}{\sqrt n}S_n(\vartheta_0).
\end{align*}
This identity is the mechanism of the proof: the left side contains the averaged observed information, while the right side is the normalised score.[/guided]