[proofplan]
The Fisher scoring quadratic approximation is minimized by solving the weighted least-squares problem with working response $z^{(k)}$ and weight matrix $W^{(k)}$. We expand the weighted residual criterion as a quadratic function of $\beta$, compute its directional derivative, and identify the normal equations. The assumed invertibility of $X^\top W^{(k)}X$ gives a unique critical point, and positivity of the quadratic form shows that this critical point is the unique global minimizer.
[/proofplan]
[step:Expand the weighted residual criterion as a quadratic polynomial in $\beta$]
For this fixed iteration $k$, write $z := z^{(k)} \in \mathbb{R}^n$ and $W := W^{(k)} \in \mathbb{R}^{n \times n}$. The matrix $W$ is symmetric because it is diagonal. Define
\begin{align*}
Q:\mathbb{R}^p &\to \mathbb{R} \\
\beta &\mapsto (z-X\beta)^\top W(z-X\beta).
\end{align*}
Expanding the product and using $W^\top=W$ gives
\begin{align*}
Q(\beta)
&= z^\top Wz - z^\top WX\beta - \beta^\top X^\top Wz + \beta^\top X^\top WX\beta \\
&= z^\top Wz - 2\beta^\top X^\top Wz + \beta^\top X^\top WX\beta.
\end{align*}
Here $z^\top WX\beta$ and $\beta^\top X^\top Wz$ are equal because both are real $1 \times 1$ matrices and transpose to one another.
[guided]
We first remove the residual notation so that the objective is visibly a quadratic polynomial in the unknown vector $\beta$. For this fixed iteration $k$, set $z := z^{(k)}$ and $W := W^{(k)}$. The weighted residual objective is the map
\begin{align*}
Q:\mathbb{R}^p &\to \mathbb{R} \\
\beta &\mapsto (z-X\beta)^\top W(z-X\beta).
\end{align*}
Because $W$ is diagonal, it is symmetric: $W^\top=W$. Expanding the product gives
\begin{align*}
Q(\beta)
&= (z^\top-\beta^\top X^\top)W(z-X\beta) \\
&= z^\top Wz - z^\top WX\beta - \beta^\top X^\top Wz + \beta^\top X^\top WX\beta.
\end{align*}
The two mixed terms are equal as real scalars. Indeed,
\begin{align*}
(z^\top WX\beta)^\top=\beta^\top X^\top W^\top z=\beta^\top X^\top Wz.
\end{align*}
Therefore
\begin{align*}
Q(\beta)=z^\top Wz - 2\beta^\top X^\top Wz + \beta^\top X^\top WX\beta.
\end{align*}
This expansion isolates the constant term, the linear term, and the quadratic curvature matrix $X^\top WX$.
[/guided]
[/step]
[step:Differentiate the quadratic objective and obtain the weighted normal equations]
Let $h \in \mathbb{R}^p$ be an arbitrary direction. For each $t \in \mathbb{R}$,
\begin{align*}
Q(\beta+th)
&= z^\top Wz - 2(\beta+th)^\top X^\top Wz
+(\beta+th)^\top X^\top WX(\beta+th).
\end{align*}
Subtracting $Q(\beta)$, dividing by $t \neq 0$, and letting $t \to 0$ gives the directional derivative
\begin{align*}
\frac{d}{dt}\Big|_{t=0} Q(\beta+th)
= 2h^\top X^\top WX\beta - 2h^\top X^\top Wz.
\end{align*}
Thus the gradient is
\begin{align*}
\nabla Q(\beta)=2X^\top WX\beta-2X^\top Wz.
\end{align*}
A critical point therefore satisfies the weighted normal equations
\begin{align*}
X^\top WX\beta=X^\top Wz.
\end{align*}
[guided]
To find the minimizer of the quadratic objective, we compute its first variation in an arbitrary direction. Let $h \in \mathbb{R}^p$ be fixed. For $t \in \mathbb{R}$, substitute $\beta+th$ into the expanded expression for $Q$:
\begin{align*}
Q(\beta+th)
&= z^\top Wz - 2(\beta+th)^\top X^\top Wz
+(\beta+th)^\top X^\top WX(\beta+th).
\end{align*}
Now expand only the terms depending on $t$:
\begin{align*}
Q(\beta+th)
&= Q(\beta)
-2t h^\top X^\top Wz
+2t h^\top X^\top WX\beta
+t^2 h^\top X^\top WXh.
\end{align*}
The coefficient of $t$ is the directional derivative at $\beta$ in the direction $h$, so
\begin{align*}
\frac{d}{dt}\Big|_{t=0} Q(\beta+th)
=2h^\top X^\top WX\beta-2h^\top X^\top Wz.
\end{align*}
Since this identity holds for every direction $h \in \mathbb{R}^p$, the gradient is
\begin{align*}
\nabla Q(\beta)=2X^\top WX\beta-2X^\top Wz.
\end{align*}
At any interior minimizer of a differentiable function on $\mathbb{R}^p$, the gradient must vanish. Hence every minimizer must satisfy
\begin{align*}
X^\top WX\beta=X^\top Wz.
\end{align*}
These are the weighted normal equations.
[/guided]
[/step]
[step:Solve the normal equations using the assumed invertibility]
By hypothesis, the matrix $X^\top WX \in \mathbb{R}^{p \times p}$ is invertible. Therefore the weighted normal equations have the unique solution
\begin{align*}
\beta_*=(X^\top WX)^{-1}X^\top Wz.
\end{align*}
Returning to the iteration notation $z=z^{(k)}$ and $W=W^{(k)}$, this is
\begin{align*}
\beta_*=\left(X^\top W^{(k)}X\right)^{-1}X^\top W^{(k)}z^{(k)}.
\end{align*}
[/step]
[step:Verify that the critical point is the unique global minimizer]
Let $\beta \in \mathbb{R}^p$ be arbitrary and set $u:=\beta-\beta_* \in \mathbb{R}^p$. Since $\beta_*$ satisfies $X^\top WX\beta_*=X^\top Wz$, the linear terms in the expansion around $\beta_*$ vanish:
\begin{align*}
Q(\beta)
&=Q(\beta_*+u) \\
&=Q(\beta_*)+u^\top X^\top WXu.
\end{align*}
Because $W=\operatorname{diag}(w_1,\dots,w_n)$ with each $w_i>0$, we have
\begin{align*}
u^\top X^\top WXu=(Xu)^\top W(Xu)=\sum_{i=1}^{n} w_i ((Xu)_i)^2 \ge 0.
\end{align*}
Thus $Q(\beta)\ge Q(\beta_*)$ for every $\beta \in \mathbb{R}^p$.
If equality holds, then $u^\top X^\top WXu=0$. Since $X^\top WX$ is invertible and symmetric positive semidefinite, its kernel is $\{0\}$, so $u=0$. Hence $\beta=\beta_*$. Therefore $\beta_*$ is the unique global minimizer of $Q$.
Substituting back $W=W^{(k)}$ and $z=z^{(k)}$ gives the Fisher scoring, equivalently IRLS, update
\begin{align*}
\beta^{(k+1)}=\left(X^\top W^{(k)}X\right)^{-1}X^\top W^{(k)}z^{(k)}.
\end{align*}
This proves the statement.
[/step]