[proofplan]
The proof is a direct finite-dimensional duality estimate. We first rewrite the coordinate remainder as the product of a row vector and an error vector, then bound this product by the $\ell_\infty$ norm of the row times the $\ell_1$ norm of the vector. The stated probabilistic order follows from multiplying two $O_{\mathbb P}$ bounds, and the nodewise formulation follows by transposing the column residual when $\hat\Sigma$ is symmetric.
[/proofplan]
custom_env
admin
[step:Bound the coordinate remainder by $\ell_\infty$ and $\ell_1$ duality]Define the row vector $a_j^\top \in \mathbb{R}^{1 \times p}$ by
\begin{align*}
a_j^\top := e_j^\top(I_p-\hat\Theta\hat\Sigma).
\end{align*}
Define the error vector $h \in \mathbb{R}^p$ by
\begin{align*}
h := \hat\beta-\beta^*.
\end{align*}
Define the scalar coordinate remainder $R_j \in \mathbb{R}$ by
\begin{align*}
R_j:=e_j^\top(I_p-\hat\Theta\hat\Sigma)(\hat\beta-\beta^*)=a_j^\top h.
\end{align*} Writing $a_{j,k}$ for the $k$-th component of $a_j$ and $h_k$ for the $k$-th component of $h$, the triangle inequality gives
\begin{align*}
|R_j|
= \left|\sum_{k=1}^{p} a_{j,k}h_k\right|
\leq \sum_{k=1}^{p} |a_{j,k}|\,|h_k|.
\end{align*}
Since $|a_{j,k}|\leq \|a_j^\top\|_\infty$ for every $k \in \{1,\dots,p\}$, we obtain
\begin{align*}
|R_j|
\leq \|a_j^\top\|_\infty\sum_{k=1}^{p}|h_k|
= \|e_j^\top(I_p-\hat\Theta\hat\Sigma)\|_\infty\,\|\hat\beta-\beta^*\|_1.
\end{align*}[/step]
custom_env
admin
[guided]The coordinate remainder is a scalar obtained by pairing one row vector with one estimation error vector. To make that pairing explicit, define the row vector $a_j^\top \in \mathbb{R}^{1\times p}$ by
\begin{align*}
a_j^\top := e_j^\top(I_p-\hat\Theta\hat\Sigma).
\end{align*}
Define the error vector $h \in \mathbb{R}^p$ by
\begin{align*}
h := \hat\beta-\beta^*.
\end{align*}
Define the scalar coordinate remainder $R_j \in \mathbb{R}$ by
\begin{align*}
R_j:=e_j^\top(I_p-\hat\Theta\hat\Sigma)(\hat\beta-\beta^*)=a_j^\top h.
\end{align*} If $a_{j,k}$ denotes the $k$-th component of $a_j$ and $h_k$ denotes the $k$-th component of $h$, then the scalar product is
\begin{align*}
R_j=\sum_{k=1}^{p}a_{j,k}h_k.
\end{align*}
Taking absolute values and applying the triangle inequality,
\begin{align*}
|R_j|
\leq \sum_{k=1}^{p}|a_{j,k}|\,|h_k|.
\end{align*}
The $\ell_\infty$ norm of $a_j^\top$ is the largest absolute component of the row, so for each $k$,
\begin{align*}
|a_{j,k}|\leq \|a_j^\top\|_\infty.
\end{align*}
Substituting this componentwise bound into the previous inequality gives
\begin{align*}
|R_j|
\leq \sum_{k=1}^{p}\|a_j^\top\|_\infty |h_k|
= \|a_j^\top\|_\infty \sum_{k=1}^{p}|h_k|
= \|a_j^\top\|_\infty \|h\|_1.
\end{align*}
Returning to the definitions of $a_j^\top$ and $h$, this is exactly
\begin{align*}
|R_j|
\leq \|e_j^\top(I_p-\hat\Theta\hat\Sigma)\|_\infty\,\|\hat\beta-\beta^*\|_1.
\end{align*}
This is the finite-dimensional $\ell_\infty$-$\ell_1$ duality estimate.[/guided]
custom_env
admin
[step:Multiply the two probabilistic rates without dividing by the normalizer]Set the nonnegative deterministic sequence
\begin{align*}
r_n:=\sqrt{\frac{\log p}{n}}.
\end{align*}
Let
\begin{align*}
X_n := \|e_j^\top(I_p-\hat\Theta\hat\Sigma)\|_\infty
\end{align*}
and
\begin{align*}
Y_n := \|\hat\beta-\beta^*\|_1.
\end{align*}
By hypothesis, in the stated convention for stochastic boundedness with possibly vanishing normalizers,
\begin{align*}
X_n = O_{\mathbb P}(r_n),
\qquad
Y_n = O_{\mathbb P}(s r_n).
\end{align*}
To verify the product rule without forming the ratios $X_n/r_n$ or $Y_n/(s r_n)$, fix $\varepsilon>0$. By the definition of $O_{\mathbb P}(r_n)$ and $O_{\mathbb P}(s r_n)$, choose constants $M_1>0$ and $M_2>0$ such that, for all sufficiently large $n$,
\begin{align*}
\mathbb P(X_n>M_1 r_n)<\frac{\varepsilon}{2}
\end{align*}
and
\begin{align*}
\mathbb P(Y_n>M_2 s r_n)<\frac{\varepsilon}{2}.
\end{align*}
This remains meaningful when $r_n=0$, because the definition then controls the event on which the corresponding nonnegative [random variable](/page/Random%20Variable) is positive. On the complement of these two events, $X_n\leq M_1r_n$ and $Y_n\leq M_2sr_n$, hence
\begin{align*}
X_nY_n\leq M_1M_2 s r_n^2.
\end{align*}
Therefore the union bound gives
\begin{align*}
\mathbb P(X_nY_n>M_1M_2 s r_n^2)<\varepsilon
\end{align*}
for all sufficiently large $n$, which is exactly
\begin{align*}
\|e_j^\top(I_p-\hat\Theta\hat\Sigma)\|_\infty\,\|\hat\beta-\beta^*\|_1
=O_{\mathbb P}(s r_n^2).
\end{align*}
Since
\begin{align*}
s r_n^2=s\frac{\log p}{n},
\end{align*}
the deterministic inequality from the previous step implies
\begin{align*}
R_j=O_{\mathbb P}\left(\frac{s\log p}{n}\right).
\end{align*}[/step]
custom_env
admin
[guided]The purpose of this step is to justify that the deterministic duality bound preserves the stated stochastic rate. Define
\begin{align*}
r_n:=\sqrt{\frac{\log p}{n}},
\end{align*}
and define the nonnegative random variables
\begin{align*}
X_n := \|e_j^\top(I_p-\hat\Theta\hat\Sigma)\|_\infty,
\qquad
Y_n := \|\hat\beta-\beta^*\|_1.
\end{align*}
The hypotheses say that $X_n=O_{\mathbb P}(r_n)$ and $Y_n=O_{\mathbb P}(s r_n)$. By the definition of stochastic boundedness, for each $\varepsilon>0$ there are constants $M_1>0$ and $M_2>0$ such that, for all sufficiently large $n$,
\begin{align*}
\mathbb P(X_n>M_1 r_n)<\frac{\varepsilon}{2},
\qquad
\mathbb P(Y_n>M_2 s r_n)<\frac{\varepsilon}{2}.
\end{align*}
On the complement of these two exceptional events, both bounds hold simultaneously, and hence
\begin{align*}
X_nY_n\leq M_1M_2s r_n^2.
\end{align*}
Therefore the union bound gives
\begin{align*}
\mathbb P(X_nY_n>M_1M_2s r_n^2)
\leq \mathbb P(X_n>M_1r_n)+\mathbb P(Y_n>M_2sr_n)
<\varepsilon.
\end{align*}
This proves
\begin{align*}
X_nY_n=O_{\mathbb P}(s r_n^2).
\end{align*}
Since
\begin{align*}
s r_n^2=s\frac{\log p}{n},
\end{align*}
and since the previous step proved $|R_j|\leq X_nY_n$, we conclude
\begin{align*}
R_j=O_{\mathbb P}\left(\frac{s\log p}{n}\right).
\end{align*}[/guided]
custom_env
admin
[step:Recover the row bound from the nodewise column bound under symmetry]
Assume that $\hat\Sigma=\hat\Sigma^\top$ and that the $j$-th row of $\hat\Theta$ is $\hat\theta_j^\top$, where $\hat\theta_j \in \mathbb{R}^p$. Then
\begin{align*}
e_j^\top(I_p-\hat\Theta\hat\Sigma)
= e_j^\top-\hat\theta_j^\top\hat\Sigma
= (e_j-\hat\Sigma\hat\theta_j)^\top,
\end{align*}
where the last identity uses $\hat\Sigma^\top=\hat\Sigma$. Therefore
\begin{align*}
\|e_j^\top(I_p-\hat\Theta\hat\Sigma)\|_\infty
= \|(e_j-\hat\Sigma\hat\theta_j)^\top\|_\infty
= \|\hat\Sigma\hat\theta_j-e_j\|_\infty.
\end{align*}
Hence any nodewise bound of the form
\begin{align*}
\|\hat\Sigma\hat\theta_j-e_j\|_\infty
=O_{\mathbb P}\left(\sqrt{\frac{\log p}{n}}\right)
\end{align*}
implies the required row bound. Combining this with the preceding step proves the claimed coordinate remainder rate.
[/step]