[proofplan]
The proof is an algebraic consequence of the optimality conditions for the nodewise Lasso problem. The Karush-Kuhn-Tucker condition controls every coordinate of $X_{-j}^\top \hat r_j/n$ by $\lambda_j$, while the subgradient identity also identifies the special $j$th coordinate $X_j^\top \hat r_j/n$ with $\hat\tau_j^2$. Since $X\hat c_j=\hat r_j$, the vector $\hat\Sigma\hat\theta_j$ is exactly $X^\top \hat r_j/(n\hat\tau_j^2)$, and the desired coordinatewise bound follows.
[/proofplan]
custom_env
admin
[step:Write the nodewise optimality condition in subgradient form]Fix $j \in \{1,\dots,p\}$. Define the convex objective function $Q_j: \mathbb{R}^{p-1} \to \mathbb{R}$ by
\begin{align*}
Q_j(\gamma) = \frac{1}{n}|X_j - X_{-j}\gamma|^2 + 2\lambda_j\|\gamma\|_1.
\end{align*}
Since $\hat{\gamma}_j$ minimizes $Q_j$, the first-order convex optimality condition gives
\begin{align*}
0 \in -\frac{2}{n}X_{-j}^\top(X_j - X_{-j}\hat{\gamma}_j) + 2\lambda_j\partial\|\hat{\gamma}_j\|_1.
\end{align*}
Using the definition $\hat r_j := X_j - X_{-j}\hat\gamma_j$, there exists a vector $\hat\kappa_j \in \mathbb{R}^{p-1}$ with $\hat\kappa_j \in \partial\|\hat\gamma_j\|_1$ such that
\begin{align*}
\frac{1}{n}X_{-j}^\top\hat r_j = \lambda_j \hat\kappa_j.
\end{align*}
For the $\ell^1$ subdifferential, this means $\|\hat\kappa_j\|_\infty \leq 1$ and
\begin{align*}
\hat\gamma_j^\top\hat\kappa_j = \|\hat\gamma_j\|_1.
\end{align*}[/step]
custom_env
admin
[guided]We start by converting the nodewise Lasso minimization problem into a coordinatewise identity. The function being minimized is the map $Q_j: \mathbb{R}^{p-1} \to \mathbb{R}$ defined by
\begin{align*}
Q_j(\gamma) = \frac{1}{n}|X_j - X_{-j}\gamma|^2 + 2\lambda_j\|\gamma\|_1.
\end{align*}
This function is convex because it is the sum of a convex quadratic function and the convex $\ell^1$ norm. Since $\hat\gamma_j$ is a minimizer, the convex first-order condition is
\begin{align*}
0 \in \partial Q_j(\hat\gamma_j).
\end{align*}
The differentiable quadratic part has gradient
\begin{align*}
-\frac{2}{n}X_{-j}^\top(X_j - X_{-j}\hat\gamma_j),
\end{align*}
and the nonsmooth part contributes $2\lambda_j\partial\|\hat\gamma_j\|_1$. Hence
\begin{align*}
0 \in -\frac{2}{n}X_{-j}^\top(X_j - X_{-j}\hat{\gamma}_j) + 2\lambda_j\partial\|\hat{\gamma}_j\|_1.
\end{align*}
Using $\hat r_j := X_j - X_{-j}\hat\gamma_j$, this says that there is a subgradient vector $\hat\kappa_j \in \partial\|\hat\gamma_j\|_1 \subset \mathbb{R}^{p-1}$ such that
\begin{align*}
\frac{1}{n}X_{-j}^\top\hat r_j = \lambda_j\hat\kappa_j.
\end{align*}
The useful content of the $\ell^1$ subgradient is twofold: each coordinate satisfies $|(\hat\kappa_j)_k|\leq 1$, and the subgradient pairs with $\hat\gamma_j$ to recover the norm,
\begin{align*}
\hat\gamma_j^\top\hat\kappa_j = \|\hat\gamma_j\|_1.
\end{align*}
These two facts are exactly what will give the off-diagonal bound and the diagonal normalization.[/guided]
custom_env
admin
[step:Identify the diagonal coordinate with the nodewise scale]
Using $X_j=\hat r_j+X_{-j}\hat\gamma_j$, we first expand
\begin{align*}
\frac{1}{n}X_j^\top\hat r_j = \frac{1}{n}|\hat r_j|^2 + \hat\gamma_j^\top\left(\frac{1}{n}X_{-j}^\top\hat r_j\right).
\end{align*}
Substituting the KKT identity $X_{-j}^\top\hat r_j/n = \lambda_j\hat\kappa_j$ gives
\begin{align*}
\frac{1}{n}X_j^\top\hat r_j = \frac{1}{n}|\hat r_j|^2 + \lambda_j\hat\gamma_j^\top\hat\kappa_j.
\end{align*}
Using $\hat\gamma_j^\top\hat\kappa_j=\|\hat\gamma_j\|_1$ and the definition of $\hat\tau_j^2$, we obtain
\begin{align*}
\frac{1}{n}X_j^\top\hat r_j = \frac{1}{n}|\hat r_j|^2 + \lambda_j\|\hat\gamma_j\|_1 = \hat\tau_j^2.
\end{align*}
Thus the $j$th coordinate of $X^\top\hat r_j/n$ equals $\hat\tau_j^2$.
[/step]
custom_env
admin
[step:Bound the off-diagonal coordinates by the nodewise tuning parameter]
Let $k \in \{1,\dots,p\}$ with $k\neq j$. The coordinate $X_k$ appears as one of the columns of $X_{-j}$. Therefore the KKT identity gives
\begin{align*}
\left|\frac{1}{n}X_k^\top\hat r_j\right| \leq \lambda_j\|\hat\kappa_j\|_\infty \leq \lambda_j.
\end{align*}
Together with the previous step, this gives the coordinatewise relations
\begin{align*}
\frac{1}{n}X_j^\top\hat r_j = \hat\tau_j^2.
\end{align*}
For every $k \in \{1,\dots,p\}$ with $k\neq j$, it also gives
\begin{align*}
\left|\frac{1}{n}X_k^\top\hat r_j\right|\leq \lambda_j.
\end{align*}
[/step]
custom_env
admin
[step:Translate the residual identities into the approximate inverse bound]
By the definition of $\hat c_j$, the product $X\hat c_j$ satisfies
\begin{align*}
X\hat c_j = X_j - X_{-j}\hat\gamma_j = \hat r_j.
\end{align*}
Since $\hat\theta_j=\hat c_j/\hat\tau_j^2$ and $\hat\Sigma=X^\top X/n$, we have
\begin{align*}
\hat\Sigma\hat\theta_j
=
\frac{1}{\hat\tau_j^2}\hat\Sigma\hat c_j
=
\frac{1}{\hat\tau_j^2}\frac{1}{n}X^\top X\hat c_j
=
\frac{1}{\hat\tau_j^2}\frac{1}{n}X^\top\hat r_j.
\end{align*}
The $j$th coordinate of this vector is $1$, while every coordinate $k\neq j$ has absolute value at most $\lambda_j/\hat\tau_j^2$. Hence
\begin{align*}
\|\hat\Sigma\hat\theta_j-e_j\|_\infty
\leq
\frac{\lambda_j}{\hat\tau_j^2}.
\end{align*}
[/step]
custom_env
admin
[step:Use symmetry of the empirical Gram matrix to obtain the row bound]
Let $\hat\Theta \in \mathbb{R}^{p\times p}$ be the matrix whose $j$th row is $\hat\theta_j^\top$. Since $\hat\Sigma=X^\top X/n$, the matrix $\hat\Sigma$ is symmetric. Therefore
\begin{align*}
e_j^\top(\hat\Theta\hat\Sigma-I_p)
=
\hat\theta_j^\top\hat\Sigma-e_j^\top
=
(\hat\Sigma\hat\theta_j-e_j)^\top.
\end{align*}
Taking the $\ell^\infty$ norm and using the column bound already proved gives
\begin{align*}
\|e_j^\top(\hat\Theta\hat\Sigma-I_p)\|_\infty
=
\|(\hat\Sigma\hat\theta_j-e_j)^\top\|_\infty
\leq
\frac{\lambda_j}{\hat\tau_j^2}.
\end{align*}
This proves both asserted bounds.
[/step]