[proofplan]
We prove the score bound directly from the defining minimality of the Lasso solution. The argument compares the Lasso objective at the minimizer with coordinatewise positive and negative perturbations. The positive perturbations give an upper bound on each score coordinate, the negative perturbations give the corresponding lower bound, and the coordinatewise bounds combine to give the stated $\ell_\infty$ feasibility estimate.
[/proofplan]
[step:Write the one-coordinate optimality inequality]
For each $j \in \{1,\dots,p\}$, let $e_j \in \mathbb{R}^p$ denote the $j$-th standard basis vector. Define the residual vector $r \in \mathbb{R}^n$ by
\begin{align*}
r := Y-X\hat\beta_{\mathrm{L}}.
\end{align*}
Define the score vector $s \in \mathbb{R}^p$ by
\begin{align*}
s := \frac{X^\top r}{n}.
\end{align*}
Since $\hat\beta_{\mathrm{L}}$ minimizes the Lasso objective over $\mathbb{R}^p$, for every $t \in \mathbb{R}$ we have
\begin{align*}
0
&\leq
\frac{1}{2n}|Y-X(\hat\beta_{\mathrm{L}}+t e_j)|^2+\lambda_L\|\hat\beta_{\mathrm{L}}+t e_j\|_1
-\frac{1}{2n}|Y-X\hat\beta_{\mathrm{L}}|^2-\lambda_L\|\hat\beta_{\mathrm{L}}\|_1.
\end{align*}
Because $Y-X(\hat\beta_{\mathrm{L}}+t e_j)=r-tXe_j$, expanding the squared Euclidean norm gives
\begin{align*}
|r-tXe_j|^2-|r|^2
&=
-2t(Xe_j)\cdot r+t^2|Xe_j|^2.
\end{align*}
Also $(Xe_j)\cdot r=e_j^\top X^\top r=n s_j$. Therefore
\begin{align*}
0
\leq
-t s_j+\frac{t^2}{2n}|Xe_j|^2
+\lambda_L\left(\|\hat\beta_{\mathrm{L}}+t e_j\|_1-\|\hat\beta_{\mathrm{L}}\|_1\right).
\end{align*}
[/step]
[step:Use positive perturbations to prove $s_j \leq \lambda_L$]
Let $t>0$. The ordinary triangle inequality for the $\ell_1$ norm, rearranged, gives
\begin{align*}
\|\hat\beta_{\mathrm{L}}+t e_j\|_1-\|\hat\beta_{\mathrm{L}}\|_1
\leq
\|t e_j\|_1
=
t.
\end{align*}
Using this in the one-coordinate optimality inequality yields
\begin{align*}
0
&\leq
-t s_j+\frac{t^2}{2n}|Xe_j|^2+\lambda_L t.
\end{align*}
Dividing by $t>0$ gives
\begin{align*}
s_j
\leq
\frac{t}{2n}|Xe_j|^2+\lambda_L.
\end{align*}
Letting $t \to 0^+$ gives $s_j \leq \lambda_L$.
[guided]
Fix a coordinate $j \in \{1,\dots,p\}$. Let $e_j \in \mathbb{R}^p$ denote the $j$-th standard basis vector. Define the residual vector $r \in \mathbb{R}^n$ by
\begin{align*}
r := Y-X\hat\beta_{\mathrm{L}}.
\end{align*}
Define the score vector $s \in \mathbb{R}^p$ by
\begin{align*}
s := \frac{X^\top r}{n}.
\end{align*}
We perturb the minimizer only in the positive $e_j$ direction. Let $t>0$. Since $\hat\beta_{\mathrm{L}}$ is a global minimizer of the Lasso objective over $\mathbb{R}^p$, the objective value at $\hat\beta_{\mathrm{L}}+t e_j$ is at least the objective value at $\hat\beta_{\mathrm{L}}$. Therefore
\begin{align*}
0
\leq
\frac{1}{2n}|Y-X(\hat\beta_{\mathrm{L}}+t e_j)|^2+\lambda_L\|\hat\beta_{\mathrm{L}}+t e_j\|_1
-\frac{1}{2n}|Y-X\hat\beta_{\mathrm{L}}|^2-\lambda_L\|\hat\beta_{\mathrm{L}}\|_1.
\end{align*}
Because $Y-X(\hat\beta_{\mathrm{L}}+t e_j)=r-tXe_j$, expanding the squared Euclidean norm gives
\begin{align*}
|r-tXe_j|^2-|r|^2
=
-2t(Xe_j)\cdot r+t^2|Xe_j|^2.
\end{align*}
Also $(Xe_j)\cdot r=e_j^\top X^\top r=n s_j$, by the definition of $s$ and the definition of the $j$-th coordinate. Substituting these identities into the objective comparison gives
\begin{align*}
0
\leq
-t s_j+\frac{t^2}{2n}|Xe_j|^2
+\lambda_L\left(\|\hat\beta_{\mathrm{L}}+t e_j\|_1-\|\hat\beta_{\mathrm{L}}\|_1\right).
\end{align*}
The remaining term is the change in the $\ell_1$ norm. We control it using the triangle inequality for the $\ell_1$ norm in the form
\begin{align*}
\|a+b\|_1\leq \|a\|_1+\|b\|_1,
\end{align*}
which implies
\begin{align*}
\|a+b\|_1-\|a\|_1 \leq \|b\|_1.
\end{align*}
Apply this with $a=\hat\beta_{\mathrm{L}}$ and $b=t e_j$. Since $t>0$ and $e_j$ has exactly one nonzero coordinate equal to $1$,
\begin{align*}
\|t e_j\|_1=t.
\end{align*}
Thus
\begin{align*}
\|\hat\beta_{\mathrm{L}}+t e_j\|_1-\|\hat\beta_{\mathrm{L}}\|_1
\leq t.
\end{align*}
Substituting this estimate into the objective comparison gives
\begin{align*}
0
&\leq
-t s_j+\frac{t^2}{2n}|Xe_j|^2+\lambda_L t.
\end{align*}
After dividing by $t>0$, we obtain
\begin{align*}
s_j
\leq
\frac{t}{2n}|Xe_j|^2+\lambda_L.
\end{align*}
The term $\frac{t}{2n}|Xe_j|^2$ tends to $0$ as $t \to 0^+$ because $|Xe_j|^2/(2n)$ is a fixed finite scalar. Hence
\begin{align*}
s_j \leq \lambda_L.
\end{align*}
This is the upper coordinate bound for the score.
[/guided]
[/step]
[step:Use negative perturbations to prove $-s_j \leq \lambda_L$]
Let $t<0$. Again the ordinary triangle inequality for the $\ell_1$ norm, rearranged, gives
\begin{align*}
\|\hat\beta_{\mathrm{L}}+t e_j\|_1-\|\hat\beta_{\mathrm{L}}\|_1
\leq
\|t e_j\|_1
=
|t|.
\end{align*}
Since $t<0$, $|t|=-t$. The one-coordinate optimality inequality becomes
\begin{align*}
0
&\leq
-t s_j+\frac{t^2}{2n}|Xe_j|^2-\lambda_L t.
\end{align*}
Dividing by $-t>0$ gives
\begin{align*}
s_j
\geq
\frac{t}{2n}|Xe_j|^2-\lambda_L.
\end{align*}
Letting $t \to 0^-$ gives $s_j \geq -\lambda_L$, equivalently $-s_j \leq \lambda_L$.
[/step]
[step:Take the maximum over coordinates]
The previous two steps show that for every $j \in \{1,\dots,p\}$,
\begin{align*}
-\lambda_L \leq s_j \leq \lambda_L.
\end{align*}
Hence $|s_j|\leq \lambda_L$ for every coordinate $j$. By the definition of the $\ell_\infty$ norm on $\mathbb{R}^p$,
\begin{align*}
\|s\|_\infty
=
\max_{1\leq j\leq p}|s_j|
\leq
\lambda_L.
\end{align*}
Substituting $s=X^\top(Y-X\hat\beta_{\mathrm{L}})/n$ gives
\begin{align*}
\left\|\frac{X^\top(Y-X\hat\beta_{\mathrm{L}})}{n}\right\|_\infty \leq \lambda_L,
\end{align*}
which is the desired score feasibility bound.
[/step]