[guided]The restricted problem is solved only on the coordinates where the true coefficient is nonzero. Define
\begin{align*}
\bar{\beta}_{n,S}
\in
\operatorname*{arg\,min}_{b\in \mathbb{R}^{|S|}}
\left\{
\frac{1}{2n}|Y_n-X_{n,S}b|^2
+
\lambda_n\sum_{j\in S}w_{n,j}|b_j|
\right\},
\qquad
\bar{\beta}_{n,S^c}:=0.
\end{align*}
The matrix $X_{n,S}\in\mathbb{R}^{n\times |S|}$ is obtained from $X_n$ by keeping only the columns indexed by $S$. Since $C$ is positive definite, the principal submatrix $C_{SS}$ is positive definite. Because $C_{n,SS}\to C_{SS}$, the matrices $C_{n,SS}$ are positive definite for all sufficiently large $n$, so the restricted objective is strictly convex for those $n$. Therefore the displayed argmin consists of a unique vector for all sufficiently large $n$, and this unique vector is denoted by $\bar{\beta}_{n,S}$.
The important point is to avoid assuming in advance that the restricted minimizer has no zero active coordinates. The penalty is nondifferentiable at zero, so we use subgradients. Let $\tau_{n,S}\in[-1,1]^{|S|}$ be a subgradient vector for the active absolute-value penalty at $\bar{\beta}_{n,S}$: for each $j\in S$, $\tau_{n,j}=\operatorname{sgn}(\bar{\beta}_{n,j})$ if $\bar{\beta}_{n,j}\ne0$, while $\tau_{n,j}\in[-1,1]$ if $\bar{\beta}_{n,j}=0$. The subgradient optimality condition for the restricted convex problem is
\begin{align*}
0
=
-\frac{1}{n}X_{n,S}^\top(Y_n-X_{n,S}\bar{\beta}_{n,S})
+
\lambda_n W_{n,S}\tau_{n,S},
\end{align*}
where $W_{n,S}\in\mathbb{R}^{|S|\times |S|}$ is diagonal with diagonal entries $w_{n,j}$ for $j\in S$.
Substitute the model identity $Y_n=X_{n,S}\beta^*_S+\varepsilon_n$. The optimality condition becomes
\begin{align*}
0
=
-\frac{1}{n}X_{n,S}^\top\{X_{n,S}\beta^*_S+\varepsilon_n-X_{n,S}\bar{\beta}_{n,S}\}
+
\lambda_n W_{n,S}\tau_{n,S}.
\end{align*}
Expanding the braces and collecting the terms containing $\bar{\beta}_{n,S}-\beta^*_S$ gives
\begin{align*}
0
=
C_{n,SS}(\bar{\beta}_{n,S}-\beta^*_S)
-
\frac{1}{n}X_{n,S}^\top\varepsilon_n
+
\lambda_n W_{n,S}\tau_{n,S}.
\end{align*}
Thus
\begin{align*}
C_{n,SS}(\bar{\beta}_{n,S}-\beta^*_S)
=
\frac{1}{n}X_{n,S}^\top\varepsilon_n
-
\lambda_n W_{n,S}\tau_{n,S}.
\end{align*}
Multiplying by $\sqrt{n}$ and using $Z_{n,S}=X_{n,S}^\top\varepsilon_n/\sqrt{n}$ gives
\begin{align*}
\sqrt{n}(\bar{\beta}_{n,S}-\beta^*_S)
=
C_{n,SS}^{-1}Z_{n,S}
-
\sqrt{n}\lambda_n C_{n,SS}^{-1}W_{n,S}\tau_{n,S}.
\end{align*}
The first term is the ordinary least-squares fluctuation on the true active variables. The second term is the adaptive penalty bias. On active coordinates, $W_{n,S}=O_{\mathbb{P}}(1)$ by the weight convergence from the first step, and $|\tau_{n,S}|\le\sqrt{|S|}$ because each component lies in $[-1,1]$. Since $\sqrt{n}\lambda_n\to0$ and $C_{n,SS}^{-1}\to C_{SS}^{-1}$, the penalty bias is $o_{\mathbb{P}}(1)$. Therefore
\begin{align*}
\sqrt{n}(\bar{\beta}_{n,S}-\beta^*_S)
=
C_{SS}^{-1}Z_{n,S}+o_{\mathbb{P}}(1).
\end{align*}
Since $Z_{n,S}\xrightarrow{d}\mathcal{N}(0,\sigma^2 C_{SS})$, [Slutsky's theorem](/page/Slutsky%27s%20Theorem) yields
\begin{align*}
\sqrt{n}(\bar{\beta}_{n,S}-\beta^*_S)
\xrightarrow{d}
\mathcal{N}(0,\sigma^2 C_{SS}^{-1}).
\end{align*}
This gives $\bar{\beta}_{n,S}\xrightarrow{\mathbb{P}}\beta^*_S$. Because every coordinate of $\beta^*_S$ is nonzero, convergence in probability implies
\begin{align*}
\mathbb{P}(\bar{\beta}_{n,j}\ne 0\text{ for every }j\in S)\to 1.
\end{align*}[/guided]