Pick $\delta \in (2p, 1/2)$ and set $R = 1 - H(\delta) > 0$. By the GSV bound (Proposition 6.6), there exists a code $C_n$ of length $n$ and minimum distance $\lfloor n\delta \rfloor$ with $|C_n| \geq 2^{n(1-H(\delta))}$. Passing to a subcode, assume $|C_n| = \lfloor 2^{nR} \rfloor$ while retaining minimum distance $\geq \lfloor n\delta \rfloor$.
Under minimum distance decoding, an error occurs only if the BSC makes at least $\lfloor(n\delta - 1)/2\rfloor$ errors. Choosing $\varepsilon > 0$ with $p + \varepsilon < \delta/2$, for $n$ large enough we have $(n\delta - 1)/2 > n(p+\varepsilon)$, so $\hat{e}(C_n) \leq \mathbb{P}(\text{BSC makes} \geq n(p+\varepsilon) \text{ errors}) \to 0$. The last limit follows because the number of errors is a sum of $n$ independent Bernoulli$(p)$ variables, and the law of large numbers forces the sample mean to concentrate near $p$.