[proofplan]
We transfer an arbitrary $L^p(Q)$ cover of $\mathcal F$ through the Lipschitz map $\phi$. The pointwise Lipschitz inequality contracts distances up to the factor $L$, and integration with respect to $Q$ converts this pointwise estimate into an $L^p(Q)$ estimate. Taking the infimum over all admissible covers gives the covering-number inequality. The envelope statement follows separately from $\phi(0)=0$, which implies $|\phi(t)|\le L|t|$ for every real number $t$.
[/proofplan]
[step:Transfer an arbitrary finite $L^p(Q)$ cover through $\phi$]
Fix a probability measure $Q$ on $(S,\mathcal S)$, fix $p\ge 1$, and fix $\varepsilon>0$. Define the extended $L^p(Q)$ pseudometric $d_{p,Q}$ on measurable real-valued functions by
\begin{align*}
d_{p,Q}(u,v):=\left(\int_S |u(x)-v(x)|^p\,dQ(x)\right)^{1/p}.
\end{align*}
If $N(\varepsilon,\mathcal F,L^p(Q))=\infty$, the desired inequality is immediate. Suppose therefore that $\mathcal F$ admits a finite $\varepsilon$-cover in $L^p(Q)$. Let $m\in\mathbb N$ and let $g_1,\dots,g_m:S\to\mathbb R$ be [measurable functions](/page/Measurable%20Functions) such that for every $f\in\mathcal F$ there exists $j\in\{1,\dots,m\}$ with
\begin{align*}
d_{p,Q}(f,g_j)\le\varepsilon.
\end{align*}
Since $\phi$ is Lipschitz, it is continuous and hence Borel measurable. Therefore each composition $\phi\circ g_j:S\to\mathbb R$ is measurable.
Let $f\in\mathcal F$, and choose $j\in\{1,\dots,m\}$ such that $d_{p,Q}(f,g_j)\le\varepsilon$. For every $x\in S$, the Lipschitz condition gives
\begin{align*}
|(\phi\circ f)(x)-(\phi\circ g_j)(x)|\le L|f(x)-g_j(x)|.
\end{align*}
Raising both sides to the power $p$, integrating with respect to $Q$, and taking the $p$-th root gives
\begin{align*}
d_{p,Q}(\phi\circ f,\phi\circ g_j)\le Ld_{p,Q}(f,g_j)\le L\varepsilon.
\end{align*}
Thus $\phi\circ g_1,\dots,\phi\circ g_m$ form an $L\varepsilon$-cover of $\phi\circ\mathcal F$ in $L^p(Q)$.
[guided]
Fix a probability measure $Q$ on $(S,\mathcal S)$, a number $p\ge 1$, and a scale $\varepsilon>0$. The distance used by the covering number is
\begin{align*}
d_{p,Q}(u,v):=\left(\int_S |u(x)-v(x)|^p\,dQ(x)\right)^{1/p}
\end{align*}
for measurable functions $u,v:S\to\mathbb R$.
If $N(\varepsilon,\mathcal F,L^p(Q))=\infty$, there is no finite upper bound to prove because the right-hand side is already infinite. So assume that $\mathcal F$ has a finite $\varepsilon$-cover. This means that there are measurable functions $g_1,\dots,g_m:S\to\mathbb R$ such that every $f\in\mathcal F$ lies within $L^p(Q)$ distance $\varepsilon$ of one of them:
\begin{align*}
d_{p,Q}(f,g_j)\le\varepsilon
\end{align*}
for at least one index $j\in\{1,\dots,m\}$.
The natural candidate cover for $\phi\circ\mathcal F$ is obtained by applying $\phi$ to each centre. This is legitimate because a [Lipschitz function](/page/Lipschitz%20Function) $\phi:\mathbb R\to\mathbb R$ is continuous, hence Borel measurable, and therefore $\phi\circ g_j:S\to\mathbb R$ is measurable for each $j$.
Now fix $f\in\mathcal F$ and choose $j$ so that $d_{p,Q}(f,g_j)\le\varepsilon$. The Lipschitz property of $\phi$ gives the pointwise estimate
\begin{align*}
|(\phi\circ f)(x)-(\phi\circ g_j)(x)|\le L|f(x)-g_j(x)|
\end{align*}
for every $x\in S$. Since $p\ge 1$, raising nonnegative quantities to the power $p$ preserves the inequality:
\begin{align*}
|(\phi\circ f)(x)-(\phi\circ g_j)(x)|^p\le L^p|f(x)-g_j(x)|^p.
\end{align*}
Integrating with respect to the probability measure $Q$ gives
\begin{align*}
\int_S |(\phi\circ f)(x)-(\phi\circ g_j)(x)|^p\,dQ(x)\le L^p\int_S |f(x)-g_j(x)|^p\,dQ(x).
\end{align*}
Taking the $p$-th root yields
\begin{align*}
d_{p,Q}(\phi\circ f,\phi\circ g_j)\le Ld_{p,Q}(f,g_j)\le L\varepsilon.
\end{align*}
Thus every member of $\phi\circ\mathcal F$ is within $L\varepsilon$ of one of the finitely many functions $\phi\circ g_1,\dots,\phi\circ g_m$. Hence these functions form an $L\varepsilon$-cover of $\phi\circ\mathcal F$.
[/guided]
[/step]
[step:Take the infimum over all covers]
The preceding step shows that every finite $\varepsilon$-cover of $\mathcal F$ with $m$ centres produces an $L\varepsilon$-cover of $\phi\circ\mathcal F$ with $m$ centres. Therefore
\begin{align*}
N(L\varepsilon,\phi\circ\mathcal F,L^p(Q))\le m
\end{align*}
for every admissible $m$. Taking the infimum over all such $m$ gives
\begin{align*}
N(L\varepsilon,\phi\circ\mathcal F,L^p(Q))\le N(\varepsilon,\mathcal F,L^p(Q)).
\end{align*}
Since $Q$, $p$, and $\varepsilon$ were arbitrary, the covering-number inequality holds for every probability measure $Q$, every $p\ge 1$, and every $\varepsilon>0$.
[/step]
[step:Use $\phi(0)=0$ to transfer the envelope bound]
Assume now that $\phi(0)=0$ and that $F:S\to[0,\infty]$ is an envelope for $\mathcal F$. For every $t\in\mathbb R$, the Lipschitz condition applied to $t$ and $0$ gives
\begin{align*}
|\phi(t)|=|\phi(t)-\phi(0)|\le L|t|.
\end{align*}
Let $h\in\phi\circ\mathcal F$. Then there exists $f\in\mathcal F$ such that $h=\phi\circ f$. For every $x\in S$,
\begin{align*}
|h(x)|=|\phi(f(x))|\le L|f(x)|\le LF(x).
\end{align*}
Thus $LF:S\to[0,\infty]$ bounds the absolute value of every member of $\phi\circ\mathcal F$, and hence $LF$ is an envelope for $\phi\circ\mathcal F$.
[/step]