[guided]We start with arbitrary points $x_1,x_2\in X$ because the definition of being Lipschitz requires an estimate for every pair of points in the domain. The composition $g\circ f$ sends $x\in X$ first to $f(x)\in Y$ and then to $g(f(x))\in Z$, so the distance we must estimate is
\begin{align*}
d_Z((g\circ f)(x_1),(g\circ f)(x_2)).
\end{align*}
By the definition of composition, this is
\begin{align*}
d_Z(g(f(x_1)),g(f(x_2))).
\end{align*}
The map $g:Y\to Z$ is $M$-Lipschitz, meaning that for every $y_1,y_2\in Y$,
\begin{align*}
d_Z(g(y_1),g(y_2))\le M\,d_Y(y_1,y_2).
\end{align*}
We may apply this with $y_1=f(x_1)$ and $y_2=f(x_2)$ because $f$ takes values in $Y$. Therefore,
\begin{align*}
d_Z(g(f(x_1)),g(f(x_2)))\le M\,d_Y(f(x_1),f(x_2)).
\end{align*}
Now we estimate the remaining distance in $Y$. The map $f:X\to Y$ is $L$-Lipschitz, meaning that for every $u_1,u_2\in X$,
\begin{align*}
d_Y(f(u_1),f(u_2))\le L\,d_X(u_1,u_2).
\end{align*}
Applying this with $u_1=x_1$ and $u_2=x_2$ gives
\begin{align*}
d_Y(f(x_1),f(x_2))\le L\,d_X(x_1,x_2).
\end{align*}
Since $M\ge 0$, multiplying this inequality by $M$ preserves the direction of the inequality:
\begin{align*}
M\,d_Y(f(x_1),f(x_2))\le ML\,d_X(x_1,x_2).
\end{align*}
Substituting this bound into the earlier estimate gives
\begin{align*}
d_Z(g(f(x_1)),g(f(x_2)))\le ML\,d_X(x_1,x_2).
\end{align*}
This is precisely the desired Lipschitz estimate for $g\circ f$ at the pair $x_1,x_2$.[/guided]