[guided]We want to estimate the Wasserstein distance between the two intermediate measures $\mu_s$ and $\mu_t$. The definition of $W_p$ asks us to minimize over all couplings of $\mu_s$ and $\mu_t$, so it is enough to exhibit one admissible coupling and compute its cost.
The natural coupling is obtained by taking a random geodesic $\gamma$ distributed according to $\Pi$ and looking at its two positions $\gamma(s)$ and $\gamma(t)$. Formally, define
\begin{align*}
\pi_{s,t}:=(e_s,e_t)_\#\Pi.
\end{align*}
This is a probability measure on $X\times X$. Its first marginal is $(e_s)_\#\Pi=\mu_s$, and its second marginal is $(e_t)_\#\Pi=\mu_t$, so $\pi_{s,t}\in\Pi(\mu_s,\mu_t)$.
Because $W_p(\mu_s,\mu_t)^p$ is the infimum of $\int_{X\times X}d(x,y)^p\,d\rho(x,y)$ over all $\rho\in\Pi(\mu_s,\mu_t)$, the particular admissible plan $\pi_{s,t}$ gives the upper bound
\begin{align*}
W_p(\mu_s,\mu_t)^p\le \int_{X\times X} d(x,y)^p\,d\pi_{s,t}(x,y).
\end{align*}
Now we rewrite this integral using the definition of pushforward. Since $\pi_{s,t}=(e_s,e_t)_\#\Pi$, we have
\begin{align*}
\int_{X\times X} d(x,y)^p\,d\pi_{s,t}(x,y)=\int_{\operatorname{Geo}(X)} d(\gamma(s),\gamma(t))^p\,d\Pi(\gamma).
\end{align*}
The key point is that $\Pi$ is supported on constant-speed geodesics. Therefore, for every $\gamma\in\operatorname{Geo}(X)$ and every $0\le s\le t\le1$,
\begin{align*}
d(\gamma(s),\gamma(t))=(t-s)d(\gamma(0),\gamma(1)).
\end{align*}
Substituting this identity into the previous estimate gives
\begin{align*}
W_p(\mu_s,\mu_t)^p\le (t-s)^p\int_{\operatorname{Geo}(X)} d(\gamma(0),\gamma(1))^p\,d\Pi(\gamma).
\end{align*}
Finally, the assumption $(e_0,e_1)_\#\Pi=\pi$ says that the distribution of the pair of endpoints $(\gamma(0),\gamma(1))$ is exactly $\pi$. Hence
\begin{align*}
\int_{\operatorname{Geo}(X)} d(\gamma(0),\gamma(1))^p\,d\Pi(\gamma)=\int_{X\times X} d(x,y)^p\,d\pi(x,y).
\end{align*}
The coupling $\pi$ is optimal for $W_p(\mu_0,\mu_1)$, so
\begin{align*}
\int_{X\times X} d(x,y)^p\,d\pi(x,y)=W_p(\mu_0,\mu_1)^p.
\end{align*}
Combining these identities yields
\begin{align*}
W_p(\mu_s,\mu_t)^p\le (t-s)^pW_p(\mu_0,\mu_1)^p.
\end{align*}
Taking $p$th roots is valid because both sides are non-negative, and we obtain
\begin{align*}
W_p(\mu_s,\mu_t)\le (t-s)W_p(\mu_0,\mu_1).
\end{align*}[/guided]