[guided]The non-ergodic case is not obtained by pretending that the exponents are constant. Instead, we decompose $\mu$ into ergodic invariant measures and apply the ergodic inequality on each component.
Define the measurable nonnegative function $\Phi:M\to[0,\infty]$ on the Oseledets full-measure set $M_0$ by
\begin{align*}
\Phi(x):=\sum_{1\leq i\leq r(x),\, \lambda_i(x)>0}m_i(x)\lambda_i(x).
\end{align*}
Changing $\Phi$ on $M\setminus M_0$ does not affect integration with respect to $\mu$ or with respect to almost every ergodic component in the ergodic decomposition of $\mu$.
Let $\tau$ be the ergodic decomposition measure on the set $\mathcal{E}_f(M)$ of ergodic $f$-invariant Borel probability measures. The ergodic decomposition theorem gives two facts. First, for every nonnegative Borel function $G:M\to[0,\infty]$,
\begin{align*}
\int_M G(x)\,d\mu(x)=\int_{\mathcal{E}_f(M)}\int_M G(x)\,d\nu(x)\,d\tau(\nu).
\end{align*}
Second, metric entropy decomposes affinely:
\begin{align*}
h_\mu(f)=\int_{\mathcal{E}_f(M)}h_\nu(f)\,d\tau(\nu).
\end{align*}
We apply these statements with $G=\Phi$.
For $\tau$-almost every ergodic component $\nu$, the measure $\nu$ is an ergodic $f$-invariant Borel probability measure on the same compact Riemannian manifold, and $f$ is the same $C^1$ diffeomorphism. Thus the Ergodic Margulis-Ruelle Inequality applies. If $\alpha_1(\nu)>\cdots>\alpha_{s(\nu)}(\nu)$ are the $\nu$-almost everywhere constant Lyapunov exponents and $q_j(\nu)$ are their multiplicities, it gives
\begin{align*}
h_\nu(f)\leq \sum_{1\leq j\leq s(\nu),\, \alpha_j(\nu)>0}q_j(\nu)\alpha_j(\nu).
\end{align*}
Because the derivative cocycle is the same cocycle, the Oseledets spectrum recorded by $\Phi$ agrees with this constant spectrum on a $\nu$-full set. Therefore
\begin{align*}
\int_M \Phi(x)\,d\nu(x)=\sum_{1\leq j\leq s(\nu),\, \alpha_j(\nu)>0}q_j(\nu)\alpha_j(\nu).
\end{align*}
Integrating the previous [entropy inequality](/theorems/6729) with respect to $\tau$ gives
\begin{align*}
h_\mu(f)=\int_{\mathcal{E}_f(M)}h_\nu(f)\,d\tau(\nu)\leq \int_{\mathcal{E}_f(M)}\int_M \Phi(x)\,d\nu(x)\,d\tau(\nu).
\end{align*}
Finally, the defining integration formula for the ergodic decomposition, applied to the nonnegative function $\Phi$, gives
\begin{align*}
\int_{\mathcal{E}_f(M)}\int_M \Phi(x)\,d\nu(x)\,d\tau(\nu)=\int_M \Phi(x)\,d\mu(x).
\end{align*}
Substituting the definition of $\Phi$ proves
\begin{align*}
h_\mu(f)\leq \int_M \sum_{1\leq i\leq r(x),\, \lambda_i(x)>0}m_i(x)\lambda_i(x)\,d\mu(x).
\end{align*}[/guided]