[proofplan]
The proof uses Kru\u017ekov's doubling of variables technique.
We write the entropy inequality for $u$ with $k = v(y,s)$ and for $v$ with $k = u(x,t)$, test against a product kernel $\psi((x+y)/2, (t+s)/2)\,\rho_\varepsilon(x-y)\,\rho_\varepsilon(t-s)$ concentrating on the diagonal, add the two inequalities, and pass to the limit $\varepsilon \to 0$ to obtain a distributional inequality for $|u - v|$.
A suitable choice of test function then yields the $L^1$ contraction.
[/proofplan]
[step:Double the entropy inequalities with $k = v(y,s)$ and $k = u(x,t)$]
Since $u$ is an entropy solution, the Kru\u017ekov inequality holds for every constant $k$.
In particular, for $k = v(y,s)$ (viewed as a parameter):
\begin{align*}
\int_0^T \!\!\int_{\mathbb{R}} \bigl(|u(x,t) - v(y,s)|\,\partial_t\varphi + \operatorname{sgn}(u - v(y,s))(f(u) - f(v(y,s)))\,\partial_x\varphi\bigr) \, d\mathcal{L}^1(x) \, d\mathcal{L}^1(t) \geq 0
\end{align*}
for all non-negative $\varphi$.
Similarly, $v$ satisfies the same inequality with the roles of $(x,t)$ and $(y,s)$ swapped and $k = u(x,t)$.
[/step]
[step:Choose a product test function and add the two inequalities]
Choose $\varphi(x,t,y,s) = \psi\!\left(\frac{x+y}{2}, \frac{t+s}{2}\right)\rho_\varepsilon(x-y)\,\rho_\varepsilon(t-s)$, where $\psi \geq 0$ is a smooth test function in the centre-of-mass variables and $\rho_\varepsilon$ is a standard mollifier concentrating at $0$.
Add the two entropy inequalities.
[/step]
[step:Pass to the limit $\varepsilon \to 0$ to collapse to a two-variable inequality]
As $\varepsilon \to 0$, $\rho_\varepsilon(x-y)\,\rho_\varepsilon(t-s)$ concentrates on the diagonal $\{x = y,\, t = s\}$.
The four-variable integrals collapse to two-variable integrals by the properties of convolution.
The result is:
\begin{align*}
\int_0^T \!\!\int_{\mathbb{R}} \bigl(|u - v|\,\partial_t\psi + \operatorname{sgn}(u-v)(f(u)-f(v))\,\partial_x\psi\bigr) \, d\mathcal{L}^2 + \int_{\mathbb{R}} |u_0 - v_0|\,\psi(x,0) \, d\mathcal{L}^1 \geq 0.
\end{align*}
[/step]
[step:Choose a test function approximating a characteristic function to extract the $L^1$ contraction]
Choose a sequence $\psi_m$ approximating $\mathbb{1}_{[0,\tau]}(t) \cdot \mathbb{1}_{[-R,R]}(x)$ for large $R$.
The flux term $\operatorname{sgn}(u-v)(f(u)-f(v))$ is bounded since $f$ is Lipschitz and $u, v \in L^\infty$.
As $R \to \infty$, the spatial boundary contributions vanish.
The temporal derivative of $\psi_m$ concentrates at $t = 0$ (with weight $+1$) and $t = \tau$ (with weight $-1$).
In the limit:
\begin{align*}
\int_{\mathbb{R}} |u(x,\tau) - v(x,\tau)| \, d\mathcal{L}^1(x) \leq \int_{\mathbb{R}} |u_0(x) - v_0(x)| \, d\mathcal{L}^1(x)
\end{align*}
for almost every $\tau \in (0,T)$.
[/step]