[step:Show $\nabla f$ is locally $BV$ via monotonicity]
[claim:For every open ball $B \Subset U$, $\nabla f \in BV(B; \mathbb{R}^n)$, and the distributional Jacobian $D(\nabla f)$ is an $n \times n$ matrix of finite signed Radon measures on $B$, symmetric in the sense that $\partial_{x_i}(\partial_{x_j} f) = \partial_{x_j}(\partial_{x_i} f)$ as distributions.]
[/claim]
*Proof of claim.* Fix an open ball $B \Subset U$. Choose a slightly enlarged ball $B' \Subset U$ with $\overline{B} \Subset B'$, and set $\delta := \operatorname{dist}(\overline{B}, \partial B') > 0$.
For each unit vector $v \in S^{n-1}$ and each $h \in (0, \delta)$, define the directional difference quotient
\begin{align*}
\Delta_v^h \nabla f: B &\to \mathbb{R}^n, \\
x &\mapsto \frac{\nabla f(x + h v) - \nabla f(x)}{h}.
\end{align*}
This is well-defined for a.e. $x \in B$ since $x + hv \in B'$.
By the monotonicity of the subdifferential, applied with $p = \nabla f(x + hv) \in \partial f(x + hv)$ and $q = \nabla f(x) \in \partial f(x)$ (at points of differentiability),
\begin{align*}
(\nabla f(x + hv) - \nabla f(x)) \cdot ((x + hv) - x) \ge 0,
\end{align*}
i.e., $h \, v \cdot \Delta_v^h \nabla f(x) \ge 0$. Dividing by $h^2 > 0$,
\begin{align*}
v \cdot \Delta_v^h \nabla f(x) \ge 0 \qquad \text{for a.e. } x \in B.
\end{align*}
In particular, taking $v = e_k$ shows that the scalar $\partial_{x_k} f$ is monotone-nondecreasing in the $x_k$-direction along $B$ (in the difference-quotient sense). Taking $v = (e_i + e_j)/\sqrt 2$ and expanding $v \cdot \nabla f = (\partial_{x_i} f + \partial_{x_j} f)/\sqrt 2$ shows that the scalar $\partial_{x_i} f + \partial_{x_j} f$ is monotone-nondecreasing in the direction $v$. Taking $v = (e_i - e_j)/\sqrt 2$ shows analogously that $\partial_{x_i} f - \partial_{x_j} f$ is monotone-nondecreasing in $v' = (e_i - e_j)/\sqrt 2$.
Integrating these difference-quotient inequalities against test functions $\varphi \in C_c^\infty(B)$ with $\varphi \ge 0$, and passing to the distributional limit as $h \to 0^+$, yields nonnegative distributions:
\begin{align*}
T_k &:= \partial_{x_k}(\partial_{x_k} f) \ge 0, \\
T_{ij}^+ &:= \partial_v(\partial_{v} f) = \tfrac{1}{2}(\partial_{x_i} + \partial_{x_j})(\partial_{x_i} f + \partial_{x_j} f) = \tfrac{1}{2}(\partial_{ii} f + 2 \partial_{ij} f + \partial_{jj} f) \ge 0, \\
T_{ij}^- &:= \partial_{v'}(\partial_{v'} f) = \tfrac{1}{2}(\partial_{x_i} - \partial_{x_j})(\partial_{x_i} f - \partial_{x_j} f) = \tfrac{1}{2}(\partial_{ii} f - 2 \partial_{ij} f + \partial_{jj} f) \ge 0.
\end{align*}
By Schwartz's theorem on positive distributions (every nonnegative distribution is a Radon measure; see Schwartz, *Théorie des distributions*, Ch. I §4), each of $T_k$, $T_{ij}^+$, $T_{ij}^-$ is a nonnegative Radon measure on $B$.
We now extract the off-diagonal entries by polarization:
\begin{align*}
\partial_{ij} f = \tfrac{1}{2}\bigl(T_{ij}^+ - T_{ij}^-\bigr),
\end{align*}
exhibiting $\partial_{ij} f$ as a difference of two nonnegative Radon measures, hence a finite signed Radon measure on $B$. The diagonal entries $\partial_{ii} f = T_i$ are themselves nonnegative measures.
Since each component $\partial_{x_i} f \in L^\infty(B)$ has each distributional partial derivative $\partial_{x_j}(\partial_{x_i} f)$ a finite signed Radon measure on $B$, we conclude $\partial_{x_i} f \in BV(B)$ by the definition of $BV$, and hence $\nabla f \in BV(B; \mathbb{R}^n)$.
The symmetry $\partial_{x_i}(\partial_{x_j} f) = \partial_{x_j}(\partial_{x_i} f)$ holds as distributions because mixed partials commute when applied to test functions (using $f \in L^1_{\mathrm{loc}}(B)$, which follows from $f$ being locally Lipschitz hence locally bounded):
\begin{align*}
\partial_{x_j}(\partial_{x_i} f)(\varphi) = -\int_B \partial_{x_i} f \, \partial_{x_j}\varphi = \int_B f \, \partial_{x_i}\partial_{x_j}\varphi = -\int_B \partial_{x_j} f\, \partial_{x_i}\varphi = \partial_{x_i}(\partial_{x_j} f)(\varphi).
\end{align*}
$\square$
By the claim, $\nabla f \in BV(B; \mathbb{R}^n)$ and $D(\nabla f) = (\partial_{x_j}(\partial_{x_i} f))_{i,j} =: D^2 f$ is an $n \times n$ matrix of finite signed Radon measures on $B$.
By Lebesgue's decomposition for the matrix of signed measures $D^2 f$, write
\begin{align*}
D^2 f = D^2 f_{\mathrm{ac}} + D^2 f_{\mathrm{s}},
\end{align*}
where $D^2 f_{\mathrm{ac}} \ll \mathcal{L}^n$ and $D^2 f_{\mathrm{s}} \perp \mathcal{L}^n$ on $B$. The Radon–Nikodym theorem provides a measurable matrix-valued density $A_B \in L^1(B; \mathbb{R}^{n \times n}_{\mathrm{sym}})$ with $D^2 f_{\mathrm{ac}} = A_B \, \mathcal{L}^n$. The symmetry of $D^2 f$ as a distribution descends to symmetry of $A_B(x)$ for $\mathcal{L}^n$-a.e. $x$. The densities $A_B$ on different balls agree on overlaps (the absolutely continuous part of $D^2 f$ on $B$ is the restriction to $B$ of the absolutely continuous part of $D^2 f$ on any larger ball $B'' \supseteq B$), so they patch into a single density $A \in L^1_{\mathrm{loc}}(U; \mathbb{R}^{n \times n}_{\mathrm{sym}})$.
[/step]