[proofplan]
Define $M_t$ as the unique solution of $dM_t = -M_t\, dz_t$ with $M_0 = I_e$ — existence and uniqueness follow from the linearity of the equation and the standard CDE theory. We show $J_t M_t \equiv I_e$ and $M_t J_t \equiv I_e$ on $[0,T]$ by setting $A_t := J_t M_t$ (and $B_t := M_t J_t$), computing $dA_t$ via the [Product Rule for CDEs](/theorems/???), and observing that $A_t$ satisfies a homogeneous linear matrix CDE in commutator form $dA_t = [dz_t, A_t]$ with $A_0 = I_e$. Since $[dz_t, I_e] = dz_t \cdot I_e - I_e \cdot dz_t = 0$, the constant path $A_t \equiv I_e$ solves this equation, and uniqueness pins it down. The same argument for $B_t$ gives a two-sided inverse, proving both nonsingularity and the claimed equation for $M_t$.
[/proofplan]
[step:Define the candidate inverse $M_t$ as the solution of the dual linear CDE]
Consider the matrix-valued linear CDE
\begin{align*}
M : [0,T] &\to \mathbb{R}^{e \times e} \\
dM_t &= -M_t \cdot dz_t, \qquad M_0 = I_e,
\end{align*}
where $z$ is the bounded-variation driving path of the linearised CDE for the Jacobian. By the [Existence and Uniqueness of Linear CDEs](/theorems/???), since the right-hand side $A \mapsto -A$ is a bounded linear operator $\mathbb{R}^{e \times e} \to \mathbb{R}^{e \times e}$ and $z$ has bounded variation, this equation has a unique global solution $M_t$ on $[0,T]$. We do **not** yet claim $M_t = J_t^{-1}$ — we will prove this in the next steps.
[guided]
The strategy is to define a *candidate inverse* $M_t$ purely algebraically — without referring to $J_t^{-1}$ at all — and then verify after the fact that it is indeed the inverse. Why proceed this way? If we tried to define $M_t := J_t^{-1}$ from the outset, we would need to know that $J_t$ is invertible *before* we have any tool to prove invertibility, which is circular. The clean way is to write down a candidate solving its own well-posed CDE, then check $J M = M J = I$ as a consequence.
The choice of equation $dM_t = -M_t\,dz_t$ is forced by the answer we want. Differentiating the desired identity $J_t M_t = I$ via the product rule (formally, even before justifying it) gives
\begin{align*}
0 = d(J_t M_t) = dJ_t \cdot M_t + J_t \cdot dM_t = dz_t \cdot J_t M_t + J_t \cdot dM_t = dz_t + J_t \cdot dM_t,
\end{align*}
so $J_t \cdot dM_t = -dz_t$, i.e. $dM_t = -J_t^{-1} dz_t = -M_t \cdot dz_t$ (using the desired identity to rearrange). This identifies $-M_t \cdot dz_t$ as the right driver for $M_t$. We *don't* prove the identity this way — that's the formal calculation — but it tells us which candidate to write down.
We verify the existence-and-uniqueness hypotheses of the [Existence and Uniqueness of Linear CDEs](/theorems/???): (i) the vector field $A \mapsto -A$ is bounded linear on $\mathbb{R}^{e \times e}$ — as a left-multiplication operator, $\|-A\|_{\mathrm{op}} \le \|A\|$; (ii) the driver $z$ has bounded variation, since $z = \int \nabla f_\theta(y_\cdot)\,dx_\cdot$ where $\nabla f_\theta$ is bounded continuous (hypothesis on $f_\theta \in C^1$ with bounded derivatives) and $x$ has bounded variation. Both hypotheses are met, so $M_t$ exists uniquely on $[0,T]$.
[/guided]
[/step]
[step:Compute $dA_t$ for $A_t := J_t M_t$ via the product rule]
Define $A : [0,T] \to \mathbb{R}^{e \times e}$ by $A_t := J_t \cdot M_t$. The Jacobian $J_t$ satisfies its own forward CDE $dJ_t = dz_t \cdot J_t$ (the linearisation of the original CDE about the trajectory $y_t$). Applying the [Product Rule for CDEs](/theorems/???) — valid because $J$ and $M$ are matrix-valued solutions of CDEs driven by paths of bounded variation, so the cross-variation term vanishes — we obtain
\begin{align*}
dA_t &= (dJ_t) \cdot M_t + J_t \cdot (dM_t) \\
&= (dz_t \cdot J_t) \cdot M_t + J_t \cdot (-M_t \cdot dz_t) \\
&= dz_t \cdot (J_t M_t) - (J_t M_t) \cdot dz_t \\
&= dz_t \cdot A_t - A_t \cdot dz_t \\
&= [dz_t,\, A_t],
\end{align*}
where $[B, C] := BC - CB$ is the matrix commutator. The initial condition is $A_0 = J_0 M_0 = I_e \cdot I_e = I_e$.
[guided]
Why does the cross-variation term vanish? In the [Product Rule for CDEs](/theorems/???) — equivalently, the [Itô Product Rule](/theorems/???) specialised to bounded-variation drivers — the formula reads
\begin{align*}
d(J_t M_t) = (dJ_t) M_t + J_t (dM_t) + d[J, M]_t,
\end{align*}
where $[J, M]_t$ is the cross-variation. For two CDEs driven by the *same* bounded-variation path $z$, the cross-variation is identically zero (no quadratic variation in the driver), so the Itô product rule reduces to the Leibniz rule of classical calculus. This is the same reason that classical ODE theory gives $\frac{d}{dt}(JM) = J' M + J M'$ without correction terms.
Substituting $dJ_t = dz_t \cdot J_t$ (the standard linearisation of the CDE: differentiate $dy_t = f(y_t)\, dx_t$ in $y_0$ to get $dJ_t = \nabla f(y_t) \cdot J_t \cdot dx_t$, and absorb $\nabla f(y_t)\, dx_t$ into the auxiliary driver $dz_t$), and substituting $dM_t = -M_t\, dz_t$ from the definition, we get
\begin{align*}
dA_t = dz_t \cdot J_t \cdot M_t - J_t \cdot M_t \cdot dz_t = dz_t A_t - A_t dz_t = [dz_t, A_t].
\end{align*}
The initial condition $A_0 = I_e$ holds by direct evaluation: $J_0 = I_e$ (the Jacobian of the identity flow at the start) and $M_0 = I_e$ (our chosen initial condition).
[/guided]
[/step]
[step:Show $A_t \equiv I_e$ by observing that the constant path solves the commutator equation]
We claim the constant path $A_t \equiv I_e$ solves the equation $dA_t = [dz_t, A_t]$ with $A_0 = I_e$. Indeed, $[dz_t, I_e] = dz_t \cdot I_e - I_e \cdot dz_t = 0$, so the right-hand side vanishes identically and $A_t = A_0 = I_e$ is consistent.
For uniqueness: the equation $dA_t = [dz_t, A_t]$ is a linear matrix CDE with linear vector field $A \mapsto [dz_t, A]$ (commutator with a fixed matrix). By the [Existence and Uniqueness of Linear CDEs](/theorems/???) — verifying its hypotheses, the operator $A \mapsto [B, A]$ is bounded linear on $\mathbb{R}^{e \times e}$ with norm $\le 2\|B\|$, and $z$ has bounded variation — the solution from initial condition $A_0 = I_e$ is unique. Therefore
\begin{align*}
A_t = J_t M_t = I_e \qquad \text{for all } t \in [0, T].
\end{align*}
[guided]
The structure of the argument is: we have a linear CDE with a known initial condition, we exhibit a particular solution (the constant path), and uniqueness forces every solution to coincide with it. The constant path works precisely because the commutator $[B, I] = BI - IB = 0$ — the identity matrix is the unique element of $\mathbb{R}^{e \times e}$ that commutes with everything, so it is a fixed point of every commutator equation.
It is essential to verify uniqueness, not assume it: the commutator vector field $A \mapsto [dz_t, A]$ does have other solutions for *other* initial conditions (e.g., $A_0 = B$ for general $B$), so we must rule out the existence of multiple solutions starting from $I_e$. This is exactly what the [Existence and Uniqueness of Linear CDEs](/theorems/???) provides. We verify hypotheses: (i) the vector field is bounded linear with $\|[B, \cdot]\|_{\mathrm{op}} \le 2 \|B\|$ on $\mathbb{R}^{e \times e}$ (operator norm of left multiplication plus right multiplication); (ii) $z$ has bounded variation by assumption (it is the auxiliary driver in the linearised CDE).
[/guided]
[/step]
[step:Repeat the argument for $B_t := M_t J_t$ to obtain the two-sided inverse]
Define $B_t := M_t \cdot J_t$. By the [Product Rule for CDEs](/theorems/???) again,
\begin{align*}
dB_t &= (dM_t) J_t + M_t (dJ_t) \\
&= (-M_t \cdot dz_t) J_t + M_t (dz_t \cdot J_t) \\
&= -M_t \cdot dz_t \cdot J_t + M_t \cdot dz_t \cdot J_t \\
&= 0.
\end{align*}
Hence $B_t \equiv B_0 = M_0 J_0 = I_e \cdot I_e = I_e$.
In fact this computation gives a more direct conclusion than the commutator argument: $dB_t = 0$ identically, so $B$ is constant and equal to its initial value $I_e$.
Combining with the previous step, we have shown
\begin{align*}
J_t M_t = M_t J_t = I_e \qquad \text{for all } t \in [0, T].
\end{align*}
This proves both that $J_t$ is nonsingular for every $t \in [0, T]$ and that $J_t^{-1} = M_t$, where $M_t$ is the unique solution to $dM_t = -M_t \cdot dz_t$ with $M_0 = I_e$. This completes the proof.
[guided]
The asymmetry between $A_t = J_t M_t$ (commutator equation) and $B_t = M_t J_t$ (zero equation) is striking and worth understanding. Why does the order matter?
When we compute $dA_t = (dJ_t) M_t + J_t (dM_t) = dz_t J_t M_t - J_t M_t dz_t$, the $dz_t$ factors sit at the *outside* of the product on each term — once on the left, once on the right — so they do not cancel and we get a commutator.
When we compute $dB_t = (dM_t) J_t + M_t (dJ_t) = -M_t dz_t J_t + M_t dz_t J_t = 0$, the $dz_t$ factor sits in the *middle* of both terms, sandwiched between $M_t$ and $J_t$. The two contributions are exact negatives of each other, so they cancel pointwise. This is why the second computation yields $dB_t \equiv 0$ directly.
The sign convention in the equation $dM_t = -M_t \cdot dz_t$ (with $M_t$ on the *left* of $dz_t$) was chosen specifically so that this cancellation in $B_t$ would happen — it is the "right-acting" companion to the "left-acting" Jacobian equation $dJ_t = dz_t \cdot J_t$. If we had instead defined $\tilde{M}_t$ by $d\tilde{M}_t = -dz_t \cdot \tilde{M}_t$, we would get the symmetric situation: $\tilde{M}_t J_t$ would satisfy a commutator equation while $J_t \tilde{M}_t$ would be exactly constant.
[/guided]
[/step]