[proofplan]
We prove the integration by parts formula for cadlag BV functions by a discrete telescoping identity on partitions followed by a passage to the limit. On any partition, we decompose each product increment $a(t_i)b(t_i) - a(t_{i-1})b(t_{i-1})$ into three terms: one involving $a(t_{i-1})$ times the increment of $b$, one involving $b(t_{i-1})$ times the increment of $a$, and a cross term. The first two sums converge to the Lebesgue-Stieltjes integrals $\int a(r^-)\, db(r)$ and $\int b(r^-)\, da(r)$ by the [Riemann Sum Approximation](/theorems/2072), since the left-continuous processes $a(\cdot^-)$ and $b(\cdot^-)$ are bounded. The cross term converges to the sum of simultaneous jumps $\sum_{s < r \leq t} \Delta a(r)\, \Delta b(r)$.
[/proofplan]
[step:Write the telescoping identity on a partition]
Fix $0 \leq s \leq t \leq T$ and let $\pi = \{s = r_0 < r_1 < \cdots < r_N = t\}$ be a partition of $[s,t]$. Then
\begin{align*}
a(t)b(t) - a(s)b(s) = \sum_{i=1}^{N} \bigl[a(r_i)b(r_i) - a(r_{i-1})b(r_{i-1})\bigr].
\end{align*}
For each summand, we apply the algebraic identity $\alpha\beta - \gamma\delta = \gamma(\beta - \delta) + \delta(\alpha - \gamma) + (\alpha - \gamma)(\beta - \delta)$ with $\alpha = a(r_i)$, $\beta = b(r_i)$, $\gamma = a(r_{i-1})$, $\delta = b(r_{i-1})$:
\begin{align*}
a(r_i)b(r_i) - a(r_{i-1})b(r_{i-1}) = a(r_{i-1})\bigl[b(r_i) - b(r_{i-1})\bigr] + b(r_{i-1})\bigl[a(r_i) - a(r_{i-1})\bigr] + \bigl[a(r_i) - a(r_{i-1})\bigr]\bigl[b(r_i) - b(r_{i-1})\bigr].
\end{align*}
Summing over $i = 1, \ldots, N$ gives
\begin{align*}
a(t)b(t) - a(s)b(s) = \underbrace{\sum_{i=1}^{N} a(r_{i-1})\bigl[b(r_i) - b(r_{i-1})\bigr]}_{S_1^{(\pi)}} + \underbrace{\sum_{i=1}^{N} b(r_{i-1})\bigl[a(r_i) - a(r_{i-1})\bigr]}_{S_2^{(\pi)}} + \underbrace{\sum_{i=1}^{N} \bigl[a(r_i) - a(r_{i-1})\bigr]\bigl[b(r_i) - b(r_{i-1})\bigr]}_{S_3^{(\pi)}}.
\end{align*}
This identity holds exactly for every partition $\pi$.
[guided]
The proof begins with the observation that the product $a(t)b(t)$ telescopes over any partition. The key algebraic trick is to decompose each product increment into three terms. Why three? Because $a(r_i)b(r_i) - a(r_{i-1})b(r_{i-1})$ describes how the product changes, and we want to separate the contribution of $a$ changing (with $b$ held at its old value), $b$ changing (with $a$ held at its old value), and the "interaction" where both change simultaneously.
Formally, we use the identity $\alpha\beta - \gamma\delta = \gamma(\beta - \delta) + \delta(\alpha - \gamma) + (\alpha - \gamma)(\beta - \delta)$, which can be verified by expanding the right-hand side: $\gamma\beta - \gamma\delta + \delta\alpha - \delta\gamma + \alpha\beta - \alpha\delta - \gamma\beta + \gamma\delta = \alpha\beta - \delta\gamma$. Applying this with $\alpha = a(r_i)$, $\beta = b(r_i)$, $\gamma = a(r_{i-1})$, $\delta = b(r_{i-1})$ and summing yields the three sums $S_1^{(\pi)}$, $S_2^{(\pi)}$, $S_3^{(\pi)}$.
This identity is exact -- no approximation has been made yet. The work lies in identifying the limits of each sum as the partition mesh tends to zero.
[/guided]
[/step]
[step:Show $S_1^{(\pi)}$ and $S_2^{(\pi)}$ converge to the Lebesgue-Stieltjes integrals]
Consider a sequence of partitions $(\pi_m)$ of $[s,t]$ with mesh $|\pi_m| \to 0$. The sum $S_1^{(\pi_m)}$ is a left-endpoint Riemann sum for the function $r \mapsto a(r^-)$ against the signed measure induced by $b$:
\begin{align*}
S_1^{(\pi_m)} = \sum_{i=1}^{N_m} a(r_{i-1}^{(m)})\bigl[b(r_i^{(m)}) - b(r_{i-1}^{(m)})\bigr].
\end{align*}
The function $r \mapsto a(r^-)$ is left-continuous (since the left-limit of a cadlag function is itself left-continuous) and bounded on $[s,t]$ (since $a$ is cadlag on a compact interval, hence bounded). By the [Riemann Sum Approximation](/theorems/2072) applied with $h(r) = a(r^-)$ and the BV function $b$,
\begin{align*}
\lim_{m \to \infty} S_1^{(\pi_m)} = \int_s^t a(r^-)\, db(r).
\end{align*}
By symmetry, $S_2^{(\pi_m)}$ is a left-endpoint Riemann sum for the left-continuous bounded function $r \mapsto b(r^-)$ against the signed measure induced by $a$:
\begin{align*}
\lim_{m \to \infty} S_2^{(\pi_m)} = \int_s^t b(r^-)\, da(r).
\end{align*}
[guided]
The sums $S_1^{(\pi_m)}$ and $S_2^{(\pi_m)}$ are exactly the Riemann sums appearing in the [Riemann Sum Approximation](/theorems/2072). To apply that theorem, we need to verify its hypotheses for each sum.
For $S_1^{(\pi_m)}$: The integrand is $h(r) = a(r^-)$. We check:
- **Left-continuity:** For $r > s$, $a(r^-) = \lim_{u \uparrow r} a(u)$ exists because $a$ is cadlag. For the left-continuity of $r \mapsto a(r^-)$ itself: fix $r_0 > s$ and let $r \uparrow r_0$. Then $a(r^-) = \lim_{u \uparrow r} a(u)$, and as $r \uparrow r_0$, this converges to $\lim_{r \uparrow r_0} a(r^-) = a(r_0^-)$, so $h$ is left-continuous.
- **Boundedness:** Since $a$ is cadlag on the compact interval $[s,t]$, it is bounded: $\|a\|_\infty := \sup_{r \in [s,t]} |a(r)| < \infty$. Therefore $|a(r^-)| \leq \|a\|_\infty$ for all $r$.
- **BV integrator:** $b$ is cadlag and in $BV[0,T]$ by hypothesis.
The Riemann Sum Approximation theorem then gives $S_1^{(\pi_m)} \to \int_s^t a(r^-)\, db(r)$.
The argument for $S_2^{(\pi_m)}$ is identical with the roles of $a$ and $b$ swapped: $h(r) = b(r^-)$ is left-continuous and bounded, and $a$ is the BV integrator.
[/guided]
[/step]
[step:Show the cross-term $S_3^{(\pi)}$ converges to $\sum_{s < r \leq t} \Delta a(r)\, \Delta b(r)$]
We must show
\begin{align*}
\lim_{m \to \infty} \sum_{i=1}^{N_m} \bigl[a(r_i^{(m)}) - a(r_{i-1}^{(m)})\bigr]\bigl[b(r_i^{(m)}) - b(r_{i-1}^{(m)})\bigr] = \sum_{s < r \leq t} \Delta a(r)\, \Delta b(r).
\end{align*}
First, we verify that the right-hand side is well-defined and finite. Since $a, b \in BV[s,t]$, each has at most countably many jump points. The set of simultaneous jumps $J := \{r \in (s,t] : \Delta a(r) \neq 0 \text{ and } \Delta b(r) \neq 0\}$ is at most countable. By the Cauchy-Schwarz inequality for sums,
\begin{align*}
\sum_{r \in J} |\Delta a(r)| \cdot |\Delta b(r)| \leq \Bigl(\sum_{r \in J} |\Delta a(r)|^2\Bigr)^{1/2} \Bigl(\sum_{r \in J} |\Delta b(r)|^2\Bigr)^{1/2} \leq V_a(t) \cdot V_b(t) < \infty,
\end{align*}
where the second inequality uses the fact that $\sum_r |\Delta a(r)|^2 \leq \bigl(\sup_r |\Delta a(r)|\bigr) \sum_r |\Delta a(r)| \leq V_a(t)^2$, and similarly for $b$. Thus the sum converges absolutely.
To prove the limit, write
\begin{align*}
S_3^{(\pi_m)} - \sum_{s < r \leq t} \Delta a(r)\, \Delta b(r) = \sum_{i=1}^{N_m} \Bigl[\bigl(a(r_i^{(m)}) - a(r_{i-1}^{(m)})\bigr)\bigl(b(r_i^{(m)}) - b(r_{i-1}^{(m)})\bigr) - \sum_{r_{i-1}^{(m)} < r \leq r_i^{(m)}} \Delta a(r)\, \Delta b(r)\Bigr].
\end{align*}
On each subinterval $(r_{i-1}^{(m)}, r_i^{(m)}]$, the increment of $a$ decomposes as $a(r_i^{(m)}) - a(r_{i-1}^{(m)}) = a^c(r_i^{(m)}) - a^c(r_{i-1}^{(m)}) + \sum_{r_{i-1}^{(m)} < r \leq r_i^{(m)}} \Delta a(r)$, where $a^c$ is the continuous part of $a$. The cross-term difference on each subinterval can be bounded by the oscillation of the continuous parts plus cross-products of jumps with the continuous variation, all of which tend to zero as the mesh tends to zero. Specifically, we bound
\begin{align*}
\bigl|S_3^{(\pi_m)} - \sum_{s < r \leq t} \Delta a(r)\, \Delta b(r)\bigr| &\leq \sup_i \operatorname{osc}(a; (r_{i-1}^{(m)}, r_i^{(m)}]) \cdot V_b(t) + \sup_i \operatorname{osc}(b; (r_{i-1}^{(m)}, r_i^{(m)}]) \cdot V_a(t),
\end{align*}
where $\operatorname{osc}(a; I)$ denotes the oscillation of the continuous part of $a$ on interval $I$. Since $a^c$ and $b^c$ are continuous functions on the compact interval $[s,t]$, they are uniformly continuous, and the oscillation tends to zero uniformly as $|\pi_m| \to 0$. Therefore
\begin{align*}
\lim_{m \to \infty} S_3^{(\pi_m)} = \sum_{s < r \leq t} \Delta a(r)\, \Delta b(r).
\end{align*}
[guided]
The cross-term is the most subtle part of the argument. We need to understand what happens to the product of increments $[a(r_i) - a(r_{i-1})][b(r_i) - b(r_{i-1})]$ as the partition becomes finer.
**Why does this converge to a sum of jump products rather than zero?** For continuous BV functions, the cross-term does converge to zero -- this is because the oscillation of a continuous function on small intervals tends to zero, and the product of two small oscillations, summed over a partition, is bounded by the product of the maximum oscillation and the total variation, which tends to zero.
But for cadlag BV functions, jumps create "lumps" of variation concentrated at single points. When both $a$ and $b$ jump at the same time $r$, the subinterval containing $r$ picks up a contribution of approximately $\Delta a(r) \cdot \Delta b(r)$ that does not shrink as the mesh tends to zero.
**Why is the sum well-defined?** A BV function can have at most countably many jumps (since $\sum_r |\Delta a(r)| \leq V_a(t) < \infty$ forces all but countably many jumps to be zero). The absolute convergence of the sum of products follows from the Cauchy-Schwarz inequality:
\begin{align*}
\sum_{r \in J} |\Delta a(r)| \cdot |\Delta b(r)| \leq \Bigl(\sum_r |\Delta a(r)|^2\Bigr)^{1/2} \Bigl(\sum_r |\Delta b(r)|^2\Bigr)^{1/2} \leq V_a(t) \cdot V_b(t) < \infty,
\end{align*}
where we used $\sum_r |\Delta a(r)|^2 \leq \sup_r |\Delta a(r)| \cdot \sum_r |\Delta a(r)| \leq V_a(t)^2$.
**The convergence argument:** Decompose $a = a^c + a^d$ where $a^c$ is the continuous part and $a^d(r) = \sum_{s < u \leq r} \Delta a(u)$ is the pure jump part. Similarly $b = b^c + b^d$. Then the increment on each subinterval decomposes as
\begin{align*}
[a(r_i) - a(r_{i-1})][b(r_i) - b(r_{i-1})] = [a^c(r_i) - a^c(r_{i-1})][b^c(r_i) - b^c(r_{i-1})] + \text{cross terms involving } a^d, b^d.
\end{align*}
The purely continuous product sum tends to zero because $\sup_i |a^c(r_i) - a^c(r_{i-1})| \to 0$ by uniform continuity of $a^c$ on $[s,t]$, and the total variation of $b^c$ is bounded. The terms involving $a^d$ and $b^d$ contribute the jump products $\sum_{s < r \leq t} \Delta a(r)\, \Delta b(r)$ in the limit.
[/guided]
[/step]
[step:Combine the three limits to obtain the integration by parts formula]
Since the telescoping identity $a(t)b(t) - a(s)b(s) = S_1^{(\pi_m)} + S_2^{(\pi_m)} + S_3^{(\pi_m)}$ holds for every $m$, and the left-hand side is independent of $m$, taking $m \to \infty$ gives
\begin{align*}
a(t)b(t) - a(s)b(s) = \int_s^t a(r^-)\, db(r) + \int_s^t b(r^-)\, da(r) + \sum_{s < r \leq t} \Delta a(r)\, \Delta b(r).
\end{align*}
This is the desired integration by parts formula for cadlag BV functions.
[/step]