Injectivity For Time-Augmented Paths (Theorem # 2494)
Theorem
Let $x, y \in C_p([a,b], V)$ with $p \in [1,2)$. Suppose that in some basis $(e_1, \ldots, e_d)$ for $V$ the first coordinates of $x$ and $y$ agree and are strictly monotone:
\begin{align*}
x^1 = y^1 = \rho : [a,b] \to \mathbb{R}, \qquad \rho \text{ strictly monotone}.
\end{align*}
If $x_a = y_a$ and $S(x) = S(y)$, then $x = y$.
Analysis
Real Analysis
Discussion
No discussion available for this theorem.
Proof
[proofplan]
The key observation is that a strictly monotone first coordinate behaves like a "clock" against which we can integrate. After reparameterizing both paths so that this clock becomes the identity $i(t) = t$, signature components of the form $S(x)^{(1, 1, \dots, 1, j)}$ encode polynomial moments $\int t^k \, dx^j_t$ of each remaining coordinate. Comparing these moments for $x$ and $y$, integration by parts converts the equality of moments into the equality $\int t^k x^j_t \, d\mathcal{L}^1(t) = \int t^k y^j_t \, d\mathcal{L}^1(t)$ for every $k \ge 0$. The Weierstrass approximation theorem then forces $x^j - y^j$ to be orthogonal to all polynomials in $L^2([a,b])$, hence identically zero. Since the first coordinate already agrees by hypothesis, $x = y$.
[/proofplan]
[step:Reduce to the case where the first coordinate is the identity $i(t) = t$]
The implication $x = y \Rightarrow S(x) = S(y)$ is immediate from the definition of the signature, so we focus on the converse.
By replacing $(x, y)$ with $(-x, -y)$ if necessary, assume without loss of generality that $\rho: [a, b] \to \mathbb{R}$ is strictly **increasing** (the negation flips the monotonicity, preserves equality of $x_a$ and $y_a$, and preserves equality of signatures because $S(-x)^{(i_1, \dots, i_k)} = (-1)^k S(x)^{(i_1, \dots, i_k)}$, which is the same transformation applied to $x$ and $y$).
Set $[\alpha, \beta] := \rho([a, b])$, which is a closed interval since $\rho$ is continuous and $[a,b]$ is compact and connected. Strict monotonicity makes $\rho: [a, b] \to [\alpha, \beta]$ a continuous bijection, hence a homeomorphism by compactness, with continuous strictly increasing inverse $\sigma := \rho^{-1}: [\alpha, \beta] \to [a, b]$.
Define the reparameterized paths
\begin{align*}
\tilde x: [\alpha, \beta] &\to V, & \tilde x_t := x_{\sigma(t)}, \\
\tilde y: [\alpha, \beta] &\to V, & \tilde y_t := y_{\sigma(t)}.
\end{align*}
The first coordinate of $\tilde x$ is $\tilde x^1_t = x^1_{\sigma(t)} = \rho(\sigma(t)) = t$, so $\tilde x^1 = \tilde y^1 = i$ where $i: [\alpha, \beta] \to \mathbb{R}$, $t \mapsto t$, is the identity. Apply [Reparameterization Invariance](/theorems/2492) (the hypothesis $\sigma$ continuous and non-decreasing is satisfied) to conclude
\begin{align*}
S(\tilde x)_{[\alpha, \beta]} = S(x)_{[a, b]} = S(y)_{[a, b]} = S(\tilde y)_{[\alpha, \beta]}.
\end{align*}
We also have $\tilde x_\alpha = x_a = y_a = \tilde y_\alpha$. Reparameterization preserves all the data of the problem, so it suffices to prove $\tilde x = \tilde y$ on $[\alpha, \beta]$ (this implies $x = y$ on $[a, b]$ by composing with $\rho$). To lighten notation we relabel $[\alpha, \beta]$ as $[a, b]$ and $\tilde x, \tilde y$ as $x, y$, so from this point onward $x^1_t = y^1_t = t$ for all $t \in [a, b]$.
Write $x^- := (x^2, \dots, x^d): [a, b] \to \mathbb{R}^{d-1}$ and analogously $y^-$, so $x_t = (t, x^-_t)$ and $y_t = (t, y^-_t)$. The goal becomes: $x^- = y^-$.
[guided]
The hypothesis is that one coordinate $\rho$ of $x$ (and the same of $y$) is **strictly monotone**. The strategy is to use this special coordinate as a clock, normalising the time parameter so the clock reads $\rho(t) = t$. Three things need verifying along the way: (i) WLOG strictly increasing, (ii) the reparameterized paths are well-defined and live on a common interval, (iii) signature equality is preserved under the reparameterization.
Step (i): WLOG strictly increasing. The hypothesis says $\rho := x^1 = y^1$ is strictly monotone but does not specify the direction. If $\rho$ is strictly decreasing, then $-\rho$ is strictly increasing. Replacing $(x, y)$ with $(-x, -y)$ preserves the conclusion (we want to prove $x = y$, which is equivalent to $-x = -y$), preserves the matching $x_a = y_a$ becoming $-x_a = -y_a$, and preserves signature equality up to a uniform sign:
\begin{align*}
S(-x)^{(i_1, \dots, i_k)}_{[a,b]} = (-1)^k S(x)^{(i_1, \dots, i_k)}_{[a,b]},
\end{align*}
since each iterated integral pulls out one factor of $-1$ per integration. The same identity holds for $y$, so $S(x) = S(y)$ implies $S(-x) = S(-y)$ component by component. Hence WLOG $\rho$ is strictly increasing.
Step (ii): the reparameterized interval and inverse. Set $[\alpha, \beta] := \rho([a,b])$. By continuity of $\rho$ and connectedness/compactness of $[a,b]$, this is a closed bounded interval. Strict monotonicity makes $\rho: [a,b] \to [\alpha, \beta]$ a continuous bijection between compact Hausdorff spaces, hence a homeomorphism. Its inverse $\sigma := \rho^{-1}: [\alpha, \beta] \to [a,b]$ is continuous and strictly increasing.
Define the reparameterized paths
\begin{align*}
\tilde x: [\alpha, \beta] &\to V, & \tilde x_t := x_{\sigma(t)}, \\
\tilde y: [\alpha, \beta] &\to V, & \tilde y_t := y_{\sigma(t)}.
\end{align*}
Computing the first coordinate:
\begin{align*}
\tilde x^1_t = x^1_{\sigma(t)} = \rho(\sigma(t)) = t,
\end{align*}
so $\tilde x^1 = \tilde y^1 = i$, where $i: [\alpha, \beta] \to \mathbb{R}$ is the identity map.
Step (iii): signature equality is preserved. We apply [Reparameterization Invariance](/theorems/2492). The hypothesis of that theorem is that the reparameterization map is continuous and non-decreasing (allowing plateaus). Our $\sigma$ is continuous and strictly increasing — hence in particular continuous and non-decreasing — so theorem 2492 applies. It gives
\begin{align*}
S(\tilde x)_{[\alpha, \beta]} = S(x)_{[a, b]} = S(y)_{[a, b]} = S(\tilde y)_{[\alpha, \beta]},
\end{align*}
where the middle equality is the hypothesis. Likewise $\tilde x_\alpha = x_{\sigma(\alpha)} = x_a = y_a = \tilde y_\alpha$.
Bookkeeping. Reparameterization preserves all hypotheses and the conclusion is $\tilde x = \tilde y$ on $[\alpha, \beta]$ if and only if $x = y$ on $[a,b]$ (composing with the homeomorphism $\rho$ either way). To declutter notation, relabel $[\alpha, \beta]$ as $[a,b]$ and $\tilde x, \tilde y$ as $x, y$. From this point on, we have the simplifying assumption $x^1_t = y^1_t = t$ for all $t \in [a, b]$.
Why does this normalisation help? Because $dx^1 = dy^1 = dt$ as Stieltjes measures; integrating against the first coordinate is the same as integrating against Lebesgue measure. This is what unlocks the moment computation in Step 2 — signature components of the form $S(x)^{(1, \dots, 1, j)}$ become polynomial moments $\int t^k \, dx^j_t$.
[/guided]
[/step]
[step:Compute the signature components $S(x)^{(1, \dots, 1, j)}$ as polynomial moments of $x^j$]
Fix $k \ge 0$ and $j \in \{1, \dots, d\}$. Consider the multi-index $w = (1, 1, \dots, 1, j)$ of length $k + 2$ — that is, $k + 1$ copies of the index $1$ followed by a single $j$. We compute $S(x)^w_{[a,b]}$.
By the definition of the iterated integral and the fact that $x^1_t = t$, hence $dx^1_t = dt = d\mathcal{L}^1(t)$ as a Lebesgue–Stieltjes measure on $[a, b]$,
\begin{align*}
S(x)^{(1, \dots, 1)}_{[a, t]} = \int_{a < s_1 < \dots < s_{k+1} < t} d\mathcal{L}^1(s_1) \cdots d\mathcal{L}^1(s_{k+1}) = \mathcal{L}^{k+1}(\Delta_{k+1}^{[a, t]}),
\end{align*}
where the multi-index $(1, \dots, 1)$ has length $k + 1$ and $\Delta_{k+1}^{[a, t]} := \{(s_1, \dots, s_{k+1}) \in [a, t]^{k+1} : a < s_1 < \dots < s_{k+1} < t\}$.
The standard simplex volume formula gives
\begin{align*}
\mathcal{L}^{k+1}(\Delta_{k+1}^{[a, t]}) = \frac{(t - a)^{k+1}}{(k+1)!}.
\end{align*}
(Same Fubini-on-the-simplex computation as in the proof of [Factorial Decay](/theorems/2493).) Thus
\begin{align*}
S(x)^{(1, \dots, 1)}_{[a, t]} = \frac{(t - a)^{k+1}}{(k+1)!}.
\end{align*}
Now apply this inside the outermost integral defining $S(x)^w$:
\begin{align*}
S(x)^{(1, \dots, 1, j)}_{[a, b]} = \int_a^b S(x)^{(1, \dots, 1)}_{[a, t]} \, dx^j_t = \frac{1}{(k+1)!} \int_a^b (t - a)^{k+1} \, dx^j_t.
\end{align*}
Identical reasoning for $y$ gives
\begin{align*}
S(y)^{(1, \dots, 1, j)}_{[a, b]} = \frac{1}{(k+1)!} \int_a^b (t - a)^{k+1} \, dy^j_t.
\end{align*}
The hypothesis $S(x) = S(y)$ implies $S(x)^w = S(y)^w$ for every multi-index, so
\begin{align*}
\int_a^b (t - a)^{k+1} \, dx^j_t = \int_a^b (t - a)^{k+1} \, dy^j_t \qquad \forall k \ge 0, \ j \in \{1, \dots, d\}.
\end{align*}
Expanding $(t - a)^{k+1}$ in the monomial basis and using linearity of Stieltjes integration in the integrand, an equivalent reformulation is
\begin{align*}
\int_a^b t^k \, dx^j_t = \int_a^b t^k \, dy^j_t \qquad \forall k \ge 0, \ j \in \{1, \dots, d\}.
\end{align*}
(The case $k = 0$ uses just the multi-index $(j)$, giving $x^j_b - x^j_a = y^j_b - y^j_a$; for $k \ge 1$ we obtain a triangular system in the monomial moments which we invert to extract individual moments.)
[guided]
The strategy. Extract polynomial moments of $x^j$ and $y^j$ from the signature equality, by selecting a specific family of multi-indices designed to interact nicely with the normalisation $x^1_t = t$ from Step 1. The chosen family is $w = (1, 1, \dots, 1, j)$ with $k+1$ leading $1$s and a single $j$ at the end, giving a multi-index of length $k+2$.
Why this family? Because integrating against $dx^1$ is integrating against $dt$ (since $x^1_t = t$), which produces polynomial pre-factors. Iterating this $k+1$ times produces a factorial of $t$, i.e., a single monomial up to a combinatorial constant. The final integration against $dx^j$ then converts the polynomial pre-factor into a moment of the path $x^j$.
Computing the prefix. By definition of the iterated integral and the identity $dx^1_t = dt = d\mathcal{L}^1(t)$ on $[a,b]$,
\begin{align*}
S(x)^{(1, \dots, 1)}_{[a, t]} = \int_{a < s_1 < \cdots < s_{k+1} < t} d\mathcal{L}^1(s_1) \cdots d\mathcal{L}^1(s_{k+1}) = \mathcal{L}^{k+1}(\Delta_{k+1}^{[a, t]}),
\end{align*}
where the multi-index $(1, \dots, 1)$ has length $k+1$ and $\Delta_{k+1}^{[a,t]} := \{(s_1, \dots, s_{k+1}) \in [a,t]^{k+1} : a < s_1 < \cdots < s_{k+1} < t\}$ is the standard ordered simplex.
The simplex volume. The cube $[a, t]^{k+1}$ decomposes (up to a measure-zero set of coincident coordinates) into $(k+1)!$ congruent simplices, one for each permutation $\pi \in S_{k+1}$ — the simplex for $\pi$ is $\{s_{\pi(1)} < s_{\pi(2)} < \cdots < s_{\pi(k+1)}\}$. By symmetry under permutation of coordinates (Lebesgue measure on $\mathbb{R}^{k+1}$ is invariant under coordinate permutations), all these simplices have the same Lebesgue measure, so
\begin{align*}
\mathcal{L}^{k+1}(\Delta_{k+1}^{[a, t]}) = \frac{\mathcal{L}^{k+1}([a,t]^{k+1})}{(k+1)!} = \frac{(t - a)^{k+1}}{(k+1)!}.
\end{align*}
This is the same calculation underlying [Factorial Decay](/theorems/2493).
Substituting back,
\begin{align*}
S(x)^{(1, \dots, 1)}_{[a, t]} = \frac{(t - a)^{k+1}}{(k+1)!}.
\end{align*}
Computing the full multi-index. Now integrate the prefix against $dx^j$ in the outermost integral:
\begin{align*}
S(x)^{(1, \dots, 1, j)}_{[a, b]} = \int_a^b S(x)^{(1, \dots, 1)}_{[a, t]} \, dx^j_t = \frac{1}{(k+1)!} \int_a^b (t - a)^{k+1} \, dx^j_t.
\end{align*}
The same calculation for $y$ gives the analogous expression with $y^j$ in place of $x^j$.
Extracting moment equality. The hypothesis $S(x) = S(y)$ in $T((V))$ implies coordinate-wise equality, in particular for the multi-index $w = (1, \dots, 1, j)$:
\begin{align*}
\int_a^b (t - a)^{k+1} \, dx^j_t = \int_a^b (t - a)^{k+1} \, dy^j_t \qquad \forall k \ge 0, \ j \in \{1, \dots, d\}.
\end{align*}
From shifted moments to ordinary moments. Expand $(t - a)^{k+1}$ in the standard monomial basis:
\begin{align*}
(t - a)^{k+1} = \sum_{m=0}^{k+1} \binom{k+1}{m} t^m (-a)^{k+1-m}.
\end{align*}
By linearity of Stieltjes integration in the integrand, equality of $\int (t-a)^{k+1} \, dx^j$ and $\int (t-a)^{k+1} \, dy^j$ for all $k \ge 0$ is equivalent to equality of $\int t^m \, dx^j$ and $\int t^m \, dy^j$ for all $m \ge 0$ (a triangular linear system in the indices $k, m$, hence invertible).
Concretely: for $k = 0$, we have $\int 1 \, dx^j = \int 1 \, dy^j$, i.e., $x^j_b - x^j_a = y^j_b - y^j_a$. For $k = 1$, expanding gives an equation involving $\int t \, dx^j$ and $\int 1 \, dx^j$; using the $k=0$ equation we extract $\int t \, dx^j = \int t \, dy^j$. By induction on $k$ we extract every monomial moment:
\begin{align*}
\int_a^b t^m \, dx^j_t = \int_a^b t^m \, dy^j_t \qquad \forall m \ge 0, \ j \in \{1, \dots, d\}.
\end{align*}
This is the moment equality we will exploit in Step 3 to reduce to an $L^2$-orthogonality statement about $x^j - y^j$ as a function of $t$.
Why this works only when $\rho$ is a coordinate. The moment construction relies critically on $x^1_t = t$. If no coordinate of $x$ were strictly monotone, no analogous moment computation would be available — and indeed, without time augmentation, the signature is not injective even on smooth paths (a famous example: the figure-eight loop traversed forwards and backwards has zero signature at every level $\ge 1$ but is not the constant path).
[/guided]
[/step]
[step:Convert moment equality of $dx^j$ to moment equality of $x^j$ via integration by parts]
We now translate the Stieltjes-moment equality of Step 2 into an $L^2$-orthogonality statement about $x^j - y^j$ as a function of $t$.
For each $j \ge 2$ and $k \ge 0$, [integration by parts](/theorems/???) for Stieltjes integrals (valid because $t \mapsto t^k$ is $C^1$ on $[a,b]$ and $x^j$ is continuous of bounded variation when $p = 1$, or more generally Young-integrable in $q$-variation when $p < 2$ and the chosen $q$ makes $1/p + 1/q > 1$) gives
\begin{align*}
\int_a^b t^k \, dx^j_t = \bigl[t^k x^j_t\bigr]_a^b - \int_a^b k t^{k-1} x^j_t \, d\mathcal{L}^1(t) = b^k x^j_b - a^k x^j_a - k \int_a^b t^{k-1} x^j_t \, d\mathcal{L}^1(t),
\end{align*}
where we used $dt = d\mathcal{L}^1(t)$. The same identity holds for $y^j$ with $y$ in place of $x$. Subtracting,
\begin{align*}
\int_a^b t^k \, d(x^j - y^j)_t = b^k(x^j_b - y^j_b) - a^k(x^j_a - y^j_a) - k \int_a^b t^{k-1} (x^j_t - y^j_t) \, d\mathcal{L}^1(t).
\end{align*}
The hypothesis $x_a = y_a$ implies $x^j_a = y^j_a$ for every $j$, so the second boundary term vanishes. The signature equality at level $k = 1$ (multi-index $(j)$) gives $S(x)^{(j)} = S(y)^{(j)}$, i.e.,
\begin{align*}
x^j_b - x^j_a = y^j_b - y^j_a,
\end{align*}
which combined with $x^j_a = y^j_a$ yields $x^j_b = y^j_b$, eliminating the first boundary term as well. The left-hand side vanishes by Step 2. Thus, for every $k \ge 1$,
\begin{align*}
0 = -k \int_a^b t^{k-1} (x^j_t - y^j_t) \, d\mathcal{L}^1(t).
\end{align*}
Dividing by $-k$ and reindexing $m := k - 1 \ge 0$,
\begin{align*}
\int_a^b t^m (x^j_t - y^j_t) \, d\mathcal{L}^1(t) = 0 \qquad \forall m \ge 0, \ \forall j \in \{1, \dots, d\}.
\end{align*}
[guided]
The setup. Step 2 produced moment equality of the **derivatives** in the Stieltjes sense: $\int_a^b t^k \, dx^j_t = \int_a^b t^k \, dy^j_t$ for all $k \ge 0$. We want moment equality of the **paths themselves**: $\int_a^b t^m (x^j_t - y^j_t) \, d\mathcal{L}^1(t) = 0$. The bridge between $dx^j$ and $x^j$ is integration by parts.
Why integration by parts? Because [integration by parts](/theorems/???) for Stieltjes integrals trades a $dx^j$ on a smooth function for a $d\mathcal{L}^1$ on $x^j$ itself, modulo boundary terms. Concretely, for a $C^1$ function $\varphi$ and a continuous function $x^j$ of bounded variation,
\begin{align*}
\int_a^b \varphi(t) \, dx^j_t = \bigl[\varphi(t) x^j_t\bigr]_a^b - \int_a^b \varphi'(t) x^j_t \, d\mathcal{L}^1(t).
\end{align*}
Verifying hypotheses for our $\varphi(t) = t^k$: the function $t^k$ is $C^\infty$ on $[a,b]$ with derivative $k t^{k-1}$, so $\varphi$ is $C^1$. The integrator $x^j$ is continuous of bounded variation in the $p = 1$ case, and continuous Young-integrable in $q$-variation in the $1 < p < 2$ case (with $q$ chosen so $1/p + 1/q > 1$). In either case the integration-by-parts formula applies, and we get
\begin{align*}
\int_a^b t^k \, dx^j_t = b^k x^j_b - a^k x^j_a - k \int_a^b t^{k-1} x^j_t \, d\mathcal{L}^1(t).
\end{align*}
The same identity holds with $y^j$ in place of $x^j$.
Subtracting and analysing the boundary terms. Subtract the $y$-version from the $x$-version:
\begin{align*}
\int_a^b t^k \, d(x^j - y^j)_t = b^k(x^j_b - y^j_b) - a^k(x^j_a - y^j_a) - k \int_a^b t^{k-1} (x^j_t - y^j_t) \, d\mathcal{L}^1(t).
\end{align*}
Now we kill the three terms on the right-hand side **except** the integral.
- **The starting-point boundary term** $a^k(x^j_a - y^j_a)$ vanishes because the hypothesis $x_a = y_a$ in $V$ implies $x^j_a = y^j_a$ for every coordinate $j$.
- **The endpoint boundary term** $b^k(x^j_b - y^j_b)$ vanishes because of the level-1 signature equality: the multi-index $(j)$ gives $S(x)^{(j)}_{[a,b]} = \int_a^b dx^j_t = x^j_b - x^j_a$, and similarly for $y$. The hypothesis $S(x) = S(y)$ at level 1 says $x^j_b - x^j_a = y^j_b - y^j_a$. Combined with $x^j_a = y^j_a$ from the previous bullet, this gives $x^j_b = y^j_b$.
- **The left-hand side** $\int_a^b t^k \, d(x^j - y^j)_t = \int_a^b t^k \, dx^j_t - \int_a^b t^k \, dy^j_t$ vanishes by Step 2.
Conclusion. After all three vanish, we are left with
\begin{align*}
0 = -k \int_a^b t^{k-1} (x^j_t - y^j_t) \, d\mathcal{L}^1(t) \qquad \forall k \ge 1.
\end{align*}
Dividing by $-k$ (legal for $k \ge 1$) and reindexing $m := k - 1 \ge 0$,
\begin{align*}
\int_a^b t^m (x^j_t - y^j_t) \, d\mathcal{L}^1(t) = 0 \qquad \forall m \ge 0, \ j \in \{1, \dots, d\}.
\end{align*}
This is the desired $L^2$-orthogonality statement: the difference $h^j := x^j - y^j$, viewed as a function of $t \in [a,b]$, is orthogonal to every monomial $t^m$ in the standard $L^2([a,b], \mathcal{L}^1)$ inner product.
What happens for $j = 1$? Recall $x^1_t = y^1_t = t$ after the normalisation in Step 1, so $h^1 \equiv 0$ identically and there is nothing to prove. The substantive content of the orthogonality statement is for $j \in \{2, \dots, d\}$, which is exactly the coordinates of $x^-$ and $y^-$ that we still need to identify. (For uniform notation we keep $j \in \{1, \dots, d\}$.)
The strategy succeeded because three distinct hypotheses fed three distinct cancellations: the starting point $x_a = y_a$, the level-1 signature equality $S(x)^{(j)} = S(y)^{(j)}$, and the higher-level signature equalities $S(x)^{(1, \dots, 1, j)} = S(y)^{(1, \dots, 1, j)}$ from Step 2. None can be omitted — without $x_a = y_a$ we would have a non-zero boundary at $t = a$ obstructing the cancellation; without level-1 signature equality, the boundary at $t = b$ would not match.
[/guided]
[/step]
[step:Conclude $x^j = y^j$ via the Weierstrass approximation theorem]
Define $h^j: [a, b] \to \mathbb{R}$, $t \mapsto x^j_t - y^j_t$ for each $j$. By Step 3,
\begin{align*}
\int_a^b t^m h^j_t \, d\mathcal{L}^1(t) = 0 \qquad \forall m \ge 0.
\end{align*}
This says $h^j$ is $L^2$-orthogonal to every monomial $t^m$, hence (by linearity of the inner product) to every polynomial.
The function $h^j$ is continuous on $[a, b]$ (as the difference of two continuous paths) and hence lies in $L^2([a, b], \mathcal{L}^1)$. By the [Weierstrass approximation theorem](/theorems/???), polynomials are dense in $C([a, b])$ under the supremum norm. Since $C([a, b])$ embeds continuously into $L^2([a, b])$ with $\|f\|_{L^2}^2 \le (b - a) \|f\|_\infty^2$, polynomials are also dense in $L^2([a, b])$.
For any $\varepsilon > 0$, pick a polynomial $P_\varepsilon$ with $\|h^j - P_\varepsilon\|_{L^2} < \varepsilon$. Then by the Cauchy–Schwarz inequality applied to the inner product $(f, g)_{L^2} = \int_a^b f \cdot g \, d\mathcal{L}^1$,
\begin{align*}
\|h^j\|_{L^2}^2 = (h^j, h^j)_{L^2} = (h^j, h^j - P_\varepsilon)_{L^2} + (h^j, P_\varepsilon)_{L^2} = (h^j, h^j - P_\varepsilon)_{L^2} \le \|h^j\|_{L^2} \cdot \varepsilon,
\end{align*}
where the second equality uses $(h^j, P_\varepsilon)_{L^2} = 0$ (orthogonality to polynomials). Letting $\varepsilon \to 0$, $\|h^j\|_{L^2}^2 \le 0$, hence $\|h^j\|_{L^2} = 0$, so $h^j = 0$ in $L^2([a, b])$. Equivalently, $x^j_t = y^j_t$ for $\mathcal{L}^1$-a.e. $t \in [a, b]$.
By continuity of $x^j$ and $y^j$, equality on a dense subset (a co-null set is dense) extends to equality everywhere on $[a, b]$:
\begin{align*}
x^j_t = y^j_t \qquad \forall t \in [a, b], \ \forall j \in \{1, \dots, d\}.
\end{align*}
Thus $x = y$ on $[a, b]$.
Reverting the reparameterization from Step 1 (composing with $\rho$, which is a homeomorphism), the same equality holds for the original $x, y \in C_p([a, b], V)$, completing the proof.
[guided]
Setup. Step 3 produced the moment equation $\int_a^b t^m h^j_t \, d\mathcal{L}^1(t) = 0$ for all $m \ge 0$ and $j \in \{1, \dots, d\}$, where $h^j_t := x^j_t - y^j_t$. We want to conclude $h^j \equiv 0$, that is, $x^j = y^j$ pointwise on $[a, b]$.
The mechanism: the moment problem for $L^2$. The function $h^j: [a, b] \to \mathbb{R}$ is continuous (as the difference of two continuous paths) and the interval $[a, b]$ is compact, so $\|h^j\|_{L^2}^2 = \int_a^b (h^j_t)^2 \, d\mathcal{L}^1(t) \le (b - a) \|h^j\|_\infty^2 < \infty$, i.e., $h^j \in L^2([a, b], \mathcal{L}^1)$.
In the Hilbert space $L^2([a, b], \mathcal{L}^1)$ with inner product $(f, g)_{L^2} := \int_a^b f \cdot g \, d\mathcal{L}^1$, our moment equation says
\begin{align*}
(h^j, t^m)_{L^2} = 0 \qquad \forall m \ge 0.
\end{align*}
By linearity of the inner product, this extends to $(h^j, P)_{L^2} = 0$ for **every** polynomial $P$.
Density of polynomials. By the [Weierstrass approximation theorem](/theorems/???), polynomials are dense in $C([a, b])$ under the supremum norm. The continuous embedding $C([a,b]) \hookrightarrow L^2([a, b])$ given by
\begin{align*}
\|f\|_{L^2}^2 = \int_a^b |f|^2 \, d\mathcal{L}^1 \le (b - a) \|f\|_\infty^2
\end{align*}
implies that uniform convergence in $C([a,b])$ entails $L^2$ convergence. Since $C([a,b])$ is dense in $L^2([a, b])$ (continuous functions are dense in $L^p$ on compact intervals), and polynomials are dense in $C([a,b])$, polynomials are dense in $L^2([a, b])$.
The Cauchy–Schwarz argument. Fix $\varepsilon > 0$. By density, choose a polynomial $P_\varepsilon$ with $\|h^j - P_\varepsilon\|_{L^2} < \varepsilon$. Then expanding the squared norm:
\begin{align*}
\|h^j\|_{L^2}^2 = (h^j, h^j)_{L^2} = (h^j, h^j - P_\varepsilon)_{L^2} + (h^j, P_\varepsilon)_{L^2}.
\end{align*}
The second inner product vanishes by orthogonality of $h^j$ to all polynomials. The first is bounded by [Cauchy–Schwarz](/theorems/???):
\begin{align*}
|(h^j, h^j - P_\varepsilon)_{L^2}| \le \|h^j\|_{L^2} \cdot \|h^j - P_\varepsilon\|_{L^2} < \|h^j\|_{L^2} \cdot \varepsilon.
\end{align*}
Hence
\begin{align*}
\|h^j\|_{L^2}^2 < \|h^j\|_{L^2} \cdot \varepsilon.
\end{align*}
If $\|h^j\|_{L^2} > 0$, divide both sides by $\|h^j\|_{L^2}$ to get $\|h^j\|_{L^2} < \varepsilon$. Since $\varepsilon$ is arbitrary, $\|h^j\|_{L^2} = 0$. Hence $h^j = 0$ in $L^2([a, b])$, equivalently $x^j_t = y^j_t$ for $\mathcal{L}^1$-a.e. $t \in [a, b]$.
Upgrading to pointwise equality. The hypothesis is about pointwise paths, not equivalence classes mod $\mathcal{L}^1$-null sets. We need to upgrade a.e. equality to everywhere equality. The mechanism is continuity: $x^j$ and $y^j$ are continuous on $[a, b]$ (by hypothesis $x, y \in C_p$ are continuous paths), so $h^j = x^j - y^j$ is continuous. A continuous function on $[a, b]$ that vanishes a.e. vanishes everywhere — because if $h^j(t_0) \ne 0$ at some point, then by continuity $h^j$ is bounded away from zero on a neighbourhood of $t_0$, which has positive Lebesgue measure, contradicting a.e. vanishing.
Hence $x^j_t = y^j_t$ for **all** $t \in [a, b]$ and all $j \in \{1, \dots, d\}$. This means $x = y$ as paths in $V = \mathbb{R}^d$ on the reparameterized interval $[a, b]$.
Reverting the reparameterization. Recall the relabelling at the end of Step 1: we used $[a, b]$ for the rescaled interval $[\alpha, \beta] = \rho([a_{\text{old}}, b_{\text{old}}])$ and $x, y$ for the reparameterized paths $\tilde x = x_{\text{old}} \circ \rho^{-1}$. So $\tilde x = \tilde y$ on $[\alpha, \beta]$ implies $x_{\text{old}} = \tilde x \circ \rho = \tilde y \circ \rho = y_{\text{old}}$ on $[a_{\text{old}}, b_{\text{old}}]$. The conclusion holds for the original paths.
Why this argument fails without continuity. If we allowed $x^j, y^j$ to be merely measurable (or only $L^p$), then we could only conclude $x^j = y^j$ a.e., not everywhere. This is fine for many purposes but not for proving $x = y$ as **paths** (which are by definition pointwise functions). The hypothesis $x, y \in C_p$ — continuous paths — is exactly what makes the upgrade to pointwise equality possible.
[/guided]
[/step]
Explore Further
$\mathcal{C}_1$ Is a Lusin Space
Stochastic Analysis
Universal Approximation of Arbitrarily-Deep Networks
Stochastic Analysis
Universal Kernel for Distribution Regression
Stochastic Analysis
Global Existence for Lip-1 RDEs
Stochastic Analysis
Signature as Solution of a CDE
Stochastic Analysis
Characterisation of Cauchy Sequences in the Signature Pre-RKHS
Stochastic Analysis
Moments Determine Discrete Signature Distributions
Stochastic Analysis
MMD Metrizes Weak Convergence
Stochastic Analysis