Span of Signatures (Theorem # 2496)
Theorem
For every $N \in \mathbb{N}$, the truncated tensor algebra satisfies
\begin{align*}
T_N(V) = \operatorname{span}(\mathcal{S}_N),
\end{align*}
where $\mathcal{S}_N = \pi_N(\mathcal{S})$ denotes the set of truncated signatures of all paths in $C_p(V)$.
Algebra
Linear Algebra
Discussion
No discussion available for this theorem.
Proof
[proofplan]
The truncated tensor algebra $T_N(V) = \bigoplus_{k=0}^N V^{\otimes k}$ is spanned by pure tensors of the form $a_1 \otimes \cdots \otimes a_k$ with $k \le N$ and $a_i \in V$, so it suffices to exhibit each such pure tensor as a linear combination of truncated signatures. The key is that the signature of the linear path $x^a_t = ta$ on $[0,1]$ equals $\exp(i_1(a))$ in $T((V))$, where $i_1: V \hookrightarrow T((V))$ is the canonical inclusion at level $1$. Inverting via the tensor logarithm and exploiting Chen's identity (which links products of signatures to signatures of concatenations) produces $i_1(a)$ as a linear combination of signatures, and thus does the same for arbitrary words $a_1 \otimes \cdots \otimes a_k$ via the level-$k$ component of $\exp(i_1(a_1)) \cdots \exp(i_1(a_k))$. Truncating at level $N$ converts the formal series into a finite linear combination, which yields the spanning property.
[/proofplan]
[step:Reduce to expressing pure tensors $a_1 \otimes \cdots \otimes a_k$ with $k \le N$ as linear combinations of signatures]
The truncated tensor algebra decomposes as
\begin{align*}
T_N(V) = \bigoplus_{k=0}^{N} V^{\otimes k},
\end{align*}
and each summand $V^{\otimes k}$ is, by definition, spanned by the pure tensors $\{a_1 \otimes \cdots \otimes a_k : a_1, \dots, a_k \in V\}$. The level-$0$ component is spanned by $\mathbf{1} = (1, 0, 0, \dots) = S(x^0)_{[0,1]} \in \mathcal{S}_N$, where $x^0$ is the constant path. To prove the theorem it is therefore enough to show that for every $1 \le k \le N$ and every $(a_1, \dots, a_k) \in V^k$, the element $a_1 \otimes \cdots \otimes a_k$ (viewed as an element of $T_N(V)$ via the inclusion $V^{\otimes k} \hookrightarrow T_N(V)$) lies in $\operatorname{span}(\mathcal{S}_N)$.
[guided]
The truncated tensor algebra is, by definition, the direct sum
\begin{align*}
T_N(V) = \bigoplus_{k=0}^{N} V^{\otimes k},
\end{align*}
and a basic fact of multilinear algebra is that $V^{\otimes k}$ is spanned (over the underlying field) by pure tensors $a_1 \otimes \cdots \otimes a_k$ with $a_i \in V$. So $T_N(V)$ is spanned by the union of these pure tensors over $k = 0, 1, \dots, N$.
What does each level contribute? The level-$0$ slot consists of scalars; the unit element is $\mathbf{1} = (1, 0, 0, \dots)$. We can already realise $\mathbf{1}$ as a signature: take the constant path $x^0_t \equiv 0$. Its signature has all iterated integrals equal to $0$ except the level-$0$ scalar $1$, so $S(x^0)_{[0,1]} = \mathbf{1}$, and consequently $\pi_N(S(x^0)_{[0,1]}) = \mathbf{1} \in \mathcal{S}_N$.
Why does this reduction help? Because once we know that every pure tensor $a_1 \otimes \cdots \otimes a_k$ for $k \le N$ lies in $\operatorname{span}(\mathcal{S}_N)$, linearity gives us $V^{\otimes k} \subseteq \operatorname{span}(\mathcal{S}_N)$ for each such $k$, and taking the direct sum gives $T_N(V) \subseteq \operatorname{span}(\mathcal{S}_N)$. The reverse inclusion $\operatorname{span}(\mathcal{S}_N) \subseteq T_N(V)$ holds tautologically since $\mathcal{S}_N \subseteq T_N(V)$ by construction.
[/guided]
[/step]
[step:Compute the signature of the linear path $x^a$ as $\exp(i_1(a))$]
Fix $a \in V$. Define the linear path
\begin{align*}
x^a: [0,1] &\to V \\
t &\mapsto t a.
\end{align*}
Then $x^a \in C_p([0,1], V)$ for every $p \in [1, \infty)$, since $x^a$ is $C^1$ and hence of bounded variation. The increment is $x^a_t - x^a_s = (t-s) a$ for $0 \le s \le t \le 1$. The level-$k$ iterated integral of $x^a$ is
\begin{align*}
S(x^a)_{[0,1]}^{(k)} = \int_{0 \le t_1 \le \cdots \le t_k \le 1} dx^a_{t_1} \otimes \cdots \otimes dx^a_{t_k} = a^{\otimes k} \int_{0 \le t_1 \le \cdots \le t_k \le 1} d\mathcal{L}^1(t_1) \cdots d\mathcal{L}^1(t_k) = \frac{1}{k!} a^{\otimes k},
\end{align*}
where the integral over the $k$-simplex $\{0 \le t_1 \le \cdots \le t_k \le 1\}$ has $\mathcal{L}^k$-measure $1/k!$. Therefore
\begin{align*}
S(x^a)_{[0,1]} = \sum_{k=0}^{\infty} \frac{1}{k!} a^{\otimes k} = \sum_{k=0}^{\infty} \frac{1}{k!} (i_1(a))^{\otimes k} = \exp(i_1(a)) \quad \text{in } T((V)),
\end{align*}
where $i_1: V \to T((V))$ is the canonical level-$1$ inclusion and $\exp$ denotes the tensor exponential.
[guided]
We work with the linear path
\begin{align*}
x^a: [0,1] &\to V \\
t &\mapsto t a.
\end{align*}
Since $x^a$ is smooth (in fact, affine), it has bounded total variation, hence finite $p$-variation for every $p \ge 1$, so $x^a \in C_p([0,1], V)$ in particular for $p < 2$. The increment is $x^a_t - x^a_s = ta - sa = (t-s)a$, and in differential form $dx^a_t = a \, d\mathcal{L}^1(t)$.
The level-$k$ component of the signature is the iterated integral
\begin{align*}
S(x^a)_{[0,1]}^{(k)} = \int_{0 \le t_1 \le \cdots \le t_k \le 1} dx^a_{t_1} \otimes \cdots \otimes dx^a_{t_k}.
\end{align*}
Substituting $dx^a_{t_i} = a \, d\mathcal{L}^1(t_i)$ pulls the constant tensor $a^{\otimes k}$ out of the integral:
\begin{align*}
S(x^a)_{[0,1]}^{(k)} = a^{\otimes k} \int_{0 \le t_1 \le \cdots \le t_k \le 1} d\mathcal{L}^1(t_1) \cdots d\mathcal{L}^1(t_k).
\end{align*}
The remaining scalar integral is the volume of the standard $k$-simplex with respect to $\mathcal{L}^k$, which equals $1/k!$. Hence
\begin{align*}
S(x^a)_{[0,1]}^{(k)} = \frac{a^{\otimes k}}{k!}.
\end{align*}
Summing over $k$:
\begin{align*}
S(x^a)_{[0,1]} = \sum_{k=0}^{\infty} \frac{a^{\otimes k}}{k!}.
\end{align*}
Recognising the right-hand side as the formal tensor exponential $\sum_{k \ge 0} v^{\otimes k}/k!$ evaluated at $v = i_1(a) \in T((V))$,
\begin{align*}
S(x^a)_{[0,1]} = \exp(i_1(a)).
\end{align*}
This identity is the linchpin: it converts the geometric object $S(x^a)$ into an algebraic exponential, which we can invert via $\log$.
[/guided]
[/step]
[step:Recover $i_1(a)$ as a series of products of signatures via $\log$ and Chen's identity]
Apply the tensor logarithm $\log: \mathbf{1} + T_+((V)) \to T_+((V))$ — defined by the formal power series
\begin{align*}
\log(\mathbf{1} + u) = \sum_{m=1}^{\infty} \frac{(-1)^{m+1}}{m} u^{\otimes m}, \qquad u \in T_+((V)),
\end{align*}
where $T_+((V))$ denotes elements with vanishing scalar part — to the identity of the previous step. Since $\exp$ and $\log$ are mutual inverses on the relevant subsets of $T((V))$,
\begin{align*}
i_1(a) = \log(\exp(i_1(a))) = \log(S(x^a)_{[0,1]}) = \sum_{m=1}^{\infty} \frac{(-1)^{m+1}}{m} (S(x^a)_{[0,1]} - \mathbf{1})^{\otimes m}.
\end{align*}
For any positive integer $m$, write $x^a *^m := x^a * x^a * \cdots * x^a$ ($m$ copies concatenated). By Chen's identity,
\begin{align*}
S(x^a *^m)_{[0,m]} = S(x^a)_{[0,1]}^{\otimes m} = \exp(i_1(a))^{\otimes m} = \exp(m \, i_1(a)),
\end{align*}
where the final equality uses commutativity of the powers of the single element $i_1(a)$ in $T((V))$. In particular, every product $S(x^a)_{[0,1]}^{\otimes m}$ equals $S(x^a *^m)_{[0,m]} \in \mathcal{S}$. Expanding $(S(x^a)_{[0,1]} - \mathbf{1})^{\otimes m}$ via the binomial theorem in the (commutative) subalgebra generated by $S(x^a)_{[0,1]}$,
\begin{align*}
(S(x^a)_{[0,1]} - \mathbf{1})^{\otimes m} = \sum_{j=0}^{m} \binom{m}{j} (-1)^{m-j} S(x^a)_{[0,1]}^{\otimes j} = \sum_{j=0}^{m} \binom{m}{j} (-1)^{m-j} S(x^a *^j)_{[0,j]},
\end{align*}
with the convention $x^a *^0$ being the constant path so that $S(x^a *^0) = \mathbf{1} \in \mathcal{S}$. Substituting,
\begin{align*}
i_1(a) = \sum_{m=1}^{\infty} \frac{(-1)^{m+1}}{m} \sum_{j=0}^{m} \binom{m}{j} (-1)^{m-j} S(x^a *^j)_{[0,j]} \quad \text{in } T((V)).
\end{align*}
Each finite partial sum on the right is a $\mathbb{Q}$-linear combination of signatures.
[guided]
We now invert the identity $S(x^a)_{[0,1]} = \exp(i_1(a))$ to extract $i_1(a)$. The tensor logarithm is the formal series
\begin{align*}
\log: \mathbf{1} + T_+((V)) &\to T_+((V)) \\
\mathbf{1} + u &\mapsto \sum_{m=1}^{\infty} \frac{(-1)^{m+1}}{m} u^{\otimes m},
\end{align*}
where $T_+((V)) = \prod_{k \ge 1} V^{\otimes k}$ is the augmentation ideal. The series converges level-wise because the level-$k$ component of $u^{\otimes m}$ vanishes whenever $m > k$, so only finitely many terms contribute at each level. The identity $\log \circ \exp = \operatorname{id}$ on $T_+((V))$ holds by formal power-series calculation (and is just the classical identity for the exponential and logarithm of formal series). Applying it,
\begin{align*}
i_1(a) = \log(\exp(i_1(a))) = \log(S(x^a)_{[0,1]}) = \sum_{m=1}^{\infty} \frac{(-1)^{m+1}}{m} (S(x^a)_{[0,1]} - \mathbf{1})^{\otimes m}.
\end{align*}
The right-hand side involves tensor powers of the signature itself. To show that each such power is a signature, we use **Chen's identity**: for any two paths $x \in C_p([s, t], V)$ and $y \in C_p([t, u], V)$ that join up at $t$ (so the concatenation $x * y$ is well-defined), one has
\begin{align*}
S(x * y)_{[s, u]} = S(x)_{[s, t]} \otimes S(y)_{[t, u]}.
\end{align*}
Iterating Chen's identity for $m$ copies of $x^a$ — that is, building the path $x^a *^m$ by concatenating $m$ copies of $x^a$ on $[0, m]$ — yields
\begin{align*}
S(x^a *^m)_{[0, m]} = S(x^a)_{[0, 1]}^{\otimes m}.
\end{align*}
This is the crucial fact: a tensor power of a signature is itself a signature (of the concatenated path).
To handle $(S(x^a)_{[0,1]} - \mathbf{1})^{\otimes m}$, expand by the binomial theorem. The element $S(x^a)_{[0,1]} = \exp(i_1(a))$ commutes with $\mathbf{1}$ (the unit) in $T((V))$, so the binomial theorem applies in the (commutative) subalgebra generated by $S(x^a)_{[0,1]}$:
\begin{align*}
(S(x^a)_{[0,1]} - \mathbf{1})^{\otimes m} &= \sum_{j=0}^{m} \binom{m}{j} (-1)^{m-j} \, \mathbf{1}^{\otimes (m-j)} \otimes S(x^a)_{[0,1]}^{\otimes j} \\
&= \sum_{j=0}^{m} \binom{m}{j} (-1)^{m-j} \, S(x^a *^j)_{[0, j]},
\end{align*}
adopting the convention that $x^a *^0$ is the constant path on $\{0\}$ (or, equivalently, no path) so $S(x^a *^0) = \mathbf{1} \in \mathcal{S}$.
Substituting back,
\begin{align*}
i_1(a) = \sum_{m=1}^{\infty} \frac{(-1)^{m+1}}{m} \sum_{j=0}^{m} \binom{m}{j} (-1)^{m-j} \, S(x^a *^j)_{[0, j]}.
\end{align*}
Each finite partial sum is a finite $\mathbb{Q}$-linear combination of signatures. We have therefore expressed $i_1(a)$ as the limit (in $T((V))$, level-wise) of finite linear combinations of signatures.
[/guided]
[/step]
[step:Truncate at level $N$ to express $i_1^N(a)$ as a finite linear combination of truncated signatures]
The projection $\pi_N: T((V)) \to T_N(V)$ is a linear map. The level-$k$ component of $S(x^a *^j)_{[0,j]} = \exp(j \, i_1(a))$ equals $j^k a^{\otimes k}/k!$, which is zero when $k = 0$ unless we track the constant term. Concretely, for any fixed level $k$, the level-$k$ component of $(S(x^a)_{[0,1]} - \mathbf{1})^{\otimes m}$ vanishes whenever $m > k$, since $S(x^a)_{[0,1]} - \mathbf{1}$ has zero level-$0$ component and the level-$k$ component of an $m$-fold tensor product of elements with zero level-$0$ component is a sum over decompositions $k = k_1 + \cdots + k_m$ with each $k_i \ge 1$, which forces $m \le k$.
Therefore
\begin{align*}
\pi_N(i_1(a)) = i_1^N(a) = \sum_{m=1}^{N} \frac{(-1)^{m+1}}{m} \pi_N\!\left((S(x^a)_{[0,1]} - \mathbf{1})^{\otimes m}\right) = \sum_{m=1}^{N} \sum_{j=0}^{m} \frac{(-1)^{m+1}}{m} \binom{m}{j} (-1)^{m-j} \pi_N\!\left(S(x^a *^j)_{[0,j]}\right),
\end{align*}
which is a finite $\mathbb{Q}$-linear combination of elements of $\mathcal{S}_N$. Hence $i_1^N(a) \in \operatorname{span}(\mathcal{S}_N)$ for every $a \in V$.
[guided]
The series $i_1(a) = \sum_{m \ge 1} \frac{(-1)^{m+1}}{m} (S(x^a)_{[0,1]} - \mathbf{1})^{\otimes m}$ is infinite, but truncation collapses it to a finite sum. Why?
The element $S(x^a)_{[0,1]} - \mathbf{1}$ has its level-$0$ component equal to $0$ (since both $S(x^a)_{[0,1]}$ and $\mathbf{1}$ have level-$0$ component $1$). When you take an $m$-fold tensor product of elements whose level-$0$ component is zero, the result has all components of level $< m$ equal to zero. More precisely, the level-$k$ component of $u^{\otimes m}$ is
\begin{align*}
(u^{\otimes m})^{(k)} = \sum_{k_1 + \cdots + k_m = k} u^{(k_1)} \otimes \cdots \otimes u^{(k_m)},
\end{align*}
and since $u^{(0)} = 0$ for $u = S(x^a)_{[0,1]} - \mathbf{1}$, every term with some $k_i = 0$ drops out. The remaining non-zero terms require all $k_i \ge 1$, so $k = \sum k_i \ge m$. Hence $(u^{\otimes m})^{(k)} = 0$ whenever $m > k$.
Applying $\pi_N$ to the series for $i_1(a)$, the level-$k$ contribution from each $m$ vanishes when $m > k$, and so the only $m$ that can contribute to levels $1, \dots, N$ are $m = 1, \dots, N$. Therefore
\begin{align*}
i_1^N(a) := \pi_N(i_1(a)) = \sum_{m=1}^{N} \frac{(-1)^{m+1}}{m} \pi_N\!\left((S(x^a)_{[0,1]} - \mathbf{1})^{\otimes m}\right).
\end{align*}
Substituting the binomial expansion derived above,
\begin{align*}
i_1^N(a) = \sum_{m=1}^{N} \sum_{j=0}^{m} \frac{(-1)^{m+1}}{m} \binom{m}{j} (-1)^{m-j} \pi_N\!\left(S(x^a *^j)_{[0, j]}\right),
\end{align*}
a finite $\mathbb{Q}$-linear combination of elements of $\mathcal{S}_N = \pi_N(\mathcal{S})$. Therefore $i_1^N(a) \in \operatorname{span}(\mathcal{S}_N)$ for every $a \in V$, which establishes $V^{\otimes 1} \subset \operatorname{span}(\mathcal{S}_N)$.
The reason truncation is essential: without it, $i_1(a)$ is defined only as a formal limit in $T((V))$, and "linear combination" over an infinite sum requires care. After truncation, only finitely many signatures are involved, and the result is a genuine finite linear combination — the only kind that "spans" requires.
[/guided]
[/step]
[step:Extract the pure tensor $a_1 \otimes \cdots \otimes a_k$ from concatenations of linear paths via Chen's identity]
Fix $k \in \{1, \dots, N\}$ and $a_1, \dots, a_k \in V$. Concatenate the linear paths to form $y := x^{a_1} * x^{a_2} * \cdots * x^{a_k} \in C_p([0, k], V)$. By repeated application of Chen's identity,
\begin{align*}
S(y)_{[0, k]} = S(x^{a_1})_{[0,1]} \otimes S(x^{a_2})_{[0,1]} \otimes \cdots \otimes S(x^{a_k})_{[0,1]} = \exp(i_1(a_1)) \otimes \exp(i_1(a_2)) \otimes \cdots \otimes \exp(i_1(a_k)) \quad \text{in } T((V)).
\end{align*}
Expanding each exponential as a power series and collecting the level-$k$ component:
\begin{align*}
S(y)_{[0, k]}^{(k)} = \sum_{(m_1, \dots, m_k): m_1 + \cdots + m_k = k, \, m_i \ge 0} \frac{1}{m_1! \cdots m_k!} a_1^{\otimes m_1} \otimes a_2^{\otimes m_2} \otimes \cdots \otimes a_k^{\otimes m_k}.
\end{align*}
Among the multi-indices $(m_1, \dots, m_k)$ with $\sum m_i = k$ and $m_i \ge 0$, the unique tuple with all $m_i \ge 1$ (forced since $\sum m_i = k$ and $k$ summands must each then equal exactly $1$) is $(1, 1, \dots, 1)$, contributing
\begin{align*}
\frac{1}{1! \cdots 1!} a_1 \otimes a_2 \otimes \cdots \otimes a_k = a_1 \otimes a_2 \otimes \cdots \otimes a_k.
\end{align*}
The other tuples have some $m_i = 0$, which means the factor $a_i$ is absent from that slot — and instead a factor $a_j$ for some $j \ne i$ is duplicated. The contribution is then a tensor product where a vector other than the prescribed one appears, but more importantly, for indices with $m_j \ge 2$ for some $j$, the corresponding factor is $a_j^{\otimes m_j}/m_j!$, which lives in $V^{\otimes m_j}$ with $m_j \ge 2$. The shape of the level-$k$ component is therefore
\begin{align*}
S(y)_{[0, k]}^{(k)} = a_1 \otimes \cdots \otimes a_k + R(a_1, \dots, a_k),
\end{align*}
where $R(a_1, \dots, a_k)$ is a finite linear combination of pure tensors $b_1 \otimes \cdots \otimes b_k$ in $V^{\otimes k}$ with each $b_j$ chosen from $\{a_1, \dots, a_k\}$ but with at least one $a_i$ omitted (equivalently, at least one repeated). In particular, every term in $R$ corresponds to a multi-index $(m_1, \dots, m_k)$ with at least one $m_i = 0$ and at least one $m_j \ge 2$, whose contribution is
\begin{align*}
\frac{1}{m_1! \cdots m_k!} a_1^{\otimes m_1} \otimes \cdots \otimes a_k^{\otimes m_k}.
\end{align*}
[guided]
We aim to realise the pure tensor $a_1 \otimes \cdots \otimes a_k$ as a controllable quantity built from signatures. The idea: take linear paths $x^{a_1}, \dots, x^{a_k}$, concatenate them, and read off the level-$k$ component of the resulting signature.
Define
\begin{align*}
y := x^{a_1} * x^{a_2} * \cdots * x^{a_k}: [0, k] \to V,
\end{align*}
the path that runs along $a_1$ on $[0,1]$, then along $a_2$ on $[1,2]$, and so on. Each $x^{a_i}$ is in $C_p([i-1, i], V)$ for every $p \ge 1$, and concatenations of finite-$p$-variation paths are again of finite $p$-variation, so $y \in C_p([0, k], V)$.
Chen's identity states $S(\alpha * \beta) = S(\alpha) \otimes S(\beta)$ for compatible paths $\alpha, \beta$. Iterating:
\begin{align*}
S(y)_{[0, k]} = S(x^{a_1})_{[0,1]} \otimes S(x^{a_2})_{[0,1]} \otimes \cdots \otimes S(x^{a_k})_{[0,1]}.
\end{align*}
Substituting the formula $S(x^{a_i})_{[0,1]} = \exp(i_1(a_i))$ from the earlier step,
\begin{align*}
S(y)_{[0, k]} = \exp(i_1(a_1)) \otimes \exp(i_1(a_2)) \otimes \cdots \otimes \exp(i_1(a_k)).
\end{align*}
Now expand each exponential as a power series. The level-$k$ component of a tensor product is computed by summing all ways the levels can be partitioned across the factors. For $\exp(i_1(a_j)) = \sum_{m_j \ge 0} \frac{a_j^{\otimes m_j}}{m_j!}$, the level-$m_j$ part is $a_j^{\otimes m_j}/m_j!$. Therefore the level-$k$ part of the product is the sum over multi-indices $(m_1, \dots, m_k)$ with $m_1 + \cdots + m_k = k$ (each $m_i \ge 0$) of
\begin{align*}
\frac{a_1^{\otimes m_1}}{m_1!} \otimes \cdots \otimes \frac{a_k^{\otimes m_k}}{m_k!} = \frac{1}{m_1! \cdots m_k!} \, a_1^{\otimes m_1} \otimes \cdots \otimes a_k^{\otimes m_k}.
\end{align*}
Among these multi-indices, $(1, 1, \dots, 1)$ is special: it is the only one where each $m_i = 1$, giving the term $a_1 \otimes \cdots \otimes a_k$. Why is it the only such tuple? Because if all $m_i \ge 1$ and $\sum m_i = k$ with $k$ terms, the only solution is $m_i = 1$ for all $i$.
Every other tuple has some $m_{i_0} = 0$ (meaning the slot $a_{i_0}$ contributes nothing — it produces the level-$0$ scalar $1$ from $\exp$) and necessarily some $m_{j_0} \ge 2$ (to make the sum $k$). Splitting off the $(1, \dots, 1)$ contribution:
\begin{align*}
S(y)_{[0, k]}^{(k)} = a_1 \otimes \cdots \otimes a_k + R(a_1, \dots, a_k),
\end{align*}
where the **remainder** $R$ collects the contributions from all other tuples — terms in which the input vectors $a_1, \dots, a_k$ are not all used distinctly (some are repeated, some are dropped).
The shape of $R$ is what we exploit next: every term in $R$ has the same level $k$ but arises from a degenerate multi-index. We will eliminate $R$ by inclusion–exclusion in the final step.
[/guided]
[/step]
[step:Eliminate the lower-order terms by inclusion–exclusion over subsets]
Define, for any (possibly empty) subset $J \subseteq \{1, \dots, k\}$, the path $y_J$ obtained by concatenating $x^{a_j}$ for $j \in J$ in increasing order (so $y_\emptyset$ is the constant path on a single point, with signature $\mathbf{1}$). By the same Chen-identity computation as above,
\begin{align*}
S(y_J)_{[0, |J|]}^{(k)} = \begin{cases}
0 & \text{if } |J| > k, \\
\sum_{(m_j)_{j \in J}: \, \sum m_j = k, \, m_j \ge 0} \frac{1}{\prod_{j \in J} m_j!} \bigotimes_{j \in J}^{\to} a_j^{\otimes m_j} & \text{if } |J| \le k,
\end{cases}
\end{align*}
where $\bigotimes_{j \in J}^{\to}$ denotes the ordered tensor product. (Note: the level-$k$ component is non-zero only when $|J| \le k$, since the multi-index condition $\sum_{j \in J} m_j = k$ with $m_j \ge 0$ and at least one $m_j \ge 1$ in each... actually has solutions for $|J| \le k$, with the case $|J| = 0$ giving level $k = 0$ only.)
Form the alternating sum
\begin{align*}
T_k := \sum_{J \subseteq \{1, \dots, k\}} (-1)^{k - |J|} S(y_J)_{[0, |J|]}^{(k)} \in V^{\otimes k}.
\end{align*}
Substituting the multi-index expansion and switching the order of summation,
\begin{align*}
T_k = \sum_{(m_1, \dots, m_k): m_i \ge 0, \, \sum m_i = k} \left( \prod_{i: m_i = 0} (\text{contribution of dropping index } i) \right) \frac{1}{\prod_i m_i!} \, a_1^{\otimes m_1} \otimes \cdots \otimes a_k^{\otimes m_k}.
\end{align*}
For a fixed tuple $(m_1, \dots, m_k)$, let $Z := \{i : m_i = 0\}$ be the set of dropped indices. The tuple is associated with subsets $J \supseteq \{1, \dots, k\} \setminus Z$ (any superset that includes the active indices), and over those, the alternating sum $\sum_{J: J \supseteq \{1, \dots, k\} \setminus Z} (-1)^{k - |J|}$ equals $0$ unless $Z = \emptyset$, in which case it equals $1$. Therefore the only multi-index that survives is $(1, 1, \dots, 1)$, and
\begin{align*}
T_k = a_1 \otimes a_2 \otimes \cdots \otimes a_k.
\end{align*}
Apply $\pi_N$ (with $k \le N$, so $\pi_N$ acts as the identity on $V^{\otimes k}$):
\begin{align*}
a_1 \otimes \cdots \otimes a_k = \pi_N(T_k) = \sum_{J \subseteq \{1, \dots, k\}} (-1)^{k - |J|} \pi_N\!\left(S(y_J)_{[0, |J|]}\right)^{(k)}.
\end{align*}
Each $\pi_N(S(y_J)_{[0, |J|]}) \in \mathcal{S}_N$, and the projection onto level $k$ is itself a linear functional, so $a_1 \otimes \cdots \otimes a_k$ is a finite $\mathbb{Z}$-linear combination of (level-$k$ components of) elements of $\mathcal{S}_N$. Since taking the level-$k$ component is a linear projection $T_N(V) \to V^{\otimes k} \subset T_N(V)$, it suffices to verify that $a_1 \otimes \cdots \otimes a_k \in \operatorname{span}(\mathcal{S}_N)$ as elements of $T_N(V)$ — and this is exactly what we have shown, after combining the level-$k$ components and noting that the lower-level components of $T_k$ vanish (since the alternating sum kills all multi-indices with at least one $m_i = 0$ on each level $< k$ as well, by the same argument).
By Step 1, $T_N(V) = \operatorname{span}(\mathcal{S}_N)$, completing the proof.
[guided]
We have $S(y)_{[0, k]}^{(k)} = a_1 \otimes \cdots \otimes a_k + R$, and we want to extract the pure tensor cleanly. The remainder $R$ collects terms where some $a_i$ is dropped (i.e. some $m_i = 0$). The trick is **inclusion–exclusion**: subtracting off signatures of paths built from proper subsets of the $a_i$ kills the unwanted terms.
Concretely, for each subset $J \subseteq \{1, \dots, k\}$, define
\begin{align*}
y_J := *_{j \in J}^{\to} x^{a_j}: [0, |J|] \to V,
\end{align*}
the concatenation in increasing order (with $y_\emptyset$ the constant path). The same Chen-identity calculation gives
\begin{align*}
S(y_J)_{[0, |J|]} = \bigotimes_{j \in J}^{\to} \exp(i_1(a_j)),
\end{align*}
and the level-$k$ component is the multi-index sum over tuples $(m_j)_{j \in J}$ with $\sum_{j \in J} m_j = k$, $m_j \ge 0$:
\begin{align*}
S(y_J)_{[0, |J|]}^{(k)} = \sum_{(m_j)_{j \in J}: \, \sum m_j = k} \frac{1}{\prod_{j \in J} m_j!} \bigotimes_{j \in J}^{\to} a_j^{\otimes m_j}.
\end{align*}
(For $J = \emptyset$, the level-$k$ component is $0$ for $k \ge 1$, which is the relevant case.)
Form the alternating sum
\begin{align*}
T_k := \sum_{J \subseteq \{1, \dots, k\}} (-1)^{k - |J|} S(y_J)_{[0, |J|]}^{(k)}.
\end{align*}
Why this sign convention? Because we want to count with weight $1$ the multi-index $(1, \dots, 1)$ (all indices used) and weight $0$ everything else.
Switch the order of summation. For each multi-index $(m_1, \dots, m_k)$ with $\sum m_i = k$, let $A := \{i : m_i \ge 1\}$ be the active set. The multi-index appears in $S(y_J)^{(k)}$ for exactly those $J$ with $A \subseteq J$ (and in those, with the same coefficient $\prod_i 1/m_i!$). The alternating sum over such $J$ is
\begin{align*}
\sum_{J \supseteq A} (-1)^{k - |J|} = (-1)^{k - |A|} \sum_{J': J' \subseteq \{1, \dots, k\} \setminus A} (-1)^{-|J'|},
\end{align*}
where $J' = J \setminus A$. This last sum equals $\sum_{J' \subseteq \{1, \dots, k\} \setminus A} (-1)^{|J'|} = (1 - 1)^{|\{1,\dots,k\} \setminus A|} = 0^{k - |A|}$, which is $1$ if $|A| = k$ (i.e. $A = \{1, \dots, k\}$) and $0$ otherwise.
Hence the only surviving multi-index is the one with all $m_i \ge 1$ — but combined with $\sum m_i = k$, this forces $m_i = 1$ for all $i$. The coefficient is $1/(1! \cdots 1!) = 1$, and the corresponding tensor is $a_1 \otimes \cdots \otimes a_k$:
\begin{align*}
T_k = a_1 \otimes a_2 \otimes \cdots \otimes a_k.
\end{align*}
To phrase this in $T_N(V)$ rather than $V^{\otimes k}$: applying $\pi_N$ preserves $V^{\otimes k}$ (since $k \le N$), so
\begin{align*}
a_1 \otimes \cdots \otimes a_k = \sum_{J \subseteq \{1, \dots, k\}} (-1)^{k - |J|} \pi_N(S(y_J)_{[0, |J|]})^{(k)},
\end{align*}
which exhibits the pure tensor as a finite $\mathbb{Z}$-linear combination of (linear projections of) truncated signatures. Since the level-$k$ projection is linear, and each $\pi_N(S(y_J)_{[0, |J|]}) \in \mathcal{S}_N$, this places $a_1 \otimes \cdots \otimes a_k$ inside $\operatorname{span}(\mathcal{S}_N)$ — first as the level-$k$ summand of an element of $\operatorname{span}(\mathcal{S}_N)$, but the same alternating-sum argument applied at every lower level shows that all lower-level components of $T_k$ vanish by the same Möbius-inversion identity, so $T_k$ itself (viewed in $T_N(V)$) equals $a_1 \otimes \cdots \otimes a_k$.
Combining the steps: every pure tensor $a_1 \otimes \cdots \otimes a_k$ for $k \le N$ lies in $\operatorname{span}(\mathcal{S}_N)$. By Step 1, this implies $T_N(V) = \operatorname{span}(\mathcal{S}_N)$, completing the proof.
[/guided]
[/step]
Explore Further
Stratonovich SDEs as RDEs
Stochastic Analysis
Universal Kernel for Distribution Regression
Stochastic Analysis
Adjoint Equation for a Neural CDE
Stochastic Analysis
Hausdorff, Separability, and Metrisability
Stochastic Analysis
Adjoint Equation for a Neural RDE
Stochastic Analysis
Reparameterization Invariance
Stochastic Analysis
Non-Polishness of $(\mathcal{C}_1, \chi_{\mathrm{pr}})$
Stochastic Analysis
Hambly-Lyons Uniqueness
Stochastic Analysis