[guided]The strategy of the universality proof is to embed every truncated-signature linear functional into the class $\mathcal{B}_K$ of linear neural CDEs. The first step is to realise the truncated signature itself as the solution of a linear CDE — this is the algebraic content that makes the embedding possible.
The truncated signature $Y_t = \pi_{\le N} S(x)_{0,t}$ takes values in the truncated tensor algebra $T^N(\mathbb{R}^d)$. It is built from iterated Riemann-Stieltjes integrals of the path $x$, and it satisfies a fundamental recursion known as **Chen's relation**:
\begin{align*}
S(x)_{0,t} = 1 + \int_0^t S(x)_{0,s} \otimes dx_s.
\end{align*}
Truncating at level $N$ — and observing that $\pi_{\le N}(\xi \otimes a) = \pi_{\le N}(\pi_{\le N}(\xi) \cdot_N a)$ for $a \in \mathbb{R}^d$ embedded in degree one — we obtain the recursion
\begin{align*}
Y_t = (1, 0, \ldots, 0) + \int_0^t \pi_{\le N}(\mathfrak{i}_{\le N}(Y_s) \cdot \mathfrak{i}_1(dx_s)),
\end{align*}
which is the integral form of a linear CDE on $T^N(\mathbb{R}^d)$. In differential form,
\begin{align*}
dY_t = f(Y_t)\,dx_t, \qquad Y_0 = (1, 0, \ldots, 0),
\end{align*}
with $f: T^N(\mathbb{R}^d) \to \mathcal{L}(\mathbb{R}^d, T^N(\mathbb{R}^d))$ given by $f(y)\,a := \pi_{\le N}(\mathfrak{i}_{\le N}(y) \cdot \mathfrak{i}_1(a))$.
We verify that $f$ is a linear vector field of the type required by the definition of $\mathcal{B}_K$. Linearity in $y$: the truncated tensor product is bilinear, so for fixed $a$, $y \mapsto \pi_{\le N}(y \cdot a)$ is linear in $y$. Linearity in $a$: $\mathfrak{i}_1$ is linear by construction, and the truncated tensor product is again bilinear, so for fixed $y$, $a \mapsto \pi_{\le N}(y \cdot a)$ is linear in $a$. Hence $f$ is a *bilinear* map, equivalently an element of $\mathcal{L}(T^N(\mathbb{R}^d), \mathcal{L}(\mathbb{R}^d, T^N(\mathbb{R}^d)))$.
We verify the existence-and-uniqueness hypotheses of the [Existence and Uniqueness of CDE Solutions](/theorems/???): (i) the vector field $f$ is linear (hence $C^\infty$ globally, with all partial derivatives bounded on any compact set), so in particular Lipschitz on bounded subsets; (ii) the driver $x \in C_{1,0,t}$ has bounded $1$-variation by the definition of the path space. Both hypotheses are met. Hence the linear CDE has a unique global solution $Y$, and by direct verification of Chen's relation $Y_t = \pi_{\le N} S(x)_{0,t}$ is precisely this solution. Evaluating at $t = T$ identifies the truncated signature as the terminal value of the linear neural CDE.
This is the key reduction: every truncated-signature linear functional is "$g$ applied to the terminal value of a linear neural CDE". In the next step we make the parameter dimension $M = (d^{N+1}-1)/(d-1)$ match the requirement of $\mathcal{B}_K$ via the linear isomorphism $T^N(\mathbb{R}^d) \cong \mathbb{R}^M$.[/guided]