Universality of Signature Kernels (Theorem # 2519)
Theorem
Let $k_\phi$ be a signature kernel determined by a weight function $\phi$ satisfying the conditions of *Sufficient Condition for Signature Membership*, and let $\mathcal{H}_\phi$ be the associated RKHS. Assume that $\mathcal{C}_p$ is equipped with a topology for which $S : \mathcal{C}_p \to T_\phi((V))$ is continuous. Then $k_\phi$ is universal.
Analysis
Functional Analysis
Discussion
No discussion available for this theorem.
Proof
[proofplan]
By the [Universality Equivalence](/theorems/2518), it suffices to prove that the restricted family $\mathcal{H}_\phi|_\mathcal{K} = \{k_\phi^h|_\mathcal{K} : h \in T_\phi((V))\}$ is uniformly dense in $C(\mathcal{K})$ for every compact $\mathcal{K} \subset \mathcal{C}_p$. We work with the restricted family $\mathcal{A} := \{k_\phi^h|_\mathcal{K} : h \in T(V)\}$ generated by tensors of *finite* tensor degree, prove $\mathcal{A}$ satisfies the hypotheses of the [Stone--Weierstrass theorem](/theorems/???) on $\mathcal{K}$, and conclude. The four hypotheses to verify are: (i) $\mathcal{A}$ consists of continuous real-valued functions on $\mathcal{K}$; (ii) $\mathcal{A}$ is closed under pointwise products — established via the $\phi$-shuffle identity $k_\phi^h \cdot k_\phi^g = \langle h \shuffle_\phi g, S(\cdot)\rangle_\phi = k_\phi^{h \shuffle_\phi g}$; (iii) $\mathcal{A}$ contains the constants — exhibited by $h = \mathbf{1} \in V^{\otimes 0}$; (iv) $\mathcal{A}$ separates points of $\mathcal{K}$ — proved using that the signature separates paths up to tree-like equivalence and that $\mathcal{C}_p$ is taken to consist of paths inequivalent under tree-like equivalence (or equivalently, parameterised so that $S$ is injective). Stone--Weierstrass then gives uniform density of $\mathcal{A}$ in $C(\mathcal{K})$, which embeds into the larger family $\mathcal{H}_\phi|_\mathcal{K}$, completing the proof.
[/proofplan]
[step:Reduce to showing the restricted RKHS family is uniformly dense via the Universality Equivalence]
Fix a compact subset $\mathcal{K} \subset \mathcal{C}_p$. By the [Universality Equivalence](/theorems/2518), $k_\phi$ is universal if and only if $\mathcal{H}_\phi|_\mathcal{K} = \{k_\phi^h|_\mathcal{K} : h \in T_\phi((V))\}$ is dense in $C(\mathcal{K})$ for the topology of uniform convergence, for every such $\mathcal{K}$. The hypotheses of that theorem hold by assumption: $\phi$ satisfies the conditions of the [Sufficient Condition for Signature Membership](/theorems/???), and $S : \mathcal{C}_p \to T_\phi((V))$ is continuous in the given topology on $\mathcal{C}_p$.
Define
\begin{align*}
\mathcal{A} := \{ k_\phi^h|_\mathcal{K} : h \in T(V) \},
\end{align*}
where $T(V) := \bigoplus_{n=0}^\infty V^{\otimes n}$ is the *algebraic* tensor algebra (finite-degree tensors only). Since $T(V) \subseteq T_\phi((V))$, we have $\mathcal{A} \subseteq \mathcal{H}_\phi|_\mathcal{K}$. Hence uniform density of $\mathcal{A}$ in $C(\mathcal{K})$ implies uniform density of $\mathcal{H}_\phi|_\mathcal{K}$ in $C(\mathcal{K})$, and the Universality Equivalence then yields universality of $k_\phi$. We therefore aim to show $\mathcal{A}$ is uniformly dense in $C(\mathcal{K})$.
[/step]
[step:Verify $\mathcal{A}$ consists of continuous real-valued functions on $\mathcal{K}$]
For $h \in T(V)$, the function $k_\phi^h : \mathcal{C}_p \to \mathbb{R}$ is given by $k_\phi^h(\gamma) = \langle h, S(\gamma)\rangle_\phi$. Since $S : \mathcal{C}_p \to T_\phi((V))$ is continuous (by hypothesis) and $\langle h, \cdot\rangle_\phi : T_\phi((V)) \to \mathbb{R}$ is continuous (it is a bounded linear functional on the Hilbert space $T_\phi((V))$, with operator norm $\leq \|h\|_\phi$ by Cauchy--Schwarz), the composition $k_\phi^h$ is continuous on $\mathcal{C}_p$, hence on $\mathcal{K}$. Therefore
\begin{align*}
\mathcal{A} \subseteq C(\mathcal{K}).
\end{align*}
[/step]
[step:Establish the $\phi$-shuffle identity to show closure under pointwise products]
For homogeneous $h \in V^{\otimes n}$ and $g \in V^{\otimes m}$, define the $\phi$-weighted shuffle product
\begin{align*}
h \shuffle_\phi g := \frac{\phi(n)\phi(m)}{\phi(n+m)} \, h \shuffle g \in V^{\otimes (n+m)},
\end{align*}
extended bilinearly to $T(V) \times T(V) \to T(V)$. We verify the algebraic identity
\begin{align*}
k_\phi^h(\gamma) \cdot k_\phi^g(\gamma) = \langle h \shuffle_\phi g, S(\gamma)\rangle_\phi = k_\phi^{h \shuffle_\phi g}(\gamma) \qquad \text{for all } \gamma \in \mathcal{C}_p.
\end{align*}
Take $h \in V^{\otimes n}$ and $g \in V^{\otimes m}$ homogeneous (the general case follows by bilinearity). By definition of $k_\phi^h$, $k_\phi^g$ and using that only the levels $n$ and $m$ of $h$ and $g$ are nonzero,
\begin{align*}
k_\phi^h(\gamma) \cdot k_\phi^g(\gamma) = \phi(n) \langle h, S(\gamma)^{(n)}\rangle_{V^{\otimes n}} \cdot \phi(m) \langle g, S(\gamma)^{(m)}\rangle_{V^{\otimes m}}.
\end{align*}
By the [shuffle product identity for signatures](/theorems/???) — Chen's identity in the form $\langle h, S(\gamma)^{(n)}\rangle \langle g, S(\gamma)^{(m)}\rangle = \langle h \shuffle g, S(\gamma)^{(n+m)}\rangle$, which expresses the fact that the signature is a character on the shuffle algebra — the right-hand side equals
\begin{align*}
\phi(n)\phi(m) \langle h \shuffle g, S(\gamma)^{(n+m)}\rangle_{V^{\otimes (n+m)}}.
\end{align*}
Multiplying and dividing by $\phi(n+m) > 0$ (positivity of $\phi$ on integers $\geq 0$ is guaranteed by the conditions of the Sufficient Condition for Signature Membership),
\begin{align*}
\phi(n)\phi(m) \langle h \shuffle g, S(\gamma)^{(n+m)}\rangle = \phi(n+m) \cdot \frac{\phi(n)\phi(m)}{\phi(n+m)} \langle h \shuffle g, S(\gamma)^{(n+m)}\rangle = \phi(n+m) \langle h \shuffle_\phi g, S(\gamma)^{(n+m)}\rangle.
\end{align*}
The last expression is, by definition, $\langle h \shuffle_\phi g, S(\gamma)\rangle_\phi$ since $h \shuffle_\phi g \in V^{\otimes (n+m)}$ and only the level $n+m$ of $h \shuffle_\phi g$ contributes. This establishes the identity for homogeneous tensors; bilinearity extends it to all $h, g \in T(V)$.
Consequently, for any $h, g \in T(V)$ the product $h \shuffle_\phi g \in T(V)$, and
\begin{align*}
k_\phi^h \cdot k_\phi^g = k_\phi^{h \shuffle_\phi g} \in \mathcal{A}.
\end{align*}
Hence $\mathcal{A}$ is closed under pointwise multiplication. Linearity in $h$ shows $\mathcal{A}$ is also closed under addition and real scalar multiplication. Therefore $\mathcal{A}$ is a real subalgebra of $C(\mathcal{K})$.
[guided]
Closure under pointwise products is the linchpin of the Stone--Weierstrass argument, and the technical content is the $\phi$-shuffle identity. We prove the identity from first principles, then read off closure under products and the algebra structure.
**The unweighted shuffle identity for signatures.** The signature is a *character* on the shuffle algebra: for $h, g \in T(V)$,
\begin{align*}
\langle h, S(\gamma) \rangle \cdot \langle g, S(\gamma)\rangle = \langle h \shuffle g, S(\gamma)\rangle,
\end{align*}
where $\shuffle$ is the shuffle product on $T(V)$ (sums over interleavings of indices). This is the [shuffle product identity for signatures](/theorems/???), a consequence of Chen's identity for iterated integrals. On homogeneous components $h \in V^{\otimes n}$, $g \in V^{\otimes m}$, only the level-$n$ piece of $S(\gamma)$ pairs nontrivially with $h$ (and likewise for $g$ at level $m$), and shuffling produces a tensor of degree $n + m$, so
\begin{align*}
\langle h, S(\gamma)^{(n)}\rangle_{V^{\otimes n}} \cdot \langle g, S(\gamma)^{(m)}\rangle_{V^{\otimes m}} = \langle h \shuffle g, S(\gamma)^{(n+m)}\rangle_{V^{\otimes (n+m)}}.
\end{align*}
**Translating to the $\phi$-weighted inner product.** Take $h \in V^{\otimes n}$ and $g \in V^{\otimes m}$ homogeneous. Recall the $\phi$-weighted inner product is defined level-by-level by $\langle h, S(\gamma)\rangle_\phi = \phi(n)\langle h, S(\gamma)^{(n)}\rangle_{V^{\otimes n}}$ when $h$ lives at level $n$. So
\begin{align*}
k_\phi^h(\gamma) \cdot k_\phi^g(\gamma) = \phi(n)\langle h, S(\gamma)^{(n)}\rangle_{V^{\otimes n}} \cdot \phi(m)\langle g, S(\gamma)^{(m)}\rangle_{V^{\otimes m}} = \phi(n)\phi(m) \langle h \shuffle g, S(\gamma)^{(n+m)}\rangle_{V^{\otimes (n+m)}}.
\end{align*}
We want to express the right-hand side as $\langle \star, S(\gamma)\rangle_\phi$ for some $\star \in V^{\otimes (n+m)}$. By the same level-by-level definition, $\langle \star, S(\gamma)\rangle_\phi = \phi(n+m) \langle \star, S(\gamma)^{(n+m)}\rangle$, so we want
\begin{align*}
\phi(n)\phi(m) \langle h \shuffle g, S(\gamma)^{(n+m)}\rangle = \phi(n+m) \langle \star, S(\gamma)^{(n+m)}\rangle.
\end{align*}
This forces (by non-degeneracy of the inner product on $V^{\otimes (n+m)}$, modulo orthogonal directions to $S(\gamma)^{(n+m)}$, but the cleanest fix is to rescale the tensor itself):
\begin{align*}
\star := \frac{\phi(n)\phi(m)}{\phi(n+m)} \, h \shuffle g.
\end{align*}
We define $h \shuffle_\phi g := \star$ for homogeneous $h, g$ and extend bilinearly to $T(V) \times T(V) \to T(V)$. This is well-defined since $\phi(n+m) > 0$ for all $n, m \ge 0$ (the conditions of the [Sufficient Condition for Signature Membership](/theorems/???) include positivity of $\phi$ on $\mathbb{N} \cup \{0\}$).
**Putting it together.** With this definition,
\begin{align*}
k_\phi^h(\gamma) \cdot k_\phi^g(\gamma) = \phi(n+m) \langle h \shuffle_\phi g, S(\gamma)^{(n+m)}\rangle = \langle h \shuffle_\phi g, S(\gamma)\rangle_\phi = k_\phi^{h \shuffle_\phi g}(\gamma).
\end{align*}
Bilinearity of $\shuffle_\phi$ extends the identity from homogeneous tensors to all of $T(V) \times T(V)$.
**Why was rescaling necessary?** The product of two weighted inner products carries the factor $\phi(n)\phi(m)$, while a single weighted inner product on a degree-$(n+m)$ tensor carries the factor $\phi(n+m)$. The unweighted shuffle does not know about $\phi$, so directly substituting it would give the wrong scaling. The rescaling factor $\phi(n)\phi(m)/\phi(n+m)$ in the definition of $\shuffle_\phi$ is exactly the correction needed. For the unweighted choice $\phi \equiv 1$, the factor is $1$ and we recover the ordinary shuffle.
**Closure of $T(V)$ under $\shuffle_\phi$.** If $h \in V^{\otimes n}$ and $g \in V^{\otimes m}$ then $h \shuffle g \in V^{\otimes (n+m)}$ (interleaving indices preserves total degree), so $h \shuffle_\phi g$ is a scalar multiple of $h \shuffle g$ and lives in $V^{\otimes (n+m)} \subset T(V)$. Bilinearity extends closure to all finite-degree tensors. Hence $h, g \in T(V) \implies h \shuffle_\phi g \in T(V)$, and consequently
\begin{align*}
k_\phi^h \cdot k_\phi^g = k_\phi^{h \shuffle_\phi g} \in \mathcal{A}.
\end{align*}
**Why work with $\mathcal{A}$ rather than the full $\mathcal{H}_\phi|_\mathcal{K}$?** The shuffle product of two finite-degree tensors is a finite-degree tensor, so $T(V)$ is closed under $\shuffle_\phi$. For infinite-degree elements $h, g \in T_\phi((V))$, the formal expression $h \shuffle_\phi g$ may not converge in $T_\phi((V))$ — there is no general guarantee that the resulting tensor has finite $\phi$-norm. By restricting to $\mathcal{A}$, we sidestep this convergence question. Since $\mathcal{A} \subseteq \mathcal{H}_\phi|_\mathcal{K}$, uniform density of $\mathcal{A}$ implies uniform density of the larger family, which is what we need for Universality Equivalence.
**Algebra structure of $\mathcal{A}$.** Closure under sums and real scalar multiplication follows from linearity of $h \mapsto k_\phi^h$ in $h$: $k_\phi^{c_1 h + c_2 g}(\gamma) = c_1 k_\phi^h(\gamma) + c_2 k_\phi^g(\gamma)$, and $c_1 h + c_2 g \in T(V)$. Combined with closure under products, this makes $\mathcal{A}$ a real subalgebra of $C(\mathcal{K})$.
[/guided]
[/step]
[step:Verify that $\mathcal{A}$ contains the nonzero constants]
The element $\mathbf{1} := 1 \in V^{\otimes 0} \cong \mathbb{R} \subset T(V)$ is a finite-degree tensor concentrated at level zero. By definition of $k_\phi^{\mathbf{1}}$,
\begin{align*}
k_\phi^{\mathbf{1}}(\gamma) = \langle \mathbf{1}, S(\gamma)\rangle_\phi = \phi(0) \langle 1, S(\gamma)^{(0)}\rangle_{V^{\otimes 0}} = \phi(0) \cdot 1 \cdot 1 = \phi(0)
\end{align*}
for all $\gamma \in \mathcal{C}_p$. Since $\phi(0) > 0$ (positivity from the conditions of Sufficient Condition for Signature Membership), the constant function $\gamma \mapsto \phi(0)$ lies in $\mathcal{A}$. By linearity ($c \cdot \mathbf{1} \in T(V)$ for any $c \in \mathbb{R}$ and $k_\phi^{c\mathbf{1}} = c k_\phi^{\mathbf{1}} = c\phi(0)$), every real constant lies in $\mathcal{A}$. In particular, $\mathcal{A}$ contains the constant function $1$ (taking $c = 1/\phi(0)$).
[/step]
[step:Verify that $\mathcal{A}$ separates points of $\mathcal{K}$]
We must show: for any $\gamma_1, \gamma_2 \in \mathcal{K}$ with $\gamma_1 \neq \gamma_2$, there exists $h \in T(V)$ with $k_\phi^h(\gamma_1) \neq k_\phi^h(\gamma_2)$.
Fix such $\gamma_1 \neq \gamma_2$. By the [signature uniqueness theorem](/theorems/???) (Hambly--Lyons): the signature map $S : \mathcal{C}_p \to T_\phi((V))$ is injective on $\mathcal{C}_p$, where the path space $\mathcal{C}_p$ is taken modulo tree-like equivalence (equivalently, parameterised so that distinct elements have distinct signatures). Hence $S(\gamma_1) \neq S(\gamma_2)$, so the difference $S(\gamma_1) - S(\gamma_2) \in T_\phi((V))$ is nonzero.
Choose $n \in \mathbb{N} \cup \{0\}$ to be the smallest index with $S(\gamma_1)^{(n)} \neq S(\gamma_2)^{(n)}$ in $V^{\otimes n}$. Then the tensor $w := S(\gamma_1)^{(n)} - S(\gamma_2)^{(n)} \in V^{\otimes n}$ is nonzero. By the non-degeneracy of the Hilbert--Schmidt inner product on the finite-dimensional Hilbert space $V^{\otimes n}$, there exists $h_n \in V^{\otimes n}$ with
\begin{align*}
\langle h_n, w \rangle_{V^{\otimes n}} = \langle h_n, S(\gamma_1)^{(n)}\rangle_{V^{\otimes n}} - \langle h_n, S(\gamma_2)^{(n)}\rangle_{V^{\otimes n}} \neq 0.
\end{align*}
Indeed, take $h_n := w$ itself, giving $\langle w, w\rangle_{V^{\otimes n}} = \|w\|^2 > 0$.
Now lift $h_n$ to $T(V)$ by extending by zero in all other levels: $h := (0, \ldots, 0, h_n, 0, 0, \ldots) \in V^{\otimes n} \subset T(V)$. Then
\begin{align*}
k_\phi^h(\gamma_i) = \phi(n) \langle h_n, S(\gamma_i)^{(n)}\rangle_{V^{\otimes n}}, \quad i = 1, 2,
\end{align*}
and the difference
\begin{align*}
k_\phi^h(\gamma_1) - k_\phi^h(\gamma_2) = \phi(n) \langle h_n, w \rangle_{V^{\otimes n}} = \phi(n) \|w\|^2 > 0
\end{align*}
because $\phi(n) > 0$ and $w \neq 0$. Therefore $k_\phi^h(\gamma_1) \neq k_\phi^h(\gamma_2)$, and $\mathcal{A}$ separates the points $\gamma_1, \gamma_2$. Since $\gamma_1, \gamma_2 \in \mathcal{K}$ were arbitrary, $\mathcal{A}$ separates points of $\mathcal{K}$.
[guided]
Stone--Weierstrass requires the algebra $\mathcal{A}$ to *separate points*: for any two distinct $\gamma_1, \gamma_2 \in \mathcal{K}$, some function in $\mathcal{A}$ takes different values on them. We prove this in three movements: (a) translate distinctness of paths to distinctness of signatures via injectivity of $S$; (b) extract a non-degenerate level of the signature difference; (c) build a finite-degree tensor $h \in T(V)$ that detects this level.
**(a) Distinct paths have distinct signatures.** Fix $\gamma_1, \gamma_2 \in \mathcal{K}$ with $\gamma_1 \neq \gamma_2$. By the [signature uniqueness theorem](/theorems/???) (Hambly--Lyons), the signature map
\begin{align*}
S : \mathcal{C}_p \to T_\phi((V))
\end{align*}
is injective on $\mathcal{C}_p$, where $\mathcal{C}_p$ is taken modulo tree-like equivalence (equivalently, paths in $\mathcal{C}_p$ are parameterised so that distinct elements have distinct signatures — the standing convention for this theorem). Hence $S(\gamma_1) \neq S(\gamma_2)$, so the difference
\begin{align*}
v := S(\gamma_1) - S(\gamma_2) \in T_\phi((V))
\end{align*}
is nonzero in the Hilbert space $T_\phi((V))$.
**(b) Locating a non-degenerate level.** Decompose $v$ by tensor degree: $v = \sum_{k=0}^\infty v^{(k)}$ where $v^{(k)} := S(\gamma_1)^{(k)} - S(\gamma_2)^{(k)} \in V^{\otimes k}$. Since $v \neq 0$, at least one $v^{(k)}$ is nonzero. Let
\begin{align*}
n := \min\{k \ge 0 : v^{(k)} \neq 0\}.
\end{align*}
Then $w := v^{(n)} = S(\gamma_1)^{(n)} - S(\gamma_2)^{(n)} \in V^{\otimes n}$ is nonzero. The choice of the *lowest* nonzero level is for concreteness; any level where $v^{(k)} \neq 0$ would work equally well.
**(c) Constructing a separating tensor in $T(V)$.** We want $h \in T(V)$ — a *finite-degree* tensor — such that $k_\phi^h(\gamma_1) \neq k_\phi^h(\gamma_2)$. Take $h_n := w \in V^{\otimes n}$ and lift it to $T(V)$ by extending by zero in all other levels:
\begin{align*}
h := (0, 0, \ldots, 0, w, 0, 0, \ldots) \in T(V),
\end{align*}
where $w$ sits at level $n$. By the level-by-level definition of $\langle \cdot, \cdot\rangle_\phi$, the function $k_\phi^h$ evaluated at $\gamma_i$ is
\begin{align*}
k_\phi^h(\gamma_i) = \langle h, S(\gamma_i)\rangle_\phi = \phi(n) \langle w, S(\gamma_i)^{(n)}\rangle_{V^{\otimes n}}, \qquad i = 1, 2.
\end{align*}
Subtracting,
\begin{align*}
k_\phi^h(\gamma_1) - k_\phi^h(\gamma_2) = \phi(n)\langle w, S(\gamma_1)^{(n)} - S(\gamma_2)^{(n)}\rangle_{V^{\otimes n}} = \phi(n) \langle w, w\rangle_{V^{\otimes n}} = \phi(n) \|w\|^2_{V^{\otimes n}}.
\end{align*}
Since $w \neq 0$, $\|w\|^2 > 0$ by the non-degeneracy of the Hilbert--Schmidt inner product on the finite-dimensional Hilbert space $V^{\otimes n}$. Since $\phi(n) > 0$ — guaranteed by the conditions of [Sufficient Condition for Signature Membership](/theorems/???), which require $\phi$ to be strictly positive on $\mathbb{N} \cup \{0\}$ — we have $\phi(n)\|w\|^2 > 0$. Therefore
\begin{align*}
k_\phi^h(\gamma_1) - k_\phi^h(\gamma_2) > 0, \qquad \text{so } k_\phi^h(\gamma_1) \neq k_\phi^h(\gamma_2).
\end{align*}
**Verifying $h \in T(V)$.** The construction places a single nonzero component at level $n$ and zeros elsewhere, so $h \in V^{\otimes n} \subset T(V)$. This is essential — the algebra $\mathcal{A}$ requires its parameter $h$ to be a finite-degree tensor. If we instead took $h := S(\gamma_1) - S(\gamma_2)$ in full (which would also separate the points by the same Hilbert-space inner-product argument), $h$ would be an infinite-degree element of $T_\phi((V))$, lying in $\mathcal{H}_\phi|_\mathcal{K}$ but not in $\mathcal{A}$. Truncating to a single level keeps us inside $\mathcal{A}$.
**Where each hypothesis is consumed.**
- *Injectivity of $S$* (signature uniqueness) gives $v \neq 0$ in (a).
- *Non-degeneracy of the inner product on $V^{\otimes n}$* gives $\|w\|^2 > 0$ in (c).
- *Strict positivity $\phi(n) > 0$* gives $\phi(n)\|w\|^2 > 0$ in (c).
Removing any of these would break the construction — e.g. if $\phi(n) = 0$ for the level we picked, the pairing would vanish; we would then have to retreat to a different non-vanishing level, but no level may suffice if $\phi$ has zeros, which is why strict positivity of $\phi$ on $\mathbb{N} \cup \{0\}$ is built into the hypothesis.
Since $\gamma_1 \neq \gamma_2$ in $\mathcal{K}$ were arbitrary, $\mathcal{A}$ separates the points of $\mathcal{K}$.
[/guided]
[/step]
[step:Apply the Stone--Weierstrass theorem and conclude universality]
The set $\mathcal{K}$ is compact in $\mathcal{C}_p$, hence is a compact Hausdorff space (assuming $\mathcal{C}_p$ is Hausdorff in its given topology, which is standard for any space supporting a continuous signature map into a Hausdorff Hilbert space). The previous four steps establish:
\begin{itemize}
\item $\mathcal{A} \subseteq C(\mathcal{K}, \mathbb{R})$ — continuous real-valued functions;
\item $\mathcal{A}$ is a real subalgebra (closed under sums, scalar multiplication, and pointwise products);
\item $\mathcal{A}$ contains the constant function $1$;
\item $\mathcal{A}$ separates points of $\mathcal{K}$.
\end{itemize}
By the [Stone--Weierstrass theorem](/theorems/???), $\mathcal{A}$ is dense in $C(\mathcal{K}, \mathbb{R})$ in the uniform norm.
Since $\mathcal{A} \subseteq \mathcal{H}_\phi|_\mathcal{K}$, it follows that $\mathcal{H}_\phi|_\mathcal{K}$ is also uniformly dense in $C(\mathcal{K})$. Since $\mathcal{K}$ was an arbitrary compact subset of $\mathcal{C}_p$, the [Universality Equivalence](/theorems/2518) implies $k_\phi$ is universal, completing the proof.
[guided]
We assemble the four ingredients verified in the previous steps and feed them into the Stone--Weierstrass theorem, then chain back through the Universality Equivalence to conclude universality.
**Verifying the hypotheses of Stone--Weierstrass.** The [Stone--Weierstrass theorem](/theorems/???) asserts: if $X$ is a compact Hausdorff space and $\mathcal{A} \subseteq C(X, \mathbb{R})$ is a subalgebra that contains the constants and separates points, then $\mathcal{A}$ is uniformly dense in $C(X, \mathbb{R})$. We check each hypothesis on $X = \mathcal{K}$:
\begin{itemize}
\item *Compact Hausdorff space.* $\mathcal{K} \subset \mathcal{C}_p$ is compact by assumption. The path space $\mathcal{C}_p$ carries a topology (the standard $p$-variation topology, or any other for which $S$ is continuous) under which it is Hausdorff — indeed, $S : \mathcal{C}_p \to T_\phi((V))$ is continuous and injective into a Hausdorff Hilbert space, and a continuous injection into a Hausdorff space pulls back the Hausdorff property to the source whenever the source's topology is at least as fine as the pull-back. The standing convention for $\mathcal{C}_p$ guarantees Hausdorffness; consequently the subspace $\mathcal{K}$ is Hausdorff as well.
\item *$\mathcal{A} \subseteq C(\mathcal{K}, \mathbb{R})$.* Verified in Step 2.
\item *$\mathcal{A}$ is a real subalgebra.* Closure under sums and real scalar multiplication follows from linearity of $h \mapsto k_\phi^h$; closure under pointwise products is the content of Step 3 via the $\phi$-shuffle identity. Together these make $\mathcal{A}$ a real subalgebra.
\item *$\mathcal{A}$ contains the constants.* Verified in Step 4 by exhibiting $\mathbf{1} \in V^{\otimes 0} \subset T(V)$ and noting that real scalar multiples are also in $\mathcal{A}$, so the constant function $1$ lies in $\mathcal{A}$.
\item *$\mathcal{A}$ separates points.* Verified in Step 5 using injectivity of $S$ and strict positivity of $\phi$.
\end{itemize}
All five hypotheses hold, so Stone--Weierstrass applies and yields
\begin{align*}
\overline{\mathcal{A}}^{\,\|\cdot\|_\infty} = C(\mathcal{K}, \mathbb{R}),
\end{align*}
i.e. $\mathcal{A}$ is uniformly dense in $C(\mathcal{K})$.
**Lifting to $\mathcal{H}_\phi|_\mathcal{K}$.** From the inclusion $\mathcal{A} \subseteq \mathcal{H}_\phi|_\mathcal{K}$ established in Step 1, any $f \in C(\mathcal{K})$ that can be uniformly approximated by elements of $\mathcal{A}$ can also be uniformly approximated by elements of $\mathcal{H}_\phi|_\mathcal{K}$ — the same approximant $a \in \mathcal{A}$ also lies in $\mathcal{H}_\phi|_\mathcal{K}$. Therefore $\mathcal{H}_\phi|_\mathcal{K}$ is uniformly dense in $C(\mathcal{K})$.
**Closing via Universality Equivalence.** The compact $\mathcal{K} \subset \mathcal{C}_p$ was chosen arbitrarily at the start of the argument, so $\mathcal{H}_\phi|_\mathcal{K}$ is uniformly dense in $C(\mathcal{K})$ for *every* compact $\mathcal{K} \subset \mathcal{C}_p$. By the reverse direction of the [Universality Equivalence](/theorems/2518), this is equivalent to $k_\phi$ being a universal kernel on $\mathcal{C}_p$. The proof is complete.
**Why the smaller algebra $\mathcal{A}$ rather than the full $\mathcal{H}_\phi|_\mathcal{K}$?** Stone--Weierstrass needs an *algebra*, and the subalgebra of finite-degree-tensor-generated functions is the natural candidate because the $\phi$-shuffle product of two such functions is again of this form (Step 3). The full RKHS $\mathcal{H}_\phi|_\mathcal{K}$ would also separate points but is not a priori closed under products — the $\phi$-shuffle product of two infinite-degree elements may fail to lie in $T_\phi((V))$. Working with the smaller algebra is therefore not just convenient but essential. The chain of inclusions $\mathcal{A} \subseteq \mathcal{H}_\phi|_\mathcal{K} \subseteq C(\mathcal{K})$ then transfers density up the line: dense in the bottom space implies dense in the middle, which by Universality Equivalence is universality.
[/guided]
[/step]
Explore Further
Shuffle Identity
Stochastic Analysis
Signature as Solution of a CDE
Stochastic Analysis
Signature of Time Reversal
Stochastic Analysis
Isomorphism Between the Signature RKHS and the Weighted Tensor Algebra
Stochastic Analysis
Local Error of Euler Scheme
Stochastic Analysis
MMD is a Metric under Characteristicness
Stochastic Analysis
Prohorov's Theorem
Stochastic Analysis
Brownian Motion as a Rough Path
Stochastic Analysis