[proofplan]
We construct the inverse to $\Phi$ explicitly using the canonical projections of $T(V)$ onto its homogeneous components. The argument has three logical parts: (i) $\Phi$ is well-defined and linear because every $v \in T(V)$ has only finitely many non-zero homogeneous components, so $\Phi_f(v)$ is a finite sum; (ii) $\Phi$ is injective because the values of $\Phi_f$ on the homogeneous subspaces $V^{\otimes k} \subseteq T(V)$ recover $f_k$ for every $k$; (iii) $\Phi$ is surjective because every $\phi \in T(V)^*$ is determined by its restrictions $\phi_k := \phi|_{V^{\otimes k}}$, and assembling the tuple $(\phi_k)_{k \in \mathbb{N}_0}$ produces a preimage in $T((V^*))$. The canonical isomorphism $(V^*)^{\otimes k} \cong (V^{\otimes k})^*$ is used at each level to identify covectors with elements of the formal series space.
[/proofplan]
[step:Set up notation for the homogeneous decomposition of $T(V)$ and $T((V^*))$]
Recall the structures involved. The tensor algebra $T(V) = \bigoplus_{k \in \mathbb{N}_0} V^{\otimes k}$ is the algebraic direct sum: every $v \in T(V)$ has a unique decomposition $v = \sum_{k \in A} v_k$ with $A \subseteq \mathbb{N}_0$ finite and $v_k \in V^{\otimes k}$. The formal series space $T((V^*)) = \prod_{k \in \mathbb{N}_0} (V^*)^{\otimes k}$ is the direct product: an element $f = \sum_{k \in \mathbb{N}_0} f_k$ is an arbitrary sequence with $f_k \in (V^*)^{\otimes k}$.
For each $k \in \mathbb{N}_0$ we have the canonical inclusion
\begin{align*}
\mathfrak{i}_k: V^{\otimes k} &\hookrightarrow T(V) \\
w &\mapsto w \quad (\text{embedded in degree } k)
\end{align*}
and the canonical projection
\begin{align*}
\pi_k: T(V) &\to V^{\otimes k} \\
\sum_{j \in A} v_j &\mapsto \begin{cases} v_k & \text{if } k \in A \\ 0 & \text{else.} \end{cases}
\end{align*}
These satisfy $\pi_k \circ \mathfrak{i}_k = \operatorname{id}_{V^{\otimes k}}$ and $\pi_j \circ \mathfrak{i}_k = 0$ for $j \neq k$.
For every $k$, the canonical map
\begin{align*}
\iota_k: (V^*)^{\otimes k} &\to (V^{\otimes k})^* \\
\xi^1 \otimes \cdots \otimes \xi^k &\mapsto \bigl(w_1 \otimes \cdots \otimes w_k \mapsto \xi^1(w_1) \cdots \xi^k(w_k)\bigr)
\end{align*}
is an isomorphism (this is the standard pairing between tensor powers of dual spaces and duals of tensor powers; for finite-dimensional $V$ this is finite-dimensional duality, while in general $\iota_k$ is injective and we work with its image, which is the relevant identification fixed in the theorem statement). Throughout the proof we identify $f_k \in (V^*)^{\otimes k}$ with $\iota_k(f_k) \in (V^{\otimes k})^*$ without further mention.
[/step]
[step:Verify that $\Phi_f: T(V) \to \mathbb{R}$ is a well-defined linear functional]
Fix $f = \sum_{k \in \mathbb{N}_0} f_k \in T((V^*))$. For any $v = \sum_{k \in A} v_k \in T(V)$ with $A$ finite, the expression
\begin{align*}
\Phi_f(v) = \sum_{k \in A} f_k(v_k)
\end{align*}
is a finite sum of scalars, hence well-defined. Linearity follows because the homogeneous projection $\pi_k: T(V) \to V^{\otimes k}$ is linear, each $f_k: V^{\otimes k} \to \mathbb{R}$ is linear (as an element of $(V^{\otimes k})^*$ after identification), and a finite sum of linear maps is linear:
\begin{align*}
\Phi_f(\alpha u + \beta v) = \sum_{k \in \mathbb{N}_0} f_k(\pi_k(\alpha u + \beta v)) = \alpha \sum_k f_k(\pi_k(u)) + \beta \sum_k f_k(\pi_k(v)) = \alpha \Phi_f(u) + \beta \Phi_f(v),
\end{align*}
where each sum is finite because only finitely many $\pi_k(u)$ and $\pi_k(v)$ are non-zero. Thus $\Phi_f \in T(V)^*$.
The assignment $f \mapsto \Phi_f$ is itself linear: for $f, g \in T((V^*))$, $\alpha, \beta \in \mathbb{R}$, and $v = \sum_{k \in A} v_k \in T(V)$,
\begin{align*}
\Phi_{\alpha f + \beta g}(v) = \sum_{k \in A} (\alpha f_k + \beta g_k)(v_k) = \alpha \sum_{k \in A} f_k(v_k) + \beta \sum_{k \in A} g_k(v_k) = \alpha \Phi_f(v) + \beta \Phi_g(v).
\end{align*}
So $\Phi: T((V^*)) \to T(V)^*$ is a linear map.
[guided]
We need to check three things: that the formula defining $\Phi_f(v)$ makes sense, that $\Phi_f$ is linear in $v$, and that $\Phi$ itself is linear in $f$. The first is the most important — why does the formula not produce an infinite sum that might fail to converge?
The answer is structural. The space $T(V) = \bigoplus_k V^{\otimes k}$ is defined as the **direct sum**, not the direct product. By definition of direct sum, every element has only finitely many non-zero components. So when we expand $v = \sum_{k \in A} v_k$ with $A \subseteq \mathbb{N}_0$ finite, we are not making a choice — we are writing down the canonical decomposition. The sum $\Phi_f(v) = \sum_{k \in A} f_k(v_k)$ is therefore literally finite. This contrasts sharply with $T((V^*)) = \prod_k (V^*)^{\otimes k}$, which is the direct **product** and allows arbitrary sequences $(f_k)_{k \in \mathbb{N}_0}$. The asymmetry is exactly why the duality holds: dualizing a direct sum produces a direct product.
Now linearity. Fix $f$ and consider $\Phi_f: T(V) \to \mathbb{R}$. For $u = \sum_{j \in B} u_j$ and $v = \sum_{k \in A} v_k$ in $T(V)$, the sum $\alpha u + \beta v$ has homogeneous decomposition
\begin{align*}
\alpha u + \beta v = \sum_{m \in A \cup B} (\alpha \pi_m(u) + \beta \pi_m(v)),
\end{align*}
and only finitely many indices contribute. Applying $\Phi_f$ and using linearity of each $f_m$:
\begin{align*}
\Phi_f(\alpha u + \beta v) = \sum_{m \in A \cup B} f_m(\alpha \pi_m(u) + \beta \pi_m(v)) = \alpha \sum_m f_m(\pi_m(u)) + \beta \sum_m f_m(\pi_m(v)) = \alpha \Phi_f(u) + \beta \Phi_f(v).
\end{align*}
The split is justified term-by-term: each $f_m$ is linear, and the sum has finitely many terms.
Linearity of $\Phi$ in $f$ is a separate check. The formula $\Phi_f(v) = \sum_{k \in A} f_k(v_k)$ is linear in each $f_k$ individually (just multiplication of a linear functional by a scalar and addition), and only finitely many $f_k$ enter for any fixed $v$. Hence $\Phi_{\alpha f + \beta g} = \alpha \Phi_f + \beta \Phi_g$ as functionals.
[/guided]
[/step]
[step:Prove injectivity by recovering $f_k$ from $\Phi_f$ via the canonical inclusion $\mathfrak{i}_k$]
Suppose $\Phi_f = \Phi_g$ in $T(V)^*$. We show $f = g$, equivalently $f_k = g_k$ for every $k \in \mathbb{N}_0$.
Fix $k \in \mathbb{N}_0$ and $w \in V^{\otimes k}$. The element $\mathfrak{i}_k(w) \in T(V)$ has homogeneous decomposition supported on the singleton $\{k\}$: its degree-$k$ component is $w$ and all other components vanish. Therefore
\begin{align*}
\Phi_f(\mathfrak{i}_k(w)) = f_k(w) \qquad \text{and} \qquad \Phi_g(\mathfrak{i}_k(w)) = g_k(w).
\end{align*}
The hypothesis $\Phi_f = \Phi_g$ gives $f_k(w) = g_k(w)$ for all $w \in V^{\otimes k}$. Under the identification $\iota_k: (V^*)^{\otimes k} \to (V^{\otimes k})^*$, equality of $f_k$ and $g_k$ as functionals on $V^{\otimes k}$ corresponds (via the injectivity of $\iota_k$) to equality $f_k = g_k$ in $(V^*)^{\otimes k}$. Since $k$ was arbitrary, $f = g$ in $T((V^*))$. Hence $\ker \Phi = \{0\}$ and $\Phi$ is injective.
[guided]
We want to show that $\Phi$ has trivial kernel: if $\Phi_f = \Phi_g$ then $f = g$. The element $f - g \in T((V^*))$ has homogeneous components $(f_k - g_k)_{k \in \mathbb{N}_0}$, and $f = g$ means $f_k = g_k$ for every $k$. So we need a way to **isolate the $k$-th component** $f_k$ from the data of $\Phi_f$ on all of $T(V)$.
The trick is to feed $\Phi_f$ a test element whose homogeneous decomposition is supported on a single degree. The canonical inclusion $\mathfrak{i}_k: V^{\otimes k} \hookrightarrow T(V)$ from Step 1 produces exactly such elements: for any $w \in V^{\otimes k}$, the element $\mathfrak{i}_k(w) \in T(V)$ has its degree-$k$ component equal to $w$ and all other components zero.
Applying $\Phi_f$ to this test element collapses the defining sum to a single term:
\begin{align*}
\Phi_f(\mathfrak{i}_k(w)) = \sum_{j \in \mathbb{N}_0} f_j(\pi_j(\mathfrak{i}_k(w))) = f_k(\pi_k(\mathfrak{i}_k(w))) = f_k(w),
\end{align*}
where the second equality uses the orthogonality $\pi_j \circ \mathfrak{i}_k = \delta_{jk} \operatorname{id}$ from Step 1, and only the $j = k$ term survives. By the same calculation $\Phi_g(\mathfrak{i}_k(w)) = g_k(w)$.
Now the hypothesis $\Phi_f = \Phi_g$ in $T(V)^*$ means $\Phi_f(v) = \Phi_g(v)$ for **every** $v \in T(V)$. Specialising $v = \mathfrak{i}_k(w)$,
\begin{align*}
f_k(w) = \Phi_f(\mathfrak{i}_k(w)) = \Phi_g(\mathfrak{i}_k(w)) = g_k(w) \qquad \forall w \in V^{\otimes k}.
\end{align*}
This gives equality of $f_k$ and $g_k$ as elements of $(V^{\otimes k})^*$. Under the canonical identification $\iota_k: (V^*)^{\otimes k} \to (V^{\otimes k})^*$ (an injection in general, an isomorphism in finite dimensions), $\iota_k$ being injective means $f_k = g_k$ as elements of $(V^*)^{\otimes k}$.
Since $k \in \mathbb{N}_0$ was arbitrary, every homogeneous component of $f - g$ vanishes, so $f = g$ in $T((V^*))$. Equivalently, $\ker \Phi = \{0\}$, so $\Phi$ is injective.
The argument exploits exactly the duality between direct sum and direct product: the direct-sum structure on $T(V)$ provides the "probes" $\mathfrak{i}_k(w)$ that can isolate a single homogeneous component, while the direct-product structure on $T((V^*))$ allows arbitrary tuples $(f_k)$ to be assembled. If we tried to invert $\Phi$ on the dual side (i.e., produce $f$ from $\Phi_f$), we would need exactly this kind of probe — which is what Step 4 will exploit for surjectivity.
[/guided]
[/step]
[step:Prove surjectivity by constructing a preimage from the homogeneous restrictions of $\phi$]
Let $\phi \in T(V)^*$. We construct $f \in T((V^*))$ with $\Phi_f = \phi$.
For each $k \in \mathbb{N}_0$, define
\begin{align*}
\phi_k: V^{\otimes k} &\to \mathbb{R} \\
w &\mapsto \phi(\mathfrak{i}_k(w)).
\end{align*}
This is a composition of the linear map $\mathfrak{i}_k$ with the linear functional $\phi$, hence $\phi_k \in (V^{\otimes k})^*$. Using $\iota_k^{-1}$ (or, more precisely, identifying $(V^{\otimes k})^*$ with $(V^*)^{\otimes k}$ as in the theorem statement), regard $\phi_k$ as an element of $(V^*)^{\otimes k}$ and define
\begin{align*}
f := \sum_{k \in \mathbb{N}_0} \phi_k \in T((V^*)).
\end{align*}
This is a well-formed element of the direct product $\prod_k (V^*)^{\otimes k}$.
We claim $\Phi_f = \phi$. Take any $v = \sum_{k \in A} v_k \in T(V)$ with $A$ finite. Since the inclusions and projections satisfy $v = \sum_{k \in A} \mathfrak{i}_k(v_k)$, linearity of $\phi$ gives
\begin{align*}
\phi(v) = \phi\Bigl(\sum_{k \in A} \mathfrak{i}_k(v_k)\Bigr) = \sum_{k \in A} \phi(\mathfrak{i}_k(v_k)) = \sum_{k \in A} \phi_k(v_k) = \sum_{k \in A} f_k(v_k) = \Phi_f(v).
\end{align*}
Since $v$ was arbitrary, $\phi = \Phi_f$. Hence $\Phi$ is surjective.
[guided]
The strategy is to invert $\Phi$ by hand. Given $\phi \in T(V)^*$, we ask: what should $f_k$ be? The formula for $\Phi$ tells us that $\Phi_f(\mathfrak{i}_k(w)) = f_k(w)$ for every $w \in V^{\otimes k}$. So if $\Phi_f$ is to equal $\phi$, we must have $f_k(w) = \phi(\mathfrak{i}_k(w))$. This forces our hand: define $\phi_k := \phi \circ \mathfrak{i}_k \in (V^{\otimes k})^*$.
Two things now need to happen. First, we must check that $\phi_k$ lives in the right space — we need it as an element of $(V^*)^{\otimes k}$ to assemble it into $T((V^*)) = \prod_k (V^*)^{\otimes k}$. The canonical isomorphism $\iota_k: (V^*)^{\otimes k} \to (V^{\otimes k})^*$ from Step 1 lets us pass between the two; we use $\iota_k^{-1}$ to lift $\phi_k$ from $(V^{\otimes k})^*$ to $(V^*)^{\otimes k}$. (When $V$ is infinite-dimensional, $\iota_k$ may fail to be surjective; the convention adopted in the theorem is that we identify only those $\phi_k \in (V^{\otimes k})^*$ that lie in the image of $\iota_k$, which suffices because the surjectivity claim concerns precisely that image.)
Second, we must verify that the resulting $f = \sum_k \phi_k$ does map back to $\phi$. This is where the direct-sum structure of $T(V)$ pays off. Take any $v \in T(V)$. Its decomposition $v = \sum_{k \in A} v_k$ has only finitely many non-zero pieces, and each piece is the image under $\mathfrak{i}_k$ of $v_k \in V^{\otimes k}$. So $v = \sum_{k \in A} \mathfrak{i}_k(v_k)$ as elements of $T(V)$. Applying $\phi$ and using its linearity (now applied to a finite sum, no convergence issues),
\begin{align*}
\phi(v) = \sum_{k \in A} \phi(\mathfrak{i}_k(v_k)) = \sum_{k \in A} \phi_k(v_k).
\end{align*}
The right-hand side is exactly $\Phi_f(v)$ by the definition of $\Phi$. Why does this argument fail for $T((V))$ in place of $T(V)$? Because elements of $T((V))$ would have infinitely many non-zero components, and we would not be able to split $\phi$ across the sum without a convergence theory. The asymmetry between $T(V)$ (direct sum) and $T((V^*))$ (direct product) is essential.
[/guided]
[/step]
[step:Conclude that $\Phi$ is an isomorphism]
Steps 2 establishes that $\Phi: T((V^*)) \to T(V)^*$ is a well-defined linear map. Step 3 shows $\Phi$ is injective. Step 4 shows $\Phi$ is surjective. A bijective linear map is an isomorphism of vector spaces, completing the proof.
[/step]