[proofplan]
We work directly from the defining identity $T^*(g) = g \circ T$ for $g \in Y^*$. Linearity of $T^*$ on $Y^*$ follows from the pointwise definition and linearity of evaluation. The norm computation proceeds by interchanging two suprema: $\|T^*\| = \sup_g \sup_x |g(Tx)|$ in one order equals $\sup_x \sup_g |g(Tx)| = \sup_x \|Tx\|_Y$ in the other, where the inner equality uses [Hahn-Banach (Normed Space)](/theorems/2629) part (ii) — the existence of norming functionals — to identify $\|Tx\|_Y = \sup_{\|g\| \le 1} |g(Tx)|$. The algebraic properties (i)–(iii) are immediate calculations from the definition.
[/proofplan]
[step:Define $T^*$ and verify it maps $Y^*$ into $X^*$]
For $T \in \mathcal{L}(X, Y)$, define
\begin{align*}
T^*: Y^* &\to X^*, \\
g &\mapsto g \circ T.
\end{align*}
For $g \in Y^*$, the composition $g \circ T: X \to \mathbb{F}$ is linear (composition of linear maps) and bounded:
\begin{align*}
|(g \circ T)(x)| = |g(T(x))| \le \|g\|_{Y^*}\,\|T(x)\|_Y \le \|g\|_{Y^*}\,\|T\|\,\|x\|_X.
\end{align*}
Hence $g \circ T \in X^*$ with
\begin{align*}
\|T^*(g)\|_{X^*} = \|g \circ T\|_{X^*} \le \|g\|_{Y^*}\,\|T\|.
\end{align*}
[/step]
[step:Verify linearity of $T^*: Y^* \to X^*$]
For $g, h \in Y^*$ and scalars $\lambda, \mu \in \mathbb{F}$, evaluate at any $x \in X$:
\begin{align*}
T^*(\lambda g + \mu h)(x) &= ((\lambda g + \mu h) \circ T)(x) \\
&= (\lambda g + \mu h)(T(x)) \\
&= \lambda\, g(T(x)) + \mu\, h(T(x)) \\
&= \lambda\, T^*(g)(x) + \mu\, T^*(h)(x) \\
&= \bigl(\lambda T^*(g) + \mu T^*(h)\bigr)(x).
\end{align*}
Since this holds for every $x \in X$, $T^*(\lambda g + \mu h) = \lambda T^*(g) + \mu T^*(h)$ in $X^*$. Thus $T^*$ is linear.
Combined with the bound from the previous step, $T^* \in \mathcal{L}(Y^*, X^*)$ with $\|T^*\| \le \|T\|$.
[/step]
[step:Compute $\|T(x)\|_Y$ as a supremum over norming functionals via Hahn-Banach]
Let $B_{Y^*} := \{g \in Y^* : \|g\|_{Y^*} \le 1\}$ denote the closed unit ball in $Y^*$. We claim that for every $y \in Y$,
\begin{align*}
\|y\|_Y = \sup_{g \in B_{Y^*}} |g(y)|.
\end{align*}
The inequality $\sup_{g \in B_{Y^*}} |g(y)| \le \|y\|_Y$ is immediate from $|g(y)| \le \|g\|_{Y^*}\|y\|_Y \le \|y\|_Y$ for $g \in B_{Y^*}$.
For the reverse inequality, if $y = 0$ both sides are zero. If $y \ne 0$, the [Hahn-Banach (Normed Space Version)](/theorems/2629) part (ii) — applied to $Y$ as a normed space and $y \in Y \setminus \{0\}$ — yields a functional $g_y \in Y^*$ with $\|g_y\|_{Y^*} = 1$ and $g_y(y) = \|y\|_Y$. The hypotheses of theorem 2629(ii) require only that $Y$ be a real or complex normed space, which it is. Then $g_y \in B_{Y^*}$ and $|g_y(y)| = \|y\|_Y$, so $\sup_{g \in B_{Y^*}} |g(y)| \ge \|y\|_Y$. Combining gives equality.
[guided]
The bound $\|T^*\| \le \|T\|$ established in the previous steps is straightforward; the harder direction is $\|T^*\| \ge \|T\|$, equivalently $\|T\| \le \|T^*\|$. We need a lower bound on $\|T(x)\|_Y$ in terms of *functionals on $Y$*. The bridge is the dual representation of the norm.
The fact $\|y\|_Y = \sup_{\|g\|_{Y^*} \le 1} |g(y)|$ is a direct consequence of [Hahn-Banach (Normed Space)](/theorems/2629) part (ii). Given $y \ne 0$, the theorem produces a *norming functional* $g_y$ — a functional that achieves both $\|g_y\|_{Y^*} = 1$ and $g_y(y) = \|y\|_Y$, *witnessing* the supremum. Theorem 2629(ii) requires only that $Y$ be a normed space and $y \in Y \setminus \{0\}$, both of which hold here.
Without Hahn-Banach, the supremum $\sup_g |g(y)|$ might fall short of $\|y\|_Y$: a priori the dual $Y^*$ could fail to "see" $y$ at full size. Hahn-Banach guarantees there are enough functionals to recover the norm.
[/guided]
[/step]
[step:Compute $\|T^*\|$ by interchanging suprema]
We compute, using the dual norm representation in the bottom of the chain:
\begin{align*}
\|T^*\| &= \sup_{g \in B_{Y^*}} \|T^*(g)\|_{X^*} \\
&= \sup_{g \in B_{Y^*}} \sup_{x \in B_X} |T^*(g)(x)| \\
&= \sup_{g \in B_{Y^*}} \sup_{x \in B_X} |g(T(x))| \\
&= \sup_{x \in B_X} \sup_{g \in B_{Y^*}} |g(T(x))| \\
&= \sup_{x \in B_X} \|T(x)\|_Y \\
&= \|T\|,
\end{align*}
where:
- The first equality is the definition of the operator norm of $T^*: Y^* \to X^*$.
- The second equality is the definition of the dual norm $\|T^*(g)\|_{X^*} = \sup_{x \in B_X} |T^*(g)(x)|$.
- The third equality is the definition $T^*(g)(x) = g(T(x))$.
- The fourth equality interchanges the two suprema (which is justified for *any* function $\Phi: B_{Y^*} \times B_X \to [0, \infty)$, since both iterated suprema equal $\sup_{(g, x)} \Phi(g, x)$).
- The fifth equality is the previous step applied to $y = T(x) \in Y$.
- The sixth equality is the definition of $\|T\|$.
Combined with $\|T^*\| \le \|T\|$ from earlier, we have $\|T^*\| = \|T\|$, so the map $*: \mathcal{L}(X, Y) \to \mathcal{L}(Y^*, X^*)$, $T \mapsto T^*$, is isometric.
[guided]
The norm equality is a four-line calculation in which the only non-trivial step is the swap of suprema. Two iterated suprema of a non-negative function always agree, since both are equal to the unrestricted supremum
\begin{align*}
\sup_{(g, x) \in B_{Y^*} \times B_X} |g(T(x))|.
\end{align*}
This is what makes the proof work: in *both* orders we get the same iterated supremum.
Reading the chain from the top: starting from $\|T^*\| = \sup_g \|T^*(g)\|_{X^*}$, expanding $\|T^*(g)\|_{X^*}$ as a supremum over $x \in B_X$, and unfolding $T^*(g)(x) = g(T(x))$, we have a double supremum over $g$ then $x$. Reading from the bottom: starting from $\|T\| = \sup_x \|T(x)\|_Y$, expanding $\|T(x)\|_Y = \sup_g |g(T(x))|$ via Hahn-Banach (previous step), we have a double supremum over $x$ then $g$. The swap in the middle bridges the two perspectives.
Without Hahn-Banach the bottom expansion would fail and we could only conclude $\|T^*\| \le \|T\|$.
[/guided]
[/step]
[step:Verify the algebraic properties (i), (ii), (iii)]
**(i) $(\operatorname{id}_X)^* = \operatorname{id}_{X^*}$.** For any $g \in X^*$ and any $x \in X$,
\begin{align*}
(\operatorname{id}_X)^*(g)(x) = g(\operatorname{id}_X(x)) = g(x),
\end{align*}
so $(\operatorname{id}_X)^*(g) = g = \operatorname{id}_{X^*}(g)$. Hence $(\operatorname{id}_X)^* = \operatorname{id}_{X^*}$.
**(ii) $(\lambda S + \mu T)^* = \lambda S^* + \mu T^*$.** Let $S, T \in \mathcal{L}(X, Y)$ and $\lambda, \mu$ scalars. For $g \in Y^*$ and $x \in X$,
\begin{align*}
(\lambda S + \mu T)^*(g)(x) &= g\bigl((\lambda S + \mu T)(x)\bigr) \\
&= g\bigl(\lambda S(x) + \mu T(x)\bigr) \\
&= \lambda\, g(S(x)) + \mu\, g(T(x)) \\
&= \lambda\, S^*(g)(x) + \mu\, T^*(g)(x) \\
&= (\lambda S^* + \mu T^*)(g)(x).
\end{align*}
Since this holds for every $g$ and every $x$, $(\lambda S + \mu T)^* = \lambda S^* + \mu T^*$.
**(iii) $(S \circ T)^* = T^* \circ S^*$.** Let $T \in \mathcal{L}(X, Y)$, $S \in \mathcal{L}(Y, Z)$. Then $S \circ T \in \mathcal{L}(X, Z)$. For $h \in Z^*$ and $x \in X$,
\begin{align*}
(S \circ T)^*(h)(x) &= h\bigl((S \circ T)(x)\bigr) \\
&= h(S(T(x))) \\
&= S^*(h)(T(x)) \\
&= T^*(S^*(h))(x) \\
&= (T^* \circ S^*)(h)(x).
\end{align*}
Since this holds for every $h \in Z^*$ and every $x \in X$, $(S \circ T)^* = T^* \circ S^*$ as maps $Z^* \to X^*$.
[/step]
[step:Assemble the conclusion]
The first two steps show $T^* \in \mathcal{L}(Y^*, X^*)$. Steps three and four establish the isometry $\|T^*\| = \|T\|$, using [Hahn-Banach (Normed Space)](/theorems/2629) part (ii) to identify $\|y\|_Y = \sup_{g \in B_{Y^*}} |g(y)|$. The final step verifies the three algebraic properties (i), (ii), (iii). Together these establish all conclusions of the theorem.
[/step]