[proofplan]
The proof establishes the equivalence of three characterisations of tempered distributions. The implication (1) $\Rightarrow$ (2) is immediate from general topology. The implication (2) $\Rightarrow$ (3) is the substantive direction: a contrapositive argument constructs, from the failure of every seminorm bound, a sequence converging to zero in $\mathcal{S}$ on which $u$ does not converge to zero. The implication (3) $\Rightarrow$ (1) uses the seminorm bound to exhibit a neighbourhood of zero in the preimage of every interval.
[/proofplan]
[step:Prove (1) $\Rightarrow$ (2): continuity implies sequential continuity]
In any topological space, continuity implies sequential continuity.
If $\phi_k \to 0$ in $\mathcal{S}(\mathbb{R}^n)$ and $u$ is continuous, then $u(\phi_k) \to u(0) = 0$.
[/step]
[step:Prove (2) $\Rightarrow$ (3): construct a witness sequence if the seminorm bound fails]
We argue by contrapositive.
Suppose (3) fails: for every $C > 0$ and every $N, M \geq 0$, there exists $\phi \in \mathcal{S}(\mathbb{R}^n)$ with
\begin{align*}
|u(\phi)| &> C \sum_{|\alpha| \leq N, \, |\beta| \leq M} \|\phi\|_{\alpha,\beta}.
\end{align*}
Applying this with $C = k$ and $N = M = k$ for $k = 1, 2, 3, \ldots$ produces a sequence $\{\psi_k\}_{k=1}^\infty \subseteq \mathcal{S}(\mathbb{R}^n)$ satisfying
\begin{align*}
|u(\psi_k)| &> k \sum_{|\alpha| \leq k, \, |\beta| \leq k} \|\psi_k\|_{\alpha,\beta}.
\end{align*}
In particular $u(\psi_k) \neq 0$, so defining $\phi_k := \psi_k / u(\psi_k)$ gives $u(\phi_k) = 1$ for all $k$.
For every pair $(\alpha, \beta)$ with $|\alpha| \leq k$ and $|\beta| \leq k$,
\begin{align*}
\|\phi_k\|_{\alpha,\beta} &= \frac{\|\psi_k\|_{\alpha,\beta}}{|u(\psi_k)|} < \frac{1}{k}.
\end{align*}
Since every fixed pair $(\alpha, \beta) \in \mathbb{N}_0^n \times \mathbb{N}_0^n$ satisfies $|\alpha| \leq k$ and $|\beta| \leq k$ for all sufficiently large $k$, we have $\|\phi_k\|_{\alpha,\beta} \to 0$ for every $\alpha, \beta$.
This is convergence $\phi_k \to 0$ in the Schwartz topology.
But $u(\phi_k) = 1 \not\to 0$, contradicting (2).
[guided]
The contrapositive strategy is natural: if no finite collection of seminorms controls $u$, then $u$ is "too wild" to be sequentially continuous.
We make this precise by choosing, for each $k$, a test function $\psi_k$ that witnesses the failure of the bound with $C = k$ and $N = M = k$.
The normalisation $\phi_k = \psi_k / u(\psi_k)$ fixes $u(\phi_k) = 1$.
The key calculation is that $\|\phi_k\|_{\alpha,\beta} = \|\psi_k\|_{\alpha,\beta}/|u(\psi_k)| < 1/k$ whenever $|\alpha| \leq k$ and $|\beta| \leq k$.
For any fixed $(\alpha, \beta)$, the condition $|\alpha| \leq k$ and $|\beta| \leq k$ holds for all $k \geq \max(|\alpha|, |\beta|)$, so $\|\phi_k\|_{\alpha,\beta} \to 0$.
Since the Schwartz topology is defined by convergence in all seminorms, $\phi_k \to 0$ in $\mathcal{S}$.
But $u(\phi_k) = 1$ for all $k$, so $u$ is not sequentially continuous.
[/guided]
[/step]
[step:Prove (3) $\Rightarrow$ (1): the seminorm bound implies continuity]
Suppose the seminorm bound holds with constants $C, N, M$.
Define the continuous seminorm
\begin{align*}
p(\phi) &:= C \sum_{|\alpha| \leq N, \, |\beta| \leq M} \|\phi\|_{\alpha,\beta}
\end{align*}
on $\mathcal{S}(\mathbb{R}^n)$ (a finite sum of generating seminorms, scaled by $C$).
The hypothesis gives $|u(\phi)| \leq p(\phi)$ for all $\phi$.
For every $\varepsilon > 0$, the preimage $u^{-1}((-\varepsilon, \varepsilon))$ contains the open set $\{\phi : p(\phi) < \varepsilon\}$, which is a neighbourhood of $0$ in the Schwartz topology.
Since $u$ is linear, continuity at $0$ implies continuity everywhere.
[/step]