The dual of a [function](/page/Function) space — the space of continuous linear functionals on it — is where most of the objects in PDE theory live. Distributions are continuous linear functionals on [test functions](/page/Test%20Function). Tempered distributions are continuous linear functionals on the [Schwartz space](/page/Schwartz%20Space). Bounded linear functionals on $L^p$ spaces arise as weak [limits](/page/Limit) in existence arguments. In every case, one needs a topology on the dual space that is weak enough to produce useful compactness results, yet strong enough to be compatible with the operations of analysis. The weak* topology is that topology.
The construction is the same in every case: given a space $E$ and its dual $E'$, topologise $E'$ by declaring that a net of functionals converges if and only if it converges pointwise when tested against every element of $E$. But the *setting* varies: $E$ might be a Banach space (for $L^p$ duality), a [Fréchet space](/page/Fr%C3%A9chet%20Space) (for [tempered distributions](/page/Tempered%20Distributions)), or an LF-space (for [distributions](/page/Distribution)). The Banach-space definition — which appears in most introductory functional analysis courses — is a special case of the general locally convex construction, and conflating the two can create confusion when one moves to distribution theory and beyond. This page develops the general definition first, then specialises to [Banach spaces](/page/Banach%20Space) and proves that the two viewpoints coincide.
## Motivation
[motivation]
### The Norm Topology Is Too Strong
The most natural topology on a dual space $X^*$ is the operator norm topology: $f_n \to f$ if $\|f_n - f\|_{X^*} \to 0$. This is a strong notion of convergence — it requires uniform control over all of $X$ simultaneously — and it produces a complete normed space. But it has a fatal defect for applications: the unit ball $B_{X^*} = \{f \in X^* : \|f\|_{X^*} \leq 1\}$ is compact in the norm topology if and only if $X$ is finite-dimensional (by the Riesz lemma). In infinite dimensions, bounded [sequences](/page/Sequence) in $X^*$ generally have no norm-convergent subsequences, so the most basic compactness arguments of the [calculus of variations](/page/Calculus%20of%20Variations) — extract a bounded sequence, pass to a convergent subsequence, identify the limit — break down completely.
### The Search for a Weaker Topology
The remedy is to weaken the notion of convergence, accepting less control in exchange for more compactness. The weakest reasonable requirement is *pointwise* convergence: $f_n \to f$ if $f_n(x) \to f(x)$ for every $x \in E$, where $E$ is the space on which the functionals act. This is weak enough that bounded [sets](/page/Set) become compact (the Banach–Alaoglu theorem), yet strong enough that limits of linear functionals remain linear and that the pairing $\langle f, x \rangle = f(x)$ is jointly continuous in the relevant sense.
### The Need for Generality
In a first functional analysis course, one typically defines the weak* topology only for the dual of a Banach space. But the spaces that arise in distribution theory — $\mathcal{D}'(\Omega)$, $\mathcal{S}'(\mathbb{R}^n)$, $\mathcal{E}'(\Omega)$ — are duals of spaces that are not Banach. The [test function space](/page/Test%20Function) $\mathcal{D}(\Omega)$ is an LF-space (a countable strict inductive limit of Fréchet spaces); the [Schwartz space](/page/Schwartz%20Space) $\mathcal{S}(\mathbb{R}^n)$ is Fréchet but not normable; $\mathcal{E}(\Omega) = C^\infty(\Omega)$ is likewise Fréchet. The weak* topology on each of these duals is defined by the same pointwise-convergence construction, and it governs all the standard convergence results in distribution theory: mollifier approximations $T * \rho_\varepsilon \to T$, convergence of partial Fourier sums, and the distributional limits that appear in PDE existence proofs. A definition restricted to Banach spaces would exclude precisely the applications that motivate the concept.
[/motivation]
## The General Definition
The weak* topology is defined for the dual of any locally convex [topological vector space](/page/Topological%20Vector%20Space). A **locally convex space** is a topological vector space $E$ whose topology is generated by a family of seminorms — equivalently, a topological vector space that has a neighbourhood basis at the origin consisting of convex sets. Every normed space is locally convex (one seminorm suffices: the norm), and every Fréchet space is locally convex (countably many seminorms generating a complete metrizable topology). The LF-spaces arising in distribution theory are also locally convex, though they are neither metrizable nor normable.
The **continuous dual** of a locally convex space $E$, denoted $E'$, is the vector space of all continuous linear functionals $f: E \to \mathbb{R}$. What "continuous" means depends on the topology on $E$: for a normed space it is equivalent to boundedness, for a Fréchet space it means domination by finitely many seminorms, and for the LF-space $\mathcal{D}(\Omega)$ it means the [characterisation given by Theorem 449](/theorems/449).
[definition: Weak Star Topology General]
Let $E$ be a locally convex topological vector space over $\mathbb{R}$, and let $E'$ be its continuous dual. The **weak\* topology** on $E'$, denoted $\sigma(E', E)$, is the coarsest topology on $E'$ under which the evaluation map
\begin{align*}
\operatorname{ev}_x: E' &\to \mathbb{R} \\
f &\mapsto f(x)
\end{align*}
is continuous for every $x \in E$.
[/definition]
The topology $\sigma(E', E)$ is an **initial topology**: it is constructed by pulling back the standard topology on $\mathbb{R}$ through the family of maps $\{\operatorname{ev}_x\}_{x \in E}$. A subbasis for $\sigma(E', E)$ consists of all sets of the form $\operatorname{ev}_x^{-1}(U) = \{f \in E' : f(x) \in U\}$ where $x \in E$ and $U \subseteq \mathbb{R}$ is open. A basis consists of finite intersections of such sets, which gives the **basic open neighbourhoods** of a point $f_0 \in E'$:
\begin{align*}
V(f_0; x_1, \ldots, x_m; \varepsilon) &:= \{f \in E' : |f(x_i) - f_0(x_i)| < \varepsilon \text{ for all } i = 1, \ldots, m\}
\end{align*}
where $x_1, \ldots, x_m \in E$ and $\varepsilon > 0$. These are "pointwise $\varepsilon$-neighbourhoods": $f$ is close to $f_0$ if they nearly agree on finitely many test vectors.
[definition: Weak Star Convergence General]
Let $E$ be a locally convex space with continuous dual $E'$, and let $\{f_\alpha\}_{\alpha \in A}$ be a net in $E'$ and $f \in E'$. We say $f_\alpha$ **converges to $f$ in the weak\* topology**, written $f_\alpha \overset{*}{\rightharpoonup} f$, if $f_\alpha \to f$ in the topology $\sigma(E', E)$.
[/definition]
The following characterisation reduces weak* convergence to a concrete pointwise condition.
[quotetheorem:504]
This characterisation is what makes the weak* topology usable in practice: to check weak* convergence, one verifies the pointwise condition $f_\alpha(x) \to f(x)$ for each $x \in E$, rather than working directly with the topology. The proof is a direct consequence of the initial topology construction: the forward direction composes a convergent net with a continuous evaluation map; the reverse direction uses the fact that every $\sigma(E', E)$-neighbourhood of $f$ contains a basic neighbourhood determined by finitely many evaluation conditions, and finitely many convergent nets can be synchronised since the index set is directed.
Three basic properties of $\sigma(E', E)$ follow immediately from the definition. First, $\sigma(E', E)$ is **locally convex**: each basic neighbourhood $V(f_0; x_1, \ldots, x_m; \varepsilon)$ is convex (as an intersection of preimages of convex sets under [linear maps](/page/Linear%20Map)). Second, $\sigma(E', E)$ is **Hausdorff** provided $E$ separates points of $E'$: if $f \neq g$ in $E'$, there exists $x \in E$ with $f(x) \neq g(x)$, and then $\operatorname{ev}_x$ separates $f$ from $g$. This separation property holds whenever $E'$ is the continuous dual of a locally convex Hausdorff space (which it always is in practice). Third, $\sigma(E', E)$ is in general **not metrizable** when $E$ is infinite-dimensional: the topology is defined by uncountably many evaluation maps, and no countable family of seminorms can generate it globally.
### The Metrizable Case: Sequential Characterisation
Despite the global non-metrizability, in many important cases the weak* topology becomes metrizable when restricted to *bounded* subsets of $E'$. This is particularly relevant for Fréchet spaces.
When $E$ is a **separable metrizable** locally convex space — in particular a separable Fréchet space such as $\mathcal{S}(\mathbb{R}^n)$ — a countable dense subset $\{x_k\} \subseteq E$ provides countably many evaluation maps that already determine weak* convergence on bounded sets. The idea is the same as for the Banach space case (see the [metrizability theorem](/theorems/495) below): the metric $d(f, g) := \sum_k 2^{-k} |f(x_k) - g(x_k)| / (1 + |f(x_k) - g(x_k)|)$ induces the weak* topology on any equicontinuous (and hence bounded-in-a-suitable-sense) subset of $E'$. Once the topology is metrizable on the relevant sets, sequential convergence determines the topology there, and one can work with sequences rather than nets.
This is why the [sequential characterisation of distributions](/theorems/449) works for $\mathcal{D}'(\Omega)$ despite $\mathcal{D}(\Omega)$ being non-metrizable: the LF-space structure reduces continuity questions to the Fréchet pieces $\mathcal{D}_K(\Omega)$, where the sequential/[topological](/page/Topology) distinction collapses. It is also why weak* convergence in $\mathcal{S}'(\mathbb{R}^n)$ can be checked on sequences: $\mathcal{S}(\mathbb{R}^n)$ is a separable Fréchet space, so bounded subsets of $\mathcal{S}'(\mathbb{R}^n)$ are weak* metrizable.
## Why Full Generality Is Needed
### Distributions
The space of [distributions](/page/Distribution) $\mathcal{D}'(\Omega)$ is the continuous dual of the [test function space](/page/Test%20Function) $\mathcal{D}(\Omega)$. The weak* topology $\sigma(\mathcal{D}'(\Omega), \mathcal{D}(\Omega))$ — i.e. the topology of pointwise convergence on test functions — is the standard topology used in distribution theory. A sequence of distributions $T_k$ converges to $T$ in $\mathcal{D}'(\Omega)$ if and only if $T_k(\varphi) \to T(\varphi)$ for every $\varphi \in \mathcal{D}(\Omega)$.
[example: Mollifier Approximation In Distributional Sense]
Let $T \in \mathcal{D}'(\mathbb{R}^n)$ be a distribution and let $\rho_\varepsilon$ be the [standard mollifier](/page/Standard%20Mollifier). The mollification $T_\varepsilon := T * \rho_\varepsilon$ (defined by $T_\varepsilon(\varphi) := T(\tilde{\rho}_\varepsilon * \varphi)$, where $\tilde{\rho}_\varepsilon(x) := \rho_\varepsilon(-x)$) converges to $T$ in the weak* topology as $\varepsilon \to 0$. Concretely, for every $\varphi \in \mathcal{D}(\mathbb{R}^n)$, we have $\tilde{\rho}_\varepsilon * \varphi \to \varphi$ in $\mathcal{D}(\mathbb{R}^n)$ (the supports stay in a fixed compact set and all [derivatives](/page/Derivative) converge uniformly), so $T_\varepsilon(\varphi) = T(\tilde{\rho}_\varepsilon * \varphi) \to T(\varphi)$ by continuity of $T$. This is weak* convergence $T_\varepsilon \overset{*}{\rightharpoonup} T$ in $\mathcal{D}'(\mathbb{R}^n)$.
[/example]
The Banach-space definition of the weak* topology does not apply here: $\mathcal{D}(\Omega)$ is not a normed space (it is not even metrizable), so there is no "operator norm" on $\mathcal{D}'(\Omega)$ and no Banach-space dual to invoke. The general locally convex definition is required.
### Tempered Distributions
The space of [tempered distributions](/page/Tempered%20Distributions) $\mathcal{S}'(\mathbb{R}^n)$ is the continuous dual of the [Schwartz space](/page/Schwartz%20Space) $\mathcal{S}(\mathbb{R}^n)$. The Schwartz space is a Fréchet space — metrizable and complete, with topology generated by the countable family of seminorms $\|\varphi\|_{\alpha, \beta} := \sup_{x \in \mathbb{R}^n} |x^\alpha \partial^\beta \varphi(x)|$ — but it is not normable (no single norm generates the topology). Again, the Banach-space weak* definition does not apply. The general definition gives $\sigma(\mathcal{S}', \mathcal{S})$: a sequence $T_k \in \mathcal{S}'(\mathbb{R}^n)$ converges to $T$ if $T_k(\varphi) \to T(\varphi)$ for every Schwartz function $\varphi$. This is the topology under which the [Fourier transform](/page/Fourier%20Transform) $\mathcal{F}: \mathcal{S}'(\mathbb{R}^n) \to \mathcal{S}'(\mathbb{R}^n)$ is a homeomorphism.
## The Banach Space Case
When the underlying space is a Banach space (or more generally a normed space), the general definition specialises to the classical weak* topology of functional analysis. The following theorem makes this precise.
### Equivalence of Definitions
The general definition asks for the continuous dual $E'$ of $E$ as a locally convex space. In a normed space, [continuity](/page/Continuity) of a linear functional is equivalent to boundedness — this is the only fact needed to identify the general construction with the standard one.
[quotetheorem:494]
The proof is elementary but the conclusion is important: it means that one can freely use the general machinery (initial topologies, nets, general compactness results) while working with the concrete Banach-space dual $X^*$, without worrying about a discrepancy between two different notions of "continuous linear functional."
### Sequential Characterisation
Since the weak* topology is defined as an initial topology, convergence in $\sigma(X^*, X)$ reduces to pointwise convergence of the defining evaluation maps. This gives a clean sequential criterion.
[quotetheorem:256]
The proof follows the same pattern as the general characterisation discussed in §2: the forward direction uses continuity of each evaluation map, and the reverse direction uses the fact that basic weak* neighbourhoods involve only finitely many evaluation conditions, so finitely many pointwise convergence conditions suffice to enter any prescribed neighbourhood. The key simplification in the Banach case is that this works for all sequences, not just on bounded sets — because the basic neighbourhoods of $\sigma(X^*, X)$ are defined by the same finite intersection structure regardless of norms.
### Weak* Versus [Weak Convergence](/page/Weak%20Convergence)
The weak* topology on $X^*$ is not the only "weak" topology one might consider. The **weak topology** on $X^*$ is the coarsest topology making every element of $X^{**}$ — the bidual — act continuously. Since the canonical embedding $J: X \hookrightarrow X^{**}$ defined by $J(x)(f) := f(x)$ maps $X$ into $X^{**}$, every evaluation map $\operatorname{ev}_x$ used in the weak* topology is also an element of $X^{**}$. The weak topology therefore imposes *more* continuity conditions (all of $X^{**}$, not just $J(X)$), which means it is finer: every weakly [open set](/page/Open%20Set) is weak* open, but not conversely (in general).
The gap between the two topologies is controlled by reflexivity. When $X$ is **reflexive** — meaning $J: X \to X^{**}$ is surjective — the two families of evaluation maps coincide: $X^{**} = J(X)$, so testing against all of $X^{**}$ is the same as testing against $J(X)$. The weak and weak* topologies on $X^*$ therefore agree. This is the case for $L^p(\Omega)$ with $1 < p < \infty$: the dual is $L^q(\Omega)$ (where $1/p + 1/q = 1$), the bidual is $(L^q)^* \cong L^p$ again, and $J$ is the identity under this identification.
[example: The Gap For Non-Reflexive Spaces]
When $X$ is not reflexive, the gap is genuine. The most important example in PDE theory is $X = L^1(\Omega)$. Here $X^* = L^\infty(\Omega)$, and a bounded sequence in $L^\infty(\Omega)$ always has a weak* convergent subsequence by the [Sequential Banach–Alaoglu theorem](/theorems/496) (since $L^1(\Omega)$ is separable for $\sigma$-finite $\Omega$). This means: there exist a subsequence $f_{n_k}$ and $f \in L^\infty(\Omega)$ with
\begin{align*}
\int_\Omega f_{n_k} \, g \, d\mathcal{L}^n \to \int_\Omega f \, g \, d\mathcal{L}^n \quad \text{for every } g \in L^1(\Omega).
\end{align*}
Weak convergence in $L^\infty$ would instead require testing against all of $(L^\infty)^* \supsetneq L^1$, which includes finitely additive measures and other pathological objects. The weak* topology is strictly coarser than the weak topology on $L^\infty$, and correspondingly far more sequences converge in it. This is precisely why PDE arguments that produce $L^\infty$ bounds extract weak* (not weak) convergent subsequences.
[/example]
## Compactness
The most important theorem about the weak* topology — and the reason it is the "right" topology for dual spaces — is that it makes bounded sets compact.
### The Banach–Alaoglu Theorem
[quotetheorem:212]
The proof is a direct application of Tychonoff's theorem: embed the unit ball $B_{X^*}$ into the product $\prod_{x \in X} [-\|x\|_X, \|x\|_X]$ via the evaluation map $f \mapsto (f(x))_{x \in X}$, verify that the image is closed (linearity and the norm bound are preserved by pointwise limits), and conclude compactness from Tychonoff. The result holds for all normed spaces without any [separability](/page/Separable), reflexivity, or completeness hypothesis.
The compactness here is in the sense of nets: every net in $B_{X^*}$ has a subnet converging in $\sigma(X^*, X)$. This does *not* immediately give sequential compactness — for that, one needs the additional hypothesis of separability.
### Metrizability and Sequential Compactness
When the underlying space $X$ is separable, the weak* topology on bounded subsets of $X^*$ becomes metrizable. This is the bridge between the topological compactness of Banach–Alaoglu and the sequential compactness that is directly used in PDE arguments.
[quotetheorem:495]
The idea is natural: a countable dense set $\{x_k\} \subseteq X$ provides countably many "test directions." On a bounded set, these countably many tests determine the weak* topology — there are no additional directions that can distinguish functionals. The metric $d$ is a weighted sum of the distances in each test direction, with exponentially decaying weights to ensure convergence of the [series](/page/Series).
The metrizability theorem combines with Banach–Alaoglu to give the sequential version.
[quotetheorem:496]
This is the theorem that is invoked (often implicitly) in virtually every PDE existence argument that uses compactness in dual spaces. The typical pattern is: a sequence of approximate solutions produces a bounded sequence in some dual space (e.g. $L^\infty$, $W^{-1,q}$, or a space of measures); the Sequential Banach–Alaoglu theorem extracts a weak* convergent subsequence; and one then identifies the limit as a solution by passing to the limit in the weak formulation.
### The General Case: Alaoglu–Bourbaki
The Banach–Alaoglu theorem has a generalisation to arbitrary locally convex spaces, due to Alaoglu and Bourbaki. In this setting, there is no "unit ball" (the dual need not be normed), so the compact sets are described differently: for any neighbourhood $U$ of the origin in $E$, the **polar** $U^\circ := \{f \in E' : |f(x)| \leq 1 \text{ for all } x \in U\}$ is compact in $\sigma(E', E)$. This reduces to the Banach–Alaoglu theorem when $E$ is a normed space and $U$ is the unit ball (since $U^\circ = B_{E^*}$). The proof follows the same Tychonoff embedding strategy.
The Alaoglu–Bourbaki theorem is what underlies compactness results for distributions and tempered distributions: bounded (equicontinuous) sets in $\mathcal{D}'(\Omega)$ and $\mathcal{S}'(\mathbb{R}^n)$ are weak* compact.
## Applications
### Weak* Convergence in $L^\infty$ and Homogenisation
In homogenisation theory, one studies PDEs with rapidly oscillating coefficients $a_\varepsilon(x) := a(x/\varepsilon)$ where $a$ is periodic. As $\varepsilon \to 0$, the coefficients $a_\varepsilon$ do not converge strongly in $L^\infty$ (the oscillations persist at every scale), but they do converge weak* in $L^\infty$:
\begin{align*}
\int_\Omega a_\varepsilon(x) \, g(x) \, d\mathcal{L}^n(x) \to \bar{a} \int_\Omega g(x) \, d\mathcal{L}^n(x) \quad \text{for every } g \in L^1(\Omega),
\end{align*}
where $\bar{a} = \fint_Y a(y) \, d\mathcal{L}^n(y)$ is the average of $a$ over one period cell $Y$. The weak* limit is the constant function $\bar{a}$. The effective (homogenised) PDE has constant coefficients determined by this average — the rapidly oscillating microstructure is "averaged out" by the weak* limit.
### Compactness Arguments in PDE Existence
The standard strategy for proving existence of solutions to nonlinear PDEs via the direct method proceeds as follows. One constructs a sequence of approximate solutions $\{u_n\}$ (e.g. by Galerkin approximation, penalisation, or regularisation) and obtains *a priori* bounds $\|u_n\|_{X^*} \leq C$ in some dual space. The [Sequential Banach–Alaoglu theorem](/theorems/496) (or its locally convex generalisation) extracts a subsequence $u_{n_k} \overset{*}{\rightharpoonup} u$ converging weak* to a limit $u$. One then passes to the limit in the weak formulation of the equation: each term of the form $\langle u_{n_k}, \varphi \rangle$ converges to $\langle u, \varphi \rangle$ by the definition of weak* convergence, and the limit $u$ satisfies the equation in the distributional sense. The passage to the limit in nonlinear terms requires additional tools (lower semicontinuity, compensated compactness, or monotonicity methods), but the initial extraction of a convergent subsequence is always a weak* compactness argument.
## References
1. W. Rudin, *Functional Analysis*, 2nd ed. (1991).
2. J. B. Conway, *A Course in Functional Analysis*, 2nd ed. (1990).
3. F. Trèves, *Topological Vector Spaces, Distributions, and Kernels* (1967).
4. L. Hörmander, *The Analysis of Linear Partial Differential Operators I* (1983).
5. H. Brezis, *Functional Analysis, [Sobolev Spaces](/page/Sobolev%20Space) and Partial Differential Equations* (2011).