The [Orthogonal System](/page/Orthogonal%20System) page establishes the abstract theory: given a complete orthonormal system in a [Hilbert space](/page/Hilbert%20Space), every element has a convergent Fourier expansion and Parseval's identity holds. But it leaves open the concrete question: *which* orthonormal system should we use, and *is it complete*? This page answers both questions for the trigonometric system $\{e^{inx}\}_{n \in \mathbb{Z}}$ in $L^2$ of the circle, and develops the convergence theory that is invisible in the abstract framework — pointwise convergence, summability, and the interplay between smoothness and coefficient decay.
[motivation]
## Motivation
### From Vibrating Strings to Frequency Decomposition
[Fourier series](/page/Fourier%20Series) originate in the 18th-century debate over the vibrating string. Daniel Bernoulli proposed that every motion of a string is a superposition of sinusoidal modes $\sin(nx)$; Euler and d'Alembert objected that such sums — being analytic — could not represent "arbitrary" initial shapes with corners or jumps. The question "which [functions](/page/Function) can be expanded in trigonometric [series](/page/Series)?" drove the development of rigorous analysis for over a century, from Dirichlet's first convergence theorem (1829) through Carleson's resolution of the $L^2$ pointwise convergence problem (1966).
### What the Abstract Theory Does Not Give
The [Orthogonal System](/page/Orthogonal%20System) page proves that if an ONS $\{e_k\}$ is complete, then [Bessel's inequality](/theorems/540) sharpens to Parseval's identity and the Fourier series converges in norm. But completeness of a specific system is not free — it requires proving that the span is dense, which for the trigonometric system depends on an approximation argument (Fejér's theorem). Furthermore, $L^2$ convergence says nothing about whether $S_N f(x) \to f(x)$ for a specific $x$: the partial sums may oscillate pointwise even when they converge in norm. The pointwise theory requires entirely different tools — the Dirichlet and Fejér kernels — that have no abstract analogue.
### Why the Trigonometric System is Special
Among all orthonormal systems in $L^2(\mathbb{T})$, the trigonometric system is distinguished by its interaction with calculus: [differentiation](/page/Derivative) acts diagonally ($\frac{d}{dx} e^{inx} = in \, e^{inx}$), convolution becomes multiplication ($\widehat{f * g}(n) = 2\pi \hat{f}(n)\hat{g}(n)$), and the connection to complex analysis (the Fourier series of a function on $\mathbb{T}$ is the Laurent series of a holomorphic function on the disk, evaluated on the [boundary](/page/Boundary)) provides tools unavailable for generic ONS. These algebraic structures make the trigonometric system the natural basis for PDE applications on periodic domains — notably the [heat equation](/page/Heat%20Equation), where each mode $e^{inx}$ evolves independently as $e^{inx - n^2 t}$.
[/motivation]
## Setup and Definitions
We work on the circle $\mathbb{T} = \mathbb{R}/(2\pi\mathbb{Z})$, identified with $[-\pi, \pi]$ with endpoints identified. The Hilbert space is $L^2(\mathbb{T})$ with inner product $(f, g) = \int_{-\pi}^\pi f(x) \overline{g(x)} \, d\mathcal{L}^1(x)$ and norm $\|f\|_2 = (f, f)^{1/2}$.
The building blocks of Fourier analysis are the exponentials $e^{inx}$ for $n \in \mathbb{Z}$. These are eigenfunctions of the derivative operator, and their finite linear combinations are the trigonometric polynomials — the "polynomials" of periodic function theory.
[definition: Fourier Coefficients]
For $f \in L^1(\mathbb{T})$, the **Fourier coefficients** of $f$ are
\begin{align*}
\hat{f}(n) := \frac{1}{2\pi}\int_{-\pi}^\pi f(x) e^{-inx} \, d\mathcal{L}^1(x), \quad n \in \mathbb{Z}.
\end{align*}
The **Fourier partial sum** of order $N$ is $S_N f(x) := \sum_{|n| \leq N} \hat{f}(n) e^{inx}$.
[/definition]
In the real form, $S_N f(x) = \frac{a_0}{2} + \sum_{n=1}^N (a_n \cos nx + b_n \sin nx)$ with $a_n = \frac{1}{\pi}\int f \cos nx$ and $b_n = \frac{1}{\pi}\int f \sin nx$. The complex exponential form is algebraically cleaner — convolution, differentiation, and the kernel representations below are all simpler in exponential notation.
## The [Dirichlet Kernel](/page/Dirichlet%20Kernel)
The question "does $S_N f(x) \to f(x)$?" reduces to a question about a convolution kernel. Since the partial sum is a finite sum of integrals, exchanging sum and integral writes $S_N f$ as a convolution with the **Dirichlet kernel** $D_N$. The convergence of $S_N f$ therefore depends on the properties of $D_N$ — and the fundamental difficulty of Fourier series is that $D_N$ is *not* a positive kernel.
[quotetheorem:581]
The closed form $D_N(x) = \sin((N+1/2)x)/\sin(x/2)$ reveals the problem: $D_N$ oscillates with increasing amplitude near $x = 0$ (where it peaks at $2N+1$) and has lobes of alternating sign. Its $L^1$ norm grows logarithmically — $\|D_N\|_{L^1(\mathbb{T})} \sim \frac{4}{\pi^2}\log N$ — which means the operator $f \mapsto S_N f$ is not uniformly bounded on $C(\mathbb{T})$. This logarithmic growth is the root cause of the convergence difficulties: it is responsible for the Gibbs phenomenon and for the existence of continuous functions whose Fourier series diverge at a point (du Bois-Reymond, 1876).
## Pointwise Convergence
Despite the oscillatory nature of $D_N$, the Fourier series converges pointwise under mild regularity conditions on $f$. The key mechanism is cancellation: the oscillations of $D_N$ annihilate each other in the convolution integral, provided the function $f$ is not too rough near the point of evaluation.
The first tool is the Riemann-Lebesgue lemma, which says that Fourier coefficients always tend to zero — regardless of any convergence of the series itself.
[quotetheorem:245]
The Riemann-Lebesgue lemma is a *necessary* condition for convergence but far from sufficient — the harmonic series $\sum 1/n$ has terms tending to zero but diverges. For *sufficient* conditions, we need control on the local regularity of $f$. Dini's criterion provides the sharpest general condition: it requires only that the symmetric difference $f(x+t) + f(x-t) - 2s$ be [integrable](/page/Integral) against $1/t$ near $t = 0$, which is satisfied whenever $f$ has a Dini-type modulus of [continuity](/page/Continuity) at $x$.
[quotetheorem:583]
Dini's criterion subsumes all the standard pointwise convergence results: if $f$ is Lipschitz at $x$, or more generally Hölder continuous, or of bounded variation in a neighbourhood of $x$, the Dini condition is satisfied and $S_N f(x) \to f(x)$.
[remark: Carleson's Theorem]
For $f \in L^2(\mathbb{T})$, Carleson (1966) proved that $S_N f(x) \to f(x)$ for almost every $x$. This is far deeper than Dini's criterion — it does not assume any pointwise regularity — and its proof uses entirely different techniques (time-frequency analysis). We state it without proof; the result does not extend to $L^1$ (Kolmogorov constructed an $L^1$ function whose Fourier series diverges everywhere).
[/remark]
## The [Fejér Kernel](/page/Fej%C3%A9r%20Kernel) and Cesàro Summability
The Dirichlet kernel is not positive and its $L^1$ norm grows, so the partial sums $S_N f$ can misbehave. Fejér's idea was to *average* the partial sums: the [Cesàro means](/page/Ces%C3%A0ro%20Means) $\sigma_N f = \frac{1}{N+1}\sum_{n=0}^N S_n f$ are [convolutions](/page/Convolution) with the Fejér kernel, which *is* positive — and positive kernels with unit mass that concentrate at the origin are approximate identities, for which convergence is automatic.
[quotetheorem:584]
The positivity of $F_N$ is the decisive advantage over $D_N$: it makes the convolution a weighted average of $f$, which cannot overshoot the maximum of $f$ or undershoot the minimum. This is why $\sigma_N f$ [converges uniformly](/page/Uniform%20Convergence) for every continuous $f$, while $S_N f$ may not.
The uniform density of trigonometric polynomials in $C(\mathbb{T})$ is the trigonometric analogue of the [Weierstrass approximation theorem](/theorems/480) (which gives density of algebraic polynomials on compact intervals). It is also the key input for proving completeness.
## Completeness and $L^2$ Convergence
The abstract theory on the [Orthogonal System](/page/Orthogonal%20System) page tells us that completeness of an ONS is equivalent to density of its span. Fejér's theorem provides exactly this density for the trigonometric system — first in $C(\mathbb{T})$ (uniformly), then in $L^2(\mathbb{T})$ (by a routine approximation). Once completeness is established, the full power of the abstract theory applies: $L^2$ convergence, Parseval's identity, and the $\ell^2$ isomorphism are all immediate consequences.
[quotetheorem:585]
Parseval's identity in its concrete form — $\frac{1}{2\pi}\int |f|^2 = \sum |\hat{f}(n)|^2$ — is a powerful computational tool: it converts integrals into series and vice versa. The following examples illustrate this.
[example: The Basel Problem Via Parseval]
Let $f(x) = x$ on $[-\pi, \pi]$. The Fourier coefficients are $\hat{f}(0) = 0$ and $\hat{f}(n) = \frac{(-1)^{n+1}}{in}$ for $n \neq 0$ (computed by [integration by parts](/theorems/210)). Parseval's identity gives:
\begin{align*}
\frac{1}{2\pi}\int_{-\pi}^\pi x^2 \, d\mathcal{L}^1(x) = \sum_{n \neq 0} \frac{1}{n^2} = 2\sum_{n=1}^\infty \frac{1}{n^2}.
\end{align*}
The left side is $\frac{1}{2\pi} \cdot \frac{2\pi^3}{3} = \frac{\pi^2}{3}$. Therefore $\sum_{n=1}^\infty \frac{1}{n^2} = \frac{\pi^2}{6}$.
[/example]
[example: Sum Of Fourth Powers Via Parseval]
Let $f(x) = x^2$ on $[-\pi, \pi]$. Integration by parts (twice) gives $\hat{f}(0) = \pi^2/3$ and $\hat{f}(n) = \frac{2(-1)^n}{n^2}$ for $n \neq 0$. Parseval gives:
\begin{align*}
\frac{1}{2\pi}\int_{-\pi}^\pi x^4 \, d\mathcal{L}^1(x) = \frac{\pi^4}{9} + 2\sum_{n=1}^\infty \frac{4}{n^4}.
\end{align*}
The left side is $\frac{\pi^4}{5}$. Solving: $\sum_{n=1}^\infty \frac{1}{n^4} = \frac{\pi^4}{90}$.
[/example]
## Smoothness and Decay of Fourier Coefficients
The central structural principle of Fourier analysis is the duality between smoothness of $f$ in the spatial variable and decay of $\hat{f}(n)$ in the frequency variable. Differentiation maps $e^{inx} \mapsto in \, e^{inx}$, so it amplifies high frequencies by a factor of $|n|$. Conversely, if $f$ is $k$-times differentiable, then $k$ integrations by parts transfer $k$ factors of $1/(in)$ from $f$ to the exponential, producing $O(|n|^{-k})$ decay.
[quotetheorem:586]
The converse is equally important: if the coefficients decay fast enough, the Fourier series — and its term-by-term derivatives — converge uniformly, giving a smooth function. The threshold is absolute summability of $|n|^k |\hat{f}(n)|$.
[quotetheorem:587]
Together, these two results establish a clean dictionary: $C^k$ regularity of $f$ corresponds to $O(|n|^{-k})$ decay of $\hat{f}(n)$, and $C^\infty$ regularity corresponds to rapid (faster than any polynomial) decay. The Fourier characterisation of [Sobolev spaces](/page/Sobolev%20Space) extends this to fractional regularity: $f \in H^s(\mathbb{T})$ if and only if $\sum |n|^{2s}|\hat{f}(n)|^2 < \infty$.
## Application: Heat Equation on the Circle
The smoothness-decay duality explains the smoothing effect of the [heat equation](/page/Heat%20Equation) in Fourier-analytic terms. Consider $\partial_t u = \partial_x^2 u$ on $\mathbb{T} \times (0, \infty)$ with initial data $u(\cdot, 0) = f \in L^2(\mathbb{T})$. Each Fourier mode evolves independently: $\widehat{u}(n, t) = \hat{f}(n) e^{-n^2 t}$. The factor $e^{-n^2 t}$ decays exponentially in $n$ for any $t > 0$, so even if $\hat{f}(n)$ decays slowly (corresponding to rough initial data), the product $\hat{f}(n)e^{-n^2 t}$ decays faster than any polynomial. By the [Smoothness from Decay theorem](/theorems/587), $u(\cdot, t) \in C^\infty(\mathbb{T})$ for all $t > 0$ — infinite smoothing from any $L^2$ initial condition.
[example: Heat Equation With A Step Function]
Let $f(x) = \operatorname{sgn}(x)$ on $[-\pi, \pi]$. The Fourier coefficients are $\hat{f}(n) = \frac{2}{in\pi}$ for odd $n$ and $0$ for even $n$. The solution to the heat equation is:
\begin{align*}
u(x, t) = \sum_{\substack{n \text{ odd}}} \frac{2}{in\pi} e^{inx - n^2 t}.
\end{align*}
At $t = 0$, the series converges to $\operatorname{sgn}(x)$ in $L^2$ (and pointwise away from $x = 0$). For any $t > 0$, the exponential damping makes $|\hat{u}(n, t)| = \frac{2}{|n|\pi}e^{-n^2 t}$, which is rapidly decreasing. The discontinuity at $x = 0$ is instantly smoothed out, though traces of it persist as a steep transition layer for small $t$.
[/example]
## References
- Katznelson, Y. (2004). *An Introduction to Harmonic Analysis* (3rd ed.). Cambridge University Press.
- Grafakos, L. (2014). *Classical Fourier Analysis* (3rd ed.). Springer.
- Stein, E. M. and Shakarchi, R. (2003). *Fourier Analysis: An Introduction*. Princeton University Press.
- Zygmund, A. (2002). *Trigonometric Series* (3rd ed.). Cambridge University Press.