The Fejér kernel is the convolution kernel for the [Cesàro means](/page/Cesàro%20Means) of [Fourier series](/page/Fourier%20Series). Where the [Dirichlet kernel](/page/Dirichlet%20Kernel) represents the Fourier partial sums $S_N f$ and suffers from oscillation and growing $L^1$ norm, the Fejér kernel $F_N$ represents the averaged partial sums $\sigma_N f = \frac{1}{N+1}\sum_{n=0}^N S_n f$ and is everywhere non-negative. This positivity makes $F_N$ an approximate identity on $\mathbb{T}$, guaranteeing [uniform convergence](/page/Uniform%20Convergence) $\sigma_N f \to f$ for every continuous $f$ — Fejér's theorem — and providing the key input for proving [completeness of the trigonometric system](/theorems/585) in $L^2(\mathbb{T})$.
[motivation]
## Motivation
### The Problem with the Dirichlet Kernel
The [Dirichlet kernel](/page/Dirichlet%20Kernel) $D_N$ has unit mass ($\frac{1}{2\pi}\int D_N = 1$) and concentrates near the origin (the central peak has height $2N + 1$ and width $\sim 1/N$). But its oscillating side lobes mean $\|D_N\|_{L^1} \sim \frac{4}{\pi^2}\log N \to \infty$, so it fails to be an approximate identity. This failure is not just aesthetic: it leads to divergent Fourier series for some continuous functions (du Bois-Reymond) and the Gibbs overshoot at discontinuities.
### Fejér's Idea: Average Away the Oscillations
Fejér (1900) observed that the arithmetic means $\sigma_N f = \frac{1}{N+1}\sum_{n=0}^N S_n f$ can be written as $f * F_N$ where $F_N = \frac{1}{N+1}\sum_{n=0}^N D_n$. Averaging the Dirichlet kernels has a remarkable effect: the oscillating lobes, which alternate in sign from one $D_n$ to the next, cancel in the average, producing a kernel that is everywhere non-negative. Once positivity is established, the standard approximate identity argument gives uniform convergence for free.
### What This Page Covers
This page defines the Fejér kernel, derives its closed form, proves the three approximate identity properties (positivity, unit mass, concentration), and states [Fejér's theorem](/theorems/584). The general theory of Cesàro summability — why averaging improves convergence — is on the [Cesàro Means](/page/Cesàro%20Means) page.
[/motivation]
## Definition
The Fejér kernel is the arithmetic mean of the first $N + 1$ Dirichlet kernels. To define it precisely, we need the [Dirichlet kernel](/page/Dirichlet%20Kernel) $D_n(x) = \sum_{|k| \leq n} e^{ikx}$ from the [Dirichlet Kernel page](/page/Dirichlet%20Kernel).
[definition: Fejér Kernel]
For $N \in \mathbb{N}_0$, the **Fejér kernel** of order $N$ is the function $F_N: \mathbb{T} \to \mathbb{R}$ defined by
\begin{align*}
F_N(x) := \frac{1}{N+1}\sum_{n=0}^N D_n(x).
\end{align*}
[/definition]
Since each $D_n$ is a trigonometric polynomial of degree $n$, the Fejér kernel $F_N$ is a trigonometric polynomial of degree $N$. Writing out the Fourier coefficients: $\hat{F}_N(k) = \frac{1}{N+1}\sum_{n=|k|}^N 1 = \frac{N + 1 - |k|}{N+1} = 1 - \frac{|k|}{N+1}$ for $|k| \leq N$, and $\hat{F}_N(k) = 0$ for $|k| > N$. Thus $F_N$ applies a **triangular window** to the Fourier coefficients: it keeps the low frequencies nearly unchanged ($\hat{F}_N(k) \approx 1$ for $|k| \ll N$) and linearly attenuates the high frequencies toward zero.
## Closed Form
The closed form of $F_N$ is derived in the proof of [Fejér's Theorem](/theorems/584) by summing the Dirichlet kernels using a product-to-sum telescoping argument. The result is the following.
The sum $\sum_{n=0}^N D_n(x)$ is evaluated by multiplying each $D_n(x) = \sin((n+1/2)x)/\sin(x/2)$ by $2\sin(x/2)$, applying the identity $2\sin A\sin B = \cos(A-B) - \cos(A+B)$, and telescoping. After dividing by $N + 1$:
\begin{align*}
F_N(x) = \frac{1}{N+1} \cdot \frac{\sin^2\!\left(\frac{(N+1)x}{2}\right)}{\sin^2(x/2)} \quad \text{for } x \notin 2\pi\mathbb{Z},
\end{align*}
and $F_N(0) = N + 1$ (by L'Hôpital or direct computation: $F_N(0) = \frac{1}{N+1}\sum_{n=0}^N (2n+1) = N+1$). The full derivation is in the proof of [Fejér's Theorem](/theorems/584).
[example: Small Values Of N]
For $N = 0$: $F_0(x) = D_0(x) = 1$ (the constant function).
For $N = 1$: $F_1(x) = \frac{1}{2}(D_0(x) + D_1(x)) = \frac{1}{2}(1 + 1 + 2\cos x) = 1 + \cos x$. The closed form gives $F_1(x) = \frac{1}{2} \cdot \frac{\sin^2(x)}{\sin^2(x/2)} = \frac{1}{2} \cdot \frac{4\sin^2(x/2)\cos^2(x/2)}{\sin^2(x/2)} = 2\cos^2(x/2) = 1 + \cos x$. Note: $F_1(x) \geq 0$ for all $x$ (since $1 + \cos x \geq 0$), confirming positivity.
For $N = 2$: $F_2(x) = \frac{1}{3}(1 + (1 + 2\cos x) + (1 + 2\cos x + 2\cos 2x)) = 1 + \frac{4}{3}\cos x + \frac{2}{3}\cos 2x$. From the closed form: $F_2(x) = \frac{1}{3}\frac{\sin^2(3x/2)}{\sin^2(x/2)}$. At $x = 0$: $F_2(0) = 3$. At $x = \pi$: $F_2(\pi) = \frac{1}{3}\frac{\sin^2(3\pi/2)}{\sin^2(\pi/2)} = \frac{1}{3} \cdot \frac{1}{1} = 1/3$. Positive everywhere.
Compare with the Dirichlet kernel $D_2(\pi) = 1$ (same value) but $D_1(\pi) = -1$ (negative). Averaging $D_0, D_1, D_2$ at $x = \pi$ gives $(1 + (-1) + 1)/3 = 1/3$: the negative value of $D_1$ is absorbed by the averaging.
[/example]
## The Three Approximate Identity Properties
The crucial fact about $F_N$ — and the reason it succeeds where $D_N$ fails — is that it satisfies all three properties of an [approximate identity](/page/Convolution).
**Positivity.** The closed form $F_N(x) = \frac{1}{N+1}\frac{\sin^2((N+1)x/2)}{\sin^2(x/2)}$ is a ratio of non-negative quantities ($\sin^2$ terms) divided by a positive constant $N + 1$, so $F_N(x) \geq 0$ for all $x$. This is the single most important property and the one the Dirichlet kernel lacks.
**Unit mass.** Since $\frac{1}{2\pi}\int D_n = 1$ for each $n$ and $F_N$ is their average: $\frac{1}{2\pi}\int F_N = \frac{1}{N+1}\sum_{n=0}^N 1 = 1$.
**Concentration.** For $|x| \geq \delta > 0$, the denominator $\sin^2(x/2) \geq \sin^2(\delta/2)$ is bounded away from zero while the numerator $\sin^2((N+1)x/2) \leq 1$:
\begin{align*}
\sup_{\delta \leq |x| \leq \pi} F_N(x) \leq \frac{1}{(N+1)\sin^2(\delta/2)} \to 0 \quad \text{as } N \to \infty.
\end{align*}
Combined with positivity and unit mass, this implies that all the mass of $F_N$ concentrates in $(-\delta, \delta)$ as $N \to \infty$ — exactly the approximate identity behaviour.
## Fejér's Theorem
The approximate identity properties immediately yield uniform convergence of the Cesàro means for continuous [functions](/page/Function). The argument is the standard three-$\varepsilon$ proof: split the convolution [integral](/page/Integral) $\sigma_N f(x) - f(x) = \frac{1}{2\pi}\int [f(x-t) - f(x)]F_N(t) \, d\mathcal{L}^1(t)$ into a near-origin piece (controlled by uniform [continuity](/page/Continuity) of $f$ and unit mass of $F_N$) and a far piece (controlled by the concentration estimate). The positivity of $F_N$ is essential in the near-origin estimate: it ensures $\frac{1}{2\pi}\int_{|t|<\delta} F_N \leq 1$, which would fail for the oscillating Dirichlet kernel.
[quotetheorem:584]
The density of trigonometric polynomials in $C(\mathbb{T})$ is the trigonometric [Weierstrass approximation theorem](/theorems/480) — the periodic analogue of the classical result that algebraic polynomials are dense in $C([a,b])$. The connection between the two is not coincidental: the Weierstrass theorem can be proved by the same approximate identity method, using the Landau kernel (a polynomial approximate identity on $[-1, 1]$) instead of the Fejér kernel.
## Consequences
### Completeness of the Trigonometric System
Fejér's theorem gives density of trigonometric polynomials in $C(\mathbb{T})$ with respect to $\|\cdot\|_\infty$. Since $C(\mathbb{T})$ is dense in $L^2(\mathbb{T})$ (continuous functions on a finite measure space are dense in $L^p$ for $p < \infty$), a two-step approximation gives density in $L^2$. By the [Characterisation of Complete Orthonormal Systems](/theorems/541), this is equivalent to completeness of $\{e^{inx}/\sqrt{2\pi}\}_{n \in \mathbb{Z}}$. See [Completeness of the Trigonometric System](/theorems/585) for the formal statement.
### Uniqueness of Fourier Series
If $f \in L^1(\mathbb{T})$ has $\hat{f}(n) = 0$ for all $n$, then $\sigma_N f = 0$ for all $N$ (since $\sigma_N f$ is a linear combination of the $\hat{f}(n)$). If $f$ is continuous, Fejér's theorem gives $0 = \sigma_N f \to f$ uniformly, so $f = 0$. For merely $L^1$ functions, a more delicate argument is needed, but the Fejér kernel remains the key tool.
### No Gibbs Phenomenon
Since $\sigma_N f$ is a convolution of $f$ with a non-negative kernel, $\min f \leq \sigma_N f(x) \leq \max f$ for all $x$ (the convolution is a weighted average, and a weighted average of values in $[\min f, \max f]$ stays in that interval). The Cesàro means therefore never overshoot — the Gibbs phenomenon is entirely absent. This is the practical advantage of Fejér summation over ordinary partial sums.
## References
- Fejér, L. (1900). Sur les fonctions bornées et intégrables. *Comptes Rendus*, 131, 984–987.
- Katznelson, Y. (2004). *An Introduction to Harmonic Analysis* (3rd ed.). Cambridge University Press.
- Stein, E. M. and Shakarchi, R. (2003). *Fourier Analysis: An Introduction*. Princeton University Press.