Approximation theory studies how complicated functions, data, and operators can be represented by simpler and more tractable objects, and how well that representation can be made. The course begins with the basic notion of best approximation in normed spaces, then develops the central problem of measuring and controlling approximation error in concrete settings. From there it moves through the classical questions of density, existence, and construction: when polynomials, splines, rational functions, or orthogonal expansions can approximate a target function, and how the smoothness or structure of the target governs the rate of approximation.
text
admin
The main themes are a balance between qualitative existence results and quantitative error estimates. The chapters build from foundational normed-space formulations to constructive Weierstrass theory, then to moduli of smoothness, direct and inverse approximation theorems, and Bernstein-type inequalities that connect approximation rates with regularity. Later chapters specialize to minimax and Chebyshev approximation, orthogonal polynomials and spectral methods, spline spaces and B-splines, and rational and Padé approximation. The course then broadens to approximation in `L^p` spaces and nonlinear approximation, before ending with applications in numerical analysis, where these ideas become practical tools for computation, interpolation, quadrature, and the stable solution of analytic and differential problems.
text
admin
# Introduction
h1
admin
Approximation theory begins with a practical tension: many functions that arise in analysis, geometry, physics, and computation are too complicated to manipulate exactly, while finite-dimensional objects are stable enough to store, differentiate, integrate, and evaluate. The course studies how a function can be replaced by a simpler surrogate, how the error of replacement is measured, and how qualitative information such as smoothness or analyticity predicts quantitative convergence rates. Polynomial, trigonometric, spline, and rational approximants will appear as different answers to the same guiding problem: choose a structured family and understand the best error it can achieve.
text
admin
The point of this introductory chapter is to fix the language used throughout the course. We separate the approximation problem itself from the method used to construct an approximant, and we distinguish existence, uniqueness, stability, and convergence as different questions. This viewpoint links real analysis, functional analysis, Fourier analysis, and numerical linear algebra into a single framework.
text
admin
## What Is Being Approximated?
h2
admin
The first question is not how to approximate, but what kind of object is being approximated and in what sense two objects are considered close. A [continuous function](/page/Continuous%20Function) on a compact interval, an $L^2$ function, a periodic signal, and a function with isolated singularities demand different approximation spaces and different error measures.
text
admin
[definition: Approximation Problem]
An approximation problem consists of a normed space $(X, \|\cdot\|_X)$, a target element $f \in X$, and a subset $A \subset X$ whose elements are called admissible approximants.
[/definition]
definition
admin
The subset $A$ is usually much simpler than the ambient space $X$: it might be a finite-dimensional subspace of polynomials, a nonlinear family of rational functions, or a space of splines with prescribed knots. This definition identifies the data of the problem, but it does not yet assign a number to the quality of the best possible replacement. To compare different approximation families, we need a single quantity that records the smallest error that the family permits.
text
admin
[definition: Best Approximation Error]
Let $(X, \|\cdot\|_X)$ be a normed space and let $A \subset X$. The best approximation error from $A$ is the map
\begin{align*}
E_A(\cdot)_X &: X \to [0,\infty], & E_A(f)_X &:= \inf_{a \in A} \|f-a\|_X.
\end{align*}
[/definition]
definition
admin
The infimum records the performance limit of the family $A$, regardless of whether a minimiser has been found. A substantial part of approximation theory asks whether this infimum is attained, whether the minimiser is unique, and whether a computable procedure can find it without amplifying data errors.
text
admin
[definition: Best Approximant]
Let $(X, \|\cdot\|_X)$ be a normed space, let $f \in X$, and let $A \subset X$. An element $a^* \in A$ is a best approximant to $f$ from $A$ if
\begin{align*}
\|f-a^*\|_X = E_A(f)_X.
\end{align*}
[/definition]
definition
admin
A best approximant is a geometric projection of $f$ onto the model class, but it need not behave like an [orthogonal projection](/theorems/437) unless the space has Hilbert structure. This distinction explains why least-squares approximation and uniform approximation have rather different theories.
text
admin
[example: Best Constant Approximation In Two Norms]
Consider $f(x)=x$ on $[0,1]$ and approximate it by constants, where $\mathcal L^1$ denotes one-dimensional [Lebesgue measure](/page/Lebesgue%20Measure). For a constant $c\in\mathbb R$, the squared $L^2$ error is
\begin{align*}
\int_0^1 (x-c)^2\,d\mathcal L^1(x)=\int_0^1 (x^2-2cx+c^2)\,d\mathcal L^1(x)=\frac13-c+c^2.
\end{align*}
Completing the square gives
\begin{align*}
\frac13-c+c^2=\left(c-\frac12\right)^2+\frac1{12}.
\end{align*}
The term $\left(c-\frac12\right)^2$ is minimised exactly when $c=1/2$, so the best constant in $L^2(0,1)$ is $1/2$, with squared error $1/12$.
In $C[0,1]$ with the uniform norm, the error of the same constant $c$ is
\begin{align*}
\|x-c\|_\infty=\sup_{0\le x\le 1}|x-c|.
\end{align*}
Every $c$ satisfies the endpoint lower bound
\begin{align*}
1=|1-0|=|(1-c)+(c-0)|\le |1-c|+|c|\le 2\sup_{0\le x\le 1}|x-c|.
\end{align*}
Hence $\|x-c\|_\infty\ge 1/2$ for every constant $c$. For $c=1/2$, every $x\in[0,1]$ satisfies $-1/2\le x-1/2\le 1/2$, so
\begin{align*}
\left\|x-\frac12\right\|_\infty=\frac12.
\end{align*}
Thus the best constant is again $1/2$, but the $L^2$ computation comes from minimising an integral, while the uniform-norm computation comes from balancing the endpoint deviations.
[/example]
example
admin
## Measuring Error
h2
admin
The second question is how the error should be measured. The norm encodes the features of the target that the approximation is required to preserve: average accuracy, pointwise worst-case accuracy, smoothness of derivatives, or stability under an operator.
text
admin
[definition: Uniform Norm]
Let $K$ be a compact [topological space](/page/Topological%20Space). The uniform norm is the map
\begin{align*}
\|\cdot\|_\infty &: C(K) \to [0,\infty), & \|f\|_\infty &:= \sup_{x \in K} |f(x)|.
\end{align*}
[/definition]
definition
admin
Uniform approximation is strong because it controls the error at every point of the domain. It is the natural setting for polynomial density on compact intervals and for Chebyshev approximation, where the maximal deviation determines the quality of the approximant. Many applications instead care about accumulated or average error, so the next norm squares the pointwise discrepancy and integrates it over the domain.
text
admin
[definition: Least-Squares Error]
Let $(E, \mathcal E, \mu)$ be a [measure space](/page/Measure%20Space). The least-squares error is the map
\begin{align*}
L^2(E,\mathcal E,\mu) \times L^2(E,\mathcal E,\mu) &\to [0,\infty), & (f,g) &\mapsto \|f-g\|_{L^2}^2,
\end{align*}
where
\begin{align*}
\|f-g\|_{L^2}^2 = \int_E |f-g|^2\,d\mu.
\end{align*}
[/definition]
definition
admin
Least-squares approximation replaces pointwise control by averaged quadratic control. The square is not only a modelling choice: it turns $L^2$ into a Hilbert space, so the geometry of angles and orthogonality becomes available. The next theorem answers the basic computational question for finite-dimensional least-squares problems: how can we recognise the minimiser without testing every candidate in the approximation space?