Consider the polynomial $p(x) = x^3 - 2$. Since $p(1) = -1 < 0$ and $p(2) = 6 > 0$, the function changes sign on $[1, 2]$, so there must be a root somewhere in between — namely $\sqrt[3]{2}$. This reasoning is the Intermediate Value Theorem, and it requires continuity. If we replace $p$ with the discontinuous function
\begin{align*}
f(x) = \begin{cases} -1 & \text{if } x < \sqrt[3]{2}, \\ 1 & \text{if } x \geq \sqrt[3]{2}, \end{cases}
\end{align*}
then $f(1) = -1 < 0$ and $f(2) = 1 > 0$, yet $f$ never equals zero — the jump at $\sqrt[3]{2}$ allows the function to "skip" the value $0$ entirely. The discontinuity creates a gap in the range that continuity would have prevented.
This simple observation — that continuous [functions](/page/Function) cannot skip values — is the entry point to a remarkably deep theory. Continuity is the condition under which [limits](/page/Limit) commute with function evaluation, and it is the minimal regularity needed for the major theorems of real analysis: the Intermediate Value Theorem, the Extreme Value Theorem, and the [Fundamental Theorem of Calculus](/theorems/632) all require it. Yet continuity is also surprisingly permissive: continuous functions can oscillate wildly at every scale (the Weierstrass function), be discontinuous on a dense set (the Thomae function), and defy every geometric intuition one might bring from drawing smooth curves.
This page develops continuity for functions $f: E \to \mathbb{R}$ where $E \subseteq \mathbb{R}$, the concrete setting of Cambridge IA Analysis. The general metric-space and [topological](/page/Topology) definitions are developed on the parent page [Continuity](/page/Continuity).
## The $\varepsilon$-$\delta$ Definition
The informal idea of continuity — "small changes in input produce small changes in output" — must be made precise. The question is: how small is "small"? The $\varepsilon$-$\delta$ definition answers this by demanding that *for any tolerance $\varepsilon$ on the output*, there exists a corresponding tolerance $\delta$ on the input that guarantees the output stays within $\varepsilon$ of $f(a)$.
[definition:Continuity At A Point Real Line]
Let $E \subseteq \mathbb{R}$ and let $f: E \to \mathbb{R}$. The function $f$ is **continuous at** $a \in E$ if for every $\varepsilon > 0$ there exists $\delta > 0$ such that
\begin{align*}
|x - a| < \delta \quad \text{and} \quad x \in E \quad \implies \quad |f(x) - f(a)| < \varepsilon.
\end{align*}
The function $f$ is **continuous on $E$** if it is continuous at every point $a \in E$.
[/definition]
The quantifier structure $\forall \varepsilon > 0 \; \exists \delta > 0$ is the engine of the definition. Earlier mathematicians, including Newton and Leibniz, spoke of "infinitesimal changes" — quantities smaller than any positive number yet nonzero. Cauchy's insight was to eliminate infinitesimals entirely: instead of asserting that the difference $|f(x) - f(a)|$ *becomes* infinitesimal, he demanded a verifiable test for *every* positive $\varepsilon$, however small. This quantifier machinery works in abstract spaces where geometric intuition fails, and it became the foundation of modern analysis.
Notice that $\delta$ is allowed to depend on both $\varepsilon$ *and* the point $a$. This dependence is essential: for $f(x) = x^2$ near $a = 100$, a much smaller $\delta$ is needed than near $a = 1$ to achieve the same $\varepsilon$. The question of when $\delta$ can be chosen independently of $a$ — *uniform* continuity — is one of the central themes of this page.
[example:Epsilon Delta Estimation For A Quadratic]
We prove that $f(x) = x^2$ is continuous at $a = 3$. The key technique is to *factor* $|f(x) - f(a)|$ and control each factor separately.
**Step 1: Factor the distance.** $|x^2 - 9| = |x - 3| \cdot |x + 3|$. The first factor $|x - 3|$ is directly controlled by $\delta$. The second factor $|x + 3|$ must be bounded, which requires restricting $x$ to a neighbourhood of $3$.
**Step 2: Bound the uncontrolled factor.** If $|x - 3| < 1$, then $|x| < 4$, so $|x + 3| \leq |x| + 3 < 7$. This gives $|x^2 - 9| < 7|x - 3|$.
**Step 3: Choose $\delta$.** Given $\varepsilon > 0$, set $\delta = \min(1, \varepsilon/7)$. If $|x - 3| < \delta$, then both restrictions are active: $|x - 3| < 1$ (so the bound from Step 2 applies) and $|x - 3| < \varepsilon/7$. Therefore $|x^2 - 9| < 7 \cdot \varepsilon/7 = \varepsilon$.
This "factor and bound" technique is the standard method for $\varepsilon$-$\delta$ proofs. The trick $\delta = \min(1, \varepsilon/7)$ — imposing a preliminary bound to control the "bad" factor — recurs throughout analysis.
[/example]
## The Sequential Characterisation
The $\varepsilon$-$\delta$ definition is precise but unwieldy for many arguments. In $\mathbb{R}$, there is a cleaner equivalent: continuity means preserving convergent [sequences](/page/Sequence).
[quotetheorem:179]
The sequential characterisation is the standard tool for proving discontinuity: to show $f$ is discontinuous at $a$, exhibit a single [sequence](/page/Sequence) $x_n \to a$ with $f(x_n) \not\to f(a)$. This is far easier than showing that *no* $\delta$ works for some $\varepsilon$ in the $\varepsilon$-$\delta$ definition. The forward direction (continuity implies sequence preservation) is immediate; the reverse constructs a "bad sequence" by taking $\delta = 1/n$ and finding points $x_n$ violating the $\varepsilon$-$\delta$ condition, yielding $x_n \to a$ but $f(x_n) \not\to f(a)$.
## Pathological Examples
The power of the $\varepsilon$-$\delta$ definition is that it handles functions far wilder than anything one might draw by hand. The following three examples, each a milestone in the history of analysis, reveal just how strange continuous and discontinuous functions can be.
### The Dirichlet Function: Discontinuous Everywhere
[example:The Dirichlet Function]
Define $\mathbf{1}_\mathbb{Q}: \mathbb{R} \to \mathbb{R}$ by
\begin{align*}
\mathbf{1}_\mathbb{Q}(x) = \begin{cases} 1 & \text{if } x \in \mathbb{Q}, \\ 0 & \text{if } x \notin \mathbb{Q}. \end{cases}
\end{align*}
This function is discontinuous at every point. At any $a \in \mathbb{R}$, the density of both $\mathbb{Q}$ and $\mathbb{R} \setminus \mathbb{Q}$ in $\mathbb{R}$ provides two sequences converging to $a$ with different image limits.
**Proof of discontinuity.** Fix any $a \in \mathbb{R}$. By the density of the rationals, there exists a sequence of rationals $r_n \to a$; by the density of the irrationals, there exists a sequence of irrationals $s_n \to a$. Then $\mathbf{1}_\mathbb{Q}(r_n) = 1 \to 1$ and $\mathbf{1}_\mathbb{Q}(s_n) = 0 \to 0$. Since $1 \neq 0$, $\mathbf{1}_\mathbb{Q}$ cannot be continuous at $a$: if it were, both sequences would have to converge to $\mathbf{1}_\mathbb{Q}(a)$, but they converge to different values.
The Dirichlet function is the standard example showing that a function can be discontinuous at *every* point. It is also not [Riemann integrable](/page/Riemann%20Integral): on every subinterval $[x_{j-1}, x_j]$, the supremum of $\mathbf{1}_\mathbb{Q}$ is $1$ and the infimum is $0$, so the upper and lower sums never agree.
[/example]
### The Thomae Function: Continuous on a Dense Set
[example:The Thomae Function]
Define $f: (0, 1) \to \mathbb{R}$ by
\begin{align*}
f(x) = \begin{cases} \frac{1}{q} & \text{if } x = \frac{p}{q} \text{ in lowest terms with } p, q \in \mathbb{N}, \\ 0 & \text{if } x \notin \mathbb{Q}. \end{cases}
\end{align*}
This function is continuous at every irrational and discontinuous at every rational — a striking demonstration that the continuity set of a function can be wild yet dense.
**Proof of continuity at irrationals.** Fix an irrational $a \in (0, 1)$ and $\varepsilon > 0$. The key observation is that only *finitely many* rationals in $(0, 1)$ have denominator $q \leq 1/\varepsilon$: these are the fractions $p/q$ with $1 \leq p < q \leq \lfloor 1/\varepsilon \rfloor$. Since this set $B_\varepsilon$ is finite and $a \notin B_\varepsilon$ (because $a$ is irrational), we can choose $\delta > 0$ small enough that the interval $(a - \delta, a + \delta)$ contains none of the points of $B_\varepsilon$.
For any $x \in (a - \delta, a + \delta) \cap (0, 1)$: if $x$ is irrational, $f(x) = 0$; if $x = p/q$ in lowest terms, then $q > 1/\varepsilon$ (since $x \notin B_\varepsilon$), so $f(x) = 1/q < \varepsilon$. In both cases, $|f(x) - f(a)| = |f(x)| < \varepsilon$.
**Proof of discontinuity at rationals.** Fix a rational $a = p/q \in (0,1)$, so $f(a) = 1/q > 0$. Choose a sequence of irrationals $s_n \to a$. Then $f(s_n) = 0 \to 0 \neq 1/q = f(a)$.
The Thomae function teaches a fundamental technique: to prove an $\varepsilon$-$\delta$ statement, identify the *finitely many* obstructions (the rationals with small denominator), then choose $\delta$ to avoid all of them. This "finitely many bad points" argument recurs throughout analysis — in the proof that monotone functions have only countably many discontinuities, in the proof that Riemann-integrable functions are continuous almost everywhere, and in the construction of the [Riemann integral](/page/Riemann%20Integral) itself.
Unlike the Dirichlet function, the Thomae function *is* Riemann [integrable](/page/Integral), with $\int_0^1 f = 0$. The finitely-many-bad-points argument controls the upper sum: for any $\varepsilon > 0$, confine the large-denominator rationals to tiny intervals, giving $S(f, \mathcal{D}) < \varepsilon$ for a suitably chosen partition.
[/example]
### The Topologist's Sine Curve: Oscillation Without Settling
[example:The Topologist's Sine Curve]
Define $f: \mathbb{R} \to \mathbb{R}$ by $f(x) = \sin(1/x)$ for $x \neq 0$ and $f(0) = 0$. This function is continuous at every $x \neq 0$ (as a composition of continuous functions) but discontinuous at $0$: the sequence $x_n = 1/(2\pi n + \pi/2) \to 0$ gives $f(x_n) = \sin(2\pi n + \pi/2) = 1$ for all $n$, so $f(x_n) \to 1 \neq 0 = f(0)$.
Compare with $g(x) = x \sin(1/x)$ for $x \neq 0$ and $g(0) = 0$. Now $g$ *is* continuous at $0$: for any sequence $x_n \to 0$, $|g(x_n)| = |x_n| \cdot |\sin(1/x_n)| \leq |x_n| \to 0$, so $g(x_n) \to 0 = g(0)$. The factor of $x$ "kills" the oscillation by forcing the amplitude to decay even as the frequency diverges.
This pair illustrates a general principle: oscillation alone does not destroy continuity — what matters is whether the *amplitude* of the oscillation tends to zero. The function $\sin(1/x)$ oscillates with constant amplitude $1$ near $0$, causing discontinuity. The function $x \sin(1/x)$ oscillates with amplitude $|x| \to 0$, preserving continuity.
[/example]
## Algebra of Continuous Functions
Continuity is preserved by the algebraic operations: sums, products, quotients (where the denominator is nonzero), and compositions of continuous functions are continuous. This follows directly from the corresponding properties of [limits](/page/Limit) of sequences.
[quotetheorem:197]
These closure properties have an important consequence: every polynomial is continuous on $\mathbb{R}$ (since $x \mapsto x$ and $x \mapsto c$ are continuous, and polynomials are built from these by addition and multiplication), and every rational function $p(x)/q(x)$ is continuous wherever $q(x) \neq 0$. Combined with the continuity of $\sin$, $\cos$, $\exp$, and $\log$ (established via [power series](/page/Power%20Series) or $\varepsilon$-$\delta$ arguments), this gives continuity of every "elementary" function on its natural domain.
## Global Consequences on Closed Bounded Intervals
A function that is continuous at every point of its domain can still behave badly if the domain is not well-chosen: $f(x) = 1/x$ on $(0, 1)$ is continuous but unbounded, and $f(x) = x$ on $(0, 1)$ is continuous but attains neither its supremum nor its infimum. The theorems below show that these pathologies disappear on *closed bounded intervals* $[a, b]$ — the domain structure, not just pointwise continuity, is what enables strong global conclusions.
### The Intermediate Value Theorem
[quotetheorem:180]
The Intermediate Value Theorem says that continuous functions on intervals cannot "skip" values: the image of an interval under a continuous function is again an interval. This is the topological statement that continuous images of connected [sets](/page/Set) are connected, specialised to $\mathbb{R}$.
The proof requires the completeness of $\mathbb{R}$. The natural candidate for the point $c$ where $f(c) = \gamma$ is $c = \sup\{x \in [a, b] : f(x) < \gamma\}$. This supremum exists by the completeness axiom (the set is nonempty and bounded above). Continuity then forces $f(c) = \gamma$: if $f(c) < \gamma$, continuity gives a neighbourhood where $f < \gamma$, contradicting the supremum; if $f(c) > \gamma$, continuity gives a neighbourhood where $f > \gamma$, contradicting the definition of the set.
In $\mathbb{Q}$, the IVT fails: the function $f(x) = x^2 - 2$ on $[1, 2] \cap \mathbb{Q}$ is continuous (restricted to $\mathbb{Q}$), changes sign, but has no rational root. The "gap" at $\sqrt{2}$ allows $f$ to skip the value $0$. Completeness plugs such gaps.
[example:Fixed Points Of Continuous Self-Maps]
Every continuous function $f: [0, 1] \to [0, 1]$ has a fixed point. Define $g(x) = f(x) - x$. Then $g$ is continuous, $g(0) = f(0) \geq 0$ (since $f(0) \in [0, 1]$), and $g(1) = f(1) - 1 \leq 0$ (since $f(1) \in [0, 1]$). By the [Intermediate Value Theorem](/theorems/180), there exists $c \in [0, 1]$ with $g(c) = 0$, i.e., $f(c) = c$.
This is the one-dimensional Brouwer fixed-point theorem. In higher dimensions, the proof requires algebraic topology (the full Brouwer theorem states that every continuous map from the closed unit ball $\overline{B}^n$ to itself has a fixed point), but in one dimension, the IVT suffices.
[/example]
### Boundedness and Extreme Values
[quotetheorem:181]
[quotetheorem:182]
The Extreme Value Theorem guarantees not just that $f$ is bounded, but that the bounds are *attained* — there exist points where $f$ achieves its maximum and minimum. The proof uses the [Bolzano-Weierstrass Theorem](/theorems/628): if $f$ achieves values arbitrarily close to $M = \sup f([a,b])$, construct a sequence $x_n$ with $f(x_n) \to M$; by Bolzano-Weierstrass, $x_n$ has a convergent subsequence $x_{n_k} \to c \in [a, b]$ (the *closedness* of $[a, b]$ ensures $c$ stays in the domain); continuity forces $f(c) = M$.
The closedness hypothesis is essential: $f(x) = x$ on $(0, 1)$ achieves neither its supremum $1$ nor its infimum $0$, because the limit points $0$ and $1$ lie outside the domain. The boundedness hypothesis is equally essential: $f(x) = 1/x$ on $(0, 1]$ is continuous but unbounded, because the missing left endpoint allows the function to escape to $+\infty$.
## Uniform Continuity
Ordinary continuity allows $\delta$ to depend on both $\varepsilon$ *and* the point $a$: for each point, a different $\delta$ may be needed. Uniform continuity strengthens this by demanding a single $\delta$ that works simultaneously at every point.
[definition:Uniformly Continuous Real Line]
A function $f: E \to \mathbb{R}$ (where $E \subseteq \mathbb{R}$) is **uniformly continuous on $E$** if for every $\varepsilon > 0$ there exists $\delta > 0$ such that
\begin{align*}
|x - y| < \delta \implies |f(x) - f(y)| < \varepsilon \quad \text{for all } x, y \in E.
\end{align*}
[/definition]
The difference between continuity and uniform continuity is a quantifier swap. Continuity says: $\forall a \in E \; \forall \varepsilon > 0 \; \exists \delta > 0 \; \ldots$ Uniform continuity says: $\forall \varepsilon > 0 \; \exists \delta > 0 \; \forall x, y \in E \; \ldots$ Moving the universal quantifier $\forall x, y$ outside the existential $\exists \delta$ is a strictly stronger condition — it forbids $\delta$ from depending on the point.
[example:Why The Quantifier Swap Matters]
The function $f(x) = x^2$ is continuous on all of $\mathbb{R}$ but *not* uniformly continuous on $\mathbb{R}$. To achieve $|x^2 - a^2| < \varepsilon$, we need $|x - a| < \varepsilon / (2|a| + 1)$: the required $\delta$ shrinks as $|a|$ grows. No single $\delta$ works for all $a$ simultaneously.
Formally: take $x_n = n + 1/(2n)$ and $y_n = n$. Then $|x_n - y_n| = 1/(2n) \to 0$, but $|f(x_n) - f(y_n)| = |x_n^2 - n^2| = n \cdot (1/(2n)) + 1/(4n^2) > 1/2$ for large $n$. So any $\delta$ that is supposed to work for $\varepsilon = 1/2$ will be violated by the pair $(x_n, y_n)$ for large enough $n$.
On the compact interval $[0, M]$, however, $f$ *is* uniformly continuous — this is guaranteed by the [Heine-Cantor Theorem](/theorems/280), which we state next.
[/example]
### The Heine-Cantor Theorem
The remarkable fact about continuous functions on closed bounded intervals is that uniform continuity comes for free.
[quotetheorem:280]
The proof uses compactness: for each $a \in [a, b]$, continuity provides a $\delta_a$ that works at $a$. The intervals $(a - \delta_a/2, a + \delta_a/2)$ form an open cover of $[a, b]$; by [Heine-Borel](/theorems/271), finitely many suffice. Taking $\delta = \min \delta_{a_i}/2$ over this finite subcover gives a uniform $\delta$. The ability to extract a finite subcover — compactness — is what converts the pointwise guarantee into a global one.
Uniform continuity is essential for Riemann integration: the proof that every continuous function on $[a, b]$ is [Riemann integrable](/page/Riemann%20Integral) uses uniform continuity to control the oscillation of $f$ on each subinterval of a partition. Without it, the upper and lower sums might never converge.
## The Regularity Hierarchy
Continuity is the weakest of a chain of increasingly strong regularity conditions, each strictly contained in the previous one:
\begin{align*}
\text{differentiable} \;\subsetneq\; \text{Lipschitz} \;\subsetneq\; \text{uniformly continuous} \;\subsetneq\; \text{continuous}.
\end{align*}
Each inclusion is strict, and examples at each [boundary](/page/Boundary) reveal what the stronger condition adds.
[definition:Lipschitz Continuous Real Line]
A function $f: E \to \mathbb{R}$ is **Lipschitz continuous** (or $L$-Lipschitz) on $E$ if there exists $L \geq 0$ such that
\begin{align*}
|f(x) - f(y)| \leq L |x - y| \quad \text{for all } x, y \in E.
\end{align*}
The smallest such $L$ is the **Lipschitz constant** $\operatorname{Lip}(f)$.
[/definition]
Every Lipschitz function is uniformly continuous (take $\delta = \varepsilon / L$). Lipschitz continuity says $\delta$ can be chosen *proportional* to $\varepsilon$, which is stronger than merely saying $\delta$ can be chosen independently of the point.
**Boundary examples separating adjacent levels:**
**Continuous but not uniformly continuous:** $f(x) = \sin(1/x)$ on $(0, 1)$ — the oscillation frequency diverges near $0$, so no single $\delta$ controls the output for all pairs of nearby points.
**Uniformly continuous but not Lipschitz:** $f(x) = \sqrt{x}$ on $[0, 1]$ — uniformly continuous by [Heine-Cantor](/theorems/280), but not Lipschitz because $|\sqrt{x} - \sqrt{0}| / |x - 0| = 1/\sqrt{x} \to \infty$ as $x \to 0^+$. The slope is unbounded despite the function being "tame."
**Lipschitz but not [differentiable](/page/Derivative):** $f(x) = |x|$ — Lipschitz with constant $L = 1$ (since $||x| - |y|| \leq |x - y|$), but not differentiable at $0$ because the left and right difference quotients approach $-1$ and $+1$ respectively.
**Differentiable but not Lipschitz (on unbounded domains):** $f(x) = x^2$ on $\mathbb{R}$ — differentiable everywhere, but the derivative $f'(x) = 2x$ is unbounded, so no finite Lipschitz constant exists.
In the differentiable setting, the [Mean Value Theorem](/theorems/186) provides the connection: if $f$ is differentiable on $(a, b)$ with $|f'(x)| \leq L$ for all $x$, then $|f(x) - f(y)| = |f'(c)||x - y| \leq L|x - y|$ for some $c$ between $x$ and $y$. A bounded derivative implies Lipschitz continuity; the converse ([Rademacher's theorem](/page/Rademacher's%20Theorem)) states that a Lipschitz function is differentiable almost everywhere.
## Continuity Does Not Imply Differentiability
The most dramatic example separating continuity from differentiability is the Weierstrass function, which is continuous everywhere but differentiable *nowhere*.
[example:Weierstrass Nowhere-Differentiable Function]
Weierstrass (1872) constructed the function
\begin{align*}
W(x) = \sum_{n=0}^{\infty} a^n \cos(b^n \pi x),
\end{align*}
with $0 < a < 1$, $b$ an odd integer, and $ab > 1 + \frac{3\pi}{2}$. The [series](/page/Series) [converges uniformly](/page/Uniform%20Convergence) by the Weierstrass $M$-test (with $M_n = a^n$), so $W$ is continuous. But the high-frequency oscillations at every scale — amplified by the factor $b^n$ — prevent any tangent line from forming.
The key is the interplay between two competing effects: the amplitude $a^n$ decays geometrically (making the function continuous), while the frequency $b^n$ grows geometrically (making the slope of the difference quotient oscillate without limit). The condition $ab > 1$ ensures that the frequency growth dominates the amplitude decay, so the difference quotients $[W(x + h) - W(x)]/h$ oscillate with increasing amplitude as $h \to 0$, never settling to a limit.
Before Weierstrass, mathematicians believed that continuous functions were differentiable "except at isolated points." This example shattered that intuition and revealed that continuity is fundamentally weaker than differentiability — not by a small margin, but by an enormous gulf. In fact, "most" continuous functions (in the sense of Baire category) are nowhere differentiable.
[/example]