Continuity tells us that a function does not jump, but it says nothing about how quickly the values are allowed to change as the input scale shrinks. That missing rate matters whenever a limiting process has to survive differentiation, integration, approximation, or compactness: a [continuous function](/page/Continuous%20Function) can oscillate more and more violently at small scales while still obeying the epsilon-delta definition, so continuity alone is often too coarse for analysis.
The first useful refinement is not differentiability. Differentiability asks for a linear approximation, which is too rigid for many boundary-value problems, singular integrals, and variational limits. Hölder continuity sits between continuity and Lipschitz continuity: it asks for a power-law control
\begin{align*}
|f(x)-f(y)| \le C|x-y|^\gamma
\end{align*}
with an exponent $0 < \gamma \le 1$. The exponent measures the roughness: larger $\gamma$ means stronger control, while smaller $\gamma$ permits sharper cusps.
[example: A Continuous Function with a Precise Cusp Rate]
Let $f:[-1,1]\to\mathbb R$ be given by
\begin{align*}
f(x)=\sqrt{|x|}.
\end{align*}
We show first that $f$ has the square-root modulus $r\mapsto r^{1/2}$. For $a,b\ge0$ with $a\ge b$, set $d=a-b$. If $d=0$, then $|\sqrt a-\sqrt b|=0$. If $d>0$, then
\begin{align*}
\sqrt a-\sqrt b=\frac{a-b}{\sqrt a+\sqrt b}=\frac{d}{\sqrt a+\sqrt b}.
\end{align*}
Since $d=a-b\le a$, we have $\sqrt d\le \sqrt a\le \sqrt a+\sqrt b$, and therefore
\begin{align*}
\frac{d}{\sqrt a+\sqrt b}\le \frac{d}{\sqrt d}=\sqrt d.
\end{align*}
Thus $|\sqrt a-\sqrt b|\le \sqrt{|a-b|}$ for all $a,b\ge0$. Applying this with $a=|x|$ and $b=|y|$ gives
\begin{align*}
|\sqrt{|x|}-\sqrt{|y|}|\le \sqrt{||x|-|y||}.
\end{align*}
Also, the triangle inequality gives $|x|\le |x-y|+|y|$, so $|x|-|y|\le |x-y|$; swapping $x$ and $y$ gives $|y|-|x|\le |x-y|$. Hence
\begin{align*}
||x|-|y||\le |x-y|.
\end{align*}
Combining the two estimates,
\begin{align*}
|\sqrt{|x|}-\sqrt{|y|}|\le |x-y|^{1/2}
\end{align*}
for all $x,y\in[-1,1]$. In particular, $f$ is Hölder continuous with exponent $1/2$, hence continuous.
The same function is not Lipschitz on $[-1,1]$. For $0<x\le1$,
\begin{align*}
\frac{|f(x)-f(0)|}{|x-0|}=\frac{|\sqrt x-0|}{x}=\frac{\sqrt x}{x}=x^{-1/2}.
\end{align*}
If a Lipschitz constant $L$ existed, then this quotient would satisfy $x^{-1/2}\le L$ for every $0<x\le1$, but choosing $0<x<1/L^2$ when $L>0$ gives $x^{-1/2}>L$, and $L=0$ is already impossible from $x=1$. Thus $f$ is continuous with a precise square-root rate, but its cusp at $0$ is too sharp for any linear distance bound.
[/example]
This example is the basic reason Hölder continuity deserves its own page rather than being treated as a minor variant of continuity. The square-root cusp is not a pathology to discard; it is a stable regularity class that appears naturally in elliptic equations, harmonic functions near rough boundaries, and compactness arguments.
## Definition
The parent notion is [continuity](/page/Continuity): small changes in input force small changes in output. Hölder continuity strengthens this by prescribing a specific power of the distance that controls the output change. The definition is usually made on metric spaces, because the only structure it needs is a notion of distance in the domain and codomain.
[definition: Hölder Continuity]
Let $(X,d_X)$ and $(Y,d_Y)$ be metric spaces, let $0<\gamma\le 1$, and let $f:X\to Y$ be a function. The function $f$ is Hölder continuous with exponent $\gamma$ if there exists a constant $C\ge 0$ such that
\begin{align*}
d_Y(f(x),f(y)) \le C d_X(x,y)^\gamma
\end{align*}
for all $x,y\in X$.
[/definition]
The parameter $\gamma$ is part of the assertion. Saying that a function is Hölder continuous without naming the exponent usually means that there exists at least one exponent $\gamma\in(0,1]$ for which the condition holds. In Euclidean spaces, the definition becomes the familiar estimate $|f(x)-f(y)|\le C|x-y|^\gamma$.
The rest of the page develops the machinery that makes this estimate usable: moduli that compare different rates of continuity, seminorms that measure the best constant, local variants for interior regularity, and Hölder spaces that support compactness and PDE estimates.
## From Continuity to Quantitative Continuity
### Global rate estimates
Continuity at a point allows the permissible input radius to depend on a tolerance in an unspecified way. Hölder continuity replaces that dependence by an explicit formula. This gives estimates that can be transported through inequalities, compactness arguments, and approximation schemes.
Before using Hölder continuity as a regularity assumption, we need to check that it really is a strengthening of continuity. The power-law bound supplies a uniform modulus of continuity on the whole domain.
[quotetheorem:8226]
This theorem explains why Hölder continuity belongs under the continuity family in the definition graph. It is not a different kind of mapping condition; it is a quantified strengthening of the same idea. The next question is whether the exact exponent is rigid, or whether controlling a sharper power also controls rougher powers. On bounded sets, the diameter provides the missing scale conversion.
[quotetheorem:8227]
The boundedness hypothesis is not cosmetic. When distances can be arbitrarily large, the inequality $|x-y|^\gamma\le C|x-y|^\alpha$ fails for large $|x-y|$ if $\alpha<\gamma$. Local estimates avoid this issue by working on compact subsets.
[example: Why Boundedness Matters for Exponent Comparison]
Let $f:\mathbb R\to\mathbb R$ be given by $f(x)=x$. For any $x,y\in\mathbb R$,
\begin{align*}
|f(x)-f(y)|=|x-y|.
\end{align*}
Thus $f$ is Lipschitz with constant $1$, and therefore satisfies the Hölder estimate with exponent $1$.
Now fix $0<\alpha<1$. We show that no constant can make $f$ Hölder continuous with exponent $\alpha$ on all of $\mathbb R$. If such a constant $C\ge0$ existed, then for every $x>0$, applying the estimate to the pair $(x,0)$ would give
\begin{align*}
|f(x)-f(0)|\le C|x-0|^\alpha.
\end{align*}
Since $f(x)=x$ and $f(0)=0$, this becomes
\begin{align*}
x\le Cx^\alpha.
\end{align*}
Because $x>0$, dividing both sides by $x^\alpha$ gives
\begin{align*}
x^{1-\alpha}\le C.
\end{align*}
This inequality would have to hold for all $x>0$. If $C=0$, taking $x=1$ gives $1\le0$, impossible. If $C>0$, choose $x>C^{1/(1-\alpha)}$; then raising both sides to the positive power $1-\alpha$ gives
\begin{align*}
x^{1-\alpha}>C.
\end{align*}
This contradicts $x^{1-\alpha}\le C$. Hence $f$ is Lipschitz on the unbounded domain $\mathbb R$, but it is not globally Hölder continuous there with any exponent $\alpha<1$; the boundedness assumption in exponent comparison supplies the missing large-scale control.
[/example]
### Local rates and moduli
The unbounded-line example shows that a single global Hölder constant can fail for reasons having nothing to do with small-scale regularity. In analysis we often care about what happens near each point before imposing behaviour at infinity or at the boundary. A solution of a differential equation may have controlled oscillation on every compact subset of the domain, while the constants deteriorate near the boundary or at infinity. To capture that interior notion precisely, the next definition is needed: it lets the Hölder constant depend on the compact region being tested.
[definition: Local Hölder Continuity]
Let $U\subset\mathbb R^n$ be open, let $m\in\mathbb N$, let $0<\gamma\le 1$, and let $f:U\to \mathbb R^m$ be a function. The function $f$ is locally Hölder continuous with exponent $\gamma$ if for every compact set $K\subset U$ there exists a constant $C_K\ge 0$ such that
\begin{align*}
|f(x)-f(y)|\le C_K |x-y|^\gamma
\end{align*}
for all $x,y\in K$.
[/definition]
The distinction between global and local control is not a technicality. Global Hölder continuity asks for one constant on the whole domain; local Hölder continuity allows the constant to depend on the compact set. That difference is exactly what separates [boundary regularity](/theorems/99) from interior regularity.
The example leaves a broader question: if Hölder estimates are only one kind of rate estimate, what is the general object that records a function's allowed oscillation at scale $r$? Naming that object separates qualitative continuity from the special power-law rates used in Hölder theory.
[definition: Modulus of Continuity]
Let $(X,d_X)$ and $(Y,d_Y)$ be metric spaces, and let $f:X\to Y$ be a function. A modulus of continuity for $f$ is a function $\omega:[0,\infty)\to[0,\infty)$ such that $\omega(r)\to 0$ as $r\to 0$ and
\begin{align*}
d_Y(f(x),f(y))\le \omega(d_X(x,y))
\end{align*}
for all $x,y\in X$.
[/definition]
With this language, Hölder continuity says that $\omega(r)=Cr^\gamma$ is an admissible modulus. This is restrictive enough to support quantitative estimates but flexible enough to include nondifferentiable examples.
This page uses a minimal convention for moduli of continuity: only the decay condition $\omega(r)\to0$ and the oscillation bound are required. Many texts additionally require $\omega$ to be nondecreasing, and some impose subadditivity. Those stronger conventions are compatible with the examples here, but they are not needed for the basic comparison with Hölder rates.
## Examples and Sharp Exponents
### Power-law calibration
The exponent in a Hölder estimate is often the main information. A function may satisfy the condition for many small exponents but fail at the natural endpoint. Examples with powers and logarithms show how to read this information from the behaviour near a singular point.
Power functions are the standard calibration. They show that the exponent in the definition corresponds to the visible order of a cusp.
[example: Power Cusps on an Interval]
Let $0<\beta\le 1$ and define $f:[-1,1]\to\mathbb R$ by
\begin{align*}
f(x)=|x|^\beta.
\end{align*}
We show that $f$ is Hölder continuous with every exponent $0<\gamma\le\beta$, and that no exponent larger than $\beta$ is possible at the cusp.
First let $a,b\ge0$. By symmetry assume $a\ge b$, and write $d=a-b\ge0$. If $d=0$, then $a^\beta-b^\beta=0$. If $d>0$, set $t=b/d$. Then $a=d(t+1)$ and $b=dt$, so
\begin{align*}
a^\beta-b^\beta=d^\beta\big((t+1)^\beta-t^\beta\big).
\end{align*}
For $t=0$, the factor in parentheses is $1$. For $t>0$, the function $\phi(t)=(t+1)^\beta-t^\beta$ satisfies
\begin{align*}
\phi'(t)=\beta\big((t+1)^{\beta-1}-t^{\beta-1}\big)\le0,
\end{align*}
because $\beta-1\le0$ and $t+1\ge t>0$. Hence $\phi(t)\le\phi(0)=1$, and therefore
\begin{align*}
|a^\beta-b^\beta|\le |a-b|^\beta.
\end{align*}
Applying this with $a=|x|$ and $b=|y|$ gives
\begin{align*}
\big||x|^\beta-|y|^\beta\big|\le \big||x|-|y|\big|^\beta.
\end{align*}
The triangle inequality gives $\big||x|-|y|\big|\le |x-y|$, so
\begin{align*}
\big||x|^\beta-|y|^\beta\big|\le |x-y|^\beta.
\end{align*}
Now $x,y\in[-1,1]$ implies $0\le |x-y|\le2$. If $0<\gamma\le\beta$ and $|x-y|>0$, then
\begin{align*}
|x-y|^\beta=|x-y|^\gamma |x-y|^{\beta-\gamma}\le 2^{\beta-\gamma}|x-y|^\gamma.
\end{align*}
The same inequality is immediate when $x=y$. Thus
\begin{align*}
|f(x)-f(y)|\le 2^{\beta-\gamma}|x-y|^\gamma
\end{align*}
for all $x,y\in[-1,1]$, so $f$ is Hölder continuous with every exponent $\gamma\le\beta$.
Now fix $\gamma>\beta$. If a Hölder constant $C\ge0$ existed for exponent $\gamma$, then applying the estimate to $x\in(0,1]$ and $0$ would give
\begin{align*}
x^\beta=|f(x)-f(0)|\le C|x-0|^\gamma=Cx^\gamma.
\end{align*}
Dividing by $x^\gamma>0$ gives
\begin{align*}
x^{\beta-\gamma}\le C.
\end{align*}
If $C=0$, taking $x=1$ gives $1\le0$, impossible. If $C>0$, choose $0<x<\min\{1,C^{-1/(\gamma-\beta)}\}$. Then
\begin{align*}
x^{\beta-\gamma}=\frac{1}{x^{\gamma-\beta}}>C,
\end{align*}
contradicting the required bound. Hence no Hölder exponent $\gamma>\beta$ is possible, and the sharp exponent of the cusp is exactly $\beta$.
[/example]
This calculation is the model for identifying sharp regularity. The upper estimate gives membership in a Hölder class, while the quotient along a carefully chosen pair of points rules out higher exponents.
### Slow moduli
Not every uniformly continuous function has a positive Hölder exponent. Some functions approach continuity too slowly to be bounded by any power near the origin.
[example: Uniform Continuity Without Any Positive Hölder Exponent]
Define $f:[0,e^{-1}]\to\mathbb R$ by
\begin{align*}
f(0)=0,\qquad f(x)=\frac{1}{|\log x|}\quad \text{for }0<x\le e^{-1}.
\end{align*}
For $0<x\le e^{-1}$ we have $\log x\le -1$, so $|\log x|=-\log x$ and $f(x)=1/(-\log x)$. As $x\to0^+$, the quantity $-\log x$ tends to $\infty$, hence $1/(-\log x)\to0=f(0)$. On $(0,e^{-1}]$, the function is a composition of the continuous functions $x\mapsto \log x$, $t\mapsto -t$, and $s\mapsto 1/s$ on $[1,\infty)$, so $f$ is continuous there as well. Thus $f$ is continuous on the compact interval $[0,e^{-1}]$, and therefore uniformly continuous.
Now fix $\gamma>0$. For $0<x\le e^{-1}$,
\begin{align*}
\frac{|f(x)-f(0)|}{|x-0|^\gamma}=\frac{\left|\frac{1}{|\log x|}-0\right|}{x^\gamma}.
\end{align*}
Since $|\log x|=-\log x$ on this interval, this becomes
\begin{align*}
\frac{|f(x)-f(0)|}{|x-0|^\gamma}=\frac{1}{x^\gamma(-\log x)}.
\end{align*}
Put $t=-\log x$. Then $x=e^{-t}$, and $x\to0^+$ is equivalent to $t\to\infty$. Substituting gives
\begin{align*}
\frac{1}{x^\gamma(-\log x)}=\frac{1}{e^{-\gamma t}t}=\frac{e^{\gamma t}}{t}.
\end{align*}
Because $\gamma>0$, choose $T=2/\gamma$. For $t\ge T$, we have $\gamma t/2\ge1$, and the elementary inequality $e^s\ge s$ for $s\ge0$ gives
\begin{align*}
e^{\gamma t/2}\ge \frac{\gamma t}{2}.
\end{align*}
Therefore
\begin{align*}
\frac{e^{\gamma t}}{t}=\frac{e^{\gamma t/2}e^{\gamma t/2}}{t}\ge \frac{\gamma}{2}e^{\gamma t/2}.
\end{align*}
The right-hand side tends to $\infty$ as $t\to\infty$, so
\begin{align*}
\frac{|f(x)-f(0)|}{|x-0|^\gamma}\to\infty
\end{align*}
as $x\to0^+$. If $f$ were Hölder continuous with exponent $\gamma$, there would be a constant $C\ge0$ such that $|f(x)-f(0)|\le C|x-0|^\gamma$ for every $x\in(0,e^{-1}]$, which would force the displayed quotient to be at most $C$ for every such $x$. The divergence above contradicts this. Hence $f$ is uniformly continuous but is not Hölder continuous with any positive exponent.
[/example]
This failure separates Hölder continuity from [uniform continuity](/page/Uniform%20Continuity). A modulus may tend to zero, yet do so more slowly than every power. At the other end, differentiability can create the strongest Hölder estimate when derivatives are bounded, so the next result connects the power-law scale back to ordinary calculus.
The endpoint exponent $1$ deserves its own name because it is the rate produced by bounded derivatives and because it remains unchanged under composition with another endpoint estimate. Naming the endpoint also prevents a common confusion: a linear distance bound is not merely a convenient Hölder estimate, but a regularity class with its own calculus rules and stability properties. The following definition isolates that endpoint so later comparisons can distinguish genuinely fractional behaviour from linear control.
[definition: Lipschitz Continuity]
Let $(X,d_X)$ and $(Y,d_Y)$ be metric spaces, and let $f:X\to Y$ be a function. The function $f$ is Lipschitz continuous if there exists a constant $L\ge 0$ such that
\begin{align*}
d_Y(f(x),f(y))\le L d_X(x,y)
\end{align*}
for all $x,y\in X$.
[/definition]
Thus Lipschitz continuity is Hölder continuity with exponent $1$. The language of Hölder exponents lets us discuss this endpoint together with rougher power laws, but it is useful to remember that the endpoint has special stability properties.
[remark: Endpoint Convention]
Some authors reserve the phrase Hölder continuous for exponents $0<\gamma<1$ and treat the case $\gamma=1$ separately under the name Lipschitz continuous. This page includes the endpoint in the Hölder scale while still naming it separately whenever the endpoint behaviour matters.
[/remark]
This endpoint is not only terminology. To use Lipschitz continuity in estimates, we need a bridge from the derivative information supplied by calculus to the distance estimate required by the definition. Convexity of the domain supplies the missing geometry: it lets the mean-value argument run along the straight segment joining two points.
[quotetheorem:8228]
The theorem places Lipschitz continuity at the differentiable end of the Hölder scale. But power cusps show that the Hölder scale also records meaningful regularity when derivatives fail to exist or become unbounded.
## Hölder Seminorms and Hölder Spaces
Analysis rarely uses a regularity property only as a yes-or-no condition. We need spaces that carry norms, convergence, completeness, and compactness. This section records only the basic function-space structure needed to use Hölder continuity; the broader theory of Hölder spaces belongs to [Holder Space](/page/Holder%20Space).
Once a rate has been chosen, the next question is how large the constant must be. The smallest admissible constant is not only a bookkeeping device; it is the seminorm that turns Hölder regularity into a usable function-space norm.
[definition: Hölder Seminorm]
Let $X\subset\mathbb R^n$, let $m\in\mathbb N$, and let $0<\gamma\le1$. The Hölder seminorm of exponent $\gamma$ is the functional $[\cdot]_{C^{0,\gamma}(X)}:\{f:X\to\mathbb R^m\}\to [0,\infty]$ defined by
\begin{align*}
[f]_{C^{0,\gamma}(X)}:=\sup_{\{x,y\in X:x\ne y\}}\frac{|f(x)-f(y)|}{|x-y|^\gamma}.
\end{align*}
[/definition]
The word seminorm is important: constant functions have seminorm $0$. The supremum records the worst scale-normalized oscillation of $f$ over all pairs of distinct points. Hölder continuity with exponent $\gamma$ is exactly the finiteness of this quantity, but a seminorm alone does not control the size of the function. For compactness, completeness, and estimates with inhomogeneous terms, we need a full function space that combines bounded size with controlled oscillation.
The zeroth-order Hölder space packages boundedness and Hölder oscillation into a single norm. This is the natural setting for compactness and fixed-point arguments involving continuous functions with controlled roughness. The following definition is needed to name the class whose norm combines bounded size with the seminorm just introduced.
[definition: $C^{0,\gamma}$ Space]
Let $X\subset\mathbb R^n$, let $m\in\mathbb N$, and let $0<\gamma\le1$. The space $C^{0,\gamma}(X;\mathbb R^m)$ consists of all bounded functions $f:X\to\mathbb R^m$ such that
\begin{align*}
[f]_{C^{0,\gamma}(X)}<\infty.
\end{align*}
It is equipped with the norm functional
\begin{align*}
\|\cdot\|_{C^{0,\gamma}(X)}:C^{0,\gamma}(X;\mathbb R^m)\to[0,\infty)
\end{align*}
defined by
\begin{align*}
\|f\|_{C^{0,\gamma}(X)}:=\|f\|_\infty+[f]_{C^{0,\gamma}(X)},
\end{align*}
where $\|f\|_\infty:=\sup_{x\in X}|f(x)|$.
[/definition]
The space $C^{0,\gamma}$ controls the oscillation of the function itself, but many estimates control derivatives after differentiating an equation or a variational identity. To express that higher regularity, the Hölder condition must be imposed on the top-order derivatives rather than only on the original function.
[definition: $C^{k,\gamma}$ Space]
Let $U\subset\mathbb R^n$ be open, let $m\in\mathbb N$, let $k\in\mathbb N\cup\{0\}$, let $0<\gamma\le1$, and let $f:U\to\mathbb R^m$ be a function. The function $f$ belongs to $C^{k,\gamma}(U;\mathbb R^m)$ if $f\in C^k(U;\mathbb R^m)$,
\begin{align*}
\sup_{x\in U}|D^\alpha f(x)|<\infty
\end{align*}
for every multi-index $\alpha$ with $|\alpha|\le k$, and
\begin{align*}
[D^\alpha f]_{C^{0,\gamma}(U)}<\infty
\end{align*}
for every multi-index $\alpha$ with $|\alpha|=k$.
[/definition]
This is the global open-domain convention: the bounds are taken over all of $U$. Many regularity theorems do not produce such uniform control, especially on unbounded domains or near rough boundaries. Interior estimates instead say that every compactly contained region has its own bound, with constants allowed to worsen as the region approaches the boundary. The local version below is needed to express exactly that compact-by-compact control.
[definition: Local $C^{k,\gamma}$ Space]
Let $U\subset\mathbb R^n$ be open, let $m\in\mathbb N$, let $k\in\mathbb N\cup\{0\}$, let $0<\gamma\le1$, and let $f:U\to\mathbb R^m$ be a function. The function $f$ belongs to $C^{k,\gamma}_{\mathrm{loc}}(U;\mathbb R^m)$ if $f\in C^k(U;\mathbb R^m)$ and for every compact set $K\subset U$,
\begin{align*}
\sum_{|\alpha|\le k}\sup_{x\in K}|D^\alpha f(x)|+\sum_{|\alpha|=k}[D^\alpha f]_{C^{0,\gamma}(K)}<\infty.
\end{align*}
[/definition]
Local regularity is the right language for interior estimates, where constants may worsen as the compact set approaches the boundary. Boundary regularity and elliptic estimates on closed domains usually need a global closed-domain version instead, because the norm must control derivatives on $\overline{U}$ with one constant.
[definition: Global $C^{k,\gamma}$ Space on a Closure]
Let $U\subset\mathbb R^n$ be bounded and open, let $m\in\mathbb N$, let $k\in\mathbb N\cup\{0\}$, let $0<\gamma\le1$, and let $f:U\to\mathbb R^m$ be a function. The function $f$ belongs to $C^{k,\gamma}(\overline{U};\mathbb R^m)$ if each derivative $D^\alpha f$ with $|\alpha|\le k$ extends continuously to $\overline{U}$ and, for each multi-index $\alpha$ with $|\alpha|=k$, the continuous extension of $D^\alpha f$ has finite Hölder seminorm on $\overline{U}$.
[/definition]
This closed-domain convention is central in Schauder theory. To use the space in estimates, the definition must be paired with a norm that records lower derivatives and top-order Hölder oscillation.
[definition: $C^{k,\gamma}$ Norm]
Let $U\subset\mathbb R^n$ be bounded and open, let $m\in\mathbb N$, let $k\in\mathbb N\cup\{0\}$, and let $0<\gamma\le1$. The $C^{k,\gamma}$ norm on $C^{k,\gamma}(\overline{U};\mathbb R^m)$ is the functional $\|\cdot\|_{C^{k,\gamma}(\overline{U})}:C^{k,\gamma}(\overline{U};\mathbb R^m)\to [0,\infty)$ defined by
\begin{align*}
\|f\|_{C^{k,\gamma}(\overline{U})}:=\sum_{|\alpha|\le k}\|D^\alpha f\|_\infty+\sum_{|\alpha|=k}[D^\alpha f]_{C^{0,\gamma}(\overline{U})}.
\end{align*}
[/definition]
The lower-order derivatives are controlled by sup norms, while the top-order derivatives carry the Hölder seminorm. When a compact set $K\subset\mathbb R^n$ appears in a norm such as $C^{0,\gamma}(K)$ or $C^{0,\alpha}(K)$, it uses the same bounded-function norm and pairwise seminorm with all points taken in $K$. When a compact set $K\subset U$ appears in a higher-order norm such as $C^{2,\gamma}(K)$, it means the corresponding derivatives extend continuously to a neighbourhood of $K$ and the displayed sup norms and seminorms are evaluated on $K$. This convention matches the way differentiating an equation shifts regularity from the function to its derivatives. The norm would be much less useful if Cauchy sequences of approximations could converge outside the class, so completeness is the structural property needed next.
[quotetheorem:8229]
Completeness is what permits Hölder spaces to function as solution spaces. Approximate solutions can be constructed by smoothing, discretization, or compactness, and the Banach-space structure keeps the limit in the same regularity class.
## Stability and Compactness
### Algebra and Composition
A regularity class is useful only if it survives the operations used in analysis. Hölder continuous functions behave well under linear combinations, products, and many compositions, with constants that can be tracked explicitly.
The algebra property is the first stability test. Products appear in nonlinear equations and coefficient-weighted estimates, so we need to know when multiplication preserves Hölder regularity.
[quotetheorem:8230]
This estimate is the Hölder analogue of the product rule at order zero. It says that oscillation of a product is controlled by oscillation of each factor, weighted by the size of the other factor. Nonlinear changes of variables require a different stability principle, because the output of one function becomes the input of another.
[quotetheorem:8231]
The multiplication of exponents is a useful warning. Repeated composition can degrade regularity, especially when neither map is Lipschitz. This is one reason the endpoint exponent $1$ is structurally stable.
### Limits and Compactness
Uniform limits preserve continuity, but they do not automatically preserve a Hölder exponent with a finite seminorm. The missing ingredient is a uniform bound on the Hölder seminorms, which is exactly the estimate analysts try to prove before passing to a limit.
[quotetheorem:8232]
This result is a typical compactness passage: estimates survive limits when the constants are controlled before taking the limit. It still gives convergence only in the uniform norm unless more structure is available. For strong compactness in a Hölder norm, the usual move is to relax the exponent.
[quotetheorem:8234]
The strict inequality $\alpha<\gamma$ is essential. Bounded sets in $C^{0,\gamma}$ need not be compact in the same $C^{0,\gamma}$ norm; compactness appears after relaxing the exponent.
[example: Oscillations Prevent Compactness at the Same Exponent]
For $0<\gamma\le1$ and $j\in\mathbb N$, define $f_j:[0,1]\to\mathbb R$ by
\begin{align*}
f_j(x)=j^{-\gamma}\sin(jx).
\end{align*}
We show that the sequence is uniformly bounded in $C^{0,\gamma}([0,1])$, converges uniformly to $0$, but cannot converge to $0$ in the same Hölder norm.
For every $x\in[0,1]$, the bound $|\sin(jx)|\le1$ gives
\begin{align*}
|f_j(x)|=j^{-\gamma}|\sin(jx)|\le j^{-\gamma}\le1.
\end{align*}
Hence $\|f_j\|_\infty\le1$ for every $j$.
Now fix $a,b\in[0,1]$ and put $h=|a-b|$. If $h=0$, then $f_j(a)=f_j(b)$. Assume $h>0$. Since $|\sin u-\sin v|\le2$ for all real $u,v$, we have
\begin{align*}
|f_j(a)-f_j(b)|=j^{-\gamma}|\sin(ja)-\sin(jb)|\le 2j^{-\gamma}.
\end{align*}
Also, the derivative of $\sin$ is $\cos$ and $|\cos t|\le1$ for all $t$, so the one-variable mean value estimate gives
\begin{align*}
|\sin(ja)-\sin(jb)|\le |ja-jb|=jh.
\end{align*}
Therefore
\begin{align*}
|f_j(a)-f_j(b)|\le j^{-\gamma}jh=j^{1-\gamma}h.
\end{align*}
If $jh\le1$, then
\begin{align*}
j^{1-\gamma}h=(jh)^{1-\gamma}h^\gamma\le h^\gamma,
\end{align*}
because $1-\gamma\ge0$ and $0<jh\le1$. If $jh\ge1$, then $h\ge j^{-1}$, so
\begin{align*}
2j^{-\gamma}\le 2h^\gamma.
\end{align*}
Combining the two cases gives
\begin{align*}
|f_j(a)-f_j(b)|\le 2h^\gamma=2|a-b|^\gamma.
\end{align*}
Since this holds for all distinct $a,b\in[0,1]$,
\begin{align*}
[f_j]_{C^{0,\gamma}([0,1])}\le2.
\end{align*}
The same sequence converges uniformly to $0$, since
\begin{align*}
\|f_j\|_\infty=\sup_{x\in[0,1]}j^{-\gamma}|\sin(jx)|\le j^{-\gamma}
\end{align*}
and $j^{-\gamma}\to0$ as $j\to\infty$.
However, the Hölder seminorms do not converge to $0$. For $j\ge2$, set $a_j=\pi/(2j)$ and $b_j=0$; then $a_j,b_j\in[0,1]$. We have
\begin{align*}
f_j(a_j)=j^{-\gamma}\sin(\pi/2)=j^{-\gamma}
\end{align*}
and
\begin{align*}
f_j(b_j)=j^{-\gamma}\sin(0)=0.
\end{align*}
Also,
\begin{align*}
|a_j-b_j|^\gamma=\left(\frac{\pi}{2j}\right)^\gamma=\left(\frac{\pi}{2}\right)^\gamma j^{-\gamma}.
\end{align*}
Therefore
\begin{align*}
\frac{|f_j(a_j)-f_j(b_j)|}{|a_j-b_j|^\gamma}=\frac{j^{-\gamma}}{\left(\frac{\pi}{2}\right)^\gamma j^{-\gamma}}=\left(\frac{2}{\pi}\right)^\gamma.
\end{align*}
Thus
\begin{align*}
[f_j]_{C^{0,\gamma}([0,1])}\ge \left(\frac{2}{\pi}\right)^\gamma
\end{align*}
for every $j\ge2$. Since convergence in the $C^{0,\gamma}$ norm to $0$ would require $[f_j]_{C^{0,\gamma}([0,1])}\to0$, the sequence converges uniformly but not in the same Hölder norm. This shows why compactness of Hölder-bounded families appears only after weakening the exponent.
[/example]
This example captures the compactness mechanism: high-frequency oscillations can have small amplitude and still retain scale-normalized roughness. Lowering the exponent makes those oscillations disappear in the norm.
## Hölder Regularity in Analysis and PDE
Hölder continuity is not only a refinement of continuity; it is often the exact regularity scale for elliptic and parabolic equations with continuous coefficients. In these problems, derivatives may exist classically, but their continuity must be measured quantitatively to close estimates.
[Second-order elliptic equations](/page/Second-Order%20Elliptic%20Equations) show the role of Hölder spaces. If the coefficients and forcing term are Hölder continuous, Schauder theory gives estimates that keep second derivatives in the same Hölder scale. The fully general [regularity theorem](/theorems/2750) belongs to elliptic PDE; the local a priori estimate below records the form of the control used here.
For this statement, $\mathcal{L}^n$ denotes [Lebesgue measure](/page/Lebesgue%20Measure) on $\mathbb R^n$, and derivatives on closed balls are understood through continuous extension from a neighbourhood of the ball.
[quotetheorem:4947]
This estimate is deliberately stated as an a priori interior estimate: it controls a classical $C^{2,\gamma}$ solution on a smaller ball by data on a larger ball. General versions replace balls by compact subsets separated from the boundary, with constants depending on the separation and the relevant domain geometry. Separate elliptic regularity theorems explain when a weaker solution is automatically regular enough to enter this estimate. A different route starts with weak derivatives in $L^p$ and asks when integral control alone forces a pointwise representative with a Hölder modulus. The dimension-dependent threshold is measured by Morrey's inequality.
[quotetheorem:8233]
This theorem connects Hölder continuity to [Sobolev Space](/page/Sobolev%20Space) theory. Integrability of weak derivatives becomes pointwise continuity once the exponent crosses the dimension threshold $p>n$.
The same theme appears in harmonic analysis and potential theory: estimates first control averages or derivatives weakly, then convert that control into pointwise oscillation bounds. Hölder continuity is the language in which those oscillation bounds are recorded.
[example: Morrey Exponent from a Radial Model]
Let $n\ge2$, choose $p>n$, and set $\gamma=1-n/p$. Fix any $\alpha>\gamma$. Since $\gamma<\min\{\alpha,1\}$, choose $\beta$ such that $\gamma<\beta<\min\{\alpha,1\}$, and define $u:B(0,1)\to\mathbb R$ by $u(x)=|x|^\beta$.
For $x\ne0$, differentiating $|x|^\beta=(x_1^2+\cdots+x_n^2)^{\beta/2}$ gives
\begin{align*}
\partial_{x_i}u(x)=\frac{\beta}{2}(x_1^2+\cdots+x_n^2)^{\beta/2-1}2x_i=\beta |x|^{\beta-2}x_i.
\end{align*}
Hence
\begin{align*}
|\nabla u(x)|^2=\sum_{i=1}^n \beta^2 |x|^{2\beta-4}x_i^2=\beta^2 |x|^{2\beta-4}|x|^2=\beta^2 |x|^{2\beta-2}.
\end{align*}
Since $\beta>0$, this gives $|\nabla u(x)|=\beta |x|^{\beta-1}$ for $x\ne0$. The standard weak-derivative formula for radial powers therefore gives the same expression for the weak gradient on $B(0,1)$.
The function itself lies in $L^p(B(0,1))$, because $0\le |u(x)|^p=|x|^{\beta p}\le1$ on $B(0,1)$ and $\mathcal L^n(B(0,1))<\infty$. For the gradient, the polar-coordinate formula gives
\begin{align*}
\int_{B(0,1)}|\nabla u(x)|^p\,d\mathcal L^n(x)=\omega_{n-1}\beta^p\int_0^1 r^{p(\beta-1)}r^{n-1}\,dr.
\end{align*}
Thus
\begin{align*}
\int_{B(0,1)}|\nabla u(x)|^p\,d\mathcal L^n(x)=\omega_{n-1}\beta^p\int_0^1 r^{p(\beta-1)+n-1}\,dr.
\end{align*}
Put $q=p(\beta-1)+n-1$. The integral $\int_0^1 r^q\,dr$ is finite exactly when $q>-1$: if $q>-1$, it equals $1/(q+1)$, while if $q=-1$ it is $\int_0^1 r^{-1}\,dr=\infty$, and if $q<-1$ its lower-limit improper integral diverges. Therefore the gradient belongs to $L^p(B(0,1))$ exactly when
\begin{align*}
p(\beta-1)+n-1>-1.
\end{align*}
This inequality is equivalent to
\begin{align*}
p\beta-p+n>0.
\end{align*}
Dividing by $p>0$ gives
\begin{align*}
\beta>1-\frac np=\gamma.
\end{align*}
Our choice of $\beta$ satisfies this, so $u\in W^{1,p}(B(0,1))$.
Now test Hölder continuity of exponent $\alpha$ at the origin. For $0<|x|<1$,
\begin{align*}
\frac{|u(x)-u(0)|}{|x-0|^\alpha}=\frac{||x|^\beta-0|}{|x|^\alpha}=|x|^{\beta-\alpha}.
\end{align*}
Because $\beta<\alpha$, we have $\beta-\alpha<0$, so
\begin{align*}
|x|^{\beta-\alpha}=\frac{1}{|x|^{\alpha-\beta}}\to\infty
\end{align*}
as $x\to0$. If $u$ were Hölder continuous with exponent $\alpha$, this quotient would be bounded above by one Hölder constant for all $x\ne0$, which is impossible. Thus this radial Sobolev function has exactly the kind of cusp that blocks any uniform improvement from Morrey's exponent $1-n/p$ to a larger exponent $\alpha$.
[/example]
The radial calculation is only a model, but it gives the correct scale. Dimension competes with integrability: higher dimension gives more room for singular behaviour, while larger $p$ suppresses it.
## Beyond and Connected Topics
Hölder continuity is the first stop after [continuity](/page/Continuity) when rates of convergence matter. The next natural direction is [Holder Space](/page/Holder%20Space), where the seminorms and norms introduced here become the full functional-analytic framework for elliptic estimates, interpolation, and compactness.
A second direction is [Sobolev Space](/page/Sobolev%20Space) theory. Sobolev spaces measure weak differentiability in integral norms, while Hölder spaces measure pointwise oscillation. Morrey-type embeddings explain when one kind of control implies the other, and this bridge is central in regularity theory.
A third direction is elliptic PDE, especially Schauder estimates for second-order equations. Hölder regularity of coefficients is often the threshold that allows classical second derivatives of solutions to inherit Hölder continuity. This connects the present page to the study of maximum principles, Green functions, and boundary regularity.
A fourth direction is compactness in analysis. Arzela-Ascoli and compact embeddings of Hölder spaces explain why uniform Hölder estimates are stronger than uniform boundedness and equicontinuity alone. These ideas appear throughout functional analysis, [calculus of variations](/page/Calculus%20of%20Variations), and the analysis of nonlinear equations.
Two refinements become important in more advanced regularity theory. Little Hölder spaces close smooth functions inside Hölder norms and are useful when approximation by smooth functions is part of the structure. Parabolic Hölder spaces change the distance scaling so that time has weight two relative to space, matching heat-type equations where one time derivative corresponds to two spatial derivatives.
## References
Androma, [Continuity](/page/Continuity).
Androma, [Holder Space](/page/Holder%20Space).
Androma, [Sobolev Space](/page/Sobolev%20Space).
Androma, [Cambridge II Analysis of Functions](/page/Cambridge%20II%20Analysis%20of%20Functions).
Androma, [Cambridge IB Analysis and Topology](/page/Cambridge%20IB%20Analysis%20and%20Topology).
Androma, [Cambridge II Linear Analysis](/page/Cambridge%20II%20Linear%20Analysis).
Androma, [Cambridge III Functional Analysis](/page/Cambridge%20III%20Functional%20Analysis).
Evans, *Partial Differential Equations* (2010).
Gilbarg and Trudinger, *Elliptic Partial Differential Equations of Second Order* (2001).
Adams and Fournier, *Sobolev Spaces* (2003).
Hölder Continuity
Also known as: Holder continuity, Hölder condition, Holder condition, Hölder regularity, Hölder continuous functions