Distances are useful only when they survive being passed through functions. Continuity says that nearby inputs have nearby outputs, but it does not say how fast errors can grow. A Lipschitz function is the version of continuity that comes with a linear error budget: if the input changes by at most $\delta$, the output changes by at most $L\delta$. That single constant is what makes estimates portable across fixed point arguments, differential equations, [approximation theory](/page/Approximation%20Theory), metric geometry, and [numerical analysis](/page/Numerical%20Analysis).
The need for such a linear budget appears as soon as continuity is used quantitatively. A [continuous function](/page/Continuous%20Function) may preserve closeness at every point while still allowing tiny input errors to be amplified at a rate that depends on where the error occurs. Lipschitz control rules out that moving target. It replaces many local choices of $\delta$ by one global slope bound, so the same estimate can be reused throughout a computation.
[example: A Continuous Function with No Linear Error Budget]
Consider $f:[0,1]\to \mathbb{R}$ defined by $f(x)=\sqrt{x}$. This function is continuous on $[0,1]$, but it has no global linear error budget. To see this, suppose for contradiction that there is a real number $L\ge 0$ such that
\begin{align*}
|\sqrt{x}-\sqrt{y}|\le L|x-y|
\end{align*}
for every $x,y\in[0,1]$.
For any $t$ with $0<t\le 1$, the points $t^2$ and $0$ both lie in $[0,1]$. Substituting $x=t^2$ and $y=0$ into the Lipschitz estimate gives
\begin{align*}
|\sqrt{t^2}-\sqrt{0}|\le L|t^2-0|.
\end{align*}
Since $t>0$, we have $\sqrt{t^2}=t$ and $\sqrt{0}=0$, so the left-hand side is $|t-0|=t$. The right-hand side is $L|t^2|=Lt^2$, because $t^2>0$. Hence
\begin{align*}
t\le Lt^2.
\end{align*}
Dividing by the positive number $t$ gives
\begin{align*}
1\le Lt.
\end{align*}
Now choose $t$ with $0<t<1/L$ if $L>0$; then $Lt<1$, contradicting $1\le Lt$. If $L=0$, the inequality $1\le Lt$ reads $1\le 0$, also impossible. Therefore no such Lipschitz constant exists.
The failure is concentrated at the origin: continuity still holds there, but the ratio $|\sqrt{x}-\sqrt{0}|/|x-0|=1/\sqrt{x}$ becomes unbounded as $x$ approaches $0$ from the right.
[/example]
This example explains why the word "uniform" is still not strong enough for many estimates. The square-root function is uniformly continuous on $[0,1]$, so every desired output tolerance can be protected by some input tolerance. Lipschitz continuity demands more: the same proportional rule must work at every scale.
## Definition
The most common analytic setting is a function defined on a set sitting inside a [normed vector space](/page/Normed%20Vector%20Space). Here distance is measured by the ambient norm, and the Lipschitz condition says that the output displacement is bounded by a fixed multiple of the input displacement. This formulation includes functions on intervals, domains in Euclidean space, and subsets of Banach spaces, which is the form used most often in real analysis, functional analysis, and differential equations.
[definition: Lipschitz Function]
Let $(V,\|\cdot\|_V)$ and $(W,\|\cdot\|_W)$ be normed vector spaces, let $E\subset V$, and let $f:E\to W$ be a function. For a real number $L\ge 0$, the function $f$ is $L$-Lipschitz if for every $x,y\in E$,
\begin{align*}
\|f(x)-f(y)\|_W\le L\,\|x-y\|_V.
\end{align*}
The function $f$ is Lipschitz if there exists a real number $L\ge 0$ such that $f$ is $L$-Lipschitz.
[/definition]
The definition above still remembers the ambient vector spaces, even when no addition or scalar multiplication is being used. Distance functions, fixed point arguments, and geometric maps often live in spaces where distances exist but vectors do not. To use the same error budget in that setting, we isolate the purely metric form.
[definition: Lipschitz Function Between Metric Spaces]
Let $(X,d_X)$ and $(Y,d_Y)$ be metric spaces, and let $f:X\to Y$ be a function. For a real number $L\ge 0$, the function $f$ is $L$-Lipschitz between metric spaces if for every $x,y\in X$,
\begin{align*}
d_Y(f(x),f(y))\le L\,d_X(x,y).
\end{align*}
The function $f$ is Lipschitz between metric spaces if there exists a real number $L\ge 0$ such that this estimate holds.
[/definition]
The number $L$ is not part of the function; it is a certificate for the estimate. If one value works, every larger value works as well. To compare two Lipschitz estimates, or to put Lipschitz functions into a function space, we need a canonical way to record the smallest expansion rate rather than an arbitrary certificate.
[definition: Lipschitz Seminorm]
Let $(X,d_X)$ be a [metric space](/page/Metric%20Space), and let $(Y,\|\cdot\|_Y)$ be a normed [vector space](/page/Vector%20Space). The Lipschitz seminorm is the extended-valued map
\begin{align*}
[\cdot]_{\mathrm{Lip}(X;Y)}: \{f:X\to Y\}\to [0,\infty].
\end{align*}
It is defined by
\begin{align*}
[f]_{\mathrm{Lip}(X;Y)}=\sup_{x,y\in X,\ x\ne y}\frac{\|f(x)-f(y)\|_Y}{d_X(x,y)}.
\end{align*}
If there are no distinct points $x,y\in X$, this supremum is defined to be $0$.
[/definition]
After this definition, the phrase "$f$ is Lipschitz" is exactly the assertion that $[f]_{\mathrm{Lip}(X;Y)}<\infty$. This quantity is also called the best Lipschitz constant of $f$ and is often denoted $\operatorname{Lip}(f)$ when the domain and codomain are understood. The word "seminorm" is deliberate: adding a constant vector to $f$ does not change differences, so the seminorm vanishes on constant functions.
The first structural consequence is that Lipschitz continuity automatically gives [uniform continuity](/page/Uniform%20Continuity). This is the theorem that turns a slope bound into the epsilon-delta language of elementary analysis.
[quotetheorem:1097]
The converse fails, and the square-root example above is the standard warning. Uniform continuity controls error by an arbitrary modulus; Lipschitz continuity asks that this modulus be bounded above by a straight line through the origin.
## Constants, Slopes, and Elementary Sources
### Derivative Bounds
On intervals in $\mathbb{R}$, Lipschitz estimates are often found by bounding derivatives. This is not just a computational trick: it is the bridge between a local infinitesimal slope and a global finite-distance estimate.
[quotetheorem:328]
The theorem gives a fast way to recognise good error budgets. It also makes the failure of $\sqrt{x}$ transparent: on $(0,1]$ its derivative is
\begin{align*}
f'(x)=\frac{1}{2\sqrt{x}},
\end{align*}
so no finite bound holds near $0$.
[example: Bounded Derivative Gives a Reusable Constant]
Let $g:\mathbb{R}\to\mathbb{R}$ be defined by $g(x)=\sin x$. For every $t\in\mathbb{R}$, the derivative is
\begin{align*}
g'(t)=\cos t.
\end{align*}
Since $-1\le \cos t\le 1$, we have
\begin{align*}
|g'(t)|=|\cos t|\le 1.
\end{align*}
Therefore, by *Mean Value Bound for Lipschitz Continuity*, for every $x,y\in\mathbb{R}$,
\begin{align*}
|g(x)-g(y)|\le 1\cdot |x-y|.
\end{align*}
Substituting $g(x)=\sin x$ and $g(y)=\sin y$ gives
\begin{align*}
|\sin x-\sin y|\le |x-y|.
\end{align*}
Thus $\sin$ is $1$-Lipschitz on all of $\mathbb{R}$: the same constant $1$ controls every pair of points on the real line.
[/example]
Many Lipschitz functions are not differentiable everywhere, so derivative bounds are a sufficient source rather than a definition. The absolute value function is the model case: it has a sharp corner, but distance to the corner still changes at speed at most one.
[example: The Absolute Value Function]
Let $h:\mathbb{R}\to\mathbb{R}$ be defined by $h(x)=|x|$. We show that $h$ is $1$-Lipschitz by proving the estimate
\begin{align*}
|h(x)-h(y)|\le |x-y|
\end{align*}
for arbitrary $x,y\in\mathbb{R}$.
By the triangle inequality applied to $x=(x-y)+y$,
\begin{align*}
|x|\le |x-y|+|y|.
\end{align*}
Subtracting $|y|$ from both sides gives
\begin{align*}
|x|-|y|\le |x-y|.
\end{align*}
Interchanging $x$ and $y$ gives
\begin{align*}
|y|-|x|\le |y-x|.
\end{align*}
Since $|y-x|=|x-y|$, this becomes
\begin{align*}
-(|x|-|y|)\le |x-y|.
\end{align*}
Thus both $|x|-|y|\le |x-y|$ and $-(|x|-|y|)\le |x-y|$ hold, so
\begin{align*}
\big||x|-|y|\big|\le |x-y|.
\end{align*}
Because $h(x)=|x|$ and $h(y)=|y|$, this is exactly
\begin{align*}
|h(x)-h(y)|\le |x-y|.
\end{align*}
Therefore $h$ is $1$-Lipschitz on $\mathbb{R}$.
The function still has a corner at $0$: for $t>0$, $(|t|-|0|)/(t-0)=1$, while for $t<0$, $(|t|-|0|)/(t-0)=-1$. Lipschitz continuity therefore does not require differentiability everywhere; it only requires that distances are not amplified faster than linearly.
[/example]
### Distance Functions
Many arguments need to turn a subset of a metric space into a numerical function without losing control of perturbations. Distance to a fixed set is the canonical construction: it measures how far a point is from the set, and the next theorem says that this measurement cannot change faster than the point itself moves.
[quotetheorem:8308]
Distance functions are often the simplest way to turn a set into an analytic object. A [closed set](/page/Closed%20Set) can be recognised as the zero set of $\operatorname{dist}_A$, and the Lipschitz estimate keeps that recognition stable under small perturbations.
## Local Control and Global Control
### Local Lipschitz Regularity
The global condition can be too demanding when the domain is large or when the function grows faster at infinity. Analysis often separates local regularity from global growth by asking for Lipschitz bounds only on bounded neighbourhoods.
For a metric space $(X,d_X)$, write $B(x_0,r)=\{x\in X:d_X(x,x_0)<r\}$ for the open ball of radius $r$ centred at $x_0$.
[definition: Locally Lipschitz Function]
Let $(X,d_X)$ and $(Y,d_Y)$ be metric spaces, and let $f:X\to Y$ be a function. The function $f$ is locally Lipschitz if for every point $x_0\in X$ there exist a radius $r>0$ and a real number $L\ge 0$ such that the restriction $f\big|_{B(x_0,r)}:B(x_0,r)\to Y$ is $L$-Lipschitz with respect to the restricted metric on $B(x_0,r)$.
[/definition]
Local Lipschitz continuity is the right regularity for many existence and uniqueness theorems because solutions are usually constructed on a time interval where the state remains in a controlled neighbourhood. A global constant is useful when available, but the local version is often the sharp hypothesis.
[example: Local Without Global]
The function $p:\mathbb{R}\to \mathbb{R}$ defined by $p(x)=x^2$ is locally Lipschitz, but it is not Lipschitz on all of $\mathbb{R}$. Fix $R\ge 0$. For every $t\in(-R,R)$,
\begin{align*}
p'(t)=2t.
\end{align*}
Since $|t|\le R$ on $(-R,R)$, we have
\begin{align*}
|p'(t)|=|2t|=2|t|\le 2R.
\end{align*}
By *Mean Value Bound for Lipschitz Continuity*, the restriction of $p$ to $[-R,R]$ is $2R$-Lipschitz. Hence, for any point $x_0\in\mathbb{R}$, choosing $R>|x_0|$ gives a bounded interval containing a neighbourhood of $x_0$, so $p$ is locally Lipschitz.
It remains to show that no single Lipschitz constant works on all of $\mathbb{R}$. Suppose, for contradiction, that $p$ is $L$-Lipschitz on $\mathbb{R}$ for some $L\ge 0$. Then, for every $x\in\mathbb{R}$,
\begin{align*}
|p(x)-p(0)|\le L|x-0|.
\end{align*}
Substituting $p(x)=x^2$ and $p(0)=0$ gives
\begin{align*}
|x^2|\le L|x|.
\end{align*}
For $x>0$, this becomes
\begin{align*}
x^2\le Lx.
\end{align*}
Dividing by the positive number $x$ gives
\begin{align*}
x\le L.
\end{align*}
Choosing $x=L+1$ contradicts $x\le L$. Therefore $p(x)=x^2$ is not Lipschitz on all of $\mathbb{R}$.
The obstruction is growth at infinity: each bounded interval has its own finite slope budget, but those budgets increase with the size of the interval.
[/example]
### ODE Uniqueness
The distinction between local and global control becomes decisive in differential equations, where a vector field may have good slope bounds only on the region visited by a solution. The guiding uniqueness principle is elementary in spirit: if two solution curves start from the same point, a Lipschitz bound on the vector field turns the distance between the curves into a quantity that cannot grow from zero. The details of modern ODE statements involve technical language about measurable inputs and solutions defined on intervals, but the Lipschitz role needed here is just this separation-prevention mechanism.
In ODE language, the constant controls how fast the distance between competing trajectories can grow. Local Lipschitz control is therefore enough for uniqueness as long as the trajectories remain in a region where one common constant applies.
## Algebra of Lipschitz Estimates
### Composition
Lipschitz estimates are useful because they combine predictably. Sums, scalar multiples, restrictions, and compositions have constants that can be read directly from the pieces. This arithmetic is what lets long arguments keep a visible error budget.
[quotetheorem:8309]
Composition is the most common bookkeeping rule. If a numerical approximation perturbs the input by at most $\delta$, and the model and post-processing map have constants $L$ and $M$, the final error is bounded by $ML\delta$. The constant records how error propagates through the pipeline.
[example: A Simple Error-Propagation Chain]
Let $f:[0,1]\to \mathbb{R}$ be defined by $f(x)=3x+1$, and let $g:\mathbb{R}\to \mathbb{R}$ be defined by $g(u)=\sin u$. For $x,y\in[0,1]$,
\begin{align*}
|f(x)-f(y)|=|(3x+1)-(3y+1)|.
\end{align*}
Cancelling the constant terms gives
\begin{align*}
|(3x+1)-(3y+1)|=|3x-3y|.
\end{align*}
Factoring out $3$ and using $3>0$ gives
\begin{align*}
|3x-3y|=|3(x-y)|=3|x-y|.
\end{align*}
Thus $f$ is $3$-Lipschitz on $[0,1]$. Also, since $g'(u)=\cos u$ and $|\cos u|\le 1$ for every $u\in\mathbb{R}$, *Mean Value Bound for Lipschitz Continuity* gives
\begin{align*}
|\sin u-\sin v|\le |u-v|
\end{align*}
for all $u,v\in\mathbb{R}$, so $g$ is $1$-Lipschitz. By *[Composition of Lipschitz Maps](/theorems/8309)*, $g\circ f$ is $(1\cdot 3)$-Lipschitz. Equivalently, for $x,y\in[0,1]$,
\begin{align*}
|(g\circ f)(x)-(g\circ f)(y)|=|\sin(3x+1)-\sin(3y+1)|.
\end{align*}
Using the $1$-Lipschitz estimate for $\sin$ with $u=3x+1$ and $v=3y+1$ gives
\begin{align*}
|\sin(3x+1)-\sin(3y+1)|\le |(3x+1)-(3y+1)|.
\end{align*}
The right-hand side is
\begin{align*}
|(3x+1)-(3y+1)|=3|x-y|.
\end{align*}
Therefore
\begin{align*}
|\sin(3x+1)-\sin(3y+1)|\le 3|x-y|.
\end{align*}
So replacing $x$ by a nearby input changes $\sin(3x+1)$ by at most three times the input error.
[/example]
### Sums and Seminorm Estimates
Error terms are often added, not merely composed. When two vector-valued Lipschitz maps contribute to the same model, the next estimate gives the combined error budget and explains why the Lipschitz seminorm behaves like a seminorm.
[quotetheorem:8310]
The seminorm formulation packages this theorem as the inequality
\begin{align*}
[f+g]_{\mathrm{Lip}(X;V)}\le [f]_{\mathrm{Lip}(X;V)}+[g]_{\mathrm{Lip}(X;V)}.
\end{align*}
That inequality is the same triangle inequality seen through the lens of difference quotients.
## Contractions and Fixed Points
A Lipschitz constant smaller than one has a special meaning: the function strictly shrinks all distances. This turns repeated iteration into a convergence mechanism rather than merely a stability estimate.
[definition: Contraction Mapping]
Let $(X,d)$ be a metric space. A function $T:X\to X$ is a contraction mapping if there exists a real number $q$ with $0\le q<1$ such that
\begin{align*}
d(Tx,Ty)\le q\,d(x,y)
\end{align*}
for every $x,y\in X$.
[/definition]
The strict inequality $q<1$ is what changes the qualitative behaviour. Once errors shrink geometrically, repeated iteration should converge to a stable point, but that conclusion also needs a space in which Cauchy sequences have limits. The fixed point theorem combines those two ingredients.
[quotetheorem:71]
This theorem is the engine behind many existence proofs. Instead of solving an equation directly, one rewrites it as a fixed point problem $T(x)=x$ and proves that the chosen map contracts distances in a complete space.
[example: Solving a Linear Fixed Point Problem]
Let $T:\mathbb{R}\to \mathbb{R}$ be defined by $T(x)=\frac{1}{2}x+1$. For arbitrary $x,y\in\mathbb{R}$, we have
\begin{align*}
|T(x)-T(y)|=\left|\left(\frac{1}{2}x+1\right)-\left(\frac{1}{2}y+1\right)\right|.
\end{align*}
The constant terms cancel, so
\begin{align*}
\left|\left(\frac{1}{2}x+1\right)-\left(\frac{1}{2}y+1\right)\right|=\left|\frac{1}{2}x-\frac{1}{2}y\right|.
\end{align*}
Factoring out $\frac{1}{2}$ gives
\begin{align*}
\left|\frac{1}{2}x-\frac{1}{2}y\right|=\left|\frac{1}{2}(x-y)\right|.
\end{align*}
Since $\frac{1}{2}>0$, this is
\begin{align*}
\left|\frac{1}{2}(x-y)\right|=\frac{1}{2}|x-y|.
\end{align*}
Therefore
\begin{align*}
|T(x)-T(y)|\le \frac{1}{2}|x-y|.
\end{align*}
Thus $T$ is a contraction mapping with contraction constant $\frac{1}{2}$.
To find its fixed point, solve $T(x)=x$. This equation is
\begin{align*}
\frac{1}{2}x+1=x.
\end{align*}
Subtracting $\frac{1}{2}x$ from both sides gives
\begin{align*}
1=\frac{1}{2}x.
\end{align*}
Multiplying both sides by $2$ gives
\begin{align*}
x=2.
\end{align*}
Hence the only fixed point is $x_\ast=2$.
Since $\mathbb{R}$ is complete and $T$ is a contraction, the *[Banach Fixed Point Theorem](/theorems/270)* implies that, for every starting point $x_0\in\mathbb{R}$, the sequence defined by $x_{k+1}=T(x_k)$ converges to $2$. The step-by-step error identity is
\begin{align*}
|x_{k+1}-2|=|T(x_k)-2|.
\end{align*}
Substituting $T(x_k)=\frac{1}{2}x_k+1$ gives
\begin{align*}
|T(x_k)-2|=\left|\frac{1}{2}x_k+1-2\right|.
\end{align*}
Combining the constant terms gives
\begin{align*}
\left|\frac{1}{2}x_k+1-2\right|=\left|\frac{1}{2}x_k-1\right|.
\end{align*}
Since $1=\frac{1}{2}\cdot 2$, we have
\begin{align*}
\left|\frac{1}{2}x_k-1\right|=\left|\frac{1}{2}x_k-\frac{1}{2}\cdot 2\right|.
\end{align*}
Factoring out $\frac{1}{2}$ gives
\begin{align*}
\left|\frac{1}{2}x_k-\frac{1}{2}\cdot 2\right|=\left|\frac{1}{2}(x_k-2)\right|.
\end{align*}
Since $\frac{1}{2}>0$, this becomes
\begin{align*}
\left|\frac{1}{2}(x_k-2)\right|=\frac{1}{2}|x_k-2|.
\end{align*}
Therefore
\begin{align*}
|x_{k+1}-2|=\frac{1}{2}|x_k-2|.
\end{align*}
Each iteration halves the distance to the fixed point, so the convergence is governed by the same contraction constant that controls the Lipschitz estimate.
[/example]
The fixed point theorem also explains why Lipschitz hypotheses appear in integral equations and ODE proofs. The map that sends a candidate solution to the right-hand side of an integral equation must shrink distances in a function space, and the Lipschitz constant is the number that makes the shrinkage estimate precise.
## Lipschitz Geometry
In metric geometry, Lipschitz maps are the morphisms that do not expand distances faster than linearly. When a map has a Lipschitz inverse as well, it preserves the metric structure up to multiplicative constants.
[definition: Bi-Lipschitz Embedding]
Let $(X,d_X)$ and $(Y,d_Y)$ be metric spaces. A function $f:X\to Y$ is a bi-Lipschitz embedding if there exist [real numbers](/page/Real%20Numbers) $c,C>0$ such that for every $x,y\in X$,
\begin{align*}
c\,d_X(x,y)\le d_Y(f(x),f(y))\le C\,d_X(x,y).
\end{align*}
[/definition]
The upper bound says that $f$ is Lipschitz, while the lower bound prevents distinct points from collapsing too close together. Thus a bi-Lipschitz embedding preserves the large and small scale shape of the metric space up to controlled distortion.
[example: Rescaling Euclidean Space]
For a fixed real number $\lambda>0$, define $S_\lambda:\mathbb{R}^n\to \mathbb{R}^n$ by $S_\lambda(x)=\lambda x$. For arbitrary $x,y\in\mathbb{R}^n$, we compute
\begin{align*}
S_\lambda(x)-S_\lambda(y)=\lambda x-\lambda y.
\end{align*}
Factoring out $\lambda$ gives
\begin{align*}
\lambda x-\lambda y=\lambda(x-y).
\end{align*}
Taking the Euclidean norm and using absolute homogeneity of the norm gives
\begin{align*}
|S_\lambda(x)-S_\lambda(y)|=|\lambda(x-y)|=|\lambda|\,|x-y|.
\end{align*}
Since $\lambda>0$, we have $|\lambda|=\lambda$, so
\begin{align*}
|S_\lambda(x)-S_\lambda(y)|=\lambda |x-y|.
\end{align*}
Thus the upper bi-Lipschitz estimate holds with $C=\lambda$:
\begin{align*}
|S_\lambda(x)-S_\lambda(y)|\le \lambda |x-y|.
\end{align*}
The same equality also gives the lower estimate with $c=\lambda$:
\begin{align*}
\lambda |x-y|\le |S_\lambda(x)-S_\lambda(y)|.
\end{align*}
Therefore $S_\lambda$ is bi-Lipschitz with $c=C=\lambda$.
Its inverse is $S_{1/\lambda}$. Indeed, for every $x\in\mathbb{R}^n$,
\begin{align*}
S_{1/\lambda}(S_\lambda(x))=\frac{1}{\lambda}(\lambda x).
\end{align*}
Associativity of scalar multiplication gives
\begin{align*}
\frac{1}{\lambda}(\lambda x)=\left(\frac{1}{\lambda}\lambda\right)x=x.
\end{align*}
Similarly, for every $x\in\mathbb{R}^n$,
\begin{align*}
S_\lambda(S_{1/\lambda}(x))=\lambda\left(\frac{1}{\lambda}x\right)=\left(\lambda\frac{1}{\lambda}\right)x=x.
\end{align*}
So $S_\lambda$ changes every Euclidean distance by exactly the fixed factor $\lambda$, which is the model case of metric distortion by uniform rescaling.
[/example]
Bi-Lipschitz maps preserve many metric properties because their inequalities can be run in both directions. Completeness, Cauchy behaviour, boundedness, and Hausdorff dimension estimates all respond well to such two-sided control.
The one-sided Lipschitz condition is weaker and deliberately so. Many useful maps collapse information: projections, distance functions, and quotient maps may be Lipschitz without being invertible. The upper estimate alone is enough whenever the question is whether errors can grow too fast.
## Almost-Everywhere Differentiability
The examples above show that Lipschitz functions need not be differentiable everywhere. A remarkable theorem says that in Euclidean spaces this is the only possible defect: differentiability may fail on a set of [Lebesgue measure](/page/Lebesgue%20Measure) zero, but not on a set of positive measure.
In the statement below, $\mathcal{L}^n$ denotes Lebesgue measure on $\mathbb{R}^n$, so "$\mathcal{L}^n$-almost every point" means outside a subset of $U$ with $\mathcal{L}^n$-measure zero.
[quotetheorem:3069]
[Rademacher's theorem](/theorems/3069) is one reason Lipschitz functions sit between smooth functions and merely continuous functions. They are flexible enough to include corners and distance functions, yet rigid enough to possess a derivative almost everywhere.
[example: A Corner with Almost-Everywhere Derivative]
Let $h:\mathbb{R}\to\mathbb{R}$ be defined by $h(x)=|x|$. At $0$, the difference quotient along positive increments is
\begin{align*}
\frac{h(t)-h(0)}{t-0}=\frac{|t|-0}{t}=1
\end{align*}
for every $t>0$, while along negative increments it is
\begin{align*}
\frac{h(t)-h(0)}{t-0}=\frac{|t|-0}{t}=\frac{-t}{t}=-1
\end{align*}
for every $t<0$. Since the two one-sided limits of the difference quotient are different, $h$ is not differentiable at $0$.
If $x>0$, then for every $t$ sufficiently close to $0$ with $x+t>0$,
\begin{align*}
\frac{h(x+t)-h(x)}{t}=\frac{|x+t|-|x|}{t}=\frac{(x+t)-x}{t}=1.
\end{align*}
Thus $h'(x)=1$ for every $x>0$. If $x<0$, then for every $t$ sufficiently close to $0$ with $x+t<0$,
\begin{align*}
\frac{h(x+t)-h(x)}{t}=\frac{|x+t|-|x|}{t}=\frac{-(x+t)-(-x)}{t}=\frac{-t}{t}=-1.
\end{align*}
Thus $h'(x)=-1$ for every $x<0$.
Therefore the only point where differentiability fails is $0$. Since the singleton set $\{0\}$ has Lebesgue measure zero in $\mathbb{R}$, the derivative exists almost everywhere even though the graph has a corner at the origin.
[/example]
The theorem also connects Lipschitz analysis to [Sobolev spaces](/page/Sobolev%20Space). On a [Euclidean domain](/page/Euclidean%20Domain), a Lipschitz function has weak first derivatives in $L^\infty_{\mathrm{loc}}$, and the essential bound on those weak derivatives recovers the local Lipschitz scale in many standard settings.
## Beyond and Connected Topics
The most immediate continuation is [Cambridge II Analysis of Functions](/page/Cambridge%20II%20Analysis%20of%20Functions), where Lipschitz estimates appear alongside [uniform convergence](/page/Uniform%20Convergence), differentiability, and function-space arguments. There the condition is part of a broader toolkit for controlling limits of functions.
For functional-analytic applications, [Cambridge II Linear Analysis](/page/Cambridge%20II%20Linear%20Analysis) places Lipschitz and contraction estimates in the setting of complete metric and normed spaces. Banach's fixed point theorem is one of the standard bridges from metric estimates to existence theorems.
In geometry and measure theory, [Geometric Measure Theory III: BV Functions and Sets of Finite Perimeter](/page/Geometric%20Measure%20Theory%20III%3A%20BV%20Functions%20and%20Sets%20of%20Finite%20Perimeter) uses Lipschitz maps and distance functions as basic tools for rectifiability, perimeter estimates, and approximation. Rademacher's theorem is part of the background explaining why Lipschitz maps have enough differential structure for geometric arguments.
Lipschitz constants also appear in learning theory and optimisation. In [Cambridge II Mathematics of Machine Learning](/page/Cambridge%20II%20Mathematics%20of%20Machine%20Learning), they measure stability of loss functions, gradients, and prediction maps under perturbations of data or parameters.
## References
Androma, [Cambridge II Analysis of Functions](/page/Cambridge%20II%20Analysis%20of%20Functions).
Androma, [Cambridge II Linear Analysis](/page/Cambridge%20II%20Linear%20Analysis).
Androma, [Geometric Measure Theory III: BV Functions and Sets of Finite Perimeter](/page/Geometric%20Measure%20Theory%20III%3A%20BV%20Functions%20and%20Sets%20of%20Finite%20Perimeter).
Androma, [Cambridge II Mathematics of Machine Learning](/page/Cambridge%20II%20Mathematics%20of%20Machine%20Learning).
Androma, [Sobolev Space](/page/Sobolev%20Space).
Walter Rudin, *Principles of Mathematical Analysis* (1976).
Lawrence C. Evans and Ronald F. Gariepy, *Measure Theory and Fine Properties of Functions* (1992).
Juha Heinonen, *Lectures on Analysis on Metric Spaces* (2001).
Lipschitz Function
Also known as: ["Lipschitz map","Lipschitz maps","Lipschitz continuous function","Lipschitz continuity"]