Integration is, alongside [differentiation](/page/Derivative), one of the two fundamental operations of calculus. Where differentiation extracts the instantaneous rate of change of a function, integration reconstructs a quantity from its rate of change — it computes areas, volumes, total mass, accumulated work, and probabilities. The [Fundamental Theorem of Calculus](/theorems/632) reveals that these two operations are inverses: differentiation undoes integration, and integration undoes differentiation.
But the concept of "integral" is not a single definition — it is a family of increasingly powerful constructions, each designed to handle a larger class of [functions](/page/Function). The [Riemann integral](/page/Riemann%20Integral) (partitioning the domain into subintervals) suffices for [continuous](/page/Continuous) functions and many applications in calculus and physics. The [Lebesgue integral](/page/Lebesgue%20Integral) (partitioning the range and measuring the preimages) handles a far larger class — all measurable functions — and supports the powerful convergence theorems that analysis requires. Beyond these, the Stieltjes integral, the Henstock-Kurzweil integral, the integral with respect to a general measure, and the integral on manifolds each extend the concept further.
This page develops the integral at the conceptual level: the problem it solves, the key ideas behind the Riemann and Lebesgue approaches, the properties that any reasonable integral must satisfy, the Fundamental Theorem of Calculus, the convergence theorems, and integration in higher dimensions and on general spaces. The detailed constructions are developed on dedicated pages.
[motivation]
## Motivation
### The Area Problem
The oldest motivation for integration is the computation of area. The area of a rectangle with sides $a$ and $b$ is $ab$ — but what is the area of the region bounded by a curve $y = f(x)$, the $x$-axis, and the vertical lines $x = a$ and $x = b$? Archimedes answered this for parabolas using the method of exhaustion: approximate the region by inscribed and circumscribed polygons, show the two approximations converge to the same value, and declare that value to be the area. This is, in essence, the Riemann integral — the area is the [limit](/page/Limit) of sums of rectangular areas as the rectangles become infinitely thin.
### Beyond Area: Accumulation
Integration is far more than an area computation. Whenever a quantity accumulates at a varying rate, the total accumulation is an integral. If $v(t)$ is velocity, then $\int_a^b v(t) \, d\mathcal{L}^1(t)$ is displacement. If $\rho(x)$ is mass density, then $\int_U \rho(x) \, d\mathcal{L}^n(x)$ is total mass. If $f(x)$ is a probability density, then $\int_A f(x) \, d\mathcal{L}^1(x)$ is the probability of the event $A$. In each case, the integral "adds up" infinitesimal contributions to produce a finite total.
### Why Multiple Theories?
The Riemann integral — defined by partitioning the $x$-axis and summing $f(x_i^*) \Delta x_i$ — works well for continuous functions but fails for many functions that arise naturally in analysis. The characteristic function $\mathbb{1}_\mathbb{Q}$ of the rationals (equal to $1$ on $\mathbb{Q}$ and $0$ on $\mathbb{R} \setminus \mathbb{Q}$) is not Riemann integrable, because every subinterval contains both rationals and irrationals, so the upper and lower Riemann sums never agree. The pointwise limit of a [sequence](/page/Sequence) of Riemann-integrable functions need not be Riemann integrable. And the Riemann integral does not interact well with [limits](/page/Limit): there exist sequences $f_n \to f$ pointwise with $\int f_n \not\to \int f$.
The Lebesgue integral resolves all these difficulties by replacing the domain partition with a range partition: instead of asking "what is $f$ on the interval $[x_i, x_{i+1}]$?", ask "how much of the domain does $f$ map to values near $y$?" This requires measuring the "size" of sets (measure theory) but produces an integral that handles all measurable functions, supports the [Dominated Convergence Theorem](/theorems/4), and provides the foundation for probability theory, functional analysis, and PDE theory.
[/motivation]
## Definition
There is no single "definition of the integral" — each theory of integration provides its own construction. However, all reasonable integrals share a common structure: the integral is a **linear functional** on a space of functions that is **monotone** (nonnegative functions have nonnegative integrals) and satisfies an appropriate **convergence theorem**.
### The Abstract Viewpoint
At the most abstract level, an integral on a [set](/page/Set) $X$ with respect to a measure $\mu$ is a linear functional
\begin{align*}
I: \mathcal{F} \to \mathbb{R} \quad (\text{or } \mathbb{C}), \qquad f \mapsto I(f) = \int_X f \, d\mu,
\end{align*}
defined on some vector space $\mathcal{F}$ of functions $f: X \to \mathbb{R}$, satisfying:
1. **Linearity:** $I(\alpha f + \beta g) = \alpha I(f) + \beta I(g)$.
2. **Monotonicity:** if $f \ge 0$ a.e., then $I(f) \ge 0$.
3. **Normalisation:** $I(\mathbb{1}_{[0,1]}) = 1$ (or more generally, $I(\mathbb{1}_E) = \mu(E)$ for measurable sets $E$).
The different theories of integration differ in the choice of $\mathcal{F}$ (which functions are integrable?), the construction of $I$ (how is the integral computed?), and the convergence theorems (when does $\int f_n \to \int f$?).
[definition:Integral With Respect To A Measure]
Let $(X, \Sigma, \mu)$ be a measure space (where $X$ is a set, $\Sigma$ is a $\sigma$-algebra of measurable subsets, and $\mu: \Sigma \to [0, \infty]$ is a measure). For a measurable function $f: X \to [0, \infty]$, the **Lebesgue integral** of $f$ with respect to $\mu$ is
\begin{align*}
\int_X f \, d\mu := \sup\left\{\int_X s \, d\mu : 0 \le s \le f,\; s \text{ simple}\right\},
\end{align*}
where a simple function $s = \sum_{k=1}^n c_k \mathbb{1}_{E_k}$ (with $E_k \in \Sigma$ and $c_k \ge 0$) has integral $\int_X s \, d\mu = \sum_{k=1}^n c_k \mu(E_k)$.
For a general measurable function $f: X \to \mathbb{R}$, write $f = f^+ - f^-$ where $f^+ = \max(f, 0)$ and $f^- = \max(-f, 0)$. Then $f$ is **integrable** (written $f \in L^1(X, \mu)$) if both $\int f^+ \, d\mu < \infty$ and $\int f^- \, d\mu < \infty$, and
\begin{align*}
\int_X f \, d\mu := \int_X f^+ \, d\mu - \int_X f^- \, d\mu.
\end{align*}
[/definition]
This is the Lebesgue integral, which subsumes the Riemann integral for all functions where both are defined. When $X = [a, b]$, $\Sigma = \mathcal{B}([a, b])$ (the Borel $\sigma$-algebra), and $\mu = \mathcal{L}^1$ (Lebesgue measure), the notation $\int_a^b f(x) \, d\mathcal{L}^1(x)$ or simply $\int_a^b f(x) \, dx$ is used.
## The Riemann and Lebesgue Approaches
The two classical approaches to integration partition different things. The Riemann integral partitions the *domain*; the Lebesgue integral partitions the *range*. This conceptual difference is the source of all the technical differences between the two theories.
### The Riemann Integral
The Riemann integral of $f: [a, b] \to \mathbb{R}$ is defined by partitioning $[a, b]$ into subintervals $[x_{i-1}, x_i]$, forming the upper sums $U(f, P) = \sum (\sup_{[x_{i-1}, x_i]} f) \Delta x_i$ and lower sums $L(f, P) = \sum (\inf_{[x_{i-1}, x_i]} f) \Delta x_i$, and declaring $f$ Riemann integrable if $\inf_P U(f, P) = \sup_P L(f, P)$.
The Riemann integral handles all [continuous](/page/Continuous) functions (by the [Heine-Cantor theorem](/theorems/280), [continuity](/page/Continuity) on $[a, b]$ is uniform, so the upper and lower sums converge), all monotone functions, and more generally all bounded functions with a set of discontinuities of Lebesgue measure zero (the Lebesgue criterion). Its limitation is that it requires the oscillation of $f$ on small intervals to vanish — a condition that fails for highly discontinuous functions like $\mathbb{1}_\mathbb{Q}$.
### The Lebesgue Integral
The Lebesgue integral partitions the range: instead of asking what $f$ does on the interval $[x_{i-1}, x_i]$, it asks how much of the domain satisfies $y_{j-1} \le f(x) < y_j$, and sums $y_j \cdot \mu(\{x : y_{j-1} \le f(x) < y_j\})$. This requires a notion of "size" for sets (measure theory) but makes the integral depend only on the *distribution* of values of $f$, not on the arrangement of those values along the domain.
The key advantage is that the Lebesgue integral extends to all measurable functions and supports the convergence theorems (Monotone Convergence, Dominated Convergence, Fatou's Lemma) that are the workhorses of modern analysis. Every Riemann-integrable function is Lebesgue integrable with the same value, but the Lebesgue theory handles vastly more functions and provides a complete space ($L^p$ spaces are [Banach spaces](/page/Banach%20Space)).
[example:The Dirichlet Function]
The function $\mathbb{1}_\mathbb{Q}: [0, 1] \to \{0, 1\}$ (equal to $1$ on rationals, $0$ on irrationals) is not Riemann integrable: every subinterval contains both rationals and irrationals, so $U(f, P) = 1$ and $L(f, P) = 0$ for every partition $P$.
However, $\mathbb{1}_\mathbb{Q}$ is Lebesgue integrable: $\mathbb{Q} \cap [0, 1]$ is countable, hence has Lebesgue measure $\mathcal{L}^1(\mathbb{Q} \cap [0,1]) = 0$, and $[0,1] \setminus \mathbb{Q}$ has measure $1$. Therefore $\int_0^1 \mathbb{1}_\mathbb{Q} \, d\mathcal{L}^1 = 1 \cdot 0 + 0 \cdot 1 = 0$.
[/example]
## Properties of the Integral
Regardless of which construction is used, the integral satisfies a collection of fundamental properties that follow from linearity and monotonicity. These properties are the tools used in virtually every computation and estimate involving integrals.
### Linearity
The integral is linear: $\int (\alpha f + \beta g) \, d\mu = \alpha \int f \, d\mu + \beta \int g \, d\mu$ for all scalars $\alpha, \beta$ and integrable functions $f, g$. This is immediate from the definition for simple functions and extends by approximation to all integrable functions. Linearity is what makes the integral a functional-analytic object — it defines a bounded linear functional on $L^1$.
### Monotonicity and the Triangle Inequality
If $f \le g$ a.e., then $\int f \, d\mu \le \int g \, d\mu$ (apply linearity to $g - f \ge 0$). The absolute value satisfies the **integral triangle inequality**:
\begin{align*}
\left|\int_X f \, d\mu\right| \le \int_X |f| \, d\mu.
\end{align*}
This follows from $-|f| \le f \le |f|$ and monotonicity. The integral triangle inequality is the foundation of all norm estimates in $L^p$ spaces.
### Additivity Over Domains
If $A$ and $B$ are disjoint measurable sets, then $\int_{A \cup B} f \, d\mu = \int_A f \, d\mu + \int_B f \, d\mu$. More generally, if $X = \bigsqcup_{n=1}^\infty E_n$ is a countable disjoint decomposition, then $\int_X f \, d\mu = \sum_{n=1}^\infty \int_{E_n} f \, d\mu$ (for nonnegative $f$, this follows from the [Monotone Convergence Theorem](/theorems/509)).
### Null Sets and "Almost Everywhere"
If $f = g$ a.e. (i.e., $\mu(\{x : f(x) \neq g(x)\}) = 0$), then $\int f \, d\mu = \int g \, d\mu$. The integral is blind to modifications on null sets. This is the reason that elements of $L^p$ spaces are equivalence classes of functions (two functions that agree a.e. are identified), and it is the reason that pointwise values of an $L^p$ function are meaningless — only the integral against [test functions](/page/Test%20Function) has invariant meaning.
[example:Modification On A Null Set]
Define $f: [0, 1] \to \mathbb{R}$ by $f(x) = x$ and $g: [0, 1] \to \mathbb{R}$ by $g(x) = x$ for $x \neq 1/2$ and $g(1/2) = 1000$. Then $f$ and $g$ differ only at the single point $1/2$, which has Lebesgue measure zero, so $\int_0^1 f \, d\mathcal{L}^1 = \int_0^1 g \, d\mathcal{L}^1 = 1/2$. The integral does not "see" the modification.
[/example]
## The Fundamental Theorem of Calculus
The deepest result in one-dimensional integration is the Fundamental Theorem of Calculus, which establishes that differentiation and integration are inverse operations. It is the theorem that makes the evaluation of integrals practical — instead of computing limits of Riemann sums, one finds an antiderivative and evaluates at the endpoints.
[quotetheorem:632]
Part I says that the "accumulation function" $F(x) = \int_a^x f(t) \, d\mathcal{L}^1(t)$ is an antiderivative of $f$. This is the precise sense in which integration undoes differentiation: if you integrate a function and then differentiate the result, you recover the original function.
Part II says that any antiderivative can be used to evaluate a definite integral: $\int_a^b f \, d\mathcal{L}^1 = G(b) - G(a)$. This is the evaluation formula that reduces integration to the algebraic problem of finding antiderivatives — the basis of all "by inspection" integral computations in calculus.
The FTC extends to the Lebesgue setting with modifications. If $f \in L^1([a, b])$, then $F(x) = \int_a^x f \, d\mathcal{L}^1$ is absolutely continuous and $F'(x) = f(x)$ for a.e. $x$ (the [Lebesgue differentiation theorem](/theorems/74)). Conversely, if $G$ is absolutely continuous on $[a, b]$, then $G' \in L^1$ and $G(b) - G(a) = \int_a^b G' \, d\mathcal{L}^1$. Absolute continuity replaces differentiability as the correct regularity condition in the Lebesgue theory.
[example:Evaluation Via The FTC]
To compute $\int_0^1 x^2 \, d\mathcal{L}^1$: the function $G(x) = x^3/3$ satisfies $G'(x) = x^2$, so by Part II:
\begin{align*}
\int_0^1 x^2 \, d\mathcal{L}^1 = G(1) - G(0) = \frac{1}{3} - 0 = \frac{1}{3}.
\end{align*}
Without the FTC, one would need to compute this as $\lim_{n \to \infty} \sum_{k=1}^n (k/n)^2 \cdot (1/n) = \lim_{n \to \infty} \frac{n(n+1)(2n+1)}{6n^3} = 1/3$ — a much harder calculation.
[/example]
## Convergence Theorems
The most important advantage of the Lebesgue integral over the Riemann integral is the availability of powerful convergence theorems — conditions under which $\lim \int f_n = \int \lim f_n$. These theorems are the workhorses of modern analysis; they justify the interchange of limits and integrals that is needed in virtually every existence proof, every passage to the limit in PDE theory, and every approximation argument.
### The Monotone Convergence Theorem (for Integrals)
If $0 \le f_1 \le f_2 \le \cdots$ is an increasing sequence of nonnegative measurable functions, then
\begin{align*}
\lim_{n \to \infty} \int_X f_n \, d\mu = \int_X \lim_{n \to \infty} f_n \, d\mu.
\end{align*}
No additional hypotheses are needed — monotonicity and nonnegativity suffice. This is the theorem that makes $\sigma$-additivity of the integral automatic (apply it to partial sums of a [series](/page/Series) of nonnegative functions) and that underpins the construction of the Lebesgue integral itself.
### Fatou's Lemma
For any sequence of nonnegative measurable functions $f_n \ge 0$:
\begin{align*}
\int_X \liminf_{n \to \infty} f_n \, d\mu \le \liminf_{n \to \infty} \int_X f_n \, d\mu.
\end{align*}
The inequality can be strict — mass can "escape to infinity" or "concentrate on a null set." Fatou's lemma is the failsafe: even when the full interchange of limit and integral fails, the integral of the limit is bounded by the limit of the integrals.
### The Dominated Convergence Theorem
If $f_n \to f$ a.e. and there exists an integrable function $g$ with $|f_n| \le g$ a.e. for all $n$, then
\begin{align*}
\lim_{n \to \infty} \int_X f_n \, d\mu = \int_X f \, d\mu.
\end{align*}
The dominating function $g$ is the price of full interchange: without it, mass can escape (as the examples on the [Limit](/page/Limit) page show — $\int f_n \not\to \int f$ for $f_n(x) = nx(1-x^2)^n$). The Dominated Convergence Theorem is arguably the single most-used result in analysis — it justifies differentiation under the integral sign, passage to limits in weak formulations of PDEs, and virtually every "approximate by nice functions, pass to the limit" argument in functional analysis.
[example:Differentiation Under The Integral Sign]
Let $F(\alpha) = \int_0^\infty e^{-\alpha x} \sin(x) / x \, d\mathcal{L}^1(x)$ for $\alpha > 0$. To compute $F'(\alpha)$, differentiate under the integral:
\begin{align*}
F'(\alpha) = \int_0^\infty \frac{\partial}{\partial \alpha}\left(\frac{e^{-\alpha x} \sin x}{x}\right) d\mathcal{L}^1 = -\int_0^\infty e^{-\alpha x} \sin x \, d\mathcal{L}^1.
\end{align*}
The interchange is justified by the Dominated Convergence Theorem: the difference quotient $\frac{e^{-(\alpha + h)x} - e^{-\alpha x}}{h} \cdot \frac{\sin x}{x}$ is dominated by $x e^{-(\alpha - \delta)x}$ (for $|h| < \delta$), which is integrable. The resulting integral $-\int_0^\infty e^{-\alpha x} \sin x \, d\mathcal{L}^1 = -1/(1 + \alpha^2)$ (by [integration by parts](/theorems/210) twice), giving $F(\alpha) = \pi/2 - \arctan(\alpha)$ (using $F(\alpha) \to 0$ as $\alpha \to \infty$). Setting $\alpha = 0$: $\int_0^\infty \frac{\sin x}{x} \, d\mathcal{L}^1 = \pi/2$.
[/example]
## Techniques of Integration
The Fundamental Theorem reduces integration to finding antiderivatives, but antiderivatives are often harder to find than derivatives. The main techniques — substitution and integration by parts — are the integral versions of the chain rule and the product rule.
### Substitution (Change of Variables)
[quotetheorem:211]
Substitution transforms the integral $\int_a^b f(x) \, dx$ into $\int_\alpha^\beta f(g(t)) g'(t) \, dt$ by the change of variable $x = g(t)$. The formula is the integral analogue of the chain rule: if $F' = f$, then $(F \circ g)' = f(g) \cdot g'$, so $\int f(g(t))g'(t) \, dt = F(g(t))$.
In higher dimensions, the change of variables formula replaces $g'(t)$ with the Jacobian determinant: $\int_{g(U)} f(x) \, d\mathcal{L}^n(x) = \int_U f(g(u)) |\det Dg(u)| \, d\mathcal{L}^n(u)$, where $Dg$ is the derivative matrix. This is the foundation of integration in polar, cylindrical, and spherical coordinates.
### Integration by Parts
[quotetheorem:210]
Integration by parts transforms $\int f'g$ into $[fg] - \int fg'$ — it "moves" the derivative from one factor to another. This is the integral analogue of the product rule. In the Lebesgue setting, integration by parts extends to [Sobolev functions](/page/Sobolev%20Space): it is the definition of the weak derivative ($\int u \partial_i \phi = -\int (\partial_i u) \phi$ for $\phi \in C_c^\infty$), and it is the mechanism by which PDE theory converts differential equations into variational problems.
## Integration in Higher Dimensions
The one-dimensional integral $\int_a^b f(x) \, dx$ extends to multiple dimensions in two fundamentally different ways: iterated integration (computing a multi-dimensional integral as a sequence of one-dimensional integrals) and direct integration against higher-dimensional measures.
### Iterated Integrals and Fubini's Theorem
The most important result in multi-dimensional integration is Fubini's theorem (and its companion, the Tonelli theorem): under appropriate conditions, a double integral can be computed as an iterated integral, and the order of integration can be reversed.
For a nonnegative measurable function $f: X \times Y \to [0, \infty]$ (Tonelli): $\int_{X \times Y} f \, d(\mu \times \nu) = \int_X \left(\int_Y f(x, y) \, d\nu(y)\right) d\mu(x) = \int_Y \left(\int_X f(x, y) \, d\mu(x)\right) d\nu(y)$.
For an integrable function $f \in L^1(X \times Y, \mu \times \nu)$ (Fubini): the same equalities hold, and the inner integrals are defined for a.e. value of the outer variable.
The Tonelli theorem is used to verify integrability (compute the iterated integral of $|f|$ and check it is finite), while Fubini is used to compute (switch the order of integration to find a simpler form).
[example:Computing A Double Integral Via Fubini]
Compute $\int_0^1 \int_0^1 \frac{x - y}{(x + y)^3} \, d\mathcal{L}^1(y) \, d\mathcal{L}^1(x)$. The inner integral (in $y$) yields
\begin{align*}
\int_0^1 \frac{x - y}{(x+y)^3} \, dy = \left[\frac{1}{(x+y)^2} \cdot \frac{1}{2} + \frac{1}{(x+y)}\right]_0^1 \quad \text{(after partial fractions)}.
\end{align*}
But computing in the *other* order gives the negative — the iterated integrals differ in sign. This does not contradict Fubini because $\int_0^1 \int_0^1 |f| = \infty$: the function is not in $L^1([0,1]^2)$, so Fubini does not apply. The order of integration matters when absolute integrability fails.
[/example]
### Integration on Manifolds and Differential Forms
On a smooth manifold $M$, the correct objects to integrate are *differential forms*, not functions. A $k$-form on an $n$-dimensional manifold is a smooth section of the $k$-th exterior power of the cotangent bundle, and its integral over a $k$-dimensional oriented submanifold is defined via pullback to $\mathbb{R}^k$. The change-of-variables formula becomes the pullback formula $\int_M \omega = \int_{\phi(M)} \phi^* \omega$, and the fundamental theorem of calculus generalises to **Stokes' theorem**: $\int_M d\omega = \int_{\partial M} \omega$, unifying the classical theorems of Green, Gauss, and Stokes.
## Integration and Function Spaces
The integral is the tool that builds the function spaces of analysis. The $L^p$ spaces, the [Sobolev spaces](/page/Sobolev%20Space), and the spaces of [distributions](/page/Distribution) are all defined via integrals, and their properties (completeness, duality, embedding) are proved using the convergence theorems.
### $L^p$ Spaces
For $1 \le p < \infty$, the space $L^p(X, \mu)$ consists of measurable functions with $\int |f|^p \, d\mu < \infty$, equipped with the norm $\|f\|_{L^p} = (\int |f|^p \, d\mu)^{1/p}$. For $p = \infty$, $\|f\|_{L^\infty} = \operatorname{ess\,sup} |f|$. These are [Banach spaces](/page/Banach%20Space) (the Riesz-Fischer theorem), and $L^2$ is a [Hilbert space](/page/Hilbert%20Space) with inner product $(f, g) = \int f \overline{g} \, d\mu$.
The $L^p$ spaces are the natural setting for harmonic analysis ([Fourier transforms](/page/Fourier%20Transform) map $L^1 \to L^\infty$ and $L^2 \to L^2$), for probability theory ($L^p$ integrability of random variables), and for PDE theory (weak solutions live in Sobolev spaces, which are subspaces of $L^p$).
### The Integral as a Linear Functional
The integral defines a bounded linear functional on $L^1$: the map $f \mapsto \int f \, d\mu$ belongs to $(L^1)^*$. More generally, the [Riesz representation theorem](/theorems/221) identifies the dual of $L^p$ (for $1 \le p < \infty$) with $L^q$ (where $1/p + 1/q = 1$): every bounded linear functional on $L^p$ has the form $\Lambda(f) = \int fg \, d\mu$ for a unique $g \in L^q$.
This dual pairing $\langle f, g \rangle = \int fg \, d\mu$ is the foundation of weak formulations in PDE theory, the definition of distributions ($T(\phi) = \int T \phi$ for regular distributions), and the duality theory of Banach spaces.
### Weak Derivatives and Sobolev Spaces
Integration by parts, applied to the definition of the classical derivative, produces the notion of **weak derivative**: $v$ is the [weak derivative](/page/Weak%20Derivative) of $u$ if $\int u \partial_i \phi \, d\mathcal{L}^n = -\int v \phi \, d\mathcal{L}^n$ for all $\phi \in C_c^\infty$. This is the integral that defines [Sobolev spaces](/page/Sobolev%20Space) — the natural domains for elliptic PDE operators — and it is the mechanism by which differentiation is extended from smooth functions to $L^p$ functions.
## References
- Rudin, W., *Real and Complex Analysis* (3rd ed., 1987).
- Folland, G. B., *Real Analysis: Modern Techniques and Their Applications* (2nd ed., 1999).
- Stein, E. M. and Shakarchi, R., *Real Analysis: Measure Theory, Integration, and Hilbert Spaces* (2005).
- Bartle, R. G., *The Elements of Integration and Lebesgue Measure* (1995).