Given a [measure space](/page/Measure%20Space) $(X, \mathcal{A}, \mu)$ and a second measure $\nu$ on the same $\sigma$-algebra $\mathcal{A}$, one frequently needs to express integrals with respect to $\nu$ in terms of integrals with respect to $\mu$. For instance, if $\mu = \mathcal{L}^1$ is Lebesgue measure on $\mathbb{R}$ and $\nu$ is a probability measure with a density, we write $d\nu = f \, d\mu$ for some nonnegative measurable function $f: \mathbb{R} \to [0, \infty)$. But when does such a representation exist? And when it does, in what sense is the "density" $f$ unique?
The answer is governed by a single structural relationship between measures: [absolute continuity](/page/Absolutely%20Continuous%20Measures). The Radon--Nikodym theorem asserts that absolute continuity $\nu \ll \mu$ is not only necessary but, under a $\sigma$-finiteness hypothesis, also *sufficient* for the existence of a density. This result is one of the central pillars of modern analysis. It connects abstract measure theory to concrete computation: it underpins conditional expectation in probability, duality of [Lp spaces](/page/L%5Ep%20Spaces), the change-of-variables formula, and the classical Fundamental Theorem of Calculus.
To appreciate why the theorem requires care, consider a first attempt that fails.
[example: Failure Without Absolute Continuity]
Let $\mu = \mathcal{L}^1$ be Lebesgue measure on $(\mathbb{R}, \mathcal{B}(\mathbb{R}))$, and let $\nu = \delta_0$ be the Dirac measure concentrated at the origin, defined by $\delta_0(A) = \mathbb{1}_A(0)$ for each Borel set $A \in \mathcal{B}(\mathbb{R})$. Suppose, seeking a contradiction, that there exists a measurable function $f: \mathbb{R} \to [0, \infty)$ satisfying $\delta_0(A) = \int_A f \, d\mathcal{L}^1$ for every Borel set $A$.
Taking $A = \{0\}$, we need $\int_{\{0\}} f \, d\mathcal{L}^1 = 1$. But $\mathcal{L}^1(\{0\}) = 0$, so $\int_{\{0\}} f \, d\mathcal{L}^1 = 0$ regardless of the choice of $f$. This is a contradiction. No density can exist.
The obstruction is structural: $\delta_0$ assigns positive mass to the set $\{0\}$, which has $\mathcal{L}^1$-measure zero. In other words, $\delta_0$ is *not* absolutely continuous with respect to $\mathcal{L}^1$. The Radon--Nikodym theorem identifies absolute continuity as the precise condition that prevents this failure.
[/example]
## Motivation
[motivation]
### Why density representations matter
Throughout analysis and probability, one repeatedly encounters pairs of measures on the same measurable space and needs to pass between them. Three representative situations illustrate the scope of the problem.
**Change of variables.** If $T: \mathbb{R}^n \to \mathbb{R}^n$ is a $C^1$ diffeomorphism and $\mu = \mathcal{L}^n$, then the pushforward measure $\nu = T_\# \mu$ satisfies $\nu(A) = \mathcal{L}^n(T^{-1}(A))$. Computing $\int f \, d\nu$ requires expressing $\nu$ in terms of $\mathcal{L}^n$, which leads to the Jacobian factor $|\det JT^{-1}|$ — a density.
**Probability.** Given a random variable $X: (\Omega, \mathcal{F}, \mathbb{P}) \to (\mathbb{R}, \mathcal{B}(\mathbb{R}))$, its distribution $\mu_X = \mathbb{P} \circ X^{-1}$ is a Borel probability measure. The question "does $X$ have a density?" is precisely the question of whether $\mu_X \ll \mathcal{L}^1$.
**Duality.** Identifying the dual space of $L^p(X, \mathcal{A}, \mu)$ for $1 \le p < \infty$ requires showing that every bounded linear functional on $L^p$ can be represented as integration against some $g \in L^q$. The key step is constructing $g$ as a Radon--Nikodym derivative.
### What "absolute continuity" captures
The naive hope — that any two measures on the same $\sigma$-algebra admit a density — fails spectacularly, as the Dirac measure example above shows. The failure occurs whenever $\nu$ sees sets that $\mu$ considers negligible. Absolute continuity $\nu \ll \mu$ is the condition that $\nu$ is "blind to $\mu$-null sets": whenever $\mu(A) = 0$, we also have $\nu(A) = 0$. This is both necessary and (under $\sigma$-finiteness) sufficient for $d\nu = f \, d\mu$.
### Why $\sigma$-finiteness cannot be dropped
The theorem fails without $\sigma$-finiteness on $\mu$. Consider the simplest possible example: $X = \{0\}$ (a single point), $\mathcal{A} = \{\varnothing, \{0\}\}$, with $\mu$ defined by $\mu(\{0\}) = \infty$ and $\nu(\{0\}) = 1$. Note that $\mu$ is *not* $\sigma$-finite: the only cover of $X$ is $\{0\}$ itself, which has infinite $\mu$-measure. Yet $\nu \ll \mu$ holds (the only $\mu$-null set is $\varnothing$, and $\nu(\varnothing) = 0$). Any candidate density $f$ would need $\int_{\{0\}} f \, d\mu = f(0) \cdot \mu(\{0\}) = f(0) \cdot \infty$. If $f(0) > 0$, this integral is $+\infty \ne 1$; if $f(0) = 0$, the integral is $0 \ne 1$. No density exists, despite absolute continuity holding.
A more substantial example arises on uncountable spaces. Let $X = [0,1]$ with $\mathcal{A} = \mathcal{B}([0,1])$, let $\mu$ be the counting measure on $[0,1]$ (which assigns to each set its cardinality), and let $\nu = \mathcal{L}^1|_{[0,1]}$. Then $\nu \ll \mu$ (if $\mu(A) = 0$ then $A = \varnothing$, so $\nu(A) = 0$). But $\mu$ is not $\sigma$-finite: $[0,1]$ cannot be covered by countably many sets of finite counting measure, since each such set is finite and a countable union of finite sets is countable. Any candidate density $f$ would need $\int_A f \, d\mu = \sum_{x \in A} f(x) = \nu(A) = \mathcal{L}^1(A)$. Taking $A = \{x\}$ gives $f(x) = \mathcal{L}^1(\{x\}) = 0$ for every $x$, forcing $f \equiv 0$, which cannot reproduce $\nu([0,1]) = 1$.
[/motivation]
## Definition
The central object is the Radon--Nikodym derivative, which is defined only when absolute continuity holds.
[definition: Radon-Nikodym Derivative]
Let $(X, \mathcal{A})$ be a measurable space, and let $\mu$ and $\nu$ be $\sigma$-finite measures on $\mathcal{A}$ with $\nu \ll \mu$ (that is, $\nu$ is [absolutely continuous](/page/Absolutely%20Continuous%20Measures) with respect to $\mu$). The **Radon--Nikodym derivative** of $\nu$ with respect to $\mu$ is the $\mu$-a.e. unique $\mathcal{A}$-measurable function
\begin{align*}
\frac{d\nu}{d\mu}: X \to [0, \infty)
\end{align*}
satisfying
\begin{align*}
\nu(A) = \int_A \frac{d\nu}{d\mu} \, d\mu \quad \text{for every } A \in \mathcal{A}.
\end{align*}
The existence and $\mu$-a.e. uniqueness of this function is the content of the Radon--Nikodym theorem stated below.
When $\nu$ is a signed measure with $\nu \ll \mu$, the Radon--Nikodym derivative takes values in $\mathbb{R}$ (or $[-\infty, \infty)$) rather than $[0, \infty)$, and is the $\mu$-a.e. unique measurable function satisfying the same integral identity.
[/definition]
The notation $\frac{d\nu}{d\mu}$ is suggestive: it behaves like a "ratio" of two measures, much as the classical derivative $\frac{df}{dx}$ is a ratio of infinitesimal increments. This analogy is not merely cosmetic — the chain rule and substitution formulas for Radon--Nikodym derivatives mirror their classical counterparts, as we shall see in the section on calculus of densities.
An equivalent and frequently useful formulation is the **change-of-measure identity**: if $f := \frac{d\nu}{d\mu}$, then for every $\mathcal{A}$-measurable function $g: X \to [0, \infty]$,
\begin{align*}
\int_X g \, d\nu = \int_X g \cdot f \, d\mu.
\end{align*}
This identity is what makes the Radon--Nikodym derivative a computational tool, not merely an existence statement. It says that integrating against $\nu$ is the same as integrating against $\mu$ with a "reweighting factor" $f$.
## The Radon--Nikodym Theorem
The fundamental question is: does the Radon--Nikodym derivative actually exist? And if so, under what conditions? Two hypotheses are needed — absolute continuity (to prevent the Dirac measure obstruction) and $\sigma$-finiteness (to prevent the counting measure obstruction). Together, they are sufficient.
[quotetheorem:1247]
Several aspects of this theorem warrant careful discussion.
**Necessity of absolute continuity.** The hypothesis $\nu \ll \mu$ is indispensable. If $f$ is any nonnegative measurable function and $\nu(A) = \int_A f \, d\mu$, then whenever $\mu(A) = 0$, the integral $\int_A f \, d\mu = 0$, so $\nu(A) = 0$. Thus the existence of a density *implies* absolute continuity. The theorem's content is the converse: absolute continuity *implies* the existence of a density.
**Necessity of $\sigma$-finiteness.** As demonstrated in the motivation, the counting measure on $[0,1]$ provides a concrete example where $\nu \ll \mu$ holds but no density exists, because $\mu$ fails to be $\sigma$-finite. The $\sigma$-finiteness of $\nu$ can sometimes be relaxed (if $\nu \ll \mu$ and $\mu$ is $\sigma$-finite, then $\nu$ is automatically $\sigma$-finite when $\nu$ is a finite measure), but the $\sigma$-finiteness of $\mu$ is essential. There are extensions to non-$\sigma$-finite settings (e.g., using the concept of a *localizable* measure), but these require substantially different formulations.
**Uniqueness is only $\mu$-a.e.** The density $f$ is determined only up to modification on $\mu$-null sets. On any $\mu$-null set $N$, one can redefine $f$ arbitrarily without affecting any integral $\int_A f \, d\mu$. This is consistent with the general principle that $L^p$ functions are equivalence classes modulo $\mu$-null sets.
**The signed measure extension.** The theorem extends to signed measures: if $\nu$ is a signed measure with $\nu \ll \mu$ (meaning $|\nu| \ll \mu$, where $|\nu|$ is the total variation of $\nu$), then $\frac{d\nu}{d\mu}$ exists and takes values in $\mathbb{R}$. One applies the theorem separately to the positive and negative parts in the Jordan decomposition $\nu = \nu^+ - \nu^-$.
## The Lebesgue Decomposition
When $\nu$ is not absolutely continuous with respect to $\mu$, one cannot write $d\nu = f \, d\mu$ for the entirety of $\nu$. However, a natural question arises: can we at least decompose $\nu$ into a part that *does* admit a density and a remainder that is as "orthogonal" to $\mu$ as possible? The Lebesgue decomposition theorem provides exactly this.
The notion of orthogonality for measures is captured by [mutual singularity](/page/Mutually%20Singular%20Measures): two measures $\mu$ and $\lambda$ are mutually singular, written $\mu \perp \lambda$, if there exists a set $A \in \mathcal{A}$ such that $\mu(A) = 0$ and $\lambda(X \setminus A) = 0$. The measures "live on disjoint parts" of $X$.
[quotetheorem:1207]
The Lebesgue decomposition and the Radon--Nikodym theorem work in tandem: the decomposition isolates the part of $\nu$ that admits a density, and the Radon--Nikodym theorem produces that density. Combining them, we can write
\begin{align*}
\nu(A) = \int_A f \, d\mu + \nu_s(A) \quad \text{for every } A \in \mathcal{A},
\end{align*}
where $f = \frac{d\nu_{\mathrm{ac}}}{d\mu}$ and $\nu_s \perp \mu$.
**Uniqueness of the decomposition.** If $\nu = \nu_{\mathrm{ac}}' + \nu_s'$ is another such decomposition, then $\nu_{\mathrm{ac}} - \nu_{\mathrm{ac}}' = \nu_s' - \nu_s$. The left side is absolutely continuous with respect to $\mu$; the right side is singular with respect to $\mu$. A measure that is simultaneously absolutely continuous and singular must be the zero measure (if $\lambda \ll \mu$ and $\lambda \perp \mu$, take a set $A$ with $\mu(A) = 0$ and $\lambda(X \setminus A) = 0$; then for any $E \in \mathcal{A}$, $\lambda(E) = \lambda(E \cap A) + \lambda(E \setminus A) = \lambda(E \cap A)$, but $\mu(E \cap A) \le \mu(A) = 0$, so $\lambda(E \cap A) = 0$ by absolute continuity, giving $\lambda(E) = 0$). Hence $\nu_{\mathrm{ac}} = \nu_{\mathrm{ac}}'$ and $\nu_s = \nu_s'$.
[example: Lebesgue Decomposition of a Mixed Distribution]
Consider the probability measure $\nu$ on $(\mathbb{R}, \mathcal{B}(\mathbb{R}))$ defined by
\begin{align*}
\nu(A) = \frac{1}{2} \int_A e^{-|x|} \, d\mathcal{L}^1(x) + \frac{1}{2} \delta_0(A)
\end{align*}
for each Borel set $A \in \mathcal{B}(\mathbb{R})$. This is a "mixed" distribution: half the mass is spread according to the double-exponential (Laplace) density, and half is concentrated at the origin.
The Lebesgue decomposition with respect to $\mu = \mathcal{L}^1$ is:
\begin{align*}
\nu_{\mathrm{ac}}(A) &= \frac{1}{2} \int_A e^{-|x|} \, d\mathcal{L}^1(x), \\
\nu_s(A) &= \frac{1}{2} \delta_0(A).
\end{align*}
The absolutely continuous part has Radon--Nikodym derivative $f(x) = \frac{1}{2} e^{-|x|}$. The singular part is a point mass at the origin, which is mutually singular with $\mathcal{L}^1$ (take the witness set $A = \{0\}$: $\mathcal{L}^1(\{0\}) = 0$ and $\nu_s(\mathbb{R} \setminus \{0\}) = \frac{1}{2} \delta_0(\mathbb{R} \setminus \{0\}) = 0$).
Note that no single Radon--Nikodym derivative $\frac{d\nu}{d\mathcal{L}^1}$ exists for the full measure $\nu$, precisely because of the singular part. The Lebesgue decomposition is the correct framework for handling such mixed measures.
[/example]
## Calculus of Radon--Nikodym Derivatives
The notation $\frac{d\nu}{d\mu}$ suggests that Radon--Nikodym derivatives should obey the same algebraic rules as ordinary derivatives. This is indeed the case, and these rules form the basic computational toolkit for working with densities.
### The Chain Rule
When three $\sigma$-finite measures satisfy $\lambda \ll \nu \ll \mu$, one expects the densities to compose multiplicatively — just as the classical chain rule gives $\frac{dz}{dx} = \frac{dz}{dy} \cdot \frac{dy}{dx}$.
[quotetheorem:1208]
This is more than a formal identity — it is a practical computation tool. Whenever a density is easier to compute through an intermediate measure, the chain rule allows one to factor the computation. The identity also confirms that the notation $\frac{d\nu}{d\mu}$ is internally consistent: the "fractions" cancel as expected.
**What the chain rule does not say.** The identity is a $\mu$-a.e. statement. There may be a $\mu$-null set on which the two sides differ. Additionally, the chain rule requires all three measures to be $\sigma$-finite. If $\nu$ is not $\sigma$-finite, neither density on the right-hand side need exist.
### The Reciprocal Rule
If two measures are mutually absolutely continuous — each absolutely continuous with respect to the other — then one expects $\frac{d\mu}{d\nu} = \left(\frac{d\nu}{d\mu}\right)^{-1}$.
[quotetheorem:1209]
The positivity $\frac{d\nu}{d\mu} > 0$ $\mu$-a.e. is essential for the reciprocal to be well-defined. To see why it holds: if $A = \{x : \frac{d\nu}{d\mu}(x) = 0\}$, then $\nu(A) = \int_A \frac{d\nu}{d\mu} \, d\mu = 0$. Since $\mu \ll \nu$, we get $\mu(A) = 0$. So the density vanishes only on a $\mu$-null set, and one can safely take the reciprocal $\mu$-a.e.
### Sums and Scalar Multiples
[quotetheorem:1210]
This follows directly from linearity of the integral.
## Connection to the Fundamental Theorem of Calculus
One of the most striking applications of the Radon--Nikodym theorem is its connection to classical calculus. On the real line, the relationship between a function and its derivative is governed by the Fundamental Theorem of Calculus. The Radon--Nikodym theorem reveals that this relationship is, at its core, a statement about absolute continuity of measures.
### From functions to measures
Given a function $F: [a,b] \to \mathbb{R}$ that is nondecreasing and right-continuous, one can construct a unique Borel measure $\mu_F$ on $([a,b], \mathcal{B}([a,b]))$ — the **Lebesgue--Stieltjes measure** — satisfying $\mu_F((c,d]) = F(d) - F(c)$ for all $a \le c < d \le b$. The question of whether $\mu_F$ has a density with respect to Lebesgue measure $\mathcal{L}^1$ is precisely the question of whether $F$ is "sufficiently regular."
[definition: Absolutely Continuous Function]
A function $F: [a,b] \to \mathbb{R}$ is **absolutely continuous** (in the classical sense) if for every $\varepsilon > 0$ there exists $\delta > 0$ such that for any finite collection of pairwise disjoint intervals $(a_k, b_k) \subset [a,b]$, $k = 1, \ldots, n$,
\begin{align*}
\sum_{k=1}^n (b_k - a_k) < \delta \implies \sum_{k=1}^n |F(b_k) - F(a_k)| < \varepsilon.
\end{align*}
[/definition]
This is a strengthening of uniform continuity: it controls not just single oscillations but the *total* oscillation over collections of small intervals.
[quotetheorem:1211]
This theorem reveals a deep structural point: the classical derivative $F'$ *is* the Radon--Nikodym derivative $\frac{d\mu_F}{d\mathcal{L}^1}$. The Radon--Nikodym theorem is thus a far-reaching generalization of the Fundamental Theorem of Calculus, extending from the real line to arbitrary $\sigma$-finite measure spaces.
**What goes wrong without absolute continuity.** The Cantor function $F_C: [0,1] \to [0,1]$ is continuous, nondecreasing, satisfies $F_C(0) = 0$ and $F_C(1) = 1$, and has $F_C' = 0$ $\mathcal{L}^1$-a.e. (since $F_C$ is constant on each interval in the complement of the Cantor set, which has full Lebesgue measure). Yet $F_C$ is not absolutely continuous: the associated Lebesgue--Stieltjes measure $\mu_{F_C}$ is the *Cantor measure*, which is singular with respect to $\mathcal{L}^1$. Specifically, $\mu_{F_C}$ is supported on the Cantor set $C$ (which satisfies $\mathcal{L}^1(C) = 0$), so $\mu_{F_C} \perp \mathcal{L}^1$. The function $F_C$ has a derivative a.e., but the integral of that derivative ($\int_0^1 0 \, d\mathcal{L}^1 = 0$) does not recover the total variation $F_C(1) - F_C(0) = 1$. The Fundamental Theorem of Calculus fails because $\mu_{F_C}$ is singular, not absolutely continuous.
## Dual of $L^p$ Spaces
A central application of the Radon--Nikodym theorem is the identification of the dual space of $L^p$. The problem is the following: given a bounded linear functional $\varphi: L^p(X, \mathcal{A}, \mu) \to \mathbb{R}$, can we always represent $\varphi$ as integration against some function $g$? Without the Radon--Nikodym theorem, there is no mechanism to produce $g$ from the abstract functional $\varphi$.
### Constructing the representing function
The strategy is to convert the functional $\varphi$ into a measure, then apply the Radon--Nikodym theorem to extract a density. Given a bounded linear functional $\varphi \in (L^p(X, \mathcal{A}, \mu))^*$, define a set function $\nu: \mathcal{A} \to \mathbb{R}$ by
\begin{align*}
\nu(A) := \varphi(\mathbb{1}_A) \quad \text{for each } A \in \mathcal{A} \text{ with } \mu(A) < \infty.
\end{align*}
One verifies that $\nu$ is a signed measure with $\nu \ll \mu$ (if $\mu(A) = 0$, then $\mathbb{1}_A = 0$ in $L^p$, so $\varphi(\mathbb{1}_A) = 0$). The Radon--Nikodym theorem produces $g = \frac{d\nu}{d\mu}$, and one then shows that $g \in L^q$ (where $\frac{1}{p} + \frac{1}{q} = 1$) and that $\varphi(u) = \int_X u \cdot g \, d\mu$ for all $u \in L^p$.
[quotetheorem:901]
This identification $(L^p)^* \cong L^q$ is an isometric isomorphism of Banach spaces. It fails when $p = \infty$: the dual of $L^\infty$ is strictly larger than $L^1$ (it contains finitely additive set functions that are not countably additive). The failure at $p = \infty$ is not a deficiency of the Radon--Nikodym theorem but rather reflects the pathological nature of $L^\infty$ as a Banach space — its unit ball is not separable in the strong topology, and its dual includes singular functionals that cannot be represented by integration.
The $\sigma$-finiteness hypothesis in the Radon--Nikodym theorem is what makes the proof work: it ensures that the set function $\nu$ constructed from $\varphi$ is a genuine $\sigma$-finite signed measure, to which the Radon--Nikodym theorem applies.
## Conditional Expectation
In probability theory, the Radon--Nikodym theorem provides the theoretical foundation for conditional expectation. The challenge is to define $\mathbb{E}[X \mid \mathcal{G}]$ — the expected value of a random variable $X$ given partial information encoded in a sub-$\sigma$-algebra $\mathcal{G}$ — in a way that goes beyond the elementary case of conditioning on events of positive probability.
### The problem of partial information
Let $(\Omega, \mathcal{F}, \mathbb{P})$ be a probability space, let $X: \Omega \to \mathbb{R}$ be an integrable random variable ($X \in L^1(\Omega, \mathcal{F}, \mathbb{P})$), and let $\mathcal{G} \subset \mathcal{F}$ be a sub-$\sigma$-algebra representing partial information. We seek a $\mathcal{G}$-measurable random variable $Y$ that "best predicts" $X$ given the information in $\mathcal{G}$, in the sense that $Y$ preserves the averages of $X$ over all $\mathcal{G}$-measurable sets:
\begin{align*}
\int_G Y \, d\mathbb{P} = \int_G X \, d\mathbb{P} \quad \text{for every } G \in \mathcal{G}.
\end{align*}
Define a signed measure $\nu$ on $(\Omega, \mathcal{G})$ by $\nu(G) := \int_G X \, d\mathbb{P}$. Since $X$ is integrable, $\nu$ is a finite signed measure, and $\nu \ll \mathbb{P}|_{\mathcal{G}}$ (if $\mathbb{P}(G) = 0$ then $|\nu(G)| \le \int_G |X| \, d\mathbb{P} = 0$). Both $\nu$ and $\mathbb{P}|_{\mathcal{G}}$ are finite (hence $\sigma$-finite), so the Radon--Nikodym theorem applies on the measure space $(\Omega, \mathcal{G}, \mathbb{P}|_{\mathcal{G}})$, producing a $\mathcal{G}$-measurable function $Y = \frac{d\nu}{d\mathbb{P}|_{\mathcal{G}}}$ satisfying the desired identity.
[definition: Conditional Expectation]
Let $(\Omega, \mathcal{F}, \mathbb{P})$ be a probability space, let $X \in L^1(\Omega, \mathcal{F}, \mathbb{P})$, and let $\mathcal{G} \subset \mathcal{F}$ be a sub-$\sigma$-algebra. The **conditional expectation** of $X$ given $\mathcal{G}$, denoted $\mathbb{E}[X \mid \mathcal{G}]$, is the $\mathbb{P}|_{\mathcal{G}}$-a.e. unique $\mathcal{G}$-measurable function satisfying
\begin{align*}
\int_G \mathbb{E}[X \mid \mathcal{G}] \, d\mathbb{P} = \int_G X \, d\mathbb{P} \quad \text{for every } G \in \mathcal{G}.
\end{align*}
Equivalently, $\mathbb{E}[X \mid \mathcal{G}] = \frac{d\nu}{d\mathbb{P}|_{\mathcal{G}}}$, where $\nu(G) = \int_G X \, d\mathbb{P}$ for $G \in \mathcal{G}$.
[/definition]
Without the Radon--Nikodym theorem, the existence of conditional expectation would require a separate construction (such as an orthogonal projection argument in $L^2$, which only covers square-integrable random variables). The Radon--Nikodym approach works for all integrable $X \in L^1$ and provides the definitive construction.
[example: Conditional Expectation With Respect to a Partition]
Let $\Omega = [0,1]$, $\mathcal{F} = \mathcal{B}([0,1])$, $\mathbb{P} = \mathcal{L}^1|_{[0,1]}$, and $X(\omega) = \omega^2$. Let $\mathcal{G}$ be the $\sigma$-algebra generated by the partition $\{[0, \frac{1}{2}), [\frac{1}{2}, 1]\}$, so that $\mathcal{G} = \{\varnothing, [0, \frac{1}{2}), [\frac{1}{2}, 1], [0,1]\}$.
A $\mathcal{G}$-measurable function is constant on each atom of the partition. So $\mathbb{E}[X \mid \mathcal{G}]$ must take the form
\begin{align*}
\mathbb{E}[X \mid \mathcal{G}](\omega) = \begin{cases} c_1 & \text{if } \omega \in [0, \frac{1}{2}), \\ c_2 & \text{if } \omega \in [\frac{1}{2}, 1]. \end{cases}
\end{align*}
The defining property requires:
\begin{align*}
c_1 \cdot \mathbb{P}([0, \tfrac{1}{2})) &= \int_0^{1/2} \omega^2 \, d\mathcal{L}^1(\omega), \\
c_1 \cdot \tfrac{1}{2} &= \left[\frac{\omega^3}{3}\right]_0^{1/2} = \frac{1}{24}, \\
c_1 &= \frac{1}{12}.
\end{align*}
Similarly:
\begin{align*}
c_2 \cdot \mathbb{P}([\tfrac{1}{2}, 1]) &= \int_{1/2}^{1} \omega^2 \, d\mathcal{L}^1(\omega), \\
c_2 \cdot \tfrac{1}{2} &= \left[\frac{\omega^3}{3}\right]_{1/2}^{1} = \frac{1}{3} - \frac{1}{24} = \frac{7}{24}, \\
c_2 &= \frac{7}{12}.
\end{align*}
The conditional expectation is the function that averages $X = \omega^2$ over each atom of $\mathcal{G}$, replacing the "detailed" function by its mean on each piece of the partition. Note that $\mathbb{E}[\mathbb{E}[X \mid \mathcal{G}]] = \frac{1}{12} \cdot \frac{1}{2} + \frac{7}{12} \cdot \frac{1}{2} = \frac{1}{3} = \mathbb{E}[X]$, confirming the tower property.
[/example]
## Techniques for Computing Radon--Nikodym Derivatives
Having established the existence theory, we turn to the practical question: how does one actually compute Radon--Nikodym derivatives? The theorem guarantees existence but does not hand over an explicit formula. Several standard techniques cover the most common situations.
### Direct verification
The most straightforward approach: guess a density $f$ and verify that $\nu(A) = \int_A f \, d\mu$ for all measurable $A$. By uniqueness, if such an $f$ works, it must be *the* Radon--Nikodym derivative. This technique applies whenever the measure $\nu$ is explicitly defined through an integral formula.
[example: Direct Verification for a Weighted Measure]
Let $\mu = \mathcal{L}^1$ on $(\mathbb{R}, \mathcal{B}(\mathbb{R}))$, and define $\nu$ by
\begin{align*}
\nu(A) = \int_A \frac{1}{\sqrt{2\pi}} e^{-x^2/2} \, d\mathcal{L}^1(x) \quad \text{for each } A \in \mathcal{B}(\mathbb{R}).
\end{align*}
This is the standard Gaussian (normal) measure. The function $f: \mathbb{R} \to [0, \infty)$ defined by $f(x) = \frac{1}{\sqrt{2\pi}} e^{-x^2/2}$ satisfies $\nu(A) = \int_A f \, d\mathcal{L}^1$ by definition. Hence $\frac{d\nu}{d\mathcal{L}^1}(x) = \frac{1}{\sqrt{2\pi}} e^{-x^2/2}$.
To verify that $\nu \ll \mathcal{L}^1$: if $\mathcal{L}^1(A) = 0$, then $\nu(A) = \int_A f \, d\mathcal{L}^1 = 0$ since integrating any measurable function over a null set yields zero. The $\sigma$-finiteness of $\nu$ holds because $\nu$ is a finite measure ($\nu(\mathbb{R}) = 1$).
[/example]
### The chain rule technique
When a Radon--Nikodym derivative with respect to $\mu$ is difficult to compute directly, one can factor through an intermediate measure. If $\frac{d\nu}{d\lambda}$ and $\frac{d\lambda}{d\mu}$ are both known (or easier to compute), then $\frac{d\nu}{d\mu} = \frac{d\nu}{d\lambda} \cdot \frac{d\lambda}{d\mu}$.
[example: Chain Rule Computation]
Let $\mu = \mathcal{L}^1$ on $([0, \infty), \mathcal{B}([0,\infty)))$. Define $\lambda$ by $\frac{d\lambda}{d\mu}(x) = 2x$ (so $\lambda(A) = \int_A 2x \, d\mathcal{L}^1(x)$), and define $\nu$ by $\frac{d\nu}{d\lambda}(x) = e^{-x^2}$.
By the chain rule:
\begin{align*}
\frac{d\nu}{d\mu}(x) = \frac{d\nu}{d\lambda}(x) \cdot \frac{d\lambda}{d\mu}(x) = e^{-x^2} \cdot 2x = 2x \, e^{-x^2}.
\end{align*}
We can verify independently: $\nu(A) = \int_A e^{-x^2} \, d\lambda(x) = \int_A e^{-x^2} \cdot 2x \, d\mathcal{L}^1(x)$ for each Borel set $A \subset [0,\infty)$, which is exactly $\int_A 2x \, e^{-x^2} \, d\mathcal{L}^1(x)$.
[/example]
### Pullback and change of variables
If $T: (X, \mathcal{A}) \to (Y, \mathcal{B})$ is a measurable map and $\nu = \mu \circ T^{-1}$ is the pushforward of $\mu$ by $T$, then computing $\frac{d\nu}{d\lambda}$ (where $\lambda$ is a reference measure on $Y$) often reduces to a change-of-variables computation.
[example: Radon-Nikodym Derivative via Change of Variables]
Let $\mu = \mathcal{L}^1|_{(0,1)}$ (uniform measure on $(0,1)$), and let $T: (0,1) \to (0, \infty)$ be defined by $T(x) = -\log(x)$. The pushforward $\nu = \mu \circ T^{-1}$ is a measure on $((0,\infty), \mathcal{B}((0,\infty)))$.
For a Borel set $A \subset (0,\infty)$:
\begin{align*}
\nu(A) &= \mu(T^{-1}(A)) = \mathcal{L}^1(\{x \in (0,1) : -\log(x) \in A\}) \\
&= \mathcal{L}^1(\{x \in (0,1) : x \in e^{-A}\}).
\end{align*}
Since $T$ is a $C^1$ diffeomorphism from $(0,1)$ to $(0,\infty)$ with inverse $T^{-1}(y) = e^{-y}$ and $|(T^{-1})'(y)| = e^{-y}$, the change-of-variables formula gives
\begin{align*}
\nu(A) = \int_A e^{-y} \, d\mathcal{L}^1(y).
\end{align*}
Therefore $\frac{d\nu}{d\mathcal{L}^1}(y) = e^{-y}$ for $y \in (0,\infty)$. This is the exponential distribution with rate $1$ — the well-known fact that if $U$ is uniformly distributed on $(0,1)$, then $-\log(U)$ is exponentially distributed.
[/example]
### The Lebesgue differentiation approach
On $\mathbb{R}^n$ with Lebesgue measure, the Radon--Nikodym derivative can often be recovered pointwise as a limit of ratios. If $\nu \ll \mathcal{L}^n$ on $\mathbb{R}^n$, then the Lebesgue differentiation theorem gives
\begin{align*}
\frac{d\nu}{d\mathcal{L}^n}(x) = \lim_{r \to 0^+} \frac{\nu(B(x,r))}{\mathcal{L}^n(B(x,r))} \quad \text{for } \mathcal{L}^n\text{-a.e. } x \in \mathbb{R}^n.
\end{align*}
This connects the abstract Radon--Nikodym derivative to a geometric, "infinitesimal ratio" interpretation. It also explains why the notation $\frac{d\nu}{d\mu}$ is so apt: the derivative is literally a limit of ratios of measures of shrinking sets, mirroring the classical difference quotient $\frac{f(x+h) - f(x)}{h}$.
This approach is specific to $\mathbb{R}^n$ (or more generally, spaces where a suitable covering theorem, such as the Vitali or Besicovitch covering theorem, is available). On abstract measure spaces, no analogous pointwise limit formula exists — one can only work with the integral characterization.
## Radon--Nikodym for Signed and Complex Measures
So far we have focused on nonnegative measures, but many applications — particularly the duality of $L^p$ spaces and the representation of signed charges in potential theory — require extending the Radon--Nikodym theorem to signed measures.
[definition: Signed Measure]
A **signed measure** $\nu$ on $(X, \mathcal{A})$ is a countably additive set function $\nu: \mathcal{A} \to [-\infty, \infty]$ with $\nu(\varnothing) = 0$ (and $\nu$ assumes at most one of the values $+\infty$ and $-\infty$). The **total variation** $|\nu| := \nu^+ + \nu^-$ is a nonnegative measure, where $\nu^+$ and $\nu^-$ are defined by the Jordan decomposition below.
[/definition]
[quotetheorem:1212]
[quotetheorem:1213]
The condition $|\nu| \ll \mu$ is equivalent to requiring both $\nu^+ \ll \mu$ and $\nu^- \ll \mu$. When this holds, $f = \frac{d\nu^+}{d\mu} - \frac{d\nu^-}{d\mu}$.
For **complex measures** $\nu: \mathcal{A} \to \mathbb{C}$, the same result holds with $f: X \to \mathbb{C}$. A complex measure is always finite (it does not take infinite values), so the $\sigma$-finiteness of $\nu$ is automatic.
[remark: Absolute Continuity for Signed Measures]
For signed measures, the correct notion of absolute continuity is $|\nu| \ll \mu$, not the potentially weaker condition that $\mu(A) = 0 \implies \nu(A) = 0$. When $\nu$ is a finite signed measure, these two conditions are equivalent. But for $\sigma$-finite signed measures, one must use $|\nu| \ll \mu$ to ensure the Radon--Nikodym theorem applies.
[/remark]
## Summary of Hypotheses and Their Roles
A natural question after seeing the various theorems above is: which hypotheses are genuinely essential, and what does each one rule out? The following table collects the key hypotheses and pairs each with a concrete failure mode.
| Hypothesis | What it prevents | Failure example |
|---|---|---|
| $\nu \ll \mu$ | $\nu$ charging $\mu$-null sets | $\delta_0$ vs. $\mathcal{L}^1$ |
| $\mu$ $\sigma$-finite | Non-localizable measures | Counting measure on $[0,1]$ vs. $\mathcal{L}^1$ |
| $\nu$ $\sigma$-finite | Infinite densities | Automatic when $\nu \ll \mu$ and $\mu$ is $\sigma$-finite with $\nu$ nonneg. |
| $|\nu| \ll \mu$ (signed case) | Signed charge on $\mu$-null sets | Needed for both $\nu^+$ and $\nu^-$ |
## References
- Folland, G. B., *Real Analysis: Modern Techniques and Their Applications*, 2nd ed. (1999).
- Royden, H. L. and Fitzpatrick, P. M., *Real Analysis*, 4th ed. (2010).
- Rudin, W., *Real and Complex Analysis*, 3rd ed. (1987).
- Billingsley, P., *Probability and Measure*, 3rd ed. (1995).
- Evans, L. C. and Gariepy, R. F., *Measure Theory and Fine Properties of Functions*, revised ed. (2015).