test #33
Loading comments...
Sign in to comment on this pull request.
Changes to Content
Original Content
In classical analysis, the derivative and integral are built on the regularity of the [functions](/page/Function) involved. The Riemann-Stieltjes integral $\int_0^T f \, dg$ requires the integrator $g$ to have bounded variation, and the chain rule $\frac{d}{dt}F(g(t)) = F'(g(t))g'(t)$ requires $g$ to be [differentiable](/page/Derivative). Brownian motion — the canonical model for continuous random fluctuations — violates both of these assumptions: its sample paths are almost surely continuous but nowhere differentiable, and they have infinite variation on every interval. Any attempt to build a calculus for stochastic processes must therefore confront the failure of the classical machinery at a fundamental level.
Stochastic calculus provides the resolution. It constructs an integral with respect to Brownian motion (and more general semimartingales), derives a modified chain rule — Itô's formula — that accounts for the roughness of the integrator, and establishes the existence and uniqueness theory for stochastic differential equations. The additional second-order correction term that appears in Itô's formula, absent in classical calculus, arises directly from the nontrivial quadratic variation of Brownian motion and is the signature feature of the theory.
Throughout this article, we work on a fixed filtered probability space $(\Omega, \mathcal{F}, (\mathcal{F}_t)_{t \ge 0}, \mathbb{P})$ satisfying the **usual conditions**: the filtration is right-continuous and $\mathcal{F}_0$ contains all $\mathbb{P}$-null [sets](/page/Set). The symbol $W$ always denotes a standard Brownian motion adapted to $(\mathcal{F}_t)$.
[motivation]
### The Failure of Riemann-Stieltjes Integration
The Riemann-Stieltjes integral $\int_0^T f(t) \, dg(t)$ is well-defined when the integrator $g$ has bounded variation on $[0,T]$. Recall that the total variation of a function $g: [0,T] \to \mathbb{R}$ is defined as:
\begin{align*}
V_T(g) := \sup_{\Pi} \sum_{k=0}^{n-1} |g(t_{k+1}) - g(t_k)|,
\end{align*}
where the supremum is over all partitions $\Pi = \{0 = t_0 < t_1 < \cdots < t_n = T\}$.
For a standard Brownian motion $W$, the increments $W(t_{k+1}) - W(t_k)$ are independent Gaussian random variables with variance $t_{k+1} - t_k$. For any partition with mesh $|\Pi| = \max_k(t_{k+1} - t_k)$:
\begin{align*}
\sum_{k=0}^{n-1} |W(t_{k+1}) - W(t_k)| \ge \left(\sum_{k=0}^{n-1} |W(t_{k+1}) - W(t_k)|^2\right)^{1/2},
\end{align*}
by the Cauchy-Schwarz inequality. The sum of squared increments converges in $L^2$ to $T$ (as we show below in the discussion of quadratic variation). Since $\sqrt{T} > 0$, the total variation diverges as the partition is refined. In fact, $V_T(W) = +\infty$ almost surely for every $T > 0$.
The pathological roughness of Brownian paths means that even the simplest integral $\int_0^T W(t) \, dW(t)$ cannot be defined by Riemann-Stieltjes theory. Any construction of stochastic integrals must therefore abandon the bounded variation framework entirely.
### The Choice of Evaluation Point: Why Itô and Not Stratonovich
Even granting that the Riemann-Stieltjes integral is undefined, one might attempt to define the integral as a limit of Riemann-type sums. For a partition $\Pi = \{0 = t_0 < \cdots < t_n = T\}$ and a process $f$, consider sums of the form:
\begin{align*}
S_\Pi(\alpha) := \sum_{k=0}^{n-1} f(\alpha t_{k+1} + (1-\alpha) t_k) \, (W(t_{k+1}) - W(t_k)),
\end{align*}
where $\alpha \in [0,1]$ controls the evaluation point within each subinterval.
A striking feature of stochastic integration — and a fundamental departure from classical analysis — is that the limit depends on $\alpha$. For deterministic integrands and finite-variation integrators, all evaluation points give the same limit, but the infinite variation of Brownian motion creates a discrepancy.
The two principal choices are:
- **$\alpha = 0$ (left endpoint):** This yields the **Itô integral**. Evaluating $f$ at the left endpoint $t_k$ ensures that $f(t_k)$ is $\mathcal{F}_{t_k}$-measurable and hence independent of the future increment $W(t_{k+1}) - W(t_k)$. This independence is the source of the martingale property and the Itô isometry — the two structural pillars of the theory.
- **$\alpha = 1/2$ (midpoint):** This yields the **Stratonovich integral**, denoted $\int f \circ dW$. The midpoint evaluation restores the classical chain rule (no second-order correction), but the resulting integral is not a martingale, and the isometry fails. In applications where the noise arises as a limit of smooth approximations (e.g., the Wong-Zakai theorem), the Stratonovich integral is the natural object.
We develop the Itô theory here, as the martingale and isometry properties make it the correct framework for mathematical finance, filtering theory, and the general theory of stochastic differential equations. The Stratonovich integral can always be recovered from the Itô integral by a correction term involving the quadratic covariation.
### From Sums to $L^2$ [Limits](/page/Limit)
The Itô integral is constructed not by pointwise limits of Riemann sums (which may fail to converge pathwise), but by $L^2(\Omega)$ limits. The key insight is that for $\alpha = 0$, the independence between $f(t_k)$ and $\Delta W_k := W(t_{k+1}) - W(t_k)$ produces the identity:
\begin{align*}
\mathbb{E}\left[\left(\sum_{k} f(t_k) \Delta W_k\right)^2\right] = \sum_{k} \mathbb{E}[f(t_k)^2] (t_{k+1} - t_k),
\end{align*}
which in the limit becomes the **Itô isometry** $\mathbb{E}[(\int f \, dW)^2] = \mathbb{E}[\int f^2 \, dt]$. This isometry identifies the stochastic integral as an isometric embedding of $L^2(\Omega \times [0,T])$ into $L^2(\Omega)$, and the construction proceeds by density: define the integral for simple (step) processes, verify the isometry, and extend to the closure.
[/motivation]
## The Itô Integral
### Construction for Simple Processes
The construction begins with the simplest class of integrands: processes that are piecewise constant and adapted.
[definition: Simple Process]
A **simple process** (or elementary process) is a stochastic process $f: [0,T] \times \Omega \to \mathbb{R}$ of the form:
\begin{align*}
f(t, \omega) = \sum_{k=0}^{n-1} \xi_k(\omega) \, \mathbb{1}_{(t_k, t_{k+1}]}(t),
\end{align*}
where $0 = t_0 < t_1 < \cdots < t_n = T$ is a partition and each $\xi_k: \Omega \to \mathbb{R}$ is a bounded, $\mathcal{F}_{t_k}$-measurable random variable.
[/definition]
For a simple process $f$, the Itô integral is defined algebraically as the finite sum:
\begin{align*}
\int_0^T f(t) \, dW(t) := \sum_{k=0}^{n-1} \xi_k \, (W(t_{k+1}) - W(t_k)).
\end{align*}
Each term in this sum is the product of an $\mathcal{F}_{t_k}$-measurable random variable $\xi_k$ and the independent increment $W(t_{k+1}) - W(t_k)$. This independence is the mechanism behind all the structural properties of the integral.
[example: The Integral $\int_0^T W \, dW$ via Approximation]
We compute $\int_0^T W(t) \, dW(t)$ by approximating $W(t)$ with simple processes. For a partition $\Pi = \{0 = t_0 < \cdots < t_n = T\}$, the left-endpoint approximation is:
\begin{align*}
S_\Pi = \sum_{k=0}^{n-1} W(t_k)(W(t_{k+1}) - W(t_k)).
\end{align*}
We use the algebraic identity $a(b - a) = \frac{1}{2}(b^2 - a^2) - \frac{1}{2}(b-a)^2$ with $a = W(t_k)$ and $b = W(t_{k+1})$:
\begin{align*}
S_\Pi &= \frac{1}{2} \sum_{k=0}^{n-1} \left(W(t_{k+1})^2 - W(t_k)^2\right) - \frac{1}{2}\sum_{k=0}^{n-1} (W(t_{k+1}) - W(t_k))^2 \\
&= \frac{1}{2} W(T)^2 - \frac{1}{2} \sum_{k=0}^{n-1} (W(t_{k+1}) - W(t_k))^2.
\end{align*}
The first sum telescopes. The second sum converges in $L^2(\Omega)$ to $T$ (this is the quadratic variation of Brownian motion, proved below). Therefore:
\begin{align*}
\int_0^T W(t) \, dW(t) = \frac{1}{2}W(T)^2 - \frac{1}{2}T.
\end{align*}
The correction term $-\frac{1}{2}T$ is absent from the classical Riemann-Stieltjes calculation (where $\int g \, dg = \frac{1}{2}g^2$). It arises because the sum of squared increments does not vanish — it converges to $T$ — and this is the first concrete manifestation of the Itô correction.
Note also that $\mathbb{E}[\int_0^T W \, dW] = \frac{1}{2}\mathbb{E}[W(T)^2] - \frac{1}{2}T = \frac{1}{2}T - \frac{1}{2}T = 0$, consistent with the martingale property.
[/example]
### Extension to $L^2$ Integrands
The Itô integral is extended from simple processes to the full class of square-integrable adapted processes by an isometric density argument.
[definition: Itô Integrable Process]
Let $\mathcal{H}^2(0,T)$ denote the space of all progressively measurable processes $f: [0,T] \times \Omega \to \mathbb{R}$ satisfying:
\begin{align*}
\|f\|_{\mathcal{H}^2}^2 := \mathbb{E}\left[\int_0^T f(t)^2 \, dt\right] < \infty.
\end{align*}
[/definition]
The space $\mathcal{H}^2(0,T)$, equipped with the norm $\|\cdot\|_{\mathcal{H}^2}$, is a [Hilbert space](/page/Hilbert%20Space). The Itô integral for simple processes is an isometry from the subspace of simple processes into $L^2(\Omega)$, and we extend it to all of $\mathcal{H}^2(0,T)$ by [continuity](/page/Continuity). The following theorem collects the fundamental properties.
[theorem: Properties of the Itô Integral]
Let $f \in \mathcal{H}^2(0,T)$. The Itô integral $I(f) := \int_0^T f(t) \, dW(t)$ satisfies:
1. **Linearity:** $I(\alpha f + \beta g) = \alpha I(f) + \beta I(g)$ for $\alpha, \beta \in \mathbb{R}$ and $f, g \in \mathcal{H}^2$.
2. **Itô Isometry:**
\begin{align*}
\mathbb{E}\left[\left(\int_0^T f(t) \, dW(t)\right)^2\right] = \mathbb{E}\left[\int_0^T f(t)^2 \, dt\right].
\end{align*}
3. **Martingale Property:** The process $M(t) := \int_0^t f(s) \, dW(s)$ is a continuous $(\mathcal{F}_t)$-martingale. In particular, $\mathbb{E}[M(t)] = 0$ for all $t$.
4. **Continuous Paths:** The process $t \mapsto M(t)$ has a version with continuous sample paths.
[/theorem]
The Itô isometry deserves emphasis. In classical integration, the $L^2$ norm of the integral is controlled by Hölder or Cauchy-Schwarz estimates that involve the variation of the integrator. Here, the isometry replaces the role of bounded variation: the $L^2(\Omega)$ norm of the stochastic integral equals the $L^2(\Omega \times [0,T])$ norm of the integrand. This identity is a consequence of the orthogonality of the increments $\Delta W_k$ — the same independence that we exploited in the construction.
The martingale property is equally fundamental. It says that the stochastic integral has no systematic drift: the best prediction of $M(T)$ given information up to time $t$ is the current value $M(t)$. This property fails for the Stratonovich integral, which is one reason the Itô formulation is preferred in probability theory.
## Quadratic Variation
The quadratic variation is the key analytical concept that distinguishes stochastic from classical calculus. For a classical $C^1$ function $g$, the sum of squared increments $\sum (g(t_{k+1}) - g(t_k))^2$ converges to zero as the mesh of the partition tends to zero. For Brownian motion, this sum converges to a nonzero quantity — and it is this nonvanishing limit that forces the correction term in Itô's formula.
[definition: Quadratic Variation]
Let $X: [0,T] \times \Omega \to \mathbb{R}$ be a continuous semimartingale. The **quadratic variation** of $X$ on $[0,T]$ is defined as the limit in probability:
\begin{align*}
[X,X](T) := \lim_{|\Pi| \to 0} \sum_{k=0}^{n-1} (X(t_{k+1}) - X(t_k))^2,
\end{align*}
where $\Pi = \{0 = t_0 < \cdots < t_n = T\}$ is a partition and $|\Pi| = \max_k(t_{k+1} - t_k)$ denotes the mesh.
More generally, the **quadratic covariation** of two continuous semimartingales $X$ and $Y$ is:
\begin{align*}
[X,Y](T) := \lim_{|\Pi| \to 0} \sum_{k=0}^{n-1} (X(t_{k+1}) - X(t_k))(Y(t_{k+1}) - Y(t_k)).
\end{align*}
[/definition]
The following result establishes the quadratic variation of Brownian motion and is the computational engine behind Itô's formula.
[theorem: Quadratic Variation of Brownian Motion]
Let $W$ be a standard Brownian motion. Then for every $T > 0$:
\begin{align*}
[W,W](T) = T.
\end{align*}
More precisely, for any [sequence](/page/Sequence) of partitions $\Pi_n$ of $[0,T]$ with $|\Pi_n| \to 0$:
\begin{align*}
\sum_{k=0}^{n-1} (W(t_{k+1}^{(n)}) - W(t_k^{(n)}))^2 \xrightarrow{L^2(\Omega)} T.
\end{align*}
[/theorem]
The proof is a direct second-moment computation. Let $Q_n := \sum_k (\Delta W_k)^2$ where $\Delta W_k = W(t_{k+1}) - W(t_k)$. Since the increments are independent with $\mathbb{E}[(\Delta W_k)^2] = \Delta t_k$ and $\operatorname{Var}((\Delta W_k)^2) = 2(\Delta t_k)^2$ (the fourth moment of a centred Gaussian is $3\sigma^4$, so the variance of the square is $3\sigma^4 - \sigma^4 = 2\sigma^4$):
\begin{align*}
\mathbb{E}[Q_n] &= \sum_k \Delta t_k = T, \\
\operatorname{Var}(Q_n) &= \sum_k 2(\Delta t_k)^2 \le 2 |\Pi_n| \sum_k \Delta t_k = 2T |\Pi_n| \to 0.
\end{align*}
Convergence in $L^2$ follows.
This result is often written in differential shorthand as $dW \cdot dW = dt$. Combined with the heuristic rules $dt \cdot dW = 0$ and $dt \cdot dt = 0$ (which express the fact that terms of order higher than $dt$ are negligible), these multiplication rules determine the correction term in Itô's formula. For an $m$-dimensional Brownian motion $W = (W_1, \dots, W_m)$, the covariation rules are:
\begin{align*}
dW_j \cdot dW_\ell = \delta_{j\ell} \, dt,
\end{align*}
where $\delta_{j\ell}$ denotes the Kronecker symbol.
## Itô Processes and Itô's Formula
### Itô Processes
Having constructed the stochastic integral, we can define the class of processes that serve as the solutions of stochastic differential equations.
[definition: Itô Process]
Let $T > 0$ and $n, m \in \mathbb{N}$. Let $W: [0,T] \times \Omega \to \mathbb{R}^m$ be an $m$-dimensional Brownian motion adapted to $(\mathcal{F}_t)_{t \ge 0}$.
Let $b: [0,T] \times \mathbb{R}^n \to \mathbb{R}^n$ and $\sigma: [0,T] \times \mathbb{R}^n \to \mathbb{R}^{n \times m}$ be measurable functions. An $\mathbb{R}^n$-valued stochastic process $X: [0,T] \times \Omega \to \mathbb{R}^n$ is called an **Itô process** if, for each $i \in \{1, \dots, n\}$:
\begin{align*}
X_i(t) = X_i(0) + \int_0^t b_i(s, X(s)) \, ds + \sum_{j=1}^m \int_0^t \sigma_{ij}(s, X(s)) \, dW_j(s),
\end{align*}
where the first integral is a [Lebesgue integral](/page/Lebesgue%20Integral) and the second is an Itô integral.
[/definition]
In differential notation, the Itô process is written:
\begin{align*}
dX_i = b_i(t, X) \, dt + \sum_{j=1}^m \sigma_{ij}(t, X) \, dW_j.
\end{align*}
The function $b$ is the **drift coefficient** (deterministic trend) and $\sigma$ is the **diffusion coefficient** (random fluctuations). The matrix $a_{ik}(t,x) := \sum_{j=1}^m \sigma_{ij}(t,x) \sigma_{kj}(t,x)$ is called the **diffusion matrix** and governs the local covariance structure of $X$.
### Itô's Formula in $\mathbb{R}^n$
Itô's formula is the stochastic analogue of the chain rule. It describes how a smooth function of an Itô process evolves in time. The key difference from the classical chain rule is the presence of a second-order term involving the diffusion matrix — a direct consequence of the nonvanishing quadratic variation of Brownian motion.
[theorem: Itô's Formula]
Let $X: [0,T] \times \Omega \to \mathbb{R}^n$ be an Itô process as in the preceding definition. Let $f: [0,T] \times \mathbb{R}^n \to \mathbb{R}$ be a function of class $C^{1,2}$ (continuously differentiable in $t$ and twice continuously differentiable in $x$).
Then the real-valued process $Y(t) := f(t, X(t))$ satisfies, for all $t \in [0,T]$:
\begin{align*}
Y(t) &= Y(0) + \int_0^t \partial_t f(s, X(s)) \, ds + \sum_{i=1}^n \int_0^t \partial_{x_i} f(s, X(s)) \, b_i(s, X(s)) \, ds \\
&\quad + \frac{1}{2} \sum_{i=1}^n \sum_{k=1}^n \int_0^t \partial_{x_i x_k} f(s, X(s)) \, a_{ik}(s, X(s)) \, ds \\
&\quad + \sum_{i=1}^n \sum_{j=1}^m \int_0^t \partial_{x_i} f(s, X(s)) \, \sigma_{ij}(s, X(s)) \, dW_j(s),
\end{align*}
where $a_{ik} = \sum_{j=1}^m \sigma_{ij} \sigma_{kj}$.
[/theorem]
In differential shorthand, Itô's formula reads:
\begin{align*}
df = \left(\partial_t f + \sum_i b_i \partial_{x_i} f + \frac{1}{2} \sum_{i,k} a_{ik} \partial_{x_i x_k} f\right) dt + \sum_{i,j} \sigma_{ij} \partial_{x_i} f \, dW_j.
\end{align*}
The first line consists of the $dt$ terms: the time derivative, the drift contribution (which matches the classical chain rule), and the **Itô correction** $\frac{1}{2}\sum_{i,k} a_{ik} \partial_{x_i x_k} f$. This correction term has no classical analogue; it arises because the Taylor expansion of $f$ along a Brownian path must be carried to second order, since $(\Delta W)^2 \sim \Delta t$ rather than being negligible.
The second line is the $dW$ term, a martingale. When computing expectations, this term vanishes (assuming sufficient [integrability](/page/Integral)), leaving:
\begin{align*}
\mathbb{E}[f(t, X(t))] = f(0, X(0)) + \mathbb{E}\left[\int_0^t \left(\partial_t f + \sum_i b_i \partial_{x_i} f + \frac{1}{2} \sum_{i,k} a_{ik} \partial_{x_i x_k} f\right) ds\right].
\end{align*}
This identity connects the expected value of $f(t,X(t))$ to the action of the **infinitesimal generator** of the diffusion, $\mathcal{L}f := \sum_i b_i \partial_{x_i} f + \frac{1}{2}\sum_{i,k} a_{ik} \partial_{x_i x_k} f$, and is the starting point for the connection between stochastic calculus and partial differential equations (the Kolmogorov equations).
**Proof idea.** The proof proceeds by applying the deterministic Taylor expansion to $f(t_{k+1}, X(t_{k+1})) - f(t_k, X(t_k))$ over each subinterval of a partition, retaining terms up to second order in the spatial increments $\Delta X_i$. The first-order terms produce the drift and stochastic integral contributions. The second-order terms produce $\frac{1}{2}\sum_{i,k} \partial_{x_i x_k} f \cdot \Delta X_i \Delta X_k$. Using the heuristic $\Delta X_i \Delta X_k \approx a_{ik} \Delta t$ (which is made rigorous via the quadratic variation), summing over the partition, and passing to the limit yields the formula.
[example: Geometric Brownian Motion]
Consider the one-dimensional Itô process $X$ with drift $b(t,x) = \mu x$ and diffusion $\sigma(t,x) = \sigma x$, where $\mu \in \mathbb{R}$ and $\sigma > 0$ are constants:
\begin{align*}
dX = \mu X \, dt + \sigma X \, dW, \qquad X(0) = x_0 > 0.
\end{align*}
This is the **geometric Brownian motion** (GBM) equation, and it is the foundational model for asset prices in mathematical finance.
To solve the equation, apply Itô's formula to $f(x) = \log x$ (here $f$ does not depend on $t$). We have $f'(x) = 1/x$ and $f''(x) = -1/x^2$. By Itô's formula:
\begin{align*}
d(\log X) &= \frac{1}{X} \, dX + \frac{1}{2}\left(-\frac{1}{X^2}\right)(\sigma X)^2 \, dt \\
&= \frac{1}{X}(\mu X \, dt + \sigma X \, dW) - \frac{1}{2}\sigma^2 \, dt \\
&= \left(\mu - \frac{\sigma^2}{2}\right) dt + \sigma \, dW.
\end{align*}
The second-order Itô correction $-\frac{1}{2}\sigma^2$ shifts the effective growth rate from $\mu$ to $\mu - \sigma^2/2$. Integrating:
\begin{align*}
\log X(t) = \log x_0 + \left(\mu - \frac{\sigma^2}{2}\right)t + \sigma W(t),
\end{align*}
so the explicit solution is:
\begin{align*}
X(t) = x_0 \exp\left(\left(\mu - \frac{\sigma^2}{2}\right)t + \sigma W(t)\right).
\end{align*}
Since $X(t) > 0$ for all $t$ (the exponential is strictly positive), the process never hits zero — a desirable property for a price model. The shift $\mu \to \mu - \sigma^2/2$ means that the expected growth rate of $\log X$ is strictly less than $\mu$: volatility depresses the logarithmic return. This is the **volatility drag** phenomenon, entirely a consequence of the Itô correction.
[/example]
[example: The Ornstein-Uhlenbeck Process]
The Ornstein-Uhlenbeck (OU) process models a diffusion that is pulled towards a long-term mean. It satisfies:
\begin{align*}
dX = -\theta X \, dt + \sigma \, dW, \qquad X(0) = x_0,
\end{align*}
where $\theta > 0$ is the mean-reversion rate and $\sigma > 0$ is the volatility. To solve this linear SDE, we use the integrating factor $e^{\theta t}$.
Apply Itô's formula to $Y(t) = e^{\theta t} X(t)$, using $f(t,x) = e^{\theta t} x$ with $\partial_t f = \theta e^{\theta t} x$, $\partial_x f = e^{\theta t}$, $\partial_{xx} f = 0$:
\begin{align*}
dY &= \theta e^{\theta t} X \, dt + e^{\theta t} \, dX + 0 \\
&= \theta e^{\theta t} X \, dt + e^{\theta t}(-\theta X \, dt + \sigma \, dW) \\
&= \sigma e^{\theta t} \, dW.
\end{align*}
The drift terms cancel (as designed by the integrating factor), and no Itô correction appears because $\partial_{xx} f = 0$. Integrating:
\begin{align*}
Y(t) = x_0 + \sigma \int_0^t e^{\theta s} \, dW(s),
\end{align*}
so the solution is:
\begin{align*}
X(t) = x_0 e^{-\theta t} + \sigma \int_0^t e^{-\theta(t-s)} \, dW(s).
\end{align*}
Since the stochastic integral of a deterministic function of $s$ against $dW(s)$ is a Gaussian random variable, $X(t)$ is Gaussian with mean $x_0 e^{-\theta t}$ and variance $\frac{\sigma^2}{2\theta}(1 - e^{-2\theta t})$. As $t \to \infty$, the process converges in [distribution](/page/Distribution) to $\mathcal{N}(0, \sigma^2/(2\theta))$, the stationary distribution.
[/example]
## Existence and Uniqueness for Stochastic Differential Equations
The examples above relied on explicit solutions, which are available only for special coefficient structures (linear, or amenable to a known transformation). For general SDEs, we require an abstract existence and uniqueness theorem analogous to the Picard-Lindelöf theorem in ODE theory.
[definition: Stochastic Differential Equation]
A **stochastic differential equation** (SDE) on $[0,T]$ is the integral equation:
\begin{align*}
X(t) = X(0) + \int_0^t b(s, X(s)) \, ds + \int_0^t \sigma(s, X(s)) \, dW(s), \qquad t \in [0,T],
\end{align*}
where $b: [0,T] \times \mathbb{R}^n \to \mathbb{R}^n$ is the drift, $\sigma: [0,T] \times \mathbb{R}^n \to \mathbb{R}^{n \times m}$ is the diffusion, $W$ is an $m$-dimensional Brownian motion, and $X(0)$ is an $\mathcal{F}_0$-measurable random variable with $\mathbb{E}[|X(0)|^2] < \infty$.
A **strong solution** is an $(\mathcal{F}_t)$-adapted, continuous process $X$ satisfying this equation almost surely.
[/definition]
The standard conditions ensuring well-posedness are the Lipschitz and linear growth conditions, which are the stochastic analogues of the conditions in the Picard-Lindelöf theorem.
[theorem: Existence and Uniqueness of Strong Solutions]
Let $b: [0,T] \times \mathbb{R}^n \to \mathbb{R}^n$ and $\sigma: [0,T] \times \mathbb{R}^n \to \mathbb{R}^{n \times m}$ be measurable and satisfy:
1. **Global Lipschitz condition:** There exists $K > 0$ such that for all $t \in [0,T]$ and $x, y \in \mathbb{R}^n$:
\begin{align*}
|b(t,x) - b(t,y)| + |\sigma(t,x) - \sigma(t,y)| \le K|x - y|.
\end{align*}
2. **Linear growth condition:** There exists $K > 0$ such that for all $t \in [0,T]$ and $x \in \mathbb{R}^n$:
\begin{align*}
|b(t,x)| + |\sigma(t,x)| \le K(1 + |x|).
\end{align*}
Then for any $\mathcal{F}_0$-measurable $X(0)$ with $\mathbb{E}[|X(0)|^2] < \infty$, there exists a unique strong solution $X$ satisfying:
\begin{align*}
\mathbb{E}\left[\sup_{0 \le t \le T} |X(t)|^2\right] < \infty.
\end{align*}
[/theorem]
The proof follows the Picard iteration scheme: define $X^{(0)}(t) = X(0)$ and recursively set:
\begin{align*}
X^{(k+1)}(t) = X(0) + \int_0^t b(s, X^{(k)}(s)) \, ds + \int_0^t \sigma(s, X^{(k)}(s)) \, dW(s).
\end{align*}
The Lipschitz condition and the Itô isometry combine to give:
\begin{align*}
\mathbb{E}\left[\sup_{0 \le s \le t} |X^{(k+1)}(s) - X^{(k)}(s)|^2\right] \le C \int_0^t \mathbb{E}\left[\sup_{0 \le r \le s}|X^{(k)}(r) - X^{(k-1)}(r)|^2\right] ds,
\end{align*}
where the Burkholder-Davis-Gundy inequality is used to control the supremum of the martingale part. Iterating this bound produces a geometric series in $(CT)^k/k!$, which converges. The linear growth condition ensures the solution does not explode in finite time.
The Lipschitz condition is not merely a technical convenience — it is essential for uniqueness. There exist SDEs with non-Lipschitz coefficients (e.g., $dX = |X|^{1/2} \, dW$) that have multiple weak solutions or exhibit pathological non-uniqueness. Extensions to locally Lipschitz coefficients or the Yamada-Watanabe conditions (which relax Lipschitz to Hölder continuity for the diffusion coefficient) are important refinements but require more delicate arguments.
## The Martingale Representation Theorem
A remarkable feature of Brownian filtrations is that every martingale can be represented as a stochastic integral. This result underpins the theory of hedging in mathematical finance and the Girsanov change of measure.
[theorem: Martingale Representation]
Let $(\mathcal{F}_t)_{t \ge 0}$ be the natural filtration of a standard Brownian motion $W$, augmented by the $\mathbb{P}$-null sets. Let $M$ be a continuous $(\mathcal{F}_t)$-martingale with $\mathbb{E}[M(T)^2] < \infty$. Then there exists a unique process $f \in \mathcal{H}^2(0,T)$ such that:
\begin{align*}
M(t) = M(0) + \int_0^t f(s) \, dW(s) \qquad \text{for all } t \in [0,T].
\end{align*}
[/theorem]
The theorem says that in the Brownian world, the stochastic integral with respect to $W$ is the *only* source of randomness. Every square-integrable martingale is a stochastic integral — there are no "hidden" sources of randomness beyond $W$ itself. The proof relies on the fact that the filtration $(\mathcal{F}_t)$ is generated by $W$, and proceeds by showing that the stochastic integrals $\{\int_0^T f \, dW : f \in \mathcal{H}^2\}$ form a closed subspace of $L^2(\Omega, \mathcal{F}_T, \mathbb{P})$ that contains all $\mathcal{F}_T$-measurable random variables.
In mathematical finance, this result implies that in the Black-Scholes model (where the stock price is a geometric Brownian motion), every contingent claim can be replicated by a dynamic trading strategy — the integrand $f$ is the hedging portfolio.
## References
K. Itô, *On Stochastic Differential Equations*, Memoirs of the American Mathematical Society (1951).
B. Øksendal, *Stochastic Differential Equations: An Introduction with Applications*, 6th edition (2003).
I. Karatzas and S. E. Shreve, *Brownian Motion and Stochastic Calculus*, 2nd edition (1991).
P. Protter, *Stochastic Integration and Differential Equations*, 2nd edition (2005).
Proposed Changes
Stochastic calculus provides the resolution. It constructs an integral with respect to Brownian motion (and more general semimartingales), derives a modified chain rule — Itô's formula — that accounts for the roughness of the integrator, and establishes the existence and uniqueness theory for stochastic differential equations. The additional second-order correction term that appears in Itô's formula, absent in classical calculus, arises directly from the nontrivial quadratic variation of Brownian motion and is the signature feature of the theory.
Throughout this article, we work on a fixed filtered probability space $(\Omega, \mathcal{F}, (\mathcal{F}_t)_{t \ge 0}, \mathbb{P})$ satisfying the **usual conditions**: the filtration is right-continuous and $\mathcal{F}_0$ contains all $\mathbb{P}$-null [sets](/page/Set). The symbol $W$ always denotes a standard Brownian motion adapted to $(\mathcal{F}_t)$.
[motivation]
### The Failure of Riemann-Stieltjes Integration
The Riemann-Stieltjes integral $\int_0^T f(t) \, dg(t)$ is well-defined when the integrator $g$ has bounded variation on $[0,T]$. Recall that the total variation of a function $g: [0,T] \to \mathbb{R}$ is defined as:
\begin{align*}
V_T(g) := \sup_{\Pi} \sum_{k=0}^{n-1} |g(t_{k+1}) - g(t_k)|,
\end{align*}
where the supremum is over all partitions $\Pi = \{0 = t_0 < t_1 < \cdots < t_n = T\}$.
For a standard Brownian motion $W$, the increments $W(t_{k+1}) - W(t_k)$ are independent Gaussian random variables with variance $t_{k+1} - t_k$. For any partition with mesh $|\Pi| = \max_k(t_{k+1} - t_k)$:
\begin{align*}
\sum_{k=0}^{n-1} |W(t_{k+1}) - W(t_k)| \ge \left(\sum_{k=0}^{n-1} |W(t_{k+1}) - W(t_k)|^2\right)^{1/2},
\end{align*}
by the Cauchy-Schwarz inequality. The sum of squared increments converges in $L^2$ to $T$ (as we show below in the discussion of quadratic variation). Since $\sqrt{T} > 0$, the total variation diverges as the partition is refined. In fact, $V_T(W) = +\infty$ almost surely for every $T > 0$.
The pathological roughness of Brownian paths means that even the simplest integral $\int_0^T W(t) \, dW(t)$ cannot be defined by Riemann-Stieltjes theory. Any construction of stochastic integrals must therefore abandon the bounded variation framework entirely.
### The Choice of Evaluation Point: Why Itô and Not Stratonovich
Even granting that the Riemann-Stieltjes integral is undefined, one might attempt to define the integral as a limit of Riemann-type sums. For a partition $\Pi = \{0 = t_0 < \cdots < t_n = T\}$ and a process $f$, consider sums of the form:
\begin{align*}
S_\Pi(\alpha) := \sum_{k=0}^{n-1} f(\alpha t_{k+1} + (1-\alpha) t_k) \, (W(t_{k+1}) - W(t_k)),
\end{align*}
where $\alpha \in [0,1]$ controls the evaluation point within each subinterval.
A striking feature of stochastic integration — and a fundamental departure from classical analysis — is that the limit depends on $\alpha$. For deterministic integrands and finite-variation integrators, all evaluation points give the same limit, but the infinite variation of Brownian motion creates a discrepancy.
The two principal choices are:
- **$\alpha = 0$ (left endpoint):** This yields the **Itô integral**. Evaluating $f$ at the left endpoint $t_k$ ensures that $f(t_k)$ is $\mathcal{F}_{t_k}$-measurable and hence independent of the future increment $W(t_{k+1}) - W(t_k)$. This independence is the source of the martingale property and the Itô isometry — the two structural pillars of the theory.
- **$\alpha = 1/2$ (midpoint):** This yields the **Stratonovich integral**, denoted $\int f \circ dW$. The midpoint evaluation restores the classical chain rule (no second-order correction), but the resulting integral is not a martingale, and the isometry fails. In applications where the noise arises as a limit of smooth approximations (e.g., the Wong-Zakai theorem), the Stratonovich integral is the natural object.
We develop the Itô theory here, as the martingale and isometry properties make it the correct framework for mathematical finance, filtering theory, and the general theory of stochastic differential equations. The Stratonovich integral can always be recovered from the Itô integral by a correction term involving the quadratic covariation.
### From Sums to $L^2$ [Limits](/page/Limit)
The Itô integral is constructed not by pointwise limits of Riemann sums (which may fail to converge pathwise), but by $L^2(\Omega)$ limits. The key insight is that for $\alpha = 0$, the independence between $f(t_k)$ and $\Delta W_k := W(t_{k+1}) - W(t_k)$ produces the identity:
\begin{align*}
\mathbb{E}\left[\left(\sum_{k} f(t_k) \Delta W_k\right)^2\right] = \sum_{k} \mathbb{E}[f(t_k)^2] (t_{k+1} - t_k),
\end{align*}
which in the limit becomes the **Itô isometry** $\mathbb{E}[(\int f \, dW)^2] = \mathbb{E}[\int f^2 \, dt]$. This isometry identifies the stochastic integral as an isometric embedding of $L^2(\Omega \times [0,T])$ into $L^2(\Omega)$, and the construction proceeds by density: define the integral for simple (step) processes, verify the isometry, and extend to the closure.
[/motivation]
## The Itô Integral
### Construction for Simple Processes
The construction begins with the simplest class of integrands: processes that are piecewise constant and adapted.
[definition: Simple Process]
A **simple process** (or elementary process) is a stochastic process $f: [0,T] \times \Omega \to \mathbb{R}$ of the form:
\begin{align*}
f(t, \omega) = \sum_{k=0}^{n-1} \xi_k(\omega) \, \mathbb{1}_{(t_k, t_{k+1}]}(t),
\end{align*}
where $0 = t_0 < t_1 < \cdots < t_n = T$ is a partition and each $\xi_k: \Omega \to \mathbb{R}$ is a bounded, $\mathcal{F}_{t_k}$-measurable random variable.
[/definition]
For a simple process $f$, the Itô integral is defined algebraically as the finite sum:
\begin{align*}
\int_0^T f(t) \, dW(t) := \sum_{k=0}^{n-1} \xi_k \, (W(t_{k+1}) - W(t_k)).
\end{align*}
Each term in this sum is the product of an $\mathcal{F}_{t_k}$-measurable random variable $\xi_k$ and the independent increment $W(t_{k+1}) - W(t_k)$. This independence is the mechanism behind all the structural properties of the integral.
[example: The Integral $\int_0^T W \, dW$ via Approximation]
We compute $\int_0^T W(t) \, dW(t)$ by approximating $W(t)$ with simple processes. For a partition $\Pi = \{0 = t_0 < \cdots < t_n = T\}$, the left-endpoint approximation is:
\begin{align*}
S_\Pi = \sum_{k=0}^{n-1} W(t_k)(W(t_{k+1}) - W(t_k)).
\end{align*}
We use the algebraic identity $a(b - a) = \frac{1}{2}(b^2 - a^2) - \frac{1}{2}(b-a)^2$ with $a = W(t_k)$ and $b = W(t_{k+1})$:
\begin{align*}
S_\Pi &= \frac{1}{2} \sum_{k=0}^{n-1} \left(W(t_{k+1})^2 - W(t_k)^2\right) - \frac{1}{2}\sum_{k=0}^{n-1} (W(t_{k+1}) - W(t_k))^2 \\
&= \frac{1}{2} W(T)^2 - \frac{1}{2} \sum_{k=0}^{n-1} (W(t_{k+1}) - W(t_k))^2.
\end{align*}
The first sum telescopes. The second sum converges in $L^2(\Omega)$ to $T$ (this is the quadratic variation of Brownian motion, proved below). Therefore:
\begin{align*}
\int_0^T W(t) \, dW(t) = \frac{1}{2}W(T)^2 - \frac{1}{2}T.
\end{align*}
The correction term $-\frac{1}{2}T$ is absent from the classical Riemann-Stieltjes calculation (where $\int g \, dg = \frac{1}{2}g^2$). It arises because the sum of squared increments does not vanish — it converges to $T$ — and this is the first concrete manifestation of the Itô correction.
Note also that $\mathbb{E}[\int_0^T W \, dW] = \frac{1}{2}\mathbb{E}[W(T)^2] - \frac{1}{2}T = \frac{1}{2}T - \frac{1}{2}T = 0$, consistent with the martingale property.
[/example]
### Extension to $L^2$ Integrands
The Itô integral is extended from simple processes to the full class of square-integrable adapted processes by an isometric density argument.
[definition: Itô Integrable Process]
Let $\mathcal{H}^2(0,T)$ denote the space of all progressively measurable processes $f: [0,T] \times \Omega \to \mathbb{R}$ satisfying:
\begin{align*}
\|f\|_{\mathcal{H}^2}^2 := \mathbb{E}\left[\int_0^T f(t)^2 \, dt\right] < \infty.
\end{align*}
[/definition]
The space $\mathcal{H}^2(0,T)$, equipped with the norm $\|\cdot\|_{\mathcal{H}^2}$, is a [Hilbert space](/page/Hilbert%20Space). The Itô integral for simple processes is an isometry from the subspace of simple processes into $L^2(\Omega)$, and we extend it to all of $\mathcal{H}^2(0,T)$ by [continuity](/page/Continuity). The following theorem collects the fundamental properties.
[theorem: Properties of the Itô Integral]
Let $f \in \mathcal{H}^2(0,T)$. The Itô integral $I(f) := \int_0^T f(t) \, dW(t)$ satisfies:
1. **Linearity:** $I(\alpha f + \beta g) = \alpha I(f) + \beta I(g)$ for $\alpha, \beta \in \mathbb{R}$ and $f, g \in \mathcal{H}^2$.
2. **Itô Isometry:**
\begin{align*}
\mathbb{E}\left[\left(\int_0^T f(t) \, dW(t)\right)^2\right] = \mathbb{E}\left[\int_0^T f(t)^2 \, dt\right].
\end{align*}
3. **Martingale Property:** The process $M(t) := \int_0^t f(s) \, dW(s)$ is a continuous $(\mathcal{F}_t)$-martingale. In particular, $\mathbb{E}[M(t)] = 0$ for all $t$.
4. **Continuous Paths:** The process $t \mapsto M(t)$ has a version with continuous sample paths.
[/theorem]
The Itô isometry deserves emphasis. In classical integration, the $L^2$ norm of the integral is controlled by Hölder or Cauchy-Schwarz estimates that involve the variation of the integrator. Here, the isometry replaces the role of bounded variation: the $L^2(\Omega)$ norm of the stochastic integral equals the $L^2(\Omega \times [0,T])$ norm of the integrand. This identity is a consequence of the orthogonality of the increments $\Delta W_k$ — the same independence that we exploited in the construction.
The martingale property is equally fundamental. It says that the stochastic integral has no systematic drift: the best prediction of $M(T)$ given information up to time $t$ is the current value $M(t)$. This property fails for the Stratonovich integral, which is one reason the Itô formulation is preferred in probability theory.
## Quadratic Variation
The quadratic variation is the key analytical concept that distinguishes stochastic from classical calculus. For a classical $C^1$ function $g$, the sum of squared increments $\sum (g(t_{k+1}) - g(t_k))^2$ converges to zero as the mesh of the partition tends to zero. For Brownian motion, this sum converges to a nonzero quantity — and it is this nonvanishing limit that forces the correction term in Itô's formula.
[definition: Quadratic Variation]
Let $X: [0,T] \times \Omega \to \mathbb{R}$ be a continuous semimartingale. The **quadratic variation** of $X$ on $[0,T]$ is defined as the limit in probability:
\begin{align*}
[X,X](T) := \lim_{|\Pi| \to 0} \sum_{k=0}^{n-1} (X(t_{k+1}) - X(t_k))^2,
\end{align*}
where $\Pi = \{0 = t_0 < \cdots < t_n = T\}$ is a partition and $|\Pi| = \max_k(t_{k+1} - t_k)$ denotes the mesh.
More generally, the **quadratic covariation** of two continuous semimartingales $X$ and $Y$ is:
\begin{align*}
[X,Y](T) := \lim_{|\Pi| \to 0} \sum_{k=0}^{n-1} (X(t_{k+1}) - X(t_k))(Y(t_{k+1}) - Y(t_k)).
\end{align*}
[/definition]
The following result establishes the quadratic variation of Brownian motion and is the computational engine behind Itô's formula.
[theorem: Quadratic Variation of Brownian Motion]
Let $W$ be a standard Brownian motion. Then for every $T > 0$:
\begin{align*}
[W,W](T) = T.
\end{align*}
More precisely, for any [sequence](/page/Sequence) of partitions $\Pi_n$ of $[0,T]$ with $|\Pi_n| \to 0$:
\begin{align*}
\sum_{k=0}^{n-1} (W(t_{k+1}^{(n)}) - W(t_k^{(n)}))^2 \xrightarrow{L^2(\Omega)} T.
\end{align*}
[/theorem]
The proof is a direct second-moment computation. Let $Q_n := \sum_k (\Delta W_k)^2$ where $\Delta W_k = W(t_{k+1}) - W(t_k)$. Since the increments are independent with $\mathbb{E}[(\Delta W_k)^2] = \Delta t_k$ and $\operatorname{Var}((\Delta W_k)^2) = 2(\Delta t_k)^2$ (the fourth moment of a centred Gaussian is $3\sigma^4$, so the variance of the square is $3\sigma^4 - \sigma^4 = 2\sigma^4$):
\begin{align*}
\mathbb{E}[Q_n] &= \sum_k \Delta t_k = T, \\
\operatorname{Var}(Q_n) &= \sum_k 2(\Delta t_k)^2 \le 2 |\Pi_n| \sum_k \Delta t_k = 2T |\Pi_n| \to 0.
\end{align*}
Convergence in $L^2$ follows.
This result is often written in differential shorthand as $dW \cdot dW = dt$. Combined with the heuristic rules $dt \cdot dW = 0$ and $dt \cdot dt = 0$ (which express the fact that terms of order higher than $dt$ are negligible), these multiplication rules determine the correction term in Itô's formula. For an $m$-dimensional Brownian motion $W = (W_1, \dots, W_m)$, the covariation rules are:
\begin{align*}
dW_j \cdot dW_\ell = \delta_{j\ell} \, dt,
\end{align*}
where $\delta_{j\ell}$ denotes the Kronecker symbol.
## Itô Processes and Itô's Formula
### Itô Processes
Having constructed the stochastic integral, we can define the class of processes that serve as the solutions of stochastic differential equations.
[definition: Itô Process]
Let $T > 0$ and $n, m \in \mathbb{N}$. Let $W: [0,T] \times \Omega \to \mathbb{R}^m$ be an $m$-dimensional Brownian motion adapted to $(\mathcal{F}_t)_{t \ge 0}$.
Let $b: [0,T] \times \mathbb{R}^n \to \mathbb{R}^n$ and $\sigma: [0,T] \times \mathbb{R}^n \to \mathbb{R}^{n \times m}$ be measurable functions. An $\mathbb{R}^n$-valued stochastic process $X: [0,T] \times \Omega \to \mathbb{R}^n$ is called an **Itô process** if, for each $i \in \{1, \dots, n\}$:
\begin{align*}
X_i(t) = X_i(0) + \int_0^t b_i(s, X(s)) \, ds + \sum_{j=1}^m \int_0^t \sigma_{ij}(s, X(s)) \, dW_j(s),
\end{align*}
where the first integral is a [Lebesgue integral](/page/Lebesgue%20Integral) and the second is an Itô integral.
[/definition]
In differential notation, the Itô process is written:
\begin{align*}
dX_i = b_i(t, X) \, dt + \sum_{j=1}^m \sigma_{ij}(t, X) \, dW_j.
\end{align*}
The function $b$ is the **drift coefficient** (deterministic trend) and $\sigma$ is the **diffusion coefficient** (random fluctuations). The matrix $a_{ik}(t,x) := \sum_{j=1}^m \sigma_{ij}(t,x) \sigma_{kj}(t,x)$ is called the **diffusion matrix** and governs the local covariance structure of $X$.
### Itô's Formula in $\mathbb{R}^n$
Itô's formula is the stochastic analogue of the chain rule. It describes how a smooth function of an Itô process evolves in time. The key difference from the classical chain rule is the presence of a second-order term involving the diffusion matrix — a direct consequence of the nonvanishing quadratic variation of Brownian motion.
[theorem: Itô's Formula]
Let $X: [0,T] \times \Omega \to \mathbb{R}^n$ be an Itô process as in the preceding definition. Let $f: [0,T] \times \mathbb{R}^n \to \mathbb{R}$ be a function of class $C^{1,2}$ (continuously differentiable in $t$ and twice continuously differentiable in $x$).
Then the real-valued process $Y(t) := f(t, X(t))$ satisfies, for all $t \in [0,T]$:
\begin{align*}
Y(t) &= Y(0) + \int_0^t \partial_t f(s, X(s)) \, ds + \sum_{i=1}^n \int_0^t \partial_{x_i} f(s, X(s)) \, b_i(s, X(s)) \, ds \\
&\quad + \frac{1}{2} \sum_{i=1}^n \sum_{k=1}^n \int_0^t \partial_{x_i x_k} f(s, X(s)) \, a_{ik}(s, X(s)) \, ds \\
&\quad + \sum_{i=1}^n \sum_{j=1}^m \int_0^t \partial_{x_i} f(s, X(s)) \, \sigma_{ij}(s, X(s)) \, dW_j(s),
\end{align*}
where $a_{ik} = \sum_{j=1}^m \sigma_{ij} \sigma_{kj}$.
[/theorem]
In differential shorthand, Itô's formula reads:
\begin{align*}
df = \left(\partial_t f + \sum_i b_i \partial_{x_i} f + \frac{1}{2} \sum_{i,k} a_{ik} \partial_{x_i x_k} f\right) dt + \sum_{i,j} \sigma_{ij} \partial_{x_i} f \, dW_j.
\end{align*}
The first line consists of the $dt$ terms: the time derivative, the drift contribution (which matches the classical chain rule), and the **Itô correction** $\frac{1}{2}\sum_{i,k} a_{ik} \partial_{x_i x_k} f$. This correction term has no classical analogue; it arises because the Taylor expansion of $f$ along a Brownian path must be carried to second order, since $(\Delta W)^2 \sim \Delta t$ rather than being negligible.
The second line is the $dW$ term, a martingale. When computing expectations, this term vanishes (assuming sufficient [integrability](/page/Integral)), leaving:
\begin{align*}
\mathbb{E}[f(t, X(t))] = f(0, X(0)) + \mathbb{E}\left[\int_0^t \left(\partial_t f + \sum_i b_i \partial_{x_i} f + \frac{1}{2} \sum_{i,k} a_{ik} \partial_{x_i x_k} f\right) ds\right].
\end{align*}
This identity connects the expected value of $f(t,X(t))$ to the action of the **infinitesimal generator** of the diffusion, $\mathcal{L}f := \sum_i b_i \partial_{x_i} f + \frac{1}{2}\sum_{i,k} a_{ik} \partial_{x_i x_k} f$, and is the starting point for the connection between stochastic calculus and partial differential equations (the Kolmogorov equations).
**Proof idea.** The proof proceeds by applying the deterministic Taylor expansion to $f(t_{k+1}, X(t_{k+1})) - f(t_k, X(t_k))$ over each subinterval of a partition, retaining terms up to second order in the spatial increments $\Delta X_i$. The first-order terms produce the drift and stochastic integral contributions. The second-order terms produce $\frac{1}{2}\sum_{i,k} \partial_{x_i x_k} f \cdot \Delta X_i \Delta X_k$. Using the heuristic $\Delta X_i \Delta X_k \approx a_{ik} \Delta t$ (which is made rigorous via the quadratic variation), summing over the partition, and passing to the limit yields the formula.
[example: Geometric Brownian Motion]
Consider the one-dimensional Itô process $X$ with drift $b(t,x) = \mu x$ and diffusion $\sigma(t,x) = \sigma x$, where $\mu \in \mathbb{R}$ and $\sigma > 0$ are constants:
\begin{align*}
dX = \mu X \, dt + \sigma X \, dW, \qquad X(0) = x_0 > 0.
\end{align*}
This is the **geometric Brownian motion** (GBM) equation, and it is the foundational model for asset prices in mathematical finance.
To solve the equation, apply Itô's formula to $f(x) = \log x$ (here $f$ does not depend on $t$). We have $f'(x) = 1/x$ and $f''(x) = -1/x^2$. By Itô's formula:
\begin{align*}
d(\log X) &= \frac{1}{X} \, dX + \frac{1}{2}\left(-\frac{1}{X^2}\right)(\sigma X)^2 \, dt \\
&= \frac{1}{X}(\mu X \, dt + \sigma X \, dW) - \frac{1}{2}\sigma^2 \, dt \\
&= \left(\mu - \frac{\sigma^2}{2}\right) dt + \sigma \, dW.
\end{align*}
The second-order Itô correction $-\frac{1}{2}\sigma^2$ shifts the effective growth rate from $\mu$ to $\mu - \sigma^2/2$. Integrating:
\begin{align*}
\log X(t) = \log x_0 + \left(\mu - \frac{\sigma^2}{2}\right)t + \sigma W(t),
\end{align*}
so the explicit solution is:
\begin{align*}
X(t) = x_0 \exp\left(\left(\mu - \frac{\sigma^2}{2}\right)t + \sigma W(t)\right).
\end{align*}
Since $X(t) > 0$ for all $t$ (the exponential is strictly positive), the process never hits zero — a desirable property for a price model. The shift $\mu \to \mu - \sigma^2/2$ means that the expected growth rate of $\log X$ is strictly less than $\mu$: volatility depresses the logarithmic return. This is the **volatility drag** phenomenon, entirely a consequence of the Itô correction.
[/example]
[example: The Ornstein-Uhlenbeck Process]
The Ornstein-Uhlenbeck (OU) process models a diffusion that is pulled towards a long-term mean. It satisfies:
\begin{align*}
dX = -\theta X \, dt + \sigma \, dW, \qquad X(0) = x_0,
\end{align*}
where $\theta > 0$ is the mean-reversion rate and $\sigma > 0$ is the volatility. To solve this linear SDE, we use the integrating factor $e^{\theta t}$.
Apply Itô's formula to $Y(t) = e^{\theta t} X(t)$, using $f(t,x) = e^{\theta t} x$ with $\partial_t f = \theta e^{\theta t} x$, $\partial_x f = e^{\theta t}$, $\partial_{xx} f = 0$:
\begin{align*}
dY &= \theta e^{\theta t} X \, dt + e^{\theta t} \, dX + 0 \\
&= \theta e^{\theta t} X \, dt + e^{\theta t}(-\theta X \, dt + \sigma \, dW) \\
&= \sigma e^{\theta t} \, dW.
\end{align*}
The drift terms cancel (as designed by the integrating factor), and no Itô correction appears because $\partial_{xx} f = 0$. Integrating:
\begin{align*}
Y(t) = x_0 + \sigma \int_0^t e^{\theta s} \, dW(s),
\end{align*}
so the solution is:
\begin{align*}
X(t) = x_0 e^{-\theta t} + \sigma \int_0^t e^{-\theta(t-s)} \, dW(s).
\end{align*}
Since the stochastic integral of a deterministic function of $s$ against $dW(s)$ is a Gaussian random variable, $X(t)$ is Gaussian with mean $x_0 e^{-\theta t}$ and variance $\frac{\sigma^2}{2\theta}(1 - e^{-2\theta t})$. As $t \to \infty$, the process converges in [distribution](/page/Distribution) to $\mathcal{N}(0, \sigma^2/(2\theta))$, the stationary distribution.
[/example]
## Existence and Uniqueness for Stochastic Differential Equations
The examples above relied on explicit solutions, which are available only for special coefficient structures (linear, or amenable to a known transformation). For general SDEs, we require an abstract existence and uniqueness theorem analogous to the Picard-Lindelöf theorem in ODE theory.
[definition: Stochastic Differential Equation]
A **stochastic differential equation** (SDE) on $[0,T]$ is the integral equation:
\begin{align*}
X(t) = X(0) + \int_0^t b(s, X(s)) \, ds + \int_0^t \sigma(s, X(s)) \, dW(s), \qquad t \in [0,T],
\end{align*}
where $b: [0,T] \times \mathbb{R}^n \to \mathbb{R}^n$ is the drift, $\sigma: [0,T] \times \mathbb{R}^n \to \mathbb{R}^{n \times m}$ is the diffusion, $W$ is an $m$-dimensional Brownian motion, and $X(0)$ is an $\mathcal{F}_0$-measurable random variable with $\mathbb{E}[|X(0)|^2] < \infty$.
A **strong solution** is an $(\mathcal{F}_t)$-adapted, continuous process $X$ satisfying this equation almost surely.
[/definition]
The standard conditions ensuring well-posedness are the Lipschitz and linear growth conditions, which are the stochastic analogues of the conditions in the Picard-Lindelöf theorem.
[theorem: Existence and Uniqueness of Strong Solutions]
Let $b: [0,T] \times \mathbb{R}^n \to \mathbb{R}^n$ and $\sigma: [0,T] \times \mathbb{R}^n \to \mathbb{R}^{n \times m}$ be measurable and satisfy:
1. **Global Lipschitz condition:** There exists $K > 0$ such that for all $t \in [0,T]$ and $x, y \in \mathbb{R}^n$:
\begin{align*}
|b(t,x) - b(t,y)| + |\sigma(t,x) - \sigma(t,y)| \le K|x - y|.
\end{align*}
2. **Linear growth condition:** There exists $K > 0$ such that for all $t \in [0,T]$ and $x \in \mathbb{R}^n$:
\begin{align*}
|b(t,x)| + |\sigma(t,x)| \le K(1 + |x|).
\end{align*}
Then for any $\mathcal{F}_0$-measurable $X(0)$ with $\mathbb{E}[|X(0)|^2] < \infty$, there exists a unique strong solution $X$ satisfying:
\begin{align*}
\mathbb{E}\left[\sup_{0 \le t \le T} |X(t)|^2\right] < \infty.
\end{align*}
[/theorem]
The proof follows the Picard iteration scheme: define $X^{(0)}(t) = X(0)$ and recursively set:
\begin{align*}
X^{(k+1)}(t) = X(0) + \int_0^t b(s, X^{(k)}(s)) \, ds + \int_0^t \sigma(s, X^{(k)}(s)) \, dW(s).
\end{align*}
The Lipschitz condition and the Itô isometry combine to give:
\begin{align*}
\mathbb{E}\left[\sup_{0 \le s \le t} |X^{(k+1)}(s) - X^{(k)}(s)|^2\right] \le C \int_0^t \mathbb{E}\left[\sup_{0 \le r \le s}|X^{(k)}(r) - X^{(k-1)}(r)|^2\right] ds,
\end{align*}
where the Burkholder-Davis-Gundy inequality is used to control the supremum of the martingale part. Iterating this bound produces a geometric series in $(CT)^k/k!$, which converges. The linear growth condition ensures the solution does not explode in finite time.
The Lipschitz condition is not merely a technical convenience — it is essential for uniqueness. There exist SDEs with non-Lipschitz coefficients (e.g., $dX = |X|^{1/2} \, dW$) that have multiple weak solutions or exhibit pathological non-uniqueness. Extensions to locally Lipschitz coefficients or the Yamada-Watanabe conditions (which relax Lipschitz to Hölder continuity for the diffusion coefficient) are important refinements but require more delicate arguments.
## The Martingale Representation Theorem
A remarkable feature of Brownian filtrations is that every martingale can be represented as a stochastic integral. This result underpins the theory of hedging in mathematical finance and the Girsanov change of measure.
[theorem: Martingale Representation]
Let $(\mathcal{F}_t)_{t \ge 0}$ be the natural filtration of a standard Brownian motion $W$, augmented by the $\mathbb{P}$-null sets. Let $M$ be a continuous $(\mathcal{F}_t)$-martingale with $\mathbb{E}[M(T)^2] < \infty$. Then there exists a unique process $f \in \mathcal{H}^2(0,T)$ such that:
\begin{align*}
M(t) = M(0) + \int_0^t f(s) \, dW(s) \qquad \text{for all } t \in [0,T].
\end{align*}
[/theorem]
The theorem says that in the Brownian world, the stochastic integral with respect to $W$ is the *only* source of randomness. Every square-integrable martingale is a stochastic integral — there are no "hidden" sources of randomness beyond $W$ itself. The proof relies on the fact that the filtration $(\mathcal{F}_t)$ is generated by $W$, and proceeds by showing that the stochastic integrals $\{\int_0^T f \, dW : f \in \mathcal{H}^2\}$ form a closed subspace of $L^2(\Omega, \mathcal{F}_T, \mathbb{P})$ that contains all $\mathcal{F}_T$-measurable random variables.
In mathematical finance, this result implies that in the Black-Scholes model (where the stock price is a geometric Brownian motion), every contingent claim can be replicated by a dynamic trading strategy — the integrand $f$ is the hedging portfolio.
## References
K. Itô, *On Stochastic Differential Equations*, Memoirs of the American Mathematical Society (1951).
B. Øksendal, *Stochastic Differential Equations: An Introduction with Applications*, 6th edition (2003).
I. Karatzas and S. E. Shreve, *Brownian Motion and Stochastic Calculus*, 2nd edition (1991).
P. Protter, *Stochastic Integration and Differential Equations*, 2nd edition (2005).
Computing diff...
0 modified
0 added
1 removed
69 unchanged
1 unchanged block
text
text
Stochastic calculus provides the resolution. It constructs an integral with respect to Brownian motion (and more general...
Removed
text
In classical analysis, the derivative and integral are built on the regularity of the [functions](/page/Function) involved. The Riemann-Stieltjes integral $\int_0^T f \, dg$ requires the integrator $g$ to have bounded variation, and the chain rule $\frac{d}{dt}F(g(t)) = F'(g(t))g'(t)$ requires $g$ to be [differentiable](/page/Derivative). Brownian motion — the canonical model for continuous random fluctuations — violates both of these assumptions: its sample paths are almost surely continuous but nowhere differentiable, and they have infinite variation on every interval. Any attempt to build a calculus for stochastic processes must therefore confront the failure of the classical machinery at a fundamental level.
68 unchanged blocks
text, motivation, h2, h3, text, ...
text
Throughout this article, we work on a fixed filtered probability space $(\Omega, \mathcal{F}, (\mathcal{F}_t)_{t \ge 0},...
motivation
### The Failure of Riemann-Stieltjes Integration
The Riemann-Stieltjes integral $\int_0^T f(t) \, dg(t)$ is well-define...
h2
The Itô Integral
h3
Construction for Simple Processes
text
The construction begins with the simplest class of integrands: processes that are piecewise constant and adapted.
definition
Simple Process
A **simple process** (or elementary process) is a stochastic process $f: [0,T] \times \Omega \to \mathbb{R}$ of the form...
text
For a simple process $f$, the Itô integral is defined algebraically as the finite sum:
align*
\int_0^T f(t) \, dW(t) := \sum_{k=0}^{n-1} \xi_k \, (W(t_{k+1}) - W(t_k)).
text
Each term in this sum is the product of an $\mathcal{F}_{t_k}$-measurable random variable $\xi_k$ and the independent in...
example
The Integral $\int_0^T W \, dW$ via Approximation
We compute $\int_0^T W(t) \, dW(t)$ by approximating $W(t)$ with simple processes. For a partition $\Pi = \{0 = t_0 < \c...
h3
Extension to $L^2$ Integrands
text
The Itô integral is extended from simple processes to the full class of square-integrable adapted processes by an isomet...
definition
Itô Integrable Process
Let $\mathcal{H}^2(0,T)$ denote the space of all progressively measurable processes $f: [0,T] \times \Omega \to \mathbb{...
text
The space $\mathcal{H}^2(0,T)$, equipped with the norm $\|\cdot\|_{\mathcal{H}^2}$, is a [Hilbert space](/page/Hilbert%2...
theorem
Properties of the Itô Integral
Let $f \in \mathcal{H}^2(0,T)$. The Itô integral $I(f) := \int_0^T f(t) \, dW(t)$ satisfies:
1. **Linearity:** $I(\alph...
text
The Itô isometry deserves emphasis. In classical integration, the $L^2$ norm of the integral is controlled by Hölder or ...
text
The martingale property is equally fundamental. It says that the stochastic integral has no systematic drift: the best p...
h2
Quadratic Variation
text
The quadratic variation is the key analytical concept that distinguishes stochastic from classical calculus. For a class...
definition
Quadratic Variation
Let $X: [0,T] \times \Omega \to \mathbb{R}$ be a continuous semimartingale. The **quadratic variation** of $X$ on $[0,T]...
text
The following result establishes the quadratic variation of Brownian motion and is the computational engine behind Itô's...
theorem
Quadratic Variation of Brownian Motion
Let $W$ be a standard Brownian motion. Then for every $T > 0$:
\begin{align*}
[W,W](T) = T.
\end{align*}
More precisely,...
text
The proof is a direct second-moment computation. Let $Q_n := \sum_k (\Delta W_k)^2$ where $\Delta W_k = W(t_{k+1}) - W(t...
align*
\mathbb{E}[Q_n] &= \sum_k \Delta t_k = T, \\
\operatorname{Var}(Q_n) &= \sum_k 2(\Delta t_k)^2 \le 2 |\Pi_n| \sum_k \Del...
text
Convergence in $L^2$ follows.
text
This result is often written in differential shorthand as $dW \cdot dW = dt$. Combined with the heuristic rules $dt \cdo...
align*
dW_j \cdot dW_\ell = \delta_{j\ell} \, dt,
text
where $\delta_{j\ell}$ denotes the Kronecker symbol.
h2
Itô Processes and Itô's Formula
h3
Itô Processes
text
Having constructed the stochastic integral, we can define the class of processes that serve as the solutions of stochast...
definition
Itô Process
Let $T > 0$ and $n, m \in \mathbb{N}$. Let $W: [0,T] \times \Omega \to \mathbb{R}^m$ be an $m$-dimensional Brownian moti...
text
In differential notation, the Itô process is written:
align*
dX_i = b_i(t, X) \, dt + \sum_{j=1}^m \sigma_{ij}(t, X) \, dW_j.
text
The function $b$ is the **drift coefficient** (deterministic trend) and $\sigma$ is the **diffusion coefficient** (rando...
h3
Itô's Formula in $\mathbb{R}^n$
text
Itô's formula is the stochastic analogue of the chain rule. It describes how a smooth function of an Itô process evolves...
theorem
Itô's Formula
Let $X: [0,T] \times \Omega \to \mathbb{R}^n$ be an Itô process as in the preceding definition. Let $f: [0,T] \times \ma...
text
In differential shorthand, Itô's formula reads:
align*
df = \left(\partial_t f + \sum_i b_i \partial_{x_i} f + \frac{1}{2} \sum_{i,k} a_{ik} \partial_{x_i x_k} f\right) dt + \...
text
The first line consists of the $dt$ terms: the time derivative, the drift contribution (which matches the classical chai...
text
The second line is the $dW$ term, a martingale. When computing expectations, this term vanishes (assuming sufficient [in...
align*
\mathbb{E}[f(t, X(t))] = f(0, X(0)) + \mathbb{E}\left[\int_0^t \left(\partial_t f + \sum_i b_i \partial_{x_i} f + \frac{...
text
This identity connects the expected value of $f(t,X(t))$ to the action of the **infinitesimal generator** of the diffusi...
text
**Proof idea.** The proof proceeds by applying the deterministic Taylor expansion to $f(t_{k+1}, X(t_{k+1})) - f(t_k, X(...
example
Geometric Brownian Motion
Consider the one-dimensional Itô process $X$ with drift $b(t,x) = \mu x$ and diffusion $\sigma(t,x) = \sigma x$, where $...
example
The Ornstein-Uhlenbeck Process
The Ornstein-Uhlenbeck (OU) process models a diffusion that is pulled towards a long-term mean. It satisfies:
\begin{ali...
h2
Existence and Uniqueness for Stochastic Differential Equations
text
The examples above relied on explicit solutions, which are available only for special coefficient structures (linear, or...
definition
Stochastic Differential Equation
A **stochastic differential equation** (SDE) on $[0,T]$ is the integral equation:
\begin{align*}
X(t) = X(0) + \int_0^t ...
text
The standard conditions ensuring well-posedness are the Lipschitz and linear growth conditions, which are the stochastic...
theorem
Existence and Uniqueness of Strong Solutions
Let $b: [0,T] \times \mathbb{R}^n \to \mathbb{R}^n$ and $\sigma: [0,T] \times \mathbb{R}^n \to \mathbb{R}^{n \times m}$ ...
text
The proof follows the Picard iteration scheme: define $X^{(0)}(t) = X(0)$ and recursively set:
align*
X^{(k+1)}(t) = X(0) + \int_0^t b(s, X^{(k)}(s)) \, ds + \int_0^t \sigma(s, X^{(k)}(s)) \, dW(s).
text
The Lipschitz condition and the Itô isometry combine to give:
align*
\mathbb{E}\left[\sup_{0 \le s \le t} |X^{(k+1)}(s) - X^{(k)}(s)|^2\right] \le C \int_0^t \mathbb{E}\left[\sup_{0 \le r \...
text
where the Burkholder-Davis-Gundy inequality is used to control the supremum of the martingale part. Iterating this bound...
text
The Lipschitz condition is not merely a technical convenience — it is essential for uniqueness. There exist SDEs with no...
h2
The Martingale Representation Theorem
text
A remarkable feature of Brownian filtrations is that every martingale can be represented as a stochastic integral. This ...
theorem
Martingale Representation
Let $(\mathcal{F}_t)_{t \ge 0}$ be the natural filtration of a standard Brownian motion $W$, augmented by the $\mathbb{P...
text
The theorem says that in the Brownian world, the stochastic integral with respect to $W$ is the *only* source of randomn...
text
In mathematical finance, this result implies that in the Black-Scholes model (where the stock price is a geometric Brown...
h2
References
text
K. Itô, *On Stochastic Differential Equations*, Memoirs of the American Mathematical Society (1951).
text
B. Øksendal, *Stochastic Differential Equations: An Introduction with Applications*, 6th edition (2003).
text
I. Karatzas and S. E. Shreve, *Brownian Motion and Stochastic Calculus*, 2nd edition (1991).
text
P. Protter, *Stochastic Integration and Differential Equations*, 2nd edition (2005).
Thread
0 replies
Delete comment
Are you sure you want to delete this comment? This cannot be undone.
Merge pull request
Are you sure you want to merge this pull request? The proposed changes will be applied to the page.