[proofplan]
We use the Darboux characterization of [Riemann integrability](/page/Riemann%20Integrability): choose one fixed partition $Q$ for which the gap between the upper and lower Darboux sums is small. An arbitrary sufficiently fine tagged partition $P$ is then compared to $Q$ by separating the subintervals of $P$ that lie inside intervals of $Q$ from the few subintervals that cross points of $Q$. The inside intervals are controlled by the oscillation appearing in $U(f,Q)-L(f,Q)$, while the crossing intervals have small total length because the mesh of $P$ is small.
[/proofplan]
[step:Choose a Darboux partition with small oscillation gap]
Let
\begin{align*}
I := \int_a^b f\, d\mathcal{L}^1
\end{align*}
denote the [Riemann integral](/page/Riemann%20Integral) of $f$ over $[a,b]$. Since $f$ is Riemann integrable on the compact interval $[a,b]$, the function $f$ is bounded. Choose $M \geq 0$ such that $|f(x)| \leq M$ for every $x \in [a,b]$.
If $M = 0$, then $f(x)=0$ for every $x \in [a,b]$, so every Riemann sum and the integral are equal to $0$. The conclusion follows for any $\delta > 0$. Hence assume $M>0$.
By the Darboux characterization of Riemann integrability, there exists a partition
\begin{align*}Q = \{q_0,q_1,\dots,q_m\}\end{align*}
with
\begin{align*}a=q_0<q_1<\cdots<q_m=b\end{align*}
such that the corresponding upper and lower Darboux sums, defined below, have gap smaller than $\varepsilon/2$.
For each $j \in \{1,\dots,m\}$, define
\begin{align*}
M_j := \sup\{f(x):x \in [q_{j-1},q_j]\}
\end{align*}
and
\begin{align*}
m_j := \inf\{f(x):x \in [q_{j-1},q_j]\}.
\end{align*}
Define the Darboux sums for $Q$ by
\begin{align*}
U(f,Q) := \sum_{j=1}^m M_j(q_j-q_{j-1})
\end{align*}
and
\begin{align*}
L(f,Q) := \sum_{j=1}^m m_j(q_j-q_{j-1}).
\end{align*}
Then
\begin{align*}
U(f,Q)-L(f,Q)=\sum_{j=1}^m (M_j-m_j)(q_j-q_{j-1})
\end{align*}
and, by the choice of $Q$, this quantity is smaller than $\varepsilon/2$.
[guided]
The goal is to compare every sufficiently fine tagged sum with the integral. Since $f$ is Riemann integrable, the Darboux characterization says that upper and lower Darboux sums can be made arbitrarily close. We therefore choose one fixed partition
\begin{align*}Q = \{q_0,q_1,\dots,q_m\}\end{align*}
with
\begin{align*}a=q_0<q_1<\cdots<q_m=b\end{align*}
whose Darboux gap will be smaller than $\varepsilon/2$ after the upper and lower sums are defined below.
We also use boundedness. Since every Riemann integrable function on a compact interval is bounded, there is a number $M \geq 0$ such that $|f(x)| \leq M$ for all $x \in [a,b]$. If $M=0$, then $f$ is identically zero, so every tagged Riemann sum is $0$ and the Riemann integral is $0$. Thus the theorem is immediate in that case. For the rest of the proof we assume $M>0$.
For each interval of the fixed partition $Q$, we record the largest and smallest possible values of $f$ on that interval. For $j \in \{1,\dots,m\}$, define
\begin{align*}
M_j := \sup\{f(x):x \in [q_{j-1},q_j]\}
\end{align*}
and
\begin{align*}
m_j := \inf\{f(x):x \in [q_{j-1},q_j]\}.
\end{align*}
These numbers are finite because $f$ is bounded. The Darboux gap for $Q$ is exactly
\begin{align*}
U(f,Q)-L(f,Q)=\sum_{j=1}^m (M_j-m_j)(q_j-q_{j-1}).
\end{align*}
This sum measures the total oscillation of $f$ at the scale of the partition $Q$. By the choice of $Q$, it satisfies
\begin{align*}
U(f,Q)-L(f,Q) < \frac{\varepsilon}{2}.
\end{align*}
[/guided]
[/step]
[step:Choose the mesh so crossing intervals have small total length]
Define
\begin{align*}
\eta := \min\{q_j-q_{j-1}:1 \leq j \leq m\}.
\end{align*}
Since $Q$ is a finite partition and each interval length is positive, $\eta>0$. Choose
\begin{align*}
\delta := \min\left\{\eta,\frac{\varepsilon}{8M(m-1)}\right\}
\end{align*}
if $m \geq 2$, and choose $\delta := \eta$ if $m=1$.
Let $(P,t)$ be a tagged partition of $[a,b]$ with
\begin{align*}P=\{x_0,x_1,\dots,x_n\}, \qquad a=x_0<x_1<\cdots<x_n=b,\end{align*}
and tags $t_i \in [x_{i-1},x_i]$ for $i \in \{1,\dots,n\}$. Write
\begin{align*}
\Delta x_i := x_i-x_{i-1}.
\end{align*}
Let $B$ be the set of indices $i \in \{1,\dots,n\}$ such that the interval $[x_{i-1},x_i]$ contains at least one point of $\{q_1,\dots,q_{m-1}\}$ in its interior. Let $G:=\{1,\dots,n\}\setminus B$.
Because $|P|<\eta$, each interval $[x_{i-1},x_i]$ contains at most one interior point of $Q$. Hence the map sending $i \in B$ to the unique point of $\{q_1,\dots,q_{m-1}\}$ contained in $(x_{i-1},x_i)$ is injective, so $|B| \leq m-1$. Therefore, if $m \geq 2$,
\begin{align*}
\sum_{i \in B}\Delta x_i \leq (m-1)|P| < (m-1)\delta \leq \frac{\varepsilon}{8M}.
\end{align*}
[guided]
We choose the mesh bound so that intervals of the arbitrary partition $P$ cannot cross too much of the fixed Darboux partition $Q$. Define
\begin{align*}
\eta := \min\{q_j-q_{j-1}:1 \leq j \leq m\}.
\end{align*}
Because $Q$ has finitely many subintervals and each has positive length, $\eta>0$. If $m\geq 2$, set
\begin{align*}
\delta := \min\left\{\eta,\frac{\varepsilon}{8M(m-1)}\right\},
\end{align*}
and if $m=1$, set $\delta:=\eta$.
Let $(P,t)$ be a tagged partition with $|P|<\delta$. Write
\begin{align*}
P=\{x_0,x_1,\dots,x_n\}, \qquad a=x_0<x_1<\cdots<x_n=b,
\end{align*}
with tags $t_i\in[x_{i-1},x_i]$, and define $\Delta x_i:=x_i-x_{i-1}$. Let $B$ be the set of indices $i$ for which $[x_{i-1},x_i]$ contains at least one point of $\{q_1,\dots,q_{m-1}\}$ in its interior, and let $G:=\{1,\dots,n\}\setminus B$.
Since $|P|<\eta$, no interval of $P$ can contain two distinct interior points of $Q$: two such points would be separated by at least $\eta$, while the interval length is at most $|P|<\eta$. Thus each bad interval contains a unique interior point of $Q$, and the map from $B$ to $\{q_1,\dots,q_{m-1}\}$ sending a bad interval to that point is injective. Hence $|B|\leq m-1$. Therefore, when $m\geq 2$,
\begin{align*}
\sum_{i\in B}\Delta x_i \leq (m-1)|P| < (m-1)\delta \leq \frac{\varepsilon}{8M}.
\end{align*}
If $m=1$, there are no interior points of $Q$, so $B=\varnothing$.
[/guided]
[/step]
[step:Control the contribution from intervals contained inside the Darboux partition]
For each $i \in G$, the interval $[x_{i-1},x_i]$ is contained in some interval $[q_{j-1},q_j]$ of the partition $Q$. Define $j(i) \in \{1,\dots,m\}$ to be an index such that
\begin{align*}
[x_{i-1},x_i]\subset [q_{j(i)-1},q_{j(i)}].
\end{align*}
Since the tag satisfies $t_i \in [x_{i-1},x_i]$, we have
\begin{align*}
m_{j(i)} \leq f(t_i) \leq M_{j(i)}.
\end{align*}
Multiplying by the non-negative length $\Delta x_i$ gives
\begin{align*}
m_{j(i)}\Delta x_i \leq f(t_i)\Delta x_i \leq M_{j(i)}\Delta x_i.
\end{align*}
Consequently,
\begin{align*}
\left|\sum_{i\in G} f(t_i)\Delta x_i-\sum_{i\in G} m_{j(i)}\Delta x_i\right|
\leq \sum_{i\in G}(M_{j(i)}-m_{j(i)})\Delta x_i.
\end{align*}
Grouping the terms according to $j(i)$ and using that the total length of good intervals contained in $[q_{j-1},q_j]$ is at most $q_j-q_{j-1}$, we obtain
\begin{align*}
\sum_{i\in G}(M_{j(i)}-m_{j(i)})\Delta x_i
\leq \sum_{j=1}^m (M_j-m_j)(q_j-q_{j-1})
< \frac{\varepsilon}{2}.
\end{align*}
[guided]
For a good interval $i\in G$, the interval $[x_{i-1},x_i]$ does not contain an interior point of $Q$. Since the endpoints of $Q$ partition $[a,b]$, this means there is an index $j(i)\in\{1,\dots,m\}$ such that
\begin{align*}
[x_{i-1},x_i]\subset [q_{j(i)-1},q_{j(i)}].
\end{align*}
The tag $t_i$ lies in this same interval, so the definitions of $m_{j(i)}$ and $M_{j(i)}$ give
\begin{align*}
m_{j(i)} \leq f(t_i) \leq M_{j(i)}.
\end{align*}
Multiplying by the non-negative number $\Delta x_i$ preserves the inequalities:
\begin{align*}
m_{j(i)}\Delta x_i \leq f(t_i)\Delta x_i \leq M_{j(i)}\Delta x_i.
\end{align*}
Subtracting the lower comparison term and using the upper bound gives
\begin{align*}
\left|\sum_{i\in G} f(t_i)\Delta x_i-\sum_{i\in G} m_{j(i)}\Delta x_i\right|
\leq \sum_{i\in G}(M_{j(i)}-m_{j(i)})\Delta x_i.
\end{align*}
Now group the indices $i\in G$ by the value of $j(i)$. For a fixed $j$, the good intervals contained in $[q_{j-1},q_j]$ are disjoint subintervals of $[q_{j-1},q_j]$, so their total length is at most $q_j-q_{j-1}$. Therefore
\begin{align*}
\sum_{i\in G}(M_{j(i)}-m_{j(i)})\Delta x_i
\leq \sum_{j=1}^m (M_j-m_j)(q_j-q_{j-1})
= U(f,Q)-L(f,Q)
< \frac{\varepsilon}{2}.
\end{align*}
[/guided]
[/step]
[step:Sandwich the tagged sum between Darboux bounds up to a small crossing error]
The Riemann sum is
\begin{align*}
S(P,t;f):=\sum_{i=1}^n f(t_i)\Delta x_i.
\end{align*}
For each bad interval $i \in B$, boundedness gives
\begin{align*}
|f(t_i)\Delta x_i| \leq M\Delta x_i.
\end{align*}
Hence
\begin{align*}
\left|\sum_{i\in B} f(t_i)\Delta x_i\right|
\leq M\sum_{i\in B}\Delta x_i.
\end{align*}
If $m=1$, then $B=\varnothing$. If $m\geq 2$, the choice of $\delta$ gives
\begin{align*}
M\sum_{i\in B}\Delta x_i < \frac{\varepsilon}{8}.
\end{align*}
The lower Darboux sum satisfies
\begin{align*}
L(f,Q)=\sum_{j=1}^m m_j(q_j-q_{j-1}),
\end{align*}
and the upper Darboux sum satisfies
\begin{align*}
U(f,Q)=\sum_{j=1}^m M_j(q_j-q_{j-1}).
\end{align*}
For each $i\in B$, split the interval $[x_{i-1},x_i]$ at the points of $\{q_1,\dots,q_{m-1}\}$ that it contains. Each resulting piece lies in a single interval of $Q$, and the total length of all such bad pieces is $\sum_{i\in B}\Delta x_i$.
For each $j\in\{1,\dots,m\}$, the part of $[q_{j-1},q_j]$ not covered by good intervals is contained in the union of bad pieces. Since $|m_j|\leq M$ and $|M_j|\leq M$, deleting those pieces changes the lower and upper Darboux contributions by at most $M$ times their total length. Hence
\begin{align*}
L(f,Q)-M\sum_{i\in B}\Delta x_i
\leq \sum_{i\in G}m_{j(i)}\Delta x_i
\end{align*}
and
\begin{align*}
\sum_{i\in G}M_{j(i)}\Delta x_i
\leq U(f,Q)+M\sum_{i\in B}\Delta x_i.
\end{align*}
The good-interval inequalities give
\begin{align*}
\sum_{i\in G}m_{j(i)}\Delta x_i
\leq \sum_{i\in G}f(t_i)\Delta x_i
\leq \sum_{i\in G}M_{j(i)}\Delta x_i.
\end{align*}
Adding the bad part of the Riemann sum, whose absolute value is bounded by $M\sum_{i\in B}\Delta x_i$, yields
\begin{align*}
L(f,Q)-2M\sum_{i\in B}\Delta x_i
\leq S(P,t;f)
\leq U(f,Q)+2M\sum_{i\in B}\Delta x_i.
\end{align*}
[guided]
We now compare the arbitrary tagged sum with the fixed Darboux sums. The Riemann sum is
\begin{align*}
S(P,t;f):=\sum_{i=1}^n f(t_i)\Delta x_i,
\end{align*}
where $\Delta x_i=x_i-x_{i-1}$.
The good intervals are controlled by the oscillation of $f$ on the corresponding interval of $Q$. The only obstruction is that some intervals of $P$ may cross a point of $Q$. These intervals are the bad intervals $i \in B$. On a bad interval we do not try to use oscillation estimates, because the interval may touch two adjacent intervals of $Q$. Instead we use the crude bound $|f|\leq M$:
\begin{align*}
|f(t_i)\Delta x_i| \leq M\Delta x_i.
\end{align*}
Summing over $i \in B$ gives
\begin{align*}
\left|\sum_{i\in B} f(t_i)\Delta x_i\right|
\leq M\sum_{i\in B}\Delta x_i.
\end{align*}
The previous mesh choice gives
\begin{align*}
M\sum_{i\in B}\Delta x_i < \frac{\varepsilon}{8}
\end{align*}
when $m\geq 2$, and if $m=1$ then there are no interior points of $Q$, so $B=\varnothing$.
Now consider the good intervals. If $i\in G$, then $[x_{i-1},x_i]$ lies inside one of the intervals $[q_{j-1},q_j]$. Calling that index $j(i)$, the tag condition $t_i\in[x_{i-1},x_i]$ implies
\begin{align*}
m_{j(i)} \leq f(t_i) \leq M_{j(i)}.
\end{align*}
Multiplying by $\Delta x_i\geq 0$ gives
\begin{align*}
m_{j(i)}\Delta x_i \leq f(t_i)\Delta x_i \leq M_{j(i)}\Delta x_i.
\end{align*}
If we sum these inequalities over all good intervals, we get a lower and upper estimate for the good part of the Riemann sum. The missing pieces are exactly the bad intervals. Because $|m_j|\leq M$ and $|M_j|\leq M$ for every $j$, deleting bad subintervals from the Darboux lower and upper sums changes either comparison by at most
\begin{align*}
M\sum_{i\in B}\Delta x_i.
\end{align*}
The bad part of the Riemann sum itself contributes at most the same quantity in absolute value. Therefore the whole tagged sum satisfies
\begin{align*}
L(f,Q)-2M\sum_{i\in B}\Delta x_i
\leq S(P,t;f)
\leq U(f,Q)+2M\sum_{i\in B}\Delta x_i.
\end{align*}
This is the central estimate: all dependence on the arbitrary partition $P$ has been compressed into the total length of the bad intervals, which is small because the mesh is small.
[/guided]
[/step]
[step:Use the integral between the Darboux sums to finish the estimate]
Since $I$ is the Riemann integral of $f$, every lower Darboux sum is at most $I$ and every upper Darboux sum is at least $I$:
\begin{align*}
L(f,Q)\leq I \leq U(f,Q).
\end{align*}
From the estimate in the previous step,
\begin{align*}
S(P,t;f)-I
\leq U(f,Q)-L(f,Q)+2M\sum_{i\in B}\Delta x_i
\end{align*}
and
\begin{align*}
I-S(P,t;f)
\leq U(f,Q)-L(f,Q)+2M\sum_{i\in B}\Delta x_i.
\end{align*}
Therefore
\begin{align*}
|S(P,t;f)-I|
\leq U(f,Q)-L(f,Q)+2M\sum_{i\in B}\Delta x_i.
\end{align*}
By the choice of $Q$ and $\delta$,
\begin{align*}
|S(P,t;f)-I|
< \frac{\varepsilon}{2}+2M\cdot \frac{\varepsilon}{8M}
= \frac{3\varepsilon}{4}
< \varepsilon.
\end{align*}
Thus every tagged partition with mesh smaller than $\delta$ has its Riemann sum within $\varepsilon$ of the Riemann integral, which proves the theorem.
[guided]
The Riemann integral $I$ lies between every lower and upper Darboux sum, so for the fixed partition $Q$ we have
\begin{align*}
L(f,Q)\leq I \leq U(f,Q).
\end{align*}
The previous sandwich estimate gives
\begin{align*}
S(P,t;f)\leq U(f,Q)+2M\sum_{i\in B}\Delta x_i
\end{align*}
and
\begin{align*}
S(P,t;f)\geq L(f,Q)-2M\sum_{i\in B}\Delta x_i.
\end{align*}
Using $I\geq L(f,Q)$ in the [first inequality](/theorems/2897) gives
\begin{align*}
S(P,t;f)-I
\leq U(f,Q)-L(f,Q)+2M\sum_{i\in B}\Delta x_i.
\end{align*}
Using $I\leq U(f,Q)$ in the [second inequality](/theorems/2136) gives
\begin{align*}
I-S(P,t;f)
\leq U(f,Q)-L(f,Q)+2M\sum_{i\in B}\Delta x_i.
\end{align*}
Therefore
\begin{align*}
|S(P,t;f)-I|
\leq U(f,Q)-L(f,Q)+2M\sum_{i\in B}\Delta x_i.
\end{align*}
The Darboux gap is smaller than $\varepsilon/2$. If $m=1$, then $B=\varnothing$; if $m\geq 2$, the mesh choice gives $\sum_{i\in B}\Delta x_i<\varepsilon/(8M)$. In either case,
\begin{align*}
|S(P,t;f)-I|
< \frac{\varepsilon}{2}+2M\cdot \frac{\varepsilon}{8M}
= \frac{3\varepsilon}{4}
< \varepsilon.
\end{align*}
This proves that every tagged partition with mesh smaller than $\delta$ has Riemann sum within $\varepsilon$ of the Riemann integral.
[/guided]
[/step]