Let $(X_1, Y_1), \ldots, (X_n, Y_n) \in \mathbb{R}^p \times \{-1, 1\}$ be training data. Suppose that for each $m = 1, \ldots, M$, the base classifier $\hat{h}_m$ selected by AdaBoost satisfies
\begin{align*}
\operatorname{err}_m(\hat{h}_m) \leq \frac{1}{2} - \gamma
\end{align*}
for some fixed $\gamma > 0$ (i.e., each weak learner beats chance by a margin $\gamma$). Then the training error of the final AdaBoost classifier $\operatorname{sgn} \circ \hat{f}_M$ satisfies
\begin{align*}
\frac{1}{n} \sum_{i=1}^n \mathbb{1}_{\{\operatorname{sgn}(\hat{f}_M(X_i)) \neq Y_i\}} \leq \exp(-2\gamma^2 M).
\end{align*}
In particular, the training error converges to zero exponentially fast in $M$.