[proofplan]
We use the definition of [differentiability](/page/Derivative) to decompose the increment $f(a + h) - f(a)$ into a linear part $\tau(h)$ and a sublinear error $|h|\varepsilon(h)$. By the [Lipschitz bound on linear maps](/theorems/321), the linear part satisfies $|\tau(h)| \leq \|\tau\| \cdot |h|$. Both terms tend to zero as $h \to \mathbf{0}$, yielding [continuity](/page/Continuity).
[/proofplan]
[step:Decompose the increment using differentiability and bound each term]
By [differentiability](/page/Derivative) of $f$ at $a$, there exist $\tau \in \mathcal{L}(\mathbb{R}^m, \mathbb{R}^n)$ and a map $\varepsilon: U - a \to \mathbb{R}^n$ with $\varepsilon(h) \to \mathbf{0}$ as $h \to \mathbf{0}$ such that
\begin{align*}
f(a + h) - f(a) = \tau(h) + |h|\varepsilon(h).
\end{align*}
Taking norms and applying the triangle inequality, then the [Lipschitz bound for linear maps](/theorems/321) $|\tau(h)| \leq \|\tau\| \cdot |h|$:
\begin{align*}
|f(a + h) - f(a)| \leq |\tau(h)| + |h| |\varepsilon(h)| \leq |h|\bigl(\|\tau\| + |\varepsilon(h)|\bigr).
\end{align*}
As $h \to \mathbf{0}$, the factor $\|\tau\| + |\varepsilon(h)|$ tends to the finite value $\|\tau\|$, and $|h| \to 0$. Therefore $|f(a + h) - f(a)| \to 0$, which is [continuity](/page/Continuity) of $f$ at $a$.
[guided]
The idea is that a differentiable map looks locally like a [linear map](/page/Linear%20Map) plus a negligible error, and linear maps are continuous. More precisely, by [differentiability](/page/Derivative) at $a$, there exist $\tau = Df_a \in \mathcal{L}(\mathbb{R}^m, \mathbb{R}^n)$ and a map $\varepsilon: U - a \to \mathbb{R}^n$ with $\varepsilon(h) \to \mathbf{0}$ as $h \to \mathbf{0}$ such that
\begin{align*}
f(a + h) - f(a) = \tau(h) + |h|\varepsilon(h).
\end{align*}
We need to show both terms on the right tend to $\mathbf{0}$ as $h \to \mathbf{0}$. For the first term, by [Linear Maps are Lipschitz](/theorems/321) (Part 1), $|\tau(h)| \leq \|\tau\| \cdot |h|$, which tends to $0$ as $|h| \to 0$. For the second term, $||h|\varepsilon(h)| = |h| \cdot |\varepsilon(h)|$, and since $|\varepsilon(h)|$ is bounded near $\mathbf{0}$ (it converges to $0$) and $|h| \to 0$, this product also tends to $0$.
Combining via the triangle inequality:
\begin{align*}
|f(a + h) - f(a)| \leq |\tau(h)| + |h| |\varepsilon(h)| \leq |h|\bigl(\|\tau\| + |\varepsilon(h)|\bigr) \to 0.
\end{align*}
Therefore $f(a + h) \to f(a)$, which is [continuity](/page/Continuity) at $a$.
[/guided]
[/step]