[proofplan]
Part 1 (linearity) is direct: add the [differentiability](/page/Derivative) expansions for $f$ and $g$ and observe the error terms sum to a vanishing remainder. Part 2 (product rule) expands the product $\phi(a + h)f(a + h)$ using both differentiability expansions, identifies the terms linear in $h$ as the candidate derivative, and verifies every remaining term is $o(|h|)$ by using the operator norm bound and the vanishing of the error functions.
[/proofplan]
[step:Prove linearity of the derivative by adding the differentiability expansions]
By [differentiability](/page/Derivative) of $f$ and $g$ at $a$:
\begin{align*}
(f + g)(a + h) &= f(a) + Df_a(h) + |h|\varepsilon_1(h) + g(a) + Dg_a(h) + |h|\varepsilon_2(h) \\
&= (f + g)(a) + \bigl(Df_a + Dg_a\bigr)(h) + |h|\bigl(\varepsilon_1(h) + \varepsilon_2(h)\bigr).
\end{align*}
Since $\varepsilon_1(h) + \varepsilon_2(h) \to \mathbf{0}$ as $h \to \mathbf{0}$, the sum $f + g$ is differentiable at $a$ with derivative $Df_a + Dg_a$.
[/step]
[step:Expand the product $\phi f$ and identify the linear term]
By [differentiability](/page/Derivative) of $\phi$ and $f$:
\begin{align*}
\phi(a + h) &= \phi(a) + D\phi_a(h) + |h|\delta(h), \\
f(a + h) &= f(a) + Df_a(h) + |h|\varepsilon(h),
\end{align*}
with $\delta(h) \to 0$ and $\varepsilon(h) \to \mathbf{0}$. Multiplying:
\begin{align*}
(\phi f)(a + h) = \phi(a)f(a) + \phi(a)Df_a(h) + D\phi_a(h)f(a) + R(h),
\end{align*}
where $R(h)$ collects all remaining terms from the expansion of the product.
[/step]
[step:Verify every term in the remainder $R(h)$ is $o(|h|)$]
The remainder consists of:
\begin{align*}
R(h) = D\phi_a(h)Df_a(h) + \phi(a)|h|\varepsilon(h) + |h|\delta(h)f(a) + \text{(higher-order cross terms)}.
\end{align*}
For the first term, using the [Lipschitz bound](/theorems/321) on both $D\phi_a$ and $Df_a$:
\begin{align*}
|D\phi_a(h)Df_a(h)| \leq \|D\phi_a\| \cdot |h| \cdot \|Df_a\| \cdot |h| = O(|h|^2) = o(|h|).
\end{align*}
The second term satisfies $|\phi(a)|h|\varepsilon(h)| = |\phi(a)| \cdot |h| \cdot |\varepsilon(h)| = |h| \cdot o(1)$. The third satisfies $||h|\delta(h)f(a)| = |\delta(h)| \cdot |f(a)| \cdot |h| = |h| \cdot o(1)$. All remaining cross terms contain at least two vanishing factors and are likewise $o(|h|)$.
Therefore $R(h) = |h|\varepsilon_3(h)$ with $\varepsilon_3 \to \mathbf{0}$, and $D(\phi f)_a(h) = \phi(a)Df_a(h) + D\phi_a(h)f(a)$.
[guided]
The product rule computation follows the same pattern as in single-variable calculus, but we need to be careful about the bookkeeping. We expand the product of two expressions, each of the form "value + linear term + error", and the candidate derivative picks up the cross terms that are linear in $h$: one from $\phi$ constant times $f$ linear, and one from $\phi$ linear times $f$ constant.
Everything else involves at least two factors that are $O(|h|)$ or vanishing, making them $o(|h|)$. The critical estimates:
- $D\phi_a(h) \cdot Df_a(h)$: both factors are $O(|h|)$ by the [Lipschitz bound](/theorems/321), so the product is $O(|h|^2) = o(|h|)$.
- $\phi(a) \cdot |h|\varepsilon(h)$: the scalar $\phi(a)$ is constant, and $|h|\varepsilon(h) = o(|h|)$ by definition.
- $|h|\delta(h) \cdot f(a)$: the vector $f(a)$ is constant, and $|h|\delta(h) = o(|h|)$ since $\delta \to 0$.
Combining, $R(h) = o(|h|)$, confirming the [derivative](/page/Derivative) formula $D(\phi f)_a(h) = \phi(a)Df_a(h) + D\phi_a(h)f(a)$.
[/guided]
[/step]