[proofplan]
We write the convolution in the form $(f*g)(x)=\int_{\mathbb{R}^n} f(y)g(x-y)\,d\mathcal{L}^n(y)$, so that differentiation in $x$ falls only on the smooth compactly supported factor $g(x-y)$. Compact support of $g$ localizes the integral to a compact set depending on $x$, and local integrability of $f$ provides the domination needed to differentiate under the integral. Once the pointwise identity is proved for first derivatives, iteration gives the formula for every multi-index. The distributional identity then follows by testing against a compactly supported smooth function and integrating by parts in the test variable.
[/proofplan]
[step:Verify that the convolution integrals are finite at every point]
Let $K := \operatorname{supp} g \subset \mathbb{R}^n$, which is compact because $g \in C_c^\infty(\mathbb{R}^n)$. For each $x \in \mathbb{R}^n$, the function
\begin{align*}
y \mapsto f(y)g(x-y)
\end{align*}
vanishes outside the compact set $x-K := \{x-z : z \in K\}$. Since $f \in L^1_{\mathrm{loc}}(\mathbb{R}^n)$, the restriction of $f$ to $x-K$ is integrable with respect to $\mathcal{L}^n$. Also $g$ is bounded on $\mathbb{R}^n$, so
\begin{align*}
\int_{\mathbb{R}^n} |f(y)g(x-y)|\,d\mathcal{L}^n(y) \leq \|g\|_\infty \int_{x-K} |f(y)|\,d\mathcal{L}^n(y) < \infty.
\end{align*}
Thus $(f*g)(x)$ is well-defined for every $x \in \mathbb{R}^n$.
The same argument applies to $D^\alpha g$ for every multi-index $\alpha$, since $D^\alpha g \in C_c^\infty(\mathbb{R}^n)$ and $\operatorname{supp}(D^\alpha g) \subseteq K$. Hence $f*(D^\alpha g)$ is also well-defined at every point.
[/step]
[step:Differentiate once by dominated convergence]
Fix an index $i \in \{1,\dots,n\}$ and a point $x_0 \in \mathbb{R}^n$. Choose $r := 1$ and define the compact set
\begin{align*}
K_0 := \overline{B}(x_0,r)-K = \{z-w : z \in \overline{B}(x_0,r),\, w \in K\}.
\end{align*}
If $x \in B(x_0,r)$ and $g(x-y) \neq 0$, then $x-y \in K$, hence $y \in x-K \subset K_0$. Thus all integrals defining $(f*g)(x)$ for $x \in B(x_0,r)$ are supported in $K_0$.
For $h \in \mathbb{R}$ with $0<|h|<r/2$, define
\begin{align*}
Q_h: K_0 \to \mathbb{R}
\end{align*}
by
\begin{align*}
Q_h(y)= f(y)\frac{g(x_0+h e_i-y)-g(x_0-y)}{h},
\end{align*}
where $e_i \in \mathbb{R}^n$ is the $i$th standard basis vector. Since $g$ is smooth, the one-dimensional [mean value theorem](/theorems/186) applied to the map
\begin{align*}
t \mapsto g(x_0+t e_i-y)
\end{align*}
gives, for each $y \in K_0$,
\begin{align*}
\left|\frac{g(x_0+h e_i-y)-g(x_0-y)}{h}\right| \leq \|\partial_{x_i}g\|_\infty.
\end{align*}
Therefore
\begin{align*}
|Q_h(y)| \leq |f(y)|\|\partial_{x_i}g\|_\infty \mathbb{1}_{K_0}(y).
\end{align*}
The dominating function on the right belongs to $L^1(\mathbb{R}^n)$ because $f \in L^1_{\mathrm{loc}}(\mathbb{R}^n)$ and $K_0$ is compact. For each fixed $y \in K_0$, smoothness of $g$ gives
\begin{align*}
Q_h(y) \to f(y)\partial_{x_i}g(x_0-y)
\end{align*}
as $h \to 0$. Dominated convergence yields
\begin{align*}
\partial_{x_i}(f*g)(x_0)=\int_{\mathbb{R}^n} f(y)\partial_{x_i}g(x_0-y)\,d\mathcal{L}^n(y).
\end{align*}
Since $x_0$ was arbitrary,
\begin{align*}
\partial_{x_i}(f*g)=f*(\partial_{x_i}g)
\end{align*}
pointwise on $\mathbb{R}^n$.
[guided]
The point of writing the convolution as
\begin{align*}
(f*g)(x)=\int_{\mathbb{R}^n} f(y)g(x-y)\,d\mathcal{L}^n(y)
\end{align*}
is that the variable $x$ appears only in the smooth factor $g(x-y)$. We want to justify moving $\partial_{x_i}$ through the integral. Since $f$ is only locally integrable, not necessarily bounded or continuous, this requires a domination argument localized to a compact set.
Fix $i \in \{1,\dots,n\}$ and $x_0 \in \mathbb{R}^n$. Let $K:=\operatorname{supp}g$ and choose $r:=1$. Define
\begin{align*}
K_0:=\overline{B}(x_0,r)-K=\{z-w:z\in\overline{B}(x_0,r),\,w\in K\}.
\end{align*}
This set is compact because it is the image of the compact set $\overline{B}(x_0,r)\times K$ under the continuous map $(z,w)\mapsto z-w$. If $x \in B(x_0,r)$ and $g(x-y)\neq 0$, then $x-y\in K$, so $y=x-(x-y)\in x-K\subset K_0$. Thus, near $x_0$, every convolution integral is actually an integral over the same compact set $K_0$.
Let $e_i\in\mathbb{R}^n$ denote the $i$th standard basis vector. For $h\in\mathbb{R}$ with $0<|h|<r/2$, define
\begin{align*}
Q_h:K_0\to\mathbb{R}
\end{align*}
by
\begin{align*}
Q_h(y)=f(y)\frac{g(x_0+h e_i-y)-g(x_0-y)}{h}.
\end{align*}
For each fixed $y\in K_0$, the function
\begin{align*}
t\mapsto g(x_0+t e_i-y)
\end{align*}
is a smooth real-valued function of one real variable. Applying the mean value theorem to this one-variable function gives
\begin{align*}
\left|\frac{g(x_0+h e_i-y)-g(x_0-y)}{h}\right|\leq \|\partial_{x_i}g\|_\infty.
\end{align*}
Consequently,
\begin{align*}
|Q_h(y)|\leq |f(y)|\|\partial_{x_i}g\|_\infty\mathbb{1}_{K_0}(y).
\end{align*}
The right-hand side is integrable with respect to $\mathcal{L}^n$ because $K_0$ is compact and $f\in L^1_{\mathrm{loc}}(\mathbb{R}^n)$. Also, for every $y\in K_0$, smoothness of $g$ implies
\begin{align*}
Q_h(y)\to f(y)\partial_{x_i}g(x_0-y)
\end{align*}
as $h\to 0$. Dominated convergence therefore permits the limit to pass through the integral:
\begin{align*}
\partial_{x_i}(f*g)(x_0)=\lim_{h\to 0}\int_{\mathbb{R}^n}Q_h(y)\,d\mathcal{L}^n(y)=\int_{\mathbb{R}^n}f(y)\partial_{x_i}g(x_0-y)\,d\mathcal{L}^n(y).
\end{align*}
This is exactly
\begin{align*}
\partial_{x_i}(f*g)(x_0)=(f*\partial_{x_i}g)(x_0).
\end{align*}
Since $x_0$ was arbitrary, the identity holds pointwise on all of $\mathbb{R}^n$.
[/guided]
[/step]
[step:Iterate the first derivative identity to obtain the multi-index formula]
Let $\alpha=(\alpha_1,\dots,\alpha_n)\in\mathbb{N}_0^n$. Define $|\alpha|:=\alpha_1+\cdots+\alpha_n$. We prove
\begin{align*}
D^\alpha(f*g)=f*(D^\alpha g)
\end{align*}
by induction on $|\alpha|$.
If $|\alpha|=0$, then $D^\alpha$ is the identity operator and the assertion is the definition of $f*g$. Assume the assertion holds for all multi-indices of order $m\geq 0$. Let $\beta\in\mathbb{N}_0^n$ satisfy $|\beta|=m+1$. Choose $i\in\{1,\dots,n\}$ with $\beta_i\geq 1$, and define $\alpha:=\beta-e_i$, where $e_i$ is the $i$th standard basis multi-index. Then $|\alpha|=m$. By the induction hypothesis,
\begin{align*}
D^\alpha(f*g)=f*(D^\alpha g).
\end{align*}
Since $D^\alpha g\in C_c^\infty(\mathbb{R}^n)$, the first derivative identity applied with $D^\alpha g$ in place of $g$ gives
\begin{align*}
\partial_{x_i}\bigl(f*(D^\alpha g)\bigr)=f*(\partial_{x_i}D^\alpha g).
\end{align*}
Therefore
\begin{align*}
D^\beta(f*g)=\partial_{x_i}D^\alpha(f*g)=\partial_{x_i}\bigl(f*(D^\alpha g)\bigr)=f*(D^\beta g).
\end{align*}
The induction proves the formula for every multi-index $\alpha$. In particular, all derivatives of $f*g$ exist and are continuous because each derivative is the convolution of $f$ with a compactly supported smooth function, and the first derivative argument also proves continuity by the same local dominated convergence estimate.
[/step]
[step:Identify the distributional derivative with the regular distribution of the classical derivative]
Let $\alpha\in\mathbb{N}_0^n$ and let $\phi\in C_c^\infty(\mathbb{R}^n)$ be an arbitrary [test function](/page/Test%20Function). The [regular distribution](/page/Regular%20Distribution) associated to $f*g$ is the linear functional
\begin{align*}
T_{f*g}:C_c^\infty(\mathbb{R}^n)\to\mathbb{R}
\end{align*}
defined by
\begin{align*}
T_{f*g}(\phi)=\int_{\mathbb{R}^n}(f*g)(x)\phi(x)\,d\mathcal{L}^n(x).
\end{align*}
By the definition of the [distributional derivative](/page/Distributional%20Derivative),
\begin{align*}
D^\alpha T_{f*g}(\phi)=(-1)^{|\alpha|}T_{f*g}(D^\alpha\phi).
\end{align*}
Since $\phi$ has compact support and $f*g$ is smooth, repeated one-dimensional [integration by parts](/theorems/210) in the variables indicated by $\alpha$ gives
\begin{align*}
(-1)^{|\alpha|}\int_{\mathbb{R}^n}(f*g)(x)D^\alpha\phi(x)\,d\mathcal{L}^n(x)=\int_{\mathbb{R}^n}D^\alpha(f*g)(x)\phi(x)\,d\mathcal{L}^n(x).
\end{align*}
The boundary terms vanish because $\phi$ has compact support. Using the classical identity already proved,
\begin{align*}
\int_{\mathbb{R}^n}D^\alpha(f*g)(x)\phi(x)\,d\mathcal{L}^n(x)=\int_{\mathbb{R}^n}(f*(D^\alpha g))(x)\phi(x)\,d\mathcal{L}^n(x).
\end{align*}
The right-hand side is $T_{f*(D^\alpha g)}(\phi)$. Since this holds for every $\phi\in C_c^\infty(\mathbb{R}^n)$,
\begin{align*}
D^\alpha T_{f*g}=T_{f*(D^\alpha g)}
\end{align*}
as distributions on $\mathbb{R}^n$. This proves the distributional statement and completes the proof.
[/step]