[proofplan]
Both sides of the identity are graded derivations on the exterior algebra $\Omega^*(M)$. We verify that they are equal by checking three properties: (i) both are degree-preserving derivations, (ii) both commute with $d$, and (iii) both agree on $\Omega^0(M) = C^\infty(M)$. Since $\Omega^*(M)$ is generated as an algebra by $\Omega^0(M)$ and $d\Omega^0(M)$, any two derivations commuting with $d$ and agreeing on $\Omega^0(M)$ must coincide on the whole algebra. The non-trivial part is that $\mathcal{L}_X$ commutes with $d$, which follows from the naturality of $d$ under pullbacks together with the definition $\mathcal{L}_X = \frac{d}{dt}\big|_{t=0}\varphi_t^*$.
[/proofplan]
[step:Confirm both sides are graded derivations of degree zero on $\Omega^*(M)$]
*The Lie derivative.* By the [Lie Derivative Derivation Property](/theorems/???), $\mathcal{L}_X : \Omega^k(M) \to \Omega^k(M)$ is $\mathbb R$-linear and satisfies
\begin{align*}
\mathcal{L}_X(\omega \wedge \eta) = (\mathcal{L}_X \omega) \wedge \eta + \omega \wedge (\mathcal{L}_X \eta),
\end{align*}
for all $\omega \in \Omega^k(M)$, $\eta \in \Omega^\ell(M)$. Hence $\mathcal{L}_X$ is a derivation of degree $0$.
*The operator $D := \iota_X \circ d + d \circ \iota_X$.* Recall $d: \Omega^k \to \Omega^{k+1}$ (degree $+1$) and $\iota_X: \Omega^k \to \Omega^{k-1}$ (degree $-1$); both are graded derivations of their respective signs. Thus $D = \iota_X d + d \iota_X$ maps $\Omega^k \to \Omega^k$ (degree $0$). A direct computation verifies $D$ is a graded derivation: for $\omega \in \Omega^k$, $\eta \in \Omega^\ell$,
\begin{align*}
d(\omega \wedge \eta) &= d\omega \wedge \eta + (-1)^k \omega \wedge d\eta, \\
\iota_X(\omega \wedge \eta) &= \iota_X \omega \wedge \eta + (-1)^k \omega \wedge \iota_X \eta.
\end{align*}
Combining,
\begin{align*}
(\iota_X d + d\iota_X)(\omega \wedge \eta) = (\iota_X d\omega + d\iota_X\omega) \wedge \eta + \omega \wedge (\iota_X d\eta + d\iota_X\eta) = (D\omega) \wedge \eta + \omega \wedge (D\eta),
\end{align*}
after the signs from the graded Leibniz rules cancel in pairs (the $(-1)^{k+1}$ from $d$ on $\iota_X \omega \wedge \eta$ and the $(-1)^k$ from $\iota_X$ on $d\omega \wedge \eta$). Hence $D$ is a degree-$0$ derivation.
[guided]
The plan is to show $\mathcal L_X = D$ where $D := \iota_X d + d\iota_X$, and we want to reduce the identity on $\Omega^*(M)$ to an identity on a generating set. For this strategy to work, both operators must (a) be graded derivations (so they're determined by their values on generators via Leibniz) and (b) commute with $d$ (so they're determined by their values on $0$-forms once they're derivations commuting with $d$).
*Why is $D$ a derivation?* The key sign bookkeeping is that $\iota_X$ has degree $-1$ and $d$ has degree $+1$, so $\iota_X d$ has degree $0$ and $d\iota_X$ has degree $0$ — the sum $D$ has degree $0$. For a $k$-form $\omega$ and $\ell$-form $\eta$:
\begin{align*}
\iota_X d(\omega \wedge \eta) &= \iota_X(d\omega \wedge \eta + (-1)^k \omega \wedge d\eta) \\
&= \iota_X d\omega \wedge \eta + (-1)^{k+1}d\omega \wedge \iota_X\eta + (-1)^k \iota_X\omega \wedge d\eta + \omega \wedge \iota_X d\eta,
\end{align*}
and
\begin{align*}
d\iota_X(\omega \wedge \eta) &= d(\iota_X\omega \wedge \eta + (-1)^k \omega \wedge \iota_X\eta) \\
&= d\iota_X\omega \wedge \eta + (-1)^{k-1}\iota_X\omega \wedge d\eta + (-1)^k d\omega \wedge \iota_X\eta + \omega \wedge d\iota_X\eta.
\end{align*}
Adding, the cross-terms $(-1)^{k+1}d\omega\wedge\iota_X\eta + (-1)^k d\omega\wedge\iota_X\eta = 0$ and $(-1)^k \iota_X\omega\wedge d\eta + (-1)^{k-1}\iota_X\omega\wedge d\eta = 0$ cancel. What remains is
\begin{align*}
D(\omega\wedge\eta) = (\iota_X d\omega + d\iota_X\omega)\wedge\eta + \omega\wedge(\iota_X d\eta + d\iota_X\eta) = D\omega\wedge\eta + \omega\wedge D\eta.
\end{align*}
So $D$ is a degree-$0$ derivation.
The Lie derivative is a derivation by a classical computation: $\mathcal L_X(\omega\wedge\eta) = \frac{d}{dt}\big|_0\varphi_t^*(\omega\wedge\eta) = \frac{d}{dt}\big|_0(\varphi_t^*\omega \wedge \varphi_t^*\eta)$, and the Leibniz rule for the time derivative at $t = 0$ gives the derivation property.
[/guided]
[/step]
[step:Prove $\mathcal L_X$ commutes with $d$ via naturality of $d$ under pullbacks]
Let $(\varphi_t)$ be the (local) flow of $X$, so that by definition of the Lie derivative,
\begin{align*}
\mathcal L_X \omega = \left.\frac{d}{dt}\right|_{t=0} \varphi_t^*\omega, \qquad \omega \in \Omega^k(M).
\end{align*}
By the [Naturality of the Exterior Derivative](/theorems/???), $d$ commutes with pullbacks: $\varphi_t^* \circ d = d \circ \varphi_t^*$. Hence, for any $\omega \in \Omega^k(M)$,
\begin{align*}
d(\mathcal L_X \omega) = d\!\left(\left.\tfrac{d}{dt}\right|_{t=0}\! \varphi_t^*\omega\right) = \left.\tfrac{d}{dt}\right|_{t=0}\! d(\varphi_t^*\omega) = \left.\tfrac{d}{dt}\right|_{t=0}\! \varphi_t^*(d\omega) = \mathcal L_X(d\omega).
\end{align*}
The exchange of $d$ and $\frac{d}{dt}\big|_{t=0}$ is justified because $d$ is a continuous $\mathbb R$-linear operator on smooth forms (it acts by pointwise linear combinations of partial derivatives of coefficients, which commute with the smooth parameter $t$), and the family $t \mapsto \varphi_t^*\omega$ is smooth in $t$.
[guided]
The identity $d\mathcal L_X = \mathcal L_X d$ is the technical heart of the proof. The key ingredient is the **naturality** of the exterior derivative: for any smooth map $F: N \to M$ and any $\omega \in \Omega^k(M)$,
\begin{align*}
F^*(d\omega) = d(F^*\omega).
\end{align*}
Applied to the flow $\varphi_t: M \to M$:
\begin{align*}
\varphi_t^*(d\omega) = d(\varphi_t^*\omega).
\end{align*}
Now we differentiate in $t$ at $t = 0$. The right-hand side is $\frac{d}{dt}\big|_0 d(\varphi_t^*\omega)$. We want to exchange $d$ with $\frac{d}{dt}\big|_0$. Why is this legal? In local coordinates, $\varphi_t^*\omega = \sum f_I(x,t)\, dx^I$ where $f_I \in C^\infty(M \times (-\varepsilon, \varepsilon))$, and
\begin{align*}
d(\varphi_t^*\omega) = \sum_{I, j} \partial_{x_j}f_I(x,t)\, dx_j \wedge dx^I.
\end{align*}
Differentiating in $t$ at $0$:
\begin{align*}
\tfrac{d}{dt}\big|_0 d(\varphi_t^*\omega) = \sum_{I,j} \partial_t \partial_{x_j}f_I(x,0)\, dx_j \wedge dx^I = \sum_{I,j} \partial_{x_j}(\partial_t f_I)(x,0)\, dx_j \wedge dx^I,
\end{align*}
where the final step uses Clairaut's theorem on equality of mixed partials for smooth functions. The last expression equals $d(\mathcal L_X \omega)$. Thus the exchange is justified, and
\begin{align*}
d\mathcal L_X \omega = \mathcal L_X d\omega.
\end{align*}
The analogous identity $dD = Dd$ is immediate from $d^2 = 0$: $dD = d(\iota_X d + d\iota_X) = d\iota_X d$, and $Dd = (\iota_X d + d\iota_X)d = \iota_X d^2 + d\iota_X d = d\iota_X d$, so $dD = Dd$.
[/guided]
[/step]
[step:Verify $D$ commutes with $d$ directly using $d^2 = 0$]
Since $d \circ d = 0$ on $\Omega^*(M)$,
\begin{align*}
d \circ D &= d \circ (\iota_X d + d \iota_X) = d \iota_X d + d^2 \iota_X = d\iota_X d, \\
D \circ d &= (\iota_X d + d \iota_X) \circ d = \iota_X d^2 + d\iota_X d = d\iota_X d.
\end{align*}
Hence $d \circ D = D \circ d$.
[/step]
[step:Verify equality on $\Omega^0(M) = C^\infty(M)$]
For $f \in C^\infty(M)$, the interior product $\iota_X f$ is zero (since $f$ is a $0$-form and $\iota_X$ lowers degree into negative degree, by convention the result is $0$). The exterior derivative $df$ is a $1$-form, and $\iota_X(df) = df(X) = X \cdot f$ (the directional derivative of $f$ along $X$). Therefore
\begin{align*}
D f = \iota_X(df) + d(\iota_X f) = X \cdot f + 0 = X \cdot f.
\end{align*}
On the other hand, by the definition of the Lie derivative on functions,
\begin{align*}
\mathcal L_X f = \left.\tfrac{d}{dt}\right|_{t=0} \varphi_t^*f = \left.\tfrac{d}{dt}\right|_{t=0} f \circ \varphi_t = df_{\varphi_0(\cdot)}(X) = X \cdot f.
\end{align*}
Hence $\mathcal L_X f = D f$ for all $f \in C^\infty(M)$.
[/step]
[step:Conclude by the derivation generating principle on $\Omega^*(M)$]
We have shown:
- $\mathcal L_X$ and $D := \iota_X d + d\iota_X$ are both degree-$0$ graded derivations on $\Omega^*(M)$;
- both commute with $d$;
- they agree on $\Omega^0(M)$.
We claim this forces $\mathcal L_X = D$ on all of $\Omega^*(M)$. Locally on a coordinate chart $(U, (x_1, \dots, x_n))$, every $k$-form is a finite $C^\infty(U)$-linear combination of wedges $dx_{i_1} \wedge \cdots \wedge dx_{i_k} = d x_{i_1} \wedge \cdots \wedge dx_{i_k}$. Each $x_j \in C^\infty(U)$ and each $dx_j = d(x_j)$, so every form on $U$ lies in the algebra generated by $\Omega^0(U)$ and $d\Omega^0(U)$. By the Leibniz (derivation) property, the value of a degree-$0$ derivation on such a product is determined by its values on $\Omega^0$ and its values on $d\Omega^0$; commutation with $d$ reduces the latter to the former via $d\mathcal L_X f = \mathcal L_X df$ and $d D f = D d f$. Since $\mathcal L_X f = D f$ on functions by the previous step, it follows that $\mathcal L_X = D$ on all of $\Omega^*(U)$. As $M$ is covered by such coordinate charts and the identity is local, $\mathcal L_X = D$ globally on $\Omega^*(M)$:
\begin{align*}
\mathcal L_X \omega = \iota_X(d\omega) + d(\iota_X \omega), \qquad \omega \in \Omega^*(M).
\end{align*}
[guided]
The generating principle says: if $A, B$ are degree-$0$ derivations on $\Omega^*(M)$ that both commute with $d$ and agree on $\Omega^0(M)$, then $A = B$. The proof is local. In a chart $(U, x_1, \dots, x_n)$, every $k$-form has the local expression
\begin{align*}
\omega = \sum_{I} f_I\, dx_{i_1} \wedge \cdots \wedge dx_{i_k}.
\end{align*}
By the derivation property applied iteratively to the wedge products,
\begin{align*}
A\omega = \sum_I \Big[ (Af_I)\, dx_{i_1}\wedge\cdots\wedge dx_{i_k} + f_I \sum_{r=1}^k (-1)^{?}\, dx_{i_1}\wedge\cdots\wedge A(dx_{i_r})\wedge\cdots\wedge dx_{i_k} \Big].
\end{align*}
Now $A(dx_{i_r}) = A(d x_{i_r}) = d(Ax_{i_r})$ because $A$ commutes with $d$. Since $x_{i_r} \in \Omega^0(U)$ and $A$ agrees with $B$ on $\Omega^0$, $A(x_{i_r}) = B(x_{i_r})$, and similarly $A(dx_{i_r}) = d A x_{i_r} = d B x_{i_r} = B(dx_{i_r})$. Every ingredient in the formula for $A\omega$ equals the corresponding ingredient for $B\omega$. Hence $A\omega = B\omega$ locally, and by coverage of $M$ by charts, $A = B$ globally.
Applied to $A = \mathcal L_X$ and $B = D$: both are derivations, both commute with $d$, both agree on $C^\infty(M)$. Therefore $\mathcal L_X = \iota_X d + d\iota_X$ on $\Omega^*(M)$, which is Cartan's magic formula.
[/guided]
[/step]