[proofplan]
We differentiate the curve $s\mapsto \exp(X+sY)$ in the ambient [matrix space](/page/Matrix%20Space) and use the standard integral formula for the derivative of the matrix exponential. Let $\mathcal L^1$ denote [Lebesgue measure](/page/Lebesgue%20Measure) on $\mathbb R$. Right translation by $\exp(-X)$ moves the resulting tangent vector at $\exp X$ back to the identity, where the integrand becomes the conjugation expression $\exp(tX)Y\exp(-tX)$. We then identify this conjugation expression with $\exp(t\operatorname{ad}_X)(Y)$ and integrate the uniformly convergent operator exponential series term by term. Finally, right translation by $\exp X$ gives the displayed formula for $(d\exp)_X(Y)$ itself.
[/proofplan]
[step:Differentiate the exponential curve by the matrix variation formula]
Let $\alpha:\mathbb R\to G$ be the smooth curve defined by $\alpha(s)=\exp(X+sY)$. This is a curve in $G$ because $X+sY\in\mathfrak g$ for every $s\in\mathbb R$ by [citetheorem:8786], and the exponential of an element of the [Lie algebra](/page/Lie%20Algebra) of a matrix Lie group lies in the group. By definition of the differential of the exponential map,
\begin{align*}
(d\exp)_X(Y)=\alpha'(0).
\end{align*}
We prove the finite-dimensional matrix exponential variation formula needed here. For matrices $A,B\in M(n,\mathbb C)$, the series defining $\exp(A+sB)$ and its termwise derivative with respect to $s$ converge locally uniformly in $s$ by the standard Weierstrass theorem for [power series](/page/Power%20Series) with values in a finite-dimensional normed space. Therefore termwise differentiation is valid and gives
\begin{align*}
\frac{d}{ds}\bigg|_{s=0}\exp(A+sB)=\sum_{m=1}^{\infty}\frac{1}{m!}\sum_{j=0}^{m-1}A^jBA^{m-1-j}.
\end{align*}
On the other hand, expanding the integrand and using absolute convergence to integrate term by term gives the same series:
\begin{align*}
\int_0^1 \exp(tA)B\exp((1-t)A)\,d\mathcal L^1(t)=\sum_{p,q\ge 0}\frac{A^pBA^q}{p!q!}\int_0^1 t^p(1-t)^q\,d\mathcal L^1(t).
\end{align*}
For nonnegative integers $p$ and $q$, the beta-integral identity gives
\begin{align*}
\int_0^1 t^p(1-t)^q\,d\mathcal L^1(t)=\frac{p!q!}{(p+q+1)!}.
\end{align*}
Writing $m=p+q+1$ recovers the preceding derivative series. Applying this formula with $A=X$ and $B=Y$ gives
\begin{align*}
(d\exp)_X(Y)
=
\int_0^1 \exp(tX)Y\exp((1-t)X)\,d\mathcal L^1(t).
\end{align*}
[guided]
Define $\alpha:\mathbb R\to G$ by $\alpha(s)=\exp(X+sY)$. Since $X,Y\in\mathfrak g$ and $\mathfrak g$ is a real [vector space](/page/Vector%20Space) by [citetheorem:8786], we have $X+sY\in\mathfrak g$ for every $s\in\mathbb R$; hence $\alpha(s)\in G$ by the defining exponential property of the Lie algebra of a matrix Lie group. The differential of $\exp:\mathfrak g\to G$ at $X$ in the direction $Y$ is the velocity of this curve at $s=0$:
\begin{align*}
(d\exp)_X(Y)=\alpha'(0).
\end{align*}
We now compute this velocity. For matrices $A,B\in M(n,\mathbb C)$, the series defining $\exp(A+sB)$ and its termwise derivative with respect to $s$ converge locally uniformly in $s$ by the standard Weierstrass theorem for power series with values in a finite-dimensional normed space. Therefore termwise differentiation is valid and gives
\begin{align*}
\frac{d}{ds}\bigg|_{s=0}\exp(A+sB)=\sum_{m=1}^{\infty}\frac{1}{m!}\sum_{j=0}^{m-1}A^jBA^{m-1-j}.
\end{align*}
Expanding $\exp(tA)B\exp((1-t)A)$ into its absolutely convergent double series and integrating over $[0,1]$ with respect to $\mathcal L^1$ gives
\begin{align*}
\int_0^1 \exp(tA)B\exp((1-t)A)\,d\mathcal L^1(t)=\sum_{p,q\ge 0}\frac{A^pBA^q}{p!q!}\int_0^1 t^p(1-t)^q\,d\mathcal L^1(t).
\end{align*}
The beta-integral identity gives $\int_0^1 t^p(1-t)^q\,d\mathcal L^1(t)=p!q!/(p+q+1)!$. Setting $m=p+q+1$ shows that this integral equals the derivative series above. Therefore, with $A=X$ and $B=Y$,
\begin{align*}
(d\exp)_X(Y)=\int_0^1 \exp(tX)Y\exp((1-t)X)\,d\mathcal L^1(t).
\end{align*}
Right translation by $\exp(-X)$ is the smooth map $R_{\exp(-X)}:G\to G$ given by $A\mapsto A\exp(-X)$. Its differential is right multiplication by $\exp(-X)$ on matrix tangent vectors, so
\begin{align*}
(dR_{\exp(-X)})_{\exp X}\bigl((d\exp)_X(Y)\bigr)=\int_0^1 \exp(tX)Y\exp((1-t)X)\exp(-X)\,d\mathcal L^1(t).
\end{align*}
By the exponential law for one matrix from [citetheorem:8778], $\exp((1-t)X)\exp(-X)=\exp(-tX)$. Hence
\begin{align*}
(dR_{\exp(-X)})_{\exp X}\bigl((d\exp)_X(Y)\bigr)=\int_0^1 \exp(tX)Y\exp(-tX)\,d\mathcal L^1(t).
\end{align*}
Let $T:\mathfrak g\to\mathfrak g$ be the [linear map](/page/Linear%20Map) $T(Z)=\operatorname{ad}_X(Z)=[X,Z]$. This is well-defined by closure of $\mathfrak g$ under commutators, [citetheorem:8787]. For $t\in[0,1]$, conjugation by $\exp(tX)\in G$ preserves $\mathfrak g$ because it is the differential at the identity of the smooth inner automorphism $A\mapsto \exp(tX)A\exp(-tX)$ of $G$. Thus $C_t:\mathfrak g\to\mathfrak g$, defined by $C_t(Z)=\exp(tX)Z\exp(-tX)$, is a well-defined linear map. Differentiating $C_t(Y)$ gives
\begin{align*}
\frac{d}{dt}C_t(Y)=X\exp(tX)Y\exp(-tX)-\exp(tX)Y\exp(-tX)X=[X,C_t(Y)]=T(C_t(Y)),
\end{align*}
with initial value $C_0(Y)=Y$. The unique solution of this finite-dimensional linear equation is
\begin{align*}
C_t(Y)=\exp(tT)(Y)=\sum_{k=0}^{\infty}\frac{t^kT^k(Y)}{k!}.
\end{align*}
Therefore
\begin{align*}
\exp(tX)Y\exp(-tX)=\sum_{k=0}^{\infty}\frac{t^k(\operatorname{ad}_X)^k(Y)}{k!}.
\end{align*}
Choose a norm $\|\cdot\|_{\mathfrak g}$ on the finite-dimensional vector space $\mathfrak g$. Let $\mathcal L(\mathfrak g)$ denote the vector space of linear maps from $\mathfrak g$ to itself, equipped with the corresponding operator norm $\|\cdot\|_{\mathcal L(\mathfrak g)}$. For $t\in[0,1]$,
\begin{align*}
\left\|\frac{t^kT^k(Y)}{k!}\right\|_{\mathfrak g}\le \frac{\|T\|_{\mathcal L(\mathfrak g)}^k\|Y\|_{\mathfrak g}}{k!}.
\end{align*}
The majorising scalar exponential series converges, so the Weierstrass test gives [uniform convergence](/page/Uniform%20Convergence) on $[0,1]$. Termwise integration in the finite-dimensional normed space $\mathfrak g$ is therefore valid, and
\begin{align*}
\int_0^1 \exp(tX)Y\exp(-tX)\,d\mathcal L^1(t)=\sum_{k=0}^{\infty}\int_0^1 \frac{t^k(\operatorname{ad}_X)^k(Y)}{k!}\,d\mathcal L^1(t).
\end{align*}
Since $\int_0^1 t^k\,d\mathcal L^1(t)=1/(k+1)$, this becomes
\begin{align*}
\int_0^1 \exp(tX)Y\exp(-tX)\,d\mathcal L^1(t)=\sum_{k=0}^{\infty}\frac{(\operatorname{ad}_X)^k(Y)}{(k+1)!}.
\end{align*}
Combining with the right-translated derivative identity gives
\begin{align*}
(dR_{\exp(-X)})_{\exp X}\bigl((d\exp)_X(Y)\bigr)=\sum_{k=0}^{\infty}\frac{(\operatorname{ad}_X)^k(Y)}{(k+1)!}.
\end{align*}
Finally, $R_{\exp X}$ and $R_{\exp(-X)}$ are inverse diffeomorphisms of $G$, so their differentials at the corresponding points are inverse linear maps. Applying $(dR_{\exp X})_I$ to both sides yields
\begin{align*}
(d\exp)_X(Y)=(dR_{\exp X})_I\left(\sum_{k=0}^{\infty}\frac{(\operatorname{ad}_X)^k(Y)}{(k+1)!}\right).
\end{align*}
This proves both the stated differential formula and its equivalent right-translated form.
[/guided]
[/step]
[step:Right translate the derivative back to the identity]
Right translation by $\exp(-X)$ is the smooth map
\begin{align*}
R_{\exp(-X)}:G&\to G,\qquad A\mapsto A\exp(-X).
\end{align*}
For a tangent vector represented by a matrix-valued curve, its differential is right multiplication by $\exp(-X)$. Therefore
\begin{align*}
(dR_{\exp(-X)})_{\exp X}\bigl((d\exp)_X(Y)\bigr)
=
\left(\int_0^1 \exp(tX)Y\exp((1-t)X)\,d\mathcal L^1(t)\right)\exp(-X).
\end{align*}
Linearity of right multiplication and of the finite-dimensional integral gives
\begin{align*}
(dR_{\exp(-X)})_{\exp X}\bigl((d\exp)_X(Y)\bigr)
=
\int_0^1 \exp(tX)Y\exp((1-t)X)\exp(-X)\,d\mathcal L^1(t).
\end{align*}
Using the exponential law for the single matrix $X$ from [citetheorem:8778],
\begin{align*}
\exp((1-t)X)\exp(-X)=\exp(-tX).
\end{align*}
Thus
\begin{align*}
(dR_{\exp(-X)})_{\exp X}\bigl((d\exp)_X(Y)\bigr)
=
\int_0^1 \exp(tX)Y\exp(-tX)\,d\mathcal L^1(t).
\end{align*}
[/step]
[step:Expand conjugation by $\exp(tX)$ as the exponential of $\operatorname{ad}_X$]
Define the linear map
\begin{align*}
T:\mathfrak g&\to\mathfrak g,\qquad T(Z)=\operatorname{ad}_X(Z)=[X,Z].
\end{align*}
This map is well-defined because $\mathfrak g$ is closed under commutators by [citetheorem:8787]. For each $t\in[0,1]$, the matrix $\exp(tX)$ lies in $G$, and conjugation by $\exp(tX)$ is the smooth inner automorphism $A\mapsto \exp(tX)A\exp(-tX)$ of $G$. Its differential at $I$ preserves $T_I G=\mathfrak g$, so the map $C_t:\mathfrak g\to\mathfrak g$ defined by $C_t(Z)=\exp(tX)Z\exp(-tX)$ is well-defined. The map $t\mapsto C_t(Y)$ satisfies the linear differential equation
\begin{align*}
\frac{d}{dt}C_t(Y)=T(C_t(Y)),\qquad C_0(Y)=Y.
\end{align*}
Indeed, differentiating the two matrix exponential factors gives
\begin{align*}
\frac{d}{dt}\bigl(\exp(tX)Y\exp(-tX)\bigr)
=
X\exp(tX)Y\exp(-tX)-\exp(tX)Y\exp(-tX)X,
\end{align*}
and $X$ commutes with $\exp(tX)$ and with $\exp(-tX)$ by the exponential law for one matrix. Hence this derivative equals $[X,C_t(Y)]$.
The unique solution of this finite-dimensional linear equation is
\begin{align*}
C_t(Y)=\exp(tT)(Y)
=
\sum_{k=0}^{\infty}\frac{t^kT^k(Y)}{k!}.
\end{align*}
Therefore, for every $t\in[0,1]$,
\begin{align*}
\exp(tX)Y\exp(-tX)
=
\sum_{k=0}^{\infty}\frac{t^k(\operatorname{ad}_X)^k(Y)}{k!}.
\end{align*}
[/step]
[step:Integrate the operator exponential series term by term]
Choose any norm $\|\cdot\|_{\mathfrak g}$ on the finite-dimensional vector space $\mathfrak g$. Let $\mathcal L(\mathfrak g)$ denote the vector space of linear maps from $\mathfrak g$ to itself, equipped with the associated operator norm $\|\cdot\|_{\mathcal L(\mathfrak g)}$. We apply this norm to $T=\operatorname{ad}_X\in\mathcal L(\mathfrak g)$. For $t\in[0,1]$,
\begin{align*}
\left\|\frac{t^kT^k(Y)}{k!}\right\|_{\mathfrak g}
\le
\frac{\|T\|_{\mathcal L(\mathfrak g)}^k\|Y\|_{\mathfrak g}}{k!}.
\end{align*}
The scalar series on the right converges, so the Weierstrass test gives uniform convergence of the vector-valued series on $[0,1]$. Hence termwise integration is valid in the finite-dimensional normed space $\mathfrak g$:
\begin{align*}
\int_0^1 \exp(tX)Y\exp(-tX)\,d\mathcal L^1(t)
=
\sum_{k=0}^{\infty}\int_0^1 \frac{t^k(\operatorname{ad}_X)^k(Y)}{k!}\,d\mathcal L^1(t).
\end{align*}
For each $k\ge 0$,
\begin{align*}
\int_0^1 \frac{t^k(\operatorname{ad}_X)^k(Y)}{k!}\,d\mathcal L^1(t)
=
\frac{(\operatorname{ad}_X)^k(Y)}{(k+1)!}.
\end{align*}
Therefore
\begin{align*}
\int_0^1 \exp(tX)Y\exp(-tX)\,d\mathcal L^1(t)
=
\sum_{k=0}^{\infty}\frac{(\operatorname{ad}_X)^k(Y)}{(k+1)!}.
\end{align*}
Each partial sum lies in $\mathfrak g$, and since $\mathfrak g$ is a finite-dimensional subspace of $M(n,\mathbb C)$, it is closed; consequently the series sum also lies in $\mathfrak g$.
[/step]
[step:Translate the identity form back to the tangent space at $\exp X$]
Combining the preceding steps gives
\begin{align*}
(dR_{\exp(-X)})_{\exp X}\bigl((d\exp)_X(Y)\bigr)
=
\sum_{k=0}^{\infty}\frac{(\operatorname{ad}_X)^k(Y)}{(k+1)!}.
\end{align*}
Apply the inverse right translation differential $(dR_{\exp X})_I$ to both sides. Since $R_{\exp X}$ and $R_{\exp(-X)}$ are inverse diffeomorphisms of $G$, their differentials at the corresponding points are inverse linear maps. Thus
\begin{align*}
(d\exp)_X(Y)
=
(dR_{\exp X})_I\left(\sum_{k=0}^{\infty}\frac{(\operatorname{ad}_X)^k(Y)}{(k+1)!}\right).
\end{align*}
This is exactly the asserted formula, and the displayed right-translated identity is the equivalent formulation.
[/step]