[proofplan]
We compute $\operatorname{BCH}(X,Y)$ by inserting $tY$ in the second variable and studying the path $Z(t)=\operatorname{BCH}(X,tY)$. The differential equation for this path expresses $Z'(t)$ in terms of the operator $\operatorname{ad}_{Z(t)}$, and its Bernoulli expansion determines the homogeneous bracket terms recursively. Comparing degree $2$ terms gives the coefficient of $[X,Y]$, and comparing degree $3$ terms gives the two nested-bracket coefficients after integration from $t=0$ to $t=1$.
[/proofplan]
[step:Introduce the homogeneous expansion of $Z(t)=\operatorname{BCH}(X,tY)$]
Fix $X,Y\in\mathfrak g$ sufficiently close to $0$ so that $tY\in U$ and $\exp(X)\exp(tY)\in V$ for every $t\in[0,1]$. Thus the local Baker--Campbell--Hausdorff expression is defined along the whole path $t\in[0,1]$. Define the smooth map
\begin{align*}Z:[0,1]\to\mathfrak g,\qquad t\mapsto \operatorname{BCH}(X,tY)\end{align*}
. Thus $Z(0)=X$ and $Z(1)=\operatorname{BCH}(X,Y)$. The local Baker--Campbell--Hausdorff map is analytic over $\mathbb F$ near $(0,0)$. Its homogeneous terms are the universal Lie-polynomial terms in the BCH expansion, so we write its bracket-degree-$k$ part as $Z_k(t)$.
We decompose $Z(t)$ by homogeneous bracket degree in the two variables $X$ and $Y$:
\begin{align*}
Z(t)=X+tY+Z_2(t)+Z_3(t)+O_4(X,Y),
\end{align*}
where $Z_k(t)$ denotes the homogeneous degree-$k$ part, and the remainder $O_4(X,Y)$ contains only iterated brackets of total degree at least $4$. Since $Z(0)=X$, the higher homogeneous terms satisfy
\begin{align*}
Z_2(0)=0,\qquad Z_3(0)=0.
\end{align*}
[/step]
[step:Expand the BCH differential equation through degree three]
For each $A\in\mathfrak g$, define the [linear map](/page/Linear%20Map) \begin{align*}\operatorname{ad}_A:\mathfrak g\to\mathfrak g,\qquad W\mapsto [A,W]\end{align*}; let $\log:V\to U$ denote the local inverse of $\exp:U\to V$ from the theorem statement. For every $t\in[0,1]$, the element $\exp(X)\exp(tY)$ lies in $V$, so $Z(t)=\log(\exp(X)\exp(tY))$ lies in the logarithm chart where the BCH path differential equation applies. The hypotheses of the differential equation for the BCH path [citetheorem:8795] are therefore satisfied for this sufficiently small pair $X,Y$ and this logarithm-chart path. Applying that theorem to $Z$ gives
\begin{align*}
\frac{dZ}{dt}=\frac{\operatorname{ad}_{Z(t)}}{1-e^{-\operatorname{ad}_{Z(t)}}}(Y).
\end{align*}
The scalar [power series](/page/Power%20Series)
\begin{align*}
\frac{u}{1-e^{-u}} = 1+\frac{u}{2}+\frac{u^2}{12}+O(u^4)
\end{align*}
has no cubic term. Substituting the linear operator $u=\operatorname{ad}_{Z(t)}$ gives, through homogeneous bracket degree $3$,
\begin{align*}
\frac{dZ}{dt}=Y+\frac{1}{2}[Z(t),Y]+\frac{1}{12}[Z(t),[Z(t),Y]]+O_4(X,Y).
\end{align*}
[guided]
The point of this step is that the BCH curve satisfies an ODE whose coefficients are universal power-series functions of the adjoint operator. We use the notation
\begin{align*}
Z(t)=\log(\exp(X)\exp(tY))
\end{align*}
with $\log:V\to U$ the local inverse of $\exp:U\to V$. The theorem [citetheorem:8795] applies because $X$ and $Y$ were chosen sufficiently small and the BCH neighbourhood was shrunk so that $\exp(X)\exp(tY)\in V$ for every $t\in[0,1]$. Thus the path remains inside the logarithm chart on which $Z(t)=\log(\exp(X)\exp(tY))$ is defined, exactly as required by the cited BCH path ODE. Its conclusion is the right-trivialized differential equation
\begin{align*}
\frac{dZ}{dt}=\frac{\operatorname{ad}_{Z(t)}}{1-e^{-\operatorname{ad}_{Z(t)}}}(Y).
\end{align*}
Now expand the scalar function
\begin{align*}
\frac{u}{1-e^{-u}}=1+\frac{u}{2}+\frac{u^2}{12}+O(u^4).
\end{align*}
Substituting $u=\operatorname{ad}_{Z(t)}$ turns $u(Y)$ into $[Z(t),Y]$ and $u^2(Y)$ into $[Z(t),[Z(t),Y]]$. The cubic coefficient vanishes, so the first omitted term begins at $u^4$; since each appearance of $Z(t)$ contributes at least degree $1$, the omitted bracket terms have total degree at least $4$ in $X$ and $Y$. Therefore, up to homogeneous degree $3$,
\begin{align*}
\frac{dZ}{dt}=Y+\frac{1}{2}[Z(t),Y]+\frac{1}{12}[Z(t),[Z(t),Y]]+O_4(X,Y).
\end{align*}
[/guided]
[/step]
[step:Compare degree two terms to determine $Z_2(t)$]
Using
\begin{align*}
Z(t)=X+tY+Z_2(t)+Z_3(t)+O_4(X,Y),
\end{align*}
the degree-$2$ part of the differential equation is
\begin{align*}
Z_2'(t)=\frac{1}{2}[X+tY,Y].
\end{align*}
Since $[Y,Y]=0$ by antisymmetry of the Lie bracket,
\begin{align*}
Z_2'(t)=\frac{1}{2}[X,Y].
\end{align*}
Together with $Z_2(0)=0$, integration over $[0,t]$ gives
\begin{align*}
Z_2(t)=\frac{t}{2}[X,Y].
\end{align*}
[/step]
[step:Compare degree three terms to determine $Z_3(t)$]
The degree-$3$ terms in the differential equation are
\begin{align*}
Z_3'(t)=\frac{1}{2}[Z_2(t),Y]+\frac{1}{12}[X+tY,[X+tY,Y]].
\end{align*}
Using $Z_2(t)=\frac{t}{2}[X,Y]$, the first term is
\begin{align*}
\frac{1}{2}[Z_2(t),Y]=\frac{t}{4}[[X,Y],Y]=-\frac{t}{4}[Y,[X,Y]].
\end{align*}
For the second term, antisymmetry gives $[Y,Y]=0$, so
\begin{align*}
[X+tY,Y]=[X,Y].
\end{align*}
Hence
\begin{align*}
\frac{1}{12}[X+tY,[X+tY,Y]]=\frac{1}{12}[X+tY,[X,Y]].
\end{align*}
By bilinearity of the Lie bracket,
\begin{align*}
\frac{1}{12}[X+tY,[X,Y]]=\frac{1}{12}[X,[X,Y]]+\frac{t}{12}[Y,[X,Y]].
\end{align*}
Combining the two degree-$3$ contributions yields
\begin{align*}
Z_3'(t)=\frac{1}{12}[X,[X,Y]]-\frac{t}{6}[Y,[X,Y]].
\end{align*}
[guided]
We now isolate exactly the terms of bracket degree $3$. There are two sources. First, the term $\frac{1}{2}[Z(t),Y]$ contributes degree $3$ only when the degree-$2$ part $Z_2(t)$ of $Z(t)$ is used. Second, the term $\frac{1}{12}[Z(t),[Z(t),Y]]$ contributes degree $3$ only when both copies of $Z(t)$ are replaced by their degree-$1$ part $X+tY$. Therefore
\begin{align*}
Z_3'(t)=\frac{1}{2}[Z_2(t),Y]+\frac{1}{12}[X+tY,[X+tY,Y]].
\end{align*}
From the previous step,
\begin{align*}
Z_2(t)=\frac{t}{2}[X,Y].
\end{align*}
Substituting this into the first contribution gives
\begin{align*}
\frac{1}{2}[Z_2(t),Y]=\frac{1}{2}\left[\frac{t}{2}[X,Y],Y\right].
\end{align*}
By bilinearity this is
\begin{align*}
\frac{t}{4}[[X,Y],Y].
\end{align*}
By antisymmetry, $[[X,Y],Y]=-[Y,[X,Y]]$, so
\begin{align*}
\frac{1}{2}[Z_2(t),Y]=-\frac{t}{4}[Y,[X,Y]].
\end{align*}
For the nested contribution, first compute the inner bracket. Bilinearity gives
\begin{align*}
[X+tY,Y]=[X,Y]+t[Y,Y].
\end{align*}
Antisymmetry gives $[Y,Y]=0$, hence
\begin{align*}
[X+tY,Y]=[X,Y].
\end{align*}
Thus
\begin{align*}
\frac{1}{12}[X+tY,[X+tY,Y]]=\frac{1}{12}[X+tY,[X,Y]].
\end{align*}
Expanding again by bilinearity gives
\begin{align*}
\frac{1}{12}[X+tY,[X,Y]]=\frac{1}{12}[X,[X,Y]]+\frac{t}{12}[Y,[X,Y]].
\end{align*}
Adding the two degree-$3$ contributions, the coefficients of $[Y,[X,Y]]$ combine as
\begin{align*}
-\frac{t}{4}+\frac{t}{12}=-\frac{t}{6}.
\end{align*}
Therefore
\begin{align*}
Z_3'(t)=\frac{1}{12}[X,[X,Y]]-\frac{t}{6}[Y,[X,Y]].
\end{align*}
[/guided]
[/step]
[step:Integrate the homogeneous terms and evaluate at $t=1$]
Let $\mathcal L^1$ denote the one-dimensional [Lebesgue measure](/page/Lebesgue%20Measure) on $[0,1]$. Since $Z_3(0)=0$, integrating the formula for $Z_3'(t)$ over $[0,1]$ with respect to $\mathcal L^1$ gives
\begin{align*}
Z_3(1)=\int_0^1 \frac{1}{12}[X,[X,Y]]\,d\mathcal L^1(t)-\int_0^1 \frac{t}{6}[Y,[X,Y]]\,d\mathcal L^1(t).
\end{align*}
The bracket expressions are independent of $t$, so
\begin{align*}
Z_3(1)=\frac{1}{12}[X,[X,Y]]-\frac{1}{12}[Y,[X,Y]].
\end{align*}
Also $Z_2(1)=\frac{1}{2}[X,Y]$. Substituting $t=1$ into the homogeneous expansion of $Z(t)$ gives
\begin{align*}
\operatorname{BCH}(X,Y)=X+Y+\frac{1}{2}[X,Y]+\frac{1}{12}[X,[X,Y]]-\frac{1}{12}[Y,[X,Y]]+O_4(X,Y).
\end{align*}
This is the claimed Baker--Campbell--Hausdorff formula through homogeneous bracket degree $3$.
[/step]