Differential of the Matrix Exponential — Statement & Proof

Differential of the Matrix Exponential (Theorem # 8834)

Theorem

Edit Issues Pull Requests Attributions Admin

Discussion

Proof

[proofplan] We differentiate the curve $s\mapsto \exp(X+sY)$ in the ambient [matrix space](/page/Matrix%20Space) and use the standard integral formula for the derivative of the matrix exponential. Let $\mathcal L^1$ denote [Lebesgue measure](/page/Lebesgue%20Measure) on $\mathbb R$. Right translation by $\exp(-X)$ moves the resulting tangent vector at $\exp X$ back to the identity, where the integrand becomes the conjugation expression $\exp(tX)Y\exp(-tX)$. We then identify this conjugation expression with $\exp(t\operatorname{ad}_X)(Y)$ and integrate the uniformly convergent operator exponential series term by term. Finally, right translation by $\exp X$ gives the displayed formula for $(d\exp)_X(Y)$ itself. [/proofplan] [step:Differentiate the exponential curve by the matrix variation formula] Let $\alpha:\mathbb R\to G$ be the smooth curve defined by $\alpha(s)=\exp(X+sY)$. This is a curve in $G$ because $X+sY\in\mathfrak g$ for every $s\in\mathbb R$ by [citetheorem:8786], and the exponential of an element of the [Lie algebra](/page/Lie%20Algebra) of a matrix Lie group lies in the group. By definition of the differential of the exponential map, \begin{align*} (d\exp)_X(Y)=\alpha'(0). \end{align*} We prove the finite-dimensional matrix exponential variation formula needed here. For matrices $A,B\in M(n,\mathbb C)$, the series defining $\exp(A+sB)$ and its termwise derivative with respect to $s$ converge locally uniformly in $s$ by the standard Weierstrass theorem for [power series](/page/Power%20Series) with values in a finite-dimensional normed space. Therefore termwise differentiation is valid and gives \begin{align*} \frac{d}{ds}\bigg|_{s=0}\exp(A+sB)=\sum_{m=1}^{\infty}\frac{1}{m!}\sum_{j=0}^{m-1}A^jBA^{m-1-j}. \end{align*} On the other hand, expanding the integrand and using absolute convergence to integrate term by term gives the same series: \begin{align*} \int_0^1 \exp(tA)B\exp((1-t)A)\,d\mathcal L^1(t)=\sum_{p,q\ge 0}\frac{A^pBA^q}{p!q!}\int_0^1 t^p(1-t)^q\,d\mathcal L^1(t). \end{align*} For nonnegative integers $p$ and $q$, the beta-integral identity gives \begin{align*} \int_0^1 t^p(1-t)^q\,d\mathcal L^1(t)=\frac{p!q!}{(p+q+1)!}. \end{align*} Writing $m=p+q+1$ recovers the preceding derivative series. Applying this formula with $A=X$ and $B=Y$ gives \begin{align*} (d\exp)_X(Y) = \int_0^1 \exp(tX)Y\exp((1-t)X)\,d\mathcal L^1(t). \end{align*} [guided] Define $\alpha:\mathbb R\to G$ by $\alpha(s)=\exp(X+sY)$. Since $X,Y\in\mathfrak g$ and $\mathfrak g$ is a real [vector space](/page/Vector%20Space) by [citetheorem:8786], we have $X+sY\in\mathfrak g$ for every $s\in\mathbb R$; hence $\alpha(s)\in G$ by the defining exponential property of the Lie algebra of a matrix Lie group. The differential of $\exp:\mathfrak g\to G$ at $X$ in the direction $Y$ is the velocity of this curve at $s=0$: \begin{align*} (d\exp)_X(Y)=\alpha'(0). \end{align*} We now compute this velocity. For matrices $A,B\in M(n,\mathbb C)$, the series defining $\exp(A+sB)$ and its termwise derivative with respect to $s$ converge locally uniformly in $s$ by the standard Weierstrass theorem for power series with values in a finite-dimensional normed space. Therefore termwise differentiation is valid and gives \begin{align*} \frac{d}{ds}\bigg|_{s=0}\exp(A+sB)=\sum_{m=1}^{\infty}\frac{1}{m!}\sum_{j=0}^{m-1}A^jBA^{m-1-j}. \end{align*} Expanding $\exp(tA)B\exp((1-t)A)$ into its absolutely convergent double series and integrating over $[0,1]$ with respect to $\mathcal L^1$ gives \begin{align*} \int_0^1 \exp(tA)B\exp((1-t)A)\,d\mathcal L^1(t)=\sum_{p,q\ge 0}\frac{A^pBA^q}{p!q!}\int_0^1 t^p(1-t)^q\,d\mathcal L^1(t). \end{align*} The beta-integral identity gives $\int_0^1 t^p(1-t)^q\,d\mathcal L^1(t)=p!q!/(p+q+1)!$. Setting $m=p+q+1$ shows that this integral equals the derivative series above. Therefore, with $A=X$ and $B=Y$, \begin{align*} (d\exp)_X(Y)=\int_0^1 \exp(tX)Y\exp((1-t)X)\,d\mathcal L^1(t). \end{align*} Right translation by $\exp(-X)$ is the smooth map $R_{\exp(-X)}:G\to G$ given by $A\mapsto A\exp(-X)$. Its differential is right multiplication by $\exp(-X)$ on matrix tangent vectors, so \begin{align*} (dR_{\exp(-X)})_{\exp X}\bigl((d\exp)_X(Y)\bigr)=\int_0^1 \exp(tX)Y\exp((1-t)X)\exp(-X)\,d\mathcal L^1(t). \end{align*} By the exponential law for one matrix from [citetheorem:8778], $\exp((1-t)X)\exp(-X)=\exp(-tX)$. Hence \begin{align*} (dR_{\exp(-X)})_{\exp X}\bigl((d\exp)_X(Y)\bigr)=\int_0^1 \exp(tX)Y\exp(-tX)\,d\mathcal L^1(t). \end{align*} Let $T:\mathfrak g\to\mathfrak g$ be the [linear map](/page/Linear%20Map) $T(Z)=\operatorname{ad}_X(Z)=[X,Z]$. This is well-defined by closure of $\mathfrak g$ under commutators, [citetheorem:8787]. For $t\in[0,1]$, conjugation by $\exp(tX)\in G$ preserves $\mathfrak g$ because it is the differential at the identity of the smooth inner automorphism $A\mapsto \exp(tX)A\exp(-tX)$ of $G$. Thus $C_t:\mathfrak g\to\mathfrak g$, defined by $C_t(Z)=\exp(tX)Z\exp(-tX)$, is a well-defined linear map. Differentiating $C_t(Y)$ gives \begin{align*} \frac{d}{dt}C_t(Y)=X\exp(tX)Y\exp(-tX)-\exp(tX)Y\exp(-tX)X=[X,C_t(Y)]=T(C_t(Y)), \end{align*} with initial value $C_0(Y)=Y$. The unique solution of this finite-dimensional linear equation is \begin{align*} C_t(Y)=\exp(tT)(Y)=\sum_{k=0}^{\infty}\frac{t^kT^k(Y)}{k!}. \end{align*} Therefore \begin{align*} \exp(tX)Y\exp(-tX)=\sum_{k=0}^{\infty}\frac{t^k(\operatorname{ad}_X)^k(Y)}{k!}. \end{align*} Choose a norm $\|\cdot\|_{\mathfrak g}$ on the finite-dimensional vector space $\mathfrak g$. Let $\mathcal L(\mathfrak g)$ denote the vector space of linear maps from $\mathfrak g$ to itself, equipped with the corresponding operator norm $\|\cdot\|_{\mathcal L(\mathfrak g)}$. For $t\in[0,1]$, \begin{align*} \left\|\frac{t^kT^k(Y)}{k!}\right\|_{\mathfrak g}\le \frac{\|T\|_{\mathcal L(\mathfrak g)}^k\|Y\|_{\mathfrak g}}{k!}. \end{align*} The majorising scalar exponential series converges, so the Weierstrass test gives [uniform convergence](/page/Uniform%20Convergence) on $[0,1]$. Termwise integration in the finite-dimensional normed space $\mathfrak g$ is therefore valid, and \begin{align*} \int_0^1 \exp(tX)Y\exp(-tX)\,d\mathcal L^1(t)=\sum_{k=0}^{\infty}\int_0^1 \frac{t^k(\operatorname{ad}_X)^k(Y)}{k!}\,d\mathcal L^1(t). \end{align*} Since $\int_0^1 t^k\,d\mathcal L^1(t)=1/(k+1)$, this becomes \begin{align*} \int_0^1 \exp(tX)Y\exp(-tX)\,d\mathcal L^1(t)=\sum_{k=0}^{\infty}\frac{(\operatorname{ad}_X)^k(Y)}{(k+1)!}. \end{align*} Combining with the right-translated derivative identity gives \begin{align*} (dR_{\exp(-X)})_{\exp X}\bigl((d\exp)_X(Y)\bigr)=\sum_{k=0}^{\infty}\frac{(\operatorname{ad}_X)^k(Y)}{(k+1)!}. \end{align*} Finally, $R_{\exp X}$ and $R_{\exp(-X)}$ are inverse diffeomorphisms of $G$, so their differentials at the corresponding points are inverse linear maps. Applying $(dR_{\exp X})_I$ to both sides yields \begin{align*} (d\exp)_X(Y)=(dR_{\exp X})_I\left(\sum_{k=0}^{\infty}\frac{(\operatorname{ad}_X)^k(Y)}{(k+1)!}\right). \end{align*} This proves both the stated differential formula and its equivalent right-translated form. [/guided] [/step] [step:Right translate the derivative back to the identity] Right translation by $\exp(-X)$ is the smooth map \begin{align*} R_{\exp(-X)}:G&\to G,\qquad A\mapsto A\exp(-X). \end{align*} For a tangent vector represented by a matrix-valued curve, its differential is right multiplication by $\exp(-X)$. Therefore \begin{align*} (dR_{\exp(-X)})_{\exp X}\bigl((d\exp)_X(Y)\bigr) = \left(\int_0^1 \exp(tX)Y\exp((1-t)X)\,d\mathcal L^1(t)\right)\exp(-X). \end{align*} Linearity of right multiplication and of the finite-dimensional integral gives \begin{align*} (dR_{\exp(-X)})_{\exp X}\bigl((d\exp)_X(Y)\bigr) = \int_0^1 \exp(tX)Y\exp((1-t)X)\exp(-X)\,d\mathcal L^1(t). \end{align*} Using the exponential law for the single matrix $X$ from [citetheorem:8778], \begin{align*} \exp((1-t)X)\exp(-X)=\exp(-tX). \end{align*} Thus \begin{align*} (dR_{\exp(-X)})_{\exp X}\bigl((d\exp)_X(Y)\bigr) = \int_0^1 \exp(tX)Y\exp(-tX)\,d\mathcal L^1(t). \end{align*} [/step] [step:Expand conjugation by $\exp(tX)$ as the exponential of $\operatorname{ad}_X$] Define the linear map \begin{align*} T:\mathfrak g&\to\mathfrak g,\qquad T(Z)=\operatorname{ad}_X(Z)=[X,Z]. \end{align*} This map is well-defined because $\mathfrak g$ is closed under commutators by [citetheorem:8787]. For each $t\in[0,1]$, the matrix $\exp(tX)$ lies in $G$, and conjugation by $\exp(tX)$ is the smooth inner automorphism $A\mapsto \exp(tX)A\exp(-tX)$ of $G$. Its differential at $I$ preserves $T_I G=\mathfrak g$, so the map $C_t:\mathfrak g\to\mathfrak g$ defined by $C_t(Z)=\exp(tX)Z\exp(-tX)$ is well-defined. The map $t\mapsto C_t(Y)$ satisfies the linear differential equation \begin{align*} \frac{d}{dt}C_t(Y)=T(C_t(Y)),\qquad C_0(Y)=Y. \end{align*} Indeed, differentiating the two matrix exponential factors gives \begin{align*} \frac{d}{dt}\bigl(\exp(tX)Y\exp(-tX)\bigr) = X\exp(tX)Y\exp(-tX)-\exp(tX)Y\exp(-tX)X, \end{align*} and $X$ commutes with $\exp(tX)$ and with $\exp(-tX)$ by the exponential law for one matrix. Hence this derivative equals $[X,C_t(Y)]$. The unique solution of this finite-dimensional linear equation is \begin{align*} C_t(Y)=\exp(tT)(Y) = \sum_{k=0}^{\infty}\frac{t^kT^k(Y)}{k!}. \end{align*} Therefore, for every $t\in[0,1]$, \begin{align*} \exp(tX)Y\exp(-tX) = \sum_{k=0}^{\infty}\frac{t^k(\operatorname{ad}_X)^k(Y)}{k!}. \end{align*} [/step] [step:Integrate the operator exponential series term by term] Choose any norm $\|\cdot\|_{\mathfrak g}$ on the finite-dimensional vector space $\mathfrak g$. Let $\mathcal L(\mathfrak g)$ denote the vector space of linear maps from $\mathfrak g$ to itself, equipped with the associated operator norm $\|\cdot\|_{\mathcal L(\mathfrak g)}$. We apply this norm to $T=\operatorname{ad}_X\in\mathcal L(\mathfrak g)$. For $t\in[0,1]$, \begin{align*} \left\|\frac{t^kT^k(Y)}{k!}\right\|_{\mathfrak g} \le \frac{\|T\|_{\mathcal L(\mathfrak g)}^k\|Y\|_{\mathfrak g}}{k!}. \end{align*} The scalar series on the right converges, so the Weierstrass test gives uniform convergence of the vector-valued series on $[0,1]$. Hence termwise integration is valid in the finite-dimensional normed space $\mathfrak g$: \begin{align*} \int_0^1 \exp(tX)Y\exp(-tX)\,d\mathcal L^1(t) = \sum_{k=0}^{\infty}\int_0^1 \frac{t^k(\operatorname{ad}_X)^k(Y)}{k!}\,d\mathcal L^1(t). \end{align*} For each $k\ge 0$, \begin{align*} \int_0^1 \frac{t^k(\operatorname{ad}_X)^k(Y)}{k!}\,d\mathcal L^1(t) = \frac{(\operatorname{ad}_X)^k(Y)}{(k+1)!}. \end{align*} Therefore \begin{align*} \int_0^1 \exp(tX)Y\exp(-tX)\,d\mathcal L^1(t) = \sum_{k=0}^{\infty}\frac{(\operatorname{ad}_X)^k(Y)}{(k+1)!}. \end{align*} Each partial sum lies in $\mathfrak g$, and since $\mathfrak g$ is a finite-dimensional subspace of $M(n,\mathbb C)$, it is closed; consequently the series sum also lies in $\mathfrak g$. [/step] [step:Translate the identity form back to the tangent space at $\exp X$] Combining the preceding steps gives \begin{align*} (dR_{\exp(-X)})_{\exp X}\bigl((d\exp)_X(Y)\bigr) = \sum_{k=0}^{\infty}\frac{(\operatorname{ad}_X)^k(Y)}{(k+1)!}. \end{align*} Apply the inverse right translation differential $(dR_{\exp X})_I$ to both sides. Since $R_{\exp X}$ and $R_{\exp(-X)}$ are inverse diffeomorphisms of $G$, their differentials at the corresponding points are inverse linear maps. Thus \begin{align*} (d\exp)_X(Y) = (dR_{\exp X})_I\left(\sum_{k=0}^{\infty}\frac{(\operatorname{ad}_X)^k(Y)}{(k+1)!}\right). \end{align*} This is exactly the asserted formula, and the displayed right-translated identity is the equivalent formulation. [/step]

Prerequisites (0/5 completed)

Prerequisites Graph

Interactive dependency map showing how this theorem builds on foundational concepts

Loading dependency graph...

Theorems

Definitions & Concepts

Explore Further

Lie Algebra Definition Linear Map Definition Power Series Definition Weierstrass Theorem on Stein Manifolds Theorem #3483 Weierstrass Theorem for Proper Lower Semicontinuous Coercive Functions Theorem #6670 Pointwise Hard Lefschetz Theorem for Hermitian Vector Spaces Geometry Smooth Transition Maps for Translated Logarithm Charts Geometry Curvature Criterion for Positivity of a Holomorphic Line Bundle Geometry Continuous Maps Preserve Connected Components Topology Ambrose-Singer Theorem Geometry Gauge Invariance of the Yang-Mills Curvature Density Geometry Infinite Cofinite Spaces Are Hyperconnected Topology Functoriality of Associated Bundles under Principal Bundle Morphisms Geometry Geometry Area

What brings you to Androma?

Start with a route through the knowledge graph.

Differential of the Matrix Exponential (Theorem # 8834)

Discussion

Proof

Prerequisites (0/5 completed)

Prerequisites Graph

Explore Further

Sign in to Androma

Check your inbox

One last step

Differential of the Matrix Exponential (Theorem # 8834)

Discussion

Proof

Prerequisites (0/5 completed)

Prerequisites Graph

Explore Further