Lie Groups I: Foundations

Also known as: Lie groups, Lie group foundations, Introduction to Lie groups, Matrix Lie groups, Smooth groups

Edit 0 Issues 0 Pull Requests Roadmap Admin

Content

Problems

History

Issues Verification Attributions

These notes introduce Lie groups through matrices. The first goal is concrete: understand when familiar families of invertible matrices form smooth groups, and learn how curves through the identity turn nonlinear matrix equations into linear infinitesimal equations. A second goal is structural. Once the infinitesimal object has been built, the notes ask how much local group multiplication, global topology, and representation theory it controls. The advanced tools needed for those questions are introduced only when they become necessary in the later chapters. The course therefore moves from examples and calculations toward general structure. Matrix groups come first, Lie algebras and local formulas come next, and the final chapters explain how topology, representations, and integration enter the same story. # Introduction Lie groups are spaces where algebra and geometry meet: they are groups whose elements can be varied continuously and differentiably. This course begins with matrix examples, because matrix multiplication, determinants, eigenvalues, and invariant forms give concrete models in which the main ideas can be computed before the abstract language is introduced. The guiding problem is to understand how much of a group near the identity is controlled by its tangent space there, and how local infinitesimal data determines global structure. The foundations developed here begin with closed subgroups of general linear groups, then pass to Lie algebras, the exponential map, local multiplication formulas, and covering groups. Differential geometry is used as the language of tangent spaces and smooth maps, but the first part of the course keeps the geometry tied to matrices. The main examples are rotations, unitary matrices, determinant-one matrices, symplectic matrices, and nilpotent upper-triangular groups such as the Heisenberg group. ## The Central Questions The first question is why a group should carry a smooth structure at all. Finite groups have no interesting local geometry, while groups such as rotations, invertible matrices, and unitary matrices vary in continuous families. A Lie group is designed to capture precisely this compatibility between group operations and smooth geometry. [definition: Lie Group] A Lie group is a smooth manifold $G$ equipped with a group structure such that the multiplication map $m:G\times G\to G$, $m(g,h)=gh$, and the inversion map $i:G\to G$, $i(g)=g^{-1}$, are smooth. [/definition] This definition is the abstract endpoint of the course's first movement. The next problem is how to recognise Lie groups inside familiar linear algebra without first constructing abstract atlases, so we initially work with subgroups of matrix groups. ## Matrix Groups as First Examples We write $M(n,\mathbb C)$ for the [vector space](/page/Vector%20Space) of complex $n\times n$ matrices and $GL(n,\mathbb C)$ for the group of invertible complex $n\times n$ matrices. Real matrix groups are viewed inside this complex ambient group when convenient. Matrix Lie groups provide the concrete class of Lie groups studied in the first chapters: they are matrix subgroups whose topology is controlled well enough to support smooth structure. The closedness requirement is the key extra condition, because a subgroup of invertible matrices can otherwise be algebraically valid while behaving topologically like a dense curve rather than like a manifold of the expected dimension. [definition: Matrix Lie Group] A matrix Lie group is a subgroup $G\le GL(n,\mathbb C)$ that is closed as a subset of $GL(n,\mathbb C)$ in the topology inherited from $M(n,\mathbb C)$. [/definition] The closedness hypothesis is not cosmetic: without it, a subgroup can wind densely through a torus and fail to have the dimension suggested by its parametrisation. The next example records the basic pathology that the closedness condition excludes. Here $S^1=\{z\in\mathbb C:|z|=1\}$ is the unit circle, $U(1)$ is the same group viewed as unitary $1\times1$ matrices, and $T^2=S^1\times S^1$ is the two-torus. [example: Irrational Winding In A Torus] [claim]The subgroup $H=\{(e^{it},e^{i\alpha t}):t\in\mathbb R\}$ is dense in $T^2$ and is not closed.[/claim] [proof]Define $\phi:\mathbb R\to T^2$ by $\phi(t)=(e^{it},e^{i\alpha t})$. For $s,t\in\mathbb R$, \begin{align*} \phi(s+t)=(e^{i(s+t)},e^{i\alpha(s+t)})=(e^{is}e^{it},e^{i\alpha s}e^{i\alpha t})=\phi(s)\phi(t). \end{align*} Thus $H=\phi(\mathbb R)$ is a subgroup of $T^2$. To prove density, fix a target point $(e^{ia},e^{ib})\in T^2$. Points of $H$ with first coordinate exactly $e^{ia}$ are obtained by taking $t=a+2\pi n$ with $n\in\mathbb Z$, because \begin{align*} e^{i(a+2\pi n)}=e^{ia}e^{2\pi i n}=e^{ia}. \end{align*} For such $t$, the second coordinate is \begin{align*} e^{i\alpha(a+2\pi n)}=e^{i\alpha a}e^{2\pi i\alpha n}. \end{align*} So it remains to know that the set $\{e^{2\pi i\alpha n}:n\in\mathbb Z\}$ is dense in $S^1$. We verify this density by the pigeonhole principle. Since $\alpha$ is irrational, the fractional parts of $0,\alpha,2\alpha,\ldots,N\alpha$ are distinct. Divide $[0,1)$ into $N$ intervals of length $1/N$. Two of these $N+1$ fractional parts lie in the same interval, so for some integer $q$ with $1\le q\le N$ there is an integer $p$ such that \begin{align*} 0<|q\alpha-p|<1/N. \end{align*} Write $\varepsilon=q\alpha-p$. Then $e^{2\pi i q\alpha}=e^{2\pi i\varepsilon}$, and the step size $|\varepsilon|$ can be made arbitrarily small by choosing $N$ large. The multiples $0,\varepsilon,2\varepsilon,\ldots,m\varepsilon$ modulo $1$ then pass within $|\varepsilon|$ of every point of the circle before wrapping around, so the values $e^{2\pi i kq\alpha}$ are arbitrarily close to any prescribed point of $S^1$. Hence $\{e^{2\pi i\alpha n}:n\in\mathbb Z\}$ is dense in $S^1$. Therefore, for every $(e^{ia},e^{ib})\in T^2$ and every neighbourhood of it, one can choose $n$ so that \begin{align*} (e^{i(a+2\pi n)},e^{i\alpha(a+2\pi n)})=(e^{ia},e^{i\alpha a}e^{2\pi i\alpha n}) \end{align*} lies in that neighbourhood. Thus $H$ is dense in $T^2$. Since $H\ne T^2$, for example no $t$ satisfies both $e^{it}=1$ and $e^{i\alpha t}=e^{i\pi}$ because the first equation gives $t=2\pi m$ and then the second gives $e^{i\alpha t}=e^{2\pi i\alpha m}\ne -1$ when $\alpha$ is irrational, the dense subgroup $H$ is not closed.[/proof] This is the basic obstruction excluded by the closedness condition in the definition of a matrix Lie group: a subgroup can come from a smooth one-parameter parametrisation and still have the wrong [subspace topology](/page/Subspace%20Topology) to be an embedded Lie subgroup. [/example] ## Matrices Before Manifolds The next problem is how to build enough examples before the full machinery of manifolds is available. Matrix groups are ideal because they sit inside $M(n,\mathbb F)$, where $\mathbb F$ is either $\mathbb R$ or $\mathbb C$, and smoothness can be checked by ordinary calculus on coordinate entries. [quotetheorem:8769] [citeproof:8769] This theorem supplies the ambient object for the first lectures, and its hypotheses explain why $GL(n,\mathbb F)$ is the correct matrix setting: invertibility is an open condition, so tangent vectors can be studied using the surrounding vector space $M(n,\mathbb F)$. The theorem does not say that every subgroup of $GL(n,\mathbb F)$ is automatically a Lie group with the subspace structure; the earlier dense winding example shows why a closedness condition is needed. The next problem is to recognise important closed subgroups of $GL(n,\mathbb F)$ by equations, since the classical groups arise by preserving determinant, [inner product](/page/Inner%20Product), Hermitian form, or symplectic form. [example: Orthogonal Matrices] The [orthogonal group](/page/Orthogonal%20Group) is \begin{align*} O(n)=\{A\in GL(n,\mathbb R):A^\top A=I\}. \end{align*} Write the columns of $A$ as $v_1,\ldots,v_n\in\mathbb R^n$. The $(i,j)$-entry of $A^\top A$ is $v_i\cdot v_j$, so the equation $A^\top A=I$ is exactly the collection of equations \begin{align*} v_i\cdot v_j=\delta_{ij}. \end{align*} Thus the columns are pairwise orthogonal unit vectors, hence an [orthonormal basis](/page/Orthonormal%20Basis) of $\mathbb R^n$. Taking determinants in the defining equation gives \begin{align*} \det(A^\top A)=\det(I). \end{align*} Using $\det(A^\top)=\det(A)$ and $\det(AB)=\det(A)\det(B)$, this becomes \begin{align*} \det(A)^2=1. \end{align*} Therefore every [orthogonal matrix](/page/Orthogonal%20Matrix) has determinant $1$ or $-1$. The determinant map is continuous, and its values on $O(n)$ lie in the discrete set $\{1,-1\}$, so a continuous path in $O(n)$ cannot move from determinant $1$ to determinant $-1$. Conversely, $SO(n)=\{A\in O(n):\det A=1\}$ is path-connected: using rotations in coordinate planes, one can successively rotate the first column to $e_1$, then the second column inside the span of $e_2,\ldots,e_n$ to $e_2$, and so on, until the matrix is reduced to $I$ through matrices of determinant $1$. The determinant $-1$ part is obtained from $SO(n)$ by multiplying by the fixed reflection $\operatorname{diag}(-1,1,\ldots,1)$, so it is path-connected for the same reason. Hence $O(n)$ has exactly two connected components, distinguished by determinant, and the identity component is $SO(n)$. [/example] This example shows that a geometric invariance condition can be written as a matrix equation. The next problem is how to extract the linear approximation to such an equation at the identity, because that linear approximation is the object on which calculations become manageable. [definition: Matrix Lie Algebra] Let $G\le GL(n,\mathbb C)$ be a matrix Lie group. Its matrix [Lie algebra](/page/Lie%20Algebra) is \begin{align*} \mathfrak g=\{X\in M(n,\mathbb C):\exp(tX)\in G\text{ for all }t\in\mathbb R\}. \end{align*} [/definition] The exponential condition encodes tangent vectors through one-parameter subgroups. The next example shows how the definition turns a nonlinear constraint on matrices into a linear equation on infinitesimal matrices. [example: The Lie Algebra Of The Orthogonal Group] Let $X\in M(n,\mathbb R)$. By the definition of the matrix Lie algebra, $X$ belongs to the Lie algebra of $O(n)$ exactly when $\exp(tX)\in O(n)$ for every $t\in\mathbb R$, which means \begin{align*} \exp(tX)^\top\exp(tX)=I. \end{align*} Set $E(t)=\exp(tX)$. From the [power series](/page/Power%20Series) definition of the exponential, $E(0)=I$ and $E'(0)=X$. Differentiating the identity $E(t)^\top E(t)=I$ at $t=0$ gives, by the product rule, \begin{align*} 0=\frac{d}{dt}\bigg|_{t=0}(E(t)^\top E(t))=E'(0)^\top E(0)+E(0)^\top E'(0). \end{align*} Substituting $E(0)=I$ and $E'(0)=X$ gives \begin{align*} 0=X^\top I+I^\top X=X^\top+X. \end{align*} Thus every tangent matrix in the Lie algebra is skew-symmetric. Conversely, suppose $X^\top+X=0$, so $X^\top=-X$. Taking transposes term-by-term in the exponential series gives \begin{align*} \exp(tX)^\top=\sum_{k=0}^{\infty}\frac{((tX)^k)^\top}{k!}=\sum_{k=0}^{\infty}\frac{(tX^\top)^k}{k!}=\exp(tX^\top)=\exp(-tX). \end{align*} Since $tX$ and $-tX$ commute, the exponential product identity gives \begin{align*} \exp(tX)^\top\exp(tX)=\exp(-tX)\exp(tX)=\exp(0)=I. \end{align*} Therefore $\exp(tX)\in O(n)$ for every $t$, and the Lie algebra is exactly \begin{align*} \mathfrak{so}(n)=\{X\in M(n,\mathbb R):X^\top+X=0\}. \end{align*} The nonlinear orthogonality equation $A^\top A=I$ has therefore become the linear infinitesimal equation $X^\top+X=0$, which is the same pattern used for the unitary, special linear, and symplectic groups. [/example] ## The Exponential Map As The Bridge Once the Lie algebra has appeared, the main question becomes how to return from infinitesimal data to group elements. For matrices, the answer begins with the convergent power series defining the exponential. [definition: Matrix Exponential] For $\mathbb F\in\{\mathbb R,\mathbb C\}$, the matrix exponential is the map \begin{align*} \exp:M(n,\mathbb F)&\to GL(n,\mathbb F), & X&\mapsto \sum_{k=0}^{\infty}\frac{X^k}{k!}. \end{align*} [/definition] The matrix exponential is not merely a convenient function; it is the mechanism by which tangent vectors produce curves in the group. The next problem is to verify that these curves respect the group law, since this is what makes infinitesimal directions into genuine one-parameter symmetries. [quotetheorem:8770] [citeproof:8770] This result is the first instance of a larger theme: Lie algebras linearise nonlinear group questions by replacing multiplication near the identity with operations on matrices. The commutativity of $sX$ and $tX$ is essential here; the identity $\exp(A+B)=\exp(A)\exp(B)$ fails in general when $A$ and $B$ do not commute, and this failure is the reason the Baker--Campbell--Hausdorff formula later has correction terms. The theorem also has a global limitation: it produces a homomorphism from $\mathbb R$, but it does not make the exponential map injective or surjective for an arbitrary Lie group. The next example shows this limitation in the smallest rotation group. [example: Rotations Of The Plane] Let $J\in M(2,\mathbb R)$ be defined by $J_{11}=0$, $J_{12}=-1$, $J_{21}=1$, and $J_{22}=0$. Its square has entries $(J^2)_{11}=0\cdot 0+(-1)\cdot 1=-1$, $(J^2)_{12}=0\cdot(-1)+(-1)\cdot 0=0$, $(J^2)_{21}=1\cdot 0+0\cdot 1=0$, and $(J^2)_{22}=1\cdot(-1)+0\cdot 0=-1$, so $J^2=-I$. Hence, by induction, $J^{2m}=(-1)^mI$ and $J^{2m+1}=(-1)^mJ$ for every $m\ge 0$. Using the power series definition of the matrix exponential, split the series into even and odd powers: \begin{align*}\exp(tJ)=\sum_{m=0}^{\infty}\frac{t^{2m}J^{2m}}{(2m)!}+\sum_{m=0}^{\infty}\frac{t^{2m+1}J^{2m+1}}{(2m+1)!}.\end{align*} Substituting the formulas for the powers of $J$ gives \begin{align*}\exp(tJ)=\left(\sum_{m=0}^{\infty}\frac{(-1)^m t^{2m}}{(2m)!}\right)I+\left(\sum_{m=0}^{\infty}\frac{(-1)^m t^{2m+1}}{(2m+1)!}\right)J.\end{align*} By the Taylor series for sine and cosine, \begin{align*}\exp(tJ)=\cos(t)I+\sin(t)J.\end{align*} Since $J_{11}=0$, $J_{12}=-1$, $J_{21}=1$, and $J_{22}=0$, this means that $\exp(tJ)$ has first row $(\cos t,-\sin t)$ and second row $(\sin t,\cos t)$. Let $R(t)=\exp(tJ)$. To verify that $R(t)\in SO(2)$, compute the entries of $R(t)^\top R(t)$. The $(1,1)$-entry is $\cos^2t+\sin^2t=1$, the $(1,2)$-entry is $-\cos t\sin t+\sin t\cos t=0$, the $(2,1)$-entry is $-\sin t\cos t+\cos t\sin t=0$, and the $(2,2)$-entry is $\sin^2t+\cos^2t=1$. Thus $R(t)^\top R(t)=I$. Also \begin{align*}\det R(t)=\cos t\cos t-(-\sin t)\sin t=\cos^2t+\sin^2t=1,\end{align*} so $R(t)\in SO(2)$. For $x=(x_1,x_2)\in\mathbb R^2$, multiplication by $R(t)$ gives \begin{align*}R(t)x=(x_1\cos t-x_2\sin t,\ x_1\sin t+x_2\cos t),\end{align*} which is the usual rotation of the plane through angle $t$. The matrices also multiply according to angle addition: the first row of $R(s)R(t)$ is $(\cos s\cos t-\sin s\sin t,\ -\cos s\sin t-\sin s\cos t)$, and the second row is $(\sin s\cos t+\cos s\sin t,\ -\sin s\sin t+\cos s\cos t)$. Using the angle addition formulas, these rows are $(\cos(s+t),-\sin(s+t))$ and $(\sin(s+t),\cos(s+t))$, so $R(s)R(t)=R(s+t)$. Therefore $t\mapsto \exp(tJ)$ is a one-parameter subgroup of $SO(2)$. Finally, \begin{align*}\exp((t+2\pi)J)=\cos(t+2\pi)I+\sin(t+2\pi)J=\cos(t)I+\sin(t)J=\exp(tJ).\end{align*} In particular, $\exp(0\cdot J)=\exp(2\pi J)=I$ while $0\cdot J\ne 2\pi J$, so the exponential map $\mathfrak{so}(2)\to SO(2)$ is periodic rather than injective. [/example] ## Local Structure And Global Structure The final organising problem is how local information near the identity spreads through the whole group. Multiplication by a fixed group element transports neighbourhoods of the identity to neighbourhoods elsewhere, so much of the manifold structure is controlled by the identity component. Global questions then ask how many components there are and how covering spaces change the group without changing its local Lie algebra. [definition: Identity Component] Let $G$ be a topological group. The identity component $G_0$ is the connected component of $G$ containing the identity element $e$. [/definition] The identity component is the part seen by curves starting at $e$. The next problem is to justify treating it as a group in its own right and as an invariant normal part of $G$. [quotetheorem:8771] [citeproof:8771] This theorem explains why disconnected examples such as $O(n)$ can be studied by first studying $SO(n)$ and then adding a finite component group. The [topological group structure](/theorems/2505) is essential: for an arbitrary [topological space](/page/Topological%20Space) there is no multiplication to transport the component of a point, and for a disconnected group the theorem does not say that the whole group is controlled by $G_0$. It also does not imply that the quotient $G/G_0$ is finite; Lie groups can have infinitely many connected components. The result instead isolates the connected normal part that the Lie algebra detects, and this prepares for covering groups, where the same Lie algebra can correspond to different global groups. [example: Same Lie Algebra, Different Global Groups] [claim]The Lie algebras of $SU(2)$ and $SO(3)$ are isomorphic, but the Lie groups $SU(2)$ and $SO(3)$ are not isomorphic.[/claim] [proof]Let $E_{ij}$ denote the matrix with $1$ in the $(i,j)$-entry and $0$ elsewhere. The multiplication rule is $E_{ij}E_{kl}=\delta_{jk}E_{il}$. The Lie algebra of $SU(2)$ is \begin{align*}\mathfrak{su}(2)=\{X\in M(2,\mathbb C):X^*+X=0,\ \operatorname{tr}(X)=0\}.\end{align*} Use the basis \begin{align*}A_1=\frac{i}{2}(E_{12}+E_{21}),\quad A_2=\frac{1}{2}(E_{12}-E_{21}),\quad A_3=\frac{i}{2}(E_{11}-E_{22}).\end{align*} For the first bracket, \begin{align*}A_1A_2=\frac{i}{4}(E_{12}+E_{21})(E_{12}-E_{21})=\frac{i}{4}(-E_{11}+E_{22}).\end{align*} Also, \begin{align*}A_2A_1=\frac{i}{4}(E_{12}-E_{21})(E_{12}+E_{21})=\frac{i}{4}(E_{11}-E_{22}).\end{align*} Therefore \begin{align*}[A_1,A_2]=A_1A_2-A_2A_1=\frac{i}{2}(-E_{11}+E_{22})=-A_3.\end{align*} For the second bracket, \begin{align*}A_2A_3=\frac{i}{4}(E_{12}-E_{21})(E_{11}-E_{22})=\frac{i}{4}(-E_{12}-E_{21}).\end{align*} Also, \begin{align*}A_3A_2=\frac{i}{4}(E_{11}-E_{22})(E_{12}-E_{21})=\frac{i}{4}(E_{12}+E_{21}).\end{align*} Hence \begin{align*}[A_2,A_3]=A_2A_3-A_3A_2=-\frac{i}{2}(E_{12}+E_{21})=-A_1.\end{align*} For the third bracket, \begin{align*}A_3A_1=-\frac{1}{4}(E_{11}-E_{22})(E_{12}+E_{21})=-\frac{1}{4}(E_{12}-E_{21}).\end{align*} Also, \begin{align*}A_1A_3=-\frac{1}{4}(E_{12}+E_{21})(E_{11}-E_{22})=\frac{1}{4}(E_{12}-E_{21}).\end{align*} Thus \begin{align*}[A_3,A_1]=A_3A_1-A_1A_3=-\frac{1}{2}(E_{12}-E_{21})=-A_2.\end{align*} The Lie algebra of $SO(3)$ is \begin{align*}\mathfrak{so}(3)=\{X\in M(3,\mathbb R):X^\top+X=0\}.\end{align*} Use the basis \begin{align*}B_1=E_{32}-E_{23},\quad B_2=E_{13}-E_{31},\quad B_3=E_{21}-E_{12}.\end{align*} Using $E_{ij}E_{kl}=\delta_{jk}E_{il}$, \begin{align*}B_1B_2=(E_{32}-E_{23})(E_{13}-E_{31})=-E_{21}.\end{align*} Also, \begin{align*}B_2B_1=(E_{13}-E_{31})(E_{32}-E_{23})=-E_{12}.\end{align*} Therefore \begin{align*}[B_1,B_2]=B_1B_2-B_2B_1=E_{12}-E_{21}=-B_3.\end{align*} Similarly, \begin{align*}B_2B_3=(E_{13}-E_{31})(E_{21}-E_{12})=-E_{32}.\end{align*} And \begin{align*}B_3B_2=(E_{21}-E_{12})(E_{13}-E_{31})=-E_{23}.\end{align*} Hence \begin{align*}[B_2,B_3]=B_2B_3-B_3B_2=E_{23}-E_{32}=-B_1.\end{align*} Finally, \begin{align*}B_3B_1=(E_{21}-E_{12})(E_{32}-E_{23})=-E_{13}.\end{align*} And \begin{align*}B_1B_3=(E_{32}-E_{23})(E_{21}-E_{12})=-E_{31}.\end{align*} Thus \begin{align*}[B_3,B_1]=B_3B_1-B_1B_3=E_{31}-E_{13}=-B_2.\end{align*} Define $\Phi:\mathfrak{su}(2)\to\mathfrak{so}(3)$ by \begin{align*}\Phi(A_1)=B_1,\quad \Phi(A_2)=B_2,\quad \Phi(A_3)=B_3.\end{align*} Since $\{A_1,A_2,A_3\}$ and $\{B_1,B_2,B_3\}$ are bases, $\Phi$ is a vector space isomorphism. The bracket computations give \begin{align*}\Phi([A_1,A_2])=\Phi(-A_3)=-B_3=[B_1,B_2]=[\Phi(A_1),\Phi(A_2)].\end{align*} They also give \begin{align*}\Phi([A_2,A_3])=\Phi(-A_1)=-B_1=[B_2,B_3]=[\Phi(A_2),\Phi(A_3)].\end{align*} And \begin{align*}\Phi([A_3,A_1])=\Phi(-A_2)=-B_2=[B_3,B_1]=[\Phi(A_3),\Phi(A_1)].\end{align*} By bilinearity of the commutator bracket, $\Phi([X,Y])=[\Phi(X),\Phi(Y)]$ for all $X,Y\in\mathfrak{su}(2)$, so $\mathfrak{su}(2)\cong\mathfrak{so}(3)$ as Lie algebras. The groups themselves are not isomorphic. Identifying $SU(2)$ with the unit quaternions, conjugation on the three-dimensional space of imaginary quaternions defines a homomorphism $SU(2)\to SO(3)$. Its kernel is $\{\pm I\}$, because exactly the real unit quaternions commute with every imaginary quaternion. This homomorphism is surjective, so $SU(2)$ is a double cover of $SO(3)$. Since $SU(2)$ is simply connected and $SO(3)$ has this nontrivial double cover, no Lie group isomorphism $SU(2)\cong SO(3)$ can exist.[/proof] Thus the Lie algebra detects the local infinitesimal rotation structure, but it does not by itself determine the global topology of the Lie group. [/example] ## How The Course Develops The course begins with closed subgroups of $GL(n,\mathbb R)$ and $GL(n,\mathbb C)$, then builds a catalogue of classical examples. The next stage constructs Lie algebras, computes them by differentiating defining equations, and proves that the commutator bracket is the infinitesimal shadow of the group commutator. The exponential map then becomes the main technical tool. It gives local coordinates, produces one-parameter subgroups, and leads to the Baker--Campbell--Hausdorff formula, which describes multiplication in Lie algebra terms near the identity. The middle of the course proves [Cartan's closed subgroup theorem](/theorems/8813). This result justifies the starting point: every closed subgroup of a general linear group is an embedded Lie subgroup, so the matrix examples are not merely examples but a robust source of Lie groups. The final lectures move from local theory to global structure. Covering groups, fundamental groups, and connected components explain why the same Lie algebra can underlie several different Lie groups, and they prepare the way for representation theory and the structure theory of compact Lie groups. These local-to-global ideas set up the concrete model for the course. We now begin with matrix groups, where topology, smooth structure, and algebra can all be seen explicitly inside familiar linear transformations. # 1. Matrix Groups: Definitions and First Examples This opening chapter fixes the concrete setting for the course: Lie groups will first be studied as groups of matrices with a compatible topology and smooth structure. The guiding question is which familiar matrix groups should count as geometric objects, and why the condition of being closed inside a general linear group gives the right answer. The chapter builds from the open group $GL(n)$ to the classical groups preserving bilinear, Hermitian, or symplectic forms, then ends with solvable examples that will later contrast with compact and semisimple groups. ## General Linear Groups as Smooth Matrix Groups Before defining a Lie group, we need a model example where the group operations are already compatible with calculus. The space of all matrices is a finite-dimensional vector space, so it carries a natural Euclidean topology and smooth structure; the first question is how invertibility sits inside this ambient vector space. [definition: Matrix Space] For $n \in \mathbb N$ and $\mathbb F \in \{\mathbb R,\mathbb C\}$, let $M(n,\mathbb F)$ denote the vector space of all $n \times n$ matrices with entries in $\mathbb F$. [/definition] [Matrix space](/page/Matrix%20Space) supplies coordinates for doing calculus with matrices: each entry is a coordinate, and polynomial expressions in entries are smooth functions. A group of matrices should at least consist of matrices that can be composed and inverted as linear maps. This leads to the basic open region inside $M(n,\mathbb F)$ where the determinant does not vanish. [definition: General Linear Group] For $\mathbb F \in \{\mathbb R,\mathbb C\}$, the general linear group is \begin{align*} GL(n,\mathbb F) := \{A \in M(n,\mathbb F) : \det A \ne 0\}, \end{align*} with group law given by matrix multiplication. [/definition] The definition singles out exactly those matrices that define invertible linear maps on $\mathbb F^n$. To use differential topology, this set should not be a singular or jagged subset of matrix space. The determinant gives the first test: invertibility is detected by a continuous scalar-valued function, so the invertible matrices form an open domain for calculus. [quotetheorem:8772] [citeproof:8772] Openness gives $GL(n,\mathbb F)$ a manifold structure without constructing charts by hand, and this is exactly why the hypothesis $\det A \ne 0$ is imposed as an open condition rather than an equation. The complement $\{A:\det A=0\}$ is a closed hypersurface with singular points, for instance the zero matrix, so it is not a suitable ambient domain for ordinary differential calculus. The theorem says only that the invertible matrices form a smooth manifold as an open subset; it does not yet say that the algebraic group operations respect that smooth structure. The next compatibility question therefore concerns the group law itself: multiplication and inversion must be smooth maps, not merely algebraic operations. This is where the coordinate formulas for products and adjugates enter. [quotetheorem:8773] [citeproof:8773] The theorem makes $GL(n,\mathbb F)$ the prototype for all matrix Lie groups: it is both a group and a smooth manifold, and the two structures interact through smooth operations. Each hypothesis is doing work. Restricting to invertible matrices is necessary for inversion to be a globally defined group operation; on all of $M(n,\mathbb F)$, matrix multiplication remains smooth but singular matrices have no inverse. The openness of $GL(n,\mathbb F)$ is also what allows the rational formula for $A^{-1}$ to be treated as an ordinary smooth coordinate formula, since the denominator $\det A$ stays away from zero at each point of the domain. Finally, the theorem uses the standard matrix multiplication and inversion maps: an open manifold can carry a group law that is not smooth for its given smooth structure, for instance by transporting addition on $\mathbb R$ through a non-smooth homeomorphism. Thus smoothness of the operations is a genuine compatibility condition, not a consequence of the words "group" and "manifold" alone. The smallest cases already show that these analytic facts interact with topology, because connectedness depends strongly on whether the entries are real or complex. [example: The Case Of One By One Matrices] For $n=1$, a matrix is just $[a]$, and its determinant is $a$. Hence \begin{align*} GL(1,\mathbb R)=\{[a]:a\in\mathbb R,\ a\ne 0\}\cong \mathbb R^\times \end{align*} and \begin{align*} GL(1,\mathbb C)=\{[z]:z\in\mathbb C,\ z\ne 0\}\cong \mathbb C^\times. \end{align*} In the real case, $\mathbb R^\times=(-\infty,0)\cup(0,\infty)$, and these two intervals are disjoint open subsets of $\mathbb R^\times$. They cannot lie in the same connected component because any continuous path from a negative number to a positive number would have to take the value $0$ by the intermediate value property, but $0\notin\mathbb R^\times$. In the complex case, every $z\in\mathbb C^\times$ can be written as $z=re^{i\theta}$ with $r=|z|>0$. The path \begin{align*} \gamma(t)=((1-t)r+t)e^{i(1-t)\theta} \end{align*} satisfies $\gamma(0)=re^{i\theta}=z$, $\gamma(1)=1$, and $\gamma(t)\ne 0$ for every $t\in[0,1]$ because $(1-t)r+t>0$. Thus every nonzero complex number is connected to $1$ by a path in $\mathbb C^\times$, so $GL(1,\mathbb C)$ is connected. This smallest case already shows that the same algebraic definition can produce different topological behavior over $\mathbb R$ and over $\mathbb C$. [/example] ## Closed Subgroups and the Definition of a Matrix Lie Group Many natural groups arise by imposing equations on matrices, such as preserving a determinant or an inner product. The question is which subgroups of $GL(n,\mathbb C)$ inherit a smooth manifold structure from the ambient matrix space. [definition: Matrix Lie Group] A matrix Lie group is a subgroup $G \le GL(n,\mathbb C)$, for some fixed $n \in \mathbb N$, that is closed as a subset of $GL(n,\mathbb C)$ in the subspace topology inherited from $M(n,\mathbb C)$. [/definition] The ambient complex general linear group is used as a uniform container. Real matrix groups such as $GL(n,\mathbb R)$, $O(n)$, and $Sp(2n,\mathbb R)$ are regarded as subgroups of $GL(n,\mathbb C)$ by inclusion of real matrices among complex matrices. Since this definition uses relative topology, it is useful to compare closedness inside $GL(n,\mathbb C)$ with equations inside the full matrix space. [remark: Relative And Ambient Closedness] Since $GL(n,\mathbb C)$ is open in $M(n,\mathbb C)$, a subgroup $G \le GL(n,\mathbb C)$ is closed in $GL(n,\mathbb C)$ precisely when there is a closed subset $C \subset M(n,\mathbb C)$ such that $G=C \cap GL(n,\mathbb C)$. Most examples below are cut out by polynomial equations in the matrix entries and their complex conjugates. [/remark] Closedness excludes subgroups that wind densely through a compact group without forming a submanifold of the expected dimension. The basic model is an irrational line on a torus. [example: A Dense Nonclosed Subgroup Of The Torus] Let $\alpha \in \mathbb R \setminus \mathbb Q$ and define \begin{align*} H=\{(e^{it},e^{i\alpha t}) : t \in \mathbb R\} \subset U(1) \times U(1). \end{align*} This set is a subgroup: the identity is obtained from $t=0$, and for $s,t\in\mathbb R$, \begin{align*} (e^{it},e^{i\alpha t})(e^{is},e^{i\alpha s})=(e^{i(t+s)},e^{i\alpha(t+s)})\in H. \end{align*} Also, \begin{align*} (e^{it},e^{i\alpha t})^{-1}=(e^{-it},e^{-i\alpha t})=(e^{i(-t)},e^{i\alpha(-t)})\in H. \end{align*} We show that $H$ is dense in $U(1)\times U(1)$. Given $(e^{iu},e^{iv})\in U(1)\times U(1)$, take $t=u+2\pi k$ with $k\in\mathbb Z$. Then the first coordinate is fixed: \begin{align*} e^{it}=e^{i(u+2\pi k)}=e^{iu}e^{i2\pi k}=e^{iu}. \end{align*} The second coordinate becomes \begin{align*} e^{i\alpha t}=e^{i\alpha u}e^{i2\pi\alpha k}. \end{align*} Since $\alpha$ is irrational, the set $\{e^{i2\pi\alpha k}:k\in\mathbb Z\}$ is dense in $U(1)$ by *density of irrational rotations on the circle*. Therefore we can choose integers $k$ for which $e^{i2\pi\alpha k}$ is arbitrarily close to $e^{i(v-\alpha u)}$, and then $e^{i\alpha t}$ is arbitrarily close to $e^{iv}$. Hence every point of $U(1)\times U(1)$ lies in the closure of $H$. The subgroup is not all of $U(1)\times U(1)$. If $(1,e^{iv})\in H$, then $e^{it}=1$, so $t=2\pi k$ for some $k\in\mathbb Z$, and hence the second coordinate must be $e^{i2\pi\alpha k}$. Thus, for example, any point $(1,e^{iv})$ with $e^{iv}$ not in the [countable set](/page/Countable%20Set) $\{e^{i2\pi\alpha k}:k\in\mathbb Z\}$ is not in $H$. Since the closure of $H$ is the whole torus but $H$ is proper, $H$ is dense and nonclosed. [/example] The dense subgroup example explains why arbitrary subgroups are too flexible for differential geometry. Closed subgroups, by contrast, have enough rigidity to carry a canonical embedded manifold structure. The next theorem is the main structural guarantee behind the definition, although its proof requires tools developed later. [quotetheorem:8774] [citeproof:8774] The theorem is a structural existence result, not a recipe for computing every invariant of the subgroup. Its hypothesis is topological closedness inside the ambient general linear group, and that condition is what rules out dense winding subgroups such as the irrational line in the torus. Once closedness and the subgroup property are known, the theorem supplies the compatible smooth manifold structure automatically. It does not by itself compute dimension, connected components, compactness, or the Lie algebra; those are found later by differentiating defining equations and studying the exponential map. Orthogonal groups are the first concrete test of the closed-subgroup principle. The defining equation asks a matrix to preserve the Euclidean inner product, so it is both geometrically meaningful and well suited to verification by closed matrix equations. [definition: Orthogonal And Special Orthogonal Groups] The orthogonal group and special orthogonal group are \begin{align*} O(n):=\{A \in GL(n,\mathbb R):A^\top A=I\}, \qquad SO(n):=\{A \in O(n):\det A=1\}. \end{align*} [/definition] Matrices in $O(n)$ preserve the Euclidean inner product on $\mathbb R^n$, because $(Ax)\cdot(Ay)=x\cdot y$ for all $x,y \in \mathbb R^n$ precisely when $A^\top A=I$. In two dimensions this condition can be solved explicitly, giving a concrete model of a compact one-dimensional group. [example: Parametrising The Circle Group SO Two] Let $A\in SO(2)$, and write its entries as $A_{11}=a$, $A_{12}=b$, $A_{21}=c$, and $A_{22}=d$. The condition $A^\top A=I$ says that the two columns are orthonormal, so \begin{align*} a^2+c^2=1,\qquad b^2+d^2=1,\qquad ab+cd=0. \end{align*} Since $(a,c)$ lies on the unit circle, there is some $\theta\in\mathbb R$ with $a=\cos\theta$ and $c=\sin\theta$. The vector $(b,d)$ is a unit vector orthogonal to $(\cos\theta,\sin\theta)$, so it must be either $(-\sin\theta,\cos\theta)$ or $(\sin\theta,-\cos\theta)$. Write $(b,d)=\lambda(-\sin\theta,\cos\theta)$ with $\lambda\in\{1,-1\}$. The determinant condition gives \begin{align*} 1=\det A=ad-bc=(\cos\theta)(\lambda\cos\theta)-(\lambda(-\sin\theta))(\sin\theta)=\lambda(\cos^2\theta+\sin^2\theta)=\lambda. \end{align*} Hence $\lambda=1$, so every element of $SO(2)$ has entries $A_{11}=\cos\theta$, $A_{12}=-\sin\theta$, $A_{21}=\sin\theta$, and $A_{22}=\cos\theta$. Conversely, for any $\theta\in\mathbb R$, the matrix with entries $\cos\theta,-\sin\theta,\sin\theta,\cos\theta$ has column norms \begin{align*} \cos^2\theta+\sin^2\theta=1 \end{align*} and column inner product \begin{align*} (\cos\theta)(-\sin\theta)+(\sin\theta)(\cos\theta)=0. \end{align*} Its determinant is \begin{align*} (\cos\theta)(\cos\theta)-(-\sin\theta)(\sin\theta)=\cos^2\theta+\sin^2\theta=1, \end{align*} so it lies in $SO(2)$. The same element is obtained from $\theta+2\pi$ because sine and cosine are $2\pi$-periodic. If $R(\theta)$ denotes this element, then the product $R(\theta)R(\phi)$ has first column entries \begin{align*} \cos\theta\cos\phi-\sin\theta\sin\phi=\cos(\theta+\phi) \end{align*} and \begin{align*} \sin\theta\cos\phi+\cos\theta\sin\phi=\sin(\theta+\phi). \end{align*} Its second column entries are \begin{align*} -\cos\theta\sin\phi-\sin\theta\cos\phi=-\sin(\theta+\phi) \end{align*} and \begin{align*} -\sin\theta\sin\phi+\cos\theta\cos\phi=\cos(\theta+\phi). \end{align*} Thus $R(\theta)R(\phi)=R(\theta+\phi)$, so multiplication corresponds to adding angles modulo $2\pi$. The parametrisation identifies $SO(2)$ with the circle: it is compact as the image of the compact interval $[0,2\pi]$, and it is connected because the path $t\mapsto R((1-t)\theta+t\phi)$ joins $R(\theta)$ to $R(\phi)$ inside $SO(2)$. [/example] The determinant separates the two components of $O(n)$: matrices with determinant $1$ preserve orientation, while those with determinant $-1$ reverse it. The real transpose condition does not express preservation of the complex Hermitian inner product, so we need a definition that uses conjugate transpose instead. This turns the same metric-preservation question into the unitary setting. [definition: Unitary And Special Unitary Groups] The unitary group and special unitary group are \begin{align*} U(n):=\{A \in GL(n,\mathbb C):A^*A=I\}, \qquad SU(n):=\{A \in U(n):\det A=1\}, \end{align*} where $A^*=\overline{A}^{\top}$ is the conjugate transpose. [/definition] The condition $A^*A=I$ says that $A$ preserves the standard Hermitian inner product on $\mathbb C^n$. Since the entries of $A^*A$ involve complex conjugates, these equations are real polynomial equations in the real and imaginary parts of the entries. In rank two the equations admit a particularly useful parametrisation. [example: Parametrising SU Two] Let $A\in SU(2)$, and set $\alpha=A_{11}$ and $\beta=A_{21}$. Since $A^*A=I$, the first column has norm $1$, so \begin{align*} |\alpha|^2+|\beta|^2=1. \end{align*} Writing $q=A_{12}$ and $s=A_{22}$, orthogonality of the two columns gives \begin{align*} \overline{\alpha}q+\overline{\beta}s=0. \end{align*} The vector $(-\overline{\beta},\overline{\alpha})$ is orthogonal to $(\alpha,\beta)$ because \begin{align*} \overline{\alpha}(-\overline{\beta})+\overline{\beta}\overline{\alpha}=0. \end{align*} Since $(\alpha,\beta)\ne(0,0)$, its orthogonal complement in $\mathbb C^2$ is one-dimensional, so there is some $\lambda\in\mathbb C$ with $q=-\lambda\overline{\beta}$ and $s=\lambda\overline{\alpha}$. The second column has norm $1$, hence \begin{align*} |q|^2+|s|^2=|\lambda|^2|\beta|^2+|\lambda|^2|\alpha|^2=|\lambda|^2(|\alpha|^2+|\beta|^2)=|\lambda|^2. \end{align*} Thus $|\lambda|=1$. The determinant condition now gives \begin{align*} 1=\det A=\alpha s-q\beta=\alpha(\lambda\overline{\alpha})-(-\lambda\overline{\beta})\beta=\lambda(|\alpha|^2+|\beta|^2)=\lambda. \end{align*} Therefore $q=-\overline{\beta}$ and $s=\overline{\alpha}$. Conversely, if $\alpha,\beta\in\mathbb C$ satisfy $|\alpha|^2+|\beta|^2=1$ and $A$ has entries $A_{11}=\alpha$, $A_{12}=-\overline{\beta}$, $A_{21}=\beta$, and $A_{22}=\overline{\alpha}$, then its column norms are \begin{align*} |\alpha|^2+|\beta|^2=1 \end{align*} and \begin{align*} |-\overline{\beta}|^2+|\overline{\alpha}|^2=|\beta|^2+|\alpha|^2=1. \end{align*} The column inner product is \begin{align*} \overline{\alpha}(-\overline{\beta})+\overline{\beta}\overline{\alpha}=0, \end{align*} so $A^*A=I$. Its determinant is \begin{align*} \alpha\overline{\alpha}-(-\overline{\beta})\beta=|\alpha|^2+|\beta|^2=1, \end{align*} so $A\in SU(2)$. Writing $\alpha=a+ib$ and $\beta=c+id$, the condition $|\alpha|^2+|\beta|^2=1$ becomes \begin{align*} a^2+b^2+c^2+d^2=1. \end{align*} Thus the parameters identify $SU(2)$ with the unit sphere $S^3\subset\mathbb R^4$. Under this identification, the group law comes from matrix multiplication; it is not abelian, for example the choices $\alpha=0,\beta=1$ and $\alpha=i,\beta=0$ give two elements whose products have opposite signs in the off-diagonal entries. [/example] The parametrisation of $SU(2)$ shows that a familiar manifold, the three-sphere, can carry a matrix group structure. The next statement records why this example matters for rotations: $SU(2)$ is the two-fold covering group of $SO(3)$. The course will later construct the map using conjugation on a three-dimensional space of traceless skew-Hermitian matrices. [quotetheorem:8775] [citeproof:8775] This statement explains why closed defining equations are the common mechanism behind the classical families, and it also marks where the hypotheses cannot be weakened without changing the conclusion. The equations must be closed conditions: inside $U(1)$, the subgroup $\{e^{2\pi iq}:q\in\mathbb Q\}$ satisfies an algebraic group condition but is dense in the circle, so it is not a matrix Lie group with the subspace topology. The equations must also define a subgroup, not only a closed locus: the closed condition $A_{11}=1$ inside $GL(n,\mathbb R)$ is not preserved by matrix multiplication, so it cannot define a matrix Lie group. The real-entry requirements in $O(n)$, $SO(n)$, and $Sp(2n,\mathbb R)$ are likewise part of the statement; omitting them changes the object to a complex form-preserving group, not the corresponding real classical group. Finally, determinant-one conditions distinguish special groups from their full form-preserving groups: $O(n)$ and $SO(n)$ have different connected-component behaviour, and $U(n)$ and $SU(n)$ have different central circle structure. The theorem does not by itself give dimensions, connectedness, compactness, or simple-connectedness; for example $O(n)$ has more than one component, $Sp(2n,\mathbb R)$ is noncompact, and $SU(2)$ has different global topology from $SO(3)$ despite their close relationship. These examples will remain central throughout the course. Their Lie algebras, exponential maps, connectedness properties, and covering relationships provide the main test cases for the general theory. ## Solvable Matrix Groups The classical groups are often rigid because they preserve nondegenerate forms. To see a different type of behaviour, we now look at upper-triangular matrix groups, where multiplication produces polynomial coordinate formulas and commutators move entries upward. [definition: Heisenberg Group] The real Heisenberg group is the subgroup $H_3(\mathbb R) \le GL(3,\mathbb R)$ consisting of matrices $h(x,y,z)$ with diagonal entries equal to $1$, entries $A_{12}=x$, $A_{23}=y$, $A_{13}=z$, all entries below the diagonal equal to $0$, and $x,y,z\in\mathbb R$. [/definition] The three coordinates are not multiplied independently. If $h(x,y,z)$ denotes such a matrix, then a direct multiplication gives \begin{align*} h(x,y,z)h(x',y',z')=h(x+x',y+y',z+z'+xy'). \end{align*} [example: Noncommutativity In The Heisenberg Group] Using the multiplication rule $h(a,b,c)h(a',b',c')=h(a+a',b+b',c+c'+ab')$, we first verify the inverse formula: \begin{align*} h(x,y,z)h(-x,-y,xy-z)=h(0,0,z+xy-z+x(-y))=h(0,0,0). \end{align*} Similarly, \begin{align*} h(-x,-y,xy-z)h(x,y,z)=h(0,0,xy-z+z+(-x)y)=h(0,0,0). \end{align*} Since $h(0,0,0)$ is the identity matrix, $h(x,y,z)^{-1}=h(-x,-y,xy-z)$. Now take $h(x,y,0)$ and $h(x',y',0)$. Their inverses are \begin{align*} h(x,y,0)^{-1}=h(-x,-y,xy) \end{align*} and \begin{align*} h(x',y',0)^{-1}=h(-x',-y',x'y'). \end{align*} The first product is \begin{align*} h(x,y,0)h(x',y',0)=h(x+x',y+y',xy'). \end{align*} Multiplying by $h(x,y,0)^{-1}$ gives \begin{align*} h(x+x',y+y',xy')h(-x,-y,xy)=h(x',y',xy'+xy+(x+x')(-y)). \end{align*} The third coordinate is \begin{align*} xy'+xy+(x+x')(-y)=xy'+xy-xy-x'y=xy'-x'y. \end{align*} So \begin{align*} h(x,y,0)h(x',y',0)h(x,y,0)^{-1}=h(x',y',xy'-x'y). \end{align*} Multiplying by the final inverse gives \begin{align*} h(x',y',xy'-x'y)h(-x',-y',x'y')=h(0,0,xy'-x'y+x'y'+x'(-y')). \end{align*} The last two terms cancel, so \begin{align*} xy'-x'y+x'y'-x'y'=xy'-x'y. \end{align*} Therefore \begin{align*} h(x,y,0)h(x',y',0)h(x,y,0)^{-1}h(x',y',0)^{-1}=h(0,0,xy'-x'y). \end{align*} For example, choosing $x=1$, $y=0$, $x'=0$, and $y'=1$ gives the nontrivial commutator $h(0,0,1)$, so $H_3(\mathbb R)$ is non-abelian. Since every commutator has the form $h(0,0,z)$, and \begin{align*} h(0,0,z)h(a,b,c)=h(a,b,z+c) \end{align*} while \begin{align*} h(a,b,c)h(0,0,z)=h(a,b,c+z), \end{align*} these commutators lie in the central subgroup consisting of the matrices $h(0,0,z)$. [/example] This is the first example where the group law contains a lower-order linear part and a higher-order correction term. The same pattern appears in the Baker-Campbell-Hausdorff formula in Chapter 5, where products of exponentials are rewritten using commutator corrections. This motivates the next definition, a larger family of upper triangular groups whose commutators climb away from the diagonal. [definition: Upper Triangular Unipotent Group] For $n \ge 2$, the upper triangular unipotent group is \begin{align*} U_n^{\mathrm{up}}(\mathbb R):=\{A \in GL(n,\mathbb R):A_{ii}=1 \text{ for all } i,\ A_{ij}=0 \text{ for } i>j\}. \end{align*} [/definition] These matrices are upper triangular with all diagonal entries equal to $1$. They form a closed subgroup because the defining conditions are linear equations in the matrix entries. The next theorem explains why their algebra is controlled by how far above the diagonal a nonzero entry can occur. [quotetheorem:8776] [citeproof:8776] The theorem has three distinct pieces of content. First, the mechanism is geometric: each commutator pushes possible nonzero entries farther from the diagonal, and after the top-right corner there is nowhere left to go. Second, the hypotheses are restrictive. The upper-triangular shape supplies the filtration by superdiagonals, and the unipotent condition fixes the diagonal so that commutators cannot keep reproducing a diagonal action; if arbitrary nonzero diagonal entries are allowed, the resulting triangular group is solvable but need not be nilpotent. The bound is also tied to the ambient size $n$: for the full group $U_n^{\mathrm{up}}(\mathbb R)$ it is sharp, because successive elementary superdiagonal commutators can reach the highest superdiagonal before vanishing. Third, this example points forward to the general study of solvable and nilpotent Lie groups. The Heisenberg group is the case $n=3$, and later its Lie algebra will show the same central-commutator pattern infinitesimally. These solvable examples show that matrix Lie groups are not limited to rotations and volume-preserving transformations; they also include groups whose structure is controlled by triangular algebra. Having seen the classical matrix groups as closed subgroups of $GL(n,\mathbb C)$ and their basic examples, we next need a tool for moving between linear algebra and group elements. The matrix exponential and logarithm provide that bridge, turning infinitesimal data into actual one-parameter subgroups and recovering it near the identity. # 2. The Matrix Exponential and Logarithm The previous chapter introduced matrix groups as concrete closed subgroups of $GL(n,\mathbb C)$ and established the basic topology and algebra of the classical examples, including the classical form-preserving groups and the Heisenberg group. This chapter develops the main analytic device that lets us pass between linear data and group-valued curves: the matrix exponential. We first treat the exponential as a convergent power series, then record the identities that make it compatible with conjugation, trace, determinant, and one-parameter motion. These applications are useful context, but the role needed here is narrower: the exponential is the solution operator for constant-coefficient linear differential equations and the bridge from infinitesimal matrices to one-parameter subgroups. The inverse problem leads to the matrix logarithm, which is local rather than global and already signals why Lie theory is fundamentally local near the identity. ## Constructing the Exponential from a Power Series The guiding problem is how to turn a tangent vector at the identity into an actual path in the group. For scalar differential equations the solution of $x'(t)=\lambda x(t)$ with $x(0)=1$ is $e^{t\lambda}$, so for matrices we want a construction that solves $g'(t)=Xg(t)$ while respecting matrix multiplication. [definition: Matrix Exponential] For consistency with the matrix-space notation fixed in Chapter 1, the matrix exponential is the map $\exp: M(n,\mathbb C) \to M(n,\mathbb C)$ defined by \begin{align*} \exp(X)=\sum_{m=0}^{\infty} \frac{X^m}{m!}. \end{align*} [/definition] The definition is only useful if the series converges in the finite-dimensional vector space $M(n,\mathbb C)$. We use any submultiplicative matrix norm, meaning a norm satisfying $\|XY\| \le \|X\|\|Y\|$ for all $X,Y \in M(n,\mathbb C)$, and ask whether the scalar exponential series bounds the matrix one. [quotetheorem:8777] [citeproof:8777] The hypothesis that the norm be submultiplicative is the point of the estimate: without it, the inequality comparing $\|X^m\|$ with $\|X\|^m$ need not be available, so the scalar exponential series would no longer control the matrix series directly. For a concrete failure, put on $M(2,\mathbb C)$ the norm \begin{align*} \|B\|_* = |B_{11}|+\varepsilon |B_{12}|+\varepsilon |B_{21}|+|B_{22}| \end{align*} with $0<\varepsilon<1/\sqrt{2}$, and let $X=E_{12}+E_{21}$. Then $\|X\|_*=2\varepsilon$, while $X^2=I$ and $\|X^2\|_*=2$, so $\|X^2\|_*>\|X\|_*^2$. The theorem does not say that the exponential is a polynomial or that it is injective; for instance, nonzero nilpotent matrices give terminating series, while scalar periodicity later shows non-injectivity over $\mathbb C$. What it does provide is the analytic foundation for everything that follows: it permits termwise differentiation of $t \mapsto \exp(tX)$ and justifies rearrangements of absolutely convergent series when the relevant matrices commute. [example: Nilpotent Matrix Exponential] Let $N \in M(2,\mathbb C)$ have $N_{12}=1$ and all other entries equal to $0$. For each $i,j \in \{1,2\}$, matrix multiplication gives $(N^2)_{ij}=N_{i1}N_{1j}+N_{i2}N_{2j}$. The four entries are \begin{align*} (N^2)_{11}=N_{11}N_{11}+N_{12}N_{21}=0\cdot 0+1\cdot 0=0,\quad (N^2)_{12}=N_{11}N_{12}+N_{12}N_{22}=0\cdot 1+1\cdot 0=0. \end{align*} \begin{align*} (N^2)_{21}=N_{21}N_{11}+N_{22}N_{21}=0\cdot 0+0\cdot 0=0,\quad (N^2)_{22}=N_{21}N_{12}+N_{22}N_{22}=0\cdot 1+0\cdot 0=0. \end{align*} Thus $N^2=0$, and multiplying by further powers of $N$ gives $N^m=0$ for every $m \ge 2$. Therefore the exponential series has only its first two nonzero terms: \begin{align*} \exp(tN)=\sum_{m=0}^{\infty}\frac{(tN)^m}{m!}=I+tN+\sum_{m=2}^{\infty}\frac{t^mN^m}{m!}=I+tN. \end{align*} The entries of $I+tN$ are \begin{align*} (I+tN)_{11}=1+t\cdot 0=1,\quad (I+tN)_{12}=0+t\cdot 1=t,\quad (I+tN)_{21}=0+t\cdot 0=0,\quad (I+tN)_{22}=1+t\cdot 0=1. \end{align*} So $\exp(tN)$ is the upper triangular unipotent matrix with diagonal entries $1,1$ and upper-right entry $t$. This shows concretely how unipotent matrices arise from nilpotent infinitesimal data; when a matrix is nilpotent, its exponential series becomes a polynomial. [/example] The nilpotent example is the simplest case where the exponential is not merely diagonalising scalar exponentials. The next identities explain which parts of scalar exponential algebra survive in the noncommutative matrix setting. ## Algebraic Identities for the Exponential The central question is how much of $e^{a+b}=e^ae^b$ remains true for matrices. Noncommutativity means there is no unconditional formula for $\exp(X+Y)$, but powers of a single matrix commute with each other, and that is enough for the identities used throughout the course. [quotetheorem:8778] [citeproof:8778] The restriction to multiples of one fixed matrix is essential: if $X$ and $Y$ do not commute, the equality $\exp(X+Y)=\exp(X)\exp(Y)$ can fail, because the degree-two terms involve $XY+YX$ on one side and $2XY$ on the other. Thus the theorem does not give a general rule for sums of matrices; it gives exactly the one-parameter rule needed to turn $t \mapsto \exp(tX)$ into a homomorphism from the additive group of the line into the multiplicative group $GL(n,\mathbb C)$. A second question is whether this construction depends on the chosen basis for the vector space; the next identity shows that conjugating the infinitesimal matrix conjugates the resulting group element. [quotetheorem:8779] [citeproof:8779] The invertibility of $A$ is necessary because conjugation by $A$ must represent a genuine [change of basis](/page/Change%20Of%20Basis). A singular replacement can destroy the identity even if one tries to use $AXA$ in place of $AXA^{-1}$: take $A=E_{11}$ and $X=E_{12}$ in $M(2,\mathbb C)$. Then $AXA=0$, so $\exp(AXA)=I$, while $A\exp(X)A=E_{11}$; the two matrices are not equal. The theorem does not say that exponentials determine conjugacy classes in reverse: distinct matrices can have the same exponential, and logarithms are not global. What it does say is that the exponential is intrinsic under change of basis, so Jordan form and triangular form may be used to compute exponentials without changing the answer as a linear transformation. This raises a more structural question: if a group is defined by an equation such as $\det g=1$, what infinitesimal condition on $X$ forces $\exp X$ to satisfy that equation? [quotetheorem:8780] [citeproof:8780] The complex setting matters for this proof because every complex matrix can be triangularised; over $\mathbb R$ one would either complexify or use a different argument. The theorem does not assert that every determinant-one matrix is the exponential of a trace-zero matrix, and the logarithm discussion below explains why surjectivity questions are subtler than the determinant equation alone. Its immediate consequence is one-way but fundamental: the exponential of a trace-zero matrix lies in a special linear group. It also shows why trace appears as the infinitesimal version of determinant in later chapters. [example: Trace Zero Matrices Exponentiate into $SL(n,\mathbb C)$] Let $X \in \mathfrak{sl}(n,\mathbb C)$. By the definition of $\mathfrak{sl}(n,\mathbb C)$, this means $\operatorname{tr}X=0$. Applying the *Determinant Trace Formula* to $X$ gives \begin{align*} \det(\exp X)=\exp(\operatorname{tr}X). \end{align*} Substituting $\operatorname{tr}X=0$ yields \begin{align*} \det(\exp X)=\exp(0). \end{align*} Since the scalar exponential satisfies $\exp(0)=1$, we get \begin{align*} \det(\exp X)=1. \end{align*} By the definition of $SL(n,\mathbb C)$ as the subgroup of matrices in $GL(n,\mathbb C)$ with determinant $1$, it follows that $\exp X \in SL(n,\mathbb C)$. This is the prototype for recovering a matrix Lie group's Lie algebra by differentiating the equations defining the group. [/example] The identities above have a common theme: they are global statements obtained from a convergent series, but their greatest force is near the identity matrix. This is the same local viewpoint signalled in the introduction by the identity component and one-parameter subgroups. To invert the exponential near $I$, we now turn to the matrix logarithm. ## The Matrix Logarithm Near the Identity The inverse problem is local. Even for complex numbers, the logarithm cannot be defined as a single-valued global inverse of $z \mapsto e^z$ on $\mathbb C^\times$, so for matrices we ask for a convergent formula near $I$. [definition: Matrix Logarithm Near the Identity] For a fixed submultiplicative matrix norm, the matrix logarithm near the identity is the map $\log:\{I+A \in M(n,\mathbb C) : \|A\|<1\}\to M(n,\mathbb C)$ defined by \begin{align*} \log(I+A)=\sum_{m=1}^{\infty}(-1)^{m+1}\frac{A^m}{m}. \end{align*} [/definition] The norm condition is not a cosmetic assumption: it is what lets the scalar logarithm series control the matrix series. Since different matrix norms give the same local topology but different numerical balls, the first task is to prove convergence wherever the comparison with $\sum r^m/m$ is valid. [quotetheorem:8781] [citeproof:8781] The strict inequality $\|A\|<1$ is needed for this proof because the comparison series $\sum \|A\|^m/m$ diverges at $\|A\|=1$; for example, the scalar endpoint already reproduces the harmonic series. The theorem does not describe every matrix that has some logarithm, since matrices outside this norm ball may still admit logarithms by other methods or branches. Its role here is narrower and local: it gives a controlled formula near $I$, but having a convergent series is not yet enough; we need it to undo the exponential. The next theorem is the local inverse statement used later when identifying a neighbourhood of the identity in a Lie group with a neighbourhood of $0$ in its Lie algebra. [quotetheorem:8782] [citeproof:8782] The condition that we restrict to neighbourhoods is necessary: already in $GL(1,\mathbb C)$, the values $0$ and $2\pi i$ have the same exponential, so no global inverse can exist on all of $\mathbb C^\times$. The theorem also does not say that a preferred logarithm has been chosen for every invertible matrix; it only identifies a canonical branch close to $I$. Later, this local inverse property will be the mechanism behind charts on matrix Lie groups, while the failure of a global logarithm will explain phenomena such as compact one-parameter subgroups and periodicity. [example: Scalar Periodicity Blocks a Global Logarithm] In $GL(1,\mathbb C)=\mathbb C^\times$, the matrix exponential is the ordinary scalar exponential. Using the scalar exponential series, \begin{align*} \exp(0)=\sum_{m=0}^{\infty}\frac{0^m}{m!}=1+\sum_{m=1}^{\infty}0=1. \end{align*} [Euler's formula](/theorems/2014) gives \begin{align*} \exp(2\pi i)=\cos(2\pi)+i\sin(2\pi)=1+i\cdot 0=1. \end{align*} Thus \begin{align*} \exp(0)=1=\exp(2\pi i), \end{align*} even though $0\ne 2\pi i$. If there were a global single-valued logarithm $L:\mathbb C^\times\to \mathbb C$ satisfying $\exp(L(z))=z$ and serving as an inverse to $\exp$, then applying it to the same value $1$ would force \begin{align*} L(1)=0 \end{align*} from $\exp(0)=1$, and also \begin{align*} L(1)=2\pi i \end{align*} from $\exp(2\pi i)=1$. This contradicts $0\ne 2\pi i$. The local logarithm avoids the contradiction by choosing one branch near $1$, rather than trying to invert the exponential on all of $\mathbb C^\times$. [/example] The local logarithm lets us recover infinitesimal data from group elements close to the identity. For homomorphic curves through the identity, however, the exponential gives a stronger global statement. ## One-Parameter Subgroups A Lie group is studied through the curves in it, and the most rigid curves are homomorphisms from the additive group $\mathbb R$. The problem is to classify all smooth maps $\gamma:\mathbb R \to GL(n,\mathbb C)$ satisfying $\gamma(s+t)=\gamma(s)\gamma(t)$. [definition: One-Parameter Subgroup] A one-parameter subgroup of $GL(n,\mathbb C)$ is a smooth [group homomorphism](/page/Group%20Homomorphism) \begin{align*} \gamma:(\mathbb R,+)\to GL(n,\mathbb C). \end{align*} [/definition] The exponential law already supplies examples: for every $X \in M(n,\mathbb C)$, the map $t \mapsto \exp(tX)$ is a one-parameter subgroup. The obstruction to uniqueness is that a homomorphic curve is global data, while its tangent vector at the identity is only infinitesimal. To make the classification possible, we first need the local-to-global fact that an exponential curve is completely determined among smooth one-parameter subgroups by its initial velocity. [quotetheorem:8783] [citeproof:8783] Smoothness is needed so that the derivative at $0$ exists and the homomorphism equation can be differentiated; without a regularity assumption, pathological homomorphisms from $\mathbb R$ can appear if arbitrary set-theoretic choices are allowed. The theorem does not yet classify an arbitrary one-parameter subgroup; it says that once the initial velocity is fixed, there is only one smooth homomorphic curve with that velocity. This converts a global homomorphism condition into infinitesimal data and leaves the classification problem for an arbitrary one-parameter subgroup: its derivative at $0$ is the only possible infinitesimal generator, and the next theorem proves that this generator recovers the whole curve. [quotetheorem:8784] [citeproof:8784] The smoothness hypothesis again rules out non-analytic homomorphism pathologies and ensures that the derivative at the identity is meaningful. The theorem does not say that every element of $GL(n,\mathbb C)$ lies on a unique one-parameter subgroup; different global questions about logarithms and endpoints remain. What it proves is the first precise form of Lie theory's central slogan: continuous symmetries are governed by linear infinitesimal data. The examples below show that this data can encode rotations, compact groups, and non-diagonalizable behaviour. ## Classical Computations and Non-Semisimple Behaviour The final question in this chapter is computational: what does the exponential look like in the matrix Lie algebras that came from the classical groups? These examples also warn that eigenvalue intuition must be combined with nilpotent parts, especially in noncompact groups. [example: The $\mathfrak{so}(2)$ Exponential] Let $J$ be given by $J_{11}=0$, $J_{12}=-1$, $J_{21}=1$, and $J_{22}=0$. Multiplying entries, we get \begin{align*} (J^2)_{11}=J_{11}J_{11}+J_{12}J_{21}=0\cdot 0+(-1)\cdot 1=-1. \end{align*} \begin{align*} (J^2)_{12}=J_{11}J_{12}+J_{12}J_{22}=0\cdot(-1)+(-1)\cdot 0=0. \end{align*} \begin{align*} (J^2)_{21}=J_{21}J_{11}+J_{22}J_{21}=1\cdot 0+0\cdot 1=0. \end{align*} \begin{align*} (J^2)_{22}=J_{21}J_{12}+J_{22}J_{22}=1\cdot(-1)+0\cdot 0=-1. \end{align*} Thus $J^2=-I$. Hence, for every $k\ge 0$, \begin{align*} J^{2k}=(J^2)^k=(-I)^k=(-1)^kI. \end{align*} Also, \begin{align*} J^{2k+1}=J^{2k}J=(-1)^kIJ=(-1)^kJ. \end{align*} Using the definition of the matrix exponential and separating even and odd powers, \begin{align*} \exp(tJ)=\sum_{m=0}^{\infty}\frac{t^mJ^m}{m!}=\sum_{k=0}^{\infty}\frac{t^{2k}J^{2k}}{(2k)!}+\sum_{k=0}^{\infty}\frac{t^{2k+1}J^{2k+1}}{(2k+1)!}. \end{align*} Substituting the formulas for $J^{2k}$ and $J^{2k+1}$ gives \begin{align*} \exp(tJ)=\left(\sum_{k=0}^{\infty}\frac{(-1)^kt^{2k}}{(2k)!}\right)I+\left(\sum_{k=0}^{\infty}\frac{(-1)^kt^{2k+1}}{(2k+1)!}\right)J. \end{align*} By the scalar power series definitions of cosine and sine, \begin{align*} \exp(tJ)=\cos(t)I+\sin(t)J. \end{align*} Therefore the entries are \begin{align*} (\exp(tJ))_{11}=\cos(t)\cdot 1+\sin(t)\cdot 0=\cos(t). \end{align*} \begin{align*} (\exp(tJ))_{12}=\cos(t)\cdot 0+\sin(t)\cdot(-1)=-\sin(t). \end{align*} \begin{align*} (\exp(tJ))_{21}=\cos(t)\cdot 0+\sin(t)\cdot 1=\sin(t). \end{align*} \begin{align*} (\exp(tJ))_{22}=\cos(t)\cdot 1+\sin(t)\cdot 0=\cos(t). \end{align*} So $\exp(tJ)$ has first row $(\cos t,-\sin t)$ and second row $(\sin t,\cos t)$. Thus the skew-symmetric generator $J$ exponentiates to the usual counterclockwise rotation subgroup of $SO(2)$. [/example] This computation is the model for compact matrix groups: skew-Hermitian infinitesimal generators lead to unitary motion. The next example packages the same phenomenon in dimension two over $\mathbb C$. [example: The $\mathfrak{su}(2)$ Exponential and Unit Quaternions] Identify $\mathfrak{su}(2)$ with matrices $X$ whose entries satisfy $X_{11}=ia$, $X_{22}=-ia$, $X_{12}=z$, and $X_{21}=-\bar z$, where $a\in \mathbb R$ and $z\in \mathbb C$. We first compute $X^2$ entry by entry: \begin{align*} (X^2)_{11}=X_{11}X_{11}+X_{12}X_{21}=(ia)(ia)+z(-\bar z)=-a^2-|z|^2. \end{align*} \begin{align*} (X^2)_{12}=X_{11}X_{12}+X_{12}X_{22}=iaz+z(-ia)=iaz-iaz=0. \end{align*} \begin{align*} (X^2)_{21}=X_{21}X_{11}+X_{22}X_{21}=(-\bar z)(ia)+(-ia)(-\bar z)=-ia\bar z+ia\bar z=0. \end{align*} \begin{align*} (X^2)_{22}=X_{21}X_{12}+X_{22}X_{22}=(-\bar z)z+(-ia)(-ia)=-|z|^2-a^2. \end{align*} Thus, with $r=(a^2+|z|^2)^{1/2}$, we have $X^2=-r^2I$. If $r>0$, then for every $k\ge 0$, \begin{align*} X^{2k}=(X^2)^k=(-r^2I)^k=(-1)^kr^{2k}I. \end{align*} Also, \begin{align*} X^{2k+1}=X^{2k}X=(-1)^kr^{2k}X. \end{align*} Using the definition of the matrix exponential and separating even and odd powers gives \begin{align*} \exp(X)=\sum_{k=0}^{\infty}\frac{X^{2k}}{(2k)!}+\sum_{k=0}^{\infty}\frac{X^{2k+1}}{(2k+1)!}. \end{align*} Substituting the power formulas, \begin{align*} \exp(X)=\left(\sum_{k=0}^{\infty}\frac{(-1)^kr^{2k}}{(2k)!}\right)I+\left(\sum_{k=0}^{\infty}\frac{(-1)^kr^{2k}}{(2k+1)!}\right)X. \end{align*} By the scalar power series definitions of cosine and sine, \begin{align*} \exp(X)=\cos r\,I+\frac{\sin r}{r}X. \end{align*} When $r=0$, the equality $a^2+|z|^2=0$ forces $a=0$ and $z=0$, so $X=0$ and $\exp(X)=I$. It remains to check that these exponentials lie in $SU(2)$. For $r>0$, write $c=\cos r$ and $s=\sin r/r$, so $\exp(X)=cI+sX$. Since $c,s\in\mathbb R$ and $X^*=-X$, we get \begin{align*} (cI+sX)^*=cI-sX. \end{align*} Therefore \begin{align*} (cI+sX)^*(cI+sX)=(cI-sX)(cI+sX)=c^2I-s^2X^2. \end{align*} Using $X^2=-r^2I$, \begin{align*} c^2I-s^2X^2=(c^2+s^2r^2)I=(\cos^2 r+\sin^2 r)I=I. \end{align*} Thus $\exp(X)$ is unitary. Its entries are $c+sia$, $sz$, $-s\bar z$, and $c-sia$, so its determinant is \begin{align*} (c+sia)(c-sia)-(sz)(-s\bar z)=c^2+s^2a^2+s^2|z|^2. \end{align*} Since $r^2=a^2+|z|^2$, \begin{align*} c^2+s^2a^2+s^2|z|^2=c^2+s^2r^2=\cos^2 r+\sin^2 r=1. \end{align*} For $r=0$, $\exp(X)=I$, which is also unitary with determinant $1$. Hence these exponentials lie in $SU(2)$, and the coordinates $(a,z)$ package the usual unit-quaternion, or unit $3$-sphere, parametrization of $SU(2)$. [/example] Compact examples often hide the nilpotent part because skew-Hermitian matrices are diagonalisable over $\mathbb C$. In $\mathfrak{sl}(2,\mathbb R)$, nilpotent elements give a different kind of one-parameter subgroup. [example: Non-Semisimple Exponentials in $\mathfrak{sl}(2,\mathbb R)$] Let $N\in \mathfrak{sl}(2,\mathbb R)$ be the matrix with $N_{12}=1$ and all other entries equal to $0$, so \begin{align*} N=\begin{pmatrix}0&1\cr 0&0\end{pmatrix}. \end{align*} Its trace is $0+0=0$, so $N\in \mathfrak{sl}(2,\mathbb R)$. Multiplying entries, \begin{align*} (N^2)_{11}=N_{11}N_{11}+N_{12}N_{21}=0\cdot 0+1\cdot 0=0,\quad (N^2)_{12}=N_{11}N_{12}+N_{12}N_{22}=0\cdot 1+1\cdot 0=0. \end{align*} \begin{align*} (N^2)_{21}=N_{21}N_{11}+N_{22}N_{21}=0\cdot 0+0\cdot 0=0,\quad (N^2)_{22}=N_{21}N_{12}+N_{22}N_{22}=0\cdot 1+0\cdot 0=0. \end{align*} Thus $N^2=0$, and hence $N^m=0$ for every $m\ge 2$. The exponential series therefore truncates: \begin{align*} \exp(tN)=\sum_{m=0}^{\infty}\frac{(tN)^m}{m!}=I+tN+\sum_{m=2}^{\infty}\frac{t^mN^m}{m!}=I+tN. \end{align*} In coordinates, \begin{align*} I+tN=\begin{pmatrix}1&0\cr 0&1\end{pmatrix}+t\begin{pmatrix}0&1\cr 0&0\end{pmatrix}=\begin{pmatrix}1&t\cr 0&1\end{pmatrix}. \end{align*} For $t\ne 0$, this matrix is not diagonalisable. Its [characteristic polynomial](/page/Characteristic%20Polynomial) is \begin{align*} \det\left(\begin{pmatrix}1&t\cr 0&1\end{pmatrix}-\lambda I\right)=\det\begin{pmatrix}1-\lambda&t\cr 0&1-\lambda\end{pmatrix}=(1-\lambda)^2, \end{align*} so the only eigenvalue is $1$. The eigenspace for $\lambda=1$ is the kernel of \begin{align*} \begin{pmatrix}1&t\cr 0&1\end{pmatrix}-I=\begin{pmatrix}0&t\cr 0&0\end{pmatrix}. \end{align*} For a vector $(x,y)^\top$, the equation \begin{align*} \begin{pmatrix}0&t\cr 0&0\end{pmatrix}\begin{pmatrix}x\cr y\end{pmatrix}=\begin{pmatrix}ty\cr 0\end{pmatrix}=\begin{pmatrix}0\cr 0\end{pmatrix} \end{align*} forces $y=0$ because $t\ne 0$. Thus the eigenspace is one-dimensional, while a diagonalisable $2\times 2$ matrix with only the eigenvalue $1$ would need a two-dimensional eigenspace. Therefore $\exp(tN)$ is unipotent but non-semisimple for $t\ne 0$, showing that exponentials in $SL(2,\mathbb R)$ include more than rotations and diagonal stretches. [/example] These examples complete the basic exponential dictionary for matrix groups: skew-symmetric matrices generate rotations, skew-Hermitian trace-zero matrices generate $SU(2)$ curves, and nilpotent trace-zero matrices generate unipotent subgroups. Chapter 3 uses this dictionary in reverse, extracting the Lie algebra of a closed matrix group from the one-parameter subgroups that remain inside it. The exponential map has shown how matrices generate curves inside a group, so the next question is how to read off the corresponding infinitesimal structure from those curves. We now reverse direction and define the Lie algebra of a matrix Lie group by differentiating the one-parameter subgroups that remain in the group. # 3. The Lie Algebra of a Matrix Lie Group The preceding chapter introduced the matrix exponential as the mechanism that turns linear infinitesimal data into actual one-parameter subgroups of $GL(n,\mathbb C)$. This chapter reverses that perspective: given a matrix Lie group $G$, we ask which matrices generate curves that remain inside $G$. The answer is the Lie algebra of $G$, a real vector space equipped with a bracket that records the first non-commutative information in the group law. The main theme is that the local structure of $G$ near the identity is controlled by matrices $X$ for which $\exp(tX)$ lies in $G$. We first define this space, then prove its linear and bracket properties, then compute it for the classical groups. These computations are the first point at which the equations defining $G$ become a usable infinitesimal calculus. ## Infinitesimal Generators of One-Parameter Subgroups A matrix Lie group is usually given by global equations such as $A^\top A=I$ or $\det A=1$. To study the group near the identity, we need a linear object that remembers the possible tangent directions of curves through $I$. The exponential map supplies a canonical test for such directions because $t\mapsto \exp(tX)$ is the unique one-parameter subgroup with initial velocity $X$ in $GL(n,\mathbb C)$. [definition: Lie Algebra of a Matrix Lie Group] Let $G\le GL(n,\mathbb C)$ be a matrix Lie group. The Lie algebra of $G$ is \begin{align*} \mathfrak g=\{X\in M(n,\mathbb C):\exp(tX)\in G\text{ for all }t\in\mathbb R\}. \end{align*} [/definition] This definition uses real time $t$, so $\mathfrak g$ is first a real vector space, even when $G$ is a complex matrix group. The first test case is the ambient group itself: if there are no equations cutting out a subgroup, every matrix should be an allowed infinitesimal direction. [example: General Linear Group] Let $G=GL(n,\mathbb C)$ and let $X\in M(n,\mathbb C)$. For each $t\in\mathbb R$, the matrices $tX$ and $-tX$ commute, so the exponential product rule for commuting matrices gives \begin{align*} \exp(tX)\exp(-tX)=\exp(tX-tX)=\exp(0)=I. \end{align*} The same argument gives \begin{align*} \exp(-tX)\exp(tX)=I. \end{align*} Thus $\exp(tX)$ is invertible for every $t$, with inverse $\exp(-tX)$, so $\exp(tX)\in GL(n,\mathbb C)$ for every $t\in\mathbb R$. Since $X$ was arbitrary, every matrix in $M(n,\mathbb C)$ lies in the Lie algebra, and therefore \begin{align*} \mathfrak{gl}(n,\mathbb C)=M(n,\mathbb C). \end{align*} By default this is a real Lie algebra, because the definition uses real parameters $t$; the calculation reflects that $GL(n,\mathbb C)$ has no additional defining equation near the identity. [/example] The example confirms that the definition agrees with the expected tangent space in the largest possible matrix group. For a closed subgroup, the analogous tangent statement is more delicate because we have not yet proved that the subgroup is an embedded submanifold. The following result records the geometric meaning of $\mathfrak g$ and explains why the dimension computations begun for $GL(n,\mathbb F)$ in Chapter 1 eventually reduce to linear algebra. [quotetheorem:8785] [citeproof:8785] The vector-space result captures first-order motion, so it tells us which infinitesimal directions can be added and rescaled. The theorem depends on the closedness of a matrix Lie group: in a non-closed subgroup of a torus with irrational winding, sequences of group elements may converge to points outside the subgroup, so the product-formula limit need not remain in the subgroup. The result also does not say that every element of $G$ near $I$ is the exponential of an element of $\mathfrak g$; that local surjectivity is a later theorem. What it does not yet record is the fact that the group multiplication on $G$ may be non-commutative. To see the first non-commutative correction, we need the matrix operation that measures the failure of two infinitesimal motions to commute. [definition: Commutator Bracket] The commutator bracket is the map \begin{align*} [-,-]:M(n,\mathbb C)\times M(n,\mathbb C)\to M(n,\mathbb C) \end{align*} defined by \begin{align*} [X,Y]=XY-YX \end{align*} for $X,Y\in M(n,\mathbb C)$. [/definition] The bracket measures failure of matrices to commute, but it is not automatic from the definition that this new matrix remains tangent to $G$. The next theorem is the essential compatibility between the group operation and the linear space of infinitesimal generators. It is proved by forming small group commutators and passing to a limit. [quotetheorem:8787] [citeproof:8787] Closure under the commutator gives the operation, but the theorem again uses the matrix Lie group hypotheses through the fact that all small commutator products stay in $G$ and their limits stay in $G$. Without closedness, the same limiting argument can converge in the ambient matrix algebra while leaving the subgroup under discussion. The theorem also does not say that an arbitrary linear subspace of $M(n,\mathbb C)$ is closed under brackets; for instance, the symmetric real matrices are not bracket-closed because the commutator of two symmetric matrices is skew-symmetric. A Lie algebra also requires formal identities for its operation. Since our bracket comes from associative matrix multiplication, those identities can be checked algebraically once and then used for every matrix Lie group. [quotetheorem:8788] [citeproof:8788] The bracket therefore packages group-level non-commutativity into a bilinear operation. The identities in the theorem are consequences of associativity; they do not by themselves guarantee that a chosen subspace is a Lie algebra, because bracket closure is a separate condition. For example, diagonal matrices satisfy the identities under the ambient commutator but form an abelian Lie algebra, while symmetric matrices fail to form a Lie algebra under the same operation. The ambient associative algebra also matters: if an arbitrary bilinear alternating operation is written on a vector space without associativity behind it, the Jacobi identity may fail. For instance, on a basis $e_1,e_2,e_3$ define $[e_1,e_2]=e_2$, $[e_2,e_3]=e_1$, $[e_3,e_1]=e_3$ and extend by bilinearity and antisymmetry; then the Jacobi sum for $(e_1,e_2,e_3)$ equals $e_1+e_2+e_3$, not $0$. Chapter 5 makes this link with multiplication precise through the Baker-Campbell-Hausdorff formula, which expresses $\exp X\exp Y$ as the exponential of a series involving $X$, $Y$, and repeated brackets. ## Conjugation and the Adjoint Action A group acts on itself by conjugation, and this action fixes the identity. Differentiating conjugation at the identity should therefore produce a linear action on the Lie algebra. For matrix groups this differentiation can be written without abstract tangent notation. [definition: Adjoint Action of a Matrix Lie Group] Let $G\le GL(n,\mathbb C)$ be a matrix Lie group with Lie algebra $\mathfrak g$. For $g\in G$, the adjoint action of $g$ is the map \begin{align*} \operatorname{Ad}_g:\mathfrak g\to\mathfrak g \end{align*} defined by \begin{align*} \operatorname{Ad}_g(X)=gXg^{-1} \end{align*} for $X\in\mathfrak g$. [/definition] This is well-defined because conjugating a one-parameter subgroup of $G$ gives another one-parameter subgroup of $G$. Indeed, \begin{align*} \exp(tgXg^{-1})=g\exp(tX)g^{-1}. \end{align*} After well-definedness, the next issue is whether these maps respect the group law and the bracket. That compatibility is needed before $\operatorname{Ad}$ can be used as a representation of $G$ on its infinitesimal object. [quotetheorem:8821] [citeproof:8821] The adjoint representation is a group-level action, so it varies as $g$ moves through $G$. The theorem uses invertibility and membership in $G$: conjugation by an arbitrary matrix outside the normaliser of $G$ need not preserve $G$ or its Lie algebra. It also does not say that $\operatorname{Ad}$ is faithful. In $GL(n,\mathbb C)$, every non-zero scalar matrix $\lambda I$ is central, so $\operatorname{Ad}_{\lambda I}(X)=X$ for all $X\in\mathfrak{gl}(n,\mathbb C)$ even when $\lambda I\ne I$. The representation theorem shows that conjugation preserves all Lie-algebraic structure, and this motivates asking for its infinitesimal generator. To answer that question, we restrict $g$ to a one-parameter subgroup $\exp(tX)$ and name the resulting derivative at $t=0$. [definition: Infinitesimal Adjoint Action] Let $\mathfrak g$ be the Lie algebra of a matrix Lie group $G$. For $X\in\mathfrak g$, define \begin{align*} \operatorname{ad}_X:\mathfrak g\to\mathfrak g,\qquad \operatorname{ad}_X(Y)=[X,Y]. \end{align*} [/definition] The notation separates the group-level operation $\operatorname{Ad}$ from the Lie-algebra-level operation $\operatorname{ad}$. The main point is not only that $\operatorname{ad}_X$ is a [linear map](/page/Linear%20Map), but that it is exactly the derivative of the curve $t\mapsto \operatorname{Ad}_{\exp(tX)}$. The following identity integrates that differential equation back to time $1$. [quotetheorem:8789] [citeproof:8789] This identity says that the exponential map intertwines conjugation in the group with exponentiation of the derivation $\operatorname{ad}_X$ on the Lie algebra. The statement is about the conjugation action on $\mathfrak g$, not about equality of group elements such as $\exp X\exp Y=\exp([X,Y])$, which is false in general. The finite-dimensional matrix setting matters because the proof solves an ordinary linear differential equation on $\mathfrak g$; without a well-defined invariant Lie algebra, the derivative would not land in the same vector space. It is often the fastest way to compute how one-parameter subgroups act on infinitesimal directions. ## Classical Lie Algebras from Defining Equations The practical power of the definition appears when $G$ is cut out by matrix equations. Substituting $\exp(tX)$ into the defining equation and differentiating at $t=0$ turns nonlinear group conditions into linear equations for $X$. [example: Special Linear Lie Algebra] Let $G=SL(n,\mathbb C)=\{A\in GL(n,\mathbb C):\det A=1\}$, and let $X\in M(n,\mathbb C)$. By *Determinant of Matrix Exponential*, for every $t\in\mathbb R$, \begin{align*} \det(\exp(tX))=\exp(\operatorname{tr}(tX)). \end{align*} Since trace is linear, \begin{align*} \operatorname{tr}(tX)=t\operatorname{tr}X. \end{align*} Therefore \begin{align*} \det(\exp(tX))=\exp(t\operatorname{tr}X). \end{align*} If $\exp(tX)\in SL(n,\mathbb C)$ for every $t\in\mathbb R$, then $\det(\exp(tX))=1$ for every $t$, so \begin{align*} \exp(t\operatorname{tr}X)=1. \end{align*} Differentiating the scalar function $t\mapsto \exp(t\operatorname{tr}X)$ at $t=0$ gives \begin{align*} \operatorname{tr}X\cdot \exp(0)=0. \end{align*} Since $\exp(0)=1$, this implies $\operatorname{tr}X=0$. Conversely, if $\operatorname{tr}X=0$, then for every $t\in\mathbb R$, \begin{align*} \det(\exp(tX))=\exp(t\cdot 0)=\exp(0)=1. \end{align*} Thus $\exp(tX)\in SL(n,\mathbb C)$ for every real $t$ exactly when $\operatorname{tr}X=0$, and hence \begin{align*} \mathfrak{sl}(n,\mathbb C)=\{X\in M(n,\mathbb C):\operatorname{tr}X=0\}. \end{align*} The nonlinear determinant-one condition on the group has become the linear trace-zero condition on its infinitesimal generators. [/example] The determinant condition becomes the trace condition because trace is the derivative of determinant at the identity. Orthogonal and unitary groups work the same way, except their defining equations involve transposes or conjugate transposes. [example: Orthogonal and Special Orthogonal Lie Algebras] Let $G=O(n)=\{A\in GL(n,\mathbb R):A^\top A=I\}$, and let $X\in M(n,\mathbb R)$. We compute which $X$ satisfy $\exp(tX)\in O(n)$ for every $t\in\mathbb R$. If this holds, then \begin{align*} \exp(tX)^\top\exp(tX)=I \end{align*} for every $t$. Set $E(t)=\exp(tX)$. Since $E(0)=I$ and $E'(0)=X$, differentiating the identity $E(t)^\top E(t)=I$ at $t=0$ gives \begin{align*} (E'(0))^\top E(0)+E(0)^\top E'(0)=0. \end{align*} Substituting $E(0)=I$ and $E'(0)=X$ yields \begin{align*} X^\top I+I X=0. \end{align*} Hence \begin{align*} X^\top+X=0. \end{align*} Conversely, suppose $X^\top+X=0$, so $X^\top=-X$. Termwise transposition of the exponential series gives \begin{align*} \exp(tX)^\top=\exp(tX^\top)=\exp(-tX). \end{align*} Since $-tX$ and $tX$ commute, the exponential product rule for commuting matrices gives \begin{align*} \exp(tX)^\top\exp(tX)=\exp(-tX)\exp(tX)=\exp(0)=I. \end{align*} Thus $\exp(tX)\in O(n)$ for every $t$, and the Lie algebra of $O(n)$ is \begin{align*} \mathfrak{so}(n)=\{X\in M(n,\mathbb R):X^\top+X=0\}. \end{align*} The same Lie algebra is obtained for $SO(n)$. Since $SO(n)\subseteq O(n)$, every infinitesimal generator of $SO(n)$ is already skew-symmetric. Conversely, if $X^\top+X=0$, then each diagonal entry satisfies $x_{ii}=-x_{ii}$, so $x_{ii}=0$ and therefore $\operatorname{tr}X=0$. By *Determinant of Matrix Exponential* and linearity of trace, \begin{align*} \det(\exp(tX))=\exp(\operatorname{tr}(tX))=\exp(t\operatorname{tr}X)=\exp(0)=1. \end{align*} Together with $\exp(tX)^\top\exp(tX)=I$, this shows $\exp(tX)\in SO(n)$ for every $t$. Thus $O(n)$ and $SO(n)$ have the same infinitesimal generators: the skew-symmetric real matrices. [/example] The unitary case is the complex analogue. The skew-Hermitian condition replaces skew-symmetry, and the special unitary condition adds trace zero. [example: Unitary and Special Unitary Lie Algebras] Let $G=U(n)=\{A\in GL(n,\mathbb C):A^*A=I\}$, and let $X\in M(n,\mathbb C)$. We compute which $X$ satisfy $\exp(tX)\in U(n)$ for every $t\in\mathbb R$. If this holds, set $E(t)=\exp(tX)$, so \begin{align*} E(t)^*E(t)=I. \end{align*} Since $E(0)=I$ and $E'(0)=X$, differentiating at $t=0$ gives \begin{align*} (E'(0))^*E(0)+E(0)^*E'(0)=0. \end{align*} Substituting $E(0)=I$ and $E'(0)=X$ gives \begin{align*} X^*I+I^*X=0. \end{align*} Since $I^*=I$, this is \begin{align*} X^*+X=0. \end{align*} Conversely, suppose $X^*+X=0$, so $X^*=-X$. Termwise adjointing the exponential series gives \begin{align*} \exp(tX)^*=\exp(tX^*)=\exp(-tX). \end{align*} The matrices $-tX$ and $tX$ commute, so the exponential product rule for commuting matrices gives \begin{align*} \exp(tX)^*\exp(tX)=\exp(-tX)\exp(tX)=\exp(0)=I. \end{align*} Thus $\exp(tX)\in U(n)$ for every real $t$, and therefore \begin{align*} \mathfrak u(n)=\{X\in M(n,\mathbb C):X^*+X=0\}. \end{align*} For $SU(n)$ we add the condition $\det A=1$. If $\exp(tX)\in SU(n)$ for every $t$, then the unitary computation already gives $X^*+X=0$. Also, by *Determinant of Matrix Exponential* and linearity of trace, \begin{align*} 1=\det(\exp(tX))=\exp(\operatorname{tr}(tX))=\exp(t\operatorname{tr}X) \end{align*} for every $t\in\mathbb R$. Differentiating the scalar function $t\mapsto \exp(t\operatorname{tr}X)$ at $t=0$ gives \begin{align*} 0=\operatorname{tr}X\cdot \exp(0)=\operatorname{tr}X. \end{align*} Conversely, if $X^*+X=0$ and $\operatorname{tr}X=0$, then the first part gives $\exp(tX)\in U(n)$, and \begin{align*} \det(\exp(tX))=\exp(t\operatorname{tr}X)=\exp(0)=1. \end{align*} Hence $\exp(tX)\in SU(n)$ for every real $t$, so \begin{align*} \mathfrak{su}(n)=\{X\in M(n,\mathbb C):X^*+X=0,\ \operatorname{tr}X=0\}. \end{align*} The unitary equation becomes the linear skew-Hermitian condition, and the special condition contributes exactly the trace-zero constraint. [/example] Symplectic groups are defined by preserving a non-degenerate skew form. The infinitesimal equation says that $X$ is skew-adjoint with respect to that form. [example: Symplectic Lie Algebra] Let $J\in M(2n,\mathbb R)$ be the standard symplectic matrix, with block entries $J_{12}=I_n$, $J_{21}=-I_n$, and $J_{11}=J_{22}=0$. Let \begin{align*} Sp(2n,\mathbb R)=\{A\in GL(2n,\mathbb R):A^\top JA=J\}. \end{align*} We compute the matrices $X\in M(2n,\mathbb R)$ for which $\exp(tX)\in Sp(2n,\mathbb R)$ for every $t\in\mathbb R$. Suppose first that $\exp(tX)\in Sp(2n,\mathbb R)$ for every $t$. Set $E(t)=\exp(tX)$. Then \begin{align*} E(t)^\top J E(t)=J. \end{align*} Since $E(0)=I$ and $E'(0)=X$, differentiating at $t=0$ gives \begin{align*} (E'(0))^\top J E(0)+E(0)^\top J E'(0)=0. \end{align*} Substituting $E(0)=I$ and $E'(0)=X$ gives \begin{align*} X^\top JI+I^\top JX=0. \end{align*} Since $I^\top=I$, this is \begin{align*} X^\top J+JX=0. \end{align*} Conversely, suppose $X^\top J+JX=0$. Again set $E(t)=\exp(tX)$, so $E'(t)=X E(t)$. Define \begin{align*} F(t)=E(t)^\top J E(t). \end{align*} Then \begin{align*} F'(t)=(XE(t))^\top J E(t)+E(t)^\top JX E(t). \end{align*} Using $(XE(t))^\top=E(t)^\top X^\top$, this becomes \begin{align*} F'(t)=E(t)^\top X^\top J E(t)+E(t)^\top JX E(t). \end{align*} Factoring the common left and right terms gives \begin{align*} F'(t)=E(t)^\top (X^\top J+JX)E(t)=E(t)^\top 0 E(t)=0. \end{align*} Thus $F(t)$ is constant, and since $F(0)=I^\top JI=J$, we have $E(t)^\top J E(t)=J$ for every $t$. Therefore $\exp(tX)\in Sp(2n,\mathbb R)$ for every real $t$. Hence \begin{align*} \mathfrak{sp}(2n,\mathbb R)=\{X\in M(2n,\mathbb R):X^\top J+JX=0\}. \end{align*} Now write $X$ in block form with block entries $X_{11}=A$, $X_{12}=B$, $X_{21}=C$, and $X_{22}=D$. The block entries of $X^\top J$ are \begin{align*} (X^\top J)_{11}=-C^\top,\quad (X^\top J)_{12}=A^\top,\quad (X^\top J)_{21}=-D^\top,\quad (X^\top J)_{22}=B^\top. \end{align*} The block entries of $JX$ are \begin{align*} (JX)_{11}=C,\quad (JX)_{12}=D,\quad (JX)_{21}=-A,\quad (JX)_{22}=-B. \end{align*} Therefore $X^\top J+JX=0$ is equivalent to the four block equations \begin{align*} -C^\top+C=0,\quad A^\top+D=0,\quad -D^\top-A=0,\quad B^\top-B=0. \end{align*} These say exactly that $C^\top=C$, $B^\top=B$, and $D=-A^\top$; the equation $-D^\top-A=0$ follows from $D=-A^\top$. Thus every element of $\mathfrak{sp}(2n,\mathbb R)$ has block form $X=(A,B;C,-A^\top)$ with $B$ and $C$ symmetric, and every matrix of that block form satisfies the symplectic infinitesimal equation. [/example] These examples illustrate the general method: the defining equation for $G$ is nonlinear in $A$, but its derivative at $I$ is linear in $X$. Closedness of the resulting space under the commutator is not an extra calculation for each group; it follows from the general closure theorem. ## The Three-Dimensional Rotation Algebra The algebra $\mathfrak{so}(3)$ is the first non-abelian Lie algebra with a geometric model that can be seen directly. Its elements are infinitesimal rotations, and the bracket corresponds to the cross product on $\mathbb R^3$. [definition: Hat Map] The hat map is the map \begin{align*} \widehat{\cdot}:\mathbb R^3\to\mathfrak{so}(3) \end{align*} defined as follows: for $a=(a_1,a_2,a_3)\in\mathbb R^3$, $\widehat a$ is the linear map whose action on $v\in\mathbb R^3$ is \begin{align*} \widehat a v=a\times v. \end{align*} [/definition] The matrix $\widehat a$ is skew-symmetric because the cross product is orthogonal to each of its inputs. The hat map is an isomorphism of real vector spaces from $\mathbb R^3$ onto $\mathfrak{so}(3)$. To prove that this is more than a coordinate identification, we must show that it preserves the bracket, which motivates the following theorem. [quotetheorem:8790] [citeproof:8790] This computation explains why $SO(3)$ is the group of rotations and $\mathfrak{so}(3)$ is the algebra of angular velocities. The theorem depends on the chosen orientation and cross-product convention on $\mathbb R^3$; reversing orientation would reverse the sign of the bracket model. It also depends on the specific normalisation of the hat map, since replacing $\widehat a$ by $c\widehat a$ rescales the resulting structure constants by $c$. The bracket records how infinitesimal rotations around different axes fail to commute. A basis calculation turns this geometric statement into a table that can be used in computations. [example: Basis and Bracket Table for So Three] Let $e_1=(1,0,0)$, $e_2=(0,1,0)$, and $e_3=(0,0,1)$, and set $E_i=\widehat{e_i}$. The cross-product bracket identity for $\mathfrak{so}(3)$ gives \begin{align*} [E_1,E_2]=[\widehat{e_1},\widehat{e_2}]=\widehat{e_1\times e_2}=\widehat{e_3}=E_3. \end{align*} Similarly, \begin{align*} [E_2,E_3]=[\widehat{e_2},\widehat{e_3}]=\widehat{e_2\times e_3}=\widehat{e_1}=E_1. \end{align*} The third cyclic bracket is \begin{align*} [E_3,E_1]=[\widehat{e_3},\widehat{e_1}]=\widehat{e_3\times e_1}=\widehat{e_2}=E_2. \end{align*} The diagonal brackets vanish because the commutator of any matrix with itself is zero: \begin{align*} [E_i,E_i]=E_iE_i-E_iE_i=0. \end{align*} The reversed brackets follow by expanding the commutator in the opposite order: \begin{align*} [E_j,E_i]=E_jE_i-E_iE_j=-(E_iE_j-E_jE_i)=-[E_i,E_j]. \end{align*} Thus the full bracket table is determined by \begin{align*} [E_1,E_2]=E_3,\qquad [E_2,E_3]=E_1,\qquad [E_3,E_1]=E_2, \end{align*} together with $[E_i,E_i]=0$ and antisymmetry. This turns bracket computations in $\mathfrak{so}(3)$ into the same cyclic arithmetic as the standard cross product basis of $\mathbb R^3$. [/example] The same pattern will reappear in $\mathfrak{su}(2)$, but with a different normalisation of the basis. This relation is the infinitesimal shadow of the covering map $SU(2)\to SO(3)$ introduced in Chapter 1 and developed topologically in Chapter 8. ## The Lie Algebra $\mathfrak{su}(2)$ The group $SU(2)$ is small enough for explicit matrices but rich enough to encode the double cover of three-dimensional rotations. Its Lie algebra consists of traceless skew-Hermitian $2\times 2$ matrices. [example: Pauli Basis for Su Two] Let $\sigma_1,\sigma_2,\sigma_3\in M(2,\mathbb C)$ be characterised by $\sigma_1e_1=e_2$, $\sigma_1e_2=e_1$, $\sigma_2e_1=ie_2$, $\sigma_2e_2=-ie_1$, $\sigma_3e_1=e_1$, and $\sigma_3e_2=-e_2$. These formulas give $\sigma_1^*=\sigma_1$, $\sigma_2^*=\sigma_2$, and $\sigma_3^*=\sigma_3$, because in each case the coefficient of $e_j$ in $\sigma_k e_\ell$ is the complex conjugate of the coefficient of $e_\ell$ in $\sigma_k e_j$. Hence \begin{align*} (i\sigma_k)^*=\overline{i}\,\sigma_k^*=-i\sigma_k \end{align*} for $k=1,2,3$, so $i\sigma_1,i\sigma_2,i\sigma_3$ are skew-Hermitian. Also $\sigma_1$ and $\sigma_2$ have zero diagonal entries in the basis $e_1,e_2$, while $\sigma_3e_1=e_1$ and $\sigma_3e_2=-e_2$, so \begin{align*} \operatorname{tr}(i\sigma_1)=i\operatorname{tr}(\sigma_1)=0,\quad \operatorname{tr}(i\sigma_2)=i\operatorname{tr}(\sigma_2)=0,\quad \operatorname{tr}(i\sigma_3)=i(1-1)=0. \end{align*} Thus all three matrices lie in $\mathfrak{su}(2)$. Now let $X\in\mathfrak{su}(2)$. Write its entries as $X_{11}=a$, $X_{12}=b$, $X_{21}=c$, and $X_{22}=d$. The condition $X^*+X=0$ gives \begin{align*} \overline a+a=0,\quad \overline d+d=0,\quad \overline c+b=0,\quad \overline b+c=0. \end{align*} The trace-zero condition gives \begin{align*} a+d=0. \end{align*} Since $\overline a+a=0$, there is a unique real number $x_3$ with $a=ix_3$; then $d=-a=-ix_3$. Write $b=x_2+ix_1$ with unique $x_1,x_2\in\mathbb R$. From $c=-\overline b$, we get \begin{align*} c=-(x_2-ix_1)=-x_2+ix_1. \end{align*} On the other hand, the matrix $x_1 i\sigma_1+x_2 i\sigma_2+x_3 i\sigma_3$ sends $e_1$ to $ix_3e_1+(-x_2+ix_1)e_2$ and sends $e_2$ to $(x_2+ix_1)e_1-ix_3e_2$, so its entries are exactly $a,b,c,d$. Therefore \begin{align*} X=x_1 i\sigma_1+x_2 i\sigma_2+x_3 i\sigma_3. \end{align*} The numbers $x_1,x_2,x_3$ are forced by $a=ix_3$ and $b=x_2+ix_1$, so the expression is unique. Hence $i\sigma_1,i\sigma_2,i\sigma_3$ form a real basis of $\mathfrak{su}(2)$. [/example] The Pauli basis gives coordinates on $\mathfrak{su}(2)$, but the Lie algebra is determined by its bracket, not only by its vector space structure. [example: Bracket Table for $\mathfrak{su}(2)$ in the Pauli Basis] The three Pauli matrices are as follows: $\sigma_1$ has first row $(0,1)$ and second row $(1,0)$; $\sigma_2$ has first row $(0,-i)$ and second row $(i,0)$; and $\sigma_3$ has first row $(1,0)$ and second row $(0,-1)$. Set $F_i=i\sigma_i$. The multiplication identities $\sigma_1\sigma_2=i\sigma_3$, $\sigma_2\sigma_3=i\sigma_1$, and $\sigma_3\sigma_1=i\sigma_2$, together with the reversed products carrying the opposite sign, give \begin{align*} [F_1,F_2]&=-2F_3,& [F_2,F_3]&=-2F_1,& [F_3,F_1]&=-2F_2. \end{align*} These relations are the bracket table of $\mathfrak{su}(2)$ in the Pauli basis. [/example] Rescaling the basis by $E_i=-F_i/2$ gives the same structure constants as the cross product basis for $\mathfrak{so}(3)$. The factor $-2$ in the table is not an intrinsic obstruction; it comes from the choice $F_i=i\sigma_i$, and a different normalisation changes the constants. This is also a real Lie algebra statement: treating $\mathfrak{su}(2)$ as a complex vector space would be wrong, since multiplication by $i$ does not preserve skew-Hermitian matrices. Thus $\mathfrak{su}(2)$ and $\mathfrak{so}(3)$ are isomorphic as real Lie algebras, even though the groups $SU(2)$ and $SO(3)$ are not isomorphic. [remark: Infinitesimal Versus Global Information] The Lie algebra detects the local multiplication law near the identity, but it does not determine the global topology of the group. The groups $SU(2)$ and $SO(3)$ have isomorphic Lie algebras, while $SU(2)$ is simply connected and $SO(3)$ has fundamental group of order $2$. Later chapters explain how covering groups separate the local Lie algebra from the global group. [/remark] This chapter has built the dictionary from matrix groups to Lie algebras: one-parameter subgroups define $\mathfrak g$, closedness gives the vector-space and bracket operations, conjugation gives $\operatorname{Ad}$ and $\operatorname{ad}$, and defining equations give concrete computations. The next step is to understand how much of the group law can be reconstructed from brackets, beginning with the Baker-Campbell-Hausdorff formula. Once the Lie algebra has been extracted from the group, the natural next step is to understand how far that infinitesimal information determines the local group law. The exponential map and local structure results do exactly that, showing how neighborhoods of the identity are controlled by the Lie algebra. # 4. The Exponential Map and Local Structure The previous chapter constructed the Lie algebra $\mathfrak g$ of a matrix Lie group $G$ by differentiating paths through the identity. This chapter turns that infinitesimal object back into group elements using the exponential map. The central questions are local: how much of $G$ is seen by exponentiating elements near $0 \in \mathfrak g$, how do these exponential coordinates interact with left-invariant vector fields, and how can the resulting charts be used to organise the smooth structure of a matrix Lie group? ## The Exponential Map Near the Identity The first problem is to justify that exponentiating a small Lie algebra element really gives a small piece of the group, not merely a distinguished family of curves in the ambient matrix algebra. For matrix groups the exponential series is already defined in $M(n,\mathbb C)$, but the point is that it lands in $G$ when the exponent lies in $\mathfrak g$ and behaves locally like the identity map on tangent spaces. [definition: Matrix Exponential] As in Chapter 2, the matrix exponential is the map \begin{align*} \exp:M(n,\mathbb C)\to M(n,\mathbb C) \end{align*} defined, for $X \in M(n,\mathbb C)$, by \begin{align*} \exp(X) = \sum_{k=0}^{\infty} \frac{X^k}{k!}. \end{align*} [/definition] The series converges absolutely in any matrix norm, and it is compatible with conjugation and with commuting sums. In the setting of a matrix Lie group $G \le GL(n,\mathbb C)$, the Lie algebra $\mathfrak g$ was defined so that $\exp(tX) \in G$ for all $t \in \mathbb R$ whenever $X \in \mathfrak g$. [example: First Order Behaviour Of The Exponential] Let $X \in M_n(\mathbb C)$ and set $\gamma(t)=\exp(tX)$. From the defining series for the matrix exponential, \begin{align*} \gamma(t)=\sum_{k=0}^{\infty}\frac{(tX)^k}{k!}=\sum_{k=0}^{\infty}\frac{t^kX^k}{k!}. \end{align*} At $t=0$, all terms with $k\geq 1$ contain a factor $0^k$, while the $k=0$ term is $I$, so \begin{align*} \gamma(0)=I. \end{align*} Differentiating the power series term by term near $t=0$, justified by [uniform convergence](/page/Uniform%20Convergence) of the differentiated matrix power series on bounded $t$-intervals, gives \begin{align*} \gamma'(t)=\sum_{k=1}^{\infty}\frac{k t^{k-1}X^k}{k!}=\sum_{k=1}^{\infty}\frac{t^{k-1}X^k}{(k-1)!}. \end{align*} Evaluating at $t=0$, the $k=1$ term is $X$, and every term with $k\geq 2$ contains a positive power of $0$, hence \begin{align*} \gamma'(0)=X. \end{align*} Thus the exponential path begins at the identity and has initial velocity exactly $X$, so exponentiation recovers the infinitesimal direction used to define the Lie algebra. [/example] The preceding example identifies the derivative at the origin, and the next problem is whether that first-order agreement controls the actual map near the origin. This motivates applying the [inverse function theorem](/theorems/51) to the exponential map. [quotetheorem:8791] [citeproof:8791] This theorem is the local bridge between linear algebra and group theory, but its hypotheses and locality are doing real work. The conclusion uses that $G$ is a matrix Lie group sitting inside some $GL(n,\mathbb C)$; for an arbitrary subset of matrices closed under no smooth structure, there is no reason for the intersection with an ambient coordinate patch to be a manifold. It also does not say that $\exp$ is globally injective or surjective: on $SO(2)$, $\exp(\theta J)=\exp((\theta+2\pi)J)$, so injectivity fails once the neighbourhood is too large. The theorem is exactly what is needed for the next sections, where calculations near arbitrary group elements are reduced to this identity-neighbourhood chart by translation. [remark: Local Rather Than Global] The theorem makes no assertion that every element of $G$ is an exponential. It says that the exponential map is a diffeomorphism after restricting to sufficiently small neighbourhoods of $0$ and $I$. [/remark] This local warning matters throughout the subject. Many structural questions are local near the identity, but global topology and disconnectedness can prevent exponential coordinates from covering all of $G$. ## Canonical Coordinates And Left-Invariant Vector Fields Once $\exp$ is known to be locally invertible, the next problem is to choose coordinates that are intrinsic to the group rather than inherited from a large matrix space. A basis of $\mathfrak g$ gives such coordinates near $I$, and left translation moves the same coordinates to neighbourhoods of arbitrary group elements. [definition: Canonical Coordinates Of The First Kind] Let $G$ be a matrix Lie group with Lie algebra $\mathfrak g$, let $E_1,\dots,E_m$ be an ordered basis of $\mathfrak g$, and let $U\subset\mathfrak g$, $V\subset G$ be neighbourhoods on which $\exp|_U:U\to V$ is bijective. The canonical coordinate map of the first kind is the map \begin{align*} \kappa:V\to \mathbb R^m \end{align*} defined by $\kappa(g)=(x_1,\dots,x_m)$, where \begin{align*} g = \exp(x_1E_1+\cdots+x_mE_m). \end{align*} [/definition] The phrase "first kind" records that the coordinates exponentiate a single Lie algebra element. Later, coordinates built from ordered products such as $\exp(x_1E_1)\cdots\exp(x_mE_m)$ will behave differently because noncommutativity enters at lower order. [example: Canonical Coordinates On The Circle Group] For $G=SO(2)$, let $J$ be the matrix with entries $J_{12}=-1$, $J_{21}=1$, and $J_{11}=J_{22}=0$. Multiplying entries gives \begin{align*} (J^2)_{11}=0\cdot 0+(-1)\cdot 1=-1,\quad (J^2)_{12}=0\cdot(-1)+(-1)\cdot 0=0,\quad (J^2)_{21}=1\cdot 0+0\cdot 1=0,\quad (J^2)_{22}=1\cdot(-1)+0\cdot 0=-1. \end{align*} Thus $J^2=-I$, so by induction $J^{2r}=(-1)^rI$ and $J^{2r+1}=(-1)^rJ$ for every $r\geq 0$. Using the defining series for the matrix exponential, \begin{align*} \exp(\theta J)=\sum_{k=0}^{\infty}\frac{\theta^kJ^k}{k!}=\sum_{r=0}^{\infty}\frac{\theta^{2r}J^{2r}}{(2r)!}+\sum_{r=0}^{\infty}\frac{\theta^{2r+1}J^{2r+1}}{(2r+1)!}. \end{align*} Substituting the power formulas for $J$ gives \begin{align*} \exp(\theta J)=\left(\sum_{r=0}^{\infty}\frac{(-1)^r\theta^{2r}}{(2r)!}\right)I+\left(\sum_{r=0}^{\infty}\frac{(-1)^r\theta^{2r+1}}{(2r+1)!}\right)J=\cos\theta\, I+\sin\theta\, J. \end{align*} Therefore the entries of $\exp(\theta J)$ are $R_{11}=R_{22}=\cos\theta$, $R_{12}=-\sin\theta$, and $R_{21}=\sin\theta$, which is exactly rotation through angle $\theta$. If $\theta,\phi\in(-\varepsilon,\varepsilon)$ with $\varepsilon<\pi$ and $\exp(\theta J)=\exp(\phi J)$, then the displayed entries give $\cos\theta=\cos\phi$ and $\sin\theta=\sin\phi$, hence $\theta-\phi\in 2\pi\mathbb Z$. Since $|\theta-\phi|<2\varepsilon<2\pi$, this forces $\theta=\phi$. Thus $\theta\mapsto \exp(\theta J)$ gives a one-to-one canonical coordinate parametrisation of the identity neighbourhood consisting of rotations with angle in $(-\varepsilon,\varepsilon)$. [/example] The example shows how exponential coordinates describe points by infinitesimal generators. The next problem is to describe infinitesimal motion at every group element using the same generator at the identity, which motivates the definition of a left-invariant vector field. [definition: Left-Invariant Vector Field] Let $G$ be a matrix Lie group and let $X \in \mathfrak g$. The left-invariant vector field associated to $X$ is the section \begin{align*} \widetilde X:G\to TG,\qquad g\mapsto (dL_g)_I(X), \end{align*} where $L_g:G\to G$ is left translation, $L_g(h)=gh$. [/definition] For a matrix group this has the concrete form $\widetilde X_g=gX$ as a tangent vector at $g$ inside the ambient matrix space. The natural question is whether every left-invariant smooth vector field arises in this way, and the answer identifies $\mathfrak g$ with a space of global geometric objects. [quotetheorem:8792] [citeproof:8792] This identification is the reason $\mathfrak g$ is not just a tangent space: it controls a distinguished class of global vector fields. The left-invariance hypothesis is essential; a smooth vector field may have the same value as $\widetilde X$ at $I$ but differ elsewhere, for instance by multiplying $\widetilde X$ by a nonconstant smooth function $f:G\to\mathbb R$ with $f(I)=1$. The theorem also does not classify all vector fields on $G$, only those rigidly transported by every left translation. Its value is that this rigid class is closed under the vector-field bracket, and the Lie bracket on $\mathfrak g$ will later be recovered from that closure. ## The Differential Of The Exponential Away From Zero The derivative at $0$ only sees the first-order behaviour of $\exp$ at the identity. To understand how exponential coordinates distort tangent vectors at a general $X\in\mathfrak g$, one needs a formula for $(d\exp)_X$. Noncommutativity is measured by the adjoint operator $\operatorname{ad}_X(Y)=[X,Y]$. [definition: Adjoint Operator On The Lie Algebra] For $X\in\mathfrak g$, the adjoint operator $\operatorname{ad}_X:\mathfrak g\to\mathfrak g$ is \begin{align*} \operatorname{ad}_X(Y)=[X,Y]=XY-YX. \end{align*} [/definition] The operator $\operatorname{ad}_X$ records the failure of $X$ to commute with other infinitesimal directions. The next problem is to measure exactly how this failure changes the derivative of $\exp$ away from the origin. [quotetheorem:8834] [citeproof:8834] The formula shows that the differential is the identity at $X=0$ and is perturbed by commutators at higher order. The right-translation in the statement is necessary because $(d\exp)_X(Y)$ is a tangent vector at $\exp X$, while the series in brackets lies in the identity tangent space $\mathfrak g$. The matrix Lie group hypotheses are also part of the mechanism, not decoration: the proof differentiates an actual matrix-valued exponential curve, identifies tangent vectors inside the ambient vector space $M_n(\mathbb C)$, and uses matrix conjugation to turn $\exp(tX)Y\exp(-tX)$ into the operator series $e^{t\operatorname{ad}_X}Y$. For a general smooth Lie group an analogous formula exists, but it has to be proved using flows or the adjoint representation rather than raw matrix multiplication. In the boundary case of an abelian matrix Lie algebra, all adjoint operators vanish, so the formula collapses to the right-translate of $Y$; this is the case in which exponential coordinates behave linearly. The theorem does not say that $\exp$ remains locally invertible at every $X$: the linear operator represented by the series can fail to be invertible at special points, as happens for compact groups when exponential coordinates wrap around periodically. These commutator corrections are the first local signs of the Baker-Campbell-Hausdorff formula. [example: Abelian Case] Let \begin{align*} X=\operatorname{diag}(x_1,\dots,x_m),\qquad Y=\operatorname{diag}(y_1,\dots,y_m) \end{align*} with all $x_j,y_j\in\mathbb R$. Then \begin{align*} XY=\operatorname{diag}(x_1y_1,\dots,x_my_m) \end{align*} and \begin{align*} YX=\operatorname{diag}(y_1x_1,\dots,y_mx_m)=\operatorname{diag}(x_1y_1,\dots,x_my_m), \end{align*} so $[X,Y]=XY-YX=0$. Hence $\operatorname{ad}_X(Y)=0$, and every higher iterate also vanishes: \begin{align*} (\operatorname{ad}_X)^k(Y)=0\quad\text{for every }k\geq 1. \end{align*} Therefore the series in the differential formula becomes \begin{align*} \sum_{k=0}^{\infty}\frac{(\operatorname{ad}_X)^k(Y)}{(k+1)!}=Y+\sum_{k=1}^{\infty}0=Y. \end{align*} The exponential can also be read entry by entry. Since \begin{align*} X^r=\operatorname{diag}(x_1^r,\dots,x_m^r) \end{align*} for every $r\geq 0$, the defining series gives \begin{align*} \exp X=\operatorname{diag}\left(\sum_{r=0}^{\infty}\frac{x_1^r}{r!},\dots,\sum_{r=0}^{\infty}\frac{x_m^r}{r!}\right)=\operatorname{diag}(e^{x_1},\dots,e^{x_m}). \end{align*} Along the line $X+tY=\operatorname{diag}(x_1+ty_1,\dots,x_m+ty_m)$, this gives \begin{align*} \exp(X+tY)=\operatorname{diag}(e^{x_1+ty_1},\dots,e^{x_m+ty_m}). \end{align*} Differentiating each scalar entry at $t=0$ yields \begin{align*} (d\exp)_X(Y)=\operatorname{diag}(y_1e^{x_1},\dots,y_me^{x_m}). \end{align*} On the other hand, right translation by $\exp X$ sends a tangent vector $Z$ at the identity to $Z\exp X$, so \begin{align*} (dR_{\exp X})_I(Y)=Y\exp X=\operatorname{diag}(y_1e^{x_1},\dots,y_me^{x_m}). \end{align*} Thus in this abelian diagonal group, the differential of the exponential is exactly the right-translate of $Y$, with no commutator correction terms. [/example] For nonabelian groups, the commutator corrections are the first signs of the Baker-Campbell-Hausdorff formula of Chapter 5. They explain why exponential coordinates are linear to first order but not compatible with multiplication beyond first order. ## Smooth Structure From Logarithm Charts The next problem is to assemble local exponential coordinates into a manifold structure on all of $G$. Near the identity the inverse of $\exp$ is denoted by $\log$, and near a general point $g\in G$ we use left translation to move the same chart. [definition: Logarithm Chart] Let $U\subset\mathfrak g$ and $V\subset G$ be neighbourhoods on which $\exp:U\to V$ is a diffeomorphism. The logarithm chart at $g\in G$ is the map \begin{align*} \psi_g:gV\to U,\qquad \psi_g(h)=\log(g^{-1}h). \end{align*} [/definition] These charts are built from group multiplication, inversion, and the local inverse to the exponential. The remaining issue is compatibility: when two such charts overlap, their transition maps must be smooth maps between open subsets of $\mathfrak g$. [quotetheorem:8793] [citeproof:8793] This transition result proves that translated logarithm charts are compatible, but only after restricting to the part of the overlap where the displayed expression is defined. That restriction is essential: even for $SO(2)$, a chosen logarithm branch near $I$ cannot be evaluated on rotations near angle $\pi$ unless the chart domain is changed or shrunk. Thus the theorem is not a global formula for a single logarithm on all of $G$; it is a local compatibility statement for chart overlaps. The next problem is to promote that compatibility into a full manifold structure on $G$ and check that the group operations remain smooth in the resulting atlas. [quotetheorem:8794] [citeproof:8794] The theorem completes the local reconstruction promised in the earlier chapters: a matrix Lie group is not only a topological group with tangent directions at the identity, but a smooth manifold whose local coordinates are controlled by $\mathfrak g$. The matrix Lie group hypothesis matters because the construction used the ambient smooth structure of $GL(n,\mathbb C)$ and the closed-subgroup local form; an arbitrary abstract group has no such ambient calculus. The result is also local in nature: it produces compatible charts, not a guarantee that one exponential chart covers the whole group. For example, $SO(2)$ is a smooth one-dimensional Lie group, but no single logarithm chart from the identity can represent every rotation without cutting the circle. This distinction sets up the final section, where the global image of the exponential map is separated from its local role in the smooth structure. ## Global Behaviour Of The Exponential Map The final question in this chapter is how far the local picture extends. The exponential map always describes a neighbourhood of the identity, but its global image depends on the topology and algebra of the group. [example: Non-Surjectivity In Special Linear Groups] In $SL(2,\mathbb R)$, let $A$ have entries $A_{11}=A_{22}=-1$, $A_{12}=1$, and $A_{21}=0$. Its determinant is \begin{align*} \det A=(-1)(-1)-1\cdot 0=1. \end{align*} Thus $A\in SL(2,\mathbb R)$. We show that $A$ is not $e^X$ for any real trace-zero matrix $X$. First identify the Jordan form of $A$. The matrix $\lambda I-A$ has diagonal entries $\lambda+1$, upper-right entry $-1$, and lower-left entry $0$, so \begin{align*} \det(\lambda I-A)=(\lambda+1)(\lambda+1)-(-1)\cdot 0=(\lambda+1)^2. \end{align*} Hence the only eigenvalue is $-1$, with algebraic multiplicity $2$. Also $A+I$ has entries $(A+I)_{12}=1$ and all other entries $0$, so \begin{align*} (A+I)(u,v)=(v,0). \end{align*} Therefore $\ker(A+I)=\{(u,0):u\in\mathbb R\}$ is one-dimensional. The eigenspace has dimension $1$ while the algebraic multiplicity is $2$, so $A$ has one Jordan block of size $2$ for the negative eigenvalue $-1$. By the *Culver real logarithm criterion*, an invertible real matrix has a real logarithm only if, for each negative real eigenvalue, the number of Jordan blocks of each fixed size is even. For the eigenvalue $-1$, the matrix $A$ has exactly one Jordan block of size $2$, which is odd. Hence $A$ has no real logarithm in $M_2(\mathbb R)$. In particular there is no real trace-zero matrix $X\in\mathfrak{sl}(2,\mathbb R)$ with $e^X=A$, so the exponential map $\mathfrak{sl}(2,\mathbb R)\to SL(2,\mathbb R)$ is not surjective. [/example] This example shows that connectedness alone is not enough to guarantee surjectivity of the exponential map. It is a local parametrisation, and global obstructions can appear even in familiar matrix groups. [example: Surjectivity For The Special Unitary Group] Let $U\in SU(2)$. By the *spectral theorem for unitary matrices*, there is a unitary matrix $P$ and complex numbers $\lambda_1,\lambda_2$ such that \begin{align*} U=P\operatorname{diag}(\lambda_1,\lambda_2)P^* \end{align*} with $|\lambda_1|=|\lambda_2|=1$. Hence $\lambda_1=e^{i\alpha}$ and $\lambda_2=e^{i\beta}$ for some real $\alpha,\beta$. Since $U\in SU(2)$, $\det U=1$, and determinant is unchanged by unitary conjugation, so \begin{align*} 1=\det U=\det(P)\det(\operatorname{diag}(e^{i\alpha},e^{i\beta}))\det(P^*)=\det(P)e^{i(\alpha+\beta)}\overline{\det(P)}=e^{i(\alpha+\beta)}. \end{align*} Thus $\alpha+\beta\in 2\pi\mathbb Z$, so $e^{i\beta}=e^{-i\alpha}$. Setting $\theta=\alpha$, we may write \begin{align*} U=P\operatorname{diag}(e^{i\theta},e^{-i\theta})P^*. \end{align*} Define \begin{align*} X=P\operatorname{diag}(i\theta,-i\theta)P^*. \end{align*} Then \begin{align*} X^*=P\operatorname{diag}(-i\theta,i\theta)P^*=-X \end{align*} and \begin{align*} \operatorname{tr}X=\operatorname{tr}(\operatorname{diag}(i\theta,-i\theta))=i\theta-i\theta=0, \end{align*} so $X\in\mathfrak{su}(2)$. For every $k\geq 0$, \begin{align*} X^k=P\operatorname{diag}(i\theta,-i\theta)^kP^* \end{align*} because $P^*P=I$. Therefore the exponential series gives \begin{align*} e^X=P\exp(\operatorname{diag}(i\theta,-i\theta))P^*. \end{align*} For a diagonal matrix, powers are computed entrywise, so \begin{align*} \exp(\operatorname{diag}(i\theta,-i\theta))=\operatorname{diag}(e^{i\theta},e^{-i\theta}). \end{align*} Hence \begin{align*} e^X=P\operatorname{diag}(e^{i\theta},e^{-i\theta})P^*=U. \end{align*} Since every $U\in SU(2)$ has been written as $e^X$ for some $X\in\mathfrak{su}(2)$, the exponential map $\exp:\mathfrak{su}(2)\to SU(2)$ is surjective. [/example] The contrast between $SL(2,\mathbb R)$ and $SU(2)$ is a useful guide for later chapters. Local Lie theory is governed by $\mathfrak g$ and the exponential map near $0$, while global Lie theory asks how these local pieces wrap around the whole group. Covering maps make this distinction concrete: locally isomorphic groups can have the same Lie algebra but different global topology, as in the relationship between $SU(2)$ and $SO(3)$. Compactness and representation theory also enter through this global question, because compact matrix groups have strong spectral decompositions while their exponential maps may still identify many different Lie algebra elements. The local picture is now clear enough to ask how the Lie algebra encodes multiplication itself. The Baker--Campbell--Hausdorff formula answers that question by expressing the product of nearby exponentials directly in terms of the bracket. # 5. The Baker–Campbell–Hausdorff Formula The exponential map turns infinitesimal data in a Lie algebra into group elements near the identity. Building on Chapters 2 through 4, this chapter assumes the preceding material on matrix Lie groups, Lie algebras, the exponential map, the local logarithm near the identity, and the adjoint action by commutators. The basic local question is therefore multiplicative: if $X$ and $Y$ are small elements of a Lie algebra, how can we rewrite $\exp X\exp Y$ as a single exponential? The answer is the Baker-Campbell-Hausdorff formula, whose convergence and local consequences explain why the Lie bracket controls the group law near the identity. ## The Logarithm of a Product The starting problem is that $\exp X\exp Y$ is usually not $\exp(X+Y)$. The failure of additivity is measured by commutators, and the Baker-Campbell-Hausdorff formula gives the correction terms in a universal way. [definition: Baker--Campbell--Hausdorff Series] Let $\mathfrak g$ be a Lie algebra over $\mathbb R$ or $\mathbb C$. The Baker-Campbell-Hausdorff series is the formal map \begin{align*} \operatorname{BCH}:\mathfrak g\times \mathfrak g\longrightarrow \mathfrak g \end{align*} defined as a formal Lie series in two variables $X,Y\in\mathfrak g$ whose value is \begin{align*} \operatorname{BCH}(X,Y)=\log(\exp X\exp Y) \end{align*} whenever the logarithm of $\exp X\exp Y$ is defined in a neighbourhood of the identity. [/definition] Equivalently, as a formal identity near $(0,0)$, the BCH series is characterised by \begin{align*} \exp(\operatorname{BCH}(X,Y))=\exp X\exp Y. \end{align*} The definition is local: it refers to the logarithm in a neighbourhood of the identity. The content of the theorem later in the chapter is that the right-hand side is represented by a convergent Lie series for sufficiently small $X$ and $Y$. [example: Abelian Lie Algebra] Let $\mathfrak g=\mathbb R^n$ with zero bracket, regarded as the Lie algebra of the additive Lie group $\mathbb R^n$. In this group convention the exponential map is the identity map, so for $X,Y\in\mathbb R^n$ we have $\exp X=X$ and $\exp Y=Y$. The group product is addition, hence \begin{align*} \exp X\exp Y=X+Y. \end{align*} Applying the local logarithm, which is also the identity map in these coordinates, gives \begin{align*} \operatorname{BCH}(X,Y)=\log(\exp X\exp Y)=\log(X+Y)=X+Y. \end{align*} Since $[U,V]=0$ for all $U,V\in\mathfrak g$, every iterated commutator term in the BCH series vanishes, so the abelian case has no correction terms beyond ordinary addition. [/example] For a matrix group, the same statement can be read through the usual matrix exponential. If all matrices commute, then the exponential power series gives $e^Xe^Y=e^{X+Y}$, so the noncommutative terms must be built from brackets such as $[X,Y]=XY-YX$. ## Deriving the Differential Equation To find the correction terms, we introduce a parameter and differentiate the logarithm of a product. The useful path is \begin{align*} Z(t)=\log(\exp X\exp(tY)), \end{align*} so that $Z(0)=X$ and $Z(1)=\operatorname{BCH}(X,Y)$. [definition: Adjoint Operator] For a Lie algebra $\mathfrak g$ and $X\in\mathfrak g$, the adjoint operator is the linear map \begin{align*} \operatorname{ad}_X:\mathfrak g\to\mathfrak g,\qquad \operatorname{ad}_X(Y)=[X,Y]. \end{align*} [/definition] The adjoint operator packages repeated commutators into powers: \begin{align*} \operatorname{ad}_X^2(Y)=[X,[X,Y]]. \end{align*} The next task is to find the velocity of the logarithmic path $Z(t)$; once that velocity is written using $\operatorname{ad}_{Z(t)}$, the BCH coefficients can be read recursively. [quotetheorem:8795] [citeproof:8795] This equation is the mechanism behind the BCH formula, but its hypotheses are doing real work. The path must remain in a logarithm chart, so the statement is local near the identity and gives no global logarithm for arbitrary products. The matrix Lie group assumption lets us use the matrix exponential and its left-trivialised derivative directly; for abstract Lie groups the same formula is obtained after choosing a chart and invoking the general exponential theory. The operator fraction is also interpreted through a convergent power series only for sufficiently small $Z(t)$, which is why the theorem is a coefficient-extraction device rather than a global multiplication formula. To convert it into explicit coefficients, we need the scalar [generating function](/page/Generating%20Function) for the operator fraction appearing in the velocity formula. [definition: Bernoulli Numbers] The Bernoulli numbers $B_n$ are defined by the generating function \begin{align*} \frac{z}{e^z-1}=\sum_{n=0}^{\infty}B_n\frac{z^n}{n!}. \end{align*} [/definition] The equivalent expansion needed for the BCH path is \begin{align*} \frac{z}{1-e^{-z}}=1+\frac{z}{2}+\frac{z^2}{12}-\frac{z^4}{720}+\cdots. \end{align*} Substituting $z=\operatorname{ad}_{Z(t)}$ expresses $dZ/dt$ as a sum of iterated brackets. ## The First Terms of the Formula The next question is what the differential equation gives when terms are grouped by degree. We regard $X$ and $Y$ as degree $1$, and each bracket adds degrees. [quotetheorem:8796] [citeproof:8796] The first term $X+Y$ is the abelian approximation, the half-bracket is the first correction, and the degree-three terms record the first asymmetry between $X$ acting twice and $Y$ acting once after the commutator has appeared. This statement is a truncation, so it determines the product only modulo brackets of degree at least $4$. The smallness hypothesis is inherited from the logarithm and convergence questions: without it, the expression may still make formal sense while no single-valued logarithm of $\exp X\exp Y$ is being described. For instance, in a compact group such as $SO(2)$ the exponential map is periodic, so large elements of the Lie algebra cannot be recovered uniquely from their exponentials. The degree-three formula is therefore a local asymptotic expansion, not a global product formula. [example: Heisenberg BCH] Let $\mathfrak h$ be the Heisenberg Lie algebra with basis $X,Y,Z$ and brackets $[X,Y]=Z$ and $[X,Z]=[Y,Z]=0$. For $A=aX+bY+cZ$ and $B=a'X+b'Y+c'Z$, bilinearity of the bracket gives \begin{align*} [A,B]=aa'[X,X]+ab'[X,Y]+ac'[X,Z]+ba'[Y,X]+bb'[Y,Y]+bc'[Y,Z]+ca'[Z,X]+cb'[Z,Y]+cc'[Z,Z]. \end{align*} The alternating property gives $[X,X]=[Y,Y]=[Z,Z]=0$, antisymmetry gives $[Y,X]=-[X,Y]=-Z$, and the assumptions $[X,Z]=[Y,Z]=0$ also imply $[Z,X]=[Z,Y]=0$. Hence \begin{align*} [A,B]=ab'Z-ba'Z=(ab'-ba')Z. \end{align*} This bracket lies in the central line $\mathbb R Z$, so for every $C\in\mathfrak h$ we have \begin{align*} [C,[A,B]]=(ab'-ba')[C,Z]=0. \end{align*} Thus every iterated commutator of length at least $3$ vanishes, and the BCH series truncates after the degree-two term: \begin{align*} \operatorname{BCH}(A,B)=A+B+\frac{1}{2}[A,B]. \end{align*} Substituting the computed bracket and expanding $A+B$ gives \begin{align*} \operatorname{BCH}(A,B)=(a+a')X+(b+b')Y+\left(c+c'+\frac{1}{2}(ab'-ba')\right)Z. \end{align*} The Heisenberg group law therefore differs from ordinary vector addition exactly by the central correction $\frac{1}{2}(ab'-ba')Z$, and the polynomial is finite because the Lie algebra is two-step nilpotent. [/example] The Heisenberg example is the model case where the formal series terminates. For a general Lie algebra, the iterated commutators continue indefinitely, so convergence becomes a genuine analytic issue. ## Convergence and the Local Group Law A formal expression is not enough for Lie groups: we need the series to converge in a neighbourhood of the origin. The convergence theorem says that the universal Lie series defines an actual local multiplication law on the Lie algebra. [quotetheorem:8797] [citeproof:8797] The constant $\log 2$ is a convenient universal radius, not a sharp boundary in most examples. Finite-dimensionality and the norm inequality let the nested commutators be dominated by an ordinary scalar series; without such control, formal Lie series need not define convergent analytic functions. Absolute convergence in this ball also does not make BCH global: products outside the chosen neighbourhood can leave every logarithm chart or meet different branches of the logarithm. The theorem therefore gives a reliable local analytic law, and the next question is whether that law can be used as multiplication even before choosing a particular group. [quotetheorem:8798] [citeproof:8798] This theorem is a conceptual turning point in the course. The Lie algebra is not merely the tangent space with a bracket; together with BCH, it contains the full local multiplication table near the identity. The repeated shrinking of neighbourhoods is essential because a local group law is only required to multiply pairs whose product remains in the chosen chart. It does not decide how far the local multiplication can be continued, whether the exponential map is globally injective, or which discrete identifications a global group may have. Thus BCH reconstructs the germ of the group at the identity, while global topology still lies outside the local formula. ## Commutators and Local Isomorphisms The last step is to extract structural consequences. The group commutator detects the Lie bracket to lowest nonzero order, and local isomorphism follows from transporting the BCH law through a Lie algebra isomorphism. [quotetheorem:8799] [citeproof:8799] This expansion explains why the Lie bracket is the infinitesimal shadow of the group commutator. The smallness assumption keeps all four exponentials and their product inside a logarithm chart, so the expression is again local. The notation $O_3(X,Y)$ records all iterated-bracket terms of degree at least $3$, which means the theorem determines only the leading nonzero term of the commutator word. It cannot distinguish two Lie group laws whose commutator expansions first differ in higher degree, and it gives no information about commutators formed far from the identity. In an abelian Lie group the group commutator is the identity near $e$; in a nonabelian group, the first nonzero term is exactly the bracket. [example: Commutator in the Heisenberg Group] Let $A,B\in\mathfrak h$. From the previous Heisenberg computation, $[A,B]$ lies in the central line $\mathbb R Z$, so $[A,[A,B]]=0$ and $[B,[A,B]]=0$. Hence the BCH product of any two elements $U,V\in\mathfrak h$ is \begin{align*} U*V=U+V+\frac{1}{2}[U,V]. \end{align*} We compute the commutator word in exponential coordinates. First, \begin{align*} A*B=A+B+\frac{1}{2}[A,B]. \end{align*} Multiplying by $-A$ gives \begin{align*} (A*B)*(-A)=A+B+\frac{1}{2}[A,B]-A+\frac{1}{2}\left[A+B+\frac{1}{2}[A,B],-A\right]. \end{align*} By bilinearity and centrality of $[A,B]$, \begin{align*} \left[A+B+\frac{1}{2}[A,B],-A\right]=-[A,A]-[B,A]-\frac{1}{2}[[A,B],A]. \end{align*} Since $[A,A]=0$, $[B,A]=-[A,B]$, and $[[A,B],A]=0$, this bracket is $[A,B]$. Therefore \begin{align*} (A*B)*(-A)=B+[A,B]. \end{align*} Now multiply by $-B$: \begin{align*} ((A*B)*(-A))*(-B)=B+[A,B]-B+\frac{1}{2}[B+[A,B],-B]. \end{align*} Again using bilinearity and centrality, \begin{align*} [B+[A,B],-B]=-[B,B]-[[A,B],B]=0. \end{align*} Thus \begin{align*} ((A*B)*(-A))*(-B)=[A,B]. \end{align*} Translating back from BCH coordinates, this is exactly \begin{align*} \exp A\exp B\exp(-A)\exp(-B)=\exp([A,B]). \end{align*} For sufficiently small $A,B$ this follows in any local Heisenberg group chart; in the simply connected Heisenberg group the exponential coordinates are global, so the same identity holds for all $A,B\in\mathfrak h$. The error term $O_3(X,Y)$ disappears here because every triple commutator is zero in a two-step nilpotent Lie algebra. [/example] The Heisenberg computation shows how much information about commutators is already encoded in the Lie bracket. The next question is whether an isomorphism of Lie algebras therefore forces the corresponding group multiplications to agree in exponential coordinates. [quotetheorem:8800] [citeproof:8800] Connectedness is not needed for the construction near the identity, but it matters for the wider course because connected Lie groups are generated by small neighbourhoods of the identity. The theorem is local: the map $F$ is defined only after choosing exponential coordinate neighbourhoods, and the product identity is asserted only when $g$, $h$, and $gh$ remain in the selected neighbourhood. A Lie algebra isomorphism therefore need not produce a global group isomorphism. The standard example is the shared Lie algebra of $SU(2)$ and $SO(3)$: their Lie algebras are isomorphic and their identity neighbourhoods are locally isomorphic, but globally $SU(2)$ is a double cover of $SO(3)$. BCH shows that no further local data are missing, while fundamental groups and discrete central quotients remain global information. After BCH, the focus shifts from local formulas to maps between Lie groups and the infinitesimal maps they induce. Homomorphisms and the functor $\operatorname{Lie}$ formalize that passage, showing how smooth group maps differentiate to Lie algebra maps and how local data constrain the global ones. # 6. Homomorphisms and the Functor Lie Homomorphisms are the maps that preserve the group structure, but in Lie theory they also have to interact correctly with smooth structure. This chapter assumes the preceding material on matrix Lie groups, one-parameter subgroups, the exponential map, tangent spaces at the identity, and the Lie bracket obtained from commutators. With that background in place, the goal is to explain why the differential at the identity captures the infinitesimal part of a homomorphism, and why much of the global map can be reconstructed from that infinitesimal data. The central theme is that the assignment $\operatorname{Lie}$ is a functor: it turns Lie groups into Lie algebras and turns compatible group maps into compatible linear maps. ## Lie Group Homomorphisms and Automatic Smoothness The first question is whether a map that respects multiplication should also be required to respect the smooth structure. For abstract Lie groups smoothness is part of the definition, but for matrix Lie groups the topology and algebra interact so strongly that continuity already forces differentiability. [definition: Lie Group Homomorphism] Let $G$ and $H$ be Lie groups. A Lie group homomorphism is a smooth map $\varphi: G \to H$ such that \begin{align*} \varphi(g_1g_2) = \varphi(g_1)\varphi(g_2) \end{align*} for all $g_1,g_2 \in G$. [/definition] The condition implies $\varphi(e_G)=e_H$ and $\varphi(g^{-1})=\varphi(g)^{-1}$ whenever $G$ is nonempty. In a course that begins with closed matrix groups, this definition raises a useful technical issue: many maps are first constructed algebraically and checked to be continuous, while differentiability may be harder to verify directly. [quotetheorem:8801] [citeproof:8801] This result justifies a common working convention in matrix Lie theory: after continuity has been checked, algebraic homomorphisms may be treated as Lie group homomorphisms. The continuity hypothesis is doing real work: without it, abstract group homomorphisms can be built using wild additive maps of $\mathbb R$ that do not respect the usual topology and hence are not smooth. The matrix-group hypothesis also matters because the proof uses exponential coordinates and the classification of continuous one-parameter subgroups inside closed matrix groups. The theorem does not say that every abstract group homomorphism between underlying groups of Lie groups is smooth; it says that the topology rules out the pathological ones. The next example shows the convention in a familiar family where the differential is already visible from first-order matrix calculus. [example: Determinant Homomorphism] Consider $\det: GL(n,\mathbb C)\to \mathbb C^\times$. For $A,B\in GL(n,\mathbb C)$, multiplicativity of determinants gives $\det(AB)=\det(A)\det(B)$, so $\det$ is a group homomorphism. Since $\det(A)$ is the polynomial in the entries of $A=(a_{ij})$ given by \begin{align*} \det(A)=\sum_{\sigma\in S_n}\operatorname{sgn}(\sigma)\prod_{i=1}^n a_{i,\sigma(i)}, \end{align*} its restriction to the open subset $GL(n,\mathbb C)\subset M_n(\mathbb C)$ is smooth. Now fix $X=(x_{ij})\in\mathfrak{gl}(n,\mathbb C)$. Applying the same formula to $I+tX$ gives \begin{align*} \det(I+tX)=\sum_{\sigma\in S_n}\operatorname{sgn}(\sigma)\prod_{i=1}^n(\delta_{i,\sigma(i)}+t x_{i,\sigma(i)}). \end{align*} For the identity permutation, the term is \begin{align*} \prod_{i=1}^n(1+t x_{ii})=1+t\sum_{i=1}^n x_{ii}+O(t^2)=1+t\operatorname{tr}(X)+O(t^2). \end{align*} If $\sigma\ne\operatorname{id}$, then $\sigma$ moves at least two indices, so at least two factors have $\delta_{i,\sigma(i)}=0$; hence that whole permutation term is $O(t^2)$. Therefore \begin{align*} \det(I+tX)=1+t\operatorname{tr}(X)+O(t^2). \end{align*} Differentiating the curve $t\mapsto \det(I+tX)$ at $t=0$ gives $d(\det)_I(X)=\operatorname{tr}(X)\in T_1\mathbb C^\times\cong\mathbb C$, so the multiplicative determinant becomes the additive trace infinitesimally. [/example] The determinant example is a prototype: a group-level multiplicative invariant becomes an additive linear invariant after differentiation. We now make this differentiation process intrinsic. ## The Differential at the Identity A homomorphism is determined near any point by what it does near the identity, because $\varphi(gx)=\varphi(g)\varphi(x)$. The natural infinitesimal object is therefore the derivative at the identity, regarded as a map between tangent spaces. [definition: Differential of a Lie Group Homomorphism] Let $\varphi: G \to H$ be a Lie group homomorphism. Its differential is the linear map \begin{align*} d\varphi_e: T_eG \to T_eH. \end{align*} Identifying $T_eG=\mathfrak g$ and $T_eH=\mathfrak h$, we write \begin{align*} d\varphi: \mathfrak g \to \mathfrak h. \end{align*} [/definition] The definition isolates the identity-level derivative, but computations with Lie groups usually pass through one-parameter subgroups rather than arbitrary tangent curves. The next theorem explains why this is legitimate: differentiating a homomorphism at the identity tells exactly how it transports exponentials, so the local group map can be read from $d\varphi$. [quotetheorem:8802] [citeproof:8802] This theorem is the main computational bridge between group maps and Lie algebra maps. The homomorphism hypothesis is essential: a smooth map between Lie groups may have a perfectly good derivative at the identity without carrying one-parameter subgroups to one-parameter subgroups. The theorem also does not imply that $\varphi$ is globally determined by $d\varphi$ on an arbitrary group, because the exponential map need not be globally surjective and disconnected components are invisible from $T_eG$. For instance, a homomorphism out of a disconnected Lie group may be changed on another component while leaving the identity-level differential unchanged. What the theorem does give is the local exponential compatibility needed to prove that the differential respects the bracket, because the bracket is encoded in second-order commutator behaviour near the identity. [quotetheorem:8803] [citeproof:8803] The bracket condition is more than a consistency check: it is the extra structure that distinguishes Lie algebra maps from arbitrary linear maps. The homomorphism hypothesis is again essential, since an arbitrary smooth map can have any prescribed linear derivative at the identity without respecting commutators. The conclusion also goes only one way for a general source group: a bracket-preserving linear map need not integrate to a homomorphism unless a global topology condition is imposed later. In practice this result lets us compute differentials by checking how a map behaves on generators, and it prepares the formal statement that $G \mapsto \mathfrak g$ is functorial. [example: Differential of the Inclusion of a Closed Subgroup] Let $K \le G$ be a closed Lie subgroup, and let $i:K\hookrightarrow G$ be the inclusion homomorphism. If $X\in\mathfrak k=T_eK$, choose a smooth curve $k(t)$ in $K$ with $k(0)=e$ and $k'(0)=X$. Since $i(k(t))=k(t)$ as a curve in $G$, the derivative of the included curve is the same tangent vector, now viewed in $T_eG$: \begin{align*} di_e(X)=\frac{d}{dt}\bigg|_{t=0} i(k(t))=\frac{d}{dt}\bigg|_{t=0} k(t)=X\in\mathfrak g. \end{align*} Thus $di:\mathfrak k\to\mathfrak g$ is exactly the inclusion of the tangent subspace $T_eK\subseteq T_eG$. For $X,Y\in\mathfrak k$, the inclusion map is a Lie group homomorphism, so *Differential Preserves Lie Brackets* gives \begin{align*} di([X,Y]_{\mathfrak k})=[di(X),di(Y)]_{\mathfrak g}. \end{align*} Substituting $di(X)=X$ and $di(Y)=Y$ gives \begin{align*} [X,Y]_{\mathfrak k}=[X,Y]_{\mathfrak g}. \end{align*} So the bracket on $\mathfrak k$ is the restriction of the bracket on $\mathfrak g$, and the Lie algebra of a closed subgroup is a Lie subalgebra of the ambient Lie algebra. [/example] The examples so far show that $\operatorname{Lie}$ passes homomorphisms to linear bracket-preserving maps. The next section records that this construction behaves correctly with products and quotients, matching the standard operations on groups. ## Functoriality for Products and Quotients A functor should preserve composition and identities, and it should turn standard constructions into their infinitesimal analogues. For Lie groups this means that products become direct sums, while quotients by normal subgroups become quotients by ideals. [quotetheorem:8804] [citeproof:8804] The functor theorem handles maps, and its proof is exactly where smoothness enters: without differentiability there is no chain rule, and without the homomorphism property the derivative would not necessarily preserve brackets. The theorem does not assert that the functor is faithful on all Lie groups, since distinct homomorphisms can have the same differential when disconnected components or discrete kernels are present. A concrete example is the zero-dimensional Lie group $\mathbb Z/2\mathbb Z$: the identity homomorphism and the constant homomorphism to the identity element are distinct maps $\mathbb Z/2\mathbb Z\to\mathbb Z/2\mathbb Z$, but both have differential $0:0\to 0$. This shows that Lie forgets purely discrete information. It raises the next structural question: what Lie algebra is attached to a product group? The answer should be a direct sum if the functor $\operatorname{Lie}$ really respects independent group components, and the next theorem verifies this expectation. [quotetheorem:8805] [citeproof:8805] The product theorem depends on using the direct product group law; for a semidirect product with a genuine action, the underlying manifold may still look like a product, but the bracket acquires an action term rather than being componentwise. For example, the affine group of the line is $\mathbb R\rtimes\mathbb R^\times_{+}$ with multiplication \begin{align*} (b,a)(b',a')=(b+ab',aa'). \end{align*} On the identity component, write $a=e^t$, so the Lie algebra has basis $B$ for translations and $T$ for dilations. The corresponding left-invariant vector fields satisfy $[T,B]=B$, not $0$. Thus the underlying manifold coordinates are product-like, but the Lie algebra is not the abelian direct sum $\mathbb R\oplus\mathbb R$. The quotient construction is subtler because it requires normality. Normality is exactly what makes the quotient a group, and infinitesimally it becomes the condition that the Lie algebra of the [normal subgroup](/page/Normal%20Subgroup) is an ideal. [quotetheorem:8806] [citeproof:8806] This quotient theorem is the infinitesimal version of the group quotient construction. Closedness and the quotient Lie group structure are essential: if a normal subgroup is not closed, the topological quotient need not be a Lie group in the usual sense. Normality is also essential, since without it there is no [quotient group](/theorems/790) and infinitesimally the subalgebra need not be an ideal. The theorem does not recover discrete normal subgroups from the Lie algebra, because a discrete subgroup has zero Lie algebra. It explains why Lie algebra ideals are the correct objects for forming quotient Lie algebras, and it points next to the kernel of a homomorphism as the basic source of such ideals. [example: Projective Linear Groups] Let $G=GL(n,\mathbb C)$ and let $N=\mathbb C^\times I=\{\lambda I:\lambda\in\mathbb C^\times\}$. For every $A\in GL(n,\mathbb C)$ and every $\lambda\in\mathbb C^\times$, \begin{align*} A(\lambda I)=\lambda A=(\lambda I)A. \end{align*} Thus $N$ is central, hence normal, and the quotient $PGL(n,\mathbb C)=GL(n,\mathbb C)/\mathbb C^\times I$ is the group obtained by identifying matrices that differ by a nonzero scalar. We compute the Lie algebra of $N$. A smooth curve in $N$ through $I$ has the form $\gamma(t)=\lambda(t)I$ with $\lambda(0)=1$, so \begin{align*} \gamma'(0)=\lambda'(0)I. \end{align*} Conversely, for any $c\in\mathbb C$, the curve $\gamma_c(t)=e^{tc}I$ lies in $N$, satisfies $\gamma_c(0)=I$, and has \begin{align*} \gamma_c'(0)=cI. \end{align*} Therefore \begin{align*} \operatorname{Lie}(N)=\mathbb C I\subset \mathfrak{gl}(n,\mathbb C). \end{align*} By *Lie Algebra of a Quotient* applied to $GL(n,\mathbb C)$ and its closed normal subgroup $\mathbb C^\times I$, \begin{align*} \mathfrak{pgl}(n,\mathbb C)=\operatorname{Lie}(PGL(n,\mathbb C))\cong \mathfrak{gl}(n,\mathbb C)/\mathbb C I. \end{align*} So passing from linear to projective transformations kills exactly the infinitesimal scalar direction: $X$ and $X+cI$ define the same tangent direction in $\mathfrak{pgl}(n,\mathbb C)$, just as $A$ and $\lambda A$ define the same projective transformation. [/example] Products and quotients show that Lie respects structural operations. We now focus on how much information the differential retains about a single homomorphism. ## Kernel, Image, and Determination by the Differential A group homomorphism is measured algebraically by its kernel and image. The corresponding infinitesimal statement compares these subgroups with the kernel and image of the differential. [quotetheorem:8807] [citeproof:8807] The image behaves similarly near the identity, though global image subgroups may fail to be closed in general Lie theory. The closed-subgroup structure on $\ker\varphi$ matters because the equality identifies tangent vectors to an actual embedded subgroup; without a Lie subgroup structure, the right-hand side would not be defined. The result does not say that $\ker\varphi$ is determined by $\ker(d\varphi)$ as a group: a discrete kernel has zero Lie algebra and is therefore invisible infinitesimally. This limitation is often useful rather than harmful, because it is exactly what identifies covering maps and discrete central quotients. [example: Discrete Kernel of a Covering Homomorphism] Let $p:\widetilde G\to G$ be a covering homomorphism of Lie groups. Because $p$ is a covering map, there is an open neighbourhood $U$ of $e_{\widetilde G}$ such that $p|_U:U\to p(U)$ is a diffeomorphism. Therefore its derivative at the identity is an isomorphism: \begin{align*} dp_{e_{\widetilde G}}:T_{e_{\widetilde G}}\widetilde G\to T_{e_G}G. \end{align*} Under the identifications $T_{e_{\widetilde G}}\widetilde G=\operatorname{Lie}(\widetilde G)$ and $T_{e_G}G=\mathfrak g$, this says \begin{align*} dp:\operatorname{Lie}(\widetilde G)\to\mathfrak g \end{align*} is a linear isomorphism. Hence \begin{align*} \ker(dp)=\{0\}. \end{align*} By *Kernel of the Differential*, \begin{align*} \operatorname{Lie}(\ker p)=\ker(dp)=\{0\}. \end{align*} The subgroup $\ker p$ is normal because if $a\in\ker p$ and $\widetilde g\in\widetilde G$, then \begin{align*} p(\widetilde g a\widetilde g^{-1})=p(\widetilde g)p(a)p(\widetilde g)^{-1}=p(\widetilde g)e_Gp(\widetilde g)^{-1}=e_G. \end{align*} Thus $\widetilde g a\widetilde g^{-1}\in\ker p$. Since $\operatorname{Lie}(\ker p)=0$, the identity component of $\ker p$ is zero-dimensional, so $\ker p$ is discrete. The covering homomorphism therefore has an isomorphism on Lie algebras, and its kernel records only discrete group-level information. [/example] The covering example shows that the differential forgets discrete kernel information, so equality of differentials cannot distinguish maps that differ only on disconnected pieces or covering data. The next theorem gives the positive result that remains when the source is connected: agreement near the identity forces agreement on the whole group. [quotetheorem:8808] [citeproof:8808] Determination by the differential gives uniqueness, but it does not give existence of a homomorphism with a prescribed differential. The connectedness hypothesis is necessary: from $\mathbb Z/2\mathbb Z$ viewed as a zero-dimensional Lie group to itself, both the identity homomorphism and the constant homomorphism to the identity element have the same zero differential, but they are not the same map. Even for connected groups, existence has a separate obstruction. For example, a linear map $A:\mathbb R\to\mathbb R$ integrates on universal covers, but to descend to a homomorphism $S^1\to S^1$ it must respect the period lattice. The natural existence problem is whether a Lie algebra homomorphism can be integrated, and the obstruction is global topology rather than local calculus. [quotetheorem:8809] [citeproof:8809] This integration theorem is the converse direction to functoriality under the strongest source-topology hypothesis. Simple connectedness is the condition that removes path-continuation ambiguity; without it, integration may exist on the universal covering group but fail to descend to the original group. Concretely, identify $S^1$ with $\mathbb R/2\pi\mathbb Z$. A Lie algebra map $A:\mathbb R\to\mathbb R$ given by $A(t)=\lambda t$ descends to a homomorphism $S^1\to S^1$ only when $\lambda\in\mathbb Z$, because going once around the source circle must map to an integral number of turns in the target circle. Thus the theorem does not say that every Lie algebra homomorphism integrates for every connected source; it says that the only obstruction has disappeared when the source is simply connected. ## The Adjoint Example and the Double Cover $SU(2) \to SO(3)$ The most important example of a homomorphism built from conjugation is the adjoint representation. It turns a Lie group into linear transformations of its own Lie algebra, and in compact matrix groups it often reveals covering maps onto rotation groups. [definition: Adjoint Representation] Let $G$ be a Lie group with Lie algebra $\mathfrak g$. For each $g\in G$, define $C_g:G\to G$ by \begin{align*} C_g(x)=gxg^{-1}. \end{align*} The adjoint representation is the homomorphism \begin{align*} \operatorname{Ad}:G\to GL(\mathfrak g), \qquad \operatorname{Ad}_g=d(C_g)_e. \end{align*} [/definition] Conjugation fixes the identity, so its differential acts on $T_eG=\mathfrak g$. Differentiating the adjoint representation itself gives, for each $X\in\mathfrak g$, a linear map $\operatorname{ad}_X:\mathfrak g\to\mathfrak g$ defined by $\operatorname{ad}_X(Y)=[X,Y]$. Equivalently, $\operatorname{ad}:\mathfrak g\to\mathfrak{gl}(\mathfrak g)$ sends $X$ to $\operatorname{ad}_X$. [example: The Double Cover from $SU(2)$ to $SO(3)$] The Lie algebra \begin{align*} \mathfrak{su}(2)=\{X\in M_2(\mathbb C):X^*=-X,\operatorname{tr}(X)=0\} \end{align*} is real three-dimensional. Let $e_{ij}$ denote the matrix unit with $1$ in the $(i,j)$ entry and $0$ elsewhere. A real basis is \begin{align*} E_1=i(e_{12}+e_{21}),\quad E_2=e_{12}-e_{21},\quad E_3=i(e_{11}-e_{22}). \end{align*} The inner product \begin{align*} (X,Y)=-\frac{1}{2}\operatorname{tr}(XY) \end{align*} is preserved by conjugation in $SU(2)$: if $U\in SU(2)$, then $U^{-1}=U^*$, and cyclicity of trace gives \begin{align*} (UXU^{-1},UYU^{-1})=-\frac{1}{2}\operatorname{tr}(UXU^{-1}UYU^{-1})=-\frac{1}{2}\operatorname{tr}(UXYU^{-1})=-\frac{1}{2}\operatorname{tr}(XY)=(X,Y). \end{align*} Thus $\operatorname{Ad}_U$ is orthogonal on $\mathfrak{su}(2)$. Since $SU(2)$ is connected and $U\mapsto\det(\operatorname{Ad}_U)$ is continuous with value $1$ at $U=I$, the determinant is always $1$. Therefore the adjoint representation lands in \begin{align*} SO(\mathfrak{su}(2))\cong SO(3). \end{align*} We compute the kernel. If $U\in\ker(\operatorname{Ad})$, then $UXU^{-1}=X$ for every $X\in\mathfrak{su}(2)$, equivalently $UX=XU$ for every $X\in\mathfrak{su}(2)$. Write $U=(u_{ij})$. Commuting with $E_3=i(e_{11}-e_{22})$ forces the off-diagonal entries to vanish, because the $(1,2)$ entries of $UE_3$ and $E_3U$ are $-iu_{12}$ and $iu_{12}$, while the $(2,1)$ entries are $iu_{21}$ and $-iu_{21}$. Hence $u_{12}=u_{21}=0$, so $U$ is diagonal. Commuting with $E_1=i(e_{12}+e_{21})$ then forces the two diagonal entries to be equal, because the $(1,2)$ entries of $UE_1$ and $E_1U$ are $iu_{11}$ and $iu_{22}$. Thus $U=aI$. Since $U\in SU(2)$, we have $|a|=1$ and $\det(aI)=a^2=1$, so $a=\pm1$. Hence \begin{align*} \ker(\operatorname{Ad})=\{I,-I\}. \end{align*} Now compute the differential. For $X,Y\in\mathfrak{su}(2)$, \begin{align*} d(\operatorname{Ad})_I(X)(Y)=\frac{d}{dt}\bigg|_{t=0}\operatorname{Ad}_{\exp(tX)}(Y)=\frac{d}{dt}\bigg|_{t=0}\exp(tX)Y\exp(-tX)=XY-YX=[X,Y]. \end{align*} Using $e_{ij}e_{kl}=\delta_{jk}e_{il}$, the commutators of the basis elements are \begin{align*} [E_1,E_2]=-2E_3,\quad [E_2,E_3]=-2E_1,\quad [E_3,E_1]=-2E_2. \end{align*} If $X=x_1E_1+x_2E_2+x_3E_3$ and $[X,Y]=0$ for every $Y\in\mathfrak{su}(2)$, then taking $Y=E_1$ gives \begin{align*} 0=[X,E_1]=2x_2E_3-2x_3E_2. \end{align*} Since $E_2,E_3$ are linearly independent, $x_2=x_3=0$. Taking $Y=E_2$ gives \begin{align*} 0=[X,E_2]=-2x_1E_3+2x_3E_1. \end{align*} Since $x_3=0$ and $E_3$ is nonzero, $x_1=0$. Therefore $X=0$, so $d(\operatorname{Ad})_I$ is injective. Both $\mathfrak{su}(2)$ and $\mathfrak{so}(3)$ have real dimension $3$, so $d(\operatorname{Ad})_I$ is an isomorphism. Because the differential is an isomorphism, $\operatorname{Ad}$ is a local diffeomorphism near the identity, so its image contains an open neighbourhood of the identity in $SO(3)$. The image is a subgroup, hence it is open in $SO(3)$. It is also compact because it is the continuous image of compact $SU(2)$, and therefore it is closed. Since $SO(3)$ is connected, the image is all of $SO(3)$. Thus $\operatorname{Ad}:SU(2)\to SO(3)$ is surjective with kernel $\{I,-I\}$, so every fiber has two elements and the local diffeomorphism is a two-to-one covering homomorphism. [/example] This example illustrates several themes at once. The differential detects the local isomorphism, the kernel records the lost discrete information, and simple connectedness explains why $SU(2)$ is the natural group integrating the Lie algebra of rotations. The functorial viewpoint makes it natural to ask when a subgroup inherited from a matrix group is itself a Lie group. Cartan's closed subgroup theorem provides the structural justification, turning the closed-subgroup condition into the key criterion for Lie subgroup structure. # 7. Cartan's Closed Subgroup Theorem This chapter proves the structural result that justifies the definition of matrix Lie groups used throughout the course. Earlier chapters treated closed subgroups of $GL(n,\mathbb C)$ as matrix groups, used the exponential map from Chapter 2, and computed their Lie algebras by differentiating one-parameter subgroups as in Chapter 3. Cartan's closed subgroup theorem says that the topological condition of closedness already forces the smooth manifold structure, so no separate chart construction is needed for each classical group. The proof is local near the identity. The exponential map identifies the directions that remain inside the subgroup, the inverse function theorem separates those directions from transverse ones, and closedness rules out hidden subgroup elements accumulating transversely. ## The Problem of Smooth Structure Suppose $G \le GL(n,\mathbb C)$ is already known to be a matrix Lie group, and $H \le G$ is a closed subgroup in the subspace topology. The algebraic statement $H \le G$ does not by itself say that $H$ has charts, tangent spaces, or smooth multiplication. The question is whether closedness supplies enough analytic control to make $H$ into an embedded Lie subgroup. [definition: Closed Subgroup] Let $G$ be a matrix Lie group. A subgroup $H \le G$ is a closed subgroup if $H$ is closed as a subset of the topological space $G$. [/definition] Closedness is a topological condition inside the ambient group, not an additional smoothness assumption. In practice it is often checked by writing $H$ as the zero set of continuous matrix equations. [example: Orthogonal Group As A Closed Subgroup] Inside $GL(n,\mathbb R)$, define \begin{align*} O(n)=\{A\in GL(n,\mathbb R):A^\top A=I\}. \end{align*} This is a subgroup: $I^\top I=I$, and if $A^\top A=I$ and $B^\top B=I$, then \begin{align*} (AB)^\top(AB)=B^\top A^\top A B=B^\top I B=B^\top B=I. \end{align*} Also $A^\top A=I$ implies $A^{-1}=A^\top$, so \begin{align*} (A^{-1})^\top A^{-1}=(A^\top)^\top A^\top=AA^\top=A(A^{-1})=I. \end{align*} Thus $A^{-1}\in O(n)$. Now define $F:GL(n,\mathbb R)\to M(n,\mathbb R)$ by $F(A)=A^\top A-I$. Its $(i,j)$-entry is \begin{align*} F(A)_{ij}=\sum_{k=1}^n A_{ki}A_{kj}-\delta_{ij}, \end{align*} a polynomial in the entries of $A$, so $F$ is continuous. Moreover, \begin{align*} F(A)=0 \end{align*} is exactly the condition \begin{align*} A^\top A-I=0, \end{align*} which is equivalent to \begin{align*} A^\top A=I. \end{align*} Hence $O(n)=F^{-1}(\{0\})$. Since $\{0\}$ is closed in $M(n,\mathbb R)$ and $F$ is continuous, $O(n)$ is closed in $GL(n,\mathbb R)$. Cartan's theorem will therefore give this closed subgroup its Lie group structure without requiring separate chart constructions. [/example] This example illustrates why the theorem is useful: many natural symmetry groups are defined by preserving tensors, forms, or flags, and those preservation conditions are closed equations. The next step is to identify the correct candidate for the Lie algebra of such a subgroup. ## The Infinitesimal Subgroup A tangent vector to $H$ at the identity should be a tangent vector $X\in\mathfrak g$ whose exponential curve stays in $H$. Since the manifold structure on $H$ is not yet known, this condition must be phrased using only the already existing exponential map of $G$. [definition: Infinitesimal Subgroup] Let $G$ be a matrix Lie group with Lie algebra $\mathfrak g$, and let $H\le G$ be a subgroup. Define \begin{align*} \mathfrak h=\{X\in\mathfrak g:\exp(tX)\in H\text{ for all }t\in\mathbb R\}. \end{align*} [/definition] The definition records exactly the one-parameter subgroups of $G$ that are contained in $H$. To use this set as the tangent space of $H$, we need it to be stable under linear combinations and Lie brackets; otherwise it would not carry the algebraic structure expected of a Lie algebra. [quotetheorem:8810] [citeproof:8810] Closedness is the key hypothesis here: without it, a limit of products in $H$ might land outside $H$. For example, an irrational-slope subgroup of a torus is a subgroup whose closure is the whole torus, so limiting arguments inside the ambient group produce elements not belonging to the subgroup. The subgroup hypothesis is also essential, because the product and commutator approximations use closure under multiplication and inverse at every finite stage; a closed subset of $G$ containing the relevant exponential curves need not contain their products. The ambient matrix Lie group structure supplies the exponential map, the [product topology](/page/Product%20Topology), and the convergence statements used in the limit formulae, so the theorem is not a statement about arbitrary topological groups. Its limitation is equally important: it does not construct charts on $H$ and it does not identify every abstract subgroup with a Lie subgroup. It only proves that the infinitesimal directions forced by one-parameter subgroups form a Lie subalgebra, which is the algebraic input needed for the next step, where local coordinates separate the $\mathfrak h$ directions from complementary transverse directions. [example: Lie Algebra Of The Orthogonal Group] For $H=O(n)\le GL(n,\mathbb R)$, the infinitesimal subgroup condition says that $X\in\mathfrak h$ exactly when $\exp(tX)\in O(n)$ for every $t\in\mathbb R$, equivalently \begin{align*} \exp(tX)^\top\exp(tX)=I \end{align*} for every $t$. Since $\frac{d}{dt}\exp(tX)\vert_{t=0}=X$ and $\frac{d}{dt}\exp(tX)^\top\vert_{t=0}=X^\top$, differentiating the left side at $t=0$ gives \begin{align*} \frac{d}{dt}\bigl(\exp(tX)^\top\exp(tX)\bigr)\big\vert_{t=0}=X^\top I+IX=X^\top+X. \end{align*} The right side is the constant matrix $I$, whose derivative is $0$, so every such $X$ satisfies \begin{align*} X^\top+X=0. \end{align*} Thus the only possible Lie algebra is \begin{align*} \mathfrak o(n)=\{X\in M(n,\mathbb R):X^\top+X=0\}. \end{align*} Conversely, suppose $X^\top+X=0$, so $X^\top=-X$. For each $m\ge 0$, \begin{align*} (X^m)^\top=(X^\top)^m=(-X)^m. \end{align*} Using the power series for the matrix exponential, \begin{align*} \exp(tX)^\top=\sum_{m=0}^\infty \frac{t^m(X^m)^\top}{m!}=\sum_{m=0}^\infty \frac{t^m(-X)^m}{m!}=\exp(-tX). \end{align*} Therefore \begin{align*} \exp(tX)^\top\exp(tX)=\exp(-tX)\exp(tX)=\exp(0)=I, \end{align*} because $-tX$ and $tX$ commute. Hence $\exp(tX)\in O(n)$ for every $t$, so the Lie algebra is exactly the space of skew-symmetric matrices. [/example] This computation matches the tangent-space calculation for orthogonal groups in Chapter 3. Cartan's theorem explains why such calculations are legitimate for every closed subgroup, not only for examples where charts have already been constructed. ## Local Coordinates Adapted To A Subgroup The proof of Cartan's theorem now asks for coordinates in $G$ that distinguish the subgroup directions from the transverse directions. Since $\mathfrak h$ is a vector subspace of $\mathfrak g$, choose a linear complement $\mathfrak m$ with \begin{align*} \mathfrak g=\mathfrak h\oplus\mathfrak m. \end{align*} The local coordinate map is obtained by exponentiating first in the subgroup direction and then in the transverse direction. [definition: Adapted Exponential Chart] Let $\mathfrak g=\mathfrak h\oplus\mathfrak m$ be a direct sum of real vector spaces. The adapted exponential map is the function \begin{align*} \Phi:\mathfrak h\oplus\mathfrak m\to G,\quad (X,Y)\mapsto \exp(X)\exp(Y). \end{align*} [/definition] The adapted exponential chart packages the chosen splitting into a product parametrisation of elements near the identity. The next theorem addresses the remaining local-coordinate problem: after choosing a complement to $\mathfrak h$, can every nearby group element be decomposed in a unique subgroup part and a unique transverse part? This uniqueness is what will later let us prove that the transverse part of a nearby element of $H$ vanishes. [quotetheorem:8811] [citeproof:8811] These coordinates reduce the problem to understanding which points of $H$ have transverse coordinate $Y$. The complement hypothesis is doing two separate jobs. First, the derivative $D\Phi_{(0,0)}(X,Y)=X+Y$ is an isomorphism only because every element of $\mathfrak g$ has a unique decomposition into an $\mathfrak h$ part and an $\mathfrak m$ part. If $\mathfrak m$ were not a complement, the derivative would either miss directions in $\mathfrak g$ or identify two different pairs $(X,Y)$, and the inverse function theorem would not give a coordinate chart. Second, that same uniqueness is what makes the local coordinates capable of detecting whether a nearby element lies entirely in the subgroup direction. The inverse function theorem is indispensable here: for instance, on the circle group $S^1$, the exponential parameter $\theta\mapsto e^{i\theta}$ is not globally injective, so product expressions built from exponentials cannot be expected to be unique far from the identity. The theorem is therefore local near $e$, it does not use closedness of $H$, and it cannot by itself prevent nonzero transverse elements of $H$ from accumulating at the identity. That exclusion is the next lemma, where the closed-subgroup hypothesis begins to control the adapted chart. [quotetheorem:8812] [citeproof:8812] The theorem is the analytic heart of the proof. Closedness is necessary for the limiting step: the irrational-slope subgroup of a torus has nonclosed image, and its elements accumulate in directions belonging to the ambient torus rather than to the subgroup as an embedded submanifold. The complement hypothesis is also necessary. If $\mathfrak m$ met $\mathfrak h$ in a nonzero vector, then any nonzero $Y\in\mathfrak m\cap\mathfrak h$ would satisfy $\exp(tY)\in H$ for all $t$, so no neighbourhood of $0$ in $\mathfrak m$ could make $H\cap\exp(V_{\mathfrak m})$ equal to $\{e\}$. The local transverse formulation is part of the content as well: even for a closed subgroup, $\exp(\mathfrak m)$ may meet $H$ again far from the identity because exponential maps can be periodic or otherwise noninjective outside a small chart. Thus the theorem neither describes global intersections nor controls overlap among distant exponential patches. What it gives is exactly the missing ingredient for Cartan's theorem: in adapted coordinates near $e$, membership in $H$ will force the transverse coordinate to vanish. ## Cartan's Closed Subgroup Theorem The remaining obstacle is to turn the infinitesimal subalgebra $\mathfrak h$ into actual charts on $H$. Adapted coordinates give nearby elements of $G$ the form $\exp(X)\exp(Y)$, but a priori a nearby element of $H$ could still have a nonzero transverse coordinate. This is where closed subgroups and dense nonclosed subgroups behave differently: a closed subgroup has no hidden sequence of transverse elements converging to the identity, while a dense irrational line in a torus has elements accumulating in every ambient direction. The no-small-transverse-elements lemma is designed to rule out exactly that local failure. [quotetheorem:8813] [citeproof:8813] The theorem converts a closed subgroup problem into a topology problem. The closedness hypothesis cannot be weakened to mere subgroup closure under multiplication and inverse, because dense one-parameter subgroups of tori are algebraic subgroups in that sense but are not embedded closed submanifolds with the inherited topology. The conclusion also does not say that $H$ is an open subgroup of $G$ or that its manifold topology agrees with an arbitrary abstract topology placed on the same group; it is the embedded smooth structure coming from the inclusion $H\subset G$. This result justifies the working method used in the next examples: prove a matrix subgroup is closed by equations, then compute its Lie algebra infinitesimally. [quotetheorem:8814] [citeproof:8814] This corollary is often the most convenient form of the theorem, but each hypothesis has a distinct role. Being inside $GL(n,\mathbb C)$ or $GL(n,\mathbb R)$ supplies the ambient matrix Lie group and its exponential map; a closed subset of an arbitrary matrix space is not covered unless it is first viewed as a subgroup of one of these open general linear groups. The subgroup condition is independent of closedness: a zero set of continuous equations may be closed but fail to be closed under multiplication, so it would not inherit group operations from Cartan's theorem. Closedness in the ambient general linear group remains essential as well: a dense cyclic subgroup of a compact matrix group is still a subgroup of some $GL(n,\mathbb C)$, but it is not a closed embedded matrix Lie subgroup with the inherited topology. The result also says nothing about singular closed subsets that are not groups, such as the union of two coordinate axes in a matrix space. In applications it is therefore used in two stages: first verify the algebraic subgroup property and topological closedness inside the relevant general linear group, then use Lie-theoretic tools such as tangent spaces, exponential curves, and homogeneous spaces. ## Classical Groups And Stabilizers Constructing charts separately for every stabilizer would be inefficient: preserving an inner product, a volume form, or a flag usually gives matrix equations before it gives local coordinates. Cartan's theorem turns those equations into a smooth-structure argument once subgroup closure and topological closedness have been checked. This section records the pattern for the classical examples used later in geometry. [example: Unitary And Special Unitary Groups] The unitary group is \begin{align*} U(n)=\{A\in GL(n,\mathbb C):A^*A=I\}, \end{align*} and the special unitary group is \begin{align*} SU(n)=\{A\in U(n):\det A=1\}. \end{align*} If $A,B\in U(n)$, then \begin{align*} (AB)^*(AB)=B^*A^*AB=B^*IB=B^*B=I. \end{align*} Also $A^*A=I$ gives $A^{-1}=A^*$, and hence \begin{align*} (A^{-1})^*A^{-1}=(A^*)^*A^*=AA^*=A(A^{-1})=I. \end{align*} Thus $U(n)$ is a subgroup. Since $\det(AB)=\det A\,\det B$ and $\det(A^{-1})=(\det A)^{-1}$, the condition $\det A=1$ is also stable under products and inverses, so $SU(n)$ is a subgroup. Define $F:GL(n,\mathbb C)\to M(n,\mathbb C)$ by $F(A)=A^*A-I$. Its $(i,j)$-entry is \begin{align*} F(A)_{ij}=\sum_{k=1}^n \overline{A_{ki}}A_{kj}-\delta_{ij}, \end{align*} which is continuous in the real and imaginary parts of the entries of $A$. Hence $U(n)=F^{-1}(\{0\})$ is closed in $GL(n,\mathbb C)$. The determinant is a polynomial in the entries of $A$, so it is continuous, and therefore \begin{align*} SU(n)=U(n)\cap \det^{-1}(\{1\}) \end{align*} is closed in $GL(n,\mathbb C)$. Now let $X$ be an infinitesimal direction for $U(n)$, so $\exp(tX)\in U(n)$ for every real $t$. Then \begin{align*} \exp(tX)^*\exp(tX)=I. \end{align*} Using $\frac{d}{dt}\exp(tX)\vert_{t=0}=X$ and $\frac{d}{dt}\exp(tX)^*\vert_{t=0}=X^*$, differentiating at $t=0$ gives \begin{align*} X^*I+IX=X^*+X=0. \end{align*} Conversely, if $X^*+X=0$, then $X^*=-X$, so for each $m\ge 0$, \begin{align*} (X^m)^*=(X^*)^m=(-X)^m. \end{align*} The exponential series gives \begin{align*} \exp(tX)^*=\sum_{m=0}^\infty \frac{t^m(X^m)^*}{m!}=\sum_{m=0}^\infty \frac{t^m(-X)^m}{m!}=\exp(-tX). \end{align*} Since $-tX$ and $tX$ commute, \begin{align*} \exp(tX)^*\exp(tX)=\exp(-tX)\exp(tX)=\exp(0)=I. \end{align*} Thus \begin{align*} \mathfrak u(n)=\{X\in M(n,\mathbb C):X^*+X=0\}. \end{align*} For $SU(n)$ we add the determinant condition. If $\exp(tX)\in SU(n)$ for every $t$, then $\det(\exp(tX))=1$ for every $t$. The identity $\det(\exp Z)=\exp(\operatorname{tr}Z)$ gives \begin{align*} 1=\det(\exp(tX))=\exp(t\,\operatorname{tr}X). \end{align*} Differentiating this scalar identity at $t=0$ gives $\operatorname{tr}X=0$. Conversely, if $X\in\mathfrak u(n)$ and $\operatorname{tr}X=0$, then $\exp(tX)\in U(n)$ by the computation above, and \begin{align*} \det(\exp(tX))=\exp(t\,\operatorname{tr}X)=\exp(0)=1. \end{align*} Therefore \begin{align*} \mathfrak{su}(n)=\{X\in\mathfrak u(n):\operatorname{tr}X=0\}. \end{align*} Both groups are therefore closed matrix subgroups, and their infinitesimal conditions are exactly skew-Hermitian matrices, with the trace-zero condition added in the special unitary case. [/example] The unitary examples show the standard pattern: write the group as the solution set of preservation equations, check that the equations define a subgroup, and then differentiate those equations to find the infinitesimal condition. Skew forms give a different preservation law because the form is alternating rather than Hermitian, but the same closed-subgroup mechanism applies. This example is important because it introduces the symplectic groups, which later connect Lie theory with Hamiltonian geometry. [example: Symplectic Group] Let $J$ be the $2n\times 2n$ block matrix whose diagonal blocks are $0$, whose upper-right block is $I_n$, and whose lower-left block is $-I_n$. The real symplectic group is \begin{align*} Sp(2n,\mathbb R)=\{A\in GL(2n,\mathbb R):A^\top J A=J\}. \end{align*} It is a subgroup of $GL(2n,\mathbb R)$. First, \begin{align*} I^\top JI=J. \end{align*} If $A^\top JA=J$ and $B^\top JB=J$, then \begin{align*} (AB)^\top J(AB)=B^\top A^\top JAB=B^\top(A^\top JA)B=B^\top JB=J. \end{align*} If $A^\top JA=J$, then multiplying on the left by $(A^{-1})^\top$ and on the right by $A^{-1}$ gives \begin{align*} (A^{-1})^\top A^\top JAA^{-1}=(A^{-1})^\top JA^{-1}. \end{align*} The left side is $J$, so \begin{align*} (A^{-1})^\top JA^{-1}=J. \end{align*} Thus $A^{-1}\in Sp(2n,\mathbb R)$. Define $F:GL(2n,\mathbb R)\to M(2n,\mathbb R)$ by $F(A)=A^\top JA$. Its $(i,j)$-entry is \begin{align*} F(A)_{ij}=\sum_{p=1}^{2n}\sum_{q=1}^{2n}A_{pi}J_{pq}A_{qj}, \end{align*} a polynomial in the entries of $A$, since the entries $J_{pq}$ are fixed [real numbers](/page/Real%20Numbers). Hence $F$ is continuous, and \begin{align*} Sp(2n,\mathbb R)=F^{-1}(\{J\}). \end{align*} Because $\{J\}$ is closed in $M(2n,\mathbb R)$, the group $Sp(2n,\mathbb R)$ is closed in $GL(2n,\mathbb R)$. By *Cartan's closed subgroup theorem*, it is therefore a matrix Lie group. Now compute its Lie algebra. If $X$ is an infinitesimal direction for $Sp(2n,\mathbb R)$, then for every real $t$, \begin{align*} \exp(tX)^\top J\exp(tX)=J. \end{align*} Differentiating at $t=0$ and using the product rule gives \begin{align*} \left.\frac{d}{dt}\right|_{t=0}\bigl(\exp(tX)^\top J\exp(tX)\bigr)=X^\top JI+I^\top JX=X^\top J+JX. \end{align*} The right side $J$ is constant, so its derivative is $0$. Therefore every infinitesimal direction satisfies \begin{align*} X^\top J+JX=0. \end{align*} Conversely, suppose $X^\top J+JX=0$. Let \begin{align*} C(t)=\exp(tX)^\top J\exp(tX). \end{align*} Since $\frac{d}{dt}\exp(tX)=X\exp(tX)$, we have \begin{align*} C'(t)=\exp(tX)^\top X^\top J\exp(tX)+\exp(tX)^\top JX\exp(tX). \end{align*} Factoring the common terms gives \begin{align*} C'(t)=\exp(tX)^\top (X^\top J+JX)\exp(tX)=\exp(tX)^\top 0\exp(tX)=0. \end{align*} Thus $C(t)$ is constant. Since \begin{align*} C(0)=I^\top JI=J, \end{align*} we get \begin{align*} \exp(tX)^\top J\exp(tX)=J \end{align*} for every real $t$. Hence $\exp(tX)\in Sp(2n,\mathbb R)$ for every $t$, and the Lie algebra is exactly \begin{align*} \mathfrak{sp}(2n,\mathbb R)=\{X\in M(2n,\mathbb R):X^\top J+JX=0\}. \end{align*} The symplectic condition is therefore the infinitesimal statement that $X$ preserves the alternating form with matrix $J$ to first order. [/example] The symplectic example still fits the pattern of preserving a single [bilinear form](/page/Bilinear%20Form). Many Lie-theoretic subgroups instead arise by preserving a nested family of subspaces, so the defining equations are incidence conditions rather than tensor-preservation equations. The next example shows that Cartan's theorem applies just as well to these stabilizers, which are typically noncompact and need not be described by orthogonality or unitary conditions. [example: Stabilizer Of A Flag] Let $V_0=0$ and let $0<V_1<\cdots<V_k=\mathbb R^n$ be a flag. Choose a basis $e_1,\ldots,e_n$ adapted to the flag, so that, with $r_i=\dim V_i$, we have \begin{align*} V_i=\operatorname{span}(e_1,\ldots,e_{r_i}). \end{align*} For a matrix $A=(A_{pq})$ acting on column vectors, the condition $A(V_i)\subset V_i$ means that for every $q\le r_i$, the vector $Ae_q$ has no component in the rows $p>r_i$. Since \begin{align*} Ae_q=\sum_{p=1}^n A_{pq}e_p, \end{align*} this condition is exactly \begin{align*} A_{pq}=0\quad\text{whenever }q\le r_i\text{ and }p>r_i. \end{align*} Thus, in the adapted basis, the matrices preserving the flag are precisely the block upper-triangular invertible matrices with block sizes $r_i-r_{i-1}$. The stabilizer is \begin{align*} P=\{A\in GL(n,\mathbb R):A(V_i)=V_i\text{ for each }i\}. \end{align*} It is a subgroup: $I(V_i)=V_i$, and if $A(V_i)=V_i$ and $B(V_i)=V_i$, then \begin{align*} (AB)(V_i)=A(B(V_i))=A(V_i)=V_i. \end{align*} Also, if $A(V_i)=V_i$, then applying $A^{-1}$ to both sides gives \begin{align*} A^{-1}(A(V_i))=A^{-1}(V_i), \end{align*} so \begin{align*} V_i=A^{-1}(V_i). \end{align*} Hence $A^{-1}\in P$. The same coordinate equations prove closedness. For each pair $(p,q)$ with $q\le r_i$ and $p>r_i$ for some $i$, the condition $A_{pq}=0$ is the inverse image of $\{0\}$ under the continuous coordinate function $A\mapsto A_{pq}$. Therefore $P$ is the intersection of $GL(n,\mathbb R)$ with finitely many closed coordinate hyperplanes in $M(n,\mathbb R)$, so $P$ is closed in $GL(n,\mathbb R)$. By *Cartan's closed subgroup theorem*, $P$ is a matrix Lie group. Its Lie algebra is the space of matrices preserving the flag infinitesimally. If $X$ is in the Lie algebra, then $\exp(tX)(V_i)=V_i$ for every $t$. For $q\le r_i$ and $p>r_i$, the $(p,q)$-entry of $\exp(tX)$ is therefore $0$ for every $t$. Differentiating that scalar entry at $t=0$ gives \begin{align*} X_{pq}=\left.\frac{d}{dt}\right|_{t=0}(\exp(tX))_{pq}=0. \end{align*} Thus $X(V_i)\subset V_i$ for each $i$. Conversely, suppose $X(V_i)\subset V_i$ for each $i$. Then $X^m(V_i)\subset V_i$ for every $m\ge 0$, by repeated application of $X$. Hence for $v\in V_i$, each partial sum \begin{align*} \sum_{m=0}^N\frac{t^mX^m v}{m!} \end{align*} lies in $V_i$. Since $V_i$ is a finite-dimensional closed subspace of $\mathbb R^n$, the limit also lies in $V_i$, so \begin{align*} \exp(tX)v\in V_i. \end{align*} Applying the same argument to $-tX$ shows $\exp(-tX)(V_i)\subset V_i$, so $\exp(tX)(V_i)=V_i$. Therefore \begin{align*} \operatorname{Lie}(P)=\{X\in M(n,\mathbb R):X(V_i)\subset V_i\text{ for each }i\}, \end{align*} which is exactly the space of block upper-triangular matrices in the adapted basis. [/example] Flag stabilizers are the basic examples of parabolic subgroups in linear Lie theory. Their construction shows that Cartan's theorem is not merely a tool for compact classical groups, but also for noncompact groups with rich algebraic structure. [remark: Why Closedness Cannot Be Dropped] A subgroup may be dense without being all of the ambient Lie group. For instance, an irrational-slope one-parameter subgroup in a torus has dense image. Such a subgroup is not an embedded closed submanifold with the subspace topology inherited from the torus. Cartan's theorem identifies closedness as the hypothesis that excludes this pathology. [/remark] The chapter leaves us with a robust working principle: when a matrix subgroup is specified by continuous equations, verify subgroup closure and topological closedness, then use Cartan's theorem to obtain the Lie group structure. This is also the bridge from matrix calculations to quotient geometry: once a stabilizer $H\le G$ is known to be closed, later constructions can treat $G/H$ as a homogeneous space rather than merely as a set of cosets. In algebraic examples, the theorem explains why solution sets of polynomial matrix equations often carry the expected Lie group structure after the subgroup and closedness checks have been made. Chapter 8 uses this principle when passing from matrix computations to covering groups and global topology; later courses use the same closed-subgroup method for homogeneous spaces and algebraic groups. With the subgroup theorem in place, we can move from local algebra back to global topology. Connectedness, simple connectedness, and the fundamental group explain which Lie groups share the same local model and which global distinctions remain invisible to the Lie algebra. # 8. Connectedness, Simple Connectedness, and $\pi_1$ This chapter turns the local theory of Lie groups into global topology. Chapters 4 through 6 showed that a Lie algebra controls a neighbourhood of the identity through the exponential map, the Baker--Campbell--Hausdorff law, and local homomorphism theorems. We now ask what extra information is needed to recover the whole group, and the answer is topological: connected components, fundamental groups, covering groups, and discrete central quotients. ## Connected Components of Lie Groups The first global question is whether the piece of a Lie group containing the identity behaves like a subgroup. For a general topological space, connected components are only closed pieces of the space; for a Lie group, multiplication and inversion force the identity component to carry algebraic structure. [definition: Identity Component] Let $G$ be a Lie group. The identity component $G_0$ is the connected component of the identity element $e \in G$. [/definition] The point of isolating $G_0$ is that it is the largest connected part of $G$ visible from the identity, but connected components in an arbitrary topological space have no reason to respect any extra structure. In a Lie group, translations and conjugations are homeomorphisms, so they move connected components in a controlled way. The next issue is whether this topological piece is compatible with multiplication, since only then can disconnected Lie groups be reduced to a connected subgroup together with a component quotient. [quotetheorem:8815] [citeproof:8815] The local connectedness hypothesis is doing real work: in a general topological group, the identity component need not be open, so the quotient by it need not be discrete. The theorem also does not say that $G$ splits as a direct or semidirect product of $G_0$ with the component group; it only gives a canonical normal subgroup and a discrete quotient. Thus disconnectedness in a Lie group is not arbitrary, but it may still encode nontrivial extension data beyond the set of components. Since $G_0$ is open, the quotient group $G/G_0$ is discrete and its elements are exactly the connected components. [example: Components of the Orthogonal Group] Let $n\ge 1$ and let $A\in O(n)$. Since $A^\top A=I$, taking determinants gives \begin{align*} \det(A^\top A)=\det(I). \end{align*} Using $\det(A^\top)=\det(A)$ and multiplicativity of determinant, this becomes \begin{align*} \det(A)^2=1. \end{align*} Thus $\det(A)\in\{\pm 1\}$, so the continuous map $\det:O(n)\to\{\pm1\}$ separates $O(n)$ into the two subsets $\det^{-1}(1)$ and $\det^{-1}(-1)$. Its kernel is \begin{align*} \ker(\det)=\{A\in O(n):\det A=1\}=SO(n). \end{align*} The identity matrix has determinant $1$, so the identity component lies in $SO(n)$. Conversely, every element of $SO(n)$ can be joined to $I$ by a path in $SO(n)$: after an orthogonal change of basis, it is a product of planar rotation blocks $R_{\theta}$ and pairs of $-1$ eigenvalue blocks, and each pair of $-1$ blocks is the rotation block $R_{\pi}$ on a two-dimensional plane. Replacing each $R_{\theta}$ by $R_{(1-t)\theta}$ gives a path from the matrix to $I$ with determinant always $1$. Hence the identity component is $SO(n)$. If $D=\operatorname{diag}(-1,1,\dots,1)$, then $\det D=-1$ and every determinant $-1$ matrix has the form $DB$ with $B\in SO(n)$, because $\det(DB)=(-1)(1)=-1$ and $B=DA$ has determinant $1$. Therefore the other component is $DSO(n)$, and \begin{align*} O(n)/SO(n)\cong \{\pm1\}\cong \mathbb Z/2\mathbb Z. \end{align*} The quotient records the only component-level distinction: determinant $1$ matrices preserve orientation, while determinant $-1$ matrices reverse it. [/example] The quotient by the identity component is useful because many Lie-theoretic constructions reduce first to the connected group $G_0$, with the discrete component group acting by conjugation. This reduction has a sharp boundary: infinitesimal methods can analyse $G_0$, but they cannot recover how the other components are attached. The next remark records this loss of information at the level of Lie algebras. [remark: Lie Algebra of a Disconnected Lie Group] If $G$ is any Lie group, then $\mathfrak g = T_eG$ is also the Lie algebra of $G_0$. The Lie algebra therefore sees only the identity component, and it cannot detect how many other components the group has. [/remark] ## Fundamental Groups of Classical Lie Groups The next question is how much topology remains after restricting to connected Lie groups. The fundamental group $\pi_1(G,e)$ measures loops based at the identity up to homotopy, and for a connected Lie group it is the first obstruction to simple connectedness. [definition: Fundamental Group] Let $X$ be a topological space and $x_0 \in X$. The fundamental group $\pi_1(X,x_0)$ is the group of homotopy classes of loops $\gamma:[0,1]\to X$ with $\gamma(0)=\gamma(1)=x_0$, under concatenation. [/definition] For a connected Lie group the base point is often omitted, because changing base point gives an isomorphic group. We need concrete computations now: they show which classical groups are already simply connected and which ones must be replaced by a covering group. [quotetheorem:8816] [citeproof:8816] These computations illustrate an important distinction: groups with very similar Lie algebras may have different fundamental groups. The theorem is limited to the listed identity components and low-rank cases; it does not classify all real forms or all quotients with the same Lie algebra. For instance, $SL(2,\mathbb R)$ and $SU(2)$ both have three-dimensional simple Lie algebras over $\mathbb R$, but their topology is different: $SL(2,\mathbb R)$ has fundamental group $\mathbb Z$, while $SU(2)$ is simply connected. The examples $SO(3)$ and $SU(2)$ show the sharper phenomenon that the same infinitesimal rotations can occur in two connected groups, only one of which is simply connected. This is the topology behind spin groups and projective representations, where a representation of a quotient group may lift only after passing to a covering group. [example: Loops in the Circle Group] Identify $SO(2)$ with $U(1)$ by sending the rotation through angle $s$ to $e^{is}$. For an integer $k$, define \begin{align*} \gamma_k(t)=e^{2\pi i kt},\qquad 0\le t\le 1. \end{align*} Then $\gamma_k(0)=e^0=1$ and $\gamma_k(1)=e^{2\pi i k}=1$, so $\gamma_k$ is a loop based at $1$. It lifts along the covering map $p:\mathbb R\to U(1)$, $p(s)=e^{is}$, to \begin{align*} \widetilde\gamma_k(t)=2\pi kt, \end{align*} because $p(\widetilde\gamma_k(t))=e^{2\pi i kt}=\gamma_k(t)$. The lifted path starts at $\widetilde\gamma_k(0)=0$ and ends at $\widetilde\gamma_k(1)=2\pi k$, so the integer $k$ is the [winding number](/page/Winding%20Number) of the loop. If two based loops are homotopic through based loops, lift the homotopy to $\mathbb R$ starting at $0$. At $t=1$ the lifted endpoint must always lie in $p^{-1}(1)=2\pi\mathbb Z$, and a continuous path into the discrete set $2\pi\mathbb Z$ is constant. Thus the endpoint $2\pi k$, and therefore $k$, is unchanged by based homotopy. Hence $\gamma_k$ represents the class $k\in\mathbb Z$, and $\pi_1(U(1),1)$ records exactly the integer winding number of loops. [/example] The case of $SO(3)$ is geometrically different because a $2\pi$ rotation is not homotopic to the constant loop in $SO(3)$, while a $4\pi$ rotation is. The two-to-one cover $SU(2)\to SO(3)$ packages this phenomenon algebraically. [example: The Double Cover of Rotations] View $SU(2)$ as the unit quaternions $S^3\subset \mathbb H$, with $\pm I$ corresponding to the quaternions $\pm 1$. For a unit quaternion $q$, define \begin{align*} \rho_q:\operatorname{Im}\mathbb H\to \operatorname{Im}\mathbb H,\qquad \rho_q(v)=qvq^{-1}. \end{align*} If $v\in\operatorname{Im}\mathbb H$, then $\overline v=-v$, and since $q$ is unit we have $q^{-1}=\overline q$. Hence \begin{align*} \overline{qvq^{-1}}=\overline{q^{-1}}\,\overline v\,\overline q=q(-v)q^{-1}=-qvq^{-1}. \end{align*} Thus $qvq^{-1}$ is again imaginary. The norm is preserved because the quaternion norm is multiplicative: \begin{align*} |qvq^{-1}|=|q|\,|v|\,|q^{-1}|=1\cdot |v|\cdot 1=|v|. \end{align*} So $\rho_q$ is an orthogonal linear map of $\operatorname{Im}\mathbb H\cong \mathbb R^3$. The map $q\mapsto \det(\rho_q)$ is continuous from the [connected space](/page/Connected%20Space) $S^3$ to $\{\pm1\}$, and $\det(\rho_1)=1$, so $\det(\rho_q)=1$ for every unit quaternion $q$. Therefore $\rho_q\in SO(3)$. The quaternions $q$ and $-q$ give the same rotation because $(-q)^{-1}=-q^{-1}$, so \begin{align*} \rho_{-q}(v)=(-q)v(-q)^{-1}=(-q)v(-q^{-1})=qvq^{-1}=\rho_q(v). \end{align*} Conversely, suppose $\rho_q=\rho_r$. Then $s=r^{-1}q$ satisfies $svs^{-1}=v$ for every $v\in\operatorname{Im}\mathbb H$, so $sv=vs$ for $v=i,j,k$. Write $s=a+bi+cj+dk$. From $si=is$ we get \begin{align*} ai-b-c k+d j=ai-b+c k-d j, \end{align*} so $c=d=0$. From $sj=js$ we then get \begin{align*} aj+b k=aj-b k, \end{align*} so $b=0$. Hence $s=a$ is real. Since $s$ is unit, $a^2=1$, so $s=\pm1$ and therefore $q=\pm r$. Thus the homomorphism $q\mapsto \rho_q$ has kernel $\{\pm1\}$, and its image is all of $SO(3)$ because every rotation in $\mathbb R^3$ is rotation by some angle $\theta$ about some unit axis $u\in\operatorname{Im}\mathbb H$, realised by \begin{align*} q=\cos(\theta/2)+u\sin(\theta/2). \end{align*} Consequently \begin{align*} SU(2)/\{\pm I\}\cong SO(3). \end{align*} The double cover identifies antipodal unit quaternions, so a single rotation in $SO(3)$ has exactly two lifts in $SU(2)$. [/example] ## Simply Connected Covers The computations above suggest the central construction of the chapter: replace a connected Lie group by a simply connected covering group. The problem is to do this inside the category of Lie groups, not merely as a covering space of manifolds. [definition: Simply Connected Lie Group] A Lie group $G$ is simply connected if it is connected and $\pi_1(G,e)=0$. [/definition] Simple connectedness, or more standardly simple connectedness of the underlying space, is the topological condition that removes ambiguity in lifting paths. To use it for Lie theory, we need covers that respect multiplication, because a covering space alone does not yet tell us how to multiply lifted points. [definition: Covering Homomorphism] A Lie group homomorphism $p:\widetilde G\to G$ is a covering homomorphism if $p$ is also a covering map of smooth manifolds. [/definition] Covering homomorphisms have discrete kernels, because the fibre over $e$ is a discrete subset of $\widetilde G$. The next theorem is needed to show that the usual topological universal cover of a connected Lie group carries a compatible Lie group structure and has the same infinitesimal data. [quotetheorem:8817] [citeproof:8817] The theorem separates Lie algebra from global topology. Connectedness is essential here: without it, the chosen lift of multiplication on the identity component would not determine multiplication between unrelated components. The theorem also does not say that $G$ itself is determined by its Lie algebra; it says that the simply connected cover is the canonical object with that infinitesimal data. Other connected groups with the same Lie algebra arise by quotienting this representative by suitable discrete subgroups. The next theorem returns to the rotation example from the fundamental group computation and identifies which member of the pair $SU(2)\to SO(3)$ is the universal representative. [quotetheorem:8818] [citeproof:8818] This corollary is the prototype for much of compact Lie theory: the same infinitesimal rotations may be realised by different global groups, and the simply connected one is often the most convenient for representation theory. The hypotheses matter in two directions. First, the map must be a covering homomorphism, not merely a covering map of underlying manifolds; otherwise the lifted space would not carry the relevant group structure. Second, the source must be simply connected: the covering $U(1)\to U(1)$ given by $z\mapsto z^2$ is a two-to-one covering homomorphism, but its domain is not simply connected, so it is not the universal cover of the circle. Thus the result does not say that every double cover is universal; it says that this particular double cover is universal because $SU(2)\cong S^3$ has no noncontractible loops. [example: The Groups with Lie Algebra so Three] [claim]The connected Lie groups with Lie algebra $\mathfrak{so}(3)$ are, up to isomorphism, exactly $SU(2)$ and $SO(3)$.[/claim] [proof]By *SU Two Covers SO Three*, $SU(2)$ is the simply connected Lie group covering $SO(3)$, so its Lie algebra is $\mathfrak{so}(3)$. By *Classification by Discrete Central Subgroups*, every connected Lie group with this Lie algebra has the form $SU(2)/\Gamma$ for a discrete subgroup $\Gamma\le Z(SU(2))$. Identify $SU(2)$ with the unit quaternions. Let $q=a+bi+cj+dk$ lie in the centre. Since $q$ commutes with $i$, we have $qi=iq$. Expanding both sides gives \begin{align*} qi=(a+bi+cj+dk)i=ai-b-c k+d j. \end{align*} Also \begin{align*} iq=i(a+bi+cj+dk)=ai-b+c k-d j. \end{align*} Equality forces $-c=c$ and $d=-d$, hence $c=d=0$. Now $q=a+bi$, and centrality also gives $qj=jq$. Expanding, \begin{align*} qj=(a+bi)j=aj+b k. \end{align*} Meanwhile \begin{align*} jq=j(a+bi)=aj-b k. \end{align*} Equality forces $b=-b$, hence $b=0$. Thus every central unit quaternion is real, so $q=a$. Since $q$ is unit, $a^2=1$, and therefore $q=\pm1$. Hence \begin{align*} Z(SU(2))=\{\pm I\}. \end{align*} The only subgroups of $\{\pm I\}$ are $\{I\}$ and $\{\pm I\}$, and both are discrete. For $\Gamma=\{I\}$ the quotient is \begin{align*} SU(2)/\{I\}\cong SU(2). \end{align*} For $\Gamma=\{\pm I\}$, *SU Two Covers SO Three* gives \begin{align*} SU(2)/\{\pm I\}\cong SO(3). \end{align*} Therefore the classification leaves exactly the two connected possibilities $SU(2)$ and $SO(3)$.[/proof] The example shows how the same infinitesimal rotation algebra has two connected global realisations, distinguished by whether the central subgroup $\{\pm I\}$ has been divided out. [/example] ## Discrete Central Quotients The final question is the classification problem: given a Lie algebra $\mathfrak g$, how many connected Lie groups realise it? The universal cover gives the largest connected realisation, and all the others are obtained by dividing by discrete central subgroups. [quotetheorem:8819] [citeproof:8819] The theorem explains why the centre appears in the classification. The connectedness assumption is essential for centrality: if $\widetilde G$ is a non-abelian discrete Lie group and $p:\widetilde G\to \{e\}$ is the homomorphism to the trivial group, then $p$ is a covering homomorphism but $\ker p=\widetilde G$ is not central. Thus the theorem is not a statement about arbitrary covering homomorphisms; it is a statement about connected covering groups, where a continuous map into a discrete group must be constant. It gives one direction of the correspondence: any connected group with a fixed Lie algebra is obtained from the simply connected one by quotienting out the kernel of a covering homomorphism; what remains is to state the full classification and its converse construction. [quotetheorem:8820] [citeproof:8820] This classification is often the cleanest way to organise examples, but its hypotheses prevent two common mistakes. Centrality cannot be dropped: in the discrete Lie group $S_3$, the subgroup generated by a transposition is discrete but not normal, hence the coset space is not a quotient group with inherited multiplication. Connectedness also cannot be dropped: $O(2)$ and $SO(2)$ have the same Lie algebra, but $O(2)$ is disconnected and is not obtained from the simply connected group $\mathbb R$ by quotienting by a discrete central subgroup alone. Within the connected category, the local multiplication, bracket, and exponential behaviour live in $\mathfrak g$ and $\widetilde G$; the choice of a discrete central subgroup records the remaining global topology. This is also the covering-space classification in Lie-theoretic form: the fundamental group appears as the deck transformation group of the universal covering homomorphism. [example: The Groups with Lie Algebra u One] By *Classification by Discrete Central Subgroups*, the connected Lie groups with Lie algebra $\mathfrak u(1)\cong \mathbb R$ are the quotients of the simply connected group integrating this Lie algebra by discrete central subgroups. The simply connected integrating group is the additive Lie group $\mathbb R$, whose centre is all of $\mathbb R$ because addition is commutative: \begin{align*} x+y=y+x. \end{align*} Let $\Gamma\le \mathbb R$ be a discrete subgroup. If $\Gamma=\{0\}$, it is $0\mathbb Z$. If $\Gamma\ne\{0\}$, choose the smallest positive element $a\in \Gamma$. Such an $a$ exists because discreteness gives $\varepsilon>0$ with $\Gamma\cap(-\varepsilon,\varepsilon)=\{0\}$, so the positive elements of $\Gamma$ are bounded away from $0$. For any $\gamma\in\Gamma$, choose $n\in\mathbb Z$ with \begin{align*} na\le \gamma < (n+1)a. \end{align*} Then $\gamma-na\in\Gamma$ and \begin{align*} 0\le \gamma-na<a. \end{align*} By minimality of $a$, this forces $\gamma-na=0$, so $\gamma=na$. Hence $\Gamma=a\mathbb Z$. For $\Gamma=\{0\}$, the quotient is $\mathbb R/\{0\}\cong\mathbb R$. For $a>0$, define \begin{align*} \phi_a:\mathbb R/a\mathbb Z\to U(1),\qquad \phi_a([t])=e^{2\pi i t/a}. \end{align*} This is well-defined because if $t'=t+ka$ with $k\in\mathbb Z$, then \begin{align*} e^{2\pi i t'/a}=e^{2\pi i(t+ka)/a}=e^{2\pi i t/a}e^{2\pi i k}=e^{2\pi i t/a}. \end{align*} It is a homomorphism since \begin{align*} \phi_a([s]+[t])=e^{2\pi i(s+t)/a}=e^{2\pi i s/a}e^{2\pi i t/a}=\phi_a([s])\phi_a([t]). \end{align*} Its kernel consists of classes $[t]$ with $e^{2\pi i t/a}=1$, equivalently $t/a\in\mathbb Z$, so $t\in a\mathbb Z$ and $[t]=[0]$. It is surjective because every element of $U(1)$ has the form $e^{i\theta}$ and \begin{align*} e^{i\theta}=e^{2\pi i(a\theta/2\pi)/a}. \end{align*} Thus $\mathbb R/a\mathbb Z\cong U(1)$ for every $a>0$. Therefore, up to Lie group isomorphism, the connected Lie groups with Lie algebra $\mathfrak u(1)$ are exactly $\mathbb R$ and $U(1)$. [/example] The same logic explains why a finite-dimensional Lie algebra may have a unique connected group, finitely many, or an infinite family, depending on the discrete subgroups in the centre of its simply connected group. [remark: What the Fundamental Group Remembers] If $G\cong \widetilde G/\Gamma$ with $\widetilde G$ simply connected, then $\pi_1(G)\cong \Gamma$. Thus the fundamental group of a connected Lie group is naturally realised as a discrete central subgroup of its universal cover. [/remark] The chapter therefore closes the first global loop in the course. Connected components reduce disconnected groups to connected ones plus a discrete component group, while universal covers reduce connected groups to simply connected groups plus a discrete central subgroup. This is the global counterpart to Chapter 5's local statement that the Lie algebra determines the multiplication only near the identity. In later topics, this viewpoint lets us move between Lie algebra calculations and global group statements without losing track of the topological information. The topological classification now leads back to algebraic structure through the action of the group on itself by conjugation. The adjoint representation and Killing form convert commutator information into linear representation-theoretic data, giving a new way to measure the size and type of a Lie algebra. # 9. The Adjoint Representation and the Killing Form In Chapters 3 through 5 the Lie bracket was introduced as the infinitesimal shadow of the group law: it measures the first non-commutativity seen by flows, commutators, and the Baker--Campbell--Hausdorff product near the identity. This chapter turns that idea into a representation-theoretic tool. We assume the earlier material on Lie groups, tangent spaces, left-invariant vector fields, one-parameter subgroups, and the definition of the Lie bracket on $T_eG$. With those prerequisites in place, we study how a Lie group acts on its own Lie algebra by conjugation, then differentiate that action to recover the bracket, and finally use traces of these infinitesimal operators to define the Killing form. The main theme is that internal structure of a Lie group can be measured by linear algebra on its Lie algebra. The adjoint representation records conjugation, the map $\operatorname{ad}$ records the bracket, and the Killing form detects important structural features such as semisimplicity and compactness. ## Conjugation and the Adjoint Representation Conjugation is the most intrinsic action a group has on itself, but it does not fix every point. For a Lie group $G$, the identity element is fixed by every conjugation map, so conjugation has a differential at the identity that acts on the tangent space $T_eG=\mathfrak g$. This is the first bridge from the global group operation to a linear representation on the Lie algebra. [definition: Inner Automorphism] Let $G$ be a Lie group. For $g \in G$, the inner automorphism determined by $g$ is the smooth group automorphism \begin{align*} C_g : G &\to G, & C_g(h) &= ghg^{-1}. \end{align*} [/definition] The previous definition packages conjugation by a fixed group element as a smooth map. Differentiating the multiplication map itself would depend on a chosen input direction and would not produce an action of $G$ on a fixed vector space; the obstruction is that most points of $G$ move under conjugation. The identity is the exception: $C_g(e)=e$ for every $g\in G$, so $(dC_g)_e$ is a linear map from the single vector space $\mathfrak g$ to itself. The next definition names this fixed-point differential, which is the only canonical way to turn global conjugation into a linear representation without choosing coordinates or a basis. [definition: Adjoint Representation] Let $G$ be a Lie group with Lie algebra $\mathfrak g=T_eG$. The adjoint representation of $G$ is the map \begin{align*} \operatorname{Ad}:G&\to GL(\mathfrak g), & \operatorname{Ad}_g&=(dC_g)_e. \end{align*} [/definition] Calling this construction a representation requires a compatibility check: differentiating conjugation at the fixed point $e$ must still remember the multiplication law in $G$. The possible obstruction is that taking differentials can destroy algebraic structure unless the underlying maps compose in the right order. Here the conjugation maps compose exactly as group elements multiply, so their differentials should form a homomorphism into $GL(\mathfrak g)$. [quotetheorem:8821] [citeproof:8821] Each hypothesis is doing work here. The group law is needed because the proof uses the identity $C_{g_1g_2}=C_{g_1}\circ C_{g_2}$; for a general smooth family of diffeomorphisms fixing a point, the induced differentials need not respect any multiplication law. The Lie group smoothness is also essential, since without smooth multiplication and inversion there is no well-defined smooth map into $GL(\mathfrak g)$. The theorem does not say that $\operatorname{Ad}$ is faithful: every abelian Lie group has $\operatorname{Ad}_g=\operatorname{id}_{\mathfrak g}$ for all $g$, so the representation can lose all group-level information. What it gives is a canonical representation whose derivative can now be compared with the Lie bracket. For matrix Lie groups, this construction has the expected concrete form. The abstract differential of conjugation reduces to ordinary matrix conjugation on tangent vectors, giving a reusable way to compute $\operatorname{Ad}$ without returning to charts. [example: Adjoint Representation Of A Matrix Lie Group] Let $G\le GL(n,\mathbb C)$ be a matrix Lie group with Lie algebra $\mathfrak g\subset M(n,\mathbb C)$. Fix $g\in G$ and $X\in\mathfrak g$, and choose a smooth curve $\gamma:(-\varepsilon,\varepsilon)\to G$ with $\gamma(0)=I$ and $\gamma'(0)=X$. Since $C_g(h)=ghg^{-1}$, the definition of $\operatorname{Ad}_g=(dC_g)_I$ gives \begin{align*} \operatorname{Ad}_gX=\frac{d}{dt}\bigg|_{t=0}C_g(\gamma(t)). \end{align*} Substituting the formula for $C_g$, \begin{align*} \operatorname{Ad}_gX=\frac{d}{dt}\bigg|_{t=0}g\gamma(t)g^{-1}. \end{align*} The matrices $g$ and $g^{-1}$ are constant in $t$, so differentiating entry by entry gives \begin{align*} \frac{d}{dt}\bigg|_{t=0}g\gamma(t)g^{-1}=g\gamma'(0)g^{-1}. \end{align*} Using $\gamma'(0)=X$, we obtain \begin{align*} \operatorname{Ad}_gX=gXg^{-1}. \end{align*} Thus, for matrix Lie groups, the abstract differential of conjugation is exactly the familiar operation of conjugating tangent matrices. [/example] The example also explains why $\operatorname{Ad}$ preserves the Lie algebra. If $X$ is tangent to $G$ at the identity, then $gXg^{-1}$ is tangent to the conjugate copy $gGg^{-1}=G$ at the identity. ## Differentiating the Adjoint Representation A representation of a Lie group should have an infinitesimal representation of its Lie algebra. Applying this principle to $\operatorname{Ad}:G\to GL(\mathfrak g)$ gives a map from $\mathfrak g$ to the Lie algebra of $GL(\mathfrak g)$, namely \begin{align*} \mathfrak{gl}(\mathfrak g)=\operatorname{End}(\mathfrak g), \end{align*} the vector space of linear endomorphisms of $\mathfrak g$ with commutator bracket. The central question is how this derivative relates to the bracket already defined on $\mathfrak g$. [definition: Infinitesimal Adjoint Representation] Let $G$ be a Lie group with Lie algebra $\mathfrak g$. The infinitesimal adjoint representation is \begin{align*} \operatorname{ad}:\mathfrak g&\to\mathfrak{gl}(\mathfrak g)=\operatorname{End}(\mathfrak g), & \operatorname{ad}_X&=(d\operatorname{Ad})_e(X). \end{align*} [/definition] The notation suggests that $\operatorname{ad}_X$ is an endomorphism of $\mathfrak g$, but its definition still comes from differentiating a group-level representation. The obstruction is that the Lie bracket was introduced intrinsically on tangent vectors, while $d\operatorname{Ad}$ is obtained from conjugation in the group. These two constructions must agree if the bracket is really the infinitesimal shadow of conjugation. [quotetheorem:8822] [citeproof:8822] The hypotheses locate the statement in a setting where both sides are defined from the same smooth group law. Smoothness is needed to differentiate $\operatorname{Ad}$, and the Lie algebra must be the tangent Lie algebra of $G$ for the bracket convention to match conjugation; an abstract vector space with an arbitrary bilinear bracket has no group-level adjoint map to differentiate. The theorem does not say that every Lie-algebra homomorphism into $\mathfrak{gl}(\mathfrak g)$ arises this way, nor that $\operatorname{ad}$ is injective: if $Z$ lies in the centre \begin{align*} Z(\mathfrak g)=\{W\in\mathfrak g:[W,V]=0\text{ for all }V\in\mathfrak g\}, \end{align*} then $\operatorname{ad}_Z=0$. Its role is to identify the bracket as the infinitesimal content of conjugation, which makes the next operator identity a reformulation of Jacobi. [example: Infinitesimal Adjoint Action For Matrices] Let $G\le GL(n,\mathbb C)$ be a matrix Lie group and let $X,Y\in\mathfrak g$. Along the curve $g(t)=\exp(tX)$, the matrix formula for the adjoint action gives \begin{align*} \operatorname{Ad}_{\exp(tX)}Y=\exp(tX)Y\exp(-tX). \end{align*} By the definition of $\operatorname{ad}$ as the derivative of $\operatorname{Ad}$ at the identity, \begin{align*} \operatorname{ad}_X(Y)=\frac{d}{dt}\bigg|_{t=0}\exp(tX)Y\exp(-tX). \end{align*} The exponential series gives $\exp(tX)=I+tX+t^2X^2/2+\cdots$, so \begin{align*} \frac{d}{dt}\bigg|_{t=0}\exp(tX)=X. \end{align*} Similarly, $\exp(-tX)=I-tX+t^2X^2/2-\cdots$, so \begin{align*} \frac{d}{dt}\bigg|_{t=0}\exp(-tX)=-X. \end{align*} Using the product rule, with $Y$ constant in $t$, \begin{align*} \frac{d}{dt}\bigg|_{t=0}\exp(tX)Y\exp(-tX)=XYI+IY(-X). \end{align*} Since $I$ is the identity matrix, this becomes \begin{align*} \operatorname{ad}_X(Y)=XY-YX. \end{align*} Thus, for matrix groups, the abstract infinitesimal adjoint action is exactly the commutator operation on matrices. [/example] Since $\operatorname{ad}_X$ is bracketing by $X$, it is natural to ask whether the assignment $X\mapsto\operatorname{ad}_X$ preserves Lie brackets. The answer is exactly the Jacobi identity written in operator form: for every Lie algebra $\mathfrak g$, the adjoint map $\operatorname{ad}:\mathfrak g\to\mathfrak{gl}(\mathfrak g)$ is defined by \begin{align*} X\mapsto\operatorname{ad}_X. \end{align*} This map is a Lie algebra homomorphism, where $\operatorname{ad}_X(Y)=[X,Y]$ for $Y\in\mathfrak g$. The Lie-algebra hypothesis is essential because Jacobi is exactly what makes the commutator of two bracketing operators again a bracketing operator. If a skew-symmetric bilinear product fails Jacobi, the displayed identity can fail after applying both sides to a third element. The theorem does not imply that $\operatorname{ad}:\mathfrak g\to\mathfrak{gl}(\mathfrak g)$ is an isomorphism; abelian Lie algebras again give the extreme case where the map is zero. What it provides is a bridge from bracket identities to trace identities, and traces of products of these maps now give a canonical bilinear form. ## The Killing Form A Lie algebra has many possible bilinear forms, but choosing one arbitrarily imports external data. For example, after choosing a basis of $\mathfrak g$ one can put the Euclidean dot product on coordinate vectors, but a different basis usually gives a different form and there is no reason for either form to respect the bracket. The obstruction is that structure theory needs a bilinear form intrinsic to the Lie algebra, not to a coordinate presentation. The adjoint action supplies canonical endomorphisms $\operatorname{ad}_X$, and traces of their products are invariant under change of basis, so they produce a bilinear form built from the internal action of the Lie algebra on itself. [definition: Killing Form] Let $\mathfrak g$ be a finite-dimensional Lie algebra over $\mathbb R$ or $\mathbb C$. The Killing form of $\mathfrak g$ is the bilinear form \begin{align*} B:\mathfrak g\times\mathfrak g&\to\mathbb F, & B(X,Y)&=\operatorname{tr}(\operatorname{ad}_X\circ\operatorname{ad}_Y), \end{align*} where $\mathbb F$ is the ground field. [/definition] The definition uses only the Lie bracket, since $\operatorname{ad}_X(Y)=[X,Y]$, and the trace makes the construction independent of the basis used to compute matrices. Before using $B$ as a structural invariant, we need to know that it has the expected formal properties of a bilinear form and that its two arguments play symmetric roles. [quotetheorem:8823] [citeproof:8823] Finite-dimensionality is needed because the ordinary trace of $\operatorname{ad}_X\circ\operatorname{ad}_Y$ is being used; in infinite-dimensional Lie algebras this trace may be undefined. Bilinearity depends on the linearity of $X\mapsto\operatorname{ad}_X$, and symmetry uses the finite-dimensional trace identity $\operatorname{tr}(AB)=\operatorname{tr}(BA)$. These hypotheses give symmetry, not non-degeneracy or positivity. A non-zero abelian Lie algebra satisfies $\operatorname{ad}_X=0$ for every $X$, so its Killing form is identically zero; this shows that the formal properties in the theorem are far weaker than the structural properties needed later. What the theorem supplies is the minimum control needed before asking whether $B$ is invariant under the adjoint action. Symmetry makes $B$ a manageable bilinear form, but the reason it interacts with Lie theory is its compatibility with conjugation. Since $B$ is built from the adjoint representation, we next check that applying the group-level adjoint action to both entries leaves it unchanged. [quotetheorem:8824] [citeproof:8824] The Lie group hypothesis matters because the statement uses the global maps $\operatorname{Ad}_g$; for an abstract Lie algebra there may be no specified group element $g$ acting on it. The trace construction is also essential: a randomly chosen symmetric bilinear form on $\mathfrak g$ need not be preserved by $\operatorname{Ad}$. For instance, in $\mathfrak{sl}(2,\mathbb R)$ use the basis $H,E,F$, where $H_{11}=1$, $H_{22}=-1$, $E_{12}=1$, $F_{21}=1$, and all other entries are zero. The inner automorphism coming from $g=\operatorname{diag}(a,a^{-1})$ fixes $H$ and sends $E$ to $a^2E$, $F$ to $a^{-2}F$. The coordinate dot product for this basis is not preserved when $a^2\ne 1$, while the Killing form is preserved by the theorem. The theorem does not say that $\operatorname{Ad}$ preserves every bilinear form, only this canonical one. The group-level invariance is often too global for computations involving brackets, so we next use the infinitesimal version: if $\mathfrak g$ is finite-dimensional and $B$ is its Killing form, then for all $X,Y,Z\in\mathfrak g$, \begin{align*} B([Z,X],Y)+B(X,[Z,Y])=0. \end{align*} Equivalently, $\operatorname{ad}_Z$ is skew-symmetric with respect to $B$ in the sense appropriate to this possibly degenerate bilinear form. Finite-dimensionality is still needed because $B$ was defined by an ordinary trace, and the Lie-algebra structure is needed because the terms $[Z,X]$ and $[Z,Y]$ come from differentiating the adjoint action. The identity does not imply that $\operatorname{ad}_Z$ is skew-adjoint for a positive definite inner product; $B$ may be indefinite or degenerate, so this is skewness relative to the Killing form itself. The point is that bracket operations can now be moved from one argument of $B$ to the other, which is the algebraic mechanism behind the structural criteria that follow. [example: Killing Form Of An Abelian Lie Algebra] Let $\mathfrak g$ be abelian, and let $X,Y\in\mathfrak g$. Since $\mathfrak g$ is abelian, every bracket is zero: \begin{align*} [X,Z]=0 \end{align*} for all $Z\in\mathfrak g$. Therefore the endomorphism $\operatorname{ad}_X:\mathfrak g\to\mathfrak g$ satisfies \begin{align*} \operatorname{ad}_X(Z)=[X,Z]=0 \end{align*} for every $Z\in\mathfrak g$, so $\operatorname{ad}_X$ is the zero endomorphism. By the definition of the Killing form, \begin{align*} B(X,Y)=\operatorname{tr}(\operatorname{ad}_X\circ\operatorname{ad}_Y). \end{align*} Because $\operatorname{ad}_X=0$, the composite sends every $Z\in\mathfrak g$ to \begin{align*} (\operatorname{ad}_X\circ\operatorname{ad}_Y)(Z)=\operatorname{ad}_X(\operatorname{ad}_Y(Z))=0. \end{align*} Thus $\operatorname{ad}_X\circ\operatorname{ad}_Y=0$, and hence \begin{align*} B(X,Y)=\operatorname{tr}(0)=0. \end{align*} Since this holds for every pair $X,Y\in\mathfrak g$, the Killing form is identically zero; in an abelian Lie algebra there is no non-trivial adjoint action for the trace to measure. [/example] The abelian example shows that the Killing form can be degenerate. The next criterion explains that for finite-dimensional Lie algebras in characteristic zero, this degeneracy is exactly what prevents semisimplicity. ## Semisimplicity and Cartan's Criterion Ideals measure internal decomposability of Lie algebras, and solvable ideals are the part that behaves like repeated upper-triangular commutators. The obstruction to a clean structure theory is that a solvable ideal can sit inside $\mathfrak g$ while being hard to see from the bracket table alone. The Killing form exposes this obstruction because directions lying in large solvable pieces tend to pair degenerately with the rest of the algebra. Semisimple Lie algebras are precisely those where this solvable obstruction has disappeared. [definition: Semisimple Lie Algebra] A finite-dimensional Lie algebra $\mathfrak g$ over a field of characteristic zero is semisimple if it has no non-zero solvable ideals. [/definition] This definition is structural, but Cartan's criterion turns it into a computation. In this course we use the criterion as a structure theorem: for a finite-dimensional Lie algebra $\mathfrak g$ over a field of characteristic zero, $\mathfrak g$ is semisimple if and only if its Killing form $B:\mathfrak g\times\mathfrak g\to\mathbb F$, given by \begin{align*} B(X,Y)=\operatorname{tr}(\operatorname{ad}_X\circ\operatorname{ad}_Y), \end{align*} is non-degenerate. Cartan's criterion is a non-degeneracy test, not a signature theorem. The hypothesis that the ground field has characteristic zero is essential for the usual trace and radical arguments behind the criterion, and finite-dimensionality is needed because the Killing form is defined by an ordinary trace of endomorphisms. The theorem also separates semisimplicity from abelian behaviour: a non-zero abelian Lie algebra has zero Killing form, hence a degenerate one, so it cannot be semisimple under this definition. What the criterion gives is a computable bridge from ideals to linear algebra: instead of finding all solvable ideals directly, one can test whether any non-zero vector pairs to zero with all of $\mathfrak g$ under $B$. The criterion detects semisimplicity over the chosen field, but it does not by itself classify real forms or decide compactness. For example, $\mathfrak{su}(2)$ and $\mathfrak{sl}(2,\mathbb R)$ are both semisimple, so both have non-degenerate Killing forms, even though their Killing forms have different signatures. [example: Killing Form On Su Two] Let $E_{ij}$ denote the matrix with a $1$ in the $(i,j)$ entry and $0$ elsewhere, and use the real basis \begin{align*} A=i(E_{11}-E_{22}),\quad B=E_{12}-E_{21},\quad C=i(E_{12}+E_{21}) \end{align*} of $\mathfrak{su}(2)$. Since $E_{ij}E_{kl}=\delta_{jk}E_{il}$, the commutators are \begin{align*} [A,B]=AB-BA=C-(-C)=2C. \end{align*} \begin{align*} [A,C]=AC-CA=(-B)-B=-2B. \end{align*} \begin{align*} [B,C]=BC-CB=A-(-A)=2A. \end{align*} Thus $\operatorname{ad}_A$ sends $A$ to $0$, $B$ to $2C$, and $C$ to $-2B$, so $\operatorname{ad}_A^2$ has eigenvalues $0,-4,-4$ on this basis and \begin{align*} B(A,A)=\operatorname{tr}(\operatorname{ad}_A^2)=-8. \end{align*} The same bracket table gives $B(B,B)=-8$ and $B(C,C)=-8$. For mixed products, each operator $\operatorname{ad}_A\operatorname{ad}_B$, $\operatorname{ad}_A\operatorname{ad}_C$, and $\operatorname{ad}_B\operatorname{ad}_C$ sends every basis vector to either $0$ or a multiple of a different basis vector, so each has trace $0$. On the other hand, $A^2=B^2=C^2=-(E_{11}+E_{22})$, so \begin{align*} 4\operatorname{tr}(A^2)=4\operatorname{tr}(B^2)=4\operatorname{tr}(C^2)=-8. \end{align*} Also $AB=C$, $AC=-B$, and $BC=A$, and each of $A,B,C$ has trace $0$, so the mixed pairings $4\operatorname{tr}(AB)$, $4\operatorname{tr}(AC)$, and $4\operatorname{tr}(BC)$ are all $0$. Since both sides are symmetric bilinear forms and they agree on the basis $A,B,C$, for all $X,Y\in\mathfrak{su}(2)$, \begin{align*} B(X,Y)=4\operatorname{tr}(XY). \end{align*} Now take $X\ne 0$ in $\mathfrak{su}(2)$. Skew-Hermitian means $X^*=-X$, hence $X^2=-X^*X$. Therefore \begin{align*} B(X,X)=4\operatorname{tr}(X^2)=-4\operatorname{tr}(X^*X). \end{align*} If $X=(x_{ij})$, then \begin{align*} \operatorname{tr}(X^*X)=\sum_{i,j}|x_{ij}|^2. \end{align*} This sum is positive for $X\ne 0$, so $B(X,X)<0$. Thus the Killing form on $\mathfrak{su}(2)$ is negative definite. [/example] The compact example should be compared with a split real form. The algebra $\mathfrak{sl}(2,\mathbb R)$ is semisimple, so its Killing form is non-degenerate, but it is not negative definite. [example: Killing Form On Sl Two R] Let $E_{ij}$ denote the matrix with a $1$ in the $(i,j)$ entry and $0$ elsewhere. Use the standard basis \begin{align*} H=E_{11}-E_{22},\quad E=E_{12},\quad F=E_{21} \end{align*} of $\mathfrak{sl}(2,\mathbb R)$. From $E_{ij}E_{kl}=\delta_{jk}E_{il}$, we get \begin{align*} [H,E]=HE-EH=E-(-E)=2E. \end{align*} \begin{align*} [H,F]=HF-FH=(-F)-F=-2F. \end{align*} \begin{align*} [E,F]=EF-FE=E_{11}-E_{22}=H. \end{align*} Write \begin{align*} X=aH+bE+cF,\quad Y=pH+qE+rF. \end{align*} Using the three bracket relations above, \begin{align*} \operatorname{ad}_X(H)=[X,H]=-2bE+2cF. \end{align*} \begin{align*} \operatorname{ad}_X(E)=[X,E]=2aE-cH. \end{align*} \begin{align*} \operatorname{ad}_X(F)=[X,F]=bH-2aF. \end{align*} Similarly, \begin{align*} \operatorname{ad}_Y(H)=-2qE+2rF,\quad \operatorname{ad}_Y(E)=2pE-rH,\quad \operatorname{ad}_Y(F)=qH-2pF. \end{align*} The trace of $\operatorname{ad}_X\operatorname{ad}_Y$ is the sum of the coefficients of $H,E,F$ in the images of $H,E,F$, respectively. First, \begin{align*} \operatorname{ad}_X(\operatorname{ad}_Y(H))=-2q(2aE-cH)+2r(bH-2aF). \end{align*} So the $H$-coefficient is $2cq+2br$. Next, \begin{align*} \operatorname{ad}_X(\operatorname{ad}_Y(E))=2p(2aE-cH)-r(-2bE+2cF). \end{align*} So the $E$-coefficient is $4ap+2br$. Finally, \begin{align*} \operatorname{ad}_X(\operatorname{ad}_Y(F))=q(-2bE+2cF)-2p(bH-2aF). \end{align*} So the $F$-coefficient is $2cq+4ap$. Therefore \begin{align*} B(X,Y)=\operatorname{tr}(\operatorname{ad}_X\operatorname{ad}_Y)=8ap+4br+4cq. \end{align*} On the other hand, $H^2=E_{11}+E_{22}$, $EF=E_{11}$, and $FE=E_{22}$, while the remaining basis products have trace $0$. Hence \begin{align*} \operatorname{tr}(XY)=2ap+br+cq. \end{align*} Thus, for all $X,Y\in\mathfrak{sl}(2,\mathbb R)$, \begin{align*} B(X,Y)=4\operatorname{tr}(XY). \end{align*} Now take $H=E_{11}-E_{22}$. Since $H^2=E_{11}+E_{22}$, \begin{align*} B(H,H)=4\operatorname{tr}(H^2)=4\operatorname{tr}(E_{11}+E_{22})=4(1+1)=8. \end{align*} Let $K=E-F$, so $K_{12}=1$, $K_{21}=-1$, and $K_{11}=K_{22}=0$. Then \begin{align*} K^2=(E-F)^2=E^2-EF-FE+F^2. \end{align*} Since $E^2=0$, $F^2=0$, $EF=E_{11}$, and $FE=E_{22}$, \begin{align*} K^2=-(E_{11}+E_{22}). \end{align*} Therefore \begin{align*} B(K,K)=4\operatorname{tr}(K^2)=4\operatorname{tr}(-(E_{11}+E_{22}))=4(-1-1)=-8. \end{align*} The same Killing form has a positive value on $H$ and a negative value on $K$, so it is indefinite; this is the signature contrast with the compact algebra $\mathfrak{su}(2)$. [/example] The chapter's constructions form a chain: conjugation gives $\operatorname{Ad}$, differentiation gives $\operatorname{ad}$, traces of products give $B$, and the signature or degeneracy of $B$ reveals structural information. This chain is one of the main ways finite-dimensional Lie theory turns local differential data into global classification results. The adjoint representation has made clear that linear actions can encode deep structure inside a Lie group. We now turn to representations in general, where the same ideas are applied to arbitrary vector spaces and Schur's lemma becomes the basic tool for understanding irreducible modules. # 10. Representations: Basics and Schur's Lemma Representations let a Lie group act by linear transformations, so that nonlinear group structure can be studied through vector spaces and matrices. Chapters 6 and 9 built the bridge between a Lie group $G$ and its Lie algebra $\mathfrak g$ through differentials and the adjoint representation; this chapter adds the parallel bridge for actions, sending a group representation $\rho$ to a Lie algebra representation $d\rho$. The basic questions are how representations decompose, what maps between them can exist, and why irreducible complex representations have so few symmetries. ## Linear Actions of Lie Groups The first problem is to make precise what it means for a Lie group to act linearly on a finite-dimensional vector space while respecting the smooth structure of the group. Since $GL(V)$ is itself a Lie group, the answer is a smooth homomorphism into $GL(V)$. [definition: Finite Dimensional Representation] Let $G$ be a Lie group and let $V$ be a finite-dimensional real or complex vector space. A finite-dimensional representation of $G$ on $V$ is a smooth group homomorphism \begin{align*} \rho : G \to GL(V). \end{align*} The vector space $V$ is called the representation space. [/definition] The homomorphism condition says that $\rho(gh)=\rho(g)\rho(h)$ and $\rho(e)=\operatorname{id}_V$. Thus each group element acts by an invertible linear map, and multiplication in $G$ is transported into composition of linear maps. [example: Standard Representation Of SO Three] Let $G=SO(3)$ and $V=\mathbb R^3$. For each $A\in SO(3)$, define $\rho(A):\mathbb R^3\to\mathbb R^3$ by $\rho(A)v=Av$. Since $A$ is an invertible $3\times 3$ real matrix, $\rho(A)\in GL(\mathbb R^3)$. For $A,B\in SO(3)$ and $v\in\mathbb R^3$, \begin{align*} \rho(AB)v=(AB)v=A(Bv)=\rho(A)(\rho(B)v)=(\rho(A)\rho(B))v. \end{align*} Thus $\rho(AB)=\rho(A)\rho(B)$, and also \begin{align*} \rho(I)v=Iv=v=\operatorname{id}_{\mathbb R^3}(v). \end{align*} So $\rho:I\mapsto I$ is the inclusion homomorphism $SO(3)\to GL(\mathbb R^3)$, called the standard representation. It preserves the usual dot product because, for $A\in SO(3)$, the defining condition is $A^\top A=I$, and hence \begin{align*} (Av)\cdot(Aw)=(Av)^\top(Aw)=v^\top A^\top A w=v^\top Iw=v^\top w=v\cdot w. \end{align*} This representation is the defining action of rotations on Euclidean space, with the ordinary dot product as its invariant inner product. [/example] This example is the model case: a matrix Lie group often comes with a preferred representation by the matrices used to define it. The next construction asks what remains of a linear [group action](/page/Group%20Action) after passing from the group to its tangent space at the identity. [definition: Associated Lie Algebra Representation] Let $\rho:G\to GL(V)$ be a finite-dimensional representation of a Lie group $G$. The associated Lie algebra representation is the linear map \begin{align*} d\rho: \mathfrak g \to \mathfrak{gl}(V) \end{align*} given by the differential of $\rho$ at the identity element $e\in G$. [/definition] For $X\in\mathfrak g$, the operator $d\rho(X)$ is the infinitesimal generator of the one-parameter family $t\mapsto \rho(\exp(tX))$. The central compatibility question is whether this infinitesimal action respects the Lie bracket, since without bracket preservation it would lose the noncommutative information carried by $G$ near the identity. This gives the useful formula \begin{align*} d\rho(X)v=\frac{d}{dt}\bigg|_{t=0}\rho(\exp(tX))v, \end{align*} for every $v\in V$, and leads to the following structural result. [quotetheorem:8825] [citeproof:8825] The theorem says that a representation of $G$ always produces a representation of $\mathfrak g$, and the bracket condition is the point that prevents the differential from being merely a collection of unrelated infinitesimal operators. The smooth homomorphism hypothesis is essential. For a concrete failure, take the abelian Lie group $\mathbb R^2$ and define a smooth map $F:\mathbb R^2\to GL(\mathbb R^2)$ by $F(s,t)=\exp(sA)\exp(tB)$, where $A(e_2)=e_1$, $A(e_1)=0$, $B(e_1)=e_2$, and $B(e_2)=0$. Then $dF_0(e_1)=A$ and $dF_0(e_2)=B$, while $[e_1,e_2]=0$ in the abelian Lie algebra but $[A,B]\ne 0$. Thus the derivative of a smooth non-homomorphism can fail to be a Lie algebra homomorphism. Even among homomorphisms, $d\rho$ captures only local connected information; for example, representations of a disconnected group may differ on components away from the identity while having the same infinitesimal action. For connected groups the remaining ambiguity is global, governed by how Lie algebra data integrates through the exponential map and by the topology of the group, especially its fundamental group. This is why covering groups matter: the same Lie algebra representation may integrate to the simply connected cover but not descend through a quotient unless the relevant fundamental-group action is compatible. The next examples use $d\rho$ computationally while the later structure theory keeps track of global representation data separately. [example: Differentiating The Standard SO Three Representation] For the standard representation $\rho:SO(3)\to GL(\mathbb R^3)$, we compute the differential at the identity using the curve $t\mapsto \exp(tX)$ through $I$. Since $\rho(A)$ is just the linear map $v\mapsto Av$, for every $v\in\mathbb R^3$ we have \begin{align*} \rho(\exp(tX))v=\exp(tX)v. \end{align*} By the defining formula for the associated Lie algebra representation, \begin{align*} d\rho(X)v=\frac{d}{dt}\bigg|_{t=0}\rho(\exp(tX))v. \end{align*} Substituting the standard action gives \begin{align*} d\rho(X)v=\frac{d}{dt}\bigg|_{t=0}\exp(tX)v. \end{align*} Using the matrix exponential series $\exp(tX)=I+tX+\frac{t^2}{2}X^2+\frac{t^3}{6}X^3+\cdots$, we get \begin{align*} \exp(tX)v=v+tXv+\frac{t^2}{2}X^2v+\frac{t^3}{6}X^3v+\cdots. \end{align*} Differentiating term by term at $t=0$ leaves only the coefficient of $t$: \begin{align*} \frac{d}{dt}\bigg|_{t=0}\exp(tX)v=Xv. \end{align*} Thus $d\rho(X)v=Xv$ for every $v\in\mathbb R^3$, so $d\rho(X)=X$ as an element of $\mathfrak{gl}(\mathbb R^3)$. Therefore $d\rho:\mathfrak{so}(3)\to\mathfrak{gl}(\mathbb R^3)$ is exactly the inclusion map, sending each skew-symmetric infinitesimal rotation matrix to the same matrix acting on $\mathbb R^3$. [/example] ## Subrepresentations and Irreducibility Once a group acts linearly on $V$, the structural problem is whether the action can be understood on smaller invariant pieces. This is the representation-theoretic version of diagonalising a single operator, but now every operator $\rho(g)$ must preserve the same subspaces. [definition: Subrepresentation] Let $\rho:G\to GL(V)$ be a finite-dimensional representation. A subrepresentation is a vector subspace $W\subset V$ such that \begin{align*} \rho(g)W\subset W \end{align*} for every $g\in G$. [/definition] Since each $\rho(g)$ is invertible, the condition implies $\rho(g)W=W$ for every $g\in G$. Thus the restriction $\rho|_W:G\to GL(W)$ is itself a representation. [example: Coordinate Axis For A Circle Action] Let $G=S^1$ and let $e_1=(1,0)$, $e_2=(0,1)$ in $\mathbb C^2$. The action is \begin{align*} \rho(z)(v_1,v_2)=(zv_1,z^2v_2) \end{align*} for $z\in S^1$. We show that the coordinate lines $\mathbb C e_1$ and $\mathbb C e_2$ are subrepresentations by checking that each is preserved by every $\rho(z)$. Take a vector in the first coordinate line, say $a e_1=(a,0)$ with $a\in\mathbb C$. Then \begin{align*} \rho(z)(a e_1)=\rho(z)(a,0)=(za,z^2\cdot 0)=(za,0)=(za)e_1. \end{align*} Since $za\in\mathbb C$, this lies in $\mathbb C e_1$. Similarly, for a vector $b e_2=(0,b)$ in the second coordinate line, \begin{align*} \rho(z)(b e_2)=\rho(z)(0,b)=(z\cdot 0,z^2b)=(0,z^2b)=(z^2b)e_2. \end{align*} Since $z^2b\in\mathbb C$, this lies in $\mathbb C e_2$. Therefore both coordinate lines are invariant subspaces, hence subrepresentations. On $\mathbb C e_1$ the action is multiplication by the character $z\mapsto z$, while on $\mathbb C e_2$ it is multiplication by the character $z\mapsto z^2$. Thus this representation splits into two one-dimensional weight spaces. [/example] A representation with no nonzero proper invariant subspace is the basic indivisible object in the theory. These objects play the role of prime factors, although decomposition into them requires hypotheses. [definition: Irreducible Representation] A finite-dimensional representation $\rho:G\to GL(V)$ is irreducible if its only subrepresentations are $\{0\}$ and $V$. [/definition] Irreducibility depends on the ground field. A real representation can become reducible after extending scalars to $\mathbb C$, because complex eigenvectors may reveal invariant lines that do not exist over $\mathbb R$. [example: Real Rotation Representation Of The Circle] Let $S^1$ act on $\mathbb R^2$ by rotations: for $z=e^{i\theta}$, write \begin{align*} R_\theta(a,b)=(a\cos\theta-b\sin\theta,a\sin\theta+b\cos\theta). \end{align*} We first show that this real representation is irreducible. If a real line $L=\mathbb R v$ with $v\ne 0$ were preserved by every rotation, then it would be preserved by rotation through $\pi/2$. Writing $v=(a,b)$, we have \begin{align*} R_{\pi/2}v=(-b,a). \end{align*} The vectors $v$ and $R_{\pi/2}v$ are perpendicular because \begin{align*} v\cdot R_{\pi/2}v=(a,b)\cdot(-b,a)=-ab+ba=0. \end{align*} If $R_{\pi/2}v\in \mathbb R v$, then $R_{\pi/2}v=\lambda v$ for some $\lambda\in\mathbb R$, so \begin{align*} 0=v\cdot R_{\pi/2}v=v\cdot(\lambda v)=\lambda(a^2+b^2). \end{align*} Since $v\ne 0$, $a^2+b^2\ne 0$, hence $\lambda=0$. That would give $R_{\pi/2}v=0$, contradicting invertibility of a rotation. Thus no nonzero proper real line is invariant, so the real representation is irreducible. After complexifying, extend the same rotation operators complex-linearly to $\mathbb C^2$. For $u_+=(1,-i)$, we compute \begin{align*} R_\theta u_+=(\cos\theta+i\sin\theta,\sin\theta-i\cos\theta). \end{align*} Since $z=e^{i\theta}=\cos\theta+i\sin\theta$, the first coordinate is $z$, and \begin{align*} -i z=-i(\cos\theta+i\sin\theta)=\sin\theta-i\cos\theta. \end{align*} Therefore \begin{align*} R_\theta u_+=z(1,-i)=z u_+. \end{align*} Similarly, for $u_-=(1,i)$, \begin{align*} R_\theta u_-=(\cos\theta-i\sin\theta,\sin\theta+i\cos\theta). \end{align*} Since $z^{-1}=e^{-i\theta}=\cos\theta-i\sin\theta$, the first coordinate is $z^{-1}$, and \begin{align*} i z^{-1}=i(\cos\theta-i\sin\theta)=\sin\theta+i\cos\theta. \end{align*} Therefore \begin{align*} R_\theta u_-=z^{-1}(1,i)=z^{-1}u_-. \end{align*} The two eigenvectors are linearly independent: if $\alpha u_++\beta u_-=0$, then the two coordinates give $\alpha+\beta=0$ and $-i\alpha+i\beta=0$. The first equation gives $\beta=-\alpha$, while the second gives $\beta=\alpha$, so $\alpha=\beta=0$. Hence \begin{align*} \mathbb C^2=\mathbb C u_+\oplus \mathbb C u_-. \end{align*} The real rotation representation is irreducible over $\mathbb R$, but its complexification splits into two one-dimensional subrepresentations with characters $z\mapsto z$ and $z\mapsto z^{-1}$. [/example] The previous example shows that irreducibility is sensitive to both the field and the available invariant subspaces. To express a positive decomposition result, we need a construction that combines several representations while keeping their actions independent on separate summands. [definition: Direct Sum Of Representations] Let $\rho_i:G\to GL(V_i)$ be finite-dimensional representations for $i=1,\dots,n$. Their direct sum is the representation \begin{align*} \rho_1\oplus\cdots\oplus\rho_n:G\to GL(V_1\oplus\cdots\oplus V_n) \end{align*} defined by \begin{align*} (\rho_1\oplus\cdots\oplus\rho_n)(g)(v_1,\dots,v_n)=(\rho_1(g)v_1, \dots, \rho_n(g)v_n). \end{align*} [/definition] A representation that is a direct sum of irreducibles is easier to analyse, since questions reduce to the summands and to maps between them. The natural question is which Lie groups guarantee such decompositions for every finite-dimensional representation. Compact Lie groups have the decisive answer, stated here and proved in the sequel course. [quotetheorem:8826] [citeproof:8826] Compactness is the structural hypothesis that makes complete reducibility possible. Averaging over a compact group produces invariant inner products, and invariant orthogonal complements let one split off irreducible summands one at a time. The theorem does not say that every representation is irreducible, nor that the irreducible summands are unique as chosen subspaces; it says that finite-dimensional representations can be decomposed into irreducible pieces. This is the representation-theoretic analogue of diagonalising a unitary action: compactness supplies enough invariant geometry to prevent indecomposable but reducible behaviour. Once representations have been decomposed into irreducibles, the next question is how rigid maps between those pieces can be. That is the role of Schur's lemma. [quotetheorem:2414] [citeproof:2414] Each hypothesis is doing work here. Finite-dimensionality over $\mathbb C$ guarantees an eigenvalue of $T$; without finite-dimensionality, a complex linear operator can fail to have eigenvectors. The complex scalar field is also essential, since over $\mathbb R$ an irreducible representation can have endomorphism algebra $\mathbb R$, $\mathbb C$, or the quaternions rather than only scalar multiples of the identity. Irreducibility is what promotes one nonzero eigenspace of $T$ into all of $V$; if $V=V_1\oplus V_2$ is reducible, an intertwiner may act by different scalars on the two summands. The lemma therefore classifies endomorphisms only after irreducibility has already been established; it does not classify irreducible representations themselves. [example: Schur Lemma For Circle Characters] Let $\chi_m,\chi_n:S^1\to GL(\mathbb C)$ be the characters $\chi_m(z)v=z^m v$ and $\chi_n(z)v=z^n v$, where $m,n\in\mathbb Z$. Every complex-linear map $T:\mathbb C\to\mathbb C$ is determined by $T(1)$; if $a=T(1)$, then for every $v\in\mathbb C$, \begin{align*} T(v)=T(v\cdot 1)=vT(1)=av. \end{align*} Thus $T$ is multiplication by a scalar $a\in\mathbb C$. The intertwining condition $T\chi_m(z)=\chi_n(z)T$ means that the two sides agree on every $v\in\mathbb C$. For $z\in S^1$, \begin{align*} (T\chi_m(z))(v)=T(z^m v)=a z^m v. \end{align*} On the other hand, \begin{align*} (\chi_n(z)T)(v)=\chi_n(z)(av)=z^n av=az^n v. \end{align*} Therefore $T$ is an intertwiner exactly when \begin{align*} a z^m v=a z^n v \end{align*} for every $z\in S^1$ and every $v\in\mathbb C$. Taking $v=1$, this is equivalent to \begin{align*} a(z^m-z^n)=0 \end{align*} for every $z\in S^1$. If $m=n$, then $z^m-z^n=0$ for every $z$, so every scalar $a$ gives an intertwiner. If $m\ne n$ and $a\ne 0$, then the condition would force $z^m=z^n$ for every $z\in S^1$, hence $z^{m-n}=1$ for every $z\in S^1$. Choosing $z=e^{i\theta}$ with $\theta$ not an integer multiple of $2\pi/(m-n)$ contradicts $e^{i(m-n)\theta}=1$. Hence $a=0$. Thus the only intertwiner between distinct circle characters is the zero map, while the endomorphisms of a fixed character are exactly the scalar multiplications. [/example] ## Basic Constructions of Representations Representations can be built from existing ones using standard linear algebra operations. These constructions are essential because many natural representations arise from products, dual spaces, and conjugation actions rather than from a defining matrix inclusion. [definition: Trivial Representation] Let $G$ be a Lie group and let $V$ be a finite-dimensional vector space. The trivial representation on $V$ is the representation $\rho:G\to GL(V)$ defined by \begin{align*} \rho(g)=\operatorname{id}_V \end{align*} for every $g\in G$. [/definition] The trivial representation records invariant vectors: a vector $v\in V$ fixed by every $\rho(g)$ spans a trivial subrepresentation. To study products of systems with independent symmetries, however, invariant vectors are not enough; we need a way for $G$ to act on bilinear expressions built from two representation spaces. [definition: Tensor Product Representation] Let $\rho:G\to GL(V)$ and $\sigma:G\to GL(W)$ be finite-dimensional representations. Their [tensor product](/page/Tensor%20Product) is the representation \begin{align*} \rho\otimes\sigma:G\to GL(V\otimes W) \end{align*} defined on pure tensors by \begin{align*} (\rho\otimes\sigma)(g)(v\otimes w)=\rho(g)v\otimes\sigma(g)w. \end{align*} [/definition] Tensor products encode how two independent symmetries act together. They also produce new irreducible representations after decomposition, especially for compact groups. A second linear algebra operation reverses variance: it asks how the group should act on linear functionals so that evaluation remains compatible with the original action. [definition: Dual Representation] Let $\rho:G\to GL(V)$ be a finite-dimensional representation. The dual representation is the representation $\rho^*:G\to GL(V^*)$ defined by \begin{align*} (\rho^*(g)f)(v)=f(\rho(g^{-1})v) \end{align*} for $f\in V^*$ and $v\in V$. [/definition] The inverse in the formula is forced by the homomorphism law. Without it, the order of multiplication would reverse, giving an anti-representation. [example: Dual Of A Character] Let $\chi_m:S^1\to GL(\mathbb C)$ be the character defined by $\chi_m(z)v=z^m v$. We compute its dual action and show that it is the character $\chi_{-m}$ on the one-dimensional [dual space](/page/Dual%20Space) $\mathbb C^*$. By the definition of the dual representation, for $f\in\mathbb C^*$ and $v\in\mathbb C$, \begin{align*} (\chi_m^*(z)f)(v)=f(\chi_m(z^{-1})v). \end{align*} Since $\chi_m(z^{-1})v=(z^{-1})^m v=z^{-m}v$, this becomes \begin{align*} (\chi_m^*(z)f)(v)=f(z^{-m}v). \end{align*} Because $f$ is complex-linear and $z^{-m}\in\mathbb C$, we have \begin{align*} f(z^{-m}v)=z^{-m}f(v). \end{align*} Therefore \begin{align*} (\chi_m^*(z)f)(v)=z^{-m}f(v). \end{align*} This equality holds for every $v\in\mathbb C$, so $\chi_m^*(z)f=z^{-m}f$. Hence the dual character is $\chi_{-m}$: passing to the dual reverses the exponent of a one-dimensional circle character. [/example] The dual construction turns representations into actions on functionals. The next construction is different because it is intrinsic to the group itself: every Lie group acts on its own tangent space at the identity by conjugating nearby group elements. This action packages the way the group moves its infinitesimal symmetries. [definition: Adjoint Representation] Let $G$ be a Lie group with Lie algebra $\mathfrak g$. The adjoint representation is the representation \begin{align*} \operatorname{Ad}:G\to GL(\mathfrak g) \end{align*} obtained by differentiating the conjugation map $C_g:G\to G$, $C_g(h)=ghg^{-1}$, at the identity. [/definition] Its derivative at the identity of $G$ is the adjoint Lie algebra representation $\operatorname{ad}:\mathfrak g\to\mathfrak{gl}(\mathfrak g)$, given by $\operatorname{ad}_X(Y)=[X,Y]$. Thus the Lie bracket is itself the infinitesimal form of conjugation. [example: Adjoint Representation Of SU Two] Write $U\in SU(2)$ and $X,Y\in\mathfrak{su}(2)$. The adjoint action is conjugation: \begin{align*} \operatorname{Ad}_U(X)=UXU^{-1}. \end{align*} This matrix is again in $\mathfrak{su}(2)$: since $U^{-1}=U^*$ and $X^*=-X$, \begin{align*} (UXU^{-1})^*=(U^{-1})^*X^*U^*=U(-X)U^{-1}=-UXU^{-1}. \end{align*} Also, by cyclic invariance of trace, \begin{align*} \operatorname{tr}(UXU^{-1})=\operatorname{tr}(XU^{-1}U)=\operatorname{tr}(X)=0. \end{align*} On the real vector space $\mathfrak{su}(2)$, use the inner product \begin{align*} \langle X,Y\rangle=-\frac{1}{2}\operatorname{tr}(XY). \end{align*} Then $\operatorname{Ad}_U$ preserves it, because $U^{-1}U=I$ gives \begin{align*} \langle \operatorname{Ad}_U X,\operatorname{Ad}_U Y\rangle=-\frac{1}{2}\operatorname{tr}(UXU^{-1}UYU^{-1})=-\frac{1}{2}\operatorname{tr}(UXYU^{-1}). \end{align*} Applying cyclic invariance of trace once more gives \begin{align*} -\frac{1}{2}\operatorname{tr}(UXYU^{-1})=-\frac{1}{2}\operatorname{tr}(XYU^{-1}U)=-\frac{1}{2}\operatorname{tr}(XY)=\langle X,Y\rangle. \end{align*} Thus $\operatorname{Ad}_U$ is an orthogonal linear map of the three-dimensional real [inner product space](/page/Inner%20Product%20Space) $\mathfrak{su}(2)$. Since $SU(2)$ is connected and $\det(\operatorname{Ad}_I)=1$, the [continuous function](/page/Continuous%20Function) $U\mapsto\det(\operatorname{Ad}_U)$ is constantly $1$, so the image lies in $SO(\mathfrak{su}(2))\cong SO(3)$. It remains to identify the kernel. If $U\in\ker(\operatorname{Ad})$, then $UXU^{-1}=X$ for every $X\in\mathfrak{su}(2)$, so $UX=XU$ for every $X\in\mathfrak{su}(2)$. Write $U$ in the standard $SU(2)$ form whose first row is $(a,b)$ and whose second row is $(-\overline b,\overline a)$, with $|a|^2+|b|^2=1$. Let $E_3$ be the element of $\mathfrak{su}(2)$ with diagonal entries $i$ and $-i$ and off-diagonal entries $0$. Comparing the $(1,2)$ entries in $UE_3=E_3U$ gives \begin{align*} -ib=ib. \end{align*} Hence $2ib=0$, so $b=0$. Therefore $U$ is diagonal, with diagonal entries $a$ and $\overline a$. Now let $E_2$ be the element of $\mathfrak{su}(2)$ with entries $(E_2)_{12}=1$, $(E_2)_{21}=-1$, and diagonal entries $0$. Comparing the $(1,2)$ entries in $UE_2=E_2U$ gives \begin{align*} a=\overline a. \end{align*} Thus $a$ is real. Since $|a|^2+|b|^2=1$ and $b=0$, we have $|a|=1$, so $a=\pm 1$. Hence \begin{align*} \ker(\operatorname{Ad})=\{\pm I\}. \end{align*} Therefore the adjoint representation factors through a faithful orthogonal action of $SU(2)/\{\pm I\}$ on $\mathfrak{su}(2)$. Under the usual identification $SU(2)/\{\pm I\}\cong SO(3)$, this is the standard three-dimensional rotation representation. [/example] ## The First Glimpse of SU Two The representation theory of $SU(2)$ is the central testing ground for compact Lie groups. The full classification belongs later in the course sequence, but the basic picture gives useful orientation for the examples in this chapter. The guiding question is how many irreducible finite-dimensional representations there are, and whether they can be organized by a simple invariant rather than by writing down unrelated matrices one at a time. [quotetheorem:6952] [citeproof:6952] This classification is being used here as orientation rather than as a full highest-weight theory. The representations are finite-dimensional complex representations, and the normalization is the usual angular-momentum one: a nonnegative highest weight labels the irreducible module, and the corresponding $SU(2)$ representation has dimension one more than that highest weight. Differentiating an $SU(2)$ representation gives a representation of the real Lie algebra $\mathfrak{su}(2)$, while the simply connectedness of $SU(2)$ is what lets the Lie algebra classification integrate back to the group. Thus the theorem belongs in this introductory chapter as a model example of the general strategy: compactness gives complete reducibility, Schur's lemma controls maps between irreducibles, and the special rank-one Lie algebra structure makes the irreducible pieces explicit. Symmetric powers of the standard two-dimensional representation are therefore not merely convenient examples; they exhaust the finite-dimensional irreducible building blocks. ## Constructing the Universal Enveloping Algebra The representation theory of Lie groups has now led to Lie algebra representations, where each $X\in\mathfrak g$ acts as a linear operator and the bracket relation becomes a commutator identity. To package all iterated infinitesimal actions into one algebraic object, we first build the free associative algebra on the vector space $\mathfrak g$ and then impose exactly the Lie commutator relations. [definition: Tensor Algebra] Let $V$ be a vector space over a field $k$. The tensor algebra of $V$ is \begin{align*} T(V)=\bigoplus_{m=0}^{\infty} V^{\otimes m}, \end{align*} with multiplication given by concatenation of tensors and with $V^{\otimes 0}=k$. [/definition] The tensor algebra contains every non-commutative word in vectors of $V$. For a Lie algebra, it is too large because it does not yet know that the Lie bracket should be represented by a commutator. We now impose that commutator relation. The universal enveloping algebra is the associative algebra in which products of elements of $\mathfrak g$ satisfy exactly the identities required by the Lie bracket. [definition: Universal Enveloping Algebra] Let $\mathfrak{g}$ be a Lie algebra over a field $k$. Its universal enveloping algebra is the associative unital algebra \begin{align*} U(\mathfrak{g})=T(\mathfrak{g})/I, \end{align*} where $I$ is the two-sided ideal generated by all elements \begin{align*} X\otimes Y-Y\otimes X-[X,Y] \end{align*} for $X,Y\in\mathfrak{g}$. The canonical map is the linear map \begin{align*} \iota:\mathfrak{g}\to U(\mathfrak{g}) \end{align*} obtained by composing the inclusion $\mathfrak{g}\hookrightarrow T(\mathfrak{g})$ with the quotient map. [/definition] After passing to the quotient, we usually write $XY$ for the product of $\iota(X)$ and $\iota(Y)$ in $U(\mathfrak{g})$. The defining relation becomes \begin{align*} XY-YX=\iota([X,Y]), \end{align*} which is the associative-algebra version of the Lie bracket. The construction is justified by a universal property: if $A$ is a unital associative algebra over $k$, regarded as a Lie algebra by the commutator bracket $[a,b]=ab-ba$, then every Lie algebra homomorphism $\phi:\mathfrak g\to A_{\mathrm{Lie}}$ extends uniquely to a unital algebra homomorphism $\Phi:U(\mathfrak g)\to A$ satisfying $\Phi\circ\iota=\phi$. This universal property is the reason for the name. It says that every unital associative-algebra realisation of the Lie bracket factors through $U(\mathfrak{g})$. The word "unital" matters because the tensor algebra is free in the category of unital associative algebras, and the compatibility hypothesis matters because an arbitrary linear map $\mathfrak g\to A$ need not kill the elements $X\otimes Y-Y\otimes X-[X,Y]$. The universal property does not yet say that $\iota$ is injective, nor does it give a basis for $U(\mathfrak{g})$; it only identifies which algebra maps out of $U(\mathfrak{g})$ are forced by Lie algebra maps out of $\mathfrak g$. The remaining question is whether the quotient has destroyed information about $\mathfrak{g}$ itself. [example: The Abelian Case] Let $\mathfrak{g}$ be an abelian Lie algebra with basis $e_1,\dots,e_n$, so $[e_i,e_j]=0$ for all $i,j$. In the tensor algebra $T(\mathfrak{g})$, the defining generators of the ideal for $U(\mathfrak{g})$ are \begin{align*} e_i\otimes e_j-e_j\otimes e_i-[e_i,e_j]=e_i\otimes e_j-e_j\otimes e_i. \end{align*} After quotienting by this ideal and writing products without tensor symbols, each pair of basis elements satisfies \begin{align*} e_ie_j-e_je_i=0. \end{align*} Equivalently, \begin{align*} e_ie_j=e_je_i. \end{align*} Thus every word in the generators can be reordered without creating any lower-degree bracket term. For example, \begin{align*} e_3e_1e_2=e_1e_3e_2 \end{align*} because $e_3e_1=e_1e_3$, and then \begin{align*} e_1e_3e_2=e_1e_2e_3 \end{align*} because $e_3e_2=e_2e_3$. Hence the quotient is generated by the commuting elements $e_1,\dots,e_n$ and has exactly the same defining commutation relations as the polynomial algebra $k[e_1,\dots,e_n]$. Therefore \begin{align*} U(\mathfrak{g})\cong k[e_1,\dots,e_n]. \end{align*} This is the symmetric-algebra case of the enveloping algebra: when the Lie bracket vanishes, sorting words produces no correction terms. [/example] The non-abelian case is more subtle because reordering generators creates lower-degree correction terms involving brackets. PBW says that these correction terms are controlled enough that ordered monomials still form a basis. ## The Poincare-Birkhoff-Witt Theorem The central structural question is whether the quotient relations introduce unexpected linear dependences among ordered words. If they did, then the canonical map $\iota:\mathfrak{g}\to U(\mathfrak{g})$ could fail to be injective. The PBW theorem rules this out and gives a concrete normal form for elements of $U(\mathfrak{g})$. [quotetheorem:8827] [citeproof:8827] PBW is the algebraic form of the idea that non-commutativity in $U(\mathfrak{g})$ is controlled by the Lie bracket and does not create hidden relations. Its hypotheses and conclusion should be read carefully: the theorem is a vector-space basis theorem, not a statement that $U(\mathfrak{g})$ is commutative or naturally equal to $S(\mathfrak{g})$ as an algebra. Finite-dimensionality is used here only to state the basis with finitely many ordered generators; an infinite-dimensional version uses ordered monomials with finite support. By contrast, if one imposed any extra relation beyond the Lie commutator relation, for example forcing $e_1^2=0$ in addition to the defining relations, the PBW monomials would no longer be linearly independent. The theorem also explains why computations in enveloping algebras resemble computations in polynomial rings, except that reordering terms produces lower-degree bracket corrections. [example: Reordering in a Two-Generator Lie Algebra] Suppose $\mathfrak{g}$ has basis $e_1,e_2$ with $[e_1,e_2]=e_1$, and order the basis by $e_1<e_2$. By the *[Poincare--Birkhoff--Witt Theorem](/theorems/8827)*, the ordered monomials $e_1^ae_2^b$ form a vector-space basis of $U(\mathfrak{g})$. The defining relation in $U(\mathfrak{g})$ is \begin{align*} e_1e_2-e_2e_1=[e_1,e_2]=e_1. \end{align*} Subtracting $e_1$ from both sides gives the rewriting rule \begin{align*} e_2e_1=e_1e_2-e_1. \end{align*} We now sort the word $e_2e_1^2$ into PBW order. By associativity, \begin{align*} e_2e_1^2=(e_2e_1)e_1. \end{align*} Using $e_2e_1=e_1e_2-e_1$, \begin{align*} (e_2e_1)e_1=(e_1e_2-e_1)e_1. \end{align*} Distributing the product on the right, \begin{align*} (e_1e_2-e_1)e_1=e_1e_2e_1-e_1^2. \end{align*} The remaining unsorted adjacent pair is again $e_2e_1$, so associativity and the same rewriting rule give \begin{align*} e_1e_2e_1=e_1(e_2e_1)=e_1(e_1e_2-e_1). \end{align*} Distributing once more, \begin{align*} e_1(e_1e_2-e_1)=e_1^2e_2-e_1^2. \end{align*} Substituting this into the previous expression yields \begin{align*} e_2e_1^2=(e_1^2e_2-e_1^2)-e_1^2=e_1^2e_2-2e_1^2. \end{align*} Thus sorting the word produces the ordered PBW term $e_1^2e_2$ together with the lower-degree correction term $-2e_1^2$, and that correction comes exactly from the bracket $[e_1,e_2]=e_1$. [/example] The proof also reveals a useful slogan: $U(\mathfrak{g})$ is a filtered deformation of $S(\mathfrak{g})$. The associated graded object is commutative, while the original algebra keeps the Lie bracket in its commutators. ## Modules over the Enveloping Algebra Once PBW guarantees that $\mathfrak{g}$ survives inside $U(\mathfrak{g})$, the next question is whether representations of $\mathfrak{g}$ have really become ordinary modules. The universal property gives the answer, and PBW ensures that no information is lost at the level of generators. Concretely, for a Lie algebra representation \begin{align*} \rho:\mathfrak g&\to\mathfrak{gl}(V) \end{align*} there is a unique unital algebra action of $U(\mathfrak g)$ on $V$ extending $\rho$. Conversely, any unital left $U(\mathfrak g)$-module restricts along the canonical map $\mathfrak g\to U(\mathfrak g)$ to a representation of $\mathfrak g$. These two constructions are inverse to each other. This result is the practical payoff of the enveloping algebra. The defining commutator relations are essential: if an associative algebra generated by symbols from $\mathfrak{g}$ failed to impose $XY-YX=[X,Y]$, then restricting a module action would not necessarily give a Lie algebra representation. The unit is also part of the comparison, since a unital $U(\mathfrak{g})$-module requires $1$ to act as the identity on $V$; allowing non-unital actions would enlarge the module category artificially. The equivalence does not classify representations, prove complete reducibility, or determine irreducible modules; it only translates the representation problem into module theory over a particular associative algebra. Instead of repeatedly checking Lie bracket identities, we can use ideals, quotients, central characters, and module-theoretic constructions inside $U(\mathfrak{g})$. [example: Highest-Weight Computations for $\mathfrak{sl}(2)$] Let $\mathfrak{sl}(2)$ have basis $E,F,H$ with \begin{align*} [H,E]=2E, \qquad [H,F]=-2F, \qquad [E,F]=H. \end{align*} Order the basis as $F<H<E$. By the *Poincare--Birkhoff--Witt Theorem*, every element of $U(\mathfrak{sl}(2))$ is a unique finite linear combination of monomials \begin{align*} F^aH^bE^c, \qquad a,b,c\ge 0. \end{align*} Let $V$ be a highest-weight module generated by $v$, with $Ev=0$ and $Hv=\lambda v$. Since $V$ is generated by $v$, every vector in $V$ is a finite linear combination of vectors of the form $uv$ with $u\in U(\mathfrak{sl}(2))$. It is therefore enough to compute the action of one PBW monomial: \begin{align*} F^aH^bE^c v=F^aH^b(E^c v). \end{align*} If $c>0$, then $E^c v=E^{c-1}(Ev)=E^{c-1}0=0$, so \begin{align*} F^aH^bE^c v=0. \end{align*} If $c=0$, then $E^0=1$ and repeated use of $Hv=\lambda v$ gives \begin{align*} H^b v=\lambda^b v. \end{align*} Thus \begin{align*} F^aH^bE^0v=F^a(H^b v)=F^a(\lambda^b v)=\lambda^b F^a v. \end{align*} So every vector generated from $v$ is a linear combination of the lowering-string vectors $F^a v$. PBW is what makes this reduction precise: all raising operators $E$ can be placed on the right, where they kill the highest-weight vector. [/example] The example shows how PBW turns an abstract module into a computable object: choose an order, move raising operators to one side, and keep track of the bracket terms created during reordering. The centre of $U(\mathfrak{g})$ then supplies operators that commute with every Lie algebra action. ## The Centre and Casimir Elements Which elements of $U(\mathfrak{g})$ act uniformly across a representation, in the sense that they commute with all infinitesimal symmetries? These are the central elements of the enveloping algebra. For semisimple Lie algebras, the most important first example is the quadratic Casimir element. [definition: Centre of the Universal Enveloping Algebra] Let $\mathfrak{g}$ be a Lie algebra over $k$. The centre of $U(\mathfrak{g})$ is \begin{align*} Z(U(\mathfrak{g}))=\{u\in U(\mathfrak{g}) : uv=vu \text{ for all } v\in U(\mathfrak{g})\}. \end{align*} [/definition] Because $U(\mathfrak{g})$ is generated by $\mathfrak{g}$, an element is central exactly when it commutes with every $X\in\mathfrak{g}$. This makes central elements useful in representation theory: they define endomorphisms of every $\mathfrak{g}$-module. The next theorem constructs a central element from the invariant bilinear form available for semisimple Lie algebras. [quotetheorem:8828] [citeproof:8828] The Casimir packages the invariant bilinear form into a canonical second-order element of the enveloping algebra, and the hypotheses explain why this construction is special. Semisimplicity in characteristic $0$ ensures that the Killing form is non-degenerate; for an abelian non-zero Lie algebra the Killing form is identically zero, so there is no Killing-[dual basis](/theorems/414) and this particular formula cannot even be formed. Finite-dimensionality is also built into the displayed sum, since it uses a finite basis and a finite inverse matrix for $B$. In any irreducible representation over an [algebraically closed field](/page/Algebraically%20Closed%20Field) satisfying Schur's lemma hypotheses, the Casimir acts by a scalar, so it becomes a powerful label for representations. [example: The Casimir for $\mathfrak{sl}(2)$] For $\mathfrak{sl}(2)$ with generators $E,F,H$ satisfying \begin{align*} [H,E]=2E, \qquad [H,F]=-2F, \qquad [E,F]=H, \end{align*} take the irreducible spin-$j$ representation with highest-weight vector $v_0$ satisfying \begin{align*} Ev_0=0, \qquad Hv_0=2jv_0. \end{align*} Set $v_r=F^rv_0$ for $0\le r\le 2j$. From $[H,F]=-2F$, we have $HF=FH-2F$, so induction gives \begin{align*} HF^r=F^rH-2rF^r. \end{align*} Applying this to $v_0$ gives \begin{align*} Hv_r=HF^rv_0=(F^rH-2rF^r)v_0=(2j-2r)v_r. \end{align*} We next compute the action of $E$ on this basis. Since $[E,F]=H$, the commutator identity $[E,F^r]=\sum_{a=0}^{r-1}F^a[E,F]F^{r-1-a}$ gives \begin{align*} [E,F^r]=\sum_{a=0}^{r-1}F^aHF^{r-1-a}. \end{align*} Because $Ev_0=0$, this yields \begin{align*} EF^rv_0=[E,F^r]v_0=\sum_{a=0}^{r-1}F^aHF^{r-1-a}v_0. \end{align*} For each summand, \begin{align*} HF^{r-1-a}v_0=(2j-2(r-1-a))F^{r-1-a}v_0. \end{align*} Therefore \begin{align*} EF^rv_0=\sum_{a=0}^{r-1}(2j-2r+2+2a)F^{r-1}v_0. \end{align*} The scalar sum is \begin{align*} \sum_{a=0}^{r-1}(2j-2r+2+2a)=r(2j-2r+2)+2\frac{r(r-1)}{2}=r(2j-r+1), \end{align*} so \begin{align*} Ev_r=r(2j-r+1)v_{r-1}. \end{align*} Now let \begin{align*} C=\frac{1}{2}H^2+EF+FE. \end{align*} On $v_r$, the three terms are \begin{align*} \frac{1}{2}H^2v_r=\frac{1}{2}(2j-2r)^2v_r, \end{align*} \begin{align*} EFv_r=Ev_{r+1}=(r+1)(2j-r)v_r, \end{align*} and \begin{align*} FEv_r=F\bigl(r(2j-r+1)v_{r-1}\bigr)=r(2j-r+1)v_r. \end{align*} Adding the coefficients gives \begin{align*} \frac{1}{2}(2j-2r)^2+(r+1)(2j-r)+r(2j-r+1)=2j(j+1). \end{align*} Hence \begin{align*} Cv_r=2j(j+1)v_r \end{align*} for every basis vector $v_r$. Thus $C$ acts on the whole irreducible spin-$j$ representation as the scalar operator \begin{align*} C=2j(j+1)I. \end{align*} The calculation shows concretely how the central Casimir collapses to one scalar on an irreducible representation while still being built from the non-commuting operators $E,F,H$. [/example] This final example ties together all parts of the chapter. PBW gives a basis for doing algebra in $U(\mathfrak{sl}(2))$, the module equivalence turns Lie algebra representations into module actions, and the Casimir element produces a central operator whose scalar value distinguishes irreducible representations in the basic family. The algebraic machinery is now in place to study the compact case through integration and harmonic analysis. Haar measure, the modular function, and the [Peter--Weyl theorem](/theorems/8833) will show how the representation theory of compact Lie groups is reflected in their spaces of functions. # 11. Integration on Lie Groups and the Peter–Weyl Theorem (Overview) ## Translation-Invariant Integration This chapter introduces the integration and representation-theoretic tools needed for the compact part of the course: Haar measure, the modular function, regular representations, and the Peter--Weyl theorem. The prerequisites are the earlier chapters on Lie groups, Lie algebras, compactness, and finite-dimensional complex representations; the smooth densities used here provide the integration language needed to define Haar measure. A Lie group has no preferred coordinates, so ordinary coordinate integration cannot be the invariant object. The right question is whether there is a measure on $G$ for which translating a set by a group element does not change its size. Such a measure is the analytic substitute for volume in the absence of a Riemannian metric chosen in advance. The measure-theoretic terminology in this chapter is standard but worth fixing at first use. The Borel $\sigma$-algebra $\mathcal B(G)$ is the collection of subsets generated from open sets by countable unions, countable intersections, and complements. A Radon Borel measure is a measure on $\mathcal B(G)$ that is finite on compact sets and regular, meaning that its values can be approximated from compact subsets inside a set and open sets containing it. If $f:G\to H$ is continuous and $\mu$ is a measure on $G$, the pushforward $f_*\mu$ is defined by $(f_*\mu)(E)=\mu(f^{-1}(E))$ for Borel sets $E\subseteq H$. Later, $L^2(G,\nu)$ will mean square-integrable complex functions modulo equality outside a $\nu$-measure-zero set; strong continuity of a Hilbert-space representation means that every orbit map is norm-continuous; and a Hilbert direct sum $\bigoplus_i H_i$ consists of vectors $(v_i)_i$ with $\sum_i\|v_i\|^2<\infty$. [definition: Left Haar Measure] Let $G$ be a Lie group and let $\mathcal B(G)$ be its Borel $\sigma$-algebra. A left Haar measure on $G$ is a nonzero Radon Borel measure $\nu:\mathcal B(G)\to[0,\infty]$, finite on compact sets and regular on Borel sets, such that $\nu(gA)=\nu(A)$ for every $g \in G$ and every Borel set $A \subset G$. [/definition] The condition says that integration satisfies \begin{align*} \int_G f(gx)\,d\nu(x)=\int_G f(x)\,d\nu(x) \end{align*} for all compactly supported continuous functions $f:G\to \mathbb C$. The definition is useful only if such measures exist and are essentially unique, so the first theorem supplies the invariant integration theory used throughout the chapter. [quotetheorem:1063] [citeproof:1063] This theorem is the entry point for harmonic analysis on groups: it supplies the integration functional that representation theory will use. The Lie group hypothesis is more than decorative: smooth local coordinates give locally compact Hausdorff topology and smooth densities, which are the analytic inputs behind existence. Uniqueness up to scalar is also deliberately weaker than a canonical normalisation; on a noncompact group there is usually no finite total mass condition that singles out one measure. Most importantly, left invariance alone does not imply right invariance, so the theorem gives an invariant integration theory for one side of the group operation but leaves open whether the opposite translations preserve the same integral. [definition: Right Haar Measure] Let $G$ be a Lie group and let $\mathcal B(G)$ be its Borel $\sigma$-algebra. A right Haar measure on $G$ is a nonzero Radon Borel measure $\nu:\mathcal B(G)\to[0,\infty]$, finite on compact sets and regular on Borel sets, such that $\nu(Ag)=\nu(A)$ for every $g \in G$ and every Borel set $A \subset G$. [/definition] Left and right invariance agree for abelian groups, but noncommutative groups need not have a measure with both symmetries. The obstruction is measured by the modular function, but before introducing it we record the compact case where the obstruction disappears. [example: Haar Measure on Euclidean Space] Regard $\mathbb R^n$ as a Lie group under addition. For $u\in\mathbb R^n$ and a Borel set $A\subset\mathbb R^n$, left translation and right translation are the same map because \begin{align*} u+A=\{u+x:x\in A\}=\{x+u:x\in A\}=A+u. \end{align*} [Lebesgue measure](/page/Lebesgue%20Measure) is translation-invariant, so \begin{align*} \mathcal L^n(u+A)=\mathcal L^n(A)=\mathcal L^n(A+u). \end{align*} It is nonzero, Radon, finite on compact sets, and regular on Borel sets; hence $\mathcal L^n$ is both a left and a right Haar measure on $\mathbb R^n$. If $\mu$ is any translation-invariant regular Borel measure on $\mathbb R^n$ that is finite on compact sets and not identically zero, then $\mu$ is a left Haar measure. By *Existence and Uniqueness of Left Haar Measure*, there is a constant $c>0$ such that \begin{align*} \mu=c\mathcal L^n. \end{align*} Thus ordinary Lebesgue integration is exactly Haar integration on the additive Lie group $\mathbb R^n$, with the only freedom being multiplication by a positive scalar. [/example] This example recovers the familiar [translation invariance](/theorems/4911) of Lebesgue integration as the first case of Haar theory. Compact groups are the next major case because total mass can be normalised, and that normalisation forces the left Haar measure to respect right translations as well. [quotetheorem:8829] [citeproof:8829] Compactness enters through the finite total mass normalisation: the proof compares $\nu$ and its right translate by evaluating both on the whole group. The theorem does not say that every noncompact group fails to be right invariant; for example $\mathbb R^n$ and many semisimple Lie groups are noncompact but still unimodular. What it does say is that compactness removes the possible rescaling mechanism. In noncompact solvable examples, such as the affine group of the line below, right translation can multiply a left Haar measure by a nontrivial factor, so uniqueness of left Haar measure by itself is not enough to give a bi-invariant measure. ## The Modular Function and Unimodularity The preceding theorem leaves a question: when $G$ is not compact, how far can left and right Haar measures differ? The answer is encoded by a continuous homomorphism from $G$ to the positive real numbers. This homomorphism is invisible for compact groups but important for solvable and noncompact examples. [definition: Modular Function] Let $G$ be a Lie group with left Haar measure $\nu$. The modular function is the map $\Delta:G\to \mathbb R_+$ determined by \begin{align*} \nu(Ag)=\Delta(g)^{-1}\nu(A) \end{align*} for every $g\in G$ and every Borel set $A\subset G$. [/definition] The definition is independent of the choice of left Haar measure because two left Haar measures differ by a positive scalar. To use $\Delta$ as a Lie-theoretic invariant, we need to know that it respects multiplication and that its vanishing obstruction is exactly bi-invariance of Haar measure. [quotetheorem:8830] [citeproof:8830] This theorem turns the analytic question of right invariance into the algebraic question of whether $G$ admits a nontrivial continuous character into $\mathbb R_+$. The hypotheses matter because $\Delta$ is built from comparing right translates of a left Haar measure; without uniqueness up to scalar, there would be no single rescaling factor to record. The obstruction can genuinely be nontrivial: in the affine group $(a,b):x\mapsto ax+b$, dilation changes the left-invariant volume by a power of $a$, so $\Delta$ detects the expansion direction. Thus $\Delta=1$ is not a formal consequence of being a Lie group, and verifying unimodularity is a real step before using the same measure for left and right regular actions. The main families in this course are compact and semisimple groups, so we next identify why both families are safe from modular corrections. [definition: Unimodular Lie Group] A Lie group $G$ is unimodular if its modular function satisfies $\Delta(g)=1$ for every $g\in G$. [/definition] Unimodularity is the condition that left Haar integration can also be used without correction under right translations. The obstruction is that right translation may rescale a left Haar measure by the modular character, so regular representation formulas can acquire unwanted correction factors. For the compact theory developed later, we need structural criteria ensuring this obstruction disappears; compactness removes it through finite invariant mass, while connected semisimplicity removes it through the trace behaviour of the adjoint representation. [quotetheorem:8831] [citeproof:8831] The modular function will not play a major role in the compact Peter--Weyl theorem, but it explains why compactness is analytically convenient. The connectedness hypothesis in the semisimple case is doing work: proving that the differential of $\Delta$ vanishes controls the identity component, and disconnected extensions can require a separate check on the component group. Semisimplicity also gives only unimodularity, not compactness or finite Haar volume; groups such as $SL(2,\mathbb R)$ are semisimple and unimodular but still noncompact, so their representation theory is not governed by the compact Peter--Weyl decomposition. On a compact group, by contrast, the single normalised Haar measure supports both left and right regular actions. [example: A Nonunimodular Affine Group] Let $G=\{(a,b):a>0,\ b\in\mathbb R\}$ with multiplication \begin{align*} (a,b)(a',b')=(aa',b+ab'). \end{align*} This is the group of orientation-preserving affine transformations $x\mapsto ax+b$. We verify the behaviour of the measure \begin{align*} d\nu(a,b)=a^{-2}\,da\,db. \end{align*} First fix $(\alpha,\beta)\in G$. Left translation sends $(a,b)$ to \begin{align*} L_{(\alpha,\beta)}(a,b)=(\alpha,\beta)(a,b)=(\alpha a,\beta+\alpha b). \end{align*} In coordinates $(a,b)$, its derivative matrix is \begin{align*} \begin{pmatrix}\alpha&0\cr 0&\alpha\end{pmatrix}, \end{align*} so the Jacobian determinant is $\alpha^2$. If $a'=\alpha a$ and $b'=\beta+\alpha b$, then \begin{align*} da'\,db'=\alpha^2\,da\,db. \end{align*} Also $a'^{-2}=(\alpha a)^{-2}=\alpha^{-2}a^{-2}$, hence \begin{align*} a'^{-2}\,da'\,db'=(\alpha^{-2}a^{-2})(\alpha^2\,da\,db)=a^{-2}\,da\,db. \end{align*} Thus $a^{-2}\,da\,db$ is invariant under left translation, so it is a left Haar measure on $G$. Now compute the effect of right translation. Right multiplication by $(\alpha,\beta)$ sends \begin{align*} R_{(\alpha,\beta)}(a,b)=(a,b)(\alpha,\beta)=(a\alpha,b+a\beta). \end{align*} Its derivative matrix in the coordinates $(a,b)$ is \begin{align*} \begin{pmatrix}\alpha&0\cr \beta&1\end{pmatrix}, \end{align*} so the Jacobian determinant is $\alpha$. If $a'=a\alpha$ and $b'=b+a\beta$, then \begin{align*} da'\,db'=\alpha\,da\,db. \end{align*} Also $a'^{-2}=(a\alpha)^{-2}=a^{-2}\alpha^{-2}$, so \begin{align*} a'^{-2}\,da'\,db'=(a^{-2}\alpha^{-2})(\alpha\,da\,db)=\alpha^{-1}a^{-2}\,da\,db. \end{align*} Therefore, for every Borel set $A\subset G$, \begin{align*} \nu(A(\alpha,\beta))=\alpha^{-1}\nu(A). \end{align*} Since the modular function is defined by $\nu(Ag)=\Delta(g)^{-1}\nu(A)$, this gives \begin{align*} \Delta(\alpha,\beta)=\alpha. \end{align*} For instance, $(2,0)$ has $\Delta(2,0)=2$, so this affine group is not unimodular; the example shows how a solvable noncompact group can fail to have the same Haar measure on the left and on the right. [/example] This affine example also warns that left-invariant volume forms and right-invariant volume forms can encode different global geometry. For compact Lie groups the distinction vanishes, and the representation theory can be expressed using a single [Hilbert space](/page/Hilbert%20Space). ## Regular Representations on Square-Integrable Functions Once $G$ has a Haar measure, functions on $G$ form natural representation spaces. For compact $G$, the normalised Haar measure gives the Hilbert space $L^2(G)$, and both left and right translation act unitarily. The question is how this large representation decomposes into irreducible finite-dimensional pieces. [definition: Left and Right Regular Representations] Let $G$ be a compact Lie group with normalised Haar measure $\nu$. The left and right regular representation maps are \begin{align*} L &: G\to U(L^2(G,\nu)), & R &: G\to U(L^2(G,\nu)), \end{align*} where for each $g\in G$ the operators $L_g,R_g:L^2(G,\nu)\to L^2(G,\nu)$ are defined by \begin{align*} (L_g f)(x)=f(g^{-1}x), \qquad (R_g f)(x)=f(xg) \end{align*} for $x\in G$ and $f\in L^2(G,\nu)$. [/definition] The inverse in the formula for $L_g$ makes $g\mapsto L_g$ a representation rather than an anti-representation. To make $L^2(G)$ into a representation of $G\times G$, we must check that the left and right actions are unitary and commute. [quotetheorem:8832] [citeproof:8832] The commutation result gives a legitimate $G\times G$-action, but it does not yet decompose $L^2(G)$ or produce any finite-dimensional summands. Compactness and unimodularity are essential for the statement as written: left and right Haar invariance make both translations unitary for the same Hilbert norm. On a nonunimodular group, right translation changes a left Haar measure by the modular factor, so the uncorrected right regular operators are not unitary on $L^2(G,\nu)$; one must insert modular weights or change the representation-theoretic framework. The regular representation is also too large to understand directly, so we need explicit finite-dimensional functions inside it. Matrix coefficients provide those functions by converting vectors and covectors in a representation into scalar functions on the group. [definition: Matrix Coefficient] Let $\pi:G\to GL(V_\pi)$ be a finite-dimensional complex representation of a compact Lie group. For $v\in V_\pi$ and $\lambda\in V_\pi^*$, the associated matrix coefficient is the function $c_{\lambda,v}:G\to\mathbb C$ defined by \begin{align*} c_{\lambda,v}(g)=\lambda(\pi(g)v). \end{align*} [/definition] Matrix coefficients are continuous, and for smooth representations of Lie groups they are smooth. The decisive question is whether these finite-dimensional pieces merely give examples inside $L^2(G)$ or whether they recover the entire Hilbert space. [quotetheorem:8833] [citeproof:8833] The Peter--Weyl theorem is the point where compactness turns representation theory into a Fourier theory on $G$. Matrix coefficients are not just isolated examples of functions on the group: after completing with respect to Haar measure, they supply orthogonal finite-dimensional building blocks for all of $L^2(G,\nu)$. The theorem also records the two-sided regular action, so the decomposition remembers both left and right translation rather than treating $L^2(G)$ as an abstract Hilbert space. This is why compact groups can be studied through irreducible unitary representations in a way that directly generalises [Fourier series](/page/Fourier%20Series) on the circle and finite Fourier analysis on finite groups. [remark: Overview Rather Than Full Harmonic Analysis] The full Peter--Weyl theorem includes stronger uniform approximation statements for continuous functions and precise Schur orthogonality relations for matrix coefficients. In this course, the theorem is used as a bridge from foundational Lie theory to compact representation theory. The main takeaway is that compactness converts the nonlinear object $G$ into a Hilbert-space direct sum of finite-dimensional representation data. [/remark] The chapter closes the foundational arc: local Lie algebra methods describe the infinitesimal behaviour, covering theory controls global topology beyond the identity component, Haar measure enables invariant integration, and Peter--Weyl turns compact groups into unitary matrix groups. The next natural course would develop highest weights, roots, Weyl groups, and character formulae from this analytic starting point. ## Beyond and Connections The main bridge out of this course is from compact matrix groups to the structure theory of compact Lie groups. The examples of $SU(2)$ and $SO(3)$ point toward root systems, highest-weight theory, and the classification of irreducible representations. The covering-space material connects the local Lie algebra to global topology, while Haar measure and Peter--Weyl connect Lie theory to harmonic analysis, Fourier series on groups, and unitary representation theory. Related public course notes include [Cambridge II Representation Theory](/page/Cambridge%20II%20Representation%20Theory), [Cambridge III Differential Geometry](/page/Cambridge%20III%20Differential%20Geometry), [Cambridge II Algebraic Topology](/page/Cambridge%20II%20Algebraic%20Topology), and [Androma Graduate Harmonic Analysis](/page/Androma%20Graduate%20Harmonic%20Analysis). Useful theorem-level routes through the same material include [Cartan's semisimplicity criterion](/theorems/3813), [Schur's lemma](/theorems/2414), the [$\mathfrak{su}(2)$ irreducible classification](/theorems/6952), and the [universal enveloping algebra representation correspondence](/theorems/3777). These themes also feed into differential geometry through homogeneous spaces and into mathematical physics through symmetry, angular momentum, and spin representations. ## References - Brian C. Hall, *Lie Groups, Lie Algebras, and Representations: An Elementary Introduction*, second edition, Springer, 2015. - John M. Lee, *Introduction to Smooth Manifolds*, second edition, Springer, 2013. - Jean-Pierre Serre, *Linear Representations of Finite Groups*, Springer, 1977. - Barry Simon, *Representations of Finite and Compact Groups*, American Mathematical Society, 1996.

Created by admin on 6/21/2026 | Last updated on 6/21/2026

What brings you to Androma?

Start with a route through the knowledge graph.

Lie Groups I: Foundations

Sign in to Androma

Check your inbox

One last step

Lie Groups I: Foundations

Prerequisites (0/3 completed)

Prerequisites Graph

Rate this page