[proofplan]
We work in the quotient algebra $\mathbb{R}[z]/(p_A)$, where $p_A$ is the characteristic polynomial of $A$. Since $p_A$ is monic of degree $n$, every element of this quotient has a unique representative of degree at most $n-1$, so the quotient exponential of the class of $tz$ has unique coordinate functions $\alpha_0(t),\dots,\alpha_{n-1}(t)$. The [Cayley-Hamilton theorem](/theorems/923) makes evaluation at $A$ descend from $\mathbb{R}[z]$ to the quotient, and applying this evaluation map to the quotient exponential gives the ordinary matrix exponential.
[/proofplan]
[step:Build the quotient algebra where all powers reduce to degree less than $n$]
Let $I_n \in \mathbb{R}^{n \times n}$ denote the identity matrix. Let $p_A \in \mathbb{R}[z]$ denote the characteristic polynomial of $A$, defined by
\begin{align*}
p_A(z) = \det(z I_n - A).
\end{align*}
Then $p_A$ is monic of degree $n$. Define the real quotient algebra
\begin{align*}
Q_A := \mathbb{R}[z]/(p_A),
\end{align*}
and let
\begin{align*}
\pi_A: \mathbb{R}[z] \to Q_A
\end{align*}
be the quotient map. Define
\begin{align*}
\xi := \pi_A(z) \in Q_A.
\end{align*}
Because $p_A$ is monic of degree $n$, polynomial division gives the following uniqueness statement: for every $f \in \mathbb{R}[z]$, there are unique polynomials $q_f,r_f \in \mathbb{R}[z]$ such that $\deg r_f < n$ and
\begin{align*}
f = q_f p_A + r_f.
\end{align*}
Therefore every class in $Q_A$ has a unique representative of the form
\begin{align*}
c_0 + c_1 z + \cdots + c_{n-1} z^{n-1},
\end{align*}
with $c_0,\dots,c_{n-1} \in \mathbb{R}$. Hence
\begin{align*}
\pi_A(1), \pi_A(z), \dots, \pi_A(z^{n-1})
\end{align*}
is a basis of the finite-dimensional real [vector space](/page/Vector%20Space) $Q_A$.
[guided]
The goal is to make the phrase “reduce powers of $A$ using the characteristic polynomial” into a precise algebraic construction. We first perform the reduction before substituting $A$.
Let $I_n \in \mathbb{R}^{n \times n}$ denote the identity matrix. Let $p_A \in \mathbb{R}[z]$ be the characteristic polynomial of $A$:
\begin{align*}
p_A(z) = \det(z I_n - A).
\end{align*}
This polynomial is monic of degree $n$. We form the quotient algebra
\begin{align*}
Q_A := \mathbb{R}[z]/(p_A),
\end{align*}
and write
\begin{align*}
\pi_A: \mathbb{R}[z] \to Q_A
\end{align*}
for the quotient map. The element that plays the role of the variable inside this quotient is
\begin{align*}
\xi := \pi_A(z).
\end{align*}
Why use this quotient? In $Q_A$, the polynomial $p_A$ is declared to be zero. Since $p_A$ has degree $n$, this forces every sufficiently high power of $\xi$ to be expressible as a linear combination of $1,\xi,\dots,\xi^{n-1}$.
We now justify that this reduction is unique. Since $p_A$ is monic of degree $n$, the Euclidean division theorem in $\mathbb{R}[z]$ says that for every polynomial $f \in \mathbb{R}[z]$, there exist unique polynomials $q_f,r_f \in \mathbb{R}[z]$ with $\deg r_f < n$ such that
\begin{align*}
f = q_f p_A + r_f.
\end{align*}
Passing to the quotient gives
\begin{align*}
\pi_A(f) = \pi_A(r_f),
\end{align*}
because $\pi_A(q_f p_A)=0$. Thus every class has a representative of degree less than $n$.
Uniqueness also matters. If two polynomials $r,s \in \mathbb{R}[z]$ both have degree less than $n$ and represent the same class in $Q_A$, then $r-s$ is divisible by $p_A$. But either $r-s=0$, or $\deg(r-s)<n$, while every nonzero multiple of the monic degree-$n$ polynomial $p_A$ has degree at least $n$. Therefore $r=s$. Hence
\begin{align*}
\pi_A(1), \pi_A(z), \dots, \pi_A(z^{n-1})
\end{align*}
is a basis of $Q_A$ as a real vector space.
[/guided]
[/step]
[step:Define the scalar coefficient functions from the quotient exponential]
For each $t \in \mathbb{R}$, define the quotient-algebra exponential
\begin{align*}
\operatorname{Exp}_{Q_A}(t\xi) := \sum_{m=0}^{\infty} \frac{t^m \xi^m}{m!} \in Q_A.
\end{align*}
Equip the finite-dimensional real vector space $Q_A$ with the norm $\|\cdot\|_{Q_A}$ obtained from the basis $\pi_A(1),\pi_A(z),\dots,\pi_A(z^{n-1})$ by identifying $Q_A$ with $\mathbb{R}^n$ and using the Euclidean norm. Let $M_\xi: Q_A \to Q_A$ be the [linear map](/page/Linear%20Map) $u \mapsto \xi u$. Since $M_\xi$ is a linear map on a finite-dimensional normed space, its operator norm $\|M_\xi\|_{\mathrm{op}}$ is finite. Therefore
\begin{align*}
\|\xi^m\|_{Q_A} \leq \|M_\xi\|_{\mathrm{op}}^m \|1\|_{Q_A}
\end{align*}
for every integer $m \geq 0$, so the series defining $\operatorname{Exp}_{Q_A}(t\xi)$ is dominated by the convergent real exponential series
\begin{align*}
\|1\|_{Q_A}\sum_{m=0}^{\infty}\frac{(|t|\|M_\xi\|_{\mathrm{op}})^m}{m!}.
\end{align*}
Thus the quotient exponential is well-defined in $Q_A$.
Since $\pi_A(1),\pi_A(z),\dots,\pi_A(z^{n-1})$ is a basis of $Q_A$, for each $t \in \mathbb{R}$ there exist unique [real numbers](/page/Real%20Numbers) $\alpha_0(t),\dots,\alpha_{n-1}(t)$ such that
\begin{align*}
\operatorname{Exp}_{Q_A}(t\xi)
=
\sum_{k=0}^{n-1} \alpha_k(t)\xi^k.
\end{align*}
This defines functions
\begin{align*}
\alpha_k: \mathbb{R} \to \mathbb{R}
\end{align*}
for $k=0,\dots,n-1$.
[guided]
Now that every element of $Q_A$ has unique coordinates in the basis $\pi_A(1),\pi_A(z),\dots,\pi_A(z^{n-1})$, we define the coefficients by taking the exponential inside this finite-dimensional algebra. For $t \in \mathbb{R}$, set
\begin{align*}
\operatorname{Exp}_{Q_A}(t\xi) := \sum_{m=0}^{\infty} \frac{t^m \xi^m}{m!} \in Q_A.
\end{align*}
We must justify that this infinite series converges in $Q_A$, because $Q_A$ is not merely a formal quotient at this point; it is being used as a finite-dimensional [normed vector space](/page/Normed%20Vector%20Space).
Define $\|\cdot\|_{Q_A}$ by identifying $Q_A$ with $\mathbb{R}^n$ through the ordered basis $\pi_A(1),\pi_A(z),\dots,\pi_A(z^{n-1})$ and using the Euclidean norm on $\mathbb{R}^n$. Let $M_\xi: Q_A \to Q_A$ be the multiplication map $u \mapsto \xi u$. This is a linear map on the finite-dimensional normed real vector space $Q_A$, so its operator norm $\|M_\xi\|_{\mathrm{op}}$ is finite. Since $\xi^m = M_\xi^m(1)$, repeated use of the operator norm gives
\begin{align*}
\|\xi^m\|_{Q_A} \leq \|M_\xi\|_{\mathrm{op}}^m \|1\|_{Q_A}
\end{align*}
for every integer $m \geq 0$. Therefore
\begin{align*}
\left\|\frac{t^m\xi^m}{m!}\right\|_{Q_A} \leq \|1\|_{Q_A}\frac{(|t|\|M_\xi\|_{\mathrm{op}})^m}{m!}.
\end{align*}
The scalar majorant series is the ordinary exponential series, hence it converges. By comparison in the finite-dimensional normed space $Q_A$, the series defining $\operatorname{Exp}_{Q_A}(t\xi)$ converges.
The reason for introducing this exponential is that it has coordinates in the reduced-power basis. Since $\pi_A(1),\pi_A(z),\dots,\pi_A(z^{n-1})$ is a basis of $Q_A$, each element of $Q_A$ has a unique coordinate expansion. Applying this to $\operatorname{Exp}_{Q_A}(t\xi)$ gives unique real numbers $\alpha_0(t),\dots,\alpha_{n-1}(t)$ such that
\begin{align*}
\operatorname{Exp}_{Q_A}(t\xi)
=
\sum_{k=0}^{n-1} \alpha_k(t)\xi^k.
\end{align*}
Thus, for each $k=0,\dots,n-1$, we have defined a scalar function
\begin{align*}
\alpha_k: \mathbb{R} \to \mathbb{R}.
\end{align*}
[/guided]
[/step]
[step:Descend polynomial evaluation at $A$ to the quotient]
Define the polynomial evaluation map
\begin{align*}
\operatorname{ev}_A: \mathbb{R}[z] \to \mathbb{R}^{n \times n}
\end{align*}
by
\begin{align*}
\operatorname{ev}_A\left(\sum_{j=0}^{m} c_j z^j\right)
=
\sum_{j=0}^{m} c_j A^j.
\end{align*}
By the [Cayley-Hamilton Theorem](/theorems/865), applied to the matrix $A \in \mathbb{R}^{n \times n}$ with characteristic polynomial $p_A$, we have $p_A(A)=0$. Hence, if $f-g \in (p_A)$, then $f-g = h p_A$ for some $h \in \mathbb{R}[z]$, and multiplicativity of polynomial evaluation gives
\begin{align*}
\operatorname{ev}_A(f)-\operatorname{ev}_A(g)
=
\operatorname{ev}_A(h)\operatorname{ev}_A(p_A)
=
\operatorname{ev}_A(h)p_A(A)
=
0.
\end{align*}
Therefore evaluation at $A$ is constant on congruence classes modulo $(p_A)$, so it induces a well-defined real algebra homomorphism
\begin{align*}
\widetilde{\operatorname{ev}}_A: Q_A \to \mathbb{R}^{n \times n}
\end{align*}
given by
\begin{align*}
\widetilde{\operatorname{ev}}_A(\pi_A(f)) = f(A)
\end{align*}
for every $f \in \mathbb{R}[z]$. In particular,
\begin{align*}
\widetilde{\operatorname{ev}}_A(\xi^m)=A^m
\end{align*}
for every integer $m \ge 0$.
[guided]
We now explain why substituting $A$ is compatible with the quotient. Define the polynomial evaluation map
\begin{align*}
\operatorname{ev}_A: \mathbb{R}[z] \to \mathbb{R}^{n \times n}
\end{align*}
by
\begin{align*}
\operatorname{ev}_A\left(\sum_{j=0}^{m} c_j z^j\right)
=
\sum_{j=0}^{m} c_j A^j.
\end{align*}
This is a real algebra homomorphism because evaluation respects addition and multiplication of polynomials.
To descend this map to $Q_A=\mathbb{R}[z]/(p_A)$, we must prove that two representatives of the same quotient class have the same value at $A$. Suppose $f,g \in \mathbb{R}[z]$ represent the same class. Then $f-g \in (p_A)$, so there exists $h \in \mathbb{R}[z]$ such that
\begin{align*}
f-g = h p_A.
\end{align*}
The [Cayley-Hamilton Theorem](/theorems/865), applied to the matrix $A$ and its characteristic polynomial $p_A$, gives
\begin{align*}
p_A(A)=0.
\end{align*}
Using multiplicativity of polynomial evaluation, we get
\begin{align*}
\operatorname{ev}_A(f)-\operatorname{ev}_A(g)
=
\operatorname{ev}_A(h)\operatorname{ev}_A(p_A)
=
\operatorname{ev}_A(h)p_A(A)
=
0.
\end{align*}
Therefore evaluation at $A$ depends only on the quotient class. Hence there is a well-defined real algebra homomorphism
\begin{align*}
\widetilde{\operatorname{ev}}_A: Q_A \to \mathbb{R}^{n \times n}
\end{align*}
satisfying
\begin{align*}
\widetilde{\operatorname{ev}}_A(\pi_A(f)) = f(A)
\end{align*}
for every $f \in \mathbb{R}[z]$. Since $\xi=\pi_A(z)$, this gives
\begin{align*}
\widetilde{\operatorname{ev}}_A(\xi^m)=A^m
\end{align*}
for every integer $m \geq 0$.
[/guided]
[/step]
[step:Evaluate the quotient exponential to obtain the matrix exponential]
Fix $t \in \mathbb{R}$. Since $\widetilde{\operatorname{ev}}_A: Q_A \to \mathbb{R}^{n \times n}$ is a linear map between finite-dimensional normed real vector spaces, it is continuous and may be applied term-by-term to convergent series. Therefore
\begin{align*}
\widetilde{\operatorname{ev}}_A\left(\operatorname{Exp}_{Q_A}(t\xi)\right)
=
\widetilde{\operatorname{ev}}_A\left(\sum_{m=0}^{\infty} \frac{t^m\xi^m}{m!}\right)
=
\sum_{m=0}^{\infty} \frac{t^m\widetilde{\operatorname{ev}}_A(\xi^m)}{m!}.
\end{align*}
Using $\widetilde{\operatorname{ev}}_A(\xi^m)=A^m$, this becomes
\begin{align*}
\widetilde{\operatorname{ev}}_A\left(\operatorname{Exp}_{Q_A}(t\xi)\right)
=
\sum_{m=0}^{\infty} \frac{t^m A^m}{m!}
=
e^{tA},
\end{align*}
where the final equality is the definition of the matrix exponential.
On the other hand, applying $\widetilde{\operatorname{ev}}_A$ to the coordinate expansion of $\operatorname{Exp}_{Q_A}(t\xi)$ gives
\begin{align*}
\widetilde{\operatorname{ev}}_A\left(\operatorname{Exp}_{Q_A}(t\xi)\right)
=
\widetilde{\operatorname{ev}}_A\left(\sum_{k=0}^{n-1}\alpha_k(t)\xi^k\right)
=
\sum_{k=0}^{n-1}\alpha_k(t)A^k.
\end{align*}
Combining the two displayed identities yields
\begin{align*}
e^{tA}
=
\sum_{k=0}^{n-1}\alpha_k(t)A^k.
\end{align*}
Since $t \in \mathbb{R}$ was arbitrary, the identity holds for every $t \in \mathbb{R}$.
[guided]
It remains to connect the quotient construction back to the matrix exponential. Fix $t \in \mathbb{R}$. The map
\begin{align*}
\widetilde{\operatorname{ev}}_A: Q_A \to \mathbb{R}^{n \times n}
\end{align*}
is linear between finite-dimensional normed real vector spaces, hence continuous. Therefore it may be applied term-by-term to the convergent series defining $\operatorname{Exp}_{Q_A}(t\xi)$. We obtain
\begin{align*}
\widetilde{\operatorname{ev}}_A\left(\operatorname{Exp}_{Q_A}(t\xi)\right)
=
\widetilde{\operatorname{ev}}_A\left(\sum_{m=0}^{\infty} \frac{t^m\xi^m}{m!}\right)
=
\sum_{m=0}^{\infty} \frac{t^m\widetilde{\operatorname{ev}}_A(\xi^m)}{m!}.
\end{align*}
The previous step proved that $\widetilde{\operatorname{ev}}_A(\xi^m)=A^m$ for every integer $m \geq 0$, so
\begin{align*}
\widetilde{\operatorname{ev}}_A\left(\operatorname{Exp}_{Q_A}(t\xi)\right)
=
\sum_{m=0}^{\infty} \frac{t^m A^m}{m!}
=
e^{tA}.
\end{align*}
The last equality is the definition of the matrix exponential.
We also evaluate the coordinate expansion that defined the coefficient functions. Since
\begin{align*}
\operatorname{Exp}_{Q_A}(t\xi)
=
\sum_{k=0}^{n-1}\alpha_k(t)\xi^k,
\end{align*}
linearity of $\widetilde{\operatorname{ev}}_A$ gives
\begin{align*}
\widetilde{\operatorname{ev}}_A\left(\operatorname{Exp}_{Q_A}(t\xi)\right)
=
\sum_{k=0}^{n-1}\alpha_k(t)\widetilde{\operatorname{ev}}_A(\xi^k)
=
\sum_{k=0}^{n-1}\alpha_k(t)A^k.
\end{align*}
Both displayed formulas compute the same matrix $\widetilde{\operatorname{ev}}_A(\operatorname{Exp}_{Q_A}(t\xi))$. Therefore
\begin{align*}
e^{tA}
=
\sum_{k=0}^{n-1}\alpha_k(t)A^k.
\end{align*}
Since $t \in \mathbb{R}$ was arbitrary, the identity holds for all real $t$.
[/guided]
[/step]