[proofplan]
We compute along the affine path $A_t=A_0+ta$. For each $t\in[0,1]$, let $d_t$ denote the covariant [exterior derivative](/theorems/1525) induced by $A_t$; differentiating the curvature gives $\dot F_t=d_ta$. The $\operatorname{Ad}$-invariance of $P$ converts the exterior derivative of $P(a,F_t,\dots,F_t)$ into the covariant derivative expression $P(d_ta,F_t,\dots,F_t)$, because the terms containing $d_tF_t$ vanish by the Bianchi identity. Thus
\begin{align*}
\frac{d}{dt}P(F_t)=d\bigl(kP(a,F_t,\dots,F_t)\bigr)
\end{align*}
and integration over $t\in[0,1]$ gives the transgression formula.
[/proofplan]
[step:Differentiate the curvature along the affine path]
For each $t\in[0,1]$, let
\begin{align*}
d_t:\Omega^q(M;\operatorname{ad}E)\to\Omega^{q+1}(M;\operatorname{ad}E)
\end{align*}
denote the covariant exterior derivative induced by the connection $A_t$. We claim that
\begin{align*}
\frac{dF_t}{dt}=d_ta.
\end{align*}
This is a local identity, so fix an [open set](/page/Open%20Set) $U\subset M$ over which $E$ is trivialized by a smooth local section
\begin{align*}
s:U\to E.
\end{align*}
Let
\begin{align*}
\omega_t:=s^*A_t\in\Omega^1(U;\mathfrak g), \qquad \alpha:=s^*a\in\Omega^1(U;\mathfrak g), \qquad \Omega_t:=s^*F_t\in\Omega^2(U;\mathfrak g).
\end{align*}
Then $\omega_t=\omega_0+t\alpha$ and
\begin{align*}
\Omega_t=d\omega_t+\frac{1}{2}[\omega_t\wedge\omega_t].
\end{align*}
Differentiating this expression with respect to $t$ gives
\begin{align*}
\frac{d\Omega_t}{dt}=d\alpha+[\omega_t\wedge\alpha].
\end{align*}
By the local formula for the covariant exterior derivative induced by $\omega_t$, the right-hand side is $(d_ta)|_U$. Hence $\frac{dF_t}{dt}=d_ta$ globally.
[/step]
[step:Convert the variation of $P(F_t)$ into a covariant derivative term]
Since $P$ is symmetric $k$-linear and each $F_t$ has even degree $2$, no Koszul sign is produced by moving one occurrence of $\frac{dF_t}{dt}$ among the curvature slots. Therefore
\begin{align*}
\frac{d}{dt}P(F_t)=kP\left(\frac{dF_t}{dt},F_t,\dots,F_t\right).
\end{align*}
Using the identity from the previous step,
\begin{align*}
\frac{d}{dt}P(F_t)=kP(d_ta,F_t,\dots,F_t),
\end{align*}
where there are $k-1$ copies of $F_t$. For $k=1$, this reads simply
\begin{align*}
\frac{d}{dt}P(F_t)=P(d_ta).
\end{align*}
[guided]
The point of this step is only multilinearity. The expression $P(F_t)$ means $P(F_t,\dots,F_t)$ with $k$ identical curvature entries. Differentiating a $k$-linear expression gives one term for each slot:
\begin{align*}
\frac{d}{dt}P(F_t)=\sum_{j=1}^k P(F_t,\dots,F_t,\frac{dF_t}{dt},F_t,\dots,F_t).
\end{align*}
Because $F_t$ is a $2$-form, it has even degree. Moving $\frac{dF_t}{dt}$ past any number of copies of $F_t$ therefore produces the Koszul sign $(-1)^{2\cdot 2m}=1$. Since $P$ is symmetric in its $\mathfrak g$-arguments, all $k$ summands are equal. Thus
\begin{align*}
\frac{d}{dt}P(F_t)=kP\left(\frac{dF_t}{dt},F_t,\dots,F_t\right).
\end{align*}
The curvature-variation identity proved in the preceding step states that $\frac{dF_t}{dt}=d_ta$, where $d_t:\Omega^q(M;\operatorname{ad}E)\to\Omega^{q+1}(M;\operatorname{ad}E)$ is the covariant exterior derivative induced by $A_t$. Substitution gives
\begin{align*}
\frac{d}{dt}P(F_t)=kP(d_ta,F_t,\dots,F_t).
\end{align*}
When $k=1$, there are no remaining curvature slots, and this formula is exactly $\frac{d}{dt}P(F_t)=P(d_ta)$.
[/guided]
[/step]
[step:Use invariance and Bianchi to identify the exterior derivative of the transgression integrand]
We next prove that, for each $t\in[0,1]$,
\begin{align*}
dP(a,F_t,\dots,F_t)=P(d_ta,F_t,\dots,F_t).
\end{align*}
For $k=1$, this is the same identity with no curvature factors.
Again work on a trivializing open set $U\subset M$ with local connection form $\omega_t\in\Omega^1(U;\mathfrak g)$, local curvature form $\Omega_t\in\Omega^2(U;\mathfrak g)$, and local representative $\alpha\in\Omega^1(U;\mathfrak g)$ of $a$. The covariant derivative is locally
\begin{align*}
d_t\beta=d\beta+[\omega_t\wedge\beta]
\end{align*}
for every local $\mathfrak g$-valued form $\beta$. The Bianchi identity for the curvature of the principal connection $A_t$ follows locally from the curvature formula:
\begin{align*}
d_t\Omega_t=d\Omega_t+[\omega_t\wedge\Omega_t]=0.
\end{align*}
The graded Leibniz rule for $d$ applied to $P(\alpha,\Omega_t,\dots,\Omega_t)$ gives
\begin{align*}
dP(\alpha,\Omega_t,\dots,\Omega_t)=P(d\alpha,\Omega_t,\dots,\Omega_t)-\sum_{j=2}^{k}P(\alpha,\Omega_t,\dots,d\Omega_t,\dots,\Omega_t),
\end{align*}
because $\alpha$ has degree $1$ and every preceding curvature factor has even degree. Replacing ordinary exterior derivatives by covariant exterior derivatives adds the connection-bracket terms
\begin{align*}
P([\omega_t\wedge\alpha],\Omega_t,\dots,\Omega_t)-\sum_{j=2}^{k}P(\alpha,\Omega_t,\dots,[\omega_t\wedge\Omega_t],\dots,\Omega_t).
\end{align*}
At each point of $U$, expand the form components of $\omega_t$, $\alpha$, and $\Omega_t$ in a basis of $\mathfrak g$ and evaluate on tangent vectors. The infinitesimal $\operatorname{Ad}$-invariance identity for the symmetric invariant form $P$ says that, for every $X,Y_1,\dots,Y_k\in\mathfrak g$,
\begin{align*}
\sum_{j=1}^{k}P(Y_1,\dots,[X,Y_j],\dots,Y_k)=0.
\end{align*}
This identity follows by differentiating $P(\operatorname{Ad}_{\exp(sX)}Y_1,\dots,\operatorname{Ad}_{\exp(sX)}Y_k)=P(Y_1,\dots,Y_k)$ at $s=0$. Applying it componentwise, with the Koszul signs coming from moving the $1$-form component of $\omega_t$ past the preceding differential-form factors, shows that the displayed connection-bracket contribution vanishes as a differential form. Hence
\begin{align*}
dP(\alpha,\Omega_t,\dots,\Omega_t)=P(d_t\alpha,\Omega_t,\dots,\Omega_t)-\sum_{j=2}^{k}P(\alpha,\Omega_t,\dots,d_t\Omega_t,\dots,\Omega_t).
\end{align*}
Since $d_t\Omega_t=0$, all terms in the sum vanish, and therefore
\begin{align*}
dP(\alpha,\Omega_t,\dots,\Omega_t)=P(d_t\alpha,\Omega_t,\dots,\Omega_t).
\end{align*}
These local identities agree on overlaps because all terms are defined by the associated adjoint bundle and the invariant polynomial $P$. Thus the desired global identity holds.
[/step]
[step:Integrate the pointwise identity in the path parameter]
Combining the previous two steps gives, for every $t\in[0,1]$,
\begin{align*}
\frac{d}{dt}P(F_t)=k\,dP(a,F_t,\dots,F_t).
\end{align*}
The map
\begin{align*}
[0,1]\to\Omega^{2k}(M), \qquad t\mapsto P(F_t)
\end{align*}
is smooth, since $A_t$ depends smoothly on $t$ and the curvature expression is polynomial in $A_t$ and its first derivatives. Therefore the [fundamental theorem of calculus](/theorems/632) in the parameter $t$ gives
\begin{align*}
P(F_{A_1})-P(F_{A_0})=\int_{[0,1]}\frac{d}{dt}P(F_t)\,d\mathcal L^1(t).
\end{align*}
Substituting the differential identity yields
\begin{align*}
P(F_{A_1})-P(F_{A_0})=k\int_{[0,1]}dP(a,F_t,\dots,F_t)\,d\mathcal L^1(t).
\end{align*}
Since $t\mapsto P(a,F_t,\dots,F_t)$ is a smooth family of differential forms on the compact interval $[0,1]$, exterior differentiation in the $M$-variables commutes with integration in the parameter $t$. Hence
\begin{align*}
P(F_{A_1})-P(F_{A_0})=d\left(k\int_{[0,1]}P(a,F_t,\dots,F_t)\,d\mathcal L^1(t)\right).
\end{align*}
By the definition of $\operatorname{CS}_P(A_0,A_1)$, this is exactly
\begin{align*}
P(F_{A_1})-P(F_{A_0})=d\operatorname{CS}_P(A_0,A_1).
\end{align*}
This proves the claimed Chern-Simons transgression formula.
[/step]