[proofplan]
Set $B:=g^{-1}Ag$, so that $A^g=B+\theta$. The proof is a direct expansion of the Chern-Simons form of $B+\theta$, using the Maurer-Cartan identity $d\theta=-\theta\wedge\theta$ and the corrected differential identity $dB=g^{-1}(dA)g-\theta\wedge B-B\wedge\theta$. After expanding, graded cyclicity of the matrix trace cancels all quadratic mixed terms, identifies the conjugated copy of $\operatorname{CS}(A)$, and leaves one mixed expression. That mixed expression is exactly $-d\operatorname{tr}(\theta\wedge B)$, while the pure Maurer-Cartan term contributes $-\frac{1}{3}\operatorname{tr}(\theta\wedge\theta\wedge\theta)$.
[/proofplan]
custom_env
admin
[step:Record the graded trace rule for matrix-valued forms]
Let $\operatorname{Mat}_n(\mathbb C)$ denote the algebra of complex $n\times n$ matrices. If $\alpha\in\Omega^p(M;\operatorname{Mat}_n(\mathbb C))$ and $\beta\in\Omega^q(M;\operatorname{Mat}_n(\mathbb C))$ are homogeneous matrix-valued forms, then
\begin{align*}
\operatorname{tr}(\alpha\wedge\beta)=(-1)^{pq}\operatorname{tr}(\beta\wedge\alpha).
\end{align*}
Indeed, writing $\alpha=(\alpha_{ij})$ and $\beta=(\beta_{ij})$, the left-hand side is
\begin{align*}
\operatorname{tr}(\alpha\wedge\beta)=\sum_{i=1}^n\sum_{j=1}^n \alpha_{ij}\wedge\beta_{ji}.
\end{align*}
Since scalar differential forms satisfy $\alpha_{ij}\wedge\beta_{ji}=(-1)^{pq}\beta_{ji}\wedge\alpha_{ij}$, relabelling the summation indices gives the claimed identity. More generally, cyclically moving a homogeneous form of degree $p$ past a product of total degree $q$ introduces the sign $(-1)^{pq}$.
[/step]
custom_env
admin
[step:Derive the Maurer-Cartan and conjugation differential identities]Define
\begin{align*}
B:=g^{-1}Ag\in\Omega^1(M;\mathfrak g)
\end{align*}
and
\begin{align*}
D:=g^{-1}(dA)g\in\Omega^2(M;\mathfrak{gl}(n,\mathbb C)).
\end{align*}
From $g^{-1}g=I_n$, where $I_n\in GL(n,\mathbb C)$ denotes the $n\times n$ identity matrix, applying the [exterior derivative](/theorems/1525) entrywise and using the Leibniz rule for matrix-valued forms gives
\begin{align*}d(g^{-1})\,g+g^{-1}dg=0.\end{align*}
Multiplying on the right by $g^{-1}$ gives
\begin{align*}d(g^{-1})=-g^{-1}dg\,g^{-1}=-\theta g^{-1}.\end{align*}
Therefore
\begin{align*}d\theta=d(g^{-1}dg)=d(g^{-1})\wedge dg=-g^{-1}dg\,g^{-1}\wedge dg=-\theta\wedge\theta.\end{align*}
Using the graded Leibniz rule on $B=g^{-1}Ag$, with $g$ and $g^{-1}$ of degree $0$ and $A$ of degree $1$, gives
\begin{align*}dB=d(g^{-1})\wedge A g+g^{-1}(dA)g-g^{-1}A\wedge dg.\end{align*}
Substituting $d(g^{-1})=-\theta g^{-1}$ and $dg=g\theta$ yields
\begin{align*}dB=D-\theta\wedge B-B\wedge\theta.\end{align*}[/step]
custom_env
admin
[guided]The two identities needed for the expansion are the Maurer-Cartan equation for $\theta$ and the differential of the conjugated one-form $B$. We define
\begin{align*}
B:=g^{-1}Ag\in\Omega^1(M;\mathfrak g)
\end{align*}
and
\begin{align*}
D:=g^{-1}(dA)g\in\Omega^2(M;\mathfrak{gl}(n,\mathbb C)).
\end{align*}
The notation $D$ is temporary: it records the conjugated two-form $dA$ so that the expansion is readable.
First compute $d(g^{-1})$. Since $g^{-1}g=I_n$ as a matrix-valued function on $M$, where $I_n\in GL(n,\mathbb C)$ is the $n\times n$ identity matrix, differentiating entrywise gives
\begin{align*}
d(g^{-1})\,g+g^{-1}dg=0.
\end{align*}
Right multiplication by $g^{-1}$ gives
\begin{align*}
d(g^{-1})=-g^{-1}dg\,g^{-1}.
\end{align*}
Because $\theta=g^{-1}dg$, this becomes
\begin{align*}
d(g^{-1})=-\theta g^{-1}.
\end{align*}
Now apply the exterior derivative to $\theta=g^{-1}dg$. The term $d(dg)$ vanishes because $d^2=0$, so
\begin{align*}
d\theta=d(g^{-1})\wedge dg.
\end{align*}
Substituting $d(g^{-1})=-g^{-1}dg\,g^{-1}$ and grouping the degree-zero matrix factors with their neighbours gives
\begin{align*}
d\theta=-g^{-1}dg\,g^{-1}\wedge dg=-\theta\wedge\theta.
\end{align*}
This is the Maurer-Cartan equation in the present matrix notation.
It remains to differentiate $B=g^{-1}Ag$. The graded Leibniz rule gives a minus sign when $d$ passes the one-form $A$:
\begin{align*}
dB=d(g^{-1})\wedge A g+g^{-1}(dA)g-g^{-1}A\wedge dg.
\end{align*}
The first term is
\begin{align*}
d(g^{-1})\wedge A g=-\theta g^{-1}\wedge A g=-\theta\wedge B.
\end{align*}
The middle term is exactly $D$. Since $dg=g\theta$, the last term is
\begin{align*}
-g^{-1}A\wedge dg=-g^{-1}Ag\wedge\theta=-B\wedge\theta.
\end{align*}
Therefore
\begin{align*}
dB=D-\theta\wedge B-B\wedge\theta.
\end{align*}
The signs in this identity are the sensitive point of the proof: both mixed terms have minus signs.[/guided]
custom_env
admin
[step:Expand the Chern-Simons form of $B+\theta$]
Since $A^g=B+\theta$, we have
\begin{align*}\operatorname{CS}(A^g)=\operatorname{tr}\left((B+\theta)\wedge d(B+\theta)+\frac{2}{3}(B+\theta)\wedge(B+\theta)\wedge(B+\theta)\right).\end{align*}
Using $dB=D-\theta\wedge B-B\wedge\theta$ and $d\theta=-\theta\wedge\theta$, expansion gives
\begin{align*}\operatorname{CS}(A^g)=\operatorname{tr}(B\wedge D)+\frac{2}{3}\operatorname{tr}(B\wedge B\wedge B)+\operatorname{tr}(\theta\wedge D)-\operatorname{tr}(B\wedge\theta\wedge\theta)-\frac{1}{3}\operatorname{tr}(\theta\wedge\theta\wedge\theta).\end{align*}
Here the terms with two copies of $B$ and one copy of $\theta$ cancel because
\begin{align*}-\operatorname{tr}(B\wedge\theta\wedge B)-\operatorname{tr}(B\wedge B\wedge\theta)+\frac{2}{3}\operatorname{tr}(B\wedge B\wedge\theta)+\frac{2}{3}\operatorname{tr}(B\wedge\theta\wedge B)+\frac{2}{3}\operatorname{tr}(\theta\wedge B\wedge B)=0,\end{align*}
using cyclicity for three one-forms. The terms with one copy of $B$ and two copies of $\theta$ combine to $-\operatorname{tr}(B\wedge\theta\wedge\theta)$ for the same reason.
[/step]
custom_env
admin
[step:Identify the conjugation-invariant copy of $\operatorname{CS}(A)$]
Because $g$ and $g^{-1}$ are degree-zero matrix-valued functions, ordinary cyclicity of the matrix trace gives
\begin{align*}\operatorname{tr}(B\wedge D)=\operatorname{tr}\left(g^{-1}Ag\wedge g^{-1}(dA)g\right)=\operatorname{tr}(A\wedge dA).\end{align*}
Similarly,
\begin{align*}\operatorname{tr}(B\wedge B\wedge B)=\operatorname{tr}\left(g^{-1}Ag\wedge g^{-1}Ag\wedge g^{-1}Ag\right)=\operatorname{tr}(A\wedge A\wedge A).\end{align*}
Thus
\begin{align*}\operatorname{tr}(B\wedge D)+\frac{2}{3}\operatorname{tr}(B\wedge B\wedge B)=\operatorname{CS}(A).\end{align*}
Therefore
\begin{align*}\operatorname{CS}(A^g)-\operatorname{CS}(A)=\operatorname{tr}(\theta\wedge D)-\operatorname{tr}(B\wedge\theta\wedge\theta)-\frac{1}{3}\operatorname{tr}(\theta\wedge\theta\wedge\theta).\end{align*}
[/step]
custom_env
admin
[step:Rewrite the mixed terms as an exact form]
Using the graded Leibniz rule for the scalar-valued two-form $\operatorname{tr}(\theta\wedge B)$, we obtain
\begin{align*}
d\operatorname{tr}(\theta\wedge B)=\operatorname{tr}(d\theta\wedge B-\theta\wedge dB).
\end{align*}
Substituting $d\theta=-\theta\wedge\theta$ and $dB=D-\theta\wedge B-B\wedge\theta$ gives
\begin{align*}
d\operatorname{tr}(\theta\wedge B)=-\operatorname{tr}(\theta\wedge D)+\operatorname{tr}(\theta\wedge B\wedge\theta).
\end{align*}
By cyclicity for three one-forms,
\begin{align*}
\operatorname{tr}(\theta\wedge B\wedge\theta)=\operatorname{tr}(B\wedge\theta\wedge\theta).
\end{align*}
Hence
\begin{align*}
\operatorname{tr}(\theta\wedge D)-\operatorname{tr}(B\wedge\theta\wedge\theta)=-d\operatorname{tr}(\theta\wedge B).
\end{align*}
Since $B=g^{-1}Ag$, the previous step yields
\begin{align*}
\operatorname{CS}(A^g)-\operatorname{CS}(A)=-d\operatorname{tr}\left(\theta\wedge g^{-1}Ag\right)-\frac{1}{3}\operatorname{tr}(\theta\wedge\theta\wedge\theta).
\end{align*}
Finally,
\begin{align*}
\operatorname{tr}(A\wedge dg\,g^{-1})=\operatorname{tr}(g^{-1}Ag\wedge g^{-1}dg)=\operatorname{tr}(B\wedge\theta)=-\operatorname{tr}(\theta\wedge B),
\end{align*}
where the last sign comes from swapping two one-forms. Therefore the equivalent form
\begin{align*}
\operatorname{CS}(A^g)-\operatorname{CS}(A)=d\operatorname{tr}(A\wedge dg\,g^{-1})-\frac{1}{3}\operatorname{tr}(\theta\wedge\theta\wedge\theta)
\end{align*}
also follows. This proves the stated gauge transformation formula.
[/step]