[proofplan]
The proof is a pointwise Hodge-theoretic computation followed by integration. The curvature decomposes orthogonally into self-dual and anti-self-dual parts, so the Yang-Mills energy is the sum of the two squared $L^2$ norms. The chosen trace convention converts $\operatorname{tr}(F_A\wedge F_A)$ into the difference of those squared norms, and the stated Chern-Weil normalization identifies that difference with $8\pi^2 k$. The inequality and equality cases then follow by rewriting the energy as a topological term plus a nonnegative error term.
[/proofplan]
[step:Decompose the Yang-Mills energy into self-dual and anti-self-dual norms]
Let $*$ denote the Hodge star operator on $\Omega^2(X)$ determined by $g$ and the orientation. The self-dual and anti-self-dual parts of $F_A$ are
\begin{align*}
F_A^+:=\frac{1}{2}(F_A+*F_A)
\end{align*}
and
\begin{align*}
F_A^-:=\frac{1}{2}(F_A-*F_A).
\end{align*}
Thus $*F_A^+=F_A^+$ and $*F_A^-=-F_A^-$. The [Hodge decomposition](/theorems/2745) of two-forms in dimension $4$ is orthogonal pointwise, hence
\begin{align*}
|F_A|^2=|F_A^+|^2+|F_A^-|^2.
\end{align*}
Define
\begin{align*}
\|F_A^\pm\|_{L^2}^2:=\int_X |F_A^\pm|^2\,d\operatorname{vol}_g(x).
\end{align*}
Integrating the pointwise [orthogonal decomposition](/theorems/436) gives
\begin{align*}
YM(A)=\|F_A^+\|_{L^2}^2+\|F_A^-\|_{L^2}^2.
\end{align*}
[/step]
[step:Compute the trace wedge form from the Hodge decomposition]
We claim that the following pointwise identity of $4$-forms holds:
\begin{align*}
\operatorname{tr}(F_A\wedge F_A)=-(|F_A^+|^2-|F_A^-|^2)\,d\operatorname{vol}_g.
\end{align*}
Fix $p\in X$. Choose an oriented $g_p$-[orthonormal basis](/page/Orthonormal%20Basis) of $T_p^*X$, and choose an orthonormal basis $(\tau_1,\tau_2,\tau_3)$ of $\mathfrak{su}(E)_p$ with respect to $\langle \xi,\eta\rangle=-\operatorname{tr}(\xi\eta)$. Write
\begin{align*}
F_A^+(p)=\sum_{a=1}^{3}\alpha_a^+\otimes \tau_a
\end{align*}
and
\begin{align*}
F_A^-(p)=\sum_{a=1}^{3}\alpha_a^-\otimes \tau_a,
\end{align*}
where $\alpha_a^+\in \Lambda^2_+T_p^*X$ and $\alpha_a^-\in \Lambda^2_-T_p^*X$ are scalar two-forms.
Since $\operatorname{tr}(\tau_a\tau_b)=-\delta_{ab}$, expansion of the trace gives
\begin{align*}
\operatorname{tr}(F_A(p)\wedge F_A(p))=-\sum_{a=1}^{3}(\alpha_a^++\alpha_a^-)\wedge(\alpha_a^++\alpha_a^-).
\end{align*}
For scalar two-forms $\beta,\gamma\in \Lambda^2T_p^*X$, the wedge-inner-product identity is
\begin{align*}
\beta\wedge\gamma=\langle \beta,*\gamma\rangle\,d\operatorname{vol}_g(p).
\end{align*}
Applying this identity with $*\alpha_a^+=\alpha_a^+$ and $*\alpha_a^-=-\alpha_a^-$ gives
\begin{align*}
\alpha_a^+\wedge\alpha_a^+=|\alpha_a^+|^2\,d\operatorname{vol}_g(p)
\end{align*}
and
\begin{align*}
\alpha_a^-\wedge\alpha_a^-=-|\alpha_a^-|^2\,d\operatorname{vol}_g(p).
\end{align*}
Also,
\begin{align*}
\alpha_a^+\wedge\alpha_a^-=\langle \alpha_a^+,*\alpha_a^-\rangle\,d\operatorname{vol}_g(p)=-\langle \alpha_a^+,\alpha_a^-\rangle\,d\operatorname{vol}_g(p)=0,
\end{align*}
because $\Lambda^2_+T_p^*X$ and $\Lambda^2_-T_p^*X$ are orthogonal. Summing over $a$ yields
\begin{align*}
\operatorname{tr}(F_A(p)\wedge F_A(p))=-(|F_A^+(p)|^2-|F_A^-(p)|^2)\,d\operatorname{vol}_g(p).
\end{align*}
Since $p$ was arbitrary, the claimed identity holds on $X$.
[guided]
The sign in this theorem is controlled by two conventions: the [inner product](/page/Inner%20Product) on $\mathfrak{su}(2)$ is $\langle \xi,\eta\rangle=-\operatorname{tr}(\xi\eta)$, and the Hodge star satisfies $*\omega^+=\omega^+$ and $*\omega^-=-\omega^-$. We verify the resulting $4$-form identity pointwise.
Fix $p\in X$. Choose an oriented orthonormal basis of $T_p^*X$ and an orthonormal basis $(\tau_1,\tau_2,\tau_3)$ of $\mathfrak{su}(E)_p$ for the inner product $-\operatorname{tr}(\xi\eta)$. The orthonormality condition means exactly
\begin{align*}
\operatorname{tr}(\tau_a\tau_b)=-\delta_{ab}.
\end{align*}
Write the self-dual and anti-self-dual parts as
\begin{align*}
F_A^+(p)=\sum_{a=1}^{3}\alpha_a^+\otimes \tau_a
\end{align*}
and
\begin{align*}
F_A^-(p)=\sum_{a=1}^{3}\alpha_a^-\otimes \tau_a,
\end{align*}
where $\alpha_a^+\in \Lambda^2_+T_p^*X$ and $\alpha_a^-\in \Lambda^2_-T_p^*X$. Expanding $F_A(p)=F_A^+(p)+F_A^-(p)$ and applying the trace in the $\mathfrak{su}(E)_p$ factor gives
\begin{align*}
\operatorname{tr}(F_A(p)\wedge F_A(p))=-\sum_{a=1}^{3}(\alpha_a^++\alpha_a^-)\wedge(\alpha_a^++\alpha_a^-).
\end{align*}
Now use the defining identity between wedge product and the Hodge star on scalar two-forms:
\begin{align*}
\beta\wedge\gamma=\langle \beta,*\gamma\rangle\,d\operatorname{vol}_g(p).
\end{align*}
For a self-dual form $\alpha_a^+$ this gives
\begin{align*}
\alpha_a^+\wedge\alpha_a^+=|\alpha_a^+|^2\,d\operatorname{vol}_g(p).
\end{align*}
For an anti-self-dual form $\alpha_a^-$ it gives the opposite sign:
\begin{align*}
\alpha_a^-\wedge\alpha_a^-=-|\alpha_a^-|^2\,d\operatorname{vol}_g(p).
\end{align*}
The mixed term vanishes because self-dual and anti-self-dual two-forms are orthogonal:
\begin{align*}
\alpha_a^+\wedge\alpha_a^-=\langle \alpha_a^+,*\alpha_a^-\rangle\,d\operatorname{vol}_g(p)=-\langle \alpha_a^+,\alpha_a^-\rangle\,d\operatorname{vol}_g(p)=0.
\end{align*}
Substituting these three identities into the trace expansion gives
\begin{align*}
\operatorname{tr}(F_A(p)\wedge F_A(p))=-(|F_A^+(p)|^2-|F_A^-(p)|^2)\,d\operatorname{vol}_g(p).
\end{align*}
This is the desired pointwise identity, and it holds at every $p\in X$.
[/guided]
[/step]
[step:Identify the norm difference with the second Chern number]
By the Chern-Weil convention in the statement,
\begin{align*}
k=\left\langle -\frac{1}{8\pi^2}[\operatorname{tr}(F_A\wedge F_A)],[X]\right\rangle.
\end{align*}
Since $X$ is closed and oriented, pairing the de Rham cohomology class of a top-degree form with $[X]$ is integration over $X$ with respect to the given orientation. Therefore
\begin{align*}
k=-\frac{1}{8\pi^2}\int_X \operatorname{tr}(F_A\wedge F_A).
\end{align*}
Using the trace wedge identity from the previous step gives
\begin{align*}
k=\frac{1}{8\pi^2}\int_X (|F_A^+|^2-|F_A^-|^2)\,d\operatorname{vol}_g(x).
\end{align*}
Hence
\begin{align*}
8\pi^2k=\|F_A^+\|_{L^2}^2-\|F_A^-\|_{L^2}^2.
\end{align*}
[/step]
[step:Rewrite the energy as a topological term plus a nonnegative term]
Combining
\begin{align*}
YM(A)=\|F_A^+\|_{L^2}^2+\|F_A^-\|_{L^2}^2
\end{align*}
with
\begin{align*}
8\pi^2k=\|F_A^+\|_{L^2}^2-\|F_A^-\|_{L^2}^2
\end{align*}
gives
\begin{align*}
YM(A)=8\pi^2k+2\|F_A^-\|_{L^2}^2.
\end{align*}
The same two identities also give
\begin{align*}
YM(A)=-8\pi^2k+2\|F_A^+\|_{L^2}^2.
\end{align*}
Both squared $L^2$ norms are nonnegative, so the first formula gives $YM(A)\ge 8\pi^2k$ and the second gives $YM(A)\ge -8\pi^2k$. Therefore
\begin{align*}
YM(A)\ge 8\pi^2|k|.
\end{align*}
[/step]
[step:Read off the equality cases from the nonnegative summands]
If $k>0$, then $|k|=k$, and the identity
\begin{align*}
YM(A)=8\pi^2k+2\|F_A^-\|_{L^2}^2
\end{align*}
shows that equality $YM(A)=8\pi^2|k|$ holds exactly when $\|F_A^-\|_{L^2}^2=0$. Since $F_A^-$ is smooth and its pointwise norm is nonnegative, this is equivalent to $F_A^-=0$ on $X$.
If $k<0$, then $|k|=-k$, and the identity
\begin{align*}
YM(A)=-8\pi^2k+2\|F_A^+\|_{L^2}^2
\end{align*}
shows that equality holds exactly when $\|F_A^+\|_{L^2}^2=0$, equivalently $F_A^+=0$ on $X$.
If $k=0$, then the bound is $YM(A)\ge 0$. Equality holds exactly when
\begin{align*}
\int_X |F_A|^2\,d\operatorname{vol}_g(x)=0.
\end{align*}
Because $F_A$ is smooth and $|F_A|^2\ge 0$, this is equivalent to $F_A=0$ on $X$. Thus equality occurs precisely in the self-dual case for $k>0$, the anti-self-dual case for $k<0$, and the flat case for $k=0$.
[/step]