[proofplan]
For $k=0$ the assertion is immediate because the Chern-Weil form is the same constant function for every connection. For $k\ge 1$, join the two connections by the affine path $A_t=A_0+t(A_1-A_0)$ and compute the variation of the Chern-Weil form along this path. The curvature variation formula, the principal Bianchi identity, and the infinitesimal $\operatorname{Ad}$-invariance identity combine to show that the derivative is an [exterior derivative](/theorems/1525). Integrating this identity over $t\in[0,1]$ gives an explicit transgression form whose exterior derivative is the difference of the two Chern-Weil forms, and exactness gives equality of de Rham classes.
[/proofplan]
[step:Separate the constant polynomial case]
If $k=0$, then $P_0\in I^0(G)=\mathbb R$ is a constant. By the Chern-Weil convention, $P_0(F_{A_0})_M=P_0(F_{A_1})_M=P_0$ as $0$-forms on $M$. Hence
\begin{align*}
P_0(F_{A_1})_M-P_0(F_{A_0})_M=0.
\end{align*}
In degree $0$, the zero form lies in the zero subgroup of exact $0$-forms, so the difference is exact and the de Rham cohomology classes are equal. For the rest of the proof, assume $k\ge 1$.
[/step]
[step:Build the affine path of principal connections]
Define the $\mathfrak g$-valued $1$-form
\begin{align*}
a:=A_1-A_0\in\Omega^1(P;\mathfrak g).
\end{align*}
For each $t\in[0,1]$, define
\begin{align*}
A_t:=A_0+t a\in\Omega^1(P;\mathfrak g).
\end{align*}
Because $A_0$ and $A_1$ are principal connection forms, $a$ is horizontal and $\operatorname{Ad}$-equivariant, and therefore $A_t$ again satisfies the two defining conditions for a principal connection form: it reproduces fundamental vector fields and is equivariant under the right action of $G$. Let
\begin{align*}
F_t:=F_{A_t}=dA_t+\frac{1}{2}[A_t\wedge A_t]\in\Omega^2(P;\mathfrak g)
\end{align*}
denote the curvature form of $A_t$.
[guided]
The space of principal connections on a fixed principal bundle is affine, and the difference of two connection forms is not another connection form but rather a tensorial $\mathfrak g$-valued $1$-form. We name that difference
\begin{align*}
a:=A_1-A_0\in\Omega^1(P;\mathfrak g).
\end{align*}
For $t\in[0,1]$, set
\begin{align*}
A_t:=A_0+t a.
\end{align*}
We verify that $A_t$ is a principal connection. Let $X\in\mathfrak g$, and let $X^\#\in\mathfrak X(P)$ be the fundamental vector field generated by the right action. Since $A_0(X^\#)=X$ and $A_1(X^\#)=X$, we get
\begin{align*}
a(X^\#)=A_1(X^\#)-A_0(X^\#)=0.
\end{align*}
Thus
\begin{align*}
A_t(X^\#)=A_0(X^\#)+t a(X^\#)=X.
\end{align*}
Next let $g\in G$, and let $R_g:P\to P$ be the right action map. Since both $A_0$ and $A_1$ are equivariant,
\begin{align*}
R_g^*A_i=\operatorname{Ad}_{g^{-1}}A_i
\end{align*}
for $i\in\{0,1\}$. Subtracting gives
\begin{align*}
R_g^*a=\operatorname{Ad}_{g^{-1}}a.
\end{align*}
Therefore
\begin{align*}
R_g^*A_t=R_g^*A_0+tR_g^*a=\operatorname{Ad}_{g^{-1}}A_0+t\operatorname{Ad}_{g^{-1}}a=\operatorname{Ad}_{g^{-1}}A_t.
\end{align*}
So $A_t$ is a principal connection form for every $t\in[0,1]$. We denote its curvature by
\begin{align*}
F_t:=F_{A_t}=dA_t+\frac{1}{2}[A_t\wedge A_t]\in\Omega^2(P;\mathfrak g).
\end{align*}
[/guided]
[/step]
[step:Differentiate the curvature along the path]
Let
\begin{align*}
d_{A_t}:\Omega^j(P;\mathfrak g)\to\Omega^{j+1}(P;\mathfrak g)
\end{align*}
denote the covariant exterior derivative associated to $A_t$, defined by
\begin{align*}
d_{A_t}\beta:=d\beta+[A_t\wedge\beta]
\end{align*}
for every $\beta\in\Omega^j(P;\mathfrak g)$, with the usual graded bracket convention. Differentiating the curvature formula gives
\begin{align*}
\frac{dF_t}{dt}=da+[A_t\wedge a]=d_{A_t}a.
\end{align*}
Indeed, differentiating $F_t=dA_t+\frac{1}{2}[A_t\wedge A_t]$ and using $\frac{dA_t}{dt}=a$ gives the displayed identity.
[/step]
[step:Convert the variation of the invariant polynomial into an exterior derivative]
Extend $P_0$ to $\mathfrak g$-valued differential forms by the standard Chern-Weil wedge convention. Since $P_0$ is symmetric and $F_t$ has even degree, differentiating by multilinearity gives
\begin{align*}
\frac{d}{dt}P_0(F_t,\dots,F_t)=kP_0(d_{A_t}a,F_t,\dots,F_t).
\end{align*}
Here the right-hand side has one entry $d_{A_t}a$ and $k-1$ entries $F_t$.
The principal Bianchi identity gives $d_{A_t}F_t=0$ for every $t$. The infinitesimal $\operatorname{Ad}$-invariance identity for $P_0$ gives the Chern-Weil Leibniz identity
\begin{align*}
dP_0(a,F_t,\dots,F_t)=P_0(d_{A_t}a,F_t,\dots,F_t),
\end{align*}
because the graded Leibniz expansion has the first term displayed above and every remaining term contains one factor $d_{A_t}F_t$, hence vanishes by the Bianchi identity; the possible graded signs on those remaining terms are therefore irrelevant. Therefore
\begin{align*}
\frac{d}{dt}P_0(F_t,\dots,F_t)=d\bigl(kP_0(a,F_t,\dots,F_t)\bigr).
\end{align*}
Equivalently, this is the [infinitesimal Chern-Weil variation formula](/theorems/9790) [citetheorem:9790] applied to the smooth path $(A_t)_{t\in[0,1]}$.
[guided]
The goal of this step is to show that the change in the Chern-Weil form is already an exterior derivative at the infinitesimal level. Since $P_0$ is $k$-linear and symmetric, the ordinary product rule gives one term for each curvature slot:
\begin{align*}
\frac{d}{dt}P_0(F_t,\dots,F_t)=\sum_{j=1}^{k}P_0(F_t,\dots,F_t,\frac{dF_t}{dt},F_t,\dots,F_t).
\end{align*}
Every $F_t$ is a $2$-form, so moving $\frac{dF_t}{dt}$ past the other curvature entries introduces no sign. Symmetry of $P_0$ then makes all $k$ summands equal. Using the curvature variation identity $\frac{dF_t}{dt}=d_{A_t}a$, we obtain
\begin{align*}
\frac{d}{dt}P_0(F_t,\dots,F_t)=kP_0(d_{A_t}a,F_t,\dots,F_t).
\end{align*}
Now we explain why this expression is an exterior derivative. The ordinary exterior derivative of $P_0(a,F_t,\dots,F_t)$ can be rewritten using the covariant exterior derivative $d_{A_t}$ because $P_0$ is infinitesimally $\operatorname{Ad}$-invariant. This is exactly the cancellation expressed by the infinitesimal $\operatorname{Ad}$-invariance identity [citetheorem:9759]. Applying the graded Leibniz rule gives
\begin{align*}
dP_0(a,F_t,\dots,F_t)=P_0(d_{A_t}a,F_t,\dots,F_t)+\sum_{j=2}^{k}P_0(a,F_t,\dots,d_{A_t}F_t,\dots,F_t).
\end{align*}
The principal Bianchi identity says
\begin{align*}
d_{A_t}F_t=0.
\end{align*}
The full graded Leibniz rule attaches signs to the summands in which $d_{A_t}$ lands on a curvature slot. Since each such summand contains $d_{A_t}F_t$, every one of those signed terms is zero. Thus every summand in the displayed sum vanishes, leaving
\begin{align*}
dP_0(a,F_t,\dots,F_t)=P_0(d_{A_t}a,F_t,\dots,F_t).
\end{align*}
Substituting this into the differentiated Chern-Weil expression gives
\begin{align*}
\frac{d}{dt}P_0(F_t,\dots,F_t)=d\bigl(kP_0(a,F_t,\dots,F_t)\bigr).
\end{align*}
This is the infinitesimal Chern-Weil variation formula [citetheorem:9790] in this affine-path situation.
[/guided]
[/step]
[step:Integrate the infinitesimal identity to obtain a transgression form]
Let $\mathcal L^1$ denote one-dimensional [Lebesgue measure](/page/Lebesgue%20Measure) on $[0,1]$. Define the transgression form on $P$ by
\begin{align*}
\operatorname{TP}_0(A_0,A_1):=\int_{[0,1]}kP_0(a,F_t,\dots,F_t)\,d\mathcal L^1(t)\in\Omega^{2k-1}(P),
\end{align*}
where the integrand has one entry $a$ and $k-1$ entries $F_t$. This integral is taken coefficientwise for the smooth $t$-dependent differential form $kP_0(a,F_t,\dots,F_t)$. The form $a$ is horizontal and equivariant, each $F_t$ is horizontal and equivariant, and $P_0$ is $\operatorname{Ad}$-invariant, so the integrand is basic for every $t$. Hence $\operatorname{TP}_0(A_0,A_1)$ is basic and descends to a unique form
\begin{align*}
\operatorname{TP}_0(A_0,A_1)_M\in\Omega^{2k-1}(M)
\end{align*}
satisfying
\begin{align*}
\pi^*\operatorname{TP}_0(A_0,A_1)_M=\operatorname{TP}_0(A_0,A_1).
\end{align*}
Integrating the identity from the previous step over $[0,1]$ with respect to $\mathcal L^1$ gives
\begin{align*}
P_0(F_{A_1})-P_0(F_{A_0})=d\operatorname{TP}_0(A_0,A_1).
\end{align*}
Since exterior differentiation commutes with pullback by $\pi$ and with integration over the compact parameter interval $[0,1]$, descending the identity to $M$ yields
\begin{align*}
P_0(F_{A_1})_M-P_0(F_{A_0})_M=d\operatorname{TP}_0(A_0,A_1)_M.
\end{align*}
This is the [Chern-Weil transgression formula for a smooth path of connections](/theorems/9791) [citetheorem:9791].
[/step]
[step:Pass to de Rham cohomology]
By the previous step,
\begin{align*}
P_0(F_{A_1})_M-P_0(F_{A_0})_M=d\operatorname{TP}_0(A_0,A_1)_M.
\end{align*}
Thus the difference of the two Chern-Weil forms is exact. Since the descended Chern-Weil forms are closed by the [closedness of descended Chern-Weil forms](/theorems/9763) [citetheorem:9763], they determine de Rham cohomology classes. Exact forms represent the zero class in de Rham cohomology, so
\begin{align*}
[P_0(F_{A_1})_M]-[P_0(F_{A_0})_M]=0\in H_{\mathrm{dR}}^{2k}(M).
\end{align*}
Therefore
\begin{align*}
[P_0(F_{A_0})_M]=[P_0(F_{A_1})_M]\in H_{\mathrm{dR}}^{2k}(M),
\end{align*}
which is the desired connection independence.
[/step]