[proofplan]
We first check that every product appearing in the covariance expansion is integrable. Then we rewrite each empirical process as a normalized sum of centered i.i.d. variables and expand the covariance into a double sum. The off-diagonal terms vanish by independence and centering, while the diagonal terms all equal $P(fg)-Pf\,Pg$. Finally, applying the same computation to $h=f-g$ gives the variance identity defining the covariance semimetric.
[/proofplan]
[step:Verify integrability of the products used in the covariance expansion]
Since $f,g:E\to\mathbb R$ are measurable and square-integrable with respect to $P$, the function $fg:E\to\mathbb R$ is measurable. For every $x\in E$,
\begin{align*}
2|f(x)g(x)|\le f(x)^2+g(x)^2.
\end{align*}
Integrating with respect to $P$ gives
\begin{align*}
\int_E |f(x)g(x)|\,dP(x)\le \frac{1}{2}\int_E f(x)^2\,dP(x)+\frac{1}{2}\int_E g(x)^2\,dP(x)<\infty.
\end{align*}
Thus $P(fg)$ is finite. Also $Pf$ and $Pg$ are finite because
\begin{align*}
|f(x)|\le \frac{1}{2}(1+f(x)^2) \qquad \text{and} \qquad |g(x)|\le \frac{1}{2}(1+g(x)^2)
\end{align*}
for every $x\in E$, and $P(E)=1$. Hence $G_n(f)$ and $G_n(g)$ are square-integrable real-valued random variables, so their covariance is well-defined.
[/step]
[step:Rewrite the empirical processes as sums of centered random variables]
For each $i\in\{1,\dots,n\}$, define real-valued random variables
\begin{align*}
Y_i:=f(X_i)-Pf \qquad \text{and} \qquad Z_i:=g(X_i)-Pg.
\end{align*}
By the definitions of $P_n$ and $G_n$,
\begin{align*}
G_n(f)=\frac{1}{\sqrt n}\sum_{i=1}^{n}Y_i \qquad \text{and} \qquad G_n(g)=\frac{1}{\sqrt n}\sum_{i=1}^{n}Z_i.
\end{align*}
Moreover,
\begin{align*}
\mathbb E[Y_i]=0 \qquad \text{and} \qquad \mathbb E[Z_i]=0
\end{align*}
for every $i\in\{1,\dots,n\}$, because each $X_i$ has distribution $P$.
[/step]
[step:Expand the covariance and eliminate the off-diagonal terms]
Using the definition of covariance for square-integrable real-valued random variables and the centering from the preceding step,
\begin{align*}
\operatorname{Cov}(G_n(f),G_n(g))=\mathbb E[G_n(f)G_n(g)].
\end{align*}
Substituting the sum representations and using linearity of expectation over the finite double sum gives
\begin{align*}
\operatorname{Cov}(G_n(f),G_n(g))=\frac{1}{n}\sum_{i=1}^{n}\sum_{j=1}^{n}\mathbb E[Y_iZ_j].
\end{align*}
If $i\ne j$, then $X_i$ and $X_j$ are independent, so $Y_i=f(X_i)-Pf$ and $Z_j=g(X_j)-Pg$ are independent. Therefore
\begin{align*}
\mathbb E[Y_iZ_j]=\mathbb E[Y_i]\mathbb E[Z_j]=0.
\end{align*}
[guided]
The point of introducing $Y_i$ and $Z_i$ is that they isolate the two facts that drive the covariance computation: independence across different sample indices and mean zero at each index. Since
\begin{align*}
G_n(f)=\frac{1}{\sqrt n}\sum_{i=1}^{n}Y_i \qquad \text{and} \qquad G_n(g)=\frac{1}{\sqrt n}\sum_{j=1}^{n}Z_j,
\end{align*}
their covariance is the expectation of the product of these two centered sums:
\begin{align*}
\operatorname{Cov}(G_n(f),G_n(g))=\mathbb E[G_n(f)G_n(g)].
\end{align*}
Expanding the product gives a finite double sum, so linearity of expectation applies without any limiting argument:
\begin{align*}
\operatorname{Cov}(G_n(f),G_n(g))=\frac{1}{n}\sum_{i=1}^{n}\sum_{j=1}^{n}\mathbb E[Y_iZ_j].
\end{align*}
Now consider an off-diagonal term with $i\ne j$. The random variables $X_i$ and $X_j$ are independent by the i.i.d. hypothesis. Since $Y_i$ is a [measurable function](/page/Measurable%20Function) of $X_i$ and $Z_j$ is a measurable function of $X_j$, the real-valued random variables $Y_i$ and $Z_j$ are independent. Their expectations are zero:
\begin{align*}
\mathbb E[Y_i]=\mathbb E[f(X_i)]-Pf=Pf-Pf=0,
\end{align*}
and
\begin{align*}
\mathbb E[Z_j]=\mathbb E[g(X_j)]-Pg=Pg-Pg=0.
\end{align*}
Independence therefore factors the expectation of the product:
\begin{align*}
\mathbb E[Y_iZ_j]=\mathbb E[Y_i]\mathbb E[Z_j]=0.
\end{align*}
Thus every off-diagonal contribution in the double sum vanishes.
[/guided]
[/step]
[step:Identify the diagonal terms and obtain the covariance formula]
For each $i\in\{1,\dots,n\}$, since $X_i$ has distribution $P$,
\begin{align*}
\mathbb E[Y_iZ_i]=\mathbb E[(f(X_i)-Pf)(g(X_i)-Pg)].
\end{align*}
Expanding the product and using $\mathbb E[f(X_i)]=Pf$ and $\mathbb E[g(X_i)]=Pg$ gives
\begin{align*}
\mathbb E[Y_iZ_i]=P(fg)-Pf\,Pg.
\end{align*}
The double sum has exactly $n$ diagonal terms and all off-diagonal terms are zero, so
\begin{align*}
\operatorname{Cov}(G_n(f),G_n(g))=\frac{1}{n}\sum_{i=1}^{n}\bigl(P(fg)-Pf\,Pg\bigr)=P(fg)-Pf\,Pg.
\end{align*}
[/step]
[step:Apply the covariance formula to the difference $f-g$]
Define the measurable function $h:E\to\mathbb R$ by $h(x):=f(x)-g(x)$. Since
\begin{align*}
h(x)^2\le 2f(x)^2+2g(x)^2
\end{align*}
for every $x\in E$, we have $Ph^2<\infty$. By linearity of $P_n$ and $P$,
\begin{align*}
G_n(h)=G_n(f)-G_n(g).
\end{align*}
The preceding centering computation applied to $h$ gives $\mathbb E[G_n(h)]=0$. Hence $\mathbb E[G_n(h)^2]=\operatorname{Cov}(G_n(h),G_n(h))$. Applying the already proved covariance formula with $h$ in both arguments gives
\begin{align*}
\mathbb E[(G_n(f)-G_n(g))^2]=\mathbb E[G_n(h)^2]=P(h^2)-(Ph)^2.
\end{align*}
Substituting $h=f-g$ yields
\begin{align*}
\mathbb E[(G_n(f)-G_n(g))^2]=P((f-g)^2)-\bigl(P(f-g)\bigr)^2=d_P(f,g)^2.
\end{align*}
This proves both asserted identities.
[/step]