[proofplan]
Fix a point $x\in\mathbb{R}$ and rewrite the empirical distribution function at $x$ as the average of indicator random variables. Each indicator records whether $X_i$ falls in the Borel set $(-\infty,x]$, hence is a Bernoulli [random variable](/page/Random%20Variable) with success probability $F(x)$. Linearity of expectation gives the mean, and independence makes the cross-covariances vanish, so the [variance](/page/Variance) of the average is the average of the Bernoulli variances divided by $n$.
[/proofplan]
[step:Convert the empirical distribution function at a fixed point into Bernoulli variables]
Fix $x\in\mathbb{R}$. Define the Borel measurable map $h_x:\mathbb{R}\to\{0,1\}$ by $h_x(t)=\mathbb{1}_{(-\infty,x]}(t)$ for every $t\in\mathbb{R}$.
For each $i\in\{1,\dots,n\}$, define the random variable $Y_i:\Omega\to\{0,1\}$ by
\begin{align*}
Y_i=h_x\circ X_i=\mathbb{1}_{\{X_i\le x\}}.
\end{align*}
Since $(-\infty,x]\in\mathcal{B}(\mathbb{R})$, each $Y_i$ is $\mathcal{F}$-measurable. Also
\begin{align*}
\mathbb{P}(Y_i=1)=\mathbb{P}(X_i\le x)=F(x),
\end{align*}
because the $X_i$ have common distribution function $F$. Thus $Y_1,\dots,Y_n$ are identically distributed Bernoulli random variables with success probability $F(x)$. Since $Y_i=h_x\circ X_i$ and $X_1,\dots,X_n$ are independent, the random variables $Y_1,\dots,Y_n$ are independent. Finally,
\begin{align*}
F_n(x)=\frac{1}{n}\sum_{i=1}^n Y_i.
\end{align*}
[guided]
Fix $x\in\mathbb{R}$. The value $F_n(x)$ is not a function of a variable point anymore; it is a random variable on $\Omega$. To isolate its elementary structure, define the map $h_x:\mathbb{R}\to\{0,1\}$ by $h_x(t)=\mathbb{1}_{(-\infty,x]}(t)$ for every $t\in\mathbb{R}$.
The set $(-\infty,x]$ is Borel, so $h_x$ is measurable as a map from $(\mathbb{R},\mathcal{B}(\mathbb{R}))$ to $\{0,1\}$ with the discrete $\sigma$-algebra. For each $i\in\{1,\dots,n\}$, define the random variable $Y_i:\Omega\to\{0,1\}$ by $Y_i(\omega)=h_x(X_i(\omega))$ for every $\omega\in\Omega$.
Equivalently,
\begin{align*}
Y_i=\mathbb{1}_{\{X_i\le x\}}.
\end{align*}
Because $X_i$ is measurable and $h_x$ is measurable, the composition $Y_i=h_x\circ X_i$ is a random variable.
Now compute its success probability. The event $\{Y_i=1\}$ is exactly the event $\{X_i\le x\}$, so
\begin{align*}
\mathbb{P}(Y_i=1)=\mathbb{P}(X_i\le x)=F(x).
\end{align*}
Thus each $Y_i$ is a Bernoulli random variable with success probability $F(x)$. The independence of the original variables is preserved under applying the same measurable map $h_x$ coordinatewise, so $Y_1,\dots,Y_n$ are independent. With this notation, the empirical distribution function at the fixed point $x$ becomes
\begin{align*}
F_n(x)=\frac{1}{n}\sum_{i=1}^n \mathbb{1}_{\{X_i\le x\}}=\frac{1}{n}\sum_{i=1}^n Y_i.
\end{align*}
This reduction is the whole probabilistic content of the theorem: after fixing $x$, the problem is just about the average of i.i.d. Bernoulli random variables.
[/guided]
[/step]
[step:Compute the pointwise expectation by linearity]
Since each $Y_i$ is bounded, all expectations below are finite. For every $i\in\{1,\dots,n\}$,
\begin{align*}
\mathbb{E}[Y_i]=1\cdot\mathbb{P}(Y_i=1)+0\cdot\mathbb{P}(Y_i=0)=F(x).
\end{align*}
Using linearity of expectation for the finite sum, we first have
\begin{align*}
\mathbb{E}[F_n(x)]=\mathbb{E}\left[\frac{1}{n}\sum_{i=1}^n Y_i\right].
\end{align*}
Linearity and the scalar factor give
\begin{align*}
\mathbb{E}[F_n(x)]=\frac{1}{n}\sum_{i=1}^n \mathbb{E}[Y_i].
\end{align*}
Since $\mathbb{E}[Y_i]=F(x)$ for every $i\in\{1,\dots,n\}$,
\begin{align*}
\mathbb{E}[F_n(x)]=\frac{1}{n}\sum_{i=1}^n F(x)=F(x).
\end{align*}
[/step]
[step:Compute the Bernoulli variance at the fixed point]
For every $i\in\{1,\dots,n\}$, the identity $Y_i^2=Y_i$ holds pointwise because $Y_i$ takes only the values $0$ and $1$. Therefore the [variance](/page/Variance) of $Y_i$ satisfies
\begin{align*}
\operatorname{Var}(Y_i)=\mathbb{E}[Y_i^2]-(\mathbb{E}[Y_i])^2.
\end{align*}
Using $Y_i^2=Y_i$ and $\mathbb{E}[Y_i]=F(x)$, this becomes
\begin{align*}
\operatorname{Var}(Y_i)=\mathbb{E}[Y_i]-(F(x))^2=F(x)(1-F(x)).
\end{align*}
[/step]
[step:Use independence to compute the variance of the average]
For square-integrable real-valued random variables $A:\Omega\to\mathbb{R}$ and $B:\Omega\to\mathbb{R}$, define their covariance by
\begin{align*}
\operatorname{Cov}(A,B)=\mathbb{E}[AB]-\mathbb{E}[A]\mathbb{E}[B].
\end{align*}
This definition applies to the variables $Y_i$ because they are bounded. For distinct indices $i,j\in\{1,\dots,n\}$, independence of $Y_i$ and $Y_j$ gives
\begin{align*}
\mathbb{E}[Y_iY_j]=\mathbb{E}[Y_i]\mathbb{E}[Y_j],
\end{align*}
so
\begin{align*}
\operatorname{Cov}(Y_i,Y_j)
=\mathbb{E}[Y_iY_j]-\mathbb{E}[Y_i]\mathbb{E}[Y_j]
=0.
\end{align*}
We derive the finite-sum [variance](/page/Variance) identity in this case from covariance bilinearity. Since the $Y_i$ are bounded, all second moments are finite, and
\begin{align*}
\operatorname{Var}\left(\sum_{i=1}^n Y_i\right)=\operatorname{Cov}\left(\sum_{i=1}^n Y_i,\sum_{j=1}^n Y_j\right).
\end{align*}
Bilinearity of covariance for finite sums gives
\begin{align*}
\operatorname{Var}\left(\sum_{i=1}^n Y_i\right)=\sum_{i=1}^n\sum_{j=1}^n \operatorname{Cov}(Y_i,Y_j).
\end{align*}
Splitting the diagonal terms from the off-diagonal terms and using $\operatorname{Cov}(Y_i,Y_j)=0$ for $i\ne j$ yields
\begin{align*}
\operatorname{Var}\left(\sum_{i=1}^n Y_i\right)=\sum_{i=1}^n \operatorname{Var}(Y_i).
\end{align*}
Therefore
\begin{align*}
\operatorname{Var}(F_n(x))=\operatorname{Var}\left(\frac{1}{n}\sum_{i=1}^n Y_i\right)=\frac{1}{n^2}\sum_{i=1}^n F(x)(1-F(x))=\frac{F(x)(1-F(x))}{n}.
\end{align*}
Since $x\in\mathbb{R}$ was arbitrary, both formulas hold for every $x\in\mathbb{R}$.
[/step]