[guided]The purpose of this step is to make the conditional expectations concrete. A conditional expectation with respect to the first $k$ coordinates should be obtained by freezing those coordinates and averaging over the remaining independent coordinates.
For each $j\in\{1,\dots,n\}$, define the law of $Y_j$ by
\begin{align*}
\mu_j:=\mathbb P\circ Y_j^{-1}.
\end{align*}
This is a probability measure on $(E_j,\mathcal E_j)$. Since $Y_1,\dots,Y_n$ are independent, the joint law of $(Y_1,\dots,Y_n)$ is
\begin{align*}
\mu:=\mu_1\otimes\cdots\otimes\mu_n.
\end{align*}
Now fix $k\in\{0,1,\dots,n\}$. First define the extended-real partial integral $\tilde h_k:E_1\times\cdots\times E_k\to[-\infty,\infty]$ by
\begin{align*}
\tilde h_k(y_1,\dots,y_k)=\int_{E_{k+1}\times\cdots\times E_n} f(y_1,\dots,y_k,u_{k+1},\dots,u_n)\, d(\mu_{k+1}\otimes\cdots\otimes\mu_n)(u_{k+1},\dots,u_n).
\end{align*}
The hypothesis $Z\in L^1(\Omega,\mathcal F,\mathbb P)$ means that $f$ is integrable with respect to the joint law $\mu$. Therefore Fubini's theorem gives a full-measure measurable set $A_k\subset E_1\times\cdots\times E_k$ on which $\tilde h_k$ is finite. We define the real-valued measurable version $h_k:E_1\times\cdots\times E_k\to\mathbb R$ by $h_k=\tilde h_k$ on $A_k$ and $h_k=0$ outside $A_k$. When $k=n$, no variables remain to average over, so $h_n=f$ with the original everywhere-defined function from the theorem statement. When $k=0$, no coordinates have been revealed, and the definition becomes
\begin{align*}
h_0=\int_{E_1\times\cdots\times E_n} f(u_1,\dots,u_n)\, d\mu(u_1,\dots,u_n)=\mathbb E[Z].
\end{align*}
We verify that $h_k(Y_1,\dots,Y_k)$ is a version of $\mathbb E[Z\mid\mathcal F_k]$. Let $G:\Omega\to\mathbb R$ be any bounded $\mathcal F_k$-measurable random variable. Since $\mathcal F_k=\sigma(Y_1,\dots,Y_k)$, there is a bounded measurable map $\varphi:E_1\times\cdots\times E_k\to\mathbb R$ such that $G=\varphi(Y_1,\dots,Y_k)$ almost surely. The product $\varphi f$ is integrable because $\varphi$ is bounded and $f\in L^1(E_1\times\cdots\times E_n,\mu)$. Hence Fubini's theorem gives
\begin{align*}
\mathbb E[GZ]=\int_{E_1\times\cdots\times E_n}\varphi(y_1,\dots,y_k)f(y_1,\dots,y_n)\,d\mu(y_1,\dots,y_n).
\end{align*}
Averaging first over the unrevealed coordinates yields
\begin{align*}
\mathbb E[GZ]=\int_{E_1\times\cdots\times E_k}\varphi(y_1,\dots,y_k)h_k(y_1,\dots,y_k)\,d(\mu_1\otimes\cdots\otimes\mu_k)(y_1,\dots,y_k).
\end{align*}
Since $(Y_1,\dots,Y_k)$ has law $\mu_1\otimes\cdots\otimes\mu_k$, the last integral is
\begin{align*}
\mathbb E\left[G\,h_k(Y_1,\dots,Y_k)\right].
\end{align*}
This identity for every bounded $\mathcal F_k$-measurable $G$ is the defining property of conditional expectation. Hence
\begin{align*}
M_k=\mathbb E[Z\mid\mathcal F_k]=h_k(Y_1,\dots,Y_k)
\end{align*}
almost surely.[/guided]