[proofplan]
We expose the independent variables one at a time and form the Doob martingale of $Z$ with respect to the filtration generated by the revealed coordinates. The bounded differences hypothesis implies that, after the first $k-1$ coordinates are fixed, changing only the $k$-th coordinate changes the [conditional expectation](/page/Conditional%20Expectation) of $Z$ by at most $c_k$. Thus the $k$-th martingale increment is almost surely bounded by $c_k$. The desired tail estimate then follows from the [Azuma-Hoeffding inequality](/theorems/6071) for martingales with bounded increments.
[/proofplan]
[step:Build the coordinate filtration and the Doob martingale]
For each $k\in\{0,1,\dots,n\}$, define the sub-$\sigma$-algebra
\begin{align*}
\mathcal F_k:=\sigma(Y_1,\dots,Y_k),
\end{align*}
with $\mathcal F_0:=\{\varnothing,\Omega\}$. Define the process $(M_k)_{k=0}^n$ by
\begin{align*}
M_k:=\mathbb E[Z\mid \mathcal F_k].
\end{align*}
Since $Z\in L^1(\Omega,\mathcal F,\mathbb P)$, each conditional expectation $M_k$ is integrable and $\mathcal F_k$-measurable. Since $(\mathcal F_k)_{k=0}^n$ is increasing, the tower property gives, for $0\le k\le n-1$,
\begin{align*}
\mathbb E[M_{k+1}\mid \mathcal F_k]=\mathbb E[\mathbb E[Z\mid \mathcal F_{k+1}]\mid \mathcal F_k]=\mathbb E[Z\mid \mathcal F_k]=M_k.
\end{align*}
Thus $(M_k,\mathcal F_k)_{k=0}^n$ is a martingale. Moreover $M_0=\mathbb E[Z]$ and $M_n=Z$, because $Z=f(Y_1,\dots,Y_n)$ is $\mathcal F_n$-measurable.
[/step]
[step:Represent each conditional expectation by integrating over unrevealed coordinates]
For each $j\in\{1,\dots,n\}$, let $\mu_j:=\mathbb P\circ Y_j^{-1}$ denote the law of $Y_j$ on $(E_j,\mathcal E_j)$. Independence gives that the law of $(Y_1,\dots,Y_n)$ is the product measure
\begin{align*}
\mu:=\mu_1\otimes\cdots\otimes\mu_n
\end{align*}
on $E_1\times\cdots\times E_n$.
For each $k\in\{0,1,\dots,n\}$, define an extended-real measurable partial integral $\tilde h_k:E_1\times\cdots\times E_k\to[-\infty,\infty]$ by
\begin{align*}
\tilde h_k(y_1,\dots,y_k)=\int_{E_{k+1}\times\cdots\times E_n} f(y_1,\dots,y_k,u_{k+1},\dots,u_n)\, d(\mu_{k+1}\otimes\cdots\otimes\mu_n)(u_{k+1},\dots,u_n).
\end{align*}
Since $Z\in L^1$ and the joint law is $\mu$, [Fubini's theorem](/theorems/2961) gives a measurable set $A_k\subset E_1\times\cdots\times E_k$ with $(\mu_1\otimes\cdots\otimes\mu_k)(A_k)=1$ such that $\tilde h_k$ is finite on $A_k$. Define the real-valued measurable version $h_k:E_1\times\cdots\times E_k\to\mathbb R$ by $h_k=\tilde h_k$ on $A_k$ and $h_k=0$ on $(E_1\times\cdots\times E_k)\setminus A_k$. For $k=n$ we take $h_n=f$ with the original everywhere-defined function from the theorem statement, and for $k=0$ this means
\begin{align*}
h_0=\int_{E_1\times\cdots\times E_n} f(u_1,\dots,u_n)\, d\mu(u_1,\dots,u_n)=\mathbb E[Z].
\end{align*}
We claim that, after choosing these versions,
\begin{align*}
M_k=h_k(Y_1,\dots,Y_k)
\end{align*}
almost surely for each $k$. Indeed, for every bounded $\mathcal F_k$-measurable [random variable](/page/Random%20Variable) $G:\Omega\to\mathbb R$, there exists a bounded measurable map $\varphi:E_1\times\cdots\times E_k\to\mathbb R$ such that $G=\varphi(Y_1,\dots,Y_k)$ almost surely. Using independence and the product-measure representation of the joint law, Fubini's theorem applies to the integrable function $\varphi f$. First,
\begin{align*}
\mathbb E[GZ]=\int_{E_1\times\cdots\times E_n}\varphi(y_1,\dots,y_k)f(y_1,\dots,y_n)\,d\mu(y_1,\dots,y_n).
\end{align*}
Fubini's theorem over the unrevealed coordinates gives
\begin{align*}
\mathbb E[GZ]=\int_{E_1\times\cdots\times E_k}\varphi(y_1,\dots,y_k)h_k(y_1,\dots,y_k)\,d(\mu_1\otimes\cdots\otimes\mu_k)(y_1,\dots,y_k).
\end{align*}
By the law of $(Y_1,\dots,Y_k)$, this last integral is
\begin{align*}
\mathbb E\left[G\,h_k(Y_1,\dots,Y_k)\right].
\end{align*}
This is precisely the defining property of $\mathbb E[Z\mid\mathcal F_k]$.
[guided]
The purpose of this step is to make the conditional expectations concrete. A conditional expectation with respect to the first $k$ coordinates should be obtained by freezing those coordinates and averaging over the remaining independent coordinates.
For each $j\in\{1,\dots,n\}$, define the law of $Y_j$ by
\begin{align*}
\mu_j:=\mathbb P\circ Y_j^{-1}.
\end{align*}
This is a probability measure on $(E_j,\mathcal E_j)$. Since $Y_1,\dots,Y_n$ are independent, the joint law of $(Y_1,\dots,Y_n)$ is
\begin{align*}
\mu:=\mu_1\otimes\cdots\otimes\mu_n.
\end{align*}
Now fix $k\in\{0,1,\dots,n\}$. First define the extended-real partial integral $\tilde h_k:E_1\times\cdots\times E_k\to[-\infty,\infty]$ by
\begin{align*}
\tilde h_k(y_1,\dots,y_k)=\int_{E_{k+1}\times\cdots\times E_n} f(y_1,\dots,y_k,u_{k+1},\dots,u_n)\, d(\mu_{k+1}\otimes\cdots\otimes\mu_n)(u_{k+1},\dots,u_n).
\end{align*}
The hypothesis $Z\in L^1(\Omega,\mathcal F,\mathbb P)$ means that $f$ is integrable with respect to the joint law $\mu$. Therefore Fubini's theorem gives a full-measure measurable set $A_k\subset E_1\times\cdots\times E_k$ on which $\tilde h_k$ is finite. We define the real-valued measurable version $h_k:E_1\times\cdots\times E_k\to\mathbb R$ by $h_k=\tilde h_k$ on $A_k$ and $h_k=0$ outside $A_k$. When $k=n$, no variables remain to average over, so $h_n=f$ with the original everywhere-defined function from the theorem statement. When $k=0$, no coordinates have been revealed, and the definition becomes
\begin{align*}
h_0=\int_{E_1\times\cdots\times E_n} f(u_1,\dots,u_n)\, d\mu(u_1,\dots,u_n)=\mathbb E[Z].
\end{align*}
We verify that $h_k(Y_1,\dots,Y_k)$ is a version of $\mathbb E[Z\mid\mathcal F_k]$. Let $G:\Omega\to\mathbb R$ be any bounded $\mathcal F_k$-measurable random variable. Since $\mathcal F_k=\sigma(Y_1,\dots,Y_k)$, there is a bounded measurable map $\varphi:E_1\times\cdots\times E_k\to\mathbb R$ such that $G=\varphi(Y_1,\dots,Y_k)$ almost surely. The product $\varphi f$ is integrable because $\varphi$ is bounded and $f\in L^1(E_1\times\cdots\times E_n,\mu)$. Hence Fubini's theorem gives
\begin{align*}
\mathbb E[GZ]=\int_{E_1\times\cdots\times E_n}\varphi(y_1,\dots,y_k)f(y_1,\dots,y_n)\,d\mu(y_1,\dots,y_n).
\end{align*}
Averaging first over the unrevealed coordinates yields
\begin{align*}
\mathbb E[GZ]=\int_{E_1\times\cdots\times E_k}\varphi(y_1,\dots,y_k)h_k(y_1,\dots,y_k)\,d(\mu_1\otimes\cdots\otimes\mu_k)(y_1,\dots,y_k).
\end{align*}
Since $(Y_1,\dots,Y_k)$ has law $\mu_1\otimes\cdots\otimes\mu_k$, the last integral is
\begin{align*}
\mathbb E\left[G\,h_k(Y_1,\dots,Y_k)\right].
\end{align*}
This identity for every bounded $\mathcal F_k$-measurable $G$ is the defining property of conditional expectation. Hence
\begin{align*}
M_k=\mathbb E[Z\mid\mathcal F_k]=h_k(Y_1,\dots,Y_k)
\end{align*}
almost surely.
[/guided]
[/step]
[step:Bound each martingale increment by the corresponding coordinate oscillation]
Fix $k\in\{1,\dots,n\}$. Let $\nu_{k-1}:=\mu_1\otimes\cdots\otimes\mu_{k-1}$, with $\nu_0$ the unit measure on a one-point space, and let $\lambda_k:=\mu_{k+1}\otimes\cdots\otimes\mu_n$, with $\lambda_n$ the unit measure on a one-point space. By Fubini's theorem applied to the integrable function $f$, there is a measurable set $B_{k-1}\subset E_1\times\cdots\times E_{k-1}$ with $\nu_{k-1}(B_{k-1})=1$ such that for each prefix $a=(a_1,\dots,a_{k-1})\in B_{k-1}$ the extended-real measurable map $\tilde g_a:E_k\to[-\infty,\infty]$ defined by
\begin{align*}
\tilde g_a(y):=\int_{E_{k+1}\times\cdots\times E_n}f(a_1,\dots,a_{k-1},y,u_{k+1},\dots,u_n)\,d\lambda_k(u_{k+1},\dots,u_n)
\end{align*}
is finite for $\mu_k$-almost every $y\in E_k$, agrees with $h_k(a_1,\dots,a_{k-1},y)$ for $\mu_k$-almost every $y\in E_k$, and satisfies
\begin{align*}
h_{k-1}(a)=\int_{E_k}\tilde g_a(y)\,d\mu_k(y).
\end{align*}
For $a\in B_{k-1}$, let $C_a\subset E_k$ be a measurable set with $\mu_k(C_a)=1$ on which these finite-version identities hold. For any $y,y'\in C_a$, the bounded differences hypothesis and integration over the tail variables give
\begin{align*}
|\tilde g_a(y)-\tilde g_a(y')|=\left|\int_{E_{k+1}\times\cdots\times E_n} [f(a_1,\dots,a_{k-1},y,u_{k+1},\dots,u_n)-f(a_1,\dots,a_{k-1},y',u_{k+1},\dots,u_n)]\, d(\mu_{k+1}\otimes\cdots\otimes\mu_n)(u_{k+1},\dots,u_n)\right|.
\end{align*}
Taking the absolute value inside the integral and using the bounded differences hypothesis for the $k$-th coordinate gives
\begin{align*}
|\tilde g_a(y)-\tilde g_a(y')|\le \int_{E_{k+1}\times\cdots\times E_n} c_k\, d(\mu_{k+1}\otimes\cdots\otimes\mu_n)(u_{k+1},\dots,u_n)=c_k.
\end{align*}
Thus the $\mu_k$-essential range of $\tilde g_a$ has diameter at most $c_k$. Moreover, for every $y\in C_a$,
\begin{align*}
\left|\tilde g_a(y)-\int_{E_k}\tilde g_a(y')\,d\mu_k(y')\right|\le \int_{E_k}|\tilde g_a(y)-\tilde g_a(y')|\,d\mu_k(y')\le c_k.
\end{align*}
Since $Y_k$ is independent of $\mathcal F_{k-1}$, the previous representation gives
\begin{align*}
M_{k-1}=\int_{E_k}\tilde g_{(Y_1,\dots,Y_{k-1})}(y)\,d\mu_k(y)
\end{align*}
almost surely, while
\begin{align*}
M_k=\tilde g_{(Y_1,\dots,Y_{k-1})}(Y_k)
\end{align*}
almost surely on the event where $(Y_1,\dots,Y_{k-1})\in B_{k-1}$ and $Y_k\in C_{(Y_1,\dots,Y_{k-1})}$. This event has probability one by independence and Fubini's theorem. Therefore
\begin{align*}
|M_k-M_{k-1}|\le c_k
\end{align*}
almost surely.
[guided]
The delicate point is that conditional expectations are only defined up to null sets, so we must not argue using the ordinary pointwise range of an arbitrarily modified version of $h_k$. We instead prove an essential-range statement on the full-measure set where the partial-integral formula is valid.
Fix $k\in\{1,\dots,n\}$. Define $\nu_{k-1}:=\mu_1\otimes\cdots\otimes\mu_{k-1}$, with $\nu_0$ the unit measure on a one-point space, and define $\lambda_k:=\mu_{k+1}\otimes\cdots\otimes\mu_n$, with $\lambda_n$ the unit measure on a one-point space. Fubini's theorem applies because $f$ is integrable with respect to $\mu_1\otimes\cdots\otimes\mu_n$. Hence there is a measurable set $B_{k-1}\subset E_1\times\cdots\times E_{k-1}$ with $\nu_{k-1}(B_{k-1})=1$ such that, for each $a=(a_1,\dots,a_{k-1})\in B_{k-1}$, the extended-real measurable map $\tilde g_a:E_k\to[-\infty,\infty]$ defined by
\begin{align*}
\tilde g_a(y):=\int_{E_{k+1}\times\cdots\times E_n}f(a_1,\dots,a_{k-1},y,u_{k+1},\dots,u_n)\,d\lambda_k(u_{k+1},\dots,u_n)
\end{align*}
is finite for $\mu_k$-almost every $y$, agrees with $h_k(a_1,\dots,a_{k-1},y)$ for $\mu_k$-almost every $y$, and satisfies
\begin{align*}
h_{k-1}(a)=\int_{E_k}\tilde g_a(y)\,d\mu_k(y).
\end{align*}
Choose a measurable full-measure set $C_a\subset E_k$ on which these identities hold. For $y,y'\in C_a$, the two points differ only in the $k$-th coordinate, while the tail variables are the same. The bounded differences hypothesis gives a pointwise bound by $c_k$ inside the tail integral, so
\begin{align*}
|\tilde g_a(y)-\tilde g_a(y')|\le \int_{E_{k+1}\times\cdots\times E_n}c_k\,d\lambda_k(u_{k+1},\dots,u_n)=c_k.
\end{align*}
Now fix $y\in C_a$ and average this inequality over $y'\in E_k$. Since $C_a$ has full $\mu_k$-measure,
\begin{align*}
\left|\tilde g_a(y)-\int_{E_k}\tilde g_a(y')\,d\mu_k(y')\right|\le \int_{E_k}|\tilde g_a(y)-\tilde g_a(y')|\,d\mu_k(y')\le c_k.
\end{align*}
Finally take $a=(Y_1,\dots,Y_{k-1})$. The event $a\in B_{k-1}$ has probability one, and by independence plus Fubini the event $Y_k\in C_a$ also has probability one. On this event,
\begin{align*}
M_k=\tilde g_a(Y_k)
\end{align*}
and
\begin{align*}
M_{k-1}=\int_{E_k}\tilde g_a(y)\,d\mu_k(y).
\end{align*}
Therefore $|M_k-M_{k-1}|\le c_k$ almost surely.
[/guided]
[/step]
[step:Apply the martingale bounded differences inequality]
The martingale $(M_k,\mathcal F_k)_{k=0}^n$ is a finite real-valued martingale on the probability space $(\Omega,\mathcal F,\mathbb P)$. It is integrable, satisfies $M_0=\mathbb E[Z]$ and $M_n=Z$, and has deterministic increment bounds
\begin{align*}
|M_k-M_{k-1}|\le c_k
\end{align*}
almost surely for every $k\in\{1,\dots,n\}$. These are exactly the hypotheses of the [Azuma-Hoeffding Inequality](/theorems/6071) for a finite martingale with deterministic increment bounds. Applying that inequality gives, for every $t\ge 0$,
\begin{align*}
\mathbb P(M_n-M_0\ge t)\le\exp\left(-\frac{t^2}{2\sum_{k=1}^n c_k^2}\right).
\end{align*}
Substituting $M_n=Z$ and $M_0=\mathbb E[Z]$ yields
\begin{align*}
\mathbb P\left(Z-\mathbb E[Z]\ge t\right)
\le
\exp\left(-\frac{t^2}{2\sum_{k=1}^n c_k^2}\right).
\end{align*}
If $\sum_{k=1}^n c_k^2=0$, then $c_k=0$ for every $k$, so the increment bound gives $M_k=M_{k-1}$ almost surely for every $k$. Hence $Z=M_n=M_0=\mathbb E[Z]$ almost surely, and the stated convention for the right-hand side gives the same conclusion. This completes the proof.
[/step]