[proofplan]
Both parts rest on a single key lemma: a random variable that is non-negative $\mathbb{P}$-almost surely has non-negative expectation. We prove this by decomposing into positive and negative parts: the hypothesis forces the negative part $Z^-$ to have support contained in a $\mathbb{P}$-null set, and a simple-function argument shows that any function with null support integrates to zero, leaving $\mathbb{E}[Z] = \mathbb{E}[Z^+] \geq 0$. Monotonicity (i) follows by applying the lemma to $Z := Y - X$, which is non-negative $\mathbb{P}$-a.s. by hypothesis, together with linearity of expectation. The triangle inequality (ii) follows from two applications of (i) to the universal pointwise chain $-|X| \leq X \leq |X|$, which sandwiches $\mathbb{E}[X]$ between $-\mathbb{E}[|X|]$ and $\mathbb{E}[|X|]$.
[/proofplan]
[step:Prove that any non-negative a.s. integrable random variable has non-negative expectation]
Let $Z \in L^1(\Omega, \mathcal{F}, \mathbb{P})$ satisfy $Z \geq 0$ $\mathbb{P}$-almost surely. Define the positive and negative parts
\begin{align*}
Z^+ &:= \max(Z, 0) : \Omega \to [0, \infty), \\
Z^- &:= \max(-Z, 0) : \Omega \to [0, \infty),
\end{align*}
both $\mathcal{F}$-measurable, satisfying $Z = Z^+ - Z^-$ and $\mathbb{E}[Z] = \int_\Omega Z^+ \, d\mathbb{P} - \int_\Omega Z^- \, d\mathbb{P}$. Let $N := \{Z < 0\} \in \mathcal{F}$. The hypothesis $Z \geq 0$ $\mathbb{P}$-a.s. gives $\mathbb{P}(N) = 0$. Since $\{Z^- > 0\} = \{Z < 0\} = N$, the function $Z^-$ is strictly positive only on $N$.
We show $\int_\Omega Z^- \, d\mathbb{P} = 0$. Let $\phi = \sum_{k=1}^m c_k \mathbf{1}_{E_k}$ be any $\mathcal{F}$-simple function with $E_k \in \mathcal{F}$ pairwise disjoint, $c_k \geq 0$, and $0 \leq \phi \leq Z^-$. Since $Z^-(\omega) = 0$ for all $\omega \in \Omega \setminus N$, the pointwise bound $c_k \mathbf{1}_{E_k} \leq Z^-$ forces $E_k \subseteq N$ for each $k$. By monotonicity of $\mathbb{P}$, $\mathbb{P}(E_k) \leq \mathbb{P}(N) = 0$, so
\begin{align*}
\int_\Omega \phi \, d\mathbb{P} = \sum_{k=1}^m c_k \mathbb{P}(E_k) = 0.
\end{align*}
Taking the supremum over all such $\phi$ in the definition of the Lebesgue integral gives $\int_\Omega Z^- \, d\mathbb{P} = 0$. Therefore
\begin{align*}
\mathbb{E}[Z] = \int_\Omega Z^+ \, d\mathbb{P} - \int_\Omega Z^- \, d\mathbb{P} = \int_\Omega Z^+ \, d\mathbb{P} \geq 0,
\end{align*}
since $Z^+ \geq 0$ everywhere and the Lebesgue integral of a non-negative function is, by definition, a supremum of non-negative quantities.
[guided]
The conceptual heart of the proof is that the Lebesgue integral is insensitive to what happens on null sets. The hypothesis "$Z \geq 0$ $\mathbb{P}$-a.s." says that $Z$ is non-negative everywhere except on a set of probability zero, and we want to conclude $\mathbb{E}[Z] \geq 0$. The challenge is that the Lebesgue integral is defined over all of $\Omega$, so we must show the "bad" set $N = \{Z < 0\}$ contributes nothing.
**Setting up the decomposition.** Define
\begin{align*}
Z^+ := \max(Z, 0) : \Omega \to [0, \infty), \qquad Z^- := \max(-Z, 0) : \Omega \to [0, \infty).
\end{align*}
Both are $\mathcal{F}$-measurable (compositions of the measurable function $Z$ with the continuous maps $t \mapsto \max(t,0)$ and $t \mapsto \max(-t,0)$). They satisfy $Z = Z^+ - Z^-$, and by the definition of the Lebesgue integral for signed integrable functions:
\begin{align*}
\mathbb{E}[Z] = \int_\Omega Z^+ \, d\mathbb{P} - \int_\Omega Z^- \, d\mathbb{P}.
\end{align*}
The hypothesis $Z \geq 0$ $\mathbb{P}$-a.s. means the null set $N := \{Z < 0\}$ satisfies $\mathbb{P}(N) = 0$. Crucially, $\{Z^- > 0\} = \{-Z > 0\} = \{Z < 0\} = N$, so $Z^-$ is strictly positive only inside $N$; outside $N$, $Z^- = 0$ identically.
**Why does a function supported on a null set integrate to zero?** The Lebesgue integral $\int_\Omega Z^- \, d\mathbb{P}$ is defined as
\begin{align*}
\int_\Omega Z^- \, d\mathbb{P} = \sup\!\left\{\int_\Omega \phi \, d\mathbb{P} : \phi \text{ is } \mathcal{F}\text{-simple}, \; 0 \leq \phi \leq Z^-\right\}.
\end{align*}
Take any such simple function $\phi = \sum_{k=1}^m c_k \mathbf{1}_{E_k}$ with $E_k \in \mathcal{F}$ pairwise disjoint and $c_k \geq 0$. For any $\omega \notin N$, we have $Z^-(\omega) = 0$, so the bound $\phi(\omega) \leq Z^-(\omega) = 0$ together with $\phi \geq 0$ forces $\phi(\omega) = 0$. This means each indicator $\mathbf{1}_{E_k}$ must vanish outside $N$, i.e., $E_k \subseteq N$. By monotonicity of the measure $\mathbb{P}$:
\begin{align*}
\mathbb{P}(E_k) \leq \mathbb{P}(N) = 0.
\end{align*}
Therefore every simple function $\phi \leq Z^-$ integrates to
\begin{align*}
\int_\Omega \phi \, d\mathbb{P} = \sum_{k=1}^m c_k \mathbb{P}(E_k) = 0.
\end{align*}
Since every approximant achieves the value $0$, the supremum is $0$: $\int_\Omega Z^- \, d\mathbb{P} = 0$.
**Why is $\int_\Omega Z^+ \, d\mathbb{P} \geq 0$?** The integral is the supremum of values $\sum_k c_k \mathbb{P}(E_k)$, each of which is $\geq 0$ (since $c_k \geq 0$ and $\mathbb{P}(E_k) \geq 0$). The supremum of a set of non-negative numbers is non-negative.
**Conclusion.** Combining:
\begin{align*}
\mathbb{E}[Z] = \int_\Omega Z^+ \, d\mathbb{P} - \underbrace{\int_\Omega Z^- \, d\mathbb{P}}_{=\,0} = \int_\Omega Z^+ \, d\mathbb{P} \geq 0.
\end{align*}
The lesson: null sets are invisible to the Lebesgue integral, so "$Z \geq 0$ a.s." is just as good as "$Z \geq 0$ everywhere" for the purpose of integration.
[/guided]
[/step]
[step:Prove monotonicity by reducing to the non-negative case via the difference $Z = Y - X$]
Define
\begin{align*}
Z : \Omega &\to \mathbb{R} \\
\omega &\mapsto Y(\omega) - X(\omega).
\end{align*}
Since $X, Y \in L^1(\Omega, \mathcal{F}, \mathbb{P})$, the function $Z$ is $\mathcal{F}$-measurable (as the difference of two measurable functions) and satisfies
\begin{align*}
\mathbb{E}[|Z|] \leq \mathbb{E}[|Y| + |X|] = \mathbb{E}[|Y|] + \mathbb{E}[|X|] < \infty,
\end{align*}
so $Z \in L^1(\Omega, \mathcal{F}, \mathbb{P})$. The hypothesis $X \leq Y$ $\mathbb{P}$-a.s. gives $Z = Y - X \geq 0$ $\mathbb{P}$-a.s. By the result of the preceding step, $\mathbb{E}[Z] \geq 0$. Applying linearity of expectation (see [Properties of Expectation](/theorems/1117)):
\begin{align*}
\mathbb{E}[Y] - \mathbb{E}[X] = \mathbb{E}[Y - X] = \mathbb{E}[Z] \geq 0,
\end{align*}
hence $\mathbb{E}[X] \leq \mathbb{E}[Y]$.
[guided]
Rather than comparing $\mathbb{E}[X]$ and $\mathbb{E}[Y]$ directly, we manufacture a single non-negative random variable whose expectation encodes the comparison.
**Defining $Z$ and verifying integrability.** Set
\begin{align*}
Z : \Omega &\to \mathbb{R}, \\
\omega &\mapsto Y(\omega) - X(\omega).
\end{align*}
Before taking its expectation, we must confirm $Z \in L^1(\Omega, \mathcal{F}, \mathbb{P})$. Measurability: $Z$ is the pointwise difference of the $\mathcal{F}$-measurable functions $Y$ and $X$, hence $\mathcal{F}$-measurable. Integrability: the triangle inequality for $|\cdot|$ gives $|Z| = |Y - X| \leq |Y| + |X|$ pointwise, and linearity of integration then yields
\begin{align*}
\mathbb{E}[|Z|] \leq \mathbb{E}[|Y| + |X|] = \mathbb{E}[|Y|] + \mathbb{E}[|X|] < \infty,
\end{align*}
where finiteness uses $X, Y \in L^1$. Hence $Z \in L^1(\Omega, \mathcal{F}, \mathbb{P})$.
**Establishing non-negativity of $Z$.** There exists $A \in \mathcal{F}$ with $\mathbb{P}(A) = 1$ on which $X(\omega) \leq Y(\omega)$. On $A$, we have $Z(\omega) = Y(\omega) - X(\omega) \geq 0$. Since $\{Z < 0\} \subseteq \Omega \setminus A$, monotonicity of $\mathbb{P}$ gives $\mathbb{P}(\{Z < 0\}) \leq \mathbb{P}(\Omega \setminus A) = 0$. So $Z \geq 0$ $\mathbb{P}$-a.s.
**Applying the lemma and linearity.** The result of the preceding step gives $\mathbb{E}[Z] \geq 0$. By linearity of expectation ([Properties of Expectation](/theorems/1117)), applied to $Y$ and $-X$ in $L^1$:
\begin{align*}
\mathbb{E}[Y] - \mathbb{E}[X] = \mathbb{E}[Y - X] = \mathbb{E}[Z] \geq 0.
\end{align*}
Rearranging: $\mathbb{E}[X] \leq \mathbb{E}[Y]$.
[/guided]
[/step]
[step:Deduce the triangle inequality from the pointwise sandwich $-|X| \leq X \leq |X|$ and monotonicity]
For every $\omega \in \Omega$, the elementary inequality $-|t| \leq t \leq |t|$ (valid for all $t \in \mathbb{R}$), applied to $t = X(\omega)$, gives
\begin{align*}
-|X(\omega)| \leq X(\omega) \leq |X(\omega)|.
\end{align*}
Since this holds at every point of $\Omega$, it holds $\mathbb{P}$-almost surely.
We verify that all three functions $-|X|$, $X$, and $|X|$ lie in $L^1(\Omega, \mathcal{F}, \mathbb{P})$. The function $X \in L^1$ by hypothesis. The map $|X|: \omega \mapsto |X(\omega)|$ is $\mathcal{F}$-measurable and satisfies $\mathbb{E}[|X|] = \|X\|_{L^1(\Omega,\mathcal{F},\mathbb{P})} < \infty$, so $|X| \in L^1$. By scalar multiplication, $-|X| \in L^1$.
Applying monotonicity from part (i) twice:
- To the pair $(-|X|,\, X)$: since $-|X| \leq X$ $\mathbb{P}$-a.s., part (i) gives $\mathbb{E}[-|X|] \leq \mathbb{E}[X]$, and by linearity ([Properties of Expectation](/theorems/1117)), $-\mathbb{E}[|X|] \leq \mathbb{E}[X]$.
- To the pair $(X,\, |X|)$: since $X \leq |X|$ $\mathbb{P}$-a.s., part (i) gives $\mathbb{E}[X] \leq \mathbb{E}[|X|]$.
Combining:
\begin{align*}
-\mathbb{E}[|X|] \leq \mathbb{E}[X] \leq \mathbb{E}[|X|].
\end{align*}
Since $|X| \geq 0$ everywhere, the result of Step 1 (with $Z := |X|$) gives $\mathbb{E}[|X|] \geq 0$. For any $r \in \mathbb{R}$ and $s \geq 0$, the equivalence $-s \leq r \leq s \iff |r| \leq s$ applied with $r = \mathbb{E}[X]$ and $s = \mathbb{E}[|X|]$ yields
\begin{align*}
|\mathbb{E}[X]| \leq \mathbb{E}[|X|].
\end{align*}
[guided]
The triangle inequality $|\mathbb{E}[X]| \leq \mathbb{E}[|X|]$ is, for a real number, equivalent to the two-sided bound $-\mathbb{E}[|X|] \leq \mathbb{E}[X] \leq \mathbb{E}[|X|]$. Our strategy is to produce exactly this sandwich by applying the monotonicity we just proved.
**The pointwise sandwich.** For every $\omega \in \Omega$, the definition of absolute value gives $-|t| \leq t \leq |t|$ for all $t \in \mathbb{R}$. Setting $t = X(\omega)$:
\begin{align*}
-|X(\omega)| \leq X(\omega) \leq |X(\omega)| \quad \text{for all } \omega \in \Omega.
\end{align*}
This holds pointwise everywhere, hence $\mathbb{P}$-almost surely.
**Checking integrability.** To invoke part (i) for each inequality in the sandwich, we need $-|X|$, $X$, and $|X|$ all in $L^1(\Omega, \mathcal{F}, \mathbb{P})$. The function $X \in L^1$ by hypothesis. The function $|X|: \omega \mapsto |X(\omega)|$ is $\mathcal{F}$-measurable (composition of measurable $X$ with the continuous map $t \mapsto |t|$), and $\mathbb{E}[|X|] = \|X\|_{L^1(\Omega, \mathcal{F}, \mathbb{P})} < \infty$ by the membership $X \in L^1$. So $|X| \in L^1$, and consequently $-|X| = (-1) \cdot |X| \in L^1$ by scalar multiplication.
**Two applications of monotonicity.**
*Lower bound.* Since $-|X| \leq X$ $\mathbb{P}$-a.s. and $-|X|, X \in L^1$, part (i) gives $\mathbb{E}[-|X|] \leq \mathbb{E}[X]$. By linearity of expectation ([Properties of Expectation](/theorems/1117)):
\begin{align*}
-\mathbb{E}[|X|] = \mathbb{E}[-|X|] \leq \mathbb{E}[X].
\end{align*}
*Upper bound.* Since $X \leq |X|$ $\mathbb{P}$-a.s. and $X, |X| \in L^1$, part (i) gives:
\begin{align*}
\mathbb{E}[X] \leq \mathbb{E}[|X|].
\end{align*}
Combining: $-\mathbb{E}[|X|] \leq \mathbb{E}[X] \leq \mathbb{E}[|X|]$.
**Non-negativity of $\mathbb{E}[|X|]$.** Since $|X(\omega)| \geq 0$ for every $\omega \in \Omega$, the result of Step 1 applied to $Z := |X|$ (which lies in $L^1$ and is non-negative everywhere, hence $\mathbb{P}$-a.s.) gives $\mathbb{E}[|X|] \geq 0$.
**Extracting $|\mathbb{E}[X]|$.** For any $r \in \mathbb{R}$ and $s \geq 0$, the equivalence $-s \leq r \leq s \iff |r| \leq s$ holds. With $r := \mathbb{E}[X] \in \mathbb{R}$ and $s := \mathbb{E}[|X|] \geq 0$, the sandwich gives:
\begin{align*}
|\mathbb{E}[X]| \leq \mathbb{E}[|X|]. \qquad \square
\end{align*}
[/guided]
[/step]