[proofplan]
The forward direction ($\Rightarrow$) uses diagonal extraction: from any subsequence, select a further subsequence with $\mu(\{|f_{m_k} - f| \ge 2^{-k}\}) < 2^{-k}$, then apply Borel-Cantelli to get pointwise a.e. convergence. The converse ($\Leftarrow$) is by contrapositive: failure of convergence in measure produces $(f_{n_j})$ with $\mu(\{|f_{n_j} - f| \ge \varepsilon_0\}) \ge \delta_0$ for all $j$. Any further subsequence converging a.e. would force $\mu(\{|f_{n_{j_i}} - f| \ge \varepsilon_0\}) \to 0$ via continuity from above applied to the decreasing tail sets — contradicting the persistent lower bound.
[/proofplan]
[step:Forward direction — extract a rapidly convergent further subsequence]
Assume $f_n \xrightarrow{\mu} f$. Let $(f_{n_j})_{j=1}^\infty$ be an arbitrary subsequence; then $f_{n_j} \xrightarrow{\mu} f$. For each $k \in \mathbb{N}$, choose $j_k$ inductively with $j_k > j_{k-1}$ so that
\begin{align*}
\mu\!\left(\left\{x \in X : |f_{n_{j_k}}(x) - f(x)| \ge 2^{-k}\right\}\right) < 2^{-k}.
\end{align*}
Set $m_k := n_{j_k}$.
[guided]
We need thresholds tending to zero (for pointwise convergence) and summable measure bounds (for Borel-Cantelli). The choice $\varepsilon_k = 2^{-k}$ with bound $2^{-k}$ achieves both: $\sum 2^{-k} = 1 < \infty$. The inductive selection works because convergence in measure at level $2^{-k}$ eventually forces the superlevel-set measure below $2^{-k}$, and the subsequence indices are unbounded.
[/guided]
[/step]
[step:Apply Borel-Cantelli to get pointwise a.e. convergence]
Define $A_k := \{x \in X : |f_{m_k}(x) - f(x)| \ge 2^{-k}\}$. Since $\sum_{k=1}^\infty \mu(A_k) < 1 < \infty$, the limsup set $A := \bigcap_{K} \bigcup_{k \ge K} A_k$ satisfies
\begin{align*}
\mu(A) \le \sum_{k=K}^\infty \mu(A_k) < 2^{-(K-1)} \quad \text{for all } K,
\end{align*}
by [monotonicity and countable subadditivity](/theorems/1081). Sending $K \to \infty$ gives $\mu(A) = 0$. For $x \notin A$, there exists $K_0(x)$ with $|f_{m_k}(x) - f(x)| < 2^{-k}$ for $k \ge K_0(x)$, so $f_{m_k}(x) \to f(x)$.
[/step]
[step:Converse (contrapositive) — extract a bad subsequence]
Suppose $f_n \not\xrightarrow{\mu} f$. There exist $\varepsilon_0 > 0$, $\delta_0 > 0$, and indices $n_1 < n_2 < \cdots$ with
\begin{align*}
\mu(G_j) \ge \delta_0 \quad \text{for all } j, \quad G_j := \{x : |f_{n_j}(x) - f(x)| \ge \varepsilon_0\}.
\end{align*}
[guided]
The negation of convergence in measure: there exists $\varepsilon_0 > 0$ for which $\mu(\{|f_n - f| \ge \varepsilon_0\}) \not\to 0$. A non-negative sequence failing to converge to zero has a subsequence bounded below by some $\delta_0 > 0$.
[/guided]
[/step]
[step:Show no further subsequence converges $\mu$-a.e. to $f$]
Let $(f_{n_{j_i}})$ be any further subsequence, with $G_i := \{x : |f_{n_{j_i}}(x) - f(x)| \ge \varepsilon_0\}$ and $\mu(G_i) \ge \delta_0$. Suppose for contradiction that $f_{n_{j_i}} \to f$ $\mu$-a.e. Set $B := \{x : f_{n_{j_i}}(x) \not\to f(x)\}$ with $\mu(B) = 0$.
For $x \in X \setminus B$, $f_{n_{j_i}}(x) \to f(x)$, so $|f_{n_{j_i}}(x) - f(x)| < \varepsilon_0$ for all sufficiently large $i$. Define
\begin{align*}
E_I := \{x \in X \setminus B : |f_{n_{j_i}}(x) - f(x)| < \varepsilon_0 \;\text{for all}\; i \ge I\}.
\end{align*}
Then $E_I$ increases with $\bigcup_{I=1}^\infty E_I = X \setminus B$. For $i \ge I$, $G_i \cap E_I = \varnothing$ (by definition of $E_I$), so $G_i \setminus B \subset (X \setminus B) \setminus E_I$. Since $\mu(B) = 0$:
\begin{align*}
\delta_0 \le \mu(G_i) = \mu(G_i \setminus B) \le \mu\!\left((X \setminus B) \setminus E_I\right) \quad \text{for all } i \ge I.
\end{align*}
Define $D_I := (X \setminus B) \setminus E_I$. This is a decreasing sequence with $D_I \downarrow \varnothing$ (since $E_I \uparrow X \setminus B$), and $\mu(D_I) \ge \delta_0 > 0$ for all $I$.
Since $D_1 = (X \setminus B) \setminus E_1 = \bigcup_{i=1}^\infty (G_i \setminus B)$ and $\mu(B) = 0$, we have $\mu(D_1) = \mu(\bigcup_{i=1}^\infty G_i)$. By [continuity from below](/theorems/1082), $\mu(E_I) \uparrow \mu(X \setminus B)$. Since $D_I = (X \setminus B) \setminus E_I$ and $\mu(D_I) = \mu(X \setminus B) - \mu(E_I)$ when $\mu(X \setminus B) < \infty$, we need $\mu(X) < \infty$ (or equivalently $\mu(D_1) < \infty$) to apply [continuity from above](/theorems/1082).
When $\mu(X) < \infty$: $\mu(D_1) \le \mu(X) < \infty$, and [continuity from above](/theorems/1082) gives $\mu(D_I) \to \mu(\varnothing) = 0$. But $\mu(D_I) \ge \delta_0 > 0$ for all $I$, a contradiction.
When $\mu(X) = \infty$ but $\mu(D_1) < \infty$ (which holds whenever $\bigcup_i G_i$ has finite measure): the same argument gives $\mu(D_I) \to 0$, contradiction.
Therefore no further subsequence of $(f_{n_j})$ converges to $f$ $\mu$-a.e.
[guided]
The key sets are $E_I$ (the "safe zone" where $|f_{n_{j_i}} - f| < \varepsilon_0$ for all $i \ge I$) and $D_I = (X \setminus B) \setminus E_I$ (the "danger zone"). The a.e. convergence forces $E_I \uparrow X \setminus B$, so $D_I \downarrow \varnothing$. Meanwhile, the persistent superlevel-set bound forces $\mu(D_I) \ge \delta_0 > 0$ for every $I$.
Continuity of measures from above requires one set in the decreasing sequence to have finite measure. When $\mu(D_1) < \infty$, we get $\mu(D_I) \to 0$, directly contradicting $\mu(D_I) \ge \delta_0$.
The condition $\mu(D_1) < \infty$ holds automatically when $\mu(X) < \infty$ (i.e., the measure space is finite). More generally, it holds whenever $\mu(\bigcup_i G_i) < \infty$, which is satisfied in the important case where the functions lie in $L^p$ for some $p < \infty$ (since the [Markov inequality](/theorems/514) gives $\mu(G_i) \le \varepsilon_0^{-p}\|f_{n_{j_i}} - f\|_{L^p}^p < \infty$, and the union of finitely many finite-measure sets is finite — though the countable union may still be infinite without further assumptions).
In the finite-measure setting, the argument is immediate: $D_I \subset X$ with $\mu(X) < \infty$ means $\mu(D_1) \le \mu(X) < \infty$, and the contradiction follows in one line.
[/guided]
[/step]