[proofplan]
The proof proceeds by expressing the set $\{|f_k - f| > \varepsilon\}$ as a subset of the "tail" sets $\bigcup_{j \ge k}\{|f_j - f| > \varepsilon\}$, whose measures decrease to the measure of the limsup. Almost everywhere convergence forces the limsup to be a null set (up to modification on a null set), and the finiteness of $\mu(X)$ ensures that the continuity-from-above property of measures applies to the nested intersection.
[/proofplan]
[step:Introduce the tail sets and express the failure set of a.e. convergence]
Fix $\varepsilon > 0$. For each $k \ge 1$, define
\begin{align*}
B_k := \{x \in X : |f_k(x) - f(x)| > \varepsilon\}.
\end{align*}
We must show $\mu(B_k) \to 0$ as $k \to \infty$.
Define the tail sets
\begin{align*}
G_m := \bigcup_{k=m}^{\infty} B_k = \{x \in X : |f_k(x) - f(x)| > \varepsilon \text{ for some } k \ge m\}.
\end{align*}
The sequence $(G_m)_{m \ge 1}$ is nested: $G_1 \supset G_2 \supset G_3 \supset \cdots$, since removing the first term from a union can only shrink it.
[/step]
[step:Identify the intersection $\bigcap_{m} G_m$ with the failure set of pointwise convergence]
The intersection of all tail sets is
\begin{align*}
\bigcap_{m=1}^{\infty} G_m = \limsup_{k \to \infty} B_k = \{x \in X : x \in B_k \text{ for infinitely many } k\}.
\end{align*}
If $x \in \bigcap_{m=1}^{\infty} G_m$, then for every $m \ge 1$ there exists $k \ge m$ with $|f_k(x) - f(x)| > \varepsilon$. This means $f_k(x) \not\to f(x)$, i.e., $x$ belongs to the set where the sequence does not converge to $f$.
By hypothesis, $f_k \to f$ $\mu$-a.e., so the set $N := \{x \in X : f_k(x) \not\to f(x)\}$ satisfies $\mu(N) = 0$. Since $\bigcap_{m=1}^{\infty} G_m \subset N$, monotonicity of $\mu$ gives
\begin{align*}
\mu\!\left(\bigcap_{m=1}^{\infty} G_m\right) \le \mu(N) = 0.
\end{align*}
[guided]
The set $\bigcap_m G_m$ captures all points where the deviation $|f_k - f|$ exceeds $\varepsilon$ infinitely often. Any such point is a point of non-convergence of the sequence. Almost everywhere convergence says the set of non-convergent points is $\mu$-null, so $\bigcap_m G_m$ is contained in a null set.
Note that the containment $\bigcap_m G_m \subset N$ is the direction that matters. The reverse containment can fail: a point might fail to converge by oscillating with amplitude less than $\varepsilon$, and such a point would be in $N$ but not in $\bigcap_m G_m$. This does not affect the argument.
[/guided]
[/step]
[step:Apply continuity from above to conclude $\mu(B_k) \to 0$]
The sequence $(G_m)_{m \ge 1}$ is a decreasing sequence of $\mathcal{F}$-measurable sets. By the continuity-from-above property of measures, if $\mu(G_1) < \infty$, then
\begin{align*}
\mu\!\left(\bigcap_{m=1}^{\infty} G_m\right) = \lim_{m \to \infty} \mu(G_m).
\end{align*}
The finiteness condition $\mu(G_1) < \infty$ holds because $G_1 \subset X$ and $\mu(X) < \infty$ by hypothesis. Combined with the previous step:
\begin{align*}
\lim_{m \to \infty} \mu(G_m) = \mu\!\left(\bigcap_{m=1}^{\infty} G_m\right) = 0.
\end{align*}
Since $B_m \subset G_m$ for every $m$ (as $B_m$ is one term in the union defining $G_m$), monotonicity of $\mu$ gives $\mu(B_m) \le \mu(G_m)$, so
\begin{align*}
0 \le \mu(B_m) \le \mu(G_m) \to 0 \quad \text{as } m \to \infty.
\end{align*}
By the squeeze principle, $\mu(B_m) \to 0$. Since $\varepsilon > 0$ was arbitrary, this establishes $f_k \xrightarrow{\mu} f$.
[guided]
The hypothesis $\mu(X) < \infty$ is consumed exactly here — in the continuity-from-above step. Continuity from above requires $\mu(G_1) < \infty$, which is guaranteed by $G_1 \subset X$ and $\mu(X) < \infty$.
This hypothesis cannot be dropped. The standard counterexample on $(\mathbb{R}, \mathcal{B}(\mathbb{R}), \mathcal{L}^1)$ is $f_k := \mathbb{1}_{[k, k+1]}$. Then $f_k(x) \to 0$ for every $x \in \mathbb{R}$ (since each fixed $x$ eventually lies outside $[k, k+1]$), so $f_k \to 0$ everywhere, hence $\mathcal{L}^1$-a.e. However, $\mathcal{L}^1(\{|f_k| > 1/2\}) = \mathcal{L}^1([k, k+1]) = 1$ for every $k$, so convergence in measure fails. The measure space $(\mathbb{R}, \mathcal{B}(\mathbb{R}), \mathcal{L}^1)$ has $\mathcal{L}^1(\mathbb{R}) = \infty$, confirming that finiteness is essential.
[/guided]
[/step]