[guided]Assume Zorn's Lemma. Let $\{A_i : i \in I\}$ be a family of non-empty sets indexed by a set $I$. Our task is to produce a function $f: I \to \bigcup_i A_i$ with $f(i) \in A_i$ for every $i$.
The strategy is to construct $f$ one coordinate at a time, using Zorn's Lemma to guarantee we can extend indefinitely. The natural object to put into Zorn's Lemma is not a choice function itself but a **partial** choice function — a function defined only on a subset of $I$ — because partial choice functions can be non-trivially compared and extended. Define
\begin{align*}
X := \left\{ (J, f) \;:\; J \subseteq I,\ f: J \to \bigcup_{i \in I} A_i,\ f(j) \in A_j \text{ for all } j \in J \right\},
\end{align*}
the set of all partial choice functions. We want Zorn's Lemma to hand us a maximal partial choice function; then we argue that a maximal partial choice function must actually be total.
Is $X$ non-empty? If $I = \varnothing$, the empty function $\varnothing \to \varnothing$ is vacuously a choice function and we are done. Otherwise pick any $i_0 \in I$ and any $a_0 \in A_{i_0}$ (which exists because $A_{i_0}$ is non-empty). Then $(\{i_0\}, i_0 \mapsto a_0) \in X$.
We order $X$ by extension:
\begin{align*}
(J, f) \leq (J', f') \iff J \subseteq J' \text{ and } f'\big|_J = f.
\end{align*}
A larger partial choice function has a larger domain and agrees with the smaller one where they overlap. This is a partial order: reflexivity and antisymmetry are direct, and transitivity chains through nested domains.
Zorn's Lemma requires that every chain in $X$ have an upper bound. Let $\mathcal{C} = \{(J_k, f_k) : k \in K\}$ be a chain in $X$. The obvious candidate upper bound is the "union" of $\mathcal{C}$:
\begin{align*}
J^* := \bigcup_{k \in K} J_k, \qquad f^*: J^* \to \bigcup_{i \in I} A_i, \qquad f^*(j) := f_k(j) \text{ for any } k \text{ with } j \in J_k.
\end{align*}
Is $f^*$ well-defined? The concern is that if $j \in J_k \cap J_{k'}$ for two different $k$, the values $f_k(j)$ and $f_{k'}(j)$ might disagree. This is where the chain property saves us: either $(J_k, f_k) \leq (J_{k'}, f_{k'})$ or $(J_{k'}, f_{k'}) \leq (J_k, f_k)$, and in either case the larger function restricts to the smaller one on the shared domain, so $f_k(j) = f_{k'}(j)$. And $f^*(j) \in A_j$ by inheritance from $f_k$. So $(J^*, f^*) \in X$, and it dominates every element of $\mathcal{C}$.
Zorn's Lemma now produces a maximal element $(J, f) \in X$. The final claim is that $J = I$. Suppose not — suppose $J \subsetneq I$. Then there is some $i_0 \in I \setminus J$. Because $A_{i_0}$ is non-empty, we can pick $a_0 \in A_{i_0}$. Now define a strictly larger partial choice function:
\begin{align*}
J' := J \cup \{i_0\}, \qquad f'(j) := \begin{cases} f(j) & \text{if } j \in J, \\ a_0 & \text{if } j = i_0. \end{cases}
\end{align*}
Then $(J', f') \in X$ strictly extends $(J, f)$, contradicting maximality. Therefore $J = I$ and $f: I \to \bigcup_i A_i$ is a choice function.
Notice that the only place we "chose" anything in this proof was the final step, where we picked $a_0 \in A_{i_0}$. This is *one* choice from *one* non-empty set — which ZF allows without AC. All the heavy lifting of making infinitely many choices simultaneously has been delegated to Zorn's Lemma.[/guided]