[guided]The concatenation case is the only one that requires a substantive idea beyond Boolean combination, so we elaborate.
*Why do all $n + 1$ splits suffice?* By definition, $w \in AB$ means there exist $u \in A$ and $v \in B$ with $w = uv$. The concatenation $uv$ has length $|u| + |v| = |w|$, so $|u| \in \{0, 1, \ldots, |w|\}$. Once $|u| = j$ is fixed, the decomposition $w = uv$ determines $u$ uniquely (the first $j$ letters of $w$) and $v$ uniquely (the last $|w| - j$ letters of $w$). So the finite set of splits — one for each $j \in \{0, 1, \ldots, |w|\}$ — is exhaustive.
*Why does this give a* total *algorithm?* The key point is that the search is over a *finite* set whose size ($|w| + 1$) is a computable function of the input. This is what distinguishes the computable case from the computably-enumerable case (where the search would be over an unbounded candidate space and we would need to dovetail). Here, the indicator $\mathbb{1}_{AB}$ is constructed as a bounded loop; a bounded loop over total sub-routines always halts.
*What if $A$ or $B$ contains $\varepsilon$?* If $\varepsilon \in A$, the split $(j = 0)$ has $u = \varepsilon \in A$ and $v = w$, so $w \in AB$ whenever $w \in B$. This is consistent: $AB \supseteq \{\varepsilon\} B = B$. If $\varepsilon \in B$, the split $(j = n)$ has $v = \varepsilon \in B$ and $u = w$, so $AB \supseteq A \{\varepsilon\} = A$.
The register-machine pseudocode in the claim uses bounded iteration (for $j = 0, 1, \ldots, n$). While the course's formal model sometimes prefers primitive recursion or explicit register transitions, bounded iteration is expressible in both, because one can maintain a countdown register that is decremented from $n$ to $0$ and test for $\varepsilon$ at each step. Each loop body is a finite composition of: prefix/suffix extraction (a word-surgery primitive, implementable by bounded letter-by-letter copying), calls to the given machines $M_A, M_B$ (total by hypothesis), and constant-time tests of register contents against the single letters $a$ and $\varepsilon$. The resulting register machine is total.
*Running total computable functions in sequence.* One subtle point: $M_A$ may clobber scratch registers during its computation. Before calling $M_A$, we save the full input $w$ (and the index $j$ if needed) to fresh scratch registers not used by $M_A$; after $M_A$ halts, we restore them. This is the same save/restore discipline used in Step 2's proof for intersection, and it is needed whenever a computation is composed of multiple sub-computations sharing register space.[/guided]