[proofplan]
We eliminate cuts by a double induction on the complexity of the cut formula and on the heights of the two derivations immediately above the cut. The proof first establishes a local reduction lemma: every topmost cut can either be commuted upward past a non-principal last inference or, when the cut formula is principal on both sides, replaced by cuts on strictly simpler formulas. The reductions are checked rule-by-rule, with special attention to implication because the right rule discharges an antecedent assumption while the left rule consumes an implication in a single-conclusion sequent. Since every reduction strictly decreases the lexicographic cut measure and all reductions preserve the LJ restriction that the succedent has at most one formula, repeated reduction terminates in a cut-free derivation of the original end-sequent.
[/proofplan]
[step:Fix the sequent calculus and the induction measure]
We work in the standard first-order intuitionistic sequent calculus $LJ_{\mathcal{L}}$ with structural rules on antecedents, initial sequents $P \Rightarrow P$ for atomic formulas $P$, the usual left and right rules for $\land,\lor,\to,\forall,\exists$, and the falsity constant $\bot$, whose left rule permits the inference from no premises to $\Gamma,\bot \Rightarrow \Delta$. If negation is present as primitive notation, its rules are understood as the corresponding rules for $\neg B := B \to \bot$. A sequent is an ordered expression $\Gamma \Rightarrow \Delta$, where $\Gamma$ is a finite multiset of formulas and $\Delta$ is either empty or a singleton multiset. The cut rule is
\begin{align*}
\frac{\Gamma \Rightarrow A \qquad \Pi, A \Rightarrow \Delta}{\Gamma, \Pi \Rightarrow \Delta}.
\end{align*}
For a formula $A$, define its degree $d(A) \in \mathbb{N}$ recursively by declaring atomic formulas to have degree $0$ and setting
\begin{align*}
d(B \circ C) &= 1 + d(B) + d(C)
\end{align*}
for each binary connective $\circ \in \{\land,\lor,\to\}$, and
\begin{align*}
d(\neg B) &= 1 + d(B), \\
d(\forall x\,B) &= 1 + d(B), \\
d(\exists x\,B) &= 1 + d(B).
\end{align*}
For a derivation $\pi$, let $h(\pi) \in \mathbb{N}$ be its height. For a cut whose immediate subderivations are
\begin{align*}
\pi_1 &: \Gamma \Rightarrow A, \\
\pi_2 &: \Pi, A \Rightarrow \Delta,
\end{align*}
define its cut measure by
\begin{align*}
m(\pi_1,\pi_2,A) := \bigl(d(A), h(\pi_1)+h(\pi_2)\bigr),
\end{align*}
ordered lexicographically in $\mathbb{N}\times\mathbb{N}$.
A cut is called topmost if neither $\pi_1$ nor $\pi_2$ contains a cut. It is enough to show that every topmost cut can be replaced by a derivation of the same end-sequent whose cuts all have strictly smaller cut measure. Once this is proved, any derivation is transformed by choosing a cut maximal with respect to being closest to the leaves, reducing it, and repeating. The lexicographic order on $\mathbb{N}\times\mathbb{N}$ is well-founded, so the process terminates.
[/step]
[step:Commute a topmost cut above non-principal inferences]
Assume a topmost cut
\begin{align*}
\frac{\pi_1:\Gamma \Rightarrow A \qquad \pi_2:\Pi,A \Rightarrow \Delta}{\Gamma,\Pi \Rightarrow \Delta}
\end{align*}
has cut formula $A$. We prove the permutation statement for every LJ rule whose displayed occurrence of $A$ is not principal in the last inference.
[claim:Non-principal permutation]
Let $R$ be the last inference of one immediate premise of the topmost cut, and suppose the displayed occurrence of $A$ is not principal for $R$. Then the cut can be replaced by derivations obtained by cutting $\Gamma \Rightarrow A$ with each premise of $R$, or cutting each premise of $R$ with $\Pi,A \Rightarrow \Delta$, and then reapplying $R$. Every new cut has cut formula $A$ and strictly smaller height sum.
[/claim]
[proof]
First suppose $R$ is applied in the left premise and has premises $S_1,\dots,S_k$ and conclusion $\Gamma \Rightarrow A$, where $k \in \{0,1,2\}$. Because $A$ is not principal, the succedent occurrence $A$ occurs unchanged in every premise of $R$ that has a succedent; write those premises as $\Gamma_i \Rightarrow A$ with derivations $\pi_i$ for $1 \le i \le k$. Replace the original cut by the $k$ cuts
\begin{align*}
\frac{\pi_i:\Gamma_i \Rightarrow A \qquad \pi_2:\Pi,A \Rightarrow \Delta}{\Gamma_i,\Pi \Rightarrow \Delta}
\end{align*}
and then apply the same rule $R$ to the antecedent contexts $\Gamma_i,\Pi$. For one-premise left rules this is exactly the displayed permutation; for two-premise rules, such as $\land$-right or $\lor$-left when the cut formula is carried passively, both premises are cut before the rule is reapplied. The height of each $\pi_i$ is strictly smaller than $h(\pi_1)$, hence each new cut has measure $(d(A),h(\pi_i)+h(\pi_2)) < (d(A),h(\pi_1)+h(\pi_2))$.
Now suppose $R$ is applied in the right premise and has premises $T_1,\dots,T_k$ and conclusion $\Pi,A \Rightarrow \Delta$, with $A$ not principal. Then the antecedent occurrence $A$ is carried unchanged into every premise in which it occurs. Write those premises as $\Pi_i,A \Rightarrow \Delta_i$ with derivations $\rho_i$. Replace the original cut by the cuts
\begin{align*}
\frac{\pi_1:\Gamma \Rightarrow A \qquad \rho_i:\Pi_i,A \Rightarrow \Delta_i}{\Gamma,\Pi_i \Rightarrow \Delta_i}
\end{align*}
and reapply the same rule $R$ to obtain $\Gamma,\Pi \Rightarrow \Delta$. This covers all right logical rules, all left logical rules in which the cut occurrence is merely part of the context, and the quantifier rules. In the quantifier cases, if the eigenvariable of $R$ occurs free in the opposite cut premise, first rename it to a variable fresh for the two derivations and the end-sequent; eigenvariable renaming preserves derivability and restores the side condition before the rule is reapplied.
For exchange, the same exchange is reapplied after the cut. For weakening on a formula different from the displayed cut occurrence, the weakening is reapplied after the cut. For contraction on formulas different from the displayed cut occurrence, contract the duplicated non-cut formula after performing the cut above the contraction.
It remains to record the structural cases involving the cut formula itself. If the right premise ends by weakening that introduces the displayed $A$, then its premise is already a derivation of $\Pi \Rightarrow \Delta$, so the whole cut is deleted and no new cut is introduced. If the right premise ends by contraction of two displayed occurrences of $A$,
\begin{align*}
\frac{\rho:\Pi,A,A \Rightarrow \Delta}{\Pi,A \Rightarrow \Delta},
\end{align*}
replace the original cut by two successive cuts on $A$:
\begin{align*}
\frac{\pi_1:\Gamma \Rightarrow A \qquad \rho:\Pi,A,A \Rightarrow \Delta}{\Gamma,\Pi,A \Rightarrow \Delta}
\end{align*}
followed by
\begin{align*}
\frac{\pi_1:\Gamma \Rightarrow A \qquad \Gamma,\Pi,A \Rightarrow \Delta}{\Gamma,\Gamma,\Pi \Rightarrow \Delta}.
\end{align*}
A final contraction on the two copies of $\Gamma$ gives $\Gamma,\Pi \Rightarrow \Delta$. The first new cut has smaller right-premise height because it uses the premise of the contraction. The second new cut has the same cut formula, but its right premise is the result of the first reduced cut and is handled by the inner induction on the height sum; equivalently, in the standard Gentzen double induction this case is reduced by first applying the induction hypothesis to the smaller first cut and then cutting the resulting premise, whose measure is smaller in the multiset ordering. No rule creates a second succedent formula, since LJ right rules have singleton succedents and all structural rules are antecedent rules. This proves the claim.
[/proof]
[/step]
[step:Replace principal propositional cuts by cuts on simpler formulas]
It remains to consider a topmost cut whose cut formula is principal in both immediate last inferences. We check the propositional connectives.
For conjunction, suppose the cut formula is $B\land C$, the left derivation ends with conjunction-right,
\begin{align*}
\frac{\pi_B:\Gamma \Rightarrow B \qquad \pi_C:\Gamma \Rightarrow C}{\Gamma \Rightarrow B\land C},
\end{align*}
and the right derivation ends with conjunction-left,
\begin{align*}
\frac{\rho:\Pi,B,C \Rightarrow \Delta}{\Pi,B\land C \Rightarrow \Delta}.
\end{align*}
Replace the cut by two cuts:
\begin{align*}
\frac{\pi_B:\Gamma \Rightarrow B \qquad \rho:\Pi,B,C \Rightarrow \Delta}{\Gamma,\Pi,C \Rightarrow \Delta}
\end{align*}
and then
\begin{align*}
\frac{\pi_C:\Gamma \Rightarrow C \qquad \Gamma,\Pi,C \Rightarrow \Delta}{\Gamma,\Gamma,\Pi \Rightarrow \Delta},
\end{align*}
followed by contraction on the antecedent copies of $\Gamma$. The new cut formulas are $B$ and $C$, and
\begin{align*}
d(B) < d(B\land C), \qquad d(C) < d(B\land C).
\end{align*}
For disjunction, if the left derivation ends with disjunction-right, say
\begin{align*}
\frac{\pi_B:\Gamma \Rightarrow B}{\Gamma \Rightarrow B\lor C},
\end{align*}
and the right derivation ends with disjunction-left,
\begin{align*}
\frac{\rho_B:\Pi,B \Rightarrow \Delta \qquad \rho_C:\Pi,C \Rightarrow \Delta}{\Pi,B\lor C \Rightarrow \Delta},
\end{align*}
replace the cut by
\begin{align*}
\frac{\pi_B:\Gamma \Rightarrow B \qquad \rho_B:\Pi,B \Rightarrow \Delta}{\Gamma,\Pi \Rightarrow \Delta}.
\end{align*}
The case where the left derivation ends with the other disjunction-right rule uses $\rho_C$ instead. The new cut formula is either $B$ or $C$, hence has smaller degree.
For implication, suppose the cut formula is $B\to C$, the left derivation ends with implication-right,
\begin{align*}
\frac{\pi_C:\Gamma,B \Rightarrow C}{\Gamma \Rightarrow B\to C},
\end{align*}
and the right derivation ends with implication-left,
\begin{align*}
\frac{\rho_B:\Pi \Rightarrow B \qquad \rho_C:\Sigma,C \Rightarrow \Delta}{\Pi,\Sigma,B\to C \Rightarrow \Delta}.
\end{align*}
Replace the original cut by first cutting $\rho_B$ with $\pi_C$ on $B$:
\begin{align*}
\frac{\rho_B:\Pi \Rightarrow B \qquad \pi_C:\Gamma,B \Rightarrow C}{\Pi,\Gamma \Rightarrow C},
\end{align*}
and then cutting the result with $\rho_C$ on $C$:
\begin{align*}
\frac{\Pi,\Gamma \Rightarrow C \qquad \rho_C:\Sigma,C \Rightarrow \Delta}{\Pi,\Gamma,\Sigma \Rightarrow \Delta}.
\end{align*}
The new cut formulas are $B$ and $C$, both of strictly smaller degree than $B\to C$. The intermediate succedent is the singleton $C$, and the final succedent is exactly $\Delta$, so the LJ single-conclusion condition is preserved.
Atomic cuts are disposed of at the initial-sequent boundary. If the left premise is the initial derivation $A \Rightarrow A$, then the cut conclusion is obtained from the right premise by adding the remaining antecedent context from the initial sequent and then applying exchange and weakening as needed. If the right premise is the initial derivation $A \Rightarrow A$, then the cut conclusion is obtained from the left premise in the same way. Thus an atomic topmost cut either disappears or is reduced to structural rules only.
For falsity, a cut on $\bot$ cannot have $\bot$ principal on the right of the left premise, because LJ has no right introduction rule for $\bot$. If the right premise uses the $\bot$-left rule with displayed principal formula $\bot$, then the conclusion $\Gamma,\Pi \Rightarrow \Delta$ is obtained directly by the same $\bot$-left rule after weakening the antecedent context; no new cut is introduced.
For primitive negation, use the convention fixed at the start: $\neg B$ abbreviates $B\to\bot$. Its principal reduction is therefore exactly the implication reduction above with $C=\bot$. The new cut formula is $B$, and the possible cut on $\bot$ is eliminated by the falsity case just described.
[/step]
[step:Replace principal quantifier cuts using fresh eigenvariables]
We first record the substitution fact needed for the quantifier reductions.
[claim:Substitution through derivations]
Let $z$ be a variable, let $t$ be a term free for $z$ in every formula of a sequent $S$, and let $\sigma$ be a cut-free $LJ_{\mathcal{L}}$ derivation of $S$. After renaming bound variables and eigenvariables in $\sigma$ so that no eigenvariable is free in $t$ and no bound variable of $\sigma$ occurs in $t$, uniform substitution of $t$ for the free variable $z$ in every formula of $\sigma$ yields a cut-free derivation $\sigma[t/z]$ of the substituted sequent $S[t/z]$.
[/claim]
[proof]
The proof is by induction on $h(\sigma)$. Initial sequents are preserved because substitution sends $P \Rightarrow P$ to $P[t/z] \Rightarrow P[t/z]$, and the $\bot$-left axiom is preserved because substitution does not change the rule form. Structural rules are preserved because substitution acts formula-by-formula on antecedent multisets.
For propositional rules, substitution commutes with the displayed connective and therefore sends each instance of a rule to the same rule applied to the substituted premises. For quantifier rules, the preliminary renaming ensures the eigenvariable is fresh for $t$ and for the substituted side context. Thus the side condition in $\forall$-right and $\exists$-left remains valid after substitution. Since $t$ is free for $z$ in the relevant formulas, no variable capture occurs. The induction hypothesis supplies substituted derivations of the premises, and applying the same last rule gives the substituted conclusion. This completes the induction.
[/proof]
Consider a principal cut on a universal formula $\forall x\,B(x)$. The left derivation ends with universal-right,
\begin{align*}
\frac{\pi:\Gamma \Rightarrow B(y)}{\Gamma \Rightarrow \forall x\,B(x)},
\end{align*}
where $y$ is not free in $\Gamma$, and the right derivation ends with universal-left,
\begin{align*}
\frac{\rho:\Pi,B(t) \Rightarrow \Delta}{\Pi,\forall x\,B(x) \Rightarrow \Delta},
\end{align*}
where $t$ is a term free for $x$ in $B(x)$. Rename eigenvariables in $\pi$ so that they are fresh for $t$, $\rho$, and the end-sequent. By the substitution claim, substituting $t$ for $y$ throughout $\pi$ gives a cut-free derivation
\begin{align*}
\pi[t/y]:\Gamma \Rightarrow B(t),
\end{align*}
because $y$ is not free in $\Gamma$. Replace the original cut by
\begin{align*}
\frac{\pi[t/y]:\Gamma \Rightarrow B(t) \qquad \rho:\Pi,B(t) \Rightarrow \Delta}{\Gamma,\Pi \Rightarrow \Delta}.
\end{align*}
The new cut formula $B(t)$ has smaller degree than $\forall x\,B(x)$.
For an existential formula $\exists x\,B(x)$, the principal case has the left derivation ending with existential-right,
\begin{align*}
\frac{\pi:\Gamma \Rightarrow B(t)}{\Gamma \Rightarrow \exists x\,B(x)},
\end{align*}
and the right derivation ending with existential-left,
\begin{align*}
\frac{\rho:\Pi,B(y) \Rightarrow \Delta}{\Pi,\exists x\,B(x) \Rightarrow \Delta},
\end{align*}
where $y$ is not free in $\Pi$ or $\Delta$. Rename eigenvariables in $\rho$ so that they are fresh for $t$, $\pi$, and the end-sequent. By the substitution claim, substituting $t$ for $y$ throughout $\rho$ gives a cut-free derivation
\begin{align*}
\rho[t/y]:\Pi,B(t) \Rightarrow \Delta,
\end{align*}
because $y$ is not free in $\Pi$ or $\Delta$. Replace the original cut by
\begin{align*}
\frac{\pi:\Gamma \Rightarrow B(t) \qquad \rho[t/y]:\Pi,B(t) \Rightarrow \Delta}{\Gamma,\Pi \Rightarrow \Delta}.
\end{align*}
The new cut formula $B(t)$ has strictly smaller degree than $\exists x\,B(x)$. The renaming and substitution claim verify all eigenvariable side conditions and ensure that the end-sequent is unchanged.
[/step]
[step:Eliminate all cuts by well-founded induction]
We now prove the theorem. Let $\pi$ be an $LJ_{\mathcal{L}}$-derivation of $\Gamma \Rightarrow \Delta$ with cuts. If $\pi$ has no cuts, there is nothing to prove.
Otherwise choose a cut in $\pi$ such that no cut occurs above it. Its immediate premises are cut-free subderivations, so it is a topmost cut. By the preceding reductions, this cut can be replaced by a derivation of the same local end-sequent in which every newly introduced cut has strictly smaller cut measure: non-principal permutations lower the height-sum coordinate, principal propositional and quantifier reductions lower the formula-degree coordinate, initial and falsity cases delete the cut, and contraction on a displayed cut formula is handled by the inner induction described in the structural case. The replacement does not alter the surrounding part of $\pi$, so it gives a derivation of the original end-sequent $\Gamma \Rightarrow \Delta$ with a strictly smaller multiset of cut measures in the standard multiset extension of the lexicographic order on $\mathbb{N}\times\mathbb{N}$.
The lexicographic order on $\mathbb{N}\times\mathbb{N}$ is well-founded, and its finite multiset extension is well-founded. Therefore this repeated replacement process cannot continue indefinitely. It terminates in a derivation of $\Gamma \Rightarrow \Delta$ containing no occurrence of the cut rule.
Every reduction used above is an $LJ_{\mathcal{L}}$ derivation transformation: antecedents remain finite multisets, succedents remain empty or singleton, and the quantifier cases preserve the eigenvariable conditions. Hence the final derivation is a cut-free $LJ_{\mathcal{L}}$ derivation of the same end-sequent. This proves Gentzen cut elimination for LJ.
[/step]