[proofplan]
We translate propositional intuitionistic natural deduction derivations into the corresponding typed terms for the simply typed lambda calculus with products, sums, and the empty type. Principal natural-deduction reductions are exactly the beta reductions for these typed terms, and normal derivations correspond to beta-normal terms. We then use the strong normalization theorem for this typed calculus: every well-typed term has no infinite reduction sequence. Starting from the translated term, repeatedly reduce any beta-redex until no redex remains; finiteness is guaranteed by strong normalization, and translating the resulting beta-normal term back gives a normal derivation with the same open assumptions and conclusion.
[/proofplan]
[step:Translate derivations into typed terms and identify detours with beta-redexes]
Let $\mathcal{F}$ denote the set of formulas of propositional intuitionistic logic generated from propositional variables and $\bot$ by $\to$, $\wedge$, and $\vee$. Let $\mathcal{T}$ denote the set of simple types generated from atomic types and the empty type by the type constructors $\to$, $\times$, and $+$. Define the type assignment map $T: \mathcal{F} \to \mathcal{T}$ recursively by sending propositional variables to atomic types, $\bot$ to the empty type, $B \to C$ to the function type $T(B) \to T(C)$, $B \wedge C$ to the product type $T(B) \times T(C)$, and $B \vee C$ to the sum type $T(B) + T(C)$.
For a finite natural-deduction derivation $\Pi$ of conclusion $A$ from open assumptions $\Gamma$, let $T(\Gamma)$ denote the finite typed context obtained by assigning each open assumption occurrence of formula $B$ a variable of type $T(B)$. Define its term translation $\tau(\Pi)$ to be a term of type $T(A)$ in the context $T(\Gamma)$:
\begin{align*}
\tau(\Pi) &\in \mathsf{Term}(\Gamma,A).
\end{align*}
The translation is defined by the Curry-Howard recursion on the last rule of $\Pi$: assumptions translate to variables, $\to I$ to lambda abstraction, $\to E$ to application, $\wedge I$ to pairing, $\wedge E$ to projection, $\vee I_1$ and $\vee I_2$ to the two injections, $\vee E$ to case analysis, and $\bot E$ to the eliminator from the empty type.
Under this translation, a detour in $\Pi$ is exactly a beta-redex in $\tau(\Pi)$. In the following schematic reductions, $x$ and $y$ denote term variables, while $s$, $t$, $u$, and $v$ denote well-typed terms of the indicated types in the ambient typing contexts. The three principal correspondences are
\begin{align*}
(\lambda x:T(B).\,s)\,t &\longrightarrow_\beta s[t/x],\\
\pi_1(\langle s,t\rangle) &\longrightarrow_\beta s,\\
\pi_2(\langle s,t\rangle) &\longrightarrow_\beta t,
\end{align*}
and
\begin{align*}
\operatorname{case}(\operatorname{inl}(s);x.u;y.v) &\longrightarrow_\beta u[s/x],\\
\operatorname{case}(\operatorname{inr}(t);x.u;y.v) &\longrightarrow_\beta v[t/y].
\end{align*}
Thus reducing a natural-deduction detour is the derivation-level image of one beta reduction of the translated term.
[guided]
The purpose of the translation is to use a normalization theorem whose substitution bookkeeping already handles duplication. Let $\mathcal{F}$ be the set of formulas generated from propositional variables and $\bot$ by $\to$, $\wedge$, and $\vee$. Let $\mathcal{T}$ be the set of simple types generated from atomic types and the empty type by the type constructors $\to$, $\times$, and $+$. Define the map $T: \mathcal{F} \to \mathcal{T}$ from formulas to simple types recursively: propositional variables become atomic types, $\bot$ becomes the empty type, $B \to C$ becomes $T(B) \to T(C)$, $B \wedge C$ becomes $T(B) \times T(C)$, and $B \vee C$ becomes $T(B) + T(C)$.
Now let $\Pi$ be a finite derivation of conclusion $A$ from open assumptions $\Gamma$. Let $T(\Gamma)$ be the finite context assigning a variable of type $T(B)$ to each open assumption occurrence of formula $B$. Its Curry-Howard translation is a typed term of type $T(A)$ in that context:
\begin{align*}
\tau(\Pi) &\in \mathsf{Term}(\Gamma,A).
\end{align*} The recursion follows the last inference rule: an assumption is translated to its variable; implication introduction is translated to $\lambda$-abstraction; implication elimination is translated to application; conjunction introduction is translated to pairing; conjunction eliminations are translated to projections; disjunction introductions are translated to injections; [disjunction elimination](/theorems/4625) is translated to case analysis; and ex falso is translated to the eliminator from the empty type.
With this translation, the local natural-deduction detours become exactly the beta-redexes of the typed term calculus. In the schematic reductions below, $x$ and $y$ are term variables, and $s$, $t$, $u$, and $v$ are well-typed terms in the ambient contexts needed to make the displayed terms well typed. The implication detour becomes
\begin{align*}
(\lambda x:T(B).\,s)\,t &\longrightarrow_\beta s[t/x].
\end{align*}
The conjunction detours become
\begin{align*}
\pi_1(\langle s,t\rangle) &\longrightarrow_\beta s,\\
\pi_2(\langle s,t\rangle) &\longrightarrow_\beta t.
\end{align*}
The disjunction detours become
\begin{align*}
\operatorname{case}(\operatorname{inl}(s);x.u;y.v) &\longrightarrow_\beta u[s/x],\\
\operatorname{case}(\operatorname{inr}(t);x.u;y.v) &\longrightarrow_\beta v[t/y].
\end{align*}
This is the point at which the earlier naive counting argument is avoided: beta substitution may duplicate a term, but the strong normalization theorem for typed terms is designed to control exactly that phenomenon.
[/guided]
[/step]
[step:Record the principal reductions and their preservation properties]
The standard principal reductions are the following local transformations. In this step, $\Sigma$, $\Sigma_1$, $\Sigma_2$, $\Theta$, $\Theta_1$, and $\Theta_2$ denote finite natural-deduction derivations of the formulas specified below.
For implication, suppose a derivation of $B \to C$ ends by $\to I$ from a subderivation $\Sigma$ of $C$ with discharged assumption occurrence $B$, and this formula is immediately eliminated by $\to E$ against a derivation $\Theta$ of $B$:
\begin{align*}
\frac{
\frac{\Sigma}{B \to C}\ \to I
\qquad
\Theta
}{
C
}\ \to E.
\end{align*}
The reduction replaces the whole displayed derivation by the result of substituting $\Theta$ for every discharged occurrence of the assumption $B$ in $\Sigma$.
For conjunction, if
\begin{align*}
\frac{
\Theta_1 \qquad \Theta_2
}{
B \wedge C
}\ \wedge I
\end{align*}
is immediately followed by the first projection, the reduction replaces the whole derivation by $\Theta_1$; if it is immediately followed by the second projection, the reduction replaces the whole derivation by $\Theta_2$.
For disjunction, if
\begin{align*}
\frac{\Theta}{B}\ \vee I_1
\end{align*}
introduces $B \vee C$ and this occurrence is immediately used as the principal premise of a $\vee E$ with branch derivations $\Sigma_1$ from assumption $B$ to conclusion $D$ and $\Sigma_2$ from assumption $C$ to conclusion $D$, then the reduction replaces the whole derivation by the result of substituting $\Theta$ for the discharged assumption $B$ in $\Sigma_1$. The case of $\vee I_2$ is symmetric: substitute the derivation of $C$ into the second branch.
Each principal reduction preserves the final conclusion and the open assumptions. In the implication and disjunction cases, substitution removes exactly the assumption occurrences discharged by the eliminated introduction or elimination rule and imports precisely the open assumptions of the substituted derivation. In the conjunction case, projection simply selects one of the two already available component derivations.
[guided]
The reduction rules are local: they replace a small piece of a derivation by another derivation with the same local conclusion. In this guided explanation, $\Sigma$, $\Sigma_1$, $\Sigma_2$, $\Theta$, $\Theta_1$, and $\Theta_2$ denote finite natural-deduction derivations of the formulas specified in the corresponding displayed rules.
For implication, suppose a derivation of $B \to C$ ends by $\to I$ from a subderivation $\Sigma$ of $C$ with discharged assumption occurrence $B$, and this formula is immediately eliminated by $\to E$ against a derivation $\Theta$ of $B$:
\begin{align*}
\frac{
\frac{\Sigma}{B \to C}\ \to I
\qquad
\Theta
}{
C
}\ \to E.
\end{align*}
The reduction replaces the whole displayed derivation by the result of substituting $\Theta$ for every discharged occurrence of the assumption $B$ in $\Sigma$. The reason this preserves open assumptions is that the temporary occurrences of $B$ were discharged by the implication introduction and therefore are not open assumptions of the whole detour; after substitution, those discharged occurrences are removed and the open assumptions of $\Theta$ are imported exactly where the proof of $B$ is used. The local conclusion remains $C$.
For conjunction, suppose the derivation first introduces a conjunction from derivations $\Theta_1$ of $B$ and $\Theta_2$ of $C$:
\begin{align*}
\frac{
\Theta_1 \qquad \Theta_2
}{
B \wedge C
}\ \wedge I.
\end{align*}
If this occurrence of $B \wedge C$ is immediately followed by the first projection, the reduction replaces the whole derivation by $\Theta_1$; if it is immediately followed by the second projection, the reduction replaces the whole derivation by $\Theta_2$. No discharged assumptions are involved in this case: projection simply selects one of the component derivations already used to build the conjunction, so the selected component has the required local conclusion and its open assumptions are among the open assumptions already present in the detour.
For disjunction, suppose a derivation $\Theta$ of $B$ is injected into the left side:
\begin{align*}
\frac{\Theta}{B}\ \vee I_1.
\end{align*}
If this occurrence of $B \vee C$ is immediately used as the principal premise of a $\vee E$ with branch derivations $\Sigma_1$ from assumption $B$ to conclusion $D$ and $\Sigma_2$ from assumption $C$ to conclusion $D$, then the reduction replaces the whole derivation by the result of substituting $\Theta$ for the discharged assumption $B$ in $\Sigma_1$. The right branch $\Sigma_2$ disappears because the actual introduced disjunct was the left one. The case of $\vee I_2$ is the matching right-hand version: a derivation of $C$ is substituted into the second branch. In both disjunction reductions, the local conclusion remains $D$, the discharged case assumption is removed, and the open assumptions of the substituted derivation are imported precisely at the uses of that discharged assumption.
[/guided]
[/step]
[step:Invoke strong normalization to obtain a finite beta-reduction sequence]
Let $\mathsf{Term}(\Gamma,A)$ be the set of well-typed terms of type $T(A)$ in the finite context $T(\Gamma)$. We use the strong normalization theorem for the simply typed lambda calculus with products, sums, and empty type, where the beta-reduction relation includes function beta reduction, product projection beta reductions, and sum case beta reductions: every term in $\mathsf{Term}(\Gamma,A)$ admits no infinite sequence for this full beta-reduction relation.
Applying this theorem to $\tau(\Pi) \in \mathsf{Term}(\Gamma,A)$ gives finiteness of every beta-reduction sequence beginning at $\tau(\Pi)$. Therefore the deterministic procedure which repeatedly chooses any beta-redex in the current term must stop after finitely many steps. Thus there exist an integer $m \geq 0$ and terms $t_0,t_1,\dots,t_m$ such that
\begin{align*}
t_0 &= \tau(\Pi),\\
t_k &\longrightarrow_\beta t_{k+1} \quad \text{for } 0 \leq k < m,
\end{align*}
and $t_m$ contains no beta-redex.
[guided]
The previous step replaced derivations by typed terms, so the hard termination question becomes the standard termination question for well-typed beta reduction. Let $\mathsf{Term}(\Gamma,A)$ denote the set of terms of type $T(A)$ in the finite context $T(\Gamma)$. The translated derivation satisfies
\begin{align*}
\tau(\Pi) \in \mathsf{Term}(\Gamma,A).
\end{align*}
We invoke the strong normalization theorem for the simply typed lambda calculus with products, sums, and empty type, using the beta-reduction relation that includes function beta reduction, product projection beta reductions, and sum case beta reductions. Its hypothesis is well-typedness in a finite context, which holds because $\Pi$ is a finite natural-deduction derivation from the finite list of open assumption occurrences $\Gamma$, and the Curry-Howard translation assigns each inference rule a well-typed term constructor. Its conclusion is that no infinite sequence of these beta reductions can begin at $\tau(\Pi)$.
Now repeatedly choose a beta-redex in the current term and reduce it. Strong normalization implies this process cannot continue indefinitely. Hence there are $m \geq 0$ and well-typed terms $t_0,t_1,\dots,t_m$ with
\begin{align*}
t_0 &= \tau(\Pi),\\
t_k &\longrightarrow_\beta t_{k+1} \quad \text{for } 0 \leq k < m,
\end{align*}
and with $t_m$ beta-normal. This argument remains valid even when a beta step duplicates a subterm, because strong normalization is a theorem about all beta-reduction sequences, including those involving substitution and duplication.
[/guided]
[/step]
[step:Translate the beta-normal term back to a normal derivation]
The inverse Curry-Howard reading sends every well-typed term constructor to the corresponding natural-deduction rule: variables to assumptions, lambda abstraction to $\to I$, application to $\to E$, pairing to $\wedge I$, projections to $\wedge E$, injections to $\vee I_1$ and $\vee I_2$, case analysis to $\vee E$, and the empty-type eliminator to $\bot E$. Therefore each term $t_k$ determines a natural-deduction derivation $\Pi_k$ of conclusion $A$ from the same open assumptions $\Gamma$. Each beta step $t_k \to_\beta t_{k+1}$ is one of the function, product, or sum beta reductions listed above, so it translates to the matching principal natural-deduction reduction. Hence
\begin{align*}
\Pi = \Pi_0 \longrightarrow \Pi_1 \longrightarrow \cdots \longrightarrow \Pi_m
\end{align*}
is a finite sequence of reductions.
Because $t_m$ has no beta-redex, the derivation $\Pi_m$ has no formula occurrence that is both the conclusion of an introduction rule and the principal premise of the matching elimination rule. Hence $\Pi_m$ is normal. The typing context and final type of every $t_k$ are $T(\Gamma)$ and $T(A)$, so every $\Pi_k$ has open assumptions $\Gamma$ and conclusion $A$. Therefore $\Pi$ reduces in finitely many steps to a normal derivation with the same open assumptions and the same conclusion.
[guided]
It remains to return from typed terms to natural-deduction derivations. Each term $t_k$ is well typed in the same context and at the same type as the original translated term:
\begin{align*}
t_k &\in \mathsf{Term}(\Gamma,A) \quad \text{for } 0 \leq k \leq m.
\end{align*}
The inverse Curry-Howard reading is defined constructor by constructor: variables become assumptions, lambda abstractions become $\to I$, applications become $\to E$, pairs become $\wedge I$, projections become $\wedge E$, injections become $\vee I_1$ and $\vee I_2$, case analyses become $\vee E$, and empty-type eliminators become $\bot E$. Thus every $t_k$ gives a derivation $\Pi_k$ of $A$ from the open assumptions $\Gamma$.
For each beta step $t_k \to_\beta t_{k+1}$, the shape of the contracted redex determines the corresponding standard natural-deduction reduction. A function beta step $(\lambda x:T(B).\,s)\,t \to_\beta s[t/x]$ is substitution of the derivation for $B$ into the subderivation whose temporary assumption $B$ was discharged by $\to I$. A product beta step $\pi_1(\langle s,t\rangle) \to_\beta s$ or $\pi_2(\langle s,t\rangle) \to_\beta t$ is the removal of a conjunction introduction immediately followed by the matching projection. A sum beta step for $\operatorname{case}(\operatorname{inl}(s);x.u;y.v)$ or $\operatorname{case}(\operatorname{inr}(t);x.u;y.v)$ is substitution into the selected branch of the disjunction-elimination derivation. Hence the beta-reduction sequence translates back to a finite derivation-reduction sequence
\begin{align*}
\Pi = \Pi_0 \longrightarrow \Pi_1 \longrightarrow \cdots \longrightarrow \Pi_m.
\end{align*}
Since $t_m$ is beta-normal, none of the term patterns corresponding to an introduction immediately followed by its matching elimination remains. Therefore $\Pi_m$ contains no such detour and is normal. Because beta reduction preserves the typing context and the final type at every step, the corresponding derivations preserve the open assumptions $\Gamma$ and the conclusion $A$. This proves weak normalization.
[/guided]
[/step]