[proofplan]
We prove confluence by introducing parallel beta reduction, a one-step relation that contracts any finite family of disjoint beta-redexes at once. The key point is that parallel reduction has the diamond property, proved using substitution compatibility and complete developments. Since ordinary beta reduction is included in parallel reduction and every parallel step is simulated by finitely many ordinary beta steps, the diamond property transfers to the reflexive [transitive closure](/theorems/1493) of ordinary beta reduction.
[/proofplan]
[step:Define parallel beta reduction and complete developments]
Write $r \Rightarrow s$ for parallel beta reduction. It is the least relation on $\textsf{Tm}$ generated by the following rules:
if $x$ is a variable, then $x \Rightarrow x$;
if $r \Rightarrow r'$, then $\lambda x.r \Rightarrow \lambda x.r'$;
if $r \Rightarrow r'$ and $s \Rightarrow s'$, then $r\,s \Rightarrow r'\,s'$;
if $r \Rightarrow r'$ and $s \Rightarrow s'$, then $(\lambda x.r)\,s \Rightarrow r'[s'/x]$.
Here $r'[s'/x]$ denotes capture-avoiding substitution, and terms are considered up to alpha-equivalence, so bound variables may be renamed before substitution.
Define the complete development map
\begin{align*}
(-)^\star : \textsf{Tm} \to \textsf{Tm}
\end{align*}
recursively as follows. For a variable $x$, set $x^\star := x$. For an abstraction, set
\begin{align*}
(\lambda x.r)^\star := \lambda x.r^\star.
\end{align*}
For an application, define
\begin{align*}
((\lambda x.r)\,s)^\star := r^\star[s^\star/x]
\end{align*}
when the left-hand term is an abstraction, and otherwise define
\begin{align*}
(r\,s)^\star := r^\star\,s^\star.
\end{align*}
This definition is well-defined on alpha-equivalence classes because all substitutions are capture-avoiding.
[/step]
[step:Prove substitution compatibility for parallel reduction]
[claim:Parallel reduction is compatible with substitution]
If $r \Rightarrow r'$ and $s \Rightarrow s'$, then
\begin{align*}
r[s/x] \Rightarrow r'[s'/x].
\end{align*}
[/claim]
[proof]
We prove the claim by induction on the derivation of $r \Rightarrow r'$.
If $r$ is a variable $y$, then $r'=y$. If $y=x$, then $r[s/x]=s$ and $r'[s'/x]=s'$, so the conclusion is $s \Rightarrow s'$, which is assumed. If $y \ne x$, then $r[s/x]=y=r'[s'/x]$, and the variable rule gives $y \Rightarrow y$.
Suppose $r=\lambda y.a$ and $r'=\lambda y.a'$ with $a \Rightarrow a'$. By alpha-renaming, choose $y$ fresh for $x$, $s$, and $s'$. Then
\begin{align*}
(\lambda y.a)[s/x] = \lambda y.(a[s/x])
\end{align*}
and
\begin{align*}
(\lambda y.a')[s'/x] = \lambda y.(a'[s'/x]).
\end{align*}
The induction hypothesis gives $a[s/x] \Rightarrow a'[s'/x]$, and the abstraction rule gives
\begin{align*}
\lambda y.(a[s/x]) \Rightarrow \lambda y.(a'[s'/x]).
\end{align*}
Suppose $r=a\,b$ and $r'=a'\,b'$ is obtained from $a \Rightarrow a'$ and $b \Rightarrow b'$ by the application rule. The induction hypothesis gives
\begin{align*}
a[s/x] \Rightarrow a'[s'/x]
\end{align*}
and
\begin{align*}
b[s/x] \Rightarrow b'[s'/x].
\end{align*}
Applying the application rule gives
\begin{align*}
(a\,b)[s/x] \Rightarrow (a'\,b')[s'/x].
\end{align*}
Finally suppose $r=(\lambda y.a)\,b$ and $r'=a'[b'/y]$ is obtained by the parallel beta rule from $a \Rightarrow a'$ and $b \Rightarrow b'$. By alpha-renaming, choose $y$ fresh for $x$, $s$, and $s'$. The induction hypothesis gives
\begin{align*}
a[s/x] \Rightarrow a'[s'/x]
\end{align*}
and
\begin{align*}
b[s/x] \Rightarrow b'[s'/x].
\end{align*}
Using the parallel beta rule,
\begin{align*}
((\lambda y.a)\,b)[s/x] \Rightarrow a'[s'/x][b'[s'/x]/y].
\end{align*}
By the capture-avoiding substitution composition law, valid because $y$ was chosen fresh,
\begin{align*}
a'[s'/x][b'[s'/x]/y] = a'[b'/y][s'/x].
\end{align*}
Thus
\begin{align*}
r[s/x] \Rightarrow r'[s'/x].
\end{align*}
This completes the induction.
[/proof]
[/step]
[step:Show every parallel reduct reduces to the complete development]
We prove that if $r \Rightarrow s$, then $s \Rightarrow r^\star$, by induction on the derivation of $r \Rightarrow s$.
For the variable rule, $r=s=x$ and $x^\star=x$, so $s \Rightarrow r^\star$ is $x \Rightarrow x$.
For abstraction, write $r=\lambda x.a$ and $s=\lambda x.a'$ with $a \Rightarrow a'$. By induction, $a' \Rightarrow a^\star$. Hence
\begin{align*}
\lambda x.a' \Rightarrow \lambda x.a^\star = (\lambda x.a)^\star.
\end{align*}
For application, write $r=a\,b$ and $s=a'\,b'$ with $a \Rightarrow a'$ and $b \Rightarrow b'$. By induction,
\begin{align*}
a' \Rightarrow a^\star
\end{align*}
and
\begin{align*}
b' \Rightarrow b^\star.
\end{align*}
If $a$ is not an abstraction, the application rule gives
\begin{align*}
a'\,b' \Rightarrow a^\star\,b^\star = (a\,b)^\star.
\end{align*}
If $a=\lambda x.c$, then the parallel beta rule gives
\begin{align*}
a'\,b' \Rightarrow c^\star[b^\star/x] = ((\lambda x.c)\,b)^\star,
\end{align*}
because $a' \Rightarrow \lambda x.c^\star$ follows from the abstraction case of the induction.
For the parallel beta rule, write $r=(\lambda x.a)\,b$ and $s=a'[b'/x]$, with $a \Rightarrow a'$ and $b \Rightarrow b'$. By induction,
\begin{align*}
a' \Rightarrow a^\star
\end{align*}
and
\begin{align*}
b' \Rightarrow b^\star.
\end{align*}
By substitution compatibility,
\begin{align*}
a'[b'/x] \Rightarrow a^\star[b^\star/x].
\end{align*}
Since $((\lambda x.a)\,b)^\star=a^\star[b^\star/x]$, this proves $s \Rightarrow r^\star$.
[guided]
The purpose of the complete development $r^\star$ is to give a canonical target after one parallel step: no matter which finite collection of redexes was contracted in $r \Rightarrow s$, the reduct $s$ can still contract in one more parallel step to the term obtained by contracting all redexes available in $r$.
We prove the assertion by induction on the derivation of $r \Rightarrow s$. In the variable case, the only derivation is $x \Rightarrow x$, and the complete development satisfies $x^\star=x$, so the conclusion is exactly the variable rule.
In the abstraction case, $r=\lambda x.a$ and $s=\lambda x.a'$ for some derivation $a \Rightarrow a'$. The induction hypothesis gives
\begin{align*}
a' \Rightarrow a^\star.
\end{align*}
Applying the abstraction rule for parallel reduction gives
\begin{align*}
\lambda x.a' \Rightarrow \lambda x.a^\star.
\end{align*}
By definition of complete development,
\begin{align*}
\lambda x.a^\star = (\lambda x.a)^\star.
\end{align*}
Thus $s \Rightarrow r^\star$.
In the application case, $r=a\,b$ and $s=a'\,b'$ are obtained from $a \Rightarrow a'$ and $b \Rightarrow b'$. The induction hypotheses give
\begin{align*}
a' \Rightarrow a^\star
\end{align*}
and
\begin{align*}
b' \Rightarrow b^\star.
\end{align*}
If $a$ is not an abstraction, then the complete development of the application is $a^\star b^\star$, and the application rule gives
\begin{align*}
a'\,b' \Rightarrow a^\star\,b^\star = (a\,b)^\star.
\end{align*}
If $a=\lambda x.c$, then the application itself is a beta-redex in the source term. In that case the complete development contracts this outer redex after developing its body and argument, so
\begin{align*}
((\lambda x.c)\,b)^\star = c^\star[b^\star/x].
\end{align*}
The induction hypothesis applied to the left component gives $a' \Rightarrow \lambda x.c^\star$, and the induction hypothesis applied to the right component gives $b' \Rightarrow b^\star$. Therefore the parallel beta rule gives
\begin{align*}
a'\,b' \Rightarrow c^\star[b^\star/x] = ((\lambda x.c)\,b)^\star.
\end{align*}
In the beta case, $r=(\lambda x.a)\,b$ and $s=a'[b'/x]$, where $a \Rightarrow a'$ and $b \Rightarrow b'$. The induction hypotheses give
\begin{align*}
a' \Rightarrow a^\star
\end{align*}
and
\begin{align*}
b' \Rightarrow b^\star.
\end{align*}
Substitution compatibility then gives
\begin{align*}
a'[b'/x] \Rightarrow a^\star[b^\star/x].
\end{align*}
But $a^\star[b^\star/x]$ is exactly the complete development of $(\lambda x.a)\,b$. Hence $s \Rightarrow r^\star$.
[/guided]
[/step]
[step:Deduce the diamond property for parallel beta reduction]
Let $r,s,p \in \textsf{Tm}$ satisfy $r \Rightarrow s$ and $r \Rightarrow p$. By the previous step applied to $r \Rightarrow s$,
\begin{align*}
s \Rightarrow r^\star.
\end{align*}
Applying the same result to $r \Rightarrow p$ gives
\begin{align*}
p \Rightarrow r^\star.
\end{align*}
Thus the term $q:=r^\star$ satisfies $s \Rightarrow q$ and $p \Rightarrow q$. Therefore $\Rightarrow$ has the diamond property.
[/step]
[step:Relate ordinary beta reduction and parallel beta reduction]
First, every ordinary beta step is a parallel step. Indeed, an ordinary beta step contracts one redex
\begin{align*}
(\lambda x.a)\,b \to_\beta a[b/x]
\end{align*}
inside some term context. The parallel beta rule contracts this same redex, and the variable, abstraction, and application rules propagate the step through the surrounding context. Hence
\begin{align*}
r \to_\beta s \implies r \Rightarrow s.
\end{align*}
Second, every parallel beta step is simulated by finitely many ordinary beta steps:
\begin{align*}
r \Rightarrow s \implies r \to_\beta^* s.
\end{align*}
This is proved by induction on the derivation of $r \Rightarrow s$. The variable case is reflexivity. The abstraction and application cases follow by applying the induction hypotheses inside the corresponding syntactic constructor. In the parallel beta case, $r=(\lambda x.a)\,b$ and $s=a'[b'/x]$ with $a \Rightarrow a'$ and $b \Rightarrow b'$. By induction,
\begin{align*}
a \to_\beta^* a'
\end{align*}
and
\begin{align*}
b \to_\beta^* b'.
\end{align*}
Reducing first inside the abstraction body and then inside the argument gives
\begin{align*}
(\lambda x.a)\,b \to_\beta^* (\lambda x.a')\,b'.
\end{align*}
One ordinary beta contraction then gives
\begin{align*}
(\lambda x.a')\,b' \to_\beta a'[b'/x].
\end{align*}
Therefore $(\lambda x.a)\,b \to_\beta^* a'[b'/x]$.
[/step]
[step:Transfer the diamond property to ordinary beta confluence]
Since $\to_\beta$ is included in $\Rightarrow$, every sequence $t \to_\beta^* u$ is also a sequence $t \Rightarrow^* u$. Since $\Rightarrow$ has the diamond property, its reflexive transitive closure is confluent: whenever $t \Rightarrow^* u$ and $t \Rightarrow^* v$, there exists $z \in \textsf{Tm}$ such that
\begin{align*}
u \Rightarrow^* z
\end{align*}
and
\begin{align*}
v \Rightarrow^* z.
\end{align*}
This follows by induction on the length of one branch, repeatedly using the one-step diamond property to commute the first step past the other branch.
Applying this to the two sequences obtained from $t \to_\beta^* u$ and $t \to_\beta^* v$, choose $z \in \textsf{Tm}$ with $u \Rightarrow^* z$ and $v \Rightarrow^* z$. By the simulation result from the previous step, each parallel step is a finite ordinary beta-reduction sequence. Concatenating these finite simulations gives
\begin{align*}
u \to_\beta^* z
\end{align*}
and
\begin{align*}
v \to_\beta^* z.
\end{align*}
Taking $w:=z$ proves that there exists $w \in \textsf{Tm}$ such that $u \to_\beta^* w$ and $v \to_\beta^* w$. This is exactly the Church-Rosser property for beta reduction.
[/step]