[proofplan]
We prove the result by induction on the height of a derivation tree. The induction hypothesis says that all derivations of smaller height satisfy $P$. Since every immediate subderivation of a derivation has strictly smaller height, the local closure assumption for the final inference promotes the property from the immediate subderivations to the whole derivation.
[/proofplan]
[step:Prove the height-indexed induction statement]
For each $n \in \mathbb{N}_0$, define the assertion $Q(n)$ by
\begin{align*}
Q(n) \quad \Longleftrightarrow \quad \text{for every } D \in \mathcal{D}, \ h(D) \leq n \implies P(D).
\end{align*}
We prove $Q(n)$ for every $n \in \mathbb{N}_0$ by induction on $n$.
For $n = 0$, let $D \in \mathcal{D}$ satisfy $h(D) \leq 0$. Since $h(D) \in \mathbb{N}_0$, this means $h(D)=0$. A derivation tree of height $0$ has no immediate subderivations above its final inference, so the list of immediate subderivations is empty. The hypothesis of the theorem therefore applies vacuously and gives $P(D)$. Hence $Q(0)$ holds.
Now let $n \in \mathbb{N}_0$ and assume $Q(n)$. Let $D \in \mathcal{D}$ satisfy $h(D) \leq n+1$. If $h(D) \leq n$, then $P(D)$ follows directly from $Q(n)$. It remains to consider the case $h(D)=n+1$.
Let $D_1,\dots,D_m \in \mathcal{D}$ be the immediate subderivations occurring above the final inference of $D$, where $m \in \mathbb{N}_0$. By the defining property of the height function, each immediate subderivation has strictly smaller height than $D$, so
\begin{align*}
h(D_i) < h(D) = n+1
\end{align*}
for every $i \in \{1,\dots,m\}$. Since $h(D_i) \in \mathbb{N}_0$, this implies $h(D_i) \leq n$. By $Q(n)$, $P(D_i)$ holds for every $i \in \{1,\dots,m\}$. Applying the closure hypothesis to the final inference of $D$ gives $P(D)$. Thus $Q(n+1)$ holds.
By induction, $Q(n)$ holds for every $n \in \mathbb{N}_0$.
[/step]
[step:Apply the height-indexed statement to an arbitrary derivation]
Let $D \in \mathcal{D}$ be arbitrary. Its height $h(D)$ is an element of $\mathbb{N}_0$. Applying $Q(h(D))$ to this derivation gives
\begin{align*}
h(D) \leq h(D) \implies P(D).
\end{align*}
Therefore $P(D)$ holds. Since $D \in \mathcal{D}$ was arbitrary, $P(D)$ holds for every finite derivation tree $D$ in the proof system $\mathcal{S}$.
[/step]