[proofplan]
We start with a fixed NP-complete language $A$ and construct a delayed diagonal subset $L$ of $A$. A computable control function $h$ alternates between long even stages, which diagonalize against polynomial-time deciders for $L$, and long odd stages, which diagonalize against polynomial-time many-one reductions from $A$ to $L$. The delays make $h$ polynomial-time computable and make membership in $L$ nondeterministically polynomial-time decidable, while the two diagonalization clauses force $L \notin \mathrm{P}$ and prevent $L$ from being NP-complete.
[/proofplan]
[step:Choose a complete language and enumerate all polynomial-time algorithms]
Assume $\mathrm{P} \neq \mathrm{NP}$. By the standard Cook-Levin existence theorem for NP-complete languages, let $A \subseteq \{0,1\}^*$ be a fixed NP-complete language under polynomial-time many-one reductions. Since $\mathrm{P} \neq \mathrm{NP}$ and $A$ is NP-complete, $A \notin \mathrm{P}$.
Let $(M_i)_{i \geq 0}$ be an effective enumeration of deterministic Turing machines with polynomial clocks, where $M_i$ is interpreted as a decider running in time at most $n^{i+1}+i+1$ on inputs of length $n$. Let $(R_i)_{i \geq 0}$ be an effective enumeration of deterministic polynomial-time transducers, where $R_i$ runs in time at most $n^{i+1}+i+1$ and outputs a string. These enumerations contain every polynomial-time decider and every polynomial-time many-one reduction at least once.
[/step]
[step:Build the delayed control function by an explicit bounded search]
Fix an NP verifier $V_A$ for $A$ and a constant $c \geq 1$ such that $x \in A$ iff there exists a certificate $w \in \{0,1\}^{\leq |x|^c}$ accepted by $V_A(x,w)$ in time at most $|x|^c+c$. Define the exhaustive-decision time function
\begin{align*}
T_A: \mathbb{N} &\to \mathbb{N}
\end{align*}
by letting $T_A(r)$ be the deterministic time needed to decide membership in $A$ for every string of length at most $r$ by exhaustive certificate search. Thus $T_A(r) \leq 2^{C_A r^c}$ for a constant $C_A \geq 1$ depending only on $V_A$. Choose a constant $C \geq C_A$ large enough that, for every $r \in \mathbb{N}$,
\begin{align*}
2^{r+1}2^{C_A(r+1)^c}\leq 2^{C(r+1)^c}.
\end{align*}
Define a computable threshold function $s: \mathbb{N} \to \mathbb{N}$ by
\begin{align*}
s(n) := \max\{r \in \mathbb{N} : 2^{C(r+1)^c}(r+1)^{r+1} \leq n\},
\end{align*}
with $s(n)=0$ if the set is empty. Then $s(n) \to \infty$ and every computation used below on strings of length at most $s(n)$ takes time polynomial in $n$.
We define finite approximations $h_n: \{0,1,\dots,n\} \to \mathbb{N}$ recursively. Set $h_0(0)=0$. Suppose $h_{n-1}$ has been defined on $\{0,1,\dots,n-1\}$ and let $m=h_{n-1}(n-1)$. Extend it temporarily by $h_{n,0}(k)=h_{n-1}(k)$ for $k<n$ and $h_{n,0}(n)=m$. For any such finite map $g: \{0,1,\dots,n\}\to \mathbb{N}$, define the finite language
\begin{align*}
L_g := \{x \in \{0,1\}^{\leq s(n)} : x \in A \text{ and } g(|x|) \text{ is even}\}.
\end{align*}
Membership in $L_g$ is decidable during the construction because every queried string has length at most $s(n)$ and membership in $A$ is decided by exhaustive search.
If $m>s(n)$, set $h_n(n)=m$ and perform no diagonal search at stage $n$. This pause is part of the delay: it ensures that whenever a machine or transducer is simulated at stage $n$, its index is at most $s(n)$.
If $m\leq s(n)$ and $m=2i$, declare the even requirement witnessed at length $n$ when there exists $x \in \{0,1\}^{\leq s(n)}$ such that $M_i(x)$ halts within its clock and
\begin{align*}
M_i(x) \neq \mathbb{1}_{L_{h_{n,0}}}(x).
\end{align*}
If such an $x$ exists, set $h_n(n)=m+1$; otherwise set $h_n(n)=m$. If $m\leq s(n)$ and $m=2i+1$, declare the odd requirement witnessed at length $n$ when there exists $x \in \{0,1\}^{\leq s(n)}$ such that $R_i(x)$ halts within its clock, $|R_i(x)|\leq s(n)$, and
\begin{align*}
\mathbb{1}_{A}(x) \neq \mathbb{1}_{L_{h_{n,0}}}(R_i(x)).
\end{align*}
If such an $x$ exists, set $h_n(n)=m+1$; otherwise set $h_n(n)=m$. In all cases keep $h_n(k)=h_{n-1}(k)$ for $k<n$. Finally define $h: \mathbb{N}\to\mathbb{N}$ by $h(n)=h_n(n)$.
This is a genuine recursive definition: at stage $n$ only the already fixed values $h_{n-1}(0),\dots,h_{n-1}(n-1)$ and the temporary value $h_n^0(n)=h_{n-1}(n-1)$ are used. Later stages never change earlier values. The function $h$ is nondecreasing and increases by at most one at each length. To compute $h(n)$, simulate the above recursion for $0,1,\dots,n$. At stage $k$, if $h(k-1)>s(k)$, the construction only copies the previous value and does no machine simulation. Otherwise the searched requirement has index at most $s(k)$, the search uses at most $2^{s(k)+1}$ input strings, exhaustive decisions for $A$ occur only on strings of length at most $s(k)$, and each bounded machine or transducer simulation on such an input takes time at most $(s(k)+1)^{s(k)+1}+s(k)+1$. The output length test prevents any query to $L_{h_{k,0}}$ outside lengths at most $s(k)$. Hence the work at stage $k$ is bounded by a constant multiple of
\begin{align*}
2^{s(k)+1} 2^{C(s(k)+1)^c}(s(k)+1)^{s(k)+1}.
\end{align*}
By the choice of the constant $C$ in the definition of $s$, the input-count factor is absorbed into the exponential exhaustive-search bound, and the defining inequality for $s(k)$ gives
\begin{align*}
2^{s(k)+1} 2^{C_A(s(k)+1)^c}(s(k)+1)^{s(k)+1}\leq k.
\end{align*}
Thus each stage costs at most a constant multiple of $k$, and summing over $k\leq n$ gives a polynomial bound in $n$. Hence $h$ is total and polynomial-time computable.
[guided]
We now construct the delay function rather than invoking it. The difficulty is that $h$ will define the final language, while the construction of $h$ must still be able to test finite pieces of that language. The way out is to use only a finite approximation at each length and to make sure all undefined future values of $h$ are irrelevant to the current test.
Fix an NP verifier $V_A$ for $A$ and a constant $c \geq 1$ such that $x \in A$ exactly when some certificate $w \in \{0,1\}^{\leq |x|^c}$ makes $V_A(x,w)$ accept within time at most $|x|^c+c$. Define the map $T_A: \mathbb{N}\to\mathbb{N}$ so that $T_A(r)$ is the deterministic time required to decide $A$ on all strings of length at most $r$ by trying every certificate. There is a constant $C_A \geq 1$ such that
\begin{align*}
T_A(r) \leq 2^{C_A r^c}.
\end{align*}
Choose a constant $C \geq C_A$ large enough that, for every $r \in \mathbb{N}$,
\begin{align*}
2^{r+1}2^{C_A(r+1)^c}\leq 2^{C(r+1)^c}.
\end{align*}
Define
\begin{align*}
s(n) := \max\{r \in \mathbb{N} : 2^{C(r+1)^c}(r+1)^{r+1} \leq n\},
\end{align*}
with $s(n)=0$ if no positive $r$ satisfies the bound. This threshold tends to infinity, but slowly enough that exhaustive decisions of $A$ on strings of length at most $s(n)$ and the relevant bounded simulations are polynomial in $n$.
We define finite maps $h_n: \{0,1,\dots,n\}\to\mathbb{N}$ recursively. Start with $h_0(0)=0$. Suppose $h_{n-1}$ is already fixed and let $m=h_{n-1}(n-1)$. Before deciding whether to move past requirement $m$, form the temporary extension $h_{n,0}$ by keeping all old values and setting $h_{n,0}(n)=m$. For any finite map $g: \{0,1,\dots,n\}\to\mathbb{N}$, define
\begin{align*}
L_g := \{x \in \{0,1\}^{\leq s(n)} : x \in A \text{ and } g(|x|) \text{ is even}\}.
\end{align*}
This definition is meaningful because $s(n)\leq n$, so $g(|x|)$ is defined for every string under consideration. It also avoids circularity: at stage $n$, the construction only asks about lengths at most $s(n)$, where $h_{n,0}$ has already assigned a value.
If $m>s(n)$, we pause and set $h_n(n)=m$. This extra delay is essential for the runtime estimate: no machine of index larger than $s(n)$ is simulated at stage $n$.
If $m\leq s(n)$ and $m=2i$, the current requirement is to defeat the decider $M_i$. We search over all $x \in \{0,1\}^{\leq s(n)}$ and ask whether
\begin{align*}
M_i(x) \neq \mathbb{1}_{L_{h_{n,0}}}(x).
\end{align*}
If such an $x$ exists, set $h_n(n)=m+1$; otherwise set $h_n(n)=m$. If $m\leq s(n)$ and $m=2i+1$, the current requirement is to defeat the transducer $R_i$ as a many-one reduction from $A$ to the delayed language. We search over all $x \in \{0,1\}^{\leq s(n)}$ for which $R_i(x)$ halts within its clock and $|R_i(x)|\leq s(n)$, and we ask whether
\begin{align*}
\mathbb{1}_A(x) \neq \mathbb{1}_{L_{h_{n,0}}}(R_i(x)).
\end{align*}
If such an $x$ exists, set $h_n(n)=m+1$; otherwise set $h_n(n)=m$. In all cases, all older values are preserved: $h_n(k)=h_{n-1}(k)$ for $k<n$.
Finally set $h(n)=h_n(n)$. This gives a total nondecreasing function $h: \mathbb{N}\to\mathbb{N}$, increasing by at most one at a time. The construction is polynomial-time computable because, to compute $h(n)$, one repeats the finite recursion through length $n$, and at each intermediate length $k$ the search is restricted to strings of length at most $s(k)$. If $h(k-1)>s(k)$, the stage only copies the previous value and no machine is simulated. Otherwise the current requirement has index at most $s(k)$, so on inputs of length at most $s(k)$ every simulated clock is bounded by $(s(k)+1)^{s(k)+1}+s(k)+1$. At stage $k$ there are at most $2^{s(k)+1}$ tested inputs. The construction decides $A$ only for strings of length at most $s(k)$, costing at most $2^{C_A(s(k)+1)^c}$ by exhaustive certificate search. The condition $|R_i(x)|\leq s(k)$ is essential here: it ensures that evaluating $\mathbb{1}_{L_{h_{k,0}}}(R_i(x))$ never asks for membership in $A$ at a length larger than $s(k)$.
Therefore the cost of stage $k$ is bounded by a constant multiple of
\begin{align*}
2^{s(k)+1} 2^{C(s(k)+1)^c}(s(k)+1)^{s(k)+1}.
\end{align*}
By the initial choice of the fixed constant $C$, the factor $2^{s(k)+1}$ is included in the exponential term, and the defining inequality gives that this displayed quantity is at most $k$. Thus every stage costs at most a constant multiple of $k$, so the total cost through stage $n$ is bounded by a constant multiple of $\sum_{k=1}^n k$, hence by a polynomial in $n$. Thus the delayed control function exists, is explicitly defined, and is polynomial-time computable.
[/guided]
[/step]
[step:Show that the control function reaches every requirement]
We prove that $h$ is unbounded. Suppose instead that $h$ is eventually constant, and let $m$ be its eventual value. Since $h$ is nondecreasing and increases by at most one at each stage, this means that requirement $m$ is never witnessed after the first stage at which $h$ reaches $m$.
If $m=2j$, then $h(r)=2j$ for all sufficiently large lengths $r$. Hence, outside a finite set of lengths, the final language $L$ agrees with $A$. Since no even witness is ever found for $M_j$ and $s(n)\to\infty$, for every fixed string $x$ all sufficiently large stages test $x$ and obtain
\begin{align*}
M_j(x)=\mathbb{1}_{L}(x).
\end{align*}
Thus $M_j$ decides $L$. A deterministic polynomial-time decision procedure for $A$ is obtained by using a finite lookup table on the finitely many exceptional lengths and running $M_j$ on all remaining inputs. This contradicts $A\notin\mathrm{P}$.
If $m=2j+1$, then $h(r)=2j+1$ for all sufficiently large lengths $r$, so $L$ is finite. Since no odd witness is ever found for $R_j$ and $s(n)\to\infty$, for every fixed string $x$ all sufficiently large stages test $x$, the output length $|R_j(x)|$ is eventually at most $s(n)$ because $R_j(x)$ is fixed, and the finite approximation agrees with the final language on that output length. Therefore
\begin{align*}
\mathbb{1}_A(x)=\mathbb{1}_{L}(R_j(x)).
\end{align*}
Because $L$ is finite, membership in $L$ is decidable by a finite lookup table. Computing $R_j(x)$ and then looking up membership in $L$ gives a deterministic polynomial-time decision procedure for $A$, again contradicting $A\notin\mathrm{P}$.
Both cases are impossible. Hence $h$ is unbounded, and since it starts at $0$ and increases by at most one, it reaches every requirement $m\in\mathbb{N}$.
[/step]
[step:Define the candidate intermediate language and place it in NP]
Define
\begin{align*}
L := \{x \in \{0,1\}^* : x \in A \text{ and } h(|x|) \text{ is even}\}.
\end{align*}
Since $h$ is polynomial-time computable and $A \in \mathrm{NP}$, membership in $L$ is verified by first computing $h(|x|)$ deterministically and then using the NP verifier for $A$ when $h(|x|)$ is even. Hence $L \in \mathrm{NP}$.
[/step]
[step:Use the even stages to prove that the language is not in P]
Suppose toward a contradiction that $L \in \mathrm{P}$. Then some decider $M_i$ in the enumeration decides $L$. By the previous step, $h$ reaches the even requirement $2i$. If that requirement were never witnessed after being reached, $h$ would remain equal to $2i$ forever, contradicting the unboundedness proved in the previous step. Thus the even requirement $2i$ is witnessed at some finite length $n$. Let $x$ be a witnessing string. The construction then gives
\begin{align*}
M_i(x) \neq \mathbb{1}_{L_{h_{n,0}}}(x).
\end{align*}
Later stages never change values of $h$ at lengths at most $n$, and $|x|\leq s(n)\leq n$, so $\mathbb{1}_{L_{h_{n,0}}}(x)=\mathbb{1}_{L}(x)$. Hence $M_i$ disagrees with $L$ on $x$, contradicting that $M_i$ decides $L$. Therefore $L \notin \mathrm{P}$.
[/step]
[step:Use the odd stages to rule out NP-completeness]
Suppose toward a contradiction that $L$ is NP-complete under polynomial-time many-one reductions. Since $A$ is NP-complete and $L \in \mathrm{NP}$, there is a polynomial-time many-one reduction from $A$ to $L$; hence some transducer $R_i$ in the enumeration computes such a reduction. We prove that the odd requirement $2i+1$ is eventually witnessed. If not, then after $h$ first reaches $2i+1$, it remains equal to $2i+1$ forever. Therefore there is a length $N$ such that, for every string $y$ with $|y|\geq N$, the final language satisfies $y\notin L$.
Assume no witness is ever found after this stage. For all sufficiently large tested inputs $x$, the construction verifies
\begin{align*}
\mathbb{1}_A(x)=\mathbb{1}_{L}(R_i(x)),
\end{align*}
because $s(n)\to\infty$ and, for each fixed $x$, all large enough stages test that $x$ and include the output length $|R_i(x)|$. Since $R_i$ runs in polynomial time, $|R_i(x)|$ is polynomially bounded in $|x|$. The language $L$ is finite on lengths at least $N$ because all such lengths are suppressed, and strings of length below $N$ form a finite set. Hence $L$ is decidable by a finite lookup table, and the equivalence above gives a polynomial-time decision procedure for $A$ by computing $R_i(x)$ and looking up whether $R_i(x)\in L$. This contradicts $A\notin\mathrm{P}$.
Thus the odd requirement $2i+1$ is witnessed at some finite length $n$. Let $x$ be a witnessing string. The construction gives
\begin{align*}
\mathbb{1}_{A}(x) \neq \mathbb{1}_{L_{h_{n,0}}}(R_i(x)).
\end{align*}
Since the witness condition includes $|R_i(x)|\leq s(n)\leq n$ and later stages never alter $h$ at lengths at most $n$, we have $\mathbb{1}_{L_{h_{n,0}}}(R_i(x))=\mathbb{1}_{L}(R_i(x))$. Therefore $R_i$ does not reduce $A$ to $L$, contradicting the choice of $R_i$ as a many-one reduction from $A$ to $L$. Hence no polynomial-time many-one reduction from $A$ to $L$ exists. Since $A\in\mathrm{NP}$, this contradicts the defining property that an NP-complete language receives polynomial-time many-one reductions from every language in $\mathrm{NP}$. Therefore $L$ is not NP-complete.
[/step]
[step:Conclude that the constructed language is NP-intermediate]
We have shown $L \in \mathrm{NP}$, $L \notin \mathrm{P}$, and $L$ is not NP-complete under polynomial-time many-one reductions. Therefore, under the assumption $\mathrm{P} \neq \mathrm{NP}$, the language $L$ is NP-intermediate. This proves Ladner's theorem.
[/step]