[proofplan]
Fix a bipartite graph $G$ with parts $X, Y$ of size $n$ and no $K_{t,t}$ subgraph. The argument is a double-count on ordered pairs $(T, y)$ where $T \subseteq X$ is a $t$-subset and $y \in Y$ is a common neighbour of $T$: the no-$K_{t,t}$ hypothesis gives a linear upper bound, while a convexity argument via Jensen's inequality gives a lower bound in terms of the average degree $d = e(G)/n$ in $Y$. Comparing the two bounds and solving for $d$ yields $d \leq t^{1/t} n^{1-1/t} + t$, from which the edge bound $e(G) = nd \leq t^{1/t} n^{2-1/t} + tn$ follows. A preliminary reduction ensures $\deg y \geq t - 1$ for all $y \in Y$, which is needed to apply the convexity of $x \mapsto \binom{x}{t}$.
[/proofplan]
[step:Reduce to the case where every vertex of $Y$ has degree at least $t - 1$]
Let $G = (X \cup Y, E)$ be a bipartite graph with $|X| = |Y| = n$ and $e(G) = Z(n, t)$, so $G$ contains no $K_{t,t}$ subgraph and is extremal. Suppose some $y \in Y$ has $\deg y < t - 1$. Adding any edge incident to $y$ cannot create a $K_{t,t}$ subgraph containing $y$: such a subgraph would require $y$ to have at least $t$ neighbours in $X$, but even after adding one edge $\deg y \leq t - 1 < t$. Hence we may add edges incident to low-degree vertices one at a time until every $y \in Y$ satisfies $\deg y \geq t - 1$, without creating a $K_{t,t}$ and without decreasing the edge count. It therefore suffices to bound $e(G)$ under the additional hypothesis $\deg y \geq t - 1$ for all $y \in Y$.
[guided]
Our target bound is $e(G) \leq t^{1/t} n^{2 - 1/t} + tn$. A double-count over $t$-subsets of $X$ works best when the degrees in $Y$ are not too small, because we will apply Jensen's inequality to the convex function $x \mapsto \binom{x}{t}$, which is convex on $x \geq t - 1$ but fails convexity for smaller $x$.
So consider the extremal question: how much can $e(G)$ be if $G$ is $K_{t,t}$-free? Fix $G$ extremal — meaning $e(G) = Z(n,t)$ — and suppose some $y \in Y$ has $\deg y < t - 1$. We claim we can add an edge at $y$ without introducing a $K_{t,t}$. A $K_{t,t}$ containing $y$ requires $y$ to have at least $t$ neighbours in $X$; after adding a single edge, $\deg y \leq (t - 2) + 1 = t - 1 < t$, so no such $K_{t,t}$ is created. This contradicts extremality — we could add the edge and stay $K_{t,t}$-free, increasing $e(G)$.
Therefore in any extremal $G$ we have $\deg y \geq t - 1$ for all $y \in Y$. Passing to this case does not weaken our bound: if the bound holds for extremal $G$ it holds for all $K_{t,t}$-free graphs. From now on we assume $\deg y \geq t - 1$ for every $y \in Y$.
[/guided]
[/step]
[step:Bound the common-neighbourhood count using the no-$K_{t,t}$ hypothesis]
For distinct $x_1, \ldots, x_t \in X$, the intersection of neighbourhoods $N(x_1) \cap \cdots \cap N(x_t) \subseteq Y$ satisfies
\begin{align*}
|N(x_1) \cap \cdots \cap N(x_t)| \leq t - 1.
\end{align*}
Indeed, if there were $t$ distinct vertices $y_1, \ldots, y_t \in N(x_1) \cap \cdots \cap N(x_t)$, then $\{x_1, \ldots, x_t\} \cup \{y_1, \ldots, y_t\}$ would span a $K_{t,t}$ subgraph, contradicting the hypothesis.
[guided]
We want to count configurations that would create a $K_{t,t}$ and use the assumption that none exist. A $K_{t,t}$ subgraph is a bipartite subgraph with $t$ vertices on each side and all $t^2$ edges between them present.
Fix $t$ distinct vertices $x_1, \ldots, x_t \in X$. A common neighbour of all of them is a vertex $y \in Y$ adjacent to every $x_i$. If we had $t$ distinct common neighbours $y_1, \ldots, y_t$, then between $\{x_1, \ldots, x_t\}$ and $\{y_1, \ldots, y_t\}$ every edge is present (each $y_j$ is adjacent to each $x_i$ by definition of common neighbour), and this is precisely a $K_{t,t}$.
Since $G$ contains no $K_{t,t}$, the common neighbourhood has strictly fewer than $t$ vertices, hence at most $t - 1$:
\begin{align*}
|N(x_1) \cap \cdots \cap N(x_t)| \leq t - 1.
\end{align*}
[/guided]
[/step]
[step:Double-count the pairs $(T, y)$ with $T \in \binom{X}{t}$ and $y$ a common neighbour of $T$]
Let $\mathcal{P} = \{(T, y) : T \in \binom{X}{t},\ y \in Y,\ T \subseteq N(y)\}$. We count $|\mathcal{P}|$ two ways.
Counting by $T$ first: for each $T = \{x_1, \ldots, x_t\}$, the set of $y$ paired with $T$ is exactly $N(x_1) \cap \cdots \cap N(x_t)$, which has size at most $t - 1$ by the previous step. Hence
\begin{align*}
|\mathcal{P}| \leq (t - 1) \binom{n}{t}.
\end{align*}
Counting by $y$ first: for each $y \in Y$, the set of $T$ paired with $y$ is $\binom{N(y)}{t}$, which has size $\binom{\deg y}{t}$. Hence
\begin{align*}
|\mathcal{P}| = \sum_{y \in Y} \binom{\deg y}{t}.
\end{align*}
Combining,
\begin{align*}
\sum_{y \in Y} \binom{\deg y}{t} \leq (t - 1) \binom{n}{t}.
\end{align*}
[guided]
The heart of the argument is a double-count. We need to find a bilinear quantity that, when counted one way uses the $K_{t,t}$-free hypothesis, and when counted the other way involves the degrees we want to control.
The natural candidate is the set of pairs $(T, y)$ where $T$ is a $t$-subset of $X$ and $y \in Y$ is a common neighbour of all vertices in $T$. Why this choice? Because "$y$ is a common neighbour of a $t$-subset" is precisely the structure that creates a $K_{t,t}$ when there are too many such $y$ for a single $T$.
**Counting by $T$.** For each $T = \{x_1, \ldots, x_t\}$, the $y$ that pair with $T$ are exactly the elements of $N(x_1) \cap \cdots \cap N(x_t)$. By the previous step, this intersection has size at most $t - 1$. There are $\binom{n}{t}$ choices of $T$, so
\begin{align*}
|\mathcal{P}| \leq (t - 1) \binom{n}{t}.
\end{align*}
**Counting by $y$.** For each $y \in Y$, a $t$-subset $T$ pairs with $y$ iff every element of $T$ is a neighbour of $y$, iff $T \subseteq N(y)$. There are $\binom{\deg y}{t}$ such $t$-subsets (where we interpret $\binom{k}{t} = 0$ for $k < t$). So
\begin{align*}
|\mathcal{P}| = \sum_{y \in Y} \binom{\deg y}{t}.
\end{align*}
Equating gives the key inequality
\begin{align*}
\sum_{y \in Y} \binom{\deg y}{t} \leq (t - 1) \binom{n}{t}.
\end{align*}
[/guided]
[/step]
[step:Apply Jensen's inequality with the convex function $x \mapsto \binom{x}{t}$]
Define $f : [t - 1, \infty) \to \mathbb{R}$ by
\begin{align*}
f(x) = \binom{x}{t} = \frac{x(x-1)(x-2) \cdots (x - t + 1)}{t!}.
\end{align*}
Substituting $s = x - t + 1 \geq 0$,
\begin{align*}
f(x) = \frac{1}{t!} \prod_{j=0}^{t-1}(s + j) = \frac{1}{t!} \left(s^t + e_1 s^{t-1} + \cdots + e_{t-1} s\right),
\end{align*}
where each $e_k$ is an elementary symmetric polynomial in $\{0, 1, \ldots, t - 1\}$ and hence non-negative. Thus $f$ is a polynomial in $s = x - t + 1$ with non-negative coefficients, so its second derivative is non-negative for $s \geq 0$, i.e. $f$ is convex on $[t - 1, \infty)$.
Let $d = \frac{e(G)}{n}$ be the average degree over $y \in Y$. By the reduction in the first step, $\deg y \geq t - 1$ for all $y \in Y$, so Jensen's inequality (in its finite-average form) applies:
\begin{align*}
\frac{1}{n} \sum_{y \in Y} \binom{\deg y}{t} \geq \binom{d}{t}.
\end{align*}
Multiplying by $n$ and combining with the previous step,
\begin{align*}
n \binom{d}{t} \leq \sum_{y \in Y} \binom{\deg y}{t} \leq (t - 1) \binom{n}{t}.
\end{align*}
[guided]
The previous step gave us $\sum_y \binom{\deg y}{t} \leq (t-1)\binom{n}{t}$. We now want to extract a bound on the total edge count $e(G) = \sum_y \deg y$. Since $\binom{\deg y}{t}$ is a convex function of $\deg y$, Jensen's inequality lets us replace the sum with $n$ copies of $\binom{d}{t}$ where $d$ is the average.
**Convexity of $f(x) = \binom{x}{t}$ on $[t - 1, \infty)$.** The binomial coefficient extends to real $x$ via
\begin{align*}
\binom{x}{t} = \frac{x(x-1) \cdots (x - t + 1)}{t!}.
\end{align*}
To check convexity it is cleanest to shift coordinates. Let $s = x - t + 1$, so $x = s + t - 1$, and
\begin{align*}
\binom{x}{t} = \frac{(s + t - 1)(s + t - 2) \cdots s}{t!} = \frac{1}{t!} \prod_{j=0}^{t-1}(s + j).
\end{align*}
Expanding this product, every coefficient (in powers of $s$) is a sum of products of non-negative integers from $\{0, 1, \ldots, t-1\}$, hence non-negative. So $f$, expressed in $s$, is a polynomial with non-negative coefficients. For $s \geq 0$ the second derivative $f''(x) = \frac{d^2 f}{ds^2}$ is also a polynomial with non-negative coefficients, hence non-negative — so $f$ is convex on $s \geq 0$, i.e. on $x \geq t - 1$.
**Applying Jensen.** Why was the reduction $\deg y \geq t - 1$ essential? Because Jensen's inequality for a convex $f$ requires the arguments to lie in the domain of convexity. Since every $\deg y \geq t - 1$, Jensen gives
\begin{align*}
\frac{1}{n} \sum_{y \in Y} \binom{\deg y}{t} \geq \binom{\overline{\deg y}}{t} = \binom{d}{t}
\end{align*}
where $d = \frac{1}{n} \sum_y \deg y = \frac{e(G)}{n}$ is the average degree in $Y$.
Multiplying by $n$ and chaining with the double-count bound,
\begin{align*}
n \binom{d}{t} \leq \sum_{y \in Y} \binom{\deg y}{t} \leq (t - 1) \binom{n}{t}.
\end{align*}
[/guided]
[/step]
[step:Estimate both binomials and solve for the average degree $d$]
We next replace both binomial coefficients by clean polynomial bounds. For the upper binomial,
\begin{align*}
\binom{n}{t} = \frac{n(n-1) \cdots (n - t + 1)}{t!} \leq \frac{n^t}{t!}.
\end{align*}
For the lower binomial, we use that for $d \geq t - 1$ (hence $d \geq t$ or $d - t \geq 0$, with the trivial case $d < t$ handled below),
\begin{align*}
\binom{d}{t} = \frac{d(d-1) \cdots (d - t + 1)}{t!} \geq \frac{(d-t+1)(d-t+1) \cdots (d-t+1)}{t!} \cdot \frac{d(d-1)\cdots(d-t+1)}{(d-t+1)^t},
\end{align*}
which is messy; we use instead the simpler and sufficient bound
\begin{align*}
\binom{d}{t} \geq \frac{(d - t)^t}{t!},
\end{align*}
valid whenever $d \geq t$. (For $d < t$ the desired final bound $d \leq t^{1/t} n^{1-1/t} + t$ already holds since $d < t \leq t^{1/t} n^{1-1/t} + t$ for $n \geq 1$, so we may assume $d \geq t$.) This inequality follows since each factor $d - j \geq d - t$ for $0 \leq j \leq t - 1$ when $d \geq t$.
Substituting into the chained inequality from the previous step,
\begin{align*}
n \cdot \frac{(d-t)^t}{t!} \leq (t - 1) \cdot \frac{n^t}{t!},
\end{align*}
so $n(d-t)^t \leq (t-1) n^t \leq t \cdot n^t$, giving
\begin{align*}
(d - t)^t \leq t \cdot n^{t-1}.
\end{align*}
Taking $t$-th roots,
\begin{align*}
d - t \leq t^{1/t} n^{1 - 1/t}, \qquad \text{i.e. } d \leq t^{1/t} n^{1-1/t} + t.
\end{align*}
[guided]
We have $n \binom{d}{t} \leq (t-1)\binom{n}{t}$ and want $d \leq t^{1/t} n^{1-1/t} + t$. The plan is to bound each binomial by a clean power — upper-bound $\binom{n}{t}$ by $n^t/t!$, lower-bound $\binom{d}{t}$ by $(d-t)^t/t!$ — then solve.
**Upper bound $\binom{n}{t} \leq n^t/t!$.** Each of the $t$ factors in the numerator is at most $n$, so $n(n-1)\cdots(n-t+1) \leq n^t$.
**Lower bound $\binom{d}{t} \geq (d-t)^t/t!$ when $d \geq t$.** Each of the $t$ factors $d, d-1, \ldots, d-t+1$ is at least $d - t + 1 > d - t$; more coarsely, each is at least $d - t$. So the product is at least $(d - t)^t$.
**Handling the case $d < t$.** We want to show $d \leq t^{1/t}n^{1-1/t} + t$. If $d < t$, this is automatic since $t^{1/t}n^{1-1/t} \geq 0$. So we may assume $d \geq t$, validating the lower bound.
**Combining.** The chained inequality becomes
\begin{align*}
n \cdot \frac{(d-t)^t}{t!} \leq (t-1) \cdot \frac{n^t}{t!}.
\end{align*}
The $t!$'s cancel. Using $t - 1 \leq t$ (this loses essentially nothing and gives the cleaner bound in the theorem statement):
\begin{align*}
n(d-t)^t \leq t \cdot n^t, \quad \text{so} \quad (d-t)^t \leq t n^{t-1}.
\end{align*}
Taking $t$-th roots (both sides non-negative):
\begin{align*}
d - t \leq t^{1/t} n^{1 - 1/t},
\end{align*}
giving $d \leq t^{1/t} n^{1-1/t} + t$.
[/guided]
[/step]
[step:Convert the average-degree bound into the edge bound and conclude]
Since $d = e(G)/n$, the bound $d \leq t^{1/t} n^{1-1/t} + t$ gives
\begin{align*}
e(G) = n d \leq n \left(t^{1/t} n^{1-1/t} + t\right) = t^{1/t} n^{2 - 1/t} + tn.
\end{align*}
This bound holds for every $K_{t,t}$-free bipartite graph with parts of size $n$, hence $Z(n, t) \leq t^{1/t} n^{2-1/t} + tn$, completing the proof.
[/step]