[proofplan]
We build a straight-line graph whose vertices are the points of $P$ and whose edges join consecutive points on each line of $L$. The number of edges is controlled from below by the number of incidences, while the number of crossings is controlled from above by the number of pairs of lines. A self-contained crossing-number estimate then converts the crossing bound into the desired incidence estimate. The small-edge case gives the linear terms, and the large-edge case gives the mixed term $|P|^{2/3}|L|^{2/3}$.
[/proofplan]
[step:Discard lines with at most one incident point]
Let
\begin{align*}
m := |P|, \qquad n := |L|, \qquad I := I(P,L).
\end{align*}
If $m = 0$ or $n = 0$, then $I = 0$ and the estimate holds. Hence assume $m,n \geq 1$.
For each line $\ell \in L$, define
\begin{align*}
P_\ell := P \cap \ell, \qquad k_\ell := |P_\ell|.
\end{align*}
Let
\begin{align*}
L' := \{\ell \in L : k_\ell \geq 2\}
\end{align*}
and let $n' := |L'|$. Lines in $L \setminus L'$ contribute at most one incidence each, so
\begin{align*}
I(P,L') \geq I - n.
\end{align*}
It is therefore enough to estimate $I(P,L')$, since adding $n$ is absorbed by the final $|L|$ term.
[/step]
[step:Construct the incidence graph from consecutive points on each line]
For each $\ell \in L'$, order the finite set $P_\ell$ along the line $\ell$. If
\begin{align*}
P_\ell = \{p_{\ell,1},\dots,p_{\ell,k_\ell}\}
\end{align*}
in this linear order, draw the closed line segment joining $p_{\ell,j}$ to $p_{\ell,j+1}$ for each $1 \leq j \leq k_\ell - 1$.
Let $G$ be the graph whose vertex set is $P$ and whose edge set consists of these consecutive-pair segments. Thus
\begin{align*}
V(G) = P, \qquad e := |E(G)| = \sum_{\ell \in L'}(k_\ell - 1).
\end{align*}
Since $\sum_{\ell \in L'} k_\ell = I(P,L')$, we have
\begin{align*}
e = I(P,L') - n' \geq I(P,L') - n \geq I - 2n.
\end{align*}
The graph $G$ is simple. Indeed, if two distinct vertices $p,q \in P$ are joined by an edge, then the edge lies on the unique affine line through $p$ and $q$; hence no second line in $L'$ can produce another edge with the same endpoints.
[guided]
We convert incidences into graph edges by using only consecutive points on each line. This avoids drawing many overlapping segments on the same line.
For a fixed line $\ell \in L'$, the set $P_\ell = P \cap \ell$ is finite and has cardinality $k_\ell \geq 2$. Ordering these points along $\ell$ gives
\begin{align*}
P_\ell = \{p_{\ell,1},\dots,p_{\ell,k_\ell}\}.
\end{align*}
We draw exactly the $k_\ell - 1$ segments
\begin{align*}
[p_{\ell,1},p_{\ell,2}],\ [p_{\ell,2},p_{\ell,3}],\dots,[p_{\ell,k_\ell-1},p_{\ell,k_\ell}].
\end{align*}
Thus each line with $k_\ell$ incident points contributes $k_\ell - 1$ graph edges. Summing over all retained lines gives
\begin{align*}
e = \sum_{\ell \in L'}(k_\ell - 1)
= \sum_{\ell \in L'} k_\ell - |L'|
= I(P,L') - n'.
\end{align*}
Because $n' \leq n$ and $I(P,L') \geq I-n$, this implies
\begin{align*}
e \geq I - 2n.
\end{align*}
The graph is simple because two distinct points in $\mathbb{R}^2$ determine exactly one affine line. Therefore a fixed unordered pair $\{p,q\}$ cannot be produced as an edge from two different lines of $L'$.
[/guided]
[/step]
[step:Bound the number of graph crossings by pairs of lines]
Let $\operatorname{cr}(G)$ denote the number of crossings between interiors of distinct drawn edges of $G$. Edges lying on the same line $\ell \in L'$ have disjoint interiors, because only consecutive points on $\ell$ were connected. Edges lying on two distinct lines $\ell_1,\ell_2 \in L'$ can have at most one common interior point, since two distinct affine lines in $\mathbb{R}^2$ meet in at most one point.
Therefore each unordered pair of distinct lines in $L'$ contributes at most one crossing, and hence
\begin{align*}
\operatorname{cr}(G) \leq \binom{n'}{2} \leq \frac{n^2}{2}.
\end{align*}
[guided]
We now control the geometric complexity of the drawing. The quantity $\operatorname{cr}(G)$ denotes the number of intersection points between interiors of distinct drawn edges of $G$; intersections at common endpoints are not counted as crossings.
First consider two edges coming from the same line $\ell \in L'$. Because we connected only consecutive points of $P_\ell$ in the order along $\ell$, the interiors of those consecutive-pair segments are disjoint. Thus same-line edges contribute no crossings.
Next consider edges lying on two distinct lines $\ell_1,\ell_2 \in L'$. Two distinct affine lines in $\mathbb{R}^2$ meet in at most one point. Therefore all edges supported on $\ell_1$ and all edges supported on $\ell_2$ can create at most one interior-intersection crossing in total. Since there are $\binom{n'}{2}$ unordered pairs of distinct lines in $L'$, we obtain
\begin{align*}
\operatorname{cr}(G) \leq \binom{n'}{2}.
\end{align*}
Finally $n' \leq n$, so
\begin{align*}
\operatorname{cr}(G) \leq \binom{n'}{2} \leq \frac{n^2}{2}.
\end{align*}
[/guided]
[/step]
[step:Prove the crossing-number estimate needed for the graph]
[claim:Crossing-number estimate]
Let $H$ be a simple graph drawn in the plane with $v \geq 1$ vertices and $a$ edges, in a drawing where every crossing counted by $\operatorname{cr}(H)$ is an intersection point of the interiors of two edges with four distinct endpoints. If $a \geq 4v$, then
\begin{align*}
\operatorname{cr}(H) \geq \frac{a^3}{64v^2}.
\end{align*}
[/claim]
[proof]
First, any simple drawing satisfies
\begin{align*}
a \leq 3v + \operatorname{cr}(H).
\end{align*}
Indeed, remove one edge from each crossing. After at most $\operatorname{cr}(H)$ removals, the remaining graph is planar and simple, with $v$ vertices and at least $a-\operatorname{cr}(H)$ edges. We use the standard planar graph edge bound: every simple planar graph with $v \geq 3$ vertices has at most $3v-6$ edges, and hence at most $3v$ edges; for $v \leq 2$, the weaker bound $a \leq 3v$ is immediate for a simple graph. Hence
\begin{align*}
a-\operatorname{cr}(H) \leq 3v,
\end{align*}
which gives the displayed inequality.
Now choose each vertex of $H$ independently with probability
\begin{align*}
p := \frac{4v}{a}.
\end{align*}
Since $a \geq 4v$, we have $0 < p \leq 1$. Let $H_p$ be the induced random subgraph. Define the random variables $V_p$, $A_p$, and $X_p$ to be the number of vertices, edges, and crossings of $H_p$, respectively. Linearity of expectation gives
\begin{align*}
\mathbb{E}[V_p] &= pv,\\
\mathbb{E}[A_p] &= p^2 a,\\
\mathbb{E}[X_p] &= p^4 \operatorname{cr}(H),
\end{align*}
because an edge survives exactly when both its endpoints are chosen, and a crossing between two edges survives exactly when the four endpoints involved are chosen. By the drawing assumption in the claim, every crossing counted by $\operatorname{cr}(H)$ is an interior intersection of two edges with four distinct endpoints, so such a crossing survives exactly when those four endpoints are chosen.
Applying $A_p \leq 3V_p + X_p$ to each realized induced drawing and taking expectations yields
\begin{align*}
p^2 a \leq 3pv + p^4 \operatorname{cr}(H).
\end{align*}
Rearranging,
\begin{align*}
\operatorname{cr}(H)
\geq \frac{p^2a - 3pv}{p^4}.
\end{align*}
With $p = 4v/a$,
\begin{align*}
p^2a - 3pv
= \frac{16v^2}{a} - \frac{12v^2}{a}
= \frac{4v^2}{a}.
\end{align*}
Also,
\begin{align*}
p^4 = \frac{256v^4}{a^4}.
\end{align*}
Thus
\begin{align*}
\operatorname{cr}(H)
\geq \frac{4v^2/a}{256v^4/a^4}
= \frac{a^3}{64v^2}.
\end{align*}
This proves the claim.
[/proof]
[guided]
We prove the crossing-number estimate by the standard random-sampling argument, but we first record the deterministic inequality on which it rests. Let $H$ be a simple graph with $v \geq 1$ vertices and $a$ edges, drawn so that every crossing counted by $\operatorname{cr}(H)$ is an interior intersection of two edges with four distinct endpoints.
First we show
\begin{align*}
a \leq 3v + \operatorname{cr}(H).
\end{align*}
Remove one edge from the drawing for each crossing point. This requires at most $\operatorname{cr}(H)$ edge removals, and the remaining drawing is planar. The remaining graph is still simple, has the same $v$ vertices, and has at least $a-\operatorname{cr}(H)$ edges. We use the standard planar graph edge bound: every simple planar graph with $v \geq 3$ vertices has at most $3v-6$ edges, and hence at most $3v$ edges; when $v \leq 2$, the weaker bound $a \leq 3v$ is immediate for a simple graph. Therefore
\begin{align*}
a-\operatorname{cr}(H) \leq 3v,
\end{align*}
which gives
\begin{align*}
a \leq 3v + \operatorname{cr}(H).
\end{align*}
Now assume $a \geq 4v$ and choose each vertex independently with probability
\begin{align*}
p := \frac{4v}{a}.
\end{align*}
The hypothesis $a \geq 4v$ gives $0 < p \leq 1$, so this is a valid probability. Let $H_p$ be the induced random subgraph on the chosen vertices. Define $V_p$, $A_p$, and $X_p$ to be the random variables counting the vertices, edges, and crossings of $H_p$, respectively.
Linearity of expectation gives
\begin{align*}
\mathbb{E}[V_p] &= pv,\\
\mathbb{E}[A_p] &= p^2 a,\\
\mathbb{E}[X_p] &= p^4 \operatorname{cr}(H).
\end{align*}
The first identity holds because each vertex is chosen with probability $p$. The second holds because an edge survives exactly when both of its endpoints are chosen, an event of probability $p^2$. The third holds because each counted crossing has four distinct endpoints by the drawing assumption, and it survives exactly when all four of those endpoints are chosen, an event of probability $p^4$.
For every realized induced drawing, the deterministic inequality just proved gives
\begin{align*}
A_p \leq 3V_p + X_p.
\end{align*}
Taking expectations and substituting the three expectation identities yields
\begin{align*}
p^2 a \leq 3pv + p^4 \operatorname{cr}(H).
\end{align*}
Rearranging gives
\begin{align*}
\operatorname{cr}(H)
\geq \frac{p^2a - 3pv}{p^4}.
\end{align*}
With $p = 4v/a$, the numerator is
\begin{align*}
p^2a - 3pv
= \frac{16v^2}{a} - \frac{12v^2}{a}
= \frac{4v^2}{a},
\end{align*}
and the denominator is
\begin{align*}
p^4 = \frac{256v^4}{a^4}.
\end{align*}
Therefore
\begin{align*}
\operatorname{cr}(H)
\geq \frac{4v^2/a}{256v^4/a^4}
= \frac{a^3}{64v^2}.
\end{align*}
This proves the crossing-number estimate.
[/guided]
[/step]
[step:Apply the crossing estimate in the large-edge case]
Assume first that
\begin{align*}
e \geq 4m.
\end{align*}
The crossing-number estimate applied to $G$ gives
\begin{align*}
\frac{e^3}{64m^2} \leq \operatorname{cr}(G).
\end{align*}
Using the crossing bound from the previous step,
\begin{align*}
\frac{e^3}{64m^2} \leq \frac{n^2}{2}.
\end{align*}
Hence
\begin{align*}
e^3 \leq 32m^2n^2,
\end{align*}
and therefore
\begin{align*}
e \leq 32^{1/3}m^{2/3}n^{2/3}.
\end{align*}
Since $I \leq e + 2n$, we obtain
\begin{align*}
I \leq 32^{1/3}m^{2/3}n^{2/3} + 2n.
\end{align*}
[guided]
We now handle the case where the graph has many edges relative to its vertices. Assume
\begin{align*}
e \geq 4m.
\end{align*}
The graph $G$ is simple by the construction step, has $m$ vertices and $e$ edges, and its crossings are interior crossings between edges supported on two distinct lines. Such crossing edges have four distinct endpoints, because if two straight-line segments share an endpoint then their intersection at that endpoint is not an interior crossing. Therefore the crossing-number estimate applies to $G$ and gives
\begin{align*}
\frac{e^3}{64m^2} \leq \operatorname{cr}(G).
\end{align*}
The previous crossing bound gives
\begin{align*}
\operatorname{cr}(G) \leq \frac{n^2}{2}.
\end{align*}
Combining the two inequalities yields
\begin{align*}
\frac{e^3}{64m^2} \leq \frac{n^2}{2}.
\end{align*}
Multiplying by $64m^2$ gives
\begin{align*}
e^3 \leq 32m^2n^2,
\end{align*}
and taking cube roots gives
\begin{align*}
e \leq 32^{1/3}m^{2/3}n^{2/3}.
\end{align*}
Finally, from the edge-count construction we have $e \geq I-2n$, equivalently $I \leq e+2n$. Hence
\begin{align*}
I \leq 32^{1/3}m^{2/3}n^{2/3} + 2n.
\end{align*}
[/guided]
[/step]
[step:Handle the small-edge case and combine the estimates]
If
\begin{align*}
e < 4m,
\end{align*}
then $I \leq e + 2n$ gives
\begin{align*}
I < 4m + 2n.
\end{align*}
Combining this with the large-edge estimate, we have in all cases
\begin{align*}
I \leq 32^{1/3}m^{2/3}n^{2/3} + 4m + 2n.
\end{align*}
Therefore the theorem holds with the absolute constant
\begin{align*}
C := 4,
\end{align*}
since $32^{1/3} < 4$ and $2 \leq 4$. Recalling $m = |P|$ and $n = |L|$, this is exactly
\begin{align*}
I(P,L) \leq C\left(|P|^{2/3}|L|^{2/3} + |P| + |L|\right).
\end{align*}
[guided]
It remains to cover the complementary case and then choose one absolute constant valid in both cases. Suppose
\begin{align*}
e < 4m.
\end{align*}
From the construction step we know $e \geq I-2n$, so $I \leq e+2n$. Therefore
\begin{align*}
I < 4m + 2n.
\end{align*}
In the large-edge case proved in the preceding step, we obtained
\begin{align*}
I \leq 32^{1/3}m^{2/3}n^{2/3} + 2n.
\end{align*}
Thus, in every case,
\begin{align*}
I \leq 32^{1/3}m^{2/3}n^{2/3} + 4m + 2n.
\end{align*}
Choose the absolute constant
\begin{align*}
C := 4.
\end{align*}
Since $32^{1/3} < 4$ and $2 \leq 4$, the preceding inequality implies
\begin{align*}
I \leq C\left(m^{2/3}n^{2/3}+m+n\right).
\end{align*}
Finally, the quantities were defined at the start as $m = |P|$, $n = |L|$, and $I = I(P,L)$. Substituting these definitions gives
\begin{align*}
I(P,L) \leq C\left(|P|^{2/3}|L|^{2/3} + |P| + |L|\right),
\end{align*}
which is the asserted Szemerédi-Trotter bound.
[/guided]
[/step]