[proofplan]
We first prove the common-eigenvector form of Lie's theorem: every nonzero finite-dimensional module over a finite-dimensional solvable Lie algebra has a one-dimensional submodule. This is proved by induction on $\dim_F L$, using a codimension-one ideal and a trace computation to show that an eigenspace for the ideal is stable under the remaining generator. We then induct on $\dim_F V$: the common eigenvector gives the first line of the flag, and the induction hypothesis applied to the quotient gives the rest. Finally, we verify that complete invariant flags and simultaneous upper triangular forms are equivalent.
[/proofplan]
[step:Produce a one-dimensional submodule for every nonzero module]
We prove the following auxiliary statement.
[claim:Common eigenvector for solvable representations]
Let $F$ be algebraically closed of characteristic zero. Let $L$ be a finite-dimensional solvable Lie algebra over $F$, and let
\begin{align*}
\rho: L &\to \mathfrak{gl}_F(W)
\end{align*}
be a representation on a nonzero finite-dimensional $F$-[vector space](/page/Vector%20Space) $W$. Then there exist a nonzero vector $w \in W$ and a linear functional $\mu: L \to F$ such that
\begin{align*}
\rho(x)w = \mu(x)w
\end{align*}
for every $x \in L$.
[/claim]
[proof]
We argue by induction on $d = \dim_F L$.
If $d = 0$, there is nothing to prove: choose any nonzero $w \in W$ and let $\mu: L \to F$ be the zero functional. If $d = 1$, choose $x_0 \in L$ with $L = F x_0$. Since $F$ is algebraically closed and $W$ is nonzero finite-dimensional, the operator $\rho(x_0): W \to W$ has an eigenvector $w \neq 0$ with eigenvalue $a \in F$. Define $\mu: L \to F$ by $\mu(c x_0) = ca$ for $c \in F$. Then $\rho(x)w = \mu(x)w$ for every $x \in L$.
Assume now that $d \geq 2$ and that the statement is known for all solvable Lie algebras of dimension strictly smaller than $d$. Since $L$ is solvable and nonzero, its derived algebra $[L,L]$ is a proper subspace of $L$. Choose a codimension-one subspace $H \subset L/[L,L]$, and define $I \subset L$ to be the inverse image of $H$ under the quotient map $L \to L/[L,L]$. Then $I$ is an ideal of $L$, $\dim_F I = d - 1$, and $I$ is solvable because it is a Lie subalgebra of the solvable Lie algebra $L$.
By the induction hypothesis applied to the $I$-module $W$, there exist a nonzero vector $v \in W$ and a linear functional $\lambda: I \to F$ such that
\begin{align*}
\rho(y)v = \lambda(y)v
\end{align*}
for every $y \in I$. Choose $z \in L \setminus I$, so that $L = I \oplus Fz$ as an $F$-vector space.
For each integer $k \geq 0$, define $v_k := \rho(z)^k v \in W$. Since $W$ is finite-dimensional, there is a smallest integer $m \geq 1$ such that $v_0,\dots,v_{m-1}$ are linearly independent and $v_m \in \operatorname{span}_F\{v_0,\dots,v_{m-1}\}$. Define
\begin{align*}
S := \operatorname{span}_F\{v_0,\dots,v_{m-1}\} \subset W.
\end{align*}
By construction, $S$ is nonzero and $\rho(z)(S) \subset S$.
We next show that $\rho(y)(S) \subset S$ for every $y \in I$, and that the trace of $\rho(y)|_S$ is $m\lambda(y)$. We prove by induction on $k$ that
\begin{align*}
\rho(y)v_k - \lambda(y)v_k \in \operatorname{span}_F\{v_0,\dots,v_{k-1}\}
\end{align*}
for every $y \in I$ and every $0 \leq k \leq m-1$, with the convention that the span is $\{0\}$ when $k=0$. The case $k=0$ is exactly the definition of $v$. If the assertion holds for $k-1$, then using the representation identity $\rho([y,z]) = \rho(y)\rho(z)-\rho(z)\rho(y)$ gives
\begin{align*}
\rho(y)v_k
&= \rho(y)\rho(z)v_{k-1} \\
&= \rho(z)\rho(y)v_{k-1} + \rho([y,z])v_{k-1}.
\end{align*}
Since $I$ is an ideal, $[y,z] \in I$. Applying the induction hypothesis to $y$ and to $[y,z]$, the right-hand side belongs to
\begin{align*}
\lambda(y)v_k + \operatorname{span}_F\{v_0,\dots,v_{k-1}\}.
\end{align*}
Thus the assertion follows. Therefore, in the ordered basis $(v_0,\dots,v_{m-1})$ of $S$, the operator $\rho(y)|_S$ is triangular with every diagonal entry equal to $\lambda(y)$. Hence
\begin{align*}
\operatorname{tr}(\rho(y)|_S) = m\lambda(y)
\end{align*}
for every $y \in I$.
Now fix $y \in I$. Since $I$ is an ideal, $[z,y] \in I$, and the preceding paragraph applies to $[z,y]$. Because $S$ is invariant under $\rho(z)$, $\rho(y)$, and $\rho([z,y])$, the representation identity restricted to $S$ gives
\begin{align*}
\rho([z,y])|_S
=
\rho(z)|_S \rho(y)|_S - \rho(y)|_S \rho(z)|_S.
\end{align*}
The trace of a commutator of endomorphisms of a finite-dimensional vector space is zero, so
\begin{align*}
0
=
\operatorname{tr}(\rho([z,y])|_S)
=
m\lambda([z,y]).
\end{align*}
Since $F$ has characteristic zero and $m \geq 1$, the scalar $m \cdot 1_F$ is nonzero. Therefore
\begin{align*}
\lambda([z,y]) = 0
\end{align*}
for every $y \in I$.
Define the $\lambda$-eigenspace for the ideal $I$ by
\begin{align*}
E_\lambda := \{u \in W : \rho(y)u = \lambda(y)u \text{ for every } y \in I\}.
\end{align*}
This subspace is nonzero because $v \in E_\lambda$. We claim that $E_\lambda$ is invariant under $\rho(z)$. If $u \in E_\lambda$ and $y \in I$, then
\begin{align*}
\rho(y)\rho(z)u
&= \rho(z)\rho(y)u + \rho([y,z])u \\
&= \lambda(y)\rho(z)u + \lambda([y,z])u.
\end{align*}
Since $[y,z] = -[z,y]$, the trace computation gives $\lambda([y,z]) = 0$. Hence
\begin{align*}
\rho(y)\rho(z)u = \lambda(y)\rho(z)u
\end{align*}
for every $y \in I$, so $\rho(z)u \in E_\lambda$.
The vector space $E_\lambda$ is nonzero and finite-dimensional. Since $F$ is algebraically closed, the operator $\rho(z)|_{E_\lambda}: E_\lambda \to E_\lambda$ has an eigenvector $w \in E_\lambda$ with $w \neq 0$. Let $a \in F$ be its eigenvalue, so
\begin{align*}
\rho(z)w = aw.
\end{align*}
Define $\mu: L \to F$ by
\begin{align*}
\mu(y + cz) := \lambda(y) + ca
\end{align*}
for $y \in I$ and $c \in F$. Since $L = I \oplus Fz$, this is a well-defined linear functional. For $x = y + cz \in L$, we have
\begin{align*}
\rho(x)w
&= \rho(y)w + c\rho(z)w \\
&= \lambda(y)w + caw \\
&= \mu(x)w.
\end{align*}
This proves the claim.
[/proof]
[/step]
[step:Start the invariant flag with a common eigenline]
Let $n = \dim_F V$. We prove the existence of a complete $L$-invariant flag in $V$ by induction on $n$.
If $n = 0$, then $V = 0$, and the flag $V_0 = 0$ is complete. Suppose $n \geq 1$ and assume the result is known for all modules of dimension strictly smaller than $n$.
Apply the common-eigenvector claim to the nonzero $L$-module $V$. There exist $v_1 \in V \setminus \{0\}$ and a linear functional $\mu: L \to F$ such that
\begin{align*}
\rho(x)v_1 = \mu(x)v_1
\end{align*}
for every $x \in L$. Define
\begin{align*}
V_1 := Fv_1 \subset V.
\end{align*}
Then $\dim_F V_1 = 1$, and the displayed identity gives $\rho(x)(V_1) \subset V_1$ for every $x \in L$.
[/step]
[step:Lift a complete invariant flag from the quotient]
Since $V_1$ is $L$-invariant, the quotient vector space $V/V_1$ carries the quotient representation
\begin{align*}
\bar{\rho}: L &\to \mathfrak{gl}_F(V/V_1), \\
x &\mapsto \bar{\rho}(x),
\end{align*}
where
\begin{align*}
\bar{\rho}(x)(u + V_1) := \rho(x)u + V_1
\end{align*}
for $x \in L$ and $u \in V$. This is well-defined because $\rho(x)(V_1) \subset V_1$ for every $x \in L$.
The quotient $V/V_1$ has dimension $n-1$. By the induction hypothesis, there is a complete $L$-invariant flag
\begin{align*}
0 = \bar{V}_0 \subset \bar{V}_1 \subset \cdots \subset \bar{V}_{n-1} = V/V_1
\end{align*}
with $\dim_F \bar{V}_i = i$ and $\bar{\rho}(x)(\bar{V}_i) \subset \bar{V}_i$ for every $x \in L$.
Let
\begin{align*}
\pi: V &\to V/V_1
\end{align*}
be the quotient map, and define
\begin{align*}
V_i := \pi^{-1}(\bar{V}_{i-1})
\end{align*}
for $2 \leq i \leq n$. Together with $V_0 := 0$ and the already defined $V_1$, this gives a chain
\begin{align*}
0 = V_0 \subset V_1 \subset V_2 \subset \cdots \subset V_n = V.
\end{align*}
For $2 \leq i \leq n$, the dimension formula for a quotient map gives
\begin{align*}
\dim_F V_i
=
\dim_F V_1 + \dim_F \bar{V}_{i-1}
=
1 + (i-1)
=
i.
\end{align*}
Finally, if $u \in V_i$ and $x \in L$, then $\pi(u) \in \bar{V}_{i-1}$, so
\begin{align*}
\pi(\rho(x)u)
=
\bar{\rho}(x)\pi(u)
\in \bar{V}_{i-1}.
\end{align*}
Thus $\rho(x)u \in \pi^{-1}(\bar{V}_{i-1}) = V_i$. Hence every $V_i$ is $L$-invariant, and the displayed chain is a complete $L$-invariant flag in $V$.
[/step]
[step:Translate complete invariant flags into upper triangular matrices]
Assume first that
\begin{align*}
0 = V_0 \subset V_1 \subset \cdots \subset V_n = V
\end{align*}
is a complete $L$-invariant flag. Choose a basis $(v_1,\dots,v_n)$ of $V$ adapted to the flag, meaning
\begin{align*}
V_i = \operatorname{span}_F\{v_1,\dots,v_i\}
\end{align*}
for every $1 \leq i \leq n$. For any $x \in L$ and any $1 \leq j \leq n$, invariance gives
\begin{align*}
\rho(x)v_j \in \rho(x)(V_j) \subset V_j = \operatorname{span}_F\{v_1,\dots,v_j\}.
\end{align*}
Therefore the matrix of $\rho(x)$ in the basis $(v_1,\dots,v_n)$ has zero entries below the diagonal, so it is upper triangular.
Conversely, suppose that $(v_1,\dots,v_n)$ is a basis of $V$ such that every matrix of $\rho(x)$ in this basis is upper triangular. Define
\begin{align*}
V_i := \operatorname{span}_F\{v_1,\dots,v_i\}
\end{align*}
for $0 \leq i \leq n$, where $V_0 := 0$. Then $\dim_F V_i = i$, and upper triangularity means that $\rho(x)v_j \in V_j \subset V_i$ whenever $1 \leq j \leq i$. Hence $\rho(x)(V_i) \subset V_i$ for every $x \in L$ and every $i$. Thus the coordinate spans form a complete $L$-invariant flag.
This proves both formulations of the theorem.
[/step]