[proofplan]
We first prove the Engel invariant-vector lemma: a finite-dimensional Lie algebra of nilpotent endomorphisms has a nonzero vector killed by every element of the Lie algebra. The proof of that lemma uses induction on the dimension of the Lie algebra, together with the fact that nilpotence of an endomorphism implies nilpotence of its commutator action. Once a common zero vector $v_1$ is found, the line $Fv_1$ is invariant, the induced quotient action on $V/Fv_1$ is again by nilpotent endomorphisms, and induction on $\dim_F V$ gives an upper-triangular basis on the quotient. Lifting that quotient basis back to $V$ gives the desired strictly upper triangular form.
[/proofplan]
[step:Find a nonzero vector killed by the whole Lie algebra]
We prove the following auxiliary statement.
[claim:Engel invariant-vector lemma]
Let $W$ be a nonzero finite-dimensional [vector space](/page/Vector%20Space) over $F$, and let $\mathfrak a \subseteq \mathfrak{gl}(W)$ be a finite-dimensional Lie subalgebra. Suppose every $u \in \mathfrak a$ is nilpotent as an endomorphism $u: W \to W$. Then there exists $0 \neq w \in W$ such that $u(w) = 0$ for every $u \in \mathfrak a$.
[/claim]
[proof]
We argue by induction on $d = \dim_F \mathfrak a$.
If $d = 0$, then every nonzero $w \in W$ is killed by all elements of $\mathfrak a$. If $d = 1$, choose a basis element $z \in \mathfrak a$ if $\mathfrak a \neq \{0\}$. Since $z: W \to W$ is nilpotent and $W \neq \{0\}$, the kernel $\ker z$ is nonzero; any $0 \neq w \in \ker z$ is killed by all elements of $\mathfrak a$.
Assume $d \geq 2$ and assume the lemma is known for Lie algebras of dimension smaller than $d$. Choose a maximal proper Lie subalgebra $\mathfrak h \subsetneq \mathfrak a$, which exists because $\mathfrak a$ is finite-dimensional and contains proper subalgebras.
We first show that $\mathfrak h$ is an ideal of codimension $1$ in $\mathfrak a$. For each $y \in \mathfrak h$, define the adjoint endomorphism
\begin{align*}
\operatorname{ad}_y: \mathfrak a &\to \mathfrak a, \\
z &\mapsto [y,z].
\end{align*}
Since $y: W \to W$ is nilpotent, choose $r \in \mathbb N$ such that $y^r = 0$. Define
\begin{align*}
L_y: \operatorname{End}_F(W) &\to \operatorname{End}_F(W), &
T &\mapsto y \circ T, \\
R_y: \operatorname{End}_F(W) &\to \operatorname{End}_F(W), &
T &\mapsto T \circ y.
\end{align*}
The endomorphisms $L_y$ and $R_y$ commute, and $\operatorname{ad}_y = L_y - R_y$ on $\operatorname{End}_F(W)$. For every $k \geq 2r - 1$ and every $T \in \operatorname{End}_F(W)$,
\begin{align*}
(\operatorname{ad}_y)^k(T)
=
\sum_{i=0}^{k} (-1)^i \binom{k}{i} y^{k-i} \circ T \circ y^i
=
0,
\end{align*}
because for each $i$ either $i \geq r$ or $k-i \geq r$. Hence $\operatorname{ad}_y$ is nilpotent on $\operatorname{End}_F(W)$, and its restriction to the invariant subspace $\mathfrak a$ is nilpotent.
Let $\mathfrak a/\mathfrak h$ denote the quotient vector space, and let $\rho_y: \mathfrak a/\mathfrak h \to \mathfrak a/\mathfrak h$ be the [linear map](/page/Linear%20Map) induced by $\operatorname{ad}_y$, whenever the expression is defined modulo $\mathfrak h$. To avoid assuming invariance, apply the induction hypothesis to the representation of $\mathfrak h$ on the quotient vector space $\mathfrak a/\mathfrak h$ given by the maps induced by $\operatorname{ad}_y$ after restricting to the normalizer construction below. Equivalently, define the normalizer
\begin{align*}
N_{\mathfrak a}(\mathfrak h)
=
\{z \in \mathfrak a : [z,\mathfrak h] \subseteq \mathfrak h\}.
\end{align*}
The preceding nilpotence of all $\operatorname{ad}_y$ and the induction hypothesis applied to the natural action of $\mathfrak h$ on $\mathfrak a/\mathfrak h$ give an element $z \in \mathfrak a \setminus \mathfrak h$ such that $[y,z] \in \mathfrak h$ for every $y \in \mathfrak h$. Thus $z \in N_{\mathfrak a}(\mathfrak h)$, so $N_{\mathfrak a}(\mathfrak h)$ properly contains $\mathfrak h$. By maximality of $\mathfrak h$, we get $N_{\mathfrak a}(\mathfrak h) = \mathfrak a$, which means $[\mathfrak a,\mathfrak h] \subseteq \mathfrak h$. Therefore $\mathfrak h$ is an ideal of $\mathfrak a$.
Since $\mathfrak h$ is a maximal proper Lie subalgebra and an ideal, the quotient $\mathfrak a/\mathfrak h$ has no nonzero proper Lie subalgebra. If $\dim_F(\mathfrak a/\mathfrak h) \geq 2$, then any one-dimensional subspace of $\mathfrak a/\mathfrak h$ is a proper Lie subalgebra, because the bracket of an element with itself is zero. This contradicts maximality. Hence $\dim_F(\mathfrak a/\mathfrak h) = 1$.
By the induction hypothesis applied to $\mathfrak h \subseteq \mathfrak{gl}(W)$, there exists a nonzero subspace
\begin{align*}
W_0
=
\{w \in W : y(w) = 0 \text{ for every } y \in \mathfrak h\}
\end{align*}
with $W_0 \neq \{0\}$. Choose $z \in \mathfrak a$ such that $\mathfrak a = \mathfrak h \oplus Fz$ as vector spaces. We show that $W_0$ is stable under $z$. If $w \in W_0$ and $y \in \mathfrak h$, then
\begin{align*}
y(z(w))
=
z(y(w)) + [y,z](w)
=
z(0) + 0
=
0,
\end{align*}
because $[y,z] \in \mathfrak h$ and every element of $\mathfrak h$ kills $w$. Therefore $z(w) \in W_0$.
The restriction $z|_{W_0}: W_0 \to W_0$ is nilpotent because $z: W \to W$ is nilpotent. Since $W_0 \neq \{0\}$, there exists $0 \neq w \in W_0$ with $z(w) = 0$. Every element $u \in \mathfrak a$ has the form $u = y + \lambda z$ for some $y \in \mathfrak h$ and $\lambda \in F$, so
\begin{align*}
u(w) = y(w) + \lambda z(w) = 0.
\end{align*}
This proves the lemma.
[/proof]
[/step]
[step:Start the induction on the dimension of the representation space]
We now prove the theorem by induction on $n = \dim_F V$.
If $n = 0$, the empty basis satisfies the conclusion. If $n = 1$, every nilpotent endomorphism of the one-dimensional vector space $V$ is the zero endomorphism, so any basis $(v_1)$ satisfies $x(v_1)=0$ for all $x \in \mathfrak g$.
Assume $n \geq 2$ and assume the theorem holds for all vector spaces over $F$ of dimension smaller than $n$. Applying the Engel invariant-vector lemma to $W = V$ and $\mathfrak a = \mathfrak g$, we obtain $0 \neq v_1 \in V$ such that
\begin{align*}
x(v_1)=0
\end{align*}
for every $x \in \mathfrak g$. Define the line
\begin{align*}
L = Fv_1 \subseteq V.
\end{align*}
Since every $x \in \mathfrak g$ kills $v_1$, every $x \in \mathfrak g$ maps $L$ into $L$.
[/step]
[step:Pass the nilpotent action to the quotient by the common zero line]
Let
\begin{align*}
\pi: V &\to V/L, \\
v &\mapsto v + L
\end{align*}
be the quotient map. For each $x \in \mathfrak g$, define the induced endomorphism
\begin{align*}
\bar{x}: V/L &\to V/L, \\
\pi(v) &\mapsto \pi(x(v)).
\end{align*}
This map is well-defined: if $\pi(v)=\pi(v')$, then $v-v' \in L$, and since $x(L) \subseteq L$, we have $x(v)-x(v') \in L$, so $\pi(x(v))=\pi(x(v'))$.
Define
\begin{align*}
\bar{\mathfrak g}
=
\{\bar{x} : x \in \mathfrak g\}
\subseteq \mathfrak{gl}(V/L).
\end{align*}
The assignment $x \mapsto \bar{x}$ is a Lie algebra homomorphism because, for every $v \in V$,
\begin{align*}
[\bar{x},\bar{y}](\pi(v))
&=
\bar{x}(\pi(y(v))) - \bar{y}(\pi(x(v))) \\
&=
\pi(x(y(v))) - \pi(y(x(v))) \\
&=
\pi([x,y](v)) \\
&=
\overline{[x,y]}(\pi(v)).
\end{align*}
Hence $\bar{\mathfrak g}$ is a Lie subalgebra of $\mathfrak{gl}(V/L)$. If $x \in \mathfrak g$ and $x^m = 0$ for some $m \in \mathbb N$, then
\begin{align*}
\bar{x}^{\,m}(\pi(v)) = \pi(x^m(v)) = \pi(0)=0
\end{align*}
for every $v \in V$, so every element of $\bar{\mathfrak g}$ is nilpotent.
[/step]
[step:Lift a strictly upper triangular quotient basis to the original space]
Since $\dim_F(V/L)=n-1$, the induction hypothesis applied to $\bar{\mathfrak g} \subseteq \mathfrak{gl}(V/L)$ gives a basis
\begin{align*}
(\bar{v}_2,\dots,\bar{v}_n)
\end{align*}
of $V/L$ such that for every $\bar{x} \in \bar{\mathfrak g}$ and every $j \in \{2,\dots,n\}$,
\begin{align*}
\bar{x}(\bar{v}_j)
\in
\operatorname{span}_F\{\bar{v}_2,\dots,\bar{v}_{j-1}\},
\end{align*}
where the span is $\{0\}$ when $j=2$.
For each $j \in \{2,\dots,n\}$, choose $v_j \in V$ such that $\pi(v_j)=\bar{v}_j$. We claim that $(v_1,\dots,v_n)$ is a basis of $V$. If
\begin{align*}
a_1v_1 + a_2v_2 + \cdots + a_nv_n = 0
\end{align*}
with $a_1,\dots,a_n \in F$, applying $\pi$ gives
\begin{align*}
a_2\bar{v}_2 + \cdots + a_n\bar{v}_n = 0.
\end{align*}
Since $(\bar{v}_2,\dots,\bar{v}_n)$ is a basis of $V/L$, we get $a_2=\cdots=a_n=0$. Then $a_1v_1=0$, and since $v_1 \neq 0$, also $a_1=0$. Thus the vectors are linearly independent; there are $n=\dim_F V$ of them, so they form a basis of $V$.
[/step]
[step:Read the lifted containment as strict upper triangularity]
Let $x \in \mathfrak g$. For $j=1$, we already have
\begin{align*}
x(v_1)=0 \in \operatorname{span}_F\varnothing.
\end{align*}
For $j \in \{2,\dots,n\}$, the quotient-basis property gives scalars $a_{2j},\dots,a_{j-1,j} \in F$ such that
\begin{align*}
\pi(x(v_j))
=
\sum_{i=2}^{j-1} a_{ij}\bar{v}_i.
\end{align*}
Equivalently,
\begin{align*}
\pi\left(x(v_j)-\sum_{i=2}^{j-1} a_{ij}v_i\right)=0.
\end{align*}
The kernel of $\pi$ is $L=Fv_1$, so there exists $a_{1j} \in F$ such that
\begin{align*}
x(v_j)-\sum_{i=2}^{j-1} a_{ij}v_i = a_{1j}v_1.
\end{align*}
Therefore
\begin{align*}
x(v_j)
=
a_{1j}v_1 + \sum_{i=2}^{j-1} a_{ij}v_i
\in
\operatorname{span}_F\{v_1,\dots,v_{j-1}\}.
\end{align*}
Thus for every $x \in \mathfrak g$ and every $j \in \{1,\dots,n\}$, $x(v_j)$ lies in the span of the preceding basis vectors. With respect to the basis $(v_1,\dots,v_n)$ and the standard column convention for matrices of linear maps, this is exactly the statement that every $x \in \mathfrak g$ is represented by a strictly upper triangular matrix.
[/step]