[proofplan]
We prove the result by induction on the dimension of the Lie algebra. The main structural point is that a maximal proper Lie subalgebra can be shown to be an ideal of codimension one by applying the induction hypothesis to its induced action on the corresponding quotient. The induction hypothesis then gives a non-zero common kernel for that ideal inside the original representation space. Stability of this kernel under a complementary direction, together with nilpotence in that direction, produces a non-zero vector killed by the whole Lie algebra.
[/proofplan]
[step:Begin the induction on the dimension of the Lie algebra]
Let $n := \dim_F \mathfrak g$. Since $V$ is finite-dimensional and $\mathfrak g \subseteq \mathfrak{gl}(V)$, the [vector space](/page/Vector%20Space) $\mathfrak g$ is finite-dimensional.
We argue by induction on $n$. If $n = 0$, then $\mathfrak g = \{0\}$, and any non-zero vector $v \in V$ satisfies $x(v) = 0$ for every $x \in \mathfrak g$.
Assume now that $n > 0$ and that the theorem has been proved for every Lie subalgebra of $\mathfrak{gl}(U)$ of dimension strictly less than $n$, where $U$ is any non-zero finite-dimensional vector space over $F$.
[/step]
[step:Show that nilpotence on $V$ implies nilpotence of the adjoint action]
For $a \in \mathfrak g$, define the adjoint map
\begin{align*}
\operatorname{ad}_a: \mathfrak g &\to \mathfrak g \\
y &\mapsto [a,y].
\end{align*}
We claim that $\operatorname{ad}_a$ is nilpotent for every $a \in \mathfrak g$.
Let $A: \mathfrak{gl}(V) \to \mathfrak{gl}(V)$ and $B: \mathfrak{gl}(V) \to \mathfrak{gl}(V)$ be the linear maps
\begin{align*}
A(T) &= aT, \\
B(T) &= Ta.
\end{align*}
Then $A$ and $B$ commute, and on $\mathfrak{gl}(V)$ one has $\operatorname{ad}_a = A - B$. Since $a: V \to V$ is nilpotent, choose $m \in \mathbb N$ such that $a^m = 0$. Then $A^m = 0$ and $B^m = 0$. By the binomial formula for the commuting endomorphisms $A$ and $B$,
\begin{align*}
(A-B)^{2m-1}
=
\sum_{j=0}^{2m-1} (-1)^j \binom{2m-1}{j} A^{2m-1-j}B^j.
\end{align*}
For each index $j$, either $j \ge m$ or $2m-1-j \ge m$, so every summand is zero. Hence $(A-B)^{2m-1}=0$ on $\mathfrak{gl}(V)$. Restricting to the invariant subspace $\mathfrak g \subseteq \mathfrak{gl}(V)$ gives $(\operatorname{ad}_a)^{2m-1}=0$ on $\mathfrak g$.
[guided]
Fix $a \in \mathfrak g$. We need nilpotence not only of $a$ acting on $V$, but also of the commutator map $y \mapsto [a,y]$ acting on the Lie algebra. Define two linear maps on the associative algebra $\mathfrak{gl}(V)$:
\begin{align*}
A: \mathfrak{gl}(V) &\to \mathfrak{gl}(V), & A(T) &= aT, \\
B: \mathfrak{gl}(V) &\to \mathfrak{gl}(V), & B(T) &= Ta.
\end{align*}
These maps commute because
\begin{align*}
A(B(T)) = a(Ta) = (aT)a = B(A(T)).
\end{align*}
Moreover $\operatorname{ad}_a(T) = aT - Ta = (A-B)(T)$.
Since $a$ is nilpotent on $V$, there exists $m \in \mathbb N$ such that $a^m = 0$. Therefore left multiplication by $a$ satisfies $A^m=0$, and right multiplication by $a$ satisfies $B^m=0$. Because $A$ and $B$ commute, the ordinary binomial expansion applies:
\begin{align*}
(A-B)^{2m-1}
=
\sum_{j=0}^{2m-1} (-1)^j \binom{2m-1}{j} A^{2m-1-j}B^j.
\end{align*}
For every $j$, at least one of the exponents $j$ and $2m-1-j$ is at least $m$. Thus each term contains either $A^m$ or $B^m$ as a factor, hence is zero. Consequently $(A-B)^{2m-1}=0$ on $\mathfrak{gl}(V)$.
Finally, because $\mathfrak g$ is a Lie subalgebra, $[a,y]\in \mathfrak g$ for every $y\in\mathfrak g$. Thus $\operatorname{ad}_a$ restricts to a nilpotent endomorphism of $\mathfrak g$.
[/guided]
[/step]
[step:Find a maximal proper subalgebra that is an ideal of codimension one]
If $n=1$, set $\mathfrak h := \{0\}$. Then $\mathfrak h$ is an ideal of $\mathfrak g$ and $\dim_F(\mathfrak g/\mathfrak h)=1$.
Assume now that $n>1$. Choose a proper Lie subalgebra $\mathfrak h \subsetneq \mathfrak g$ maximal among proper Lie subalgebras. Such a subalgebra exists because $\mathfrak g$ is finite-dimensional.
Define the normaliser of $\mathfrak h$ in $\mathfrak g$ by
\begin{align*}
N_{\mathfrak g}(\mathfrak h)
:=
\{y \in \mathfrak g : [y,h]\in \mathfrak h \text{ for every } h \in \mathfrak h\}.
\end{align*}
We show that $N_{\mathfrak g}(\mathfrak h) \ne \mathfrak h$.
Let
\begin{align*}
\rho: \mathfrak h &\to \mathfrak{gl}(\mathfrak g/\mathfrak h) \\
a &\mapsto \rho(a),
\end{align*}
where
\begin{align*}
\rho(a)(y+\mathfrak h) := [a,y]+\mathfrak h
\end{align*}
for $a \in \mathfrak h$ and $y \in \mathfrak g$. This map is well-defined because if $y-y' \in \mathfrak h$, then $[a,y-y'] \in \mathfrak h$. It is a Lie algebra homomorphism: for $a,b\in\mathfrak h$ and $y\in\mathfrak g$, the commutator of endomorphisms satisfies
\begin{align*}
[\rho(a),\rho(b)](y+\mathfrak h)
&= [a,[b,y]]-[b,[a,y]]+\mathfrak h \\
&= [[a,b],y]+\mathfrak h \\
&= \rho([a,b])(y+\mathfrak h),
\end{align*}
where the second equality is the Jacobi identity. By the previous step, each $\rho(a)$ is nilpotent, since it is induced by the nilpotent map $\operatorname{ad}_a$ on $\mathfrak g$.
The vector space $\mathfrak g/\mathfrak h$ is non-zero and finite-dimensional. Also $\dim_F \rho(\mathfrak h) \le \dim_F \mathfrak h < n$. By the induction hypothesis applied to the Lie subalgebra $\rho(\mathfrak h)\subseteq \mathfrak{gl}(\mathfrak g/\mathfrak h)$, there exists a non-zero coset $y+\mathfrak h \in \mathfrak g/\mathfrak h$ such that
\begin{align*}
\rho(a)(y+\mathfrak h)=0
\end{align*}
for every $a\in\mathfrak h$. Hence $[a,y]\in\mathfrak h$ for every $a\in\mathfrak h$, and therefore $[y,a]\in\mathfrak h$ for every $a\in\mathfrak h$. Thus $y\in N_{\mathfrak g}(\mathfrak h)$. Since $y+\mathfrak h\ne 0$, we have $y\notin\mathfrak h$, so $N_{\mathfrak g}(\mathfrak h)\ne \mathfrak h$.
The normaliser $N_{\mathfrak g}(\mathfrak h)$ is a Lie subalgebra of $\mathfrak g$ containing $\mathfrak h$. Indeed, it is closed under scalar multiplication and addition by bilinearity of the Lie bracket. If $u,v\in N_{\mathfrak g}(\mathfrak h)$ and $h\in\mathfrak h$, then the Jacobi identity gives
\begin{align*}
[[u,v],h]
&= [u,[v,h]]-[v,[u,h]].
\end{align*}
Since $[v,h]\in\mathfrak h$ and $[u,h]\in\mathfrak h$, and since $u$ and $v$ normalise $\mathfrak h$, both terms on the right belong to $\mathfrak h$. Hence $[u,v]\in N_{\mathfrak g}(\mathfrak h)$. By maximality of $\mathfrak h$, it follows that $N_{\mathfrak g}(\mathfrak h)=\mathfrak g$. Therefore $[y,h]\in\mathfrak h$ for every $y\in\mathfrak g$ and every $h\in\mathfrak h$, so $\mathfrak h$ is an ideal of $\mathfrak g$.
Since $\mathfrak h$ is an ideal, the quotient $\mathfrak g/\mathfrak h$ is a Lie algebra. If $\dim_F(\mathfrak g/\mathfrak h)>1$, then $\mathfrak g/\mathfrak h$ contains a one-dimensional proper Lie subalgebra, whose inverse image under the quotient map would be a proper Lie subalgebra of $\mathfrak g$ strictly containing $\mathfrak h$, contradicting maximality. Hence $\dim_F(\mathfrak g/\mathfrak h)=1$.
[guided]
The goal of this step is to isolate a large ideal $\mathfrak h$ so that $\mathfrak g$ differs from $\mathfrak h$ by only one direction. If $\dim_F\mathfrak g=1$, then $\mathfrak h=\{0\}$ has exactly this property. So assume $\dim_F\mathfrak g>1$ and choose a maximal proper Lie subalgebra $\mathfrak h\subsetneq\mathfrak g$.
We define
\begin{align*}
N_{\mathfrak g}(\mathfrak h)
:=
\{y \in \mathfrak g : [y,h]\in \mathfrak h \text{ for every } h \in \mathfrak h\}.
\end{align*}
This is the set of elements of $\mathfrak g$ whose adjoint action preserves $\mathfrak h$. We want to show that the normaliser is larger than $\mathfrak h$.
Consider the quotient vector space $\mathfrak g/\mathfrak h$. For each $a\in\mathfrak h$, define
\begin{align*}
\rho(a): \mathfrak g/\mathfrak h &\to \mathfrak g/\mathfrak h \\
y+\mathfrak h &\mapsto [a,y]+\mathfrak h.
\end{align*}
This is well-defined: if $y+\mathfrak h=y'+\mathfrak h$, then $y-y'\in\mathfrak h$, and since $\mathfrak h$ is a Lie subalgebra, $[a,y-y']\in\mathfrak h$. Hence $[a,y]+\mathfrak h=[a,y']+\mathfrak h$.
The map
\begin{align*}
\rho: \mathfrak h &\to \mathfrak{gl}(\mathfrak g/\mathfrak h) \\
a &\mapsto \rho(a)
\end{align*}
is a Lie algebra homomorphism. To verify the sign convention, fix $a,b\in\mathfrak h$ and $y\in\mathfrak g$. The commutator in $\mathfrak{gl}(\mathfrak g/\mathfrak h)$ gives
\begin{align*}
[\rho(a),\rho(b)](y+\mathfrak h)
&= \rho(a)([b,y]+\mathfrak h)-\rho(b)([a,y]+\mathfrak h) \\
&= [a,[b,y]]-[b,[a,y]]+\mathfrak h \\
&= [[a,b],y]+\mathfrak h \\
&= \rho([a,b])(y+\mathfrak h),
\end{align*}
where the third equality is the Jacobi identity. Thus $[\rho(a),\rho(b)]=\rho([a,b])$. Each $\rho(a)$ is nilpotent: indeed, $\rho(a)$ is induced by $\operatorname{ad}_a$, and the previous step proved that $\operatorname{ad}_a$ is nilpotent on $\mathfrak g$.
Now $\mathfrak g/\mathfrak h$ is a non-zero finite-dimensional vector space, and $\rho(\mathfrak h)$ is a Lie subalgebra of $\mathfrak{gl}(\mathfrak g/\mathfrak h)$ of dimension at most $\dim_F\mathfrak h<n$. Therefore the induction hypothesis applies to the action of $\rho(\mathfrak h)$ on $\mathfrak g/\mathfrak h$. It gives a non-zero coset $y+\mathfrak h$ such that
\begin{align*}
\rho(a)(y+\mathfrak h)=0
\end{align*}
for every $a\in\mathfrak h$. Unpacking the definition of $\rho$, this means
\begin{align*}
[a,y]\in\mathfrak h
\end{align*}
for every $a\in\mathfrak h$. Since $[y,a]=-[a,y]$, this is equivalent to $[y,a]\in\mathfrak h$ for every $a\in\mathfrak h$, so $y\in N_{\mathfrak g}(\mathfrak h)$. The coset $y+\mathfrak h$ is non-zero, hence $y\notin\mathfrak h$. Thus the normaliser is strictly larger than $\mathfrak h$.
The normaliser is itself a Lie subalgebra. Closure under scalar multiplication and addition follows from bilinearity of the bracket. For closure under the Lie bracket, take $u,v\in N_{\mathfrak g}(\mathfrak h)$ and $h\in\mathfrak h$. By the Jacobi identity,
\begin{align*}
[[u,v],h]
&= [u,[v,h]]-[v,[u,h]].
\end{align*}
Because $v$ normalises $\mathfrak h$, we have $[v,h]\in\mathfrak h$; because $u$ normalises $\mathfrak h$, this implies $[u,[v,h]]\in\mathfrak h$. Similarly, $[u,h]\in\mathfrak h$ and then $[v,[u,h]]\in\mathfrak h$. Since $\mathfrak h$ is a vector subspace, the difference also lies in $\mathfrak h$. Therefore $[[u,v],h]\in\mathfrak h$ for every $h\in\mathfrak h$, so $[u,v]\in N_{\mathfrak g}(\mathfrak h)$. Thus $N_{\mathfrak g}(\mathfrak h)$ is a Lie subalgebra. It contains $\mathfrak h$ and is not equal to $\mathfrak h$, so maximality of $\mathfrak h$ forces $N_{\mathfrak g}(\mathfrak h)=\mathfrak g$. This says exactly that $[y,h]\in\mathfrak h$ for all $y\in\mathfrak g$ and $h\in\mathfrak h$, which is the definition that $\mathfrak h$ is an ideal.
Finally, maximality gives codimension one. Since $\mathfrak h$ is an ideal, $\mathfrak g/\mathfrak h$ is a Lie algebra. If its dimension were larger than one, it would contain a one-dimensional proper Lie subalgebra; taking its inverse image in $\mathfrak g$ would produce a proper Lie subalgebra strictly between $\mathfrak h$ and $\mathfrak g$, contradicting maximality. Hence
\begin{align*}
\dim_F(\mathfrak g/\mathfrak h)=1.
\end{align*}
[/guided]
[/step]
[step:Use induction to obtain a non-zero vector killed by the ideal]
Define
\begin{align*}
W := \{v\in V : h(v)=0 \text{ for every } h\in\mathfrak h\}.
\end{align*}
Since $\dim_F\mathfrak h<n$ and every $h\in\mathfrak h$ is nilpotent on $V$, the induction hypothesis applied to $\mathfrak h\subseteq\mathfrak{gl}(V)$ gives $W\ne\{0\}$.
[/step]
[step:Prove that the common kernel of the ideal is stable under a complementary element]
Choose $x\in\mathfrak g\setminus\mathfrak h$. Since $\dim_F(\mathfrak g/\mathfrak h)=1$, every element of $\mathfrak g$ can be written as $a+\lambda x$ with $a\in\mathfrak h$ and $\lambda\in F$.
We claim that $W$ is stable under $x$. Let $w\in W$ and $h\in\mathfrak h$. Since $\mathfrak h$ is an ideal, $[h,x]\in\mathfrak h$. Therefore
\begin{align*}
h(x(w))
=
x(h(w)) + [h,x](w)
=
x(0)+0
=
0.
\end{align*}
Thus $x(w)\in W$, and the restriction
\begin{align*}
x|_W: W &\to W \\
w &\mapsto x(w)
\end{align*}
is a well-defined linear endomorphism of $W$.
[guided]
Pick an element $x\in\mathfrak g\setminus\mathfrak h$. Because $\mathfrak h$ has codimension one, the coset $x+\mathfrak h$ spans $\mathfrak g/\mathfrak h$. Equivalently, every element of $\mathfrak g$ has the form $a+\lambda x$ with $a\in\mathfrak h$ and $\lambda\in F$.
We now show that $x$ maps the common kernel of $\mathfrak h$ back into itself. Take $w\in W$. To prove $x(w)\in W$, we must prove that every $h\in\mathfrak h$ kills $x(w)$. Fix such an $h$. The commutator identity gives
\begin{align*}
[h,x](w)=h(x(w))-x(h(w)).
\end{align*}
Rearranging,
\begin{align*}
h(x(w))=x(h(w))+[h,x](w).
\end{align*}
Since $w\in W$, we have $h(w)=0$. Since $\mathfrak h$ is an ideal of $\mathfrak g$, the commutator $[h,x]$ lies in $\mathfrak h$, and therefore $[h,x](w)=0$ by the definition of $W$. Hence
\begin{align*}
h(x(w))=x(0)+0=0.
\end{align*}
This holds for every $h\in\mathfrak h$, so $x(w)\in W$. Thus the restriction
\begin{align*}
x|_W: W &\to W \\
w &\mapsto x(w)
\end{align*}
is a well-defined linear endomorphism.
[/guided]
[/step]
[step:Choose a non-zero vector killed by the complementary element]
The endomorphism $x:V\to V$ is nilpotent by hypothesis, so its restriction $x|_W:W\to W$ is nilpotent. Since $W\ne\{0\}$, the kernel of $x|_W$ is non-zero. Choose $v\in W$ with $v\ne 0$ and
\begin{align*}
x(v)=0.
\end{align*}
For every $h\in\mathfrak h$, the condition $v\in W$ gives $h(v)=0$. Therefore, for an arbitrary element $y\in\mathfrak g$, write $y=h+\lambda x$ with $h\in\mathfrak h$ and $\lambda\in F$. Then
\begin{align*}
y(v)=h(v)+\lambda x(v)=0+\lambda 0=0.
\end{align*}
Thus $v\ne 0$ and $y(v)=0$ for every $y\in\mathfrak g$, which proves the theorem.
[/step]