[proofplan]
The proof has two parts: existence and uniqueness. For existence, we induct on $\dim V$. The key step is to show that the image $\operatorname{im} N$ can be decomposed into cyclic subspaces by the inductive hypothesis applied to the restriction $N|_{\operatorname{im} N}$, and then each cyclic chain in $\operatorname{im} N$ can be extended by one vector to produce a longer chain in $V$. Any remaining kernel directions not covered by these extensions contribute chains of length $1$. For uniqueness, we derive a formula for the number of blocks of each size purely in terms of the rank sequence $\operatorname{rank} N^k$.
[/proofplan]
[step:Establish the base case and induction framework]
We proceed by strong induction on $n = \dim V$.
**Base case:** If $n = 0$ the decomposition is vacuous, and if $n = 1$ then $N = 0$ (since $N$ is nilpotent on a $1$-dimensional space), so $V = C_1$ with $C_1 = \operatorname{span}(v_1)$ for any non-zero $v_1 \in V$. This is a single Jordan chain of length $1$.
**Inductive hypothesis:** Assume the theorem holds for all nilpotent operators on vector spaces of dimension strictly less than $n$.
[guided]
We proceed by strong induction on $n = \dim V$. Why strong induction rather than ordinary induction? Because when we pass to $\operatorname{im} N$ in the inductive step, the dimension of $\operatorname{im} N$ may be much smaller than $n - 1$ (for instance, if $N^2 = 0$ then $\operatorname{im} N \subseteq \ker N$, so $\dim \operatorname{im} N \leq n/2$). We therefore need the inductive hypothesis for all dimensions strictly less than $n$, not just $n - 1$.
**Base case ($n = 0$):** The space $V = \{0\}$ has the empty decomposition — there are no cyclic subspaces and the direct sum is vacuous. The multiset of chain lengths is empty, which is uniquely determined.
**Base case ($n = 1$):** The space $V$ is one-dimensional. Since $N$ is nilpotent, $N^k = 0$ for some $k \geq 1$. But the only nilpotent $1 \times 1$ matrix is the zero matrix, so $N = 0$. Choose any non-zero vector $v_1 \in V$. Then $V = \operatorname{span}(v_1) = C_1$, a single cyclic subspace with chain length $m_1 = 1$. The chain is $\{v_1\}$ with $Nv_1 = 0$. The decomposition is $V = C_1$ with the single chain length $m_1 = 1$, which is unique since $\dim V = 1$ forces exactly one chain of length $1$.
**Inductive step ($n \geq 2$):** We now assume $n \geq 2$ and that the theorem holds for all nilpotent operators on spaces of dimension strictly less than $n$. The strategy is to pass to $W = \operatorname{im} N$, apply the inductive hypothesis there, and lift the resulting cyclic decomposition of $W$ back to $V$.
**Inductive hypothesis (formal):** For every vector space $U$ with $\dim U < n$ and every nilpotent operator $M: U \to U$, the space $U$ decomposes as a direct sum of cyclic $M$-invariant subspaces, and the multiset of chain lengths is uniquely determined by $M$.
[/guided]
[/step]
[step:Apply the inductive hypothesis to the restriction $N|_{\operatorname{im} N}$]
Define $W := \operatorname{im} N$. Since $N$ is nilpotent and non-zero (otherwise the result is immediate with $r = n$ chains of length $1$), we have $W \subsetneq V$, so $\dim W < n$.
The subspace $W$ is $N$-invariant: for any $w \in W$, write $w = Nu$ for some $u \in V$, then $Nw = N^2 u \in \operatorname{im} N = W$. Moreover, $N|_W: W \to W$ is nilpotent (since $N$ is). By the inductive hypothesis, $W$ admits a cyclic decomposition
\begin{align*}
W &= D_1 \oplus D_2 \oplus \cdots \oplus D_s,
\end{align*}
where $D_i = \operatorname{span}(w_i, Nw_i, \ldots, N^{d_i - 1} w_i)$ is a cyclic subspace with chain length $d_i$, and $d_1 \geq d_2 \geq \cdots \geq d_s$.
[guided]
The key observation is that the image $W = \operatorname{im} N$ is a strictly smaller $N$-invariant subspace of $V$, so the inductive hypothesis applies to it. Let us verify both claims.
**Why $W \subsetneq V$:** If $\operatorname{im} N = V$, then $N$ is surjective. On a finite-dimensional space, a surjective linear map is also injective (by the rank-nullity theorem: $\dim \ker N = \dim V - \dim \operatorname{im} N = 0$), so $N$ would be an isomorphism. But an isomorphism satisfies $N^k \neq 0$ for all $k \geq 1$, contradicting the nilpotency of $N$ (unless $V = \{0\}$, which we have already handled). Therefore $\dim W < n$.
**$N$-invariance of $W$:** Take any $w \in W$. By definition of $W = \operatorname{im} N$, we can write $w = Nu$ for some $u \in V$. Then $Nw = N(Nu) = N^2 u$, and $N^2 u = N(Nu) \in \operatorname{im} N = W$. So $Nw \in W$ for every $w \in W$, confirming $N$-invariance.
**Nilpotency of $N|_W$:** The restriction $N|_W: W \to W$ is well-defined (by invariance) and nilpotent because $(N|_W)^k = N^k|_W$ for all $k$, and $N^k = 0$ for some $k$ (since $N$ is nilpotent on $V$).
Since $\dim W < n$, the strong inductive hypothesis applies to the nilpotent operator $N|_W$ on the space $W$. This gives a cyclic decomposition $W = D_1 \oplus D_2 \oplus \cdots \oplus D_s$, where $D_i = \operatorname{span}(w_i, Nw_i, \ldots, N^{d_i - 1}w_i)$ is a cyclic subspace with chain length $d_i$, and the sizes satisfy $d_1 \geq d_2 \geq \cdots \geq d_s$.
[/guided]
[/step]
[step:Lift each chain generator $w_i$ to a preimage $v_i$ with $Nv_i = w_i$]
For each $i = 1, \ldots, s$, the vector $w_i \in W = \operatorname{im} N$, so there exists $v_i \in V$ with $Nv_i = w_i$. Consider the chain starting from $v_i$:
\begin{align*}
v_i, \; Nv_i = w_i, \; N^2 v_i = Nw_i, \; \ldots, \; N^{d_i} v_i = N^{d_i - 1} w_i.
\end{align*}
This chain has length $d_i + 1$: it consists of $v_i$ followed by the entire $D_i$-chain $w_i, Nw_i, \ldots, N^{d_i - 1}w_i$. Moreover $N^{d_i + 1} v_i = N^{d_i} w_i = 0$ since the chain in $D_i$ has length $d_i$.
Define the cyclic subspace $C_i := \operatorname{span}(v_i, Nv_i, \ldots, N^{d_i} v_i)$. By the [Jordan Chains Are Linearly Independent](/theorems/3282) theorem, these $d_i + 1$ vectors are linearly independent (they form a Jordan chain of length $d_i + 1$), so $\dim C_i = d_i + 1$.
[guided]
Each chain generator $w_i$ lives in $W = \operatorname{im} N$, so it has a preimage: there exists $v_i \in V$ with $Nv_i = w_i$. By prepending $v_i$ to the chain $w_i, Nw_i, \ldots, N^{d_i - 1}w_i$, we extend the chain by one step.
Why does this give a valid Jordan chain of length $d_i + 1$? The chain is $v_i, Nv_i, N^2 v_i, \ldots, N^{d_i} v_i$. We need $N^{d_i} v_i \neq 0$ and $N^{d_i + 1} v_i = 0$. Indeed:
- $N^{d_i} v_i = N^{d_i - 1}(Nv_i) = N^{d_i - 1} w_i \neq 0$ because $w_i, Nw_i, \ldots, N^{d_i - 1}w_i$ is a Jordan chain of length $d_i$, so its last element is non-zero.
- $N^{d_i + 1} v_i = N^{d_i} w_i = 0$ because the chain in $D_i$ has length exactly $d_i$.
By the [Jordan Chains Are Linearly Independent](/theorems/3282) theorem, the vectors $v_i, Nv_i, \ldots, N^{d_i} v_i$ are linearly independent, giving $\dim C_i = d_i + 1$.
The crucial point is that we must choose the preimages $v_i$ carefully so that the resulting subspaces $C_i$ are in direct sum. We address this next.
[/guided]
[/step]
[step:Verify the lifted chains are in direct sum by choosing preimages from complementary subspaces]
We must choose the preimages $v_1, \ldots, v_s$ so that $C_1 + C_2 + \cdots + C_s$ is a direct sum. Since $W = D_1 \oplus \cdots \oplus D_s$, the restriction of $N$ to each $C_i$ maps $C_i$ into $W$ in a controlled way: $N(C_i) = D_i$.
Consider the map $N: V \to W$. Pick any complement $K$ to $\ker N \cap N^{-1}(D_1 \oplus \cdots \oplus D_s)$ that we need. More concretely, since $D_1, \ldots, D_s$ are in direct sum inside $W$, we can choose $v_1, \ldots, v_s$ so that $v_1, \ldots, v_s$ together with $\ker N$ span $V$ and the $v_i$ project independently onto the chain generators $w_i$.
[claim:The subspaces $C_1, \ldots, C_s$ are in direct sum]
Suppose $u_1 + u_2 + \cdots + u_s = 0$ where $u_i \in C_i$ for each $i$. Write $u_i = a_i^{(0)} v_i + a_i^{(1)} Nv_i + \cdots + a_i^{(d_i)} N^{d_i} v_i$. Applying $N$ to both sides:
\begin{align*}
\sum_{i=1}^s N u_i &= 0.
\end{align*}
Now $Nu_i = a_i^{(0)} w_i + a_i^{(1)} Nw_i + \cdots + a_i^{(d_i - 1)} N^{d_i - 1} w_i \in D_i$ (the last term $a_i^{(d_i)} N^{d_i} w_i = 0$ drops out). Since $W = D_1 \oplus \cdots \oplus D_s$ is a direct sum, $Nu_i = 0$ for each $i$. This means $a_i^{(0)} = 0$ for each $i$ (since $w_i, Nw_i, \ldots, N^{d_i-1}w_i$ are linearly independent, and $a_i^{(0)}$ is the coefficient of $w_i$; applying $N$ repeatedly forces each $a_i^{(j)} = 0$ for $j \leq d_i - 1$).
With $a_i^{(0)} = 0$, we have $u_i \in D_i \subset W$, so the original relation $u_1 + \cdots + u_s = 0$ with $u_i \in D_i$ gives $u_i = 0$ by the direct sum decomposition of $W$. Hence the sum $C_1 + \cdots + C_s$ is direct.
[/claim]
[proof]
The argument is given inline above: applying $N$ projects each $u_i$ into $D_i$, and the direct sum property of $W = \bigoplus D_i$ forces each projection to be zero, which then forces the leading coefficient $a_i^{(0)} = 0$. With the leading coefficient removed, $u_i \in D_i$, and the direct sum of $W$ gives $u_i = 0$.
[/proof]
[/step]
[step:Supplement with kernel vectors to complete the decomposition of $V$]
The subspace $C_1 \oplus \cdots \oplus C_s$ has dimension $\sum_{i=1}^s (d_i + 1) = \dim W + s$. Now $\dim V = \dim W + \dim \ker N$ by the rank-nullity theorem applied to $N: V \to V$. So the remaining dimension is
\begin{align*}
\dim V - \sum_{i=1}^s (d_i + 1) &= \dim \ker N - s.
\end{align*}
Each subspace $C_i$ contributes one vector to $\ker N$, namely $N^{d_i} v_i$. These $s$ vectors are linearly independent within $\ker N$ (since they are part of the linearly independent system across all $C_i$). Choose vectors $v_{s+1}, \ldots, v_{s+p}$ in $\ker N$ so that $\{N^{d_1} v_1, \ldots, N^{d_s} v_s, v_{s+1}, \ldots, v_{s+p}\}$ forms a basis of $\ker N$. Here $p = \dim \ker N - s$.
For each $j = s+1, \ldots, s+p$, define $C_j := \operatorname{span}(v_j)$, a cyclic subspace of chain length $1$ (since $Nv_j = 0$).
Then $V = C_1 \oplus C_2 \oplus \cdots \oplus C_s \oplus C_{s+1} \oplus \cdots \oplus C_{s+p}$, and each $C_i$ is a cyclic $N$-invariant subspace. Setting $r = s + p$ and $m_i = d_i + 1$ for $i \leq s$, $m_i = 1$ for $i > s$, we obtain the desired decomposition with chain lengths $m_1 \geq m_2 \geq \cdots \geq m_r$.
[guided]
After constructing the direct sum $C_1 \oplus \cdots \oplus C_s$, we need to account for the rest of $V$. By the rank-nullity theorem for $N: V \to V$:
\begin{align*}
\dim V &= \dim \ker N + \dim \operatorname{im} N = \dim \ker N + \dim W.
\end{align*}
The subspaces $C_1, \ldots, C_s$ together have dimension $\sum_{i=1}^s (d_i + 1)$. Since $\dim W = \sum_{i=1}^s d_i$ (the $D_i$ decompose $W$), this equals $\dim W + s$. So we still need $\dim V - \dim W - s = \dim \ker N - s$ additional dimensions.
Where do these come from? Each chain $C_i$ already "uses up" one kernel direction: the terminal vector $N^{d_i} v_i$ lies in $\ker N$. So the chain decomposition accounts for $s$ out of $\dim \ker N$ kernel dimensions. The remaining $\dim \ker N - s$ kernel directions give rise to trivial chains of length $1$ — vectors $v_j$ with $Nv_j = 0$.
We choose $v_{s+1}, \ldots, v_{s+p}$ to complement the terminal vectors within $\ker N$, set $C_j = \operatorname{span}(v_j)$, and verify that $V = C_1 \oplus \cdots \oplus C_{s+p}$ by a dimension count: $\sum_{i=1}^{s+p} \dim C_i = (\dim W + s) + (\dim \ker N - s) = \dim W + \dim \ker N = \dim V$.
[/guided]
[/step]
[step:Prove uniqueness of the block sizes via the rank sequence]
Let $m_1 \geq m_2 \geq \cdots \geq m_r$ be the chain lengths in any cyclic decomposition of $V$. We show these are determined by $N$ alone.
For each $k \geq 0$, consider the subspace $N^k(C_i)$. If $k < m_i$, then $N^k(C_i) = \operatorname{span}(N^k v_i, N^{k+1} v_i, \ldots, N^{m_i - 1} v_i)$, which has dimension $m_i - k$. If $k \geq m_i$, then $N^k(C_i) = \{0\}$.
Since $V = \bigoplus_{i=1}^r C_i$ and each $C_i$ is $N$-invariant, we have $\operatorname{im} N^k = \bigoplus_{i=1}^r N^k(C_i)$, so
\begin{align*}
\operatorname{rank} N^k &= \sum_{i=1}^r \max(m_i - k, 0).
\end{align*}
Define $r_k = \operatorname{rank} N^k$. The first difference is $r_k - r_{k+1} = |\{i : m_i > k\}|$, the number of chains of length greater than $k$. The second difference is
\begin{align*}
(r_{k-1} - r_k) - (r_k - r_{k+1}) &= |\{i : m_i > k-1\}| - |\{i : m_i > k\}| = |\{i : m_i = k\}|,
\end{align*}
which counts the number of chains of size exactly $k$. Since the right-hand side is determined entirely by $\operatorname{rank} N^k$ for $k = 0, 1, 2, \ldots$, and these ranks depend only on $N$ (not on the choice of decomposition), the multiset of chain lengths $\{m_1, \ldots, m_r\}$ is uniquely determined by $N$.
[guided]
For uniqueness, we need to show that the block sizes cannot depend on the particular choice of chain generators $v_i$ — they are intrinsic to $N$.
The idea is to express the number of blocks of each size as a function of the rank sequence $r_k = \operatorname{rank} N^k$, which depends only on $N$. In any cyclic decomposition $V = \bigoplus C_i$, the $N$-invariance of each $C_i$ means $\operatorname{im} N^k = \bigoplus N^k(C_i)$, so ranks add.
For a single chain $C_i$ of length $m_i$, the image $N^k(C_i)$ has dimension $\max(m_i - k, 0)$: applying $N^k$ shifts the chain by $k$ positions, losing the first $k$ vectors. Summing over all chains:
\begin{align*}
r_k &= \operatorname{rank} N^k = \sum_{i=1}^r \max(m_i - k, 0).
\end{align*}
The successive differences $r_k - r_{k+1}$ count how many chains survive the $k$-th application of $N$, i.e., how many chains have length $> k$. Taking the second difference isolates the chains of exactly length $k$:
\begin{align*}
(r_{k-1} - r_k) - (r_k - r_{k+1}) &= |\{i : m_i = k\}|.
\end{align*}
Since the ranks $r_k$ are determined by $N$ alone (they are $\operatorname{rank} N^k$), the number of chains of each size is uniquely determined. This completes the uniqueness argument.
[/guided]
[/step]