[proofplan]
Each part invokes a different characterisation. Part (i) uses the classical column-independence criterion for the minimum distance of a linear code defined by a parity-check matrix: the minimum distance equals the smallest number of linearly dependent columns. We show no two columns of $H$ are dependent but three of them are dependent, giving $d(C) = 3$. Part (ii) is immediate from the decoding-radius formula $e = \lfloor (d(C) - 1)/2 \rfloor$ applied to $d(C) = 3$. Part (iii) compares the sphere-packing count $|C| \cdot V_2(n, 1)$ to $2^n$ and shows equality, which is the definition of being perfect.
[/proofplan]
[step:Set up the column-independence criterion for the minimum distance]
Because $C$ is defined as the null space $\{x \in \mathbb{F}_2^n : H x^\top = 0\}$ of the parity-check matrix $H$, the [Column Independence Criterion for Minimum Distance](/theorems/???) applies: the minimum distance $d(C)$ equals the least integer $w \geq 1$ such that some $w$ columns of $H$ are linearly dependent over $\mathbb{F}_2$.
The columns of $H$ are, by construction, the $2^d - 1$ distinct nonzero vectors of $\mathbb{F}_2^d$. Denote them $h_1, \dots, h_n \in \mathbb{F}_2^d \setminus \{0\}$, with $h_i \neq h_j$ for $i \neq j$. We must determine the smallest size of a linearly dependent subfamily.
[/step]
[step:Rule out dependence among any single column or any pair of columns]
No single column is dependent: by construction each column $h_i$ is nonzero, so the one-element set $\{h_i\}$ is linearly independent.
No two columns are dependent: suppose $h_i, h_j$ are distinct columns and $\lambda_i h_i + \lambda_j h_j = 0$ with $(\lambda_i, \lambda_j) \neq (0, 0)$ in $\mathbb{F}_2$. If $\lambda_i = 0$ then $\lambda_j h_j = 0$; since $h_j \neq 0$, this forces $\lambda_j = 0$, contradicting $(\lambda_i, \lambda_j) \neq (0, 0)$. By symmetry $\lambda_j \neq 0$ leads to $\lambda_i \neq 0$, so both are $1$ in $\mathbb{F}_2$. Then $h_i + h_j = 0$, i.e., $h_i = -h_j = h_j$ in $\mathbb{F}_2$, contradicting $h_i \neq h_j$. Hence any two distinct columns of $H$ are linearly independent.
Combining, no subfamily of $\leq 2$ columns is linearly dependent, so $d(C) \geq 3$.
[guided]
Why does the column-independence criterion reduce the minimum-distance question to counting small linearly dependent column families? Because for $x \in \mathbb{F}_2^n$ the product $H x^\top$ equals the linear combination of columns
\begin{align*}
H x^\top = \sum_{i=1}^n x_i h_i \in \mathbb{F}_2^d,
\end{align*}
and this is zero if and only if the columns $\{h_i : x_i = 1\}$ sum to zero, i.e., are linearly dependent over $\mathbb{F}_2$. The Hamming weight $w(x)$ is the number of nonzero coordinates of $x$, which is exactly the size of $\{i : x_i = 1\}$. So the minimum weight nonzero codeword corresponds to the smallest linearly dependent column family.
We now rule out families of size 1 and 2. Every column $h_i$ is nonzero by construction (the Hamming parity-check matrix lists the $2^d - 1$ nonzero vectors of $\mathbb{F}_2^d$), so $\{h_i\}$ is independent.
For two columns $h_i \neq h_j$: a nontrivial $\mathbb{F}_2$-relation has the form $h_i + h_j = 0$, i.e., $h_i = -h_j = h_j$ (since $-1 = 1$ in $\mathbb{F}_2$). That contradicts distinctness. So no size-2 family is dependent.
The failure mode to notice: over $\mathbb{F}_p$ with $p > 2$ there are multiple nonzero scalars, and $h_i, h_j$ could be dependent without being equal (if $h_i = \lambda h_j$ for some $\lambda \neq 0, 1$). That is why the Hamming construction as stated produces minimum distance $3$ only in the binary case; over larger fields one requires projectively distinct columns.
[/guided]
[/step]
[step:Exhibit three columns that sum to zero]
The vectors $e_1, e_2, e_1 + e_2 \in \mathbb{F}_2^d$, where $e_1, e_2$ are the first two standard basis vectors of $\mathbb{F}_2^d$, are all nonzero (note $e_1 + e_2 \neq 0$ because $e_1 \neq e_2$ implies $e_1 - e_2 \neq 0$ and $-e_2 = e_2$). They are pairwise distinct: $e_1 \neq e_2$, $e_1 \neq e_1 + e_2$ (else $e_2 = 0$), and $e_2 \neq e_1 + e_2$ (else $e_1 = 0$). Hence all three appear among the columns of $H$; say they are $h_a, h_b, h_c$ for some indices $a, b, c \in \{1, \dots, n\}$.
They satisfy the linear relation
\begin{align*}
h_a + h_b + h_c = e_1 + e_2 + (e_1 + e_2) = 2 e_1 + 2 e_2 = 0 \in \mathbb{F}_2^d,
\end{align*}
exhibiting a linearly dependent triple. Therefore $d(C) \leq 3$, and combined with the previous step,
\begin{align*}
d(C) = 3.
\end{align*}
[/step]
[step:Deduce the 1-error-correcting property from the minimum distance]
For a linear code with minimum distance $d(C)$, the [Error-Correcting Radius Formula](/theorems/???) states that the code is $e$-error correcting with
\begin{align*}
e = \left\lfloor \frac{d(C) - 1}{2} \right\rfloor.
\end{align*}
Substituting $d(C) = 3$,
\begin{align*}
e = \left\lfloor \frac{3 - 1}{2} \right\rfloor = \left\lfloor 1 \right\rfloor = 1.
\end{align*}
Hence $C$ is 1-error correcting, proving (ii).
[/step]
[step:Count sphere volumes and compare to $|\mathbb{F}_2^n|$ for the perfect-code test]
The [Hamming Bound](/theorems/???) states that any $e$-error-correcting code $C' \subseteq \mathbb{F}_2^n$ satisfies
\begin{align*}
|C'| \cdot V_2(n, e) \leq 2^n, \qquad V_2(n, e) = \sum_{j=0}^{e} \binom{n}{j},
\end{align*}
where $V_2(n, e)$ is the cardinality of the Hamming ball of radius $e$ around any point. A code is defined to be **perfect** when equality holds.
For our code we have $n = 2^d - 1$ and $e = 1$, so
\begin{align*}
V_2(n, 1) = \binom{n}{0} + \binom{n}{1} = 1 + n = 1 + (2^d - 1) = 2^d.
\end{align*}
The cardinality of $C$ is $|C| = 2^{\dim C} = 2^{n - d}$ because $C$ has dimension $n - d$ by hypothesis. Therefore
\begin{align*}
|C| \cdot V_2(n, 1) = 2^{n - d} \cdot 2^d = 2^n.
\end{align*}
This is precisely the Hamming bound with equality, so $C$ is perfect, proving (iii).
[guided]
A **perfect** $e$-error-correcting code is one whose closed Hamming balls of radius $e$ around the codewords exactly partition the ambient space $\mathbb{F}_2^n$ — no overlaps and no leftover points. The quantity $|C| \cdot V_2(n, e)$ counts the total number of ambient-space points covered by these balls (with multiplicity if balls overlap); it cannot exceed $2^n$ because the balls are pairwise disjoint (this uses $d(C) \geq 2e + 1$), and it equals $2^n$ iff the balls cover everything.
We compute both sides:
- The ball volume $V_2(n, 1)$ counts the center plus the $n$ single-error patterns, giving $1 + n$. For our $n = 2^d - 1$, this is exactly $2^d$.
- The codeword count is $|C| = 2^{\dim C} = 2^{n-d}$ since $C$ is a $\mathbb{F}_2$-linear code of dimension $n - d$.
Their product is $2^{n-d} \cdot 2^d = 2^n$. The cancellation is clean precisely because the Hamming code parameters are designed so that $V_2(n, 1)$ is a power of $2$ — namely $2^d$ — which complements the codimension. Had we chosen $n = 2^d$, we would have $V_2(n, 1) = 2^d + 1$, which is not a power of $2$, and no binary linear code with these parameters could be perfect.
[/guided]
[/step]
[step:Assemble the three parts]
Parts (i), (ii), (iii) have been established separately: $d(C) = 3$ by the column-independence criterion, $e = 1$ by the decoding-radius formula, and $|C| \cdot V_2(n, 1) = 2^n$ by direct computation. This completes the proof.
[/step]