Consider the following problem. You want to describe the set of all solutions to the linear system
\begin{align*}
2x_1 - x_2 + 3x_3 &= 0 \\
x_1 + x_2 - x_3 &= 0.
\end{align*}
You find one solution $v = (1, 1, 0)$ and another $w = (1, -1, 1)$. You notice immediately that $v + w = (2, 0, 1)$ is also a solution, and that $3v = (3, 3, 0)$ is also a solution. You could verify each by substituting back, but you also sense that something deeper is at work: the set of solutions is *closed under addition and scalar multiplication*, and this closure is not an accident — it is a consequence of linearity. The set of solutions forms a structure that mathematicians have isolated and named, because it appears everywhere: in systems of differential equations, in quantum mechanics, in signal processing, in geometry. That structure is a **vector space**.
The power of abstraction here is immense. Once you know you are working inside a vector space, an enormous library of results becomes available to you — about dimension, about linear maps, about decompositions, about geometry — without ever returning to the particular system that generated the space. The point of the definition is not to describe one example but to capture the common skeleton shared by $\mathbb{R}^n$, spaces of polynomials, spaces of continuous functions, spaces of matrices, and countless others.
[example: Solution Sets Are Closed Under Linear Combinations]
Let $A \in \mathbb{R}^{m \times n}$ and consider the homogeneous system $Ax = 0$. If $v, w \in \mathbb{R}^n$ satisfy $Av = 0$ and $Aw = 0$, then for any scalars $\alpha, \beta \in \mathbb{R}$,
\begin{align*}
A(\alpha v + \beta w) &= \alpha Av + \beta Aw = \alpha \cdot 0 + \beta \cdot 0 = 0.
\end{align*}
So $\alpha v + \beta w$ is also a solution. This computation uses only two properties of matrix multiplication: it distributes over addition, and it commutes with scalar multiplication. These are precisely the axioms that define a vector space. The set of solutions $\ker(A) := \{x \in \mathbb{R}^n : Ax = 0\}$ is not just a set — it is a vector space in its own right.
[/example]
## Definition
To isolate what matters — what makes the solution-space example work, and what makes $\mathbb{R}^n$, polynomials, and continuous functions all behave the same way — we abstract away everything except the two operations and the rules they must obey.
[definition: Vector Space]
Let $F$ be a field. A **vector space** over $F$ is a set $V$ together with two operations,
\begin{align*}
+: V \times V &\to V & &\text{(vector addition)} \\
\cdot: F \times V &\to V & &\text{(scalar multiplication),}
\end{align*}
satisfying the following eight axioms. For all $u, v, w \in V$ and all $\alpha, \beta \in F$:
1. **(A1) Associativity of addition:** $(u + v) + w = u + (v + w)$.
2. **(A2) Commutativity of addition:** $u + v = v + u$.
3. **(A3) Additive identity:** There exists $0 \in V$ such that $v + 0 = v$ for all $v \in V$.
4. **(A4) Additive inverses:** For each $v \in V$ there exists $-v \in V$ with $v + (-v) = 0$.
5. **(S1) Compatibility:** $\alpha(\beta v) = (\alpha\beta)v$.
6. **(S2) Identity scalar:** $1 \cdot v = v$.
7. **(D1) Distributivity over vector addition:** $\alpha(u + v) = \alpha u + \alpha v$.
8. **(D2) Distributivity over scalar addition:** $(\alpha + \beta)v = \alpha v + \beta v$.
Elements of $V$ are called **vectors**; elements of $F$ are called **scalars**.
[/definition]
The word "field" here means a set equipped with addition and multiplication satisfying the usual rules — think $\mathbb{R}$, $\mathbb{C}$, $\mathbb{Q}$, or $\mathbb{F}_p$ for a prime $p$. In most applications in linear algebra the field is $\mathbb{R}$ or $\mathbb{C}$, and we speak of a **real vector space** or **complex vector space** respectively.
It is worth pausing on what this definition does *not* say. There is no mention of coordinates, of arrows, of length, or of angles. Addition and scalar multiplication are abstract operations whose output is required only to satisfy eight equations. This generality is the whole point: it lets us treat spaces of functions, spaces of matrices, and $\mathbb{R}^n$ all by the same machinery.
[remark: Uniqueness of the Zero Vector and Additive Inverses]
The axioms guarantee existence but not uniqueness of the zero vector and additive inverses. Uniqueness follows at once: if $0$ and $0'$ both satisfy (A3), then $0 = 0 + 0' = 0'$. Similarly, if both $-v$ and $-'v$ are additive inverses of $v$, then
\begin{align*}
-v &= -v + 0 = -v + (v + (-'v)) = (-v + v) + (-'v) = 0 + (-'v) = -'v.
\end{align*}
These are not separate axioms to impose — they are theorems that come for free.
[/remark]
[example: Continuous Functions on an Interval]
The space $C([0, 1])$ of continuous real-valued functions on $[0, 1]$, with pointwise addition and scalar multiplication
\begin{align*}
(f + g)(t) &:= f(t) + g(t), \qquad (\alpha f)(t) := \alpha f(t),
\end{align*}
is a real vector space. Checking one non-obvious axiom: the zero vector is the constant function $\mathbf{0}(t) = 0$, and for each $f \in C([0,1])$ the additive inverse is $(-f)(t) = -f(t)$, which is continuous whenever $f$ is. All eight axioms are inherited from the pointwise arithmetic of $\mathbb{R}$.
What makes this example worthwhile is that here the "vectors" are functions, objects with infinitely many degrees of freedom. The dimension of $C([0,1])$ is infinite — we will see why when we discuss bases — yet every element obeys exactly the same algebraic rules as an $n$-tuple. The same abstract framework covers both.
More concretely: the differential equation $f'' + f = 0$ has the solution set $\{A\sin t + B\cos t : A, B \in \mathbb{R}\}$, a two-dimensional subspace of $C([0, 2\pi])$. The fact that this set of solutions is a vector space is not a coincidence — it is a consequence of the linearity of differentiation, and it is precisely the kind of structure the vector space axioms are designed to capture.
[/example]
[example: Spaces of Functions]
Let $X$ be any nonempty set and $F$ a field. Define $F^X$ to be the set of all functions $f: X \to F$. With pointwise operations,
\begin{align*}
(f + g)(x) &:= f(x) + g(x), \qquad (\alpha f)(x) := \alpha f(x),
\end{align*}
$F^X$ is a vector space over $F$. The zero vector is the constant function $0: x \mapsto 0_F$. Each $f$ has additive inverse $(-f): x \mapsto -f(x)$. All eight axioms follow from the corresponding axioms in $F$.
When $X = \{1, 2, \ldots, n\}$, we recover $F^n$: a function $f: \{1, \ldots, n\} \to F$ is exactly an $n$-tuple $(f(1), f(2), \ldots, f(n))$. When $X = [0, 1]$ and $F = \mathbb{R}$, this gives the space of all real-valued functions on $[0, 1]$, which contains $C([0, 1])$, $C^\infty([0, 1])$, $L^2([0, 1])$, and many other important spaces as subsets (and, under appropriate conditions, as subspaces).
[/example]
[example: Polynomial Spaces]
Fix a field $F$ and let $\mathcal{P}(F)$ denote the set of all polynomials with coefficients in $F$,
\begin{align*}
\mathcal{P}(F) &= \left\{ a_0 + a_1 t + a_2 t^2 + \cdots + a_n t^n : n \ge 0,\, a_i \in F \right\}.
\end{align*}
Addition and scalar multiplication are the familiar polynomial operations. This is a vector space. The zero vector is the zero polynomial. For each $n \ge 0$, the subset $\mathcal{P}_n(F) \subset \mathcal{P}(F)$ of polynomials of degree at most $n$ is also a vector space. The space $\mathcal{P}_n(F)$ has dimension $n+1$ — something we will make precise shortly.
[/example]
## Subspaces
Not every subset of a vector space is again a vector space, but some are, and they inherit all the structure from the ambient space. Identifying subspaces is often the first step in understanding the geometry of a space.
The naive criterion for checking that a subset $W \subset V$ is a vector space in its own right would be to verify all eight axioms. But axioms (A1), (A2), (S1), (S2), (D1), (D2) are inherited automatically from $V$ — they hold for all vectors in $V$, so in particular they hold for vectors in $W$. What can fail is closure (the addition or scalar multiplication of two vectors in $W$ might escape $W$), existence of a zero vector, and existence of additive inverses. The subspace criterion packages exactly the minimal checks needed.
[definition: Subspace]
Let $V$ be a vector space over a field $F$. A nonempty subset $W \subset V$ is a **subspace** of $V$ if it is closed under addition and scalar multiplication:
\begin{align*}
u, v \in W,\, \alpha \in F \implies u + v \in W \text{ and } \alpha u \in W.
\end{align*}
[/definition]
Equivalently, $W$ is a subspace if for all $u, v \in W$ and $\alpha, \beta \in F$,
\begin{align*}
\alpha u + \beta v \in W.
\end{align*}
[explanation: Why the Subspace Criterion Works]
Given a nonempty subset $W$ satisfying the closure conditions, we must confirm it is a vector space. Since $W$ is nonempty, pick any $w \in W$; then $0 \cdot w = 0 \in W$, so the zero vector lies in $W$. For additive inverses: given $v \in W$, we have $(-1) \cdot v = -v \in W$ by scalar closure. The eight axioms all hold because they hold in $V$ and $W \subset V$. So the two closure conditions are both necessary and sufficient.
Notice what would happen without the closure conditions. The set $\{(x_1, x_2) \in \mathbb{R}^2 : x_1, x_2 \ge 0\}$ (the positive quadrant) is nonempty and contains the zero vector, but $-(1, 1) = (-1, -1)$ is not in it. It is not a subspace. The set $\{(x_1, x_2) \in \mathbb{R}^2 : x_1 + x_2 = 1\}$ is nonempty, but $(1, 0) + (0, 1) = (1, 1)$ does not satisfy $x_1 + x_2 = 1$. It is not a subspace either (and it does not contain the zero vector). The condition $x_1 + x_2 = 0$ gives a subspace; the condition $x_1 + x_2 = 1$ does not. Homogeneous constraints preserve the zero and are compatible with scalar multiplication; inhomogeneous ones are not.
[/explanation]
[example: Two Sets That Fail to Be Subspaces]
The subspace criterion is not automatic — both closure conditions must be checked, and either can fail independently.
**Failure of scalar closure (the positive quadrant).** Let $S = \{(x_1, x_2) \in \mathbb{R}^2 : x_1 \ge 0 \text{ and } x_2 \ge 0\}$. This set contains the zero vector $(0,0)$ and is closed under addition (the sum of two vectors with nonnegative components again has nonnegative components). However, it fails scalar closure: take $v = (1, 1) \in S$ and $\alpha = -1$. Then $\alpha v = (-1, -1) \notin S$. So the positive quadrant is not a subspace of $\mathbb{R}^2$, even though it "looks" like a natural geometric region.
**Failure of addition closure (an affine hyperplane).** Let $H = \{(x_1, x_2) \in \mathbb{R}^2 : x_1 + x_2 = 1\}$. Take $u = (1, 0) \in H$ and $v = (0, 1) \in H$. Then $u + v = (1, 1)$, and $1 + 1 = 2 \ne 1$, so $u + v \notin H$. Moreover $H$ does not contain the zero vector (since $0 + 0 = 0 \ne 1$). The set $H$ is an affine subspace — a translate of the line $x_1 + x_2 = 0$ — but it is not a vector subspace. The lesson: inhomogeneous linear constraints ($= 1$) destroy the zero vector and addition closure; homogeneous ones ($= 0$) preserve both.
[/example]
[example: The Kernel Is a Subspace]
Let $T: V \to W$ be a linear map between vector spaces (a map satisfying $T(u + v) = T(u) + T(v)$ and $T(\alpha v) = \alpha T(v)$). The **kernel** of $T$ is
\begin{align*}
\ker(T) &:= \{v \in V : T(v) = 0\}.
\end{align*}
This is a subspace of $V$. To check: if $u, v \in \ker(T)$ and $\alpha, \beta \in F$, then
\begin{align*}
T(\alpha u + \beta v) &= \alpha T(u) + \beta T(v) = \alpha \cdot 0 + \beta \cdot 0 = 0,
\end{align*}
so $\alpha u + \beta v \in \ker(T)$.
Similarly, the **image** $\operatorname{im}(T) := T(V) = \{T(v) : v \in V\}$ is a subspace of $W$: if $T(u), T(v) \in \operatorname{im}(T)$, then $\alpha T(u) + \beta T(v) = T(\alpha u + \beta v) \in \operatorname{im}(T)$.
[/example]
## Span and Linear Independence
Given vectors $v_1, \ldots, v_k$ in a vector space $V$, the most basic question is: what can we build from them using addition and scalar multiplication? The answer is the *span*, and it is always a subspace. The deeper question — how many of the $v_i$ are truly essential, with none being redundant — leads to linear independence.
What goes wrong when vectors are redundant? Suppose $v_3 = 2v_1 - v_2$. Then every linear combination $\alpha_1 v_1 + \alpha_2 v_2 + \alpha_3 v_3$ can be rewritten as $(\alpha_1 + 2\alpha_3)v_1 + (\alpha_2 - \alpha_3)v_2$: the third vector adds no new directions to the span. Worse, coefficients in a linear combination are no longer uniquely determined by the vector they produce — this non-uniqueness is exactly what linear dependence means.
[definition: Span]
Let $V$ be a vector space over $F$ and let $S \subset V$ be a nonempty subset. The **span** of $S$ is
\begin{align*}
\operatorname{span}(S) &:= \left\{ \sum_{i=1}^{k} \alpha_i v_i : k \ge 1,\, \alpha_i \in F,\, v_i \in S \right\},
\end{align*}
the set of all finite linear combinations of elements of $S$. By convention, $\operatorname{span}(\varnothing) = \{0\}$. If $S = \{v_1, \ldots, v_n\}$ is finite, we write $\operatorname{span}(v_1, \ldots, v_n)$.
[/definition]
The span of $S$ is the smallest subspace of $V$ containing $S$: it contains $S$, it is a subspace, and any subspace containing $S$ must contain all linear combinations and hence all of $\operatorname{span}(S)$.
[definition: Linear Independence]
Vectors $v_1, \ldots, v_n \in V$ are **linearly independent** if the only scalars $\alpha_1, \ldots, \alpha_n \in F$ satisfying
\begin{align*}
\alpha_1 v_1 + \alpha_2 v_2 + \cdots + \alpha_n v_n &= 0
\end{align*}
are $\alpha_1 = \alpha_2 = \cdots = \alpha_n = 0$. If a nontrivial solution exists — that is, if some $\alpha_i \ne 0$ satisfies the equation — then $v_1, \ldots, v_n$ are **linearly dependent**.
An infinite subset $S \subset V$ is linearly independent if every finite subset of $S$ is linearly independent.
[/definition]
[explanation: Linear Dependence Means Redundancy]
Linear dependence of $v_1, \ldots, v_n$ is equivalent to one of the $v_i$ being expressible as a linear combination of the others. Precisely: if $\alpha_1 v_1 + \cdots + \alpha_n v_n = 0$ with some $\alpha_j \ne 0$, then
\begin{align*}
v_j &= -\frac{1}{\alpha_j}\sum_{i \ne j} \alpha_i v_i \in \operatorname{span}(v_1, \ldots, \hat{v}_j, \ldots, v_n),
\end{align*}
where $\hat{v}_j$ denotes that $v_j$ is omitted. So $v_j$ is redundant: $\operatorname{span}(v_1, \ldots, v_n) = \operatorname{span}(v_1, \ldots, \hat{v}_j, \ldots, v_n)$.
The zero vector is always linearly dependent with any other vector: $1 \cdot 0 = 0$ is a nontrivial relation. A single nonzero vector is always linearly independent: $\alpha v = 0$ with $\alpha \ne 0$ implies $v = 0$.
[/explanation]
[example: Independence in $\mathbb{R}^3$]
Consider the vectors
\begin{align*}
v_1 &= (1, 0, 0), \quad v_2 = (0, 1, 0), \quad v_3 = (1, 1, 0)
\end{align*}
in $\mathbb{R}^3$. We ask whether these are linearly independent. Form $\alpha_1 v_1 + \alpha_2 v_2 + \alpha_3 v_3 = 0$:
\begin{align*}
(\alpha_1 + \alpha_3, \alpha_2 + \alpha_3, 0) &= (0, 0, 0).
\end{align*}
This gives $\alpha_1 = -\alpha_3$ and $\alpha_2 = -\alpha_3$, with $\alpha_3$ free. Taking $\alpha_3 = 1$, we get $\alpha_1 = -1$, $\alpha_2 = -1$, $\alpha_3 = 1$: so $-v_1 - v_2 + v_3 = 0$, i.e., $v_3 = v_1 + v_2$. The vectors are linearly dependent, and $v_3$ is redundant: $\operatorname{span}(v_1, v_2, v_3) = \operatorname{span}(v_1, v_2)$, which is the $x_1 x_2$-plane in $\mathbb{R}^3$.
Now replace $v_3$ with $w = (0, 0, 1)$. The system $\alpha_1 v_1 + \alpha_2 v_2 + \alpha_3 w = 0$ gives $(\alpha_1, \alpha_2, \alpha_3) = (0, 0, 0)$: the vectors are linearly independent. Together they span all of $\mathbb{R}^3$.
[/example]
## Bases and Dimension
A basis is a linearly independent spanning set — the most efficient description of a vector space. The central theorem of linear algebra, towards which everything so far has been building, is that all bases of a vector space have the same cardinality. This common cardinality is the dimension of the space, and it is an intrinsic invariant.
Without this theorem, the notion of dimension would be ambiguous: you might describe $\mathbb{R}^3$ with three standard basis vectors and I might claim to describe the same space with five vectors. The theorem says that if both of our sets are bases, they must both have exactly three elements.
[definition: Basis]
A **basis** of a vector space $V$ is a linearly independent subset $\mathcal{B} \subset V$ that spans $V$, i.e., $\operatorname{span}(\mathcal{B}) = V$.
[/definition]
The two conditions — independence and spanning — are in tension and together pin down the "right size" for describing the space. A spanning set that is too large is linearly dependent; a linearly independent set that is too small may fail to span. A basis achieves both simultaneously.
[quotetheorem:3308]
The uniqueness in condition (2) is the key content: a basis gives a *coordinate system* on $V$, and different bases give different coordinate systems, but each one is complete and non-redundant. The scalars $\alpha_i$ expressing $v$ in terms of $\mathcal{B}$ are called the **coordinates of $v$ with respect to $\mathcal{B}$**.
[quotetheorem:1229]
The proof that every vector space has a basis uses Zorn's lemma (equivalently, the axiom of choice) in the infinite-dimensional case. For finite-dimensional spaces, explicit constructions suffice: start with any spanning set and remove redundant vectors one by one until the remaining set is independent.
[definition: Dimension]
The **dimension** of a vector space $V$ over $F$, written $\dim_F V$ or $\dim V$, is the cardinality of any basis of $V$. If $V$ has a finite basis of $n$ elements, we say $V$ is **$n$-dimensional** or **finite-dimensional**. If no finite basis exists, $V$ is **infinite-dimensional**.
[/definition]
That all bases have the same cardinality — so this definition is well-posed — is the content of the following fundamental theorem.
[quotetheorem:915]
[explanation: Why Invariance of Dimension Is Nontrivial]
It might seem obvious that two bases of $\mathbb{R}^3$ must each have three elements — after all, $\mathbb{R}^3$ "looks" three-dimensional. But the definition of a vector space involves no geometry, no notion of size, and no coordinates. The invariance theorem says that the algebraic structure alone forces a unique count.
The key step in the finite-dimensional case is the **Exchange Lemma**: if $v_1, \ldots, v_n$ span $V$ and $u_1, \ldots, u_m$ are linearly independent in $V$, then $m \le n$. Applying this twice — once with the first basis as the spanning set and the second as the independent set, then vice versa — gives $|\mathcal{B}| \le |\mathcal{B}'|$ and $|\mathcal{B}'| \le |\mathcal{B}|$, hence equality.
This argument breaks down for infinite-dimensional spaces in a more subtle way, requiring cardinality arguments and the axiom of choice. The conclusion still holds — any two bases have the same cardinality — but the proof is genuinely harder.
[/explanation]
[example: Dimension of Standard Spaces]
The standard bases and dimensions of common vector spaces:
- $\mathbb{R}^n$ has the **standard basis** $e_1, \ldots, e_n$ where $e_i = (0, \ldots, 1, \ldots, 0)$ with $1$ in position $i$. Thus $\dim \mathbb{R}^n = n$.
- $\mathcal{P}_n(F)$ (polynomials of degree $\le n$) has basis $\{1, t, t^2, \ldots, t^n\}$, so $\dim \mathcal{P}_n(F) = n + 1$.
- The space of $m \times n$ matrices over $F$, denoted $M_{m \times n}(F)$, has the matrix units $E_{ij}$ (with $1$ in position $(i,j)$ and $0$ elsewhere) as a basis, giving $\dim M_{m \times n}(F) = mn$.
- The space $C([0, 1])$ of continuous functions on $[0,1]$ over $\mathbb{R}$ is infinite-dimensional. To see this we must show that for each $n \ge 1$, the functions $1, t, t^2, \ldots, t^n$ are linearly independent in $C([0,1])$. Suppose $\alpha_0 + \alpha_1 t + \alpha_2 t^2 + \cdots + \alpha_n t^n = 0$ as a function on $[0,1]$, meaning the polynomial $p(t) = \sum_{k=0}^n \alpha_k t^k$ is identically zero. Differentiating $k$ times and evaluating at $t = 0$ gives $k!\,\alpha_k = 0$, so $\alpha_k = 0$ for every $k$. Thus the only relation among $1, t, \ldots, t^n$ is the zero relation, and these $n+1$ functions are linearly independent. Since this holds for every $n$, no finite set can span $C([0,1])$, and the space is infinite-dimensional.
[/example]
[illustration:span-of-vectors-r3]
## Coordinates and Change of Basis
A basis does more than prove existence of a coordinate system — it provides one explicitly, and different bases provide different coordinate systems for the same abstract space. Understanding how coordinates transform when the basis changes is one of the most practical and conceptually important topics in linear algebra. It is the bridge between abstract vector spaces and explicit matrix computations.
Once you fix a basis $\mathcal{B} = \{b_1, \ldots, b_n\}$ of an $n$-dimensional vector space $V$, every vector $v \in V$ is uniquely determined by its coordinate vector.
[definition: Coordinate Vector]
Let $V$ be an $n$-dimensional vector space over $F$ with ordered basis $\mathcal{B} = (b_1, \ldots, b_n)$. For each $v \in V$, write $v = \sum_{i=1}^n \alpha_i b_i$ uniquely. The **coordinate vector of $v$ with respect to $\mathcal{B}$** is
\begin{align*}
[v]_{\mathcal{B}} &:= (\alpha_1, \alpha_2, \ldots, \alpha_n) \in F^n.
\end{align*}
The map $v \mapsto [v]_{\mathcal{B}}$ is an isomorphism $V \to F^n$ called the **coordinate isomorphism** associated to $\mathcal{B}$.
[/definition]
The coordinate isomorphism lets us compute in $F^n$ and transfer the results back to $V$. But coordinates depend on the choice of basis, and in practice we often have two natural bases — say, the standard basis and an eigenbasis — and need to translate between them. This is the change-of-basis problem.
[definition: Change-of-Basis Matrix]
Let $V$ be an $n$-dimensional vector space over $F$ and let $\mathcal{B} = (b_1, \ldots, b_n)$ and $\mathcal{C} = (c_1, \ldots, c_n)$ be two ordered bases of $V$. The **change-of-basis matrix from $\mathcal{B}$ to $\mathcal{C}$** is the $n \times n$ matrix $P_{\mathcal{B} \to \mathcal{C}}$ whose $j$-th column is the coordinate vector of $b_j$ in the basis $\mathcal{C}$:
\begin{align*}
(P_{\mathcal{B} \to \mathcal{C}})_{ij} &= \alpha_{ij}, \quad \text{where } b_j = \sum_{i=1}^n \alpha_{ij}\, c_i.
\end{align*}
For any $v \in V$,
\begin{align*}
[v]_{\mathcal{C}} &= P_{\mathcal{B} \to \mathcal{C}}\, [v]_{\mathcal{B}}.
\end{align*}
[/definition]
The matrix $P_{\mathcal{B} \to \mathcal{C}}$ is always invertible, with $(P_{\mathcal{B} \to \mathcal{C}})^{-1} = P_{\mathcal{C} \to \mathcal{B}}$. This reflects the fact that both coordinate isomorphisms are bijective, so passing from $\mathcal{B}$-coordinates to $\mathcal{C}$-coordinates and back is the identity.
[example: Change of Basis in $\mathbb{R}^2$]
Let $V = \mathbb{R}^2$. Take the standard basis $\mathcal{E} = (e_1, e_2) = ((1,0),(0,1))$ and the basis $\mathcal{B} = (b_1, b_2)$ where $b_1 = (1, 1)$ and $b_2 = (1, -1)$. We compute the change-of-basis matrix $P_{\mathcal{B} \to \mathcal{E}}$ by expressing each $b_j$ in the standard basis $\mathcal{E}$:
\begin{align*}
b_1 = 1 \cdot e_1 + 1 \cdot e_2, \qquad b_2 = 1 \cdot e_1 + (-1) \cdot e_2,
\end{align*}
so
\begin{align*}
P_{\mathcal{B} \to \mathcal{E}} &= \begin{pmatrix} 1 & 1 \\ 1 & -1 \end{pmatrix}.
\end{align*}
Now take a specific vector $v = (3, 1)$. In $\mathcal{E}$-coordinates, $[v]_{\mathcal{E}} = (3, 1)$. To find $[v]_{\mathcal{B}}$, we solve $P_{\mathcal{B} \to \mathcal{E}} [v]_{\mathcal{B}} = [v]_{\mathcal{E}}$:
\begin{align*}
\begin{pmatrix} 1 & 1 \\ 1 & -1 \end{pmatrix} \begin{pmatrix} \alpha_1 \\ \alpha_2 \end{pmatrix} &= \begin{pmatrix} 3 \\ 1 \end{pmatrix}.
\end{align*}
Row reduction gives $\alpha_1 = 2$ and $\alpha_2 = 1$, so $[v]_{\mathcal{B}} = (2, 1)$. We verify: $2b_1 + 1 \cdot b_2 = 2(1,1) + (1,-1) = (3, 1) = v$. The inverse matrix is $P_{\mathcal{E} \to \mathcal{B}} = \frac{1}{2}\begin{pmatrix} 1 & 1 \\ 1 & -1 \end{pmatrix}$, which converts $\mathcal{E}$-coordinates to $\mathcal{B}$-coordinates directly: $\frac{1}{2}\begin{pmatrix} 1 & 1 \\ 1 & -1 \end{pmatrix}\begin{pmatrix} 3 \\ 1 \end{pmatrix} = \begin{pmatrix} 2 \\ 1 \end{pmatrix}$, confirming the computation.
[/example]
## Linear Maps and Isomorphisms
Two vector spaces can look completely different as objects — one might consist of polynomials, another of matrices — yet be structurally identical as vector spaces. The right notion of structure-preserving map is a **linear map**, and two spaces are considered the same if there is a bijective linear map between them.
This matters because it tells us that the "content" of a finite-dimensional vector space is captured entirely by its dimension. All $n$-dimensional real vector spaces are, from the perspective of linear algebra, copies of $\mathbb{R}^n$. Every theorem proved for $\mathbb{R}^n$ applies to all of them.
[definition: Linear Map]
Let $V$ and $W$ be vector spaces over the same field $F$. A map $T: V \to W$ is **linear** (or a **linear transformation**, or an **$F$-linear map**) if for all $u, v \in V$ and $\alpha, \beta \in F$,
\begin{align*}
T(\alpha u + \beta v) &= \alpha T(u) + \beta T(v).
\end{align*}
The set of all linear maps from $V$ to $W$ is denoted $\mathcal{L}(V, W)$.
[/definition]
[definition: Isomorphism of Vector Spaces]
A linear map $T: V \to W$ is an **isomorphism** if it is bijective (injective and surjective). Two vector spaces $V$ and $W$ are **isomorphic**, written $V \cong W$, if there exists an isomorphism between them.
[/definition]
If $T: V \to W$ is an isomorphism, then $T^{-1}: W \to V$ exists and is also linear (and hence also an isomorphism). Isomorphism is an equivalence relation on vector spaces.
[quotetheorem:3309]
The "if" direction is explicit: if both spaces have dimension $n$, choose bases $\mathcal{B} = \{b_1, \ldots, b_n\}$ of $V$ and $\mathcal{C} = \{c_1, \ldots, c_n\}$ of $W$, and define $T: V \to W$ by extending linearly from $T(b_i) = c_i$. By the Basis Criterion, every vector in $V$ has unique coordinates in $\mathcal{B}$, so $T$ is well-defined; it is injective (same argument: if $T(v) = 0$ then all coordinates of $v$ are zero, so $v = 0$) and surjective (every element of $W$ is a linear combination of $c_i = T(b_i)$). The "only if" direction uses the fact that isomorphisms preserve linear independence and spanning, so they must send bases to bases.
[quotetheorem:385]
The Rank-Nullity theorem is one of the most useful results in all of linear algebra. It says that the "information lost" by $T$ (measured by the dimension of the kernel) plus the "information transmitted" (measured by the dimension of the image) must account for all of $V$. As a consequence: a linear map $T: V \to W$ with $\dim V = \dim W$ is injective iff it is surjective iff it is an isomorphism.
[example: Rank-Nullity for a Projection]
Let $V = \mathbb{R}^3$ and define $T: \mathbb{R}^3 \to \mathbb{R}^3$ by $T(x_1, x_2, x_3) = (x_1, x_2, 0)$ (projection onto the $x_1 x_2$-plane). Then:
- $\ker(T) = \{(0, 0, x_3) : x_3 \in \mathbb{R}\}$, which has dimension $1$.
- $\operatorname{im}(T) = \{(x_1, x_2, 0) : x_1, x_2 \in \mathbb{R}\}$, which has dimension $2$.
- Check: $\dim \ker(T) + \dim \operatorname{im}(T) = 1 + 2 = 3 = \dim V$. The theorem holds.
The kernel is the $x_3$-axis; the image is the $x_1 x_2$-plane. These are complementary subspaces of $\mathbb{R}^3$, illustrating that projections split a vector space into two complementary pieces. This is the simplest instance of a general decomposition theorem.
[/example]
## Direct Sums and Quotients
The notion of dimension allows us to decompose vector spaces into simpler pieces. Given two subspaces $U$ and $W$ of $V$, we can ask: does every vector in $V$ split uniquely into a part in $U$ and a part in $W$? When yes, we say $V$ is the direct sum of $U$ and $W$.
[definition: Direct Sum of Subspaces]
Let $U$ and $W$ be subspaces of a vector space $V$. We say $V$ is the **direct sum** of $U$ and $W$, written $V = U \oplus W$, if every vector $v \in V$ can be written uniquely as $v = u + w$ with $u \in U$ and $w \in W$.
[/definition]
Equivalently, $V = U \oplus W$ if and only if $V = U + W$ (every vector is a sum) and $U \cap W = \{0\}$ (the only overlap is the zero vector).
[quotetheorem:3273]
This is immediate from the definition: a basis for $U$ together with a basis for $W$ forms a basis for $V$, and the two bases are disjoint (since $U \cap W = \{0\}$, any linear relation between basis vectors of $U$ and basis vectors of $W$ forces all coefficients to be zero).
Every finite-dimensional vector space admits complementary pairs. Given any subspace $U \subset V$, there exists a subspace $W \subset V$ with $V = U \oplus W$. The complement $W$ is not unique — many choices work — but its dimension is fixed: $\dim W = \dim V - \dim U$.
The direct sum splits $V$ into two orthogonal pieces; the quotient space takes the opposite perspective. Suppose you have a linear map $T: V \to W$ and you want to understand its image. The kernel $\ker(T)$ is the "noise" — the part of $V$ that $T$ collapses to zero. If you want to isolate the "signal", you need to treat all vectors that differ by an element of $\ker(T)$ as equivalent. You are not collapsing $V$ to a subspace; you are folding it, identifying directions you no longer wish to distinguish. The object that results from this identification is the quotient space. Without it, you cannot state the First Isomorphism Theorem, factor a linear map through its kernel, or even define the cokernel of a map — each of which appears constantly in linear algebra, homological algebra, and beyond.
[definition: Quotient Space]
Let $V$ be a vector space over $F$ and $U \subset V$ a subspace. The **quotient space** $V / U$ is the set of cosets
\begin{align*}
V / U &:= \{ v + U : v \in V \}, \quad \text{where } v + U := \{v + u : u \in U\}.
\end{align*}
Two cosets $v + U$ and $v' + U$ are equal iff $v - v' \in U$. Addition and scalar multiplication on $V/U$ are defined by
\begin{align*}
(v + U) + (v' + U) &:= (v + v') + U, \qquad \alpha(v + U) := (\alpha v) + U.
\end{align*}
These operations are well-defined (independent of the choice of coset representative), and $V/U$ is a vector space over $F$ with zero element $0 + U = U$.
[/definition]
[explanation: Why Quotient Spaces Arise]
The quotient $V/U$ is the vector space you get by "declaring all vectors in $U$ to be zero." Formally, it is the codomain of the natural projection
\begin{align*}
\pi: V &\to V/U \\
v &\mapsto v + U,
\end{align*}
which is linear, surjective, and has $\ker(\pi) = U$. By Rank-Nullity, $\dim(V/U) = \dim V - \dim U$.
This construction appears when you want to understand $V$ "modulo" some subspace of information you do not care about. In the study of linear maps, quotients let you factor a map through its kernel: any $T: V \to W$ with $U = \ker(T)$ induces an injective linear map $\tilde{T}: V/U \to W$ defined by $\tilde{T}(v + U) = T(v)$.
[/explanation]
[quotetheorem:384]
This theorem packages rank-nullity and the injectivity of $\tilde{T}$ into one clean statement. It says that to understand the image of $T$, it suffices to understand $V$ modulo the directions that $T$ collapses.
## Applications
The opening promised that vector spaces appear in differential equations, quantum mechanics, and signal processing. Having now built up the full apparatus, we can see precisely how.
**Differential equations.** The set of solutions to a homogeneous linear ODE on an interval $I \subset \mathbb{R}$ is a vector space. For instance, the equation $y'' + y = 0$ on $\mathbb{R}$ has solution space spanned by $\sin t$ and $\cos t$. More generally, the $n$-th order homogeneous linear ODE
\begin{align*}
y^{(n)} + a_{n-1}(t)\,y^{(n-1)} + \cdots + a_1(t)\,y' + a_0(t)\,y &= 0
\end{align*}
with continuous coefficients $a_i: I \to \mathbb{R}$ has solution space that is an $n$-dimensional subspace of $C^\infty(I)$. The dimension theorem tells us there are exactly $n$ linearly independent solutions, no more and no less. Without the vector space framework, the fact that $n$ initial conditions determine the solution uniquely would be a coincidence; with it, it is a theorem about dimension.
**Quantum mechanics.** In quantum mechanics, the state of a physical system is a unit vector in a (possibly infinite-dimensional) complex Hilbert space $\mathcal{H}$. Observable quantities — position, momentum, energy — are represented by self-adjoint linear operators on $\mathcal{H}$. The superposition principle, which says that if $\psi_1$ and $\psi_2$ are possible states then so is $\alpha \psi_1 + \beta \psi_2$ (for $\alpha, \beta \in \mathbb{C}$ with $|\alpha|^2 + |\beta|^2 = 1$), is exactly the statement that $\mathcal{H}$ is a vector space. Measurement and the probabilistic interpretation arise from the inner product structure, but the algebraic backbone is vector space linearity.
**Signal processing.** A time-domain signal can be modelled as a function $f: \mathbb{R} \to \mathbb{R}$ in some function space (e.g., $L^2(\mathbb{R})$). The Fourier transform is a linear map $\mathcal{F}: L^2(\mathbb{R}) \to L^2(\mathbb{R})$. Filtering a signal by multiplying its Fourier transform by a function $m(\xi)$ is a linear operation on the vector space $L^2(\mathbb{R})$; the fact that the inverse Fourier transform recovers the original signal is the statement that $\mathcal{F}$ is an isomorphism. Convolution, the mathematical operation underlying blur, noise reduction, and edge detection, is a bilinear operation on function spaces — again, the vector space structure is the foundation on which the entire theory rests.
## References
Sheldon Axler, *Linear Algebra Done Right* (3rd ed., 2015).
Serge Lang, *Linear Algebra* (3rd ed., 1987).
Paul Halmos, *Finite-Dimensional Vector Spaces* (2nd ed., 1958).
Gilbert Strang, *Introduction to Linear Algebra* (5th ed., 2016).