A matrix is a way of storing numbers with two indices, but the real mathematical question is what kind of object the whole collection of such arrays forms. If we only look at one matrix at a time, we can solve a linear system or compute a determinant. If we look at all matrices of a fixed size at once, we get a [vector space](/page/Vector%20Space) whose coordinates are entries, whose points can represent linear maps, and whose subspaces encode meaningful constraints.
The first surprise is that a matrix has several lives. A single $m \times n$ array may be a coefficient table, a vector in an $mn$-dimensional space, or the coordinate representative of a [linear map](/page/Linear%20Map) from $k^n$ to $k^m$. Confusing these roles leads to mistakes: rectangular matrices form vector spaces even when they cannot be multiplied among themselves, while square matrices have an additional multiplication that is usually noncommutative.
[example: A Linear System as a Point]
Let $k$ be a field, and consider the coefficient matrix $A\in M_{2\times 3}(k)$ with entries $A_{11}=1$, $A_{12}=-2$, $A_{13}=0$, $A_{21}=3$, $A_{22}=1$, and $A_{23}=4$. The corresponding system in unknowns $x_1,x_2,x_3$ has left-hand sides
\begin{align*}
A_{11}x_1+A_{12}x_2+A_{13}x_3=x_1-2x_2+0x_3
\end{align*}
and
\begin{align*}
A_{21}x_1+A_{22}x_2+A_{23}x_3=3x_1+x_2+4x_3.
\end{align*}
As a point of the matrix space $M_{2\times 3}(k)$, the same matrix is assembled entry by entry from the matrix units:
\begin{align*}
A=A_{11}E_{11}+A_{12}E_{12}+A_{13}E_{13}+A_{21}E_{21}+A_{22}E_{22}+A_{23}E_{23}.
\end{align*}
Substituting the six entries gives
\begin{align*}
A=1E_{11}+(-2)E_{12}+0E_{13}+3E_{21}+1E_{22}+4E_{23}.
\end{align*}
Removing the zero term and the coefficient $1$ terms gives
\begin{align*}
A=E_{11}-2E_{12}+3E_{21}+E_{22}+4E_{23}.
\end{align*}
It also defines a linear map $T_A:k^3\to k^2$ by $T_A(x)=Ax$. For $x=(x_1,x_2,x_3)\in k^3$, matrix multiplication gives
\begin{align*}
T_A(x)=(x_1-2x_2,\;3x_1+x_2+4x_3).
\end{align*}
Thus the same object is a coefficient table, a coordinate vector in the six-dimensional space $M_{2\times 3}(k)$, and the coordinate representative of a map from $k^3$ to $k^2$.
[/example]
The purpose of matrix space is to keep those roles organized. We first build the vector space of all matrices of a fixed size, then connect it to linear maps, then examine important subspaces and nonlinear rank conditions. The chapter ends by explaining what extra structure appears over $\mathbb{R}$ and $\mathbb{C}$.
## Definition
The basic construction answers a simple coordinate question: if each entry of an $m$ by $n$ array may vary independently in a field $k$, what vector space do those arrays form? We need this ambient space before discussing special matrices, matrix equations, or matrices as representatives of maps.
[definition: Matrix Space]
Let $k$ be a field, and let $m,n \in \mathbb{N}$. The matrix space $M_{m \times n}(k)$ is the set of all $m \times n$ matrices $A=(A_{ij})$ with entries $A_{ij}\in k$ for $1\le i\le m$ and $1\le j\le n$. Addition and scalar multiplication are defined entrywise by $(A+B)_{ij}=A_{ij}+B_{ij}$ and $(\lambda A)_{ij}=\lambda A_{ij}$.
[/definition]
This definition deliberately does not mention matrix multiplication. Rectangular matrices still form a vector space, even though multiplying two arbitrary elements of $M_{m\times n}(k)$ is not usually defined. What is available is compatible multiplication across shapes: a matrix in $M_{m\times n}(k)$ can multiply a matrix in $M_{n\times p}(k)$ to produce an element of $M_{m\times p}(k)$. Later sections use the shorter notation $M_n(k)$ for the square case, where multiplication, powers, inverses, traces, and determinants all stay inside one fixed space.
## Coordinate Structure
To compute in matrix space, we need coordinate axes. The natural axes are the matrices that switch on exactly one entry and set all other entries to zero, because every matrix can then be assembled entry by entry.
[definition: Matrix Unit]
Let $k$ be a field and let $m,n\in\mathbb{N}$. For $1\le i\le m$ and $1\le j\le n$, the matrix unit $E_{ij}\in M_{m\times n}(k)$ is the matrix whose $(i,j)$-entry is $1$ and whose other entries are $0$.
[/definition]
Matrix units are the standard coordinate vectors of matrix space. They allow us to write every matrix as a finite coordinate expansion:
\begin{align*}
A=\sum_{i=1}^{m}\sum_{j=1}^{n}A_{ij}E_{ij}.
\end{align*}
This is the same idea as writing a vector in $k^d$ as a linear combination of standard basis vectors, but now the single coordinate index has been replaced by a pair of indices.
The expansion above would be misleading if there were hidden linear relations among the matrix units, or if some matrices could not be reached by entry-by-entry assembly. What must be checked is that choosing the $mn$ entries freely gives exactly $mn$ independent directions in $M_{m\times n}(k)$.
This turns the coordinate formula into a dimension statement: the matrix units should not merely describe entries, but should provide a genuine basis for the whole matrix space. The spanning part says that entries really do assemble every matrix, while the independence part says that no entry position can be forced by the others. Proving both points is what turns the visual grid of coordinate positions into a reliable coordinate system for $M_{m\times n}(k)$.
The next formal result is needed to promote the entry-by-entry description from useful notation to an actual coordinate theorem. It answers the structural question raised by the definition: whether the listed matrices $E_{ij}$ account for the entire vector space, with no redundancy among the coordinate positions.
[quotetheorem:8369]
The theorem is the main reason matrix spaces behave like familiar finite-dimensional coordinate spaces. It also tells us exactly how many independent parameters an $m\times n$ matrix has.
## Coordinates and Entry Functionals
### Entries as Coordinates
A matrix entry is not merely part of notation; it is a coordinate function on the vector space of all matrices. This viewpoint matters because many matrix conditions are linear equations in entries, and linear equations define subspaces.
When we want to test a matrix by inspecting one coordinate, we use an entry functional. This isolates a single position in the array without changing the matrix or choosing a new basis.
[definition: Entry Functional]
Let $k$ be a field and let $m,n\in\mathbb{N}$. For $1\le i\le m$ and $1\le j\le n$, the entry functional is the function
\begin{align*}
\varepsilon_{ij}:M_{m\times n}(k)\to k
\end{align*}
defined by
\begin{align*}
\varepsilon_{ij}(A)=A_{ij}.
\end{align*}
[/definition]
The entry functionals are linear maps because the vector space operations in $M_{m\times n}(k)$ are entrywise. They also separate points: two matrices are equal exactly when every entry functional gives the same value on them.
[example: Entry Functionals Detect Equality]
Let $A,B\in M_{m\times n}(k)$, and suppose that $\varepsilon_{ij}(A)=\varepsilon_{ij}(B)$ for every $1\le i\le m$ and $1\le j\le n$. By the definition of the entry functional,
\begin{align*}
\varepsilon_{ij}(A)=A_{ij}
\end{align*}
and
\begin{align*}
\varepsilon_{ij}(B)=B_{ij}.
\end{align*}
Therefore the assumed equality $\varepsilon_{ij}(A)=\varepsilon_{ij}(B)$ gives
\begin{align*}
A_{ij}=B_{ij}
\end{align*}
for every position $(i,j)$. Since two matrices in $M_{m\times n}(k)$ are equal exactly when all corresponding entries are equal, it follows that $A=B$. Thus the full family of entry functionals records every coordinate of a matrix, so it determines the matrix completely.
[/example]
### Flattening a Matrix
Sometimes a calculation wants a single coordinate index rather than a row index and a column index. Flattening a matrix supplies such an indexing, but it is a chosen convention rather than intrinsic structure.
[definition: Vectorization]
Let $k$ be a field and let $m,n\in\mathbb{N}$. The row-major vectorization map is the function
\begin{align*}
\operatorname{vec}:M_{m\times n}(k)\to k^{mn}
\end{align*}
defined by
\begin{align*}
\operatorname{vec}(A)=(A_{11},\ldots,A_{1n},A_{21},\ldots,A_{2n},\ldots,A_{m1},\ldots,A_{mn}).
\end{align*}
[/definition]
Vectorization is useful for reducing matrix-space statements to statements about $k^{mn}$. The cost is that row and column structure becomes hidden inside the chosen ordering of the coordinates.
Flattening a matrix into a vector is only useful if addition and scalar multiplication survive the change of notation. The issue is not whether the entries are still present, but whether the chosen ordering turns matrix space into the same vector space structure as $k^{mn}$.
The point to verify is that vectorization is not just a convenient list of entries; it should be a linear identification with coordinate space, so that linear arguments can move back and forth without changing their content. This requires checking more than bijectivity: sums of matrices must become sums of vectors, and scalar multiples must become scalar multiples after the entries are ordered. Once that compatibility is established, vectorization can be used as a legitimate bridge rather than as a merely cosmetic relabeling.
[quotetheorem:8370]
This theorem explains why finite-dimensional vector-space results apply to matrix spaces. It also reminds us that the matrix shape is meaningful extra bookkeeping, even though the underlying vector space is isomorphic to ordinary coordinate space.
## Matrices as Linear Maps
### From Arrays to Functions
The most important use of matrix space is that its elements represent linear maps once standard bases are chosen. The shape $m\times n$ records the direction of the map: vectors in $k^n$ are sent to vectors in $k^m$.
To make this role explicit, we associate to each matrix a function with a specified domain and codomain. This prevents the common mistake of treating a matrix as an abstract map without saying which coordinate spaces are being used.
[definition: Matrix-Induced Linear Map]
Let $k$ be a field, let $m,n\in\mathbb{N}$, and let $A\in M_{m\times n}(k)$. The matrix-induced linear map associated to $A$ is the function
\begin{align*}
T_A:k^n\to k^m
\end{align*}
defined by
\begin{align*}
T_A(x)=Ax.
\end{align*}
[/definition]
This definition uses the standard ordered bases of $k^n$ and $k^m$. With those bases fixed, every matrix gives a linear map, and every linear map between coordinate spaces comes from exactly one matrix.
If matrices are to represent functions rather than merely arrays, the assignment $A\mapsto T_A$ must respect both composition with inputs and the linear structure on the collection of maps. Otherwise matrix addition and scalar multiplication would be only notational operations, disconnected from the corresponding operations on transformations.
The formal bridge needed here is a compatibility theorem: it must say that the array operations in $M_{m\times n}(k)$ match the corresponding operations on the linear maps $k^n\to k^m$. It must also rule out ambiguity by showing that the entries of a matrix determine exactly one transformation on all input vectors. This is the point at which an array becomes a dependable representative of a function, not just a table of scalars.
The theorem is therefore introduced to answer a representation question rather than to restate the definition: when does multiplying by a matrix give a well-defined linear transformation, and when can linear transformations between coordinate spaces be recovered from matrices? This is the bridge that lets later arguments move freely between arrays and maps.
[quotetheorem:382]
The theorem justifies the common practice of identifying matrices with linear transformations between coordinate spaces. The identification is basis-dependent in more general vector spaces, so the next step is to record how bases enter.
### Matrices of Abstract Linear Maps
An abstract linear map $T:V\to W$ need not be a matrix by itself. It becomes a matrix only after choosing an ordered basis of the domain and an ordered basis of the codomain, because those bases provide coordinates for inputs and outputs.
[definition: Matrix of a Linear Map]
Let $V$ and $W$ be finite-dimensional vector spaces over a field $k$. Let $\mathcal{B}=(v_1,\ldots,v_n)$ be an ordered basis of $V$, and let $\mathcal{C}=(w_1,\ldots,w_m)$ be an ordered basis of $W$. For a linear map $T:V\to W$, the matrix of $T$ with respect to $\mathcal{B}$ and $\mathcal{C}$ is the matrix $[T]_{\mathcal{C}\leftarrow\mathcal{B}}\in M_{m\times n}(k)$ whose $j$-th column is $[T(v_j)]_{\mathcal{C}}$.
[/definition]
The arrow in $[T]_{\mathcal{C}\leftarrow\mathcal{B}}$ records the direction of coordinate conversion: start with $\mathcal{B}$-coordinates and end with $\mathcal{C}$-coordinates. This notation keeps the map separate from its representation.
After bases have been chosen, there are two ways to process a vector: apply the abstract map first and then take coordinates, or take coordinates first and multiply by the representing matrix. The representation is meaningful only when these two procedures give the same coordinate column.
This is the central test for a matrix representative: it should make the coordinate diagram commute, so that matrix multiplication reproduces the action of the original abstract map after coordinates are chosen. The issue is not whether the columns were defined correctly one at a time, but whether those column choices control the image of every vector by linearity.
The question is therefore a forward one: once the columns of $[T]_{\mathcal{C}\leftarrow\mathcal{B}}$ have been fixed from the basis images, what guarantees that multiplying by this matrix gives the correct coordinates for an arbitrary vector rather than only for the basis vectors? The theorem below supplies exactly that bridge from the column definition to a usable coordinate formula.
[quotetheorem:382]
This formula is the operational meaning of a matrix representation. The matrix does not replace the abstract map; it tells us how the map acts after coordinates have been chosen.
Changing coordinates should not change the underlying linear map, but it does change the matrix that represents it. To compare two representatives without mixing up the direction of conversion, we must name the coordinate-change maps before writing the formula. Let $P_{\mathcal{C}'\leftarrow\mathcal{C}}$ denote the change-of-basis matrix satisfying $[y]_{\mathcal{C}'}=P_{\mathcal{C}'\leftarrow\mathcal{C}}[y]_{\mathcal{C}}$ for every $y\in W$, and let $P_{\mathcal{B}\leftarrow\mathcal{B}'}$ denote the change-of-basis matrix satisfying $[x]_{\mathcal{B}}=P_{\mathcal{B}\leftarrow\mathcal{B}'}[x]_{\mathcal{B}'}$ for every $x\in V$. The next theorem is needed because the old matrix representative can only be reused after the input coordinates and output coordinates have been converted in the correct order.
[quotetheorem:387]
The formula says to first convert the input from $\mathcal{B}'$-coordinates to $\mathcal{B}$-coordinates, then apply the old representative of $T$, then convert the output from $\mathcal{C}$-coordinates to $\mathcal{C}'$-coordinates. This is the cleanest way to see that similar-looking matrices may represent the same abstract map in different coordinates.
[example: Recovering Columns from Basis Images]
Let $A\in M_{3\times 2}(k)$, and write its entries as $A_{11},A_{12},A_{21},A_{22},A_{31},A_{32}$. Let $e_1=(1,0)$ and $e_2=(0,1)$ be the standard basis vectors of $k^2$. By the definition of the matrix-induced map, $T_A(e_j)=Ae_j$. Multiplying by $e_1$ gives
\begin{align*}
Ae_1=(A_{11}\cdot 1+A_{12}\cdot 0,\;A_{21}\cdot 1+A_{22}\cdot 0,\;A_{31}\cdot 1+A_{32}\cdot 0).
\end{align*}
Since $0$ is the additive identity and $1$ is the multiplicative identity in $k$, this is
\begin{align*}
Ae_1=(A_{11},A_{21},A_{31}).
\end{align*}
The hypothesis $T_A(e_1)=(2,-1,0)$ therefore gives
\begin{align*}
(A_{11},A_{21},A_{31})=(2,-1,0).
\end{align*}
Equality in $k^3$ is coordinatewise, so $A_{11}=2$, $A_{21}=-1$, and $A_{31}=0$. Thus the first column of $A$ is $(2,-1,0)$.
Similarly,
\begin{align*}
Ae_2=(A_{11}\cdot 0+A_{12}\cdot 1,\;A_{21}\cdot 0+A_{22}\cdot 1,\;A_{31}\cdot 0+A_{32}\cdot 1).
\end{align*}
Using the same identities in $k$, this becomes
\begin{align*}
Ae_2=(A_{12},A_{22},A_{32}).
\end{align*}
The hypothesis $T_A(e_2)=(5,3,4)$ gives
\begin{align*}
(A_{12},A_{22},A_{32})=(5,3,4),
\end{align*}
so $A_{12}=5$, $A_{22}=3$, and $A_{32}=4$. Hence the second column of $A$ is $(5,3,4)$, and the matrix is determined by these two column vectors:
\begin{align*}
A_{11}=2,\quad A_{21}=-1,\quad A_{31}=0,\quad A_{12}=5,\quad A_{22}=3,\quad A_{32}=4.
\end{align*}
The images of the standard basis vectors are exactly the columns of the matrix, which is the column interpretation of matrix multiplication.
[/example]
## Linear Subspaces of Matrix Space
Before studying special square matrices, we isolate the square case as a named matrix space. The abbreviation matters because many constraints in this section refer to transposition, diagonals, and products, all of which naturally live in $n\times n$ arrays.
[definition: Square Matrix Space]
Let $k$ be a field and let $n\in\mathbb{N}$. The square matrix space $M_n(k)$ is the matrix space $M_{n\times n}(k)$.
[/definition]
Square matrices are still vectors under entrywise operations. What changes is that square shape also supports operations, such as transposition and multiplication, that preserve the same ambient matrix space.
### Transpose and Trace
Two of the most useful operations on square matrix space are linear before they are multiplicative. Transposition reorganizes entries across the diagonal, while trace extracts the diagonal sum. Isolating them as maps on $M_n(k)$ keeps their vector-space role visible.
[definition: Transpose Map]
Let $k$ be a field and let $n\in\mathbb{N}$. The transpose map is the function $\tau:M_n(k)\to M_n(k)$ defined by $\tau(A)=A^\top$, where
\begin{align*}
(A^\top)_{ij}=A_{ji}
\end{align*}
for all $1\le i,j\le n$.
[/definition]
The transpose map lets us express symmetry conditions as fixed-point or sign-reversing conditions for a linear operator on matrix space. It does not answer a different basic question: how can a square matrix be measured by a scalar in a way that respects addition, scalar multiplication, and change of coordinates? The diagonal sum is the basic answer. It packages a square matrix into one linear scalar measurement, and it is the quantity that later survives conjugation when matrices represent the same endomorphism in different bases.
[definition: Trace Functional]
Let $k$ be a field and let $n\in\mathbb{N}$. The trace functional is the function $\operatorname{tr}:M_n(k)\to k$ defined by
\begin{align*}
\operatorname{tr}(A)=\sum_{i=1}^{n}A_{ii}.
\end{align*}
[/definition]
Both maps are linear: transposition preserves entrywise sums and scalar multiples, and trace is a sum of coordinate functionals. This places trace alongside the entry functionals as a natural linear measurement on matrix space.
### Entry Conditions
Many natural classes of matrices are obtained by imposing linear conditions on entries. The vector-space perspective is useful because it distinguishes conditions that define subspaces from conditions that do not.
The easiest linear condition is to require all off-diagonal entries to vanish. This isolates matrices that scale each coordinate axis independently.
[definition: Diagonal Matrix]
Let $k$ be a field and let $n\in\mathbb{N}$. A matrix $A\in M_n(k)$ is diagonal if $A_{ij}=0$ for all $1\le i,j\le n$ with $i\ne j$.
[/definition]
Diagonal matrices form a small coordinate subspace of $M_n(k)$: only the $n$ diagonal entries are free. The next natural question is what happens when entries are not forced to vanish but are paired across the main diagonal. Such pairings arise when a matrix records a bilinear expression whose value is unchanged after interchanging two inputs.
[definition: Symmetric Matrix]
Let $k$ be a field and let $n\in\mathbb{N}$. A matrix $A\in M_n(k)$ is symmetric if $A_{ij}=A_{ji}$ for all $1\le i,j\le n$.
[/definition]
Symmetric matrices record entry-pairing that is unchanged by transposition, but many bilinear constructions require the opposite behaviour: swapping two inputs should reverse the sign. Away from characteristic $2$, alternating bilinear forms have coordinate matrices with entries paired by $A_{ij}=-A_{ji}$, while characteristic $2$ requires extra care about diagonal entries. Isolating the sign-changing condition gives the second natural transpose-controlled subspace of square matrix space.
[definition: Skew-Symmetric Matrix]
Let $k$ be a field and let $n\in\mathbb{N}$. A matrix $A\in M_n(k)$ is skew-symmetric if $A_{ij}=-A_{ji}$ for all $1\le i,j\le n$.
[/definition]
These three families illustrate a general technique: count the independent entries left after the constraints are imposed. The only subtlety is that paired entries should be counted once, diagonal entries may behave differently from off-diagonal pairs, and the skew-symmetric diagonal depends on the characteristic of the field.
The dimension count should now be made precise for each family, because the number of free coordinates is what distinguishes diagonal, symmetric, and skew-symmetric subspaces inside $M_n(k)$. Diagonal matrices keep only $n$ independent positions, symmetric matrices keep one free choice for each unordered pair of indices, and skew-symmetric matrices keep the off-diagonal pairs with a sign relation. Stating the dimensions together makes the comparison explicit and separates the ordinary counting argument from the characteristic-dependent caveat.
[quotetheorem:8371]
The characteristic hypothesis for skew-symmetric matrices matters. When $\operatorname{char}(k)=2$, the equations $A_{ij}=A_{ji}$ and $A_{ij}=-A_{ji}$ are the same equation, so symmetric and skew-symmetric conditions no longer behave as complementary constraints.
### Decomposition by Transpose
If $\operatorname{char}(k)\ne 2$, every square matrix can be split into a symmetric part and a skew-symmetric part. This decomposition explains why the two subspaces above are not merely examples but natural building blocks of $M_n(k)$.
Write $\operatorname{Sym}_n(k)$ for the subspace of symmetric matrices in $M_n(k)$, and write $\operatorname{Skew}_n(k)$ for the subspace of skew-symmetric matrices in $M_n(k)$. With this notation, the decomposition can be stated as an internal direct sum.
[quotetheorem:442]
The decomposition separates the part of $A$ fixed by transpose from the part negated by transpose. It is a prototype for many later decompositions into eigenspaces of a linear operator.
[example: Splitting a Two by Two Matrix]
Assume $\operatorname{char}(k)\ne 2$, so $2\ne 0$ in $k$ and division by $2$ means multiplication by the inverse of $2$. Let $A\in M_2(k)$ have entries $A_{11}=1$, $A_{12}=4$, $A_{21}=2$, and $A_{22}=3$. Since $(A^\top)_{ij}=A_{ji}$, the transpose has entries $(A^\top)_{11}=1$, $(A^\top)_{12}=2$, $(A^\top)_{21}=4$, and $(A^\top)_{22}=3$.
Define
\begin{align*}
S=\frac{A+A^\top}{2}.
\end{align*}
Then each entry is obtained entrywise:
\begin{align*}
S_{11}=\frac{A_{11}+(A^\top)_{11}}{2}=\frac{1+1}{2}=1.
\end{align*}
\begin{align*}
S_{12}=\frac{A_{12}+(A^\top)_{12}}{2}=\frac{4+2}{2}=3.
\end{align*}
\begin{align*}
S_{21}=\frac{A_{21}+(A^\top)_{21}}{2}=\frac{2+4}{2}=3.
\end{align*}
\begin{align*}
S_{22}=\frac{A_{22}+(A^\top)_{22}}{2}=\frac{3+3}{2}=3.
\end{align*}
Thus $S_{12}=S_{21}$, and the diagonal entries are unchanged by transposition, so $S$ is symmetric.
Define
\begin{align*}
K=\frac{A-A^\top}{2}.
\end{align*}
Again computing entrywise,
\begin{align*}
K_{11}=\frac{A_{11}-(A^\top)_{11}}{2}=\frac{1-1}{2}=0.
\end{align*}
\begin{align*}
K_{12}=\frac{A_{12}-(A^\top)_{12}}{2}=\frac{4-2}{2}=1.
\end{align*}
\begin{align*}
K_{21}=\frac{A_{21}-(A^\top)_{21}}{2}=\frac{2-4}{2}=-1.
\end{align*}
\begin{align*}
K_{22}=\frac{A_{22}-(A^\top)_{22}}{2}=\frac{3-3}{2}=0.
\end{align*}
So $K_{12}=-K_{21}$ and $K_{11}=K_{22}=0$, hence $K$ is skew-symmetric.
Finally,
\begin{align*}
(S+K)_{11}=S_{11}+K_{11}=1+0=1=A_{11}.
\end{align*}
\begin{align*}
(S+K)_{12}=S_{12}+K_{12}=3+1=4=A_{12}.
\end{align*}
\begin{align*}
(S+K)_{21}=S_{21}+K_{21}=3+(-1)=2=A_{21}.
\end{align*}
\begin{align*}
(S+K)_{22}=S_{22}+K_{22}=3+0=3=A_{22}.
\end{align*}
Therefore $A=S+K$: the matrix has been split into its symmetric part and its skew-symmetric part.
[/example]
## Matrix Algebra
### Multiplication in the Square Case
The vector space $M_n(k)$ has more structure than a general matrix space because square matrices can be multiplied. This operation corresponds to composition of linear maps, so it is essential for understanding powers, inverses, eigenvalues, and commutators.
[definition: Matrix Algebra]
Let $k$ be a field and let $n\in\mathbb{N}$. The matrix algebra $M_n(k)$ is the vector space $M_n(k)$ equipped with the multiplication map
\begin{align*}
\mu:M_n(k)\times M_n(k)\to M_n(k)
\end{align*}
defined by $\mu(A,B)=AB$, where
\begin{align*}
(AB)_{ij}=\sum_{r=1}^{n}A_{ir}B_{rj}.
\end{align*}
[/definition]
Matrix multiplication is not an extra vector-space operation like addition; it is a bilinear product. We need to know that this product is compatible with the vector-space operations and behaves associatively.
[quotetheorem:8372]
This theorem turns square matrix space into a basic example of a noncommutative algebra. Noncommutativity is not a defect: it reflects the order dependence of composing linear maps.
[example: Noncommutativity]
Let $A=E_{12}$ and $B=E_{21}$ in $M_2(k)$. By the definition of matrix units, the only nonzero entry of $A$ is $A_{12}=1$, and the only nonzero entry of $B$ is $B_{21}=1$.
Using the matrix product formula $(AB)_{ij}=A_{i1}B_{1j}+A_{i2}B_{2j}$ in $M_2(k)$, the entries of $AB$ are
\begin{align*}
(AB)_{11}=A_{11}B_{11}+A_{12}B_{21}=0\cdot 0+1\cdot 1=1.
\end{align*}
\begin{align*}
(AB)_{12}=A_{11}B_{12}+A_{12}B_{22}=0\cdot 0+1\cdot 0=0.
\end{align*}
\begin{align*}
(AB)_{21}=A_{21}B_{11}+A_{22}B_{21}=0\cdot 0+0\cdot 1=0.
\end{align*}
\begin{align*}
(AB)_{22}=A_{21}B_{12}+A_{22}B_{22}=0\cdot 0+0\cdot 0=0.
\end{align*}
Thus $AB$ has a single nonzero entry, equal to $1$, in position $(1,1)$, so $AB=E_{11}$.
For the reverse product,
\begin{align*}
(BA)_{11}=B_{11}A_{11}+B_{12}A_{21}=0\cdot 0+0\cdot 0=0.
\end{align*}
\begin{align*}
(BA)_{12}=B_{11}A_{12}+B_{12}A_{22}=0\cdot 1+0\cdot 0=0.
\end{align*}
\begin{align*}
(BA)_{21}=B_{21}A_{11}+B_{22}A_{21}=1\cdot 0+0\cdot 0=0.
\end{align*}
\begin{align*}
(BA)_{22}=B_{21}A_{12}+B_{22}A_{22}=1\cdot 1+0\cdot 0=1.
\end{align*}
Thus $BA$ has a single nonzero entry, equal to $1$, in position $(2,2)$, so $BA=E_{22}$.
Since $(E_{11})_{11}=1$ but $(E_{22})_{11}=0$, the matrices $E_{11}$ and $E_{22}$ are not equal. Therefore $AB\ne BA$, showing that multiplication in a matrix algebra depends on order rather than being entrywise multiplication.
[/example]
### Subalgebras
A vector subspace of $M_n(k)$ may or may not be compatible with multiplication. To study smaller matrix systems closed under both linear combinations and products, we need the notion of a matrix subalgebra.
[definition: Matrix Subalgebra]
Let $k$ be a field and let $n\in\mathbb{N}$. A matrix subalgebra of $M_n(k)$ is a vector subspace $\mathcal{A}\subset M_n(k)$ such that $I_n\in\mathcal{A}$ and $AB\in\mathcal{A}$ for all $A,B\in\mathcal{A}$.
[/definition]
Diagonal matrices form a matrix subalgebra because products of diagonal matrices remain diagonal. Symmetric matrices, by contrast, form a vector subspace but usually fail to be closed under multiplication.
[example: Symmetric Matrices Need Not Form a Subalgebra]
In $M_2(\mathbb{R})$, let $S$ have entries $S_{11}=1$, $S_{12}=1$, $S_{21}=1$, and $S_{22}=0$, and let $T$ have entries $T_{11}=0$, $T_{12}=1$, $T_{21}=1$, and $T_{22}=1$. Since $S_{12}=1=S_{21}$, the off-diagonal entries of $S$ agree, so $S$ is symmetric. Since $T_{12}=1=T_{21}$, the off-diagonal entries of $T$ agree, so $T$ is symmetric.
Using the product formula $(ST)_{ij}=S_{i1}T_{1j}+S_{i2}T_{2j}$ for $2\times 2$ matrices, the first entry is
\begin{align*}
(ST)_{11}=S_{11}T_{11}+S_{12}T_{21}=1\cdot 0+1\cdot 1=1.
\end{align*}
The second entry in the first row is
\begin{align*}
(ST)_{12}=S_{11}T_{12}+S_{12}T_{22}=1\cdot 1+1\cdot 1=2.
\end{align*}
The first entry in the second row is
\begin{align*}
(ST)_{21}=S_{21}T_{11}+S_{22}T_{21}=1\cdot 0+0\cdot 1=0.
\end{align*}
The last entry is
\begin{align*}
(ST)_{22}=S_{21}T_{12}+S_{22}T_{22}=1\cdot 1+0\cdot 1=1.
\end{align*}
Thus $(ST)_{12}=2$ while $(ST)_{21}=0$, and these are unequal in $\mathbb{R}$. Therefore $ST$ is not symmetric, even though both $S$ and $T$ are symmetric. The symmetric matrices are closed under addition and scalar multiplication, but this example shows they are not closed under matrix multiplication, so they do not form a matrix subalgebra of $M_2(\mathbb{R})$.
[/example]
This failure is a useful diagnostic. Linear equations in entries often define subspaces, but multiplicative closure is a separate and stronger requirement.
## Rank and Determinantal Conditions
### Rank as a Nonlinear Invariant
Not every important family inside matrix space is a vector subspace. Rank is the central example: it measures the dimension of the image of the associated linear map, and it controls solvability of linear systems.
[definition: Rank of a Matrix]
Let $k$ be a field and let $m,n\in\mathbb{N}$. The rank function is the map $\operatorname{rank}:M_{m\times n}(k)\to\{0,\ldots,\min\{m,n\}\}$ defined by $\operatorname{rank}(A)=\dim_k\operatorname{Range}(T_A)$, where $T_A:k^n\to k^m$ is the matrix-induced linear map associated to $A$.
[/definition]
Rank is not a linear coordinate function. It stratifies the matrix space into pieces with different image dimensions, and those pieces usually fail to be subspaces. To study matrices of a fixed image dimension, we need a name for the corresponding layer of matrix space rather than only the numerical invariant.
[definition: Rank Stratum]
Let $k$ be a field and let $m,n\in\mathbb{N}$. For $0\le r\le \min\{m,n\}$, the rank $r$ stratum in $M_{m\times n}(k)$ is $\mathcal{R}_r=\{A\in M_{m\times n}(k):\operatorname{rank}(A)=r\}$.
[/definition]
Rank strata are geometrically meaningful but not generally linear. The next example shows that even the rank-one stratum is not closed under addition.
[example: Rank-One Matrices Do Not Form a Subspace]
In $M_2(\mathbb{R})$, let $E_{11}$ be the matrix with entry $1$ in position $(1,1)$ and $0$ elsewhere, and let $E_{22}$ be the matrix with entry $1$ in position $(2,2)$ and $0$ elsewhere. For $(x,y)\in\mathbb{R}^2$, multiplication by $E_{11}$ gives
\begin{align*}
E_{11}(x,y)=(1\cdot x+0\cdot y,\;0\cdot x+0\cdot y)=(x,0).
\end{align*}
Thus $\operatorname{Range}(T_{E_{11}})=\{(x,0):x\in\mathbb{R}\}=\operatorname{span}\{(1,0)\}$, so $\operatorname{rank}(E_{11})=1$.
Similarly,
\begin{align*}
E_{22}(x,y)=(0\cdot x+0\cdot y,\;0\cdot x+1\cdot y)=(0,y).
\end{align*}
Hence $\operatorname{Range}(T_{E_{22}})=\{(0,y):y\in\mathbb{R}\}=\operatorname{span}\{(0,1)\}$, so $\operatorname{rank}(E_{22})=1$.
Now compute the sum entrywise:
\begin{align*}
E_{11}+E_{22}=\begin{pmatrix}1&0\end{pmatrix}\text{ in the first row and }\begin{pmatrix}0&1\end{pmatrix}\text{ in the second row}=I_2.
\end{align*}
For every $(x,y)\in\mathbb{R}^2$,
\begin{align*}
I_2(x,y)=(1\cdot x+0\cdot y,\;0\cdot x+1\cdot y)=(x,y).
\end{align*}
Therefore $\operatorname{Range}(T_{I_2})=\mathbb{R}^2$, so $\operatorname{rank}(I_2)=\dim_{\mathbb{R}}\mathbb{R}^2=2$. The sum of two rank-one matrices can have rank $2$, so the rank-one matrices are not closed under addition and therefore do not form a vector subspace of $M_2(\mathbb{R})$.
[/example]
### Minors and Polynomial Equations
Although exact rank strata are not linear, rank bounds are controlled by polynomial equations in the entries. This is the point where matrix space becomes a source of examples for commutative algebra and algebraic geometry.
[definition: Determinantal Variety]
Let $k$ be a field, let $m,n\in\mathbb{N}$, and let $0\le r\le \min\{m,n\}$. The determinantal variety of matrices of rank at most $r$, at the level of $k$-points, is
\begin{align*}
\mathcal{D}_r=\{A\in M_{m\times n}(k):\operatorname{rank}(A)\le r\}.
\end{align*}
[/definition]
This definition names a $k$-point set. Over a field that is not algebraically closed, the full algebraic-geometric object requires specifying the polynomial equations, base field, and base-change convention, so the notation here should be read as the visible matrix set over $k$.
The definition names the set, but by itself it still depends on the image of the associated linear map. To use $\mathcal{D}_r$ as an algebraic object, we need a test written only in the entries of $A$. Minors provide exactly that test: they turn the geometric condition "the image has dimension at most $r$" into determinant equations inside the coordinate ring of the matrix space.
This entrywise description is also the bridge to algebraic geometry. There, the same object is usually treated through the ideal generated by the relevant minors, often after specifying base-change or algebraic-closure conventions. The next theorem is therefore the point where the rank condition becomes a system of polynomial equations rather than only a statement about a linear map.
[quotetheorem:8373]
This theorem explains the word determinantal: the defining equations are determinants of submatrices. It also gives a direct bridge from matrix spaces to ideals generated by minors.
## Norms on Real and Complex Matrix Spaces
### Entrywise Size
Over an arbitrary field, matrix space is an algebraic vector space. Over $\mathbb{R}$ or $\mathbb{C}$, it also carries norms and a topology, so it makes sense to discuss convergence and perturbation of matrices.
The most direct norm measures the Euclidean size of the list of all entries. It treats the matrix as a vector after vectorization.
[definition: Frobenius Norm]
Let $m,n\in\mathbb{N}$ and let $k$ be $\mathbb{R}$ or $\mathbb{C}$. The Frobenius norm is the map
\begin{align*}
\|\cdot\|_F:M_{m\times n}(k)\to\mathbb{R}_{\ge 0}
\end{align*}
defined by
\begin{align*}
\|A\|_F=\left(\sum_{i=1}^{m}\sum_{j=1}^{n}|A_{ij}|^2\right)^{1/2}.
\end{align*}
[/definition]
The Frobenius norm is well suited to entrywise measurement. It is the norm obtained from the usual Euclidean [inner product](/page/Inner%20Product) on the coordinate space of entries.
### Operator Size
A matrix can also be measured by how much it stretches vectors. This gives a norm tied to the associated linear map rather than to the entries considered independently.
[definition: Operator Norm of a Matrix]
Let $m,n\in\mathbb{N}$ and let $k$ be $\mathbb{R}$ or $\mathbb{C}$. The operator norm induced by the Euclidean norm is the map
\begin{align*}
\|\cdot\|_{\mathrm{op}}:M_{m\times n}(k)\to\mathbb{R}_{\ge 0}
\end{align*}
defined by
\begin{align*}
\|A\|_{\mathrm{op}}=\sup\{|Ax|:x\in k^n, |x|=1\}.
\end{align*}
[/definition]
To compare entrywise size with operator size, we need inequalities rather than only definitions. Such inequalities tell us when convergence in one norm forces convergence in the other and show exactly where the number of columns enters the estimate.
[quotetheorem:8374]
The constants in the comparison depend on the number of columns. Thus fixed-size matrix convergence is simpler than convergence in a sequence of spaces whose dimensions grow.
[example: Norms Behave Differently as Dimension Grows]
Let $A_n\in M_n(\mathbb{R})$ be the diagonal matrix with $(A_n)_{ii}=1/\sqrt{n}$ for $1\le i\le n$ and $(A_n)_{ij}=0$ when $i\ne j$. For $x=(x_1,\ldots,x_n)\in\mathbb{R}^n$, matrix multiplication gives
\begin{align*}
A_nx=\left(\frac{x_1}{\sqrt{n}},\ldots,\frac{x_n}{\sqrt{n}}\right).
\end{align*}
Therefore, using the Euclidean norm,
\begin{align*}
|A_nx|^2=\left|\left(\frac{x_1}{\sqrt{n}},\ldots,\frac{x_n}{\sqrt{n}}\right)\right|^2=\frac{x_1^2}{n}+\cdots+\frac{x_n^2}{n}=\frac{1}{n}|x|^2.
\end{align*}
If $|x|=1$, then $|A_nx|^2=1/n$, so $|A_nx|=1/\sqrt{n}$. Hence every unit vector is stretched by exactly $1/\sqrt{n}$, and the definition of the operator norm gives
\begin{align*}
\|A_n\|_{\mathrm{op}}=\sup\{|A_nx|:|x|=1\}=\frac{1}{\sqrt{n}}.
\end{align*}
For the Frobenius norm, only the $n$ diagonal entries are nonzero. Thus
\begin{align*}
\|A_n\|_F=\left(\sum_{i=1}^{n}\sum_{j=1}^{n}|(A_n)_{ij}|^2\right)^{1/2}=\left(\sum_{i=1}^{n}\left|\frac{1}{\sqrt{n}}\right|^2\right)^{1/2}.
\end{align*}
Since $\left|1/\sqrt{n}\right|^2=1/n$, this becomes
\begin{align*}
\|A_n\|_F=\left(\sum_{i=1}^{n}\frac{1}{n}\right)^{1/2}=\left(n\cdot\frac{1}{n}\right)^{1/2}=1.
\end{align*}
As $n\to\infty$, the numbers $1/\sqrt{n}$ tend to $0$, so $\|A_n\|_{\mathrm{op}}\to 0$, while $\|A_n\|_F=1$ for every $n$. The same sequence of matrices therefore becomes small as an operator but not as a list of entries, which is exactly why the constants in norm comparisons must depend on the dimension.
[/example]
## Beyond and Connected Topics
Matrix spaces are the coordinate model for finite-dimensional linear algebra. The page [Cambridge IA Vectors and Matrices](/page/Cambridge%20IA%20Vectors%20and%20Matrices) develops the computational foundations: systems of linear equations, determinants, matrix multiplication, and eigenvalue calculations.
The basis-dependent relationship between matrix spaces and linear maps is developed further in [Cambridge IB Linear Algebra](/page/Cambridge%20IB%20Linear%20Algebra). That continuation explains how abstract vector spaces become coordinate spaces after choosing ordered bases, and why change-of-basis matrices alter matrix representatives without changing the underlying linear maps.
Matrix algebras connect linear algebra to noncommutative algebra. The commutator $[A,B]=AB-BA$ turns many matrix spaces into Lie algebras, which is one reason [Lie Algebras I: Foundations](/page/Lie%20Algebras%20I%3A%20Foundations) begins naturally from matrix examples.
Rank conditions connect matrix spaces to polynomial algebra. Determinantal varieties are cut out by minors, so their coordinate rings and defining ideals are natural examples for [Cambridge III Commutative Algebra](/page/Cambridge%20III%20Commutative%20Algebra).
Over $\mathbb{R}$ and $\mathbb{C}$, matrix spaces also belong to analysis and numerical mathematics. Norms, conditioning, approximation, and stability all depend on deciding whether a matrix is being treated as an array of entries or as an operator on vectors.
## References
Androma, [Cambridge IA Vectors and Matrices](/page/Cambridge%20IA%20Vectors%20and%20Matrices).
Androma, [Cambridge IB Linear Algebra](/page/Cambridge%20IB%20Linear%20Algebra).
Androma, [Lie Algebras I: Foundations](/page/Lie%20Algebras%20I%3A%20Foundations).
Androma, [Cambridge III Commutative Algebra](/page/Cambridge%20III%20Commutative%20Algebra).
Sheldon Axler, *Linear Algebra Done Right* (2015).
Kenneth Hoffman and Ray Kunze, *Linear Algebra* (1971).
Roger A. Horn and Charles R. Johnson, *Matrix Analysis* (1985).
Matrix Space
Also known as: Matrix spaces, Space of matrices, Matrices as vector spaces, $M_{m,n}(k)$, rectangular matrix space