Category theory provides a language for describing mathematical structures by the relationships between them rather than by their internal construction. This course introduces the basic objects and ideas of the subject: categories, morphisms, functors, and natural transformations. The goal is to show how a small set of abstract principles can organize examples from algebra, topology, logic, and beyond, and how categorical methods reveal common patterns that are often hidden in more concrete treatments.
The early chapters build the foundation by defining categories and exploring familiar examples, then treating morphisms themselves as carriers of structure. From there, the course develops functors as structure-preserving maps between categories and natural transformations as the correct notion of comparison between functors. Once these tools are in place, the course turns to equivalence of categories, universal properties, Hom-functors, representability, and the [Yoneda lemma](/theorems/3998), which serves as a central organizing result. The later chapters explain the consequences of Yoneda and show how categorical thinking applies across mathematics, making clear how the abstract formalism reflects and unifies concrete mathematical practice.
# Introduction
Category theory begins from a practical difficulty: the same pattern appears in many branches of mathematics, but each branch names it in its own language. Groups have homomorphisms, topological spaces have continuous maps, vector spaces have linear maps, partially ordered sets have order-preserving maps, and Hilbert spaces have bounded linear maps. This course develops a common language for these situations, with the goal of identifying constructions by how they interact with all maps into or out of them.
The course is not a survey of every categorical construction. It focuses on categories, functors, natural transformations, representable functors, and the Yoneda lemma. Adjunctions, general limits and colimits, and abelian categories belong to the next course; here they appear only as motivation when they clarify why the first layer of category theory is organised as it is.
## What Category Theory Is For
When two mathematical constructions look similar, what kind of statement would certify that they are instances of the same pattern? A category packages a class of objects together with the maps that preserve the structure under discussion. Once the maps are part of the data, an object is studied through the ways other objects map to it and the ways it maps to other objects.
The central shift is from elements to morphisms. In $\textbf{Set}$, a set may be probed by its elements, but an element of $X$ is the same data as a function $\textbf{1} \to X$ from a one-point set. This observation is small, yet it points toward a general method: replace elementwise descriptions by mapping properties that still make sense in categories where there are no underlying elements.
[example: Elements As Maps]
Let $\textbf{1}=\{*\}$ be a one-point set, and let $X$ be any set. We construct an explicit bijection between the elements of $X$ and the maps $\textbf{1}\to X$ by sending each function to its only value:
\begin{align*}
\Phi:\textbf{Set}(\textbf{1},X)&\to X,\\
\Phi(f)&=f(*).
\end{align*}
This is well-defined because $*\in \textbf{1}$, so every function $f:\textbf{1}\to X$ has a value $f(*)\in X$.
Conversely, for each element $x\in X$, define a function
\begin{align*}
\Psi(x):\textbf{1}&\to X,\\
*&\mapsto x.
\end{align*}
Then
\begin{align*}
(\Phi\circ \Psi)(x)
&=\Phi(\Psi(x))\\
&=\Psi(x)(*)\\
&=x,
\end{align*}
so $\Phi\circ\Psi=\operatorname{id}_X$. If $f:\textbf{1}\to X$, then $\Psi(\Phi(f))$ is the function $\textbf{1}\to X$ sending $*$ to $\Phi(f)=f(*)$. Since $*$ is the only element of $\textbf{1}$, the functions $\Psi(\Phi(f))$ and $f$ have the same value at every element of their domain, hence
\begin{align*}
\Psi(\Phi(f))=f.
\end{align*}
Thus $\Psi\circ\Phi=\operatorname{id}_{\textbf{Set}(\textbf{1},X)}$, and $\Phi$ is a bijection. Therefore the elements of $X$ are exactly the same data as morphisms $\textbf{1}\to X$ in $\textbf{Set}$, showing how an elementwise notion can be rewritten as a morphism-based one.
[/example]
This perspective becomes more valuable when the same construction is repeated in different settings. Direct products of sets, direct products of groups, product topologies, and products of partially ordered sets have different internal descriptions, but their defining behaviour with respect to maps has the same shape.
[explanation: Universal Properties]
A universal property describes an object by the maps it admits, not by a chosen construction of that object. For instance, a product object should receive a unique map from any object carrying compatible maps to the proposed factors. This description avoids committing to ordered pairs, Cartesian products, coordinate projections, or any other presentation.
Universal properties are the main reason category theory is useful in algebra and topology. They explain why constructions are unique up to a unique isomorphism, why different concrete models of the same construction are interchangeable, and why a construction can often be transported across different mathematical settings.
[/explanation]
## The Scope of This First Course
Which categorical ideas are needed before one can state and use the Yoneda lemma? The answer is surprisingly compact: categories, functors, natural transformations, and representability. These notes build those ideas in the order in which they are used.
The first chapter introduces categories and basic examples. The examples are not decorative: $\textbf{Set}$, $\textbf{Grp}$, $\textbf{Ring}$, $R\text{-}\textbf{Mod}$, $\textbf{Top}$, poset categories, metric spaces with suitable maps, and Hilbert spaces with bounded linear maps test the definition from different directions. They also expose size issues, opposite categories, and the habit of treating morphisms as the primary structure.
The second chapter studies special morphisms. Isomorphisms, automorphisms, sections, retractions, monomorphisms, and epimorphisms are categorical versions of familiar ideas, but they do not always coincide with bijections, injections, or surjections from set-based mathematics. Initial and terminal objects then provide the first universal properties.
The middle of the course introduces functors and natural transformations. A functor compares categories while respecting identities and composition. A natural transformation compares functors in a way that is compatible with every morphism in the source category. Together, these ideas let us state when a construction is functorial and when two functorial constructions agree for structural reasons rather than by a collection of unrelated choices.
[definition: Category Theory I]
Category Theory I is the part of this course concerned with categories, functors, natural transformations, representable functors, and the Yoneda lemma, with examples drawn from algebra, topology, order theory, metric spaces, and analysis.
[/definition]
The name is only a course label, but it marks a real boundary. This course develops the language needed to recognise objects by their mapping behaviour. It does not develop the full calculus of adjunctions, limits, colimits, monads, or homological algebra.
## Mathematical Background and Conventions
What background should a reader bring in order to treat the examples as mathematics rather than as formal symbols? The course assumes basic abstract algebra, point-set topology, linear algebra, and enough analysis to recognise normed spaces and bounded linear maps. It also assumes comfort with proofs that define a structure, verify axioms, and compare two objects by a uniquely determined map.
We use standard algebraic notation throughout. A group is denoted $G$, a ring is denoted $R$, an $R$-module is denoted $M$, and a group homomorphism or ring homomorphism is written as a function with named domain and codomain. For a category $\mathcal C$, the objects are written $A,B,C$ when no special structure is intended, and the set of morphisms from $A$ to $B$ is written $\mathcal C(A,B)$ once local smallness is available.
Set-theoretic care is part of the subject. Some collections, such as the collection of all sets, are too large to be sets in the background foundation. Rather than resolving foundations at the start, the course distinguishes small categories, large categories, and locally small categories when those distinctions first matter.
[remark: Foundations]
The notes use ordinary informal set-theoretic foundations at the level standard in graduate algebra and topology. Size distinctions are included because they affect statements involving all objects of a category or all functors between two categories. No specialised set theory is required beyond recognising the difference between a set-sized collection and a proper class-sized collection.
[/remark]
## Running Examples
How should a new definition be tested once it is introduced? The method of the course is to return repeatedly to a fixed list of examples. Each example highlights a different feature of the definitions.
In $\textbf{Set}$, objects are sets and morphisms are functions. This is the reference example for intuition, but it is also the example most likely to mislead if every categorical notion is interpreted elementwise. In $\textbf{Grp}$, objects are groups and morphisms are group homomorphisms; here algebraic structure restricts which functions count as morphisms. In $\textbf{Top}$, objects are topological spaces and morphisms are continuous maps; here homeomorphisms, quotient maps, and inclusions give useful tests for categorical language.
Posets provide a compressed example. A partially ordered set $(P,\le)$ can be regarded as a category with one morphism $p\to q$ precisely when $p\le q$. In such a category, associativity and identity morphisms are encoded by transitivity and reflexivity of the order relation. This example is a reliable way to translate categorical definitions into order-theoretic statements.
[example: A Poset As A Category]
Let $(P,\leq)$ be a partially ordered set. Define a category $\mathcal P$ as follows: the objects are the elements of $P$, and for $p,q\in P$ the morphism set is
\begin{align*}
\mathcal P(p,q)
=
\begin{cases}
\{\ast_{p,q}\}, & p\leq q,\\
\varnothing, & p\nleq q.
\end{cases}
\end{align*}
Thus there is exactly one morphism $p\to q$ when $p\leq q$, and no morphism otherwise.
For each object $p$, reflexivity gives $p\leq p$, so $\mathcal P(p,p)=\{\ast_{p,p}\}$. Define
\begin{align*}
\operatorname{id}_p=\ast_{p,p}.
\end{align*}
If $p\leq q$ and $q\leq r$, then transitivity gives $p\leq r$, so there is a unique morphism $\ast_{p,r}:p\to r$. Define composition by
\begin{align*}
\ast_{q,r}\circ \ast_{p,q}=\ast_{p,r}.
\end{align*}
This is the only possible definition, since $\mathcal P(p,r)$ has exactly one element.
The identity laws hold because, when $p\leq q$,
\begin{align*}
\ast_{p,q}\circ \operatorname{id}_p
&=\ast_{p,q}\circ \ast_{p,p}\\
&=\ast_{p,q},
\end{align*}
and
\begin{align*}
\operatorname{id}_q\circ \ast_{p,q}
&=\ast_{q,q}\circ \ast_{p,q}\\
&=\ast_{p,q}.
\end{align*}
Associativity also holds. If $p\leq q\leq r\leq s$, then both composites are the unique morphism $p\to s$:
\begin{align*}
(\ast_{r,s}\circ \ast_{q,r})\circ \ast_{p,q}
&=\ast_{q,s}\circ \ast_{p,q}\\
&=\ast_{p,s},
\end{align*}
while
\begin{align*}
\ast_{r,s}\circ(\ast_{q,r}\circ \ast_{p,q})
&=\ast_{r,s}\circ \ast_{p,r}\\
&=\ast_{p,s}.
\end{align*}
Therefore the elements of $P$, together with the order relation encoded as morphisms, form a category. In this category, categorical statements are precisely order-theoretic statements written in the language of objects and morphisms.
[/example]
Functional analysis contributes another family of examples. Hilbert spaces and bounded linear maps form a category, and the boundedness condition matters because it is what makes composition stay inside the same class of morphisms. This example reminds us that the morphisms chosen for a category encode the structure we want to preserve.
## What Theorems in This Course Will Look Like
What does it mean to prove a theorem in category theory when many objects have no elements? Most proofs in this course are diagrammatic or universal: they define a candidate morphism by a universal property, prove it is the only possible morphism with a specified behaviour, and then use uniqueness to prove equations. This style is different from element chasing, but it is not less concrete; the concrete data are morphisms and commutative diagrams.
A recurring theorem form says that a universal object is unique up to unique isomorphism. The cleanest first instance is the uniqueness of initial objects, where the universal property has no extra structure beyond a unique outgoing morphism to every object. Later examples add projections, inclusions, or representing elements, but the proof mechanism is already visible here.
[quotetheorem:3943]
[citeproof:3943]
The uniqueness hypothesis is essential, not cosmetic. If we weakened the definition by asking only for the existence of a morphism to every object, then in the category of nonempty sets and functions every nonempty set would satisfy the weakened condition, while a singleton and a two-element set are not isomorphic. The theorem also does not say that two initial objects are literally equal or that a preferred concrete model has been chosen; it says that the category itself supplies exactly one comparison isomorphism. This is the prototype for products, coproducts, free objects, and representable functors: the data defining the universal property determine the comparison maps, and the uniqueness part proves the required equations between them.
## The Main Destination
What is the structural theorem toward which the first course is moving? The destination is the Yoneda lemma, which says that an object of a locally small category is completely detected by the functor of maps out of it, or dually by the functor of maps into it. Informally, an object is determined by how every other object maps to it.
Representable functors are the bridge to that result. A functor is representable when it is naturally isomorphic to a hom-functor of the form $\mathcal C(A,-)$ or $\mathcal C(-,A)$. Representability turns questions about elements of a functor value into questions about morphisms in the category.
[explanation: Why Yoneda Matters]
The Yoneda lemma explains why universal properties are so strong. If two objects represent naturally isomorphic functors, then they are canonically isomorphic. Many constructions in algebra and topology are therefore best understood as representing a functor: free objects, products, coproducts, tensor products, and spaces of maps all fit this pattern once the surrounding category is chosen.
The lemma also changes how we think about sameness. Rather than asking for a preferred internal description of an object, we ask whether all mapping tests against that object give the same answer. This is the categorical form of understanding an object by its relationships.
[/explanation]
Chapters 7 through 9 make this precise through Hom-functors, representability, the Yoneda lemma, and its consequences. By the end of the course, the reader should be able to recognise common universal properties, construct functors between familiar categories, verify naturality, identify representable functors in examples, and use the Yoneda lemma as a practical theorem rather than as a slogan.
The opening chapter has now set the course's goal: to treat universal properties, functors, naturality, representability, and Yoneda as working tools rather than isolated slogans. With that perspective in place, we begin where category theory itself begins, by making precise what it means to have objects, arrows, and composition.
# 1. Categories and Basic Examples
Category theory begins with the observation that many mathematical subjects have the same outer shape: there are objects, there are structure-preserving maps between them, and maps can be composed. This chapter makes that pattern precise. We first isolate the axioms of a category, then discuss size issues and opposite categories, and finally verify that the familiar worlds of sets, groups, rings, modules, topological spaces, ordered sets, metric spaces, and Hilbert spaces fit the definition.
## The Data and Axioms of a Category
What is the least structure needed in order to speak about maps, identity maps, and composition without mentioning the internal elements of the objects? A category records exactly the information needed for composable arrows to behave like functions do.
[definition: Category]
A category $\mathcal C$ consists of the following data:
1. A class $\operatorname{Ob}(\mathcal C)$ whose elements are called objects.
2. For every pair of objects $A,B \in \operatorname{Ob}(\mathcal C)$, a class $\mathcal C(A,B)$ whose elements are called morphisms from $A$ to $B$.
3. For every triple of objects $A,B,C \in \operatorname{Ob}(\mathcal C)$, a composition operation
\begin{align*}
\mathcal C(B,C) \times \mathcal C(A,B) &\longrightarrow \mathcal C(A,C),\\
(g,f) &\longmapsto g \circ f.
\end{align*}
4. For every object $A \in \operatorname{Ob}(\mathcal C)$, an identity morphism $\operatorname{id}_A \in \mathcal C(A,A)$.
These data satisfy:
1. Associativity: if $f \in \mathcal C(A,B)$, $g \in \mathcal C(B,C)$, and $h \in \mathcal C(C,D)$, then
\begin{align*}
h \circ (g \circ f) = (h \circ g) \circ f.
\end{align*}
2. Identity laws: if $f \in \mathcal C(A,B)$, then
\begin{align*}
f \circ \operatorname{id}_A = f, \qquad \operatorname{id}_B \circ f = f.
\end{align*}
[/definition]
We write $f:A \to B$ to mean $f \in \mathcal C(A,B)$. The object $A$ is the domain of $f$ and $B$ is the codomain of $f$. The notation deliberately resembles ordinary functions, but a morphism need not be a function between underlying sets.
[remark: Hom-Class Notation]
The class $\mathcal C(A,B)$ is often called a hom-class, and in locally small categories it is a set called a hom-set. The direction convention is important: $\mathcal C(A,B)$ contains arrows whose source is $A$ and whose target is $B$.
[/remark]
The notation $\operatorname{id}_A$ will be used constantly, so it has to name a morphism determined by the object rather than by an arbitrary choice. The category axioms ask for an identity arrow at each object, but later cancellation arguments and examples require that there cannot be two competing identity arrows with the same role. The following result isolates that well-definedness point before we start using identities silently.
[quotetheorem:3944]
[citeproof:3944]
This theorem is usually used silently: there is no ambiguity in writing $\operatorname{id}_A$. The next examples show that the axioms have been chosen to match the behaviour of maps in ordinary mathematics.
[example: Sets and Functions]
The category $\mathbf{Set}$ has sets as objects and functions as morphisms. For sets $A$ and $B$, the hom-class $\mathbf{Set}(A,B)$ is the set of all functions $f:A \to B$. If $f:A \to B$ and $g:B \to C$, their composite is the ordinary function
\begin{align*}
g \circ f:A &\longrightarrow C,\\
a &\longmapsto g(f(a)).
\end{align*}
For each set $A$, the identity morphism is the identity function
\begin{align*}
\operatorname{id}_A:A &\longrightarrow A,\\
a &\longmapsto a.
\end{align*}
We verify the category axioms. Let $f:A \to B$, $g:B \to C$, and $h:C \to D$ be functions. For every $a \in A$,
\begin{align*}
\bigl(h \circ (g \circ f)\bigr)(a)
&= h\bigl((g \circ f)(a)\bigr)\\
&= h(g(f(a))),
\end{align*}
while
\begin{align*}
\bigl((h \circ g) \circ f\bigr)(a)
&= (h \circ g)(f(a))\\
&= h(g(f(a))).
\end{align*}
Thus $h \circ (g \circ f)$ and $(h \circ g) \circ f$ have the same value at every $a \in A$, so they are the same function $A \to D$.
For the identity laws, let $f:A \to B$. For every $a \in A$,
\begin{align*}
(f \circ \operatorname{id}_A)(a)
&= f(\operatorname{id}_A(a))\\
&= f(a),
\end{align*}
so $f \circ \operatorname{id}_A=f$. Also, for every $a \in A$,
\begin{align*}
(\operatorname{id}_B \circ f)(a)
&= \operatorname{id}_B(f(a))\\
&= f(a),
\end{align*}
so $\operatorname{id}_B \circ f=f$. Therefore ordinary sets and functions satisfy exactly the categorical axioms: composition is associative, and identity functions act as identity morphisms.
[/example]
The set example verifies the axioms in a familiar setting. A one-object category shows the opposite extreme: all information can live in the morphisms, so composition becomes an abstract multiplication law.
[example: A One-Object Category from a Monoid]
Let $(M,\cdot,e)$ be a monoid. Define $\mathcal C_M$ to have one object $*$, hom-class $\mathcal C_M(*,*)=M$, composition
\begin{align*}
m \circ n := m\cdot n
\end{align*}
for $m,n \in M$, and identity morphism $\operatorname{id}_*:=e$. Since there is only one object, every pair of morphisms in $M$ is composable.
We verify the category axioms. Let $\ell,m,n \in \mathcal C_M(*,*)=M$. Then
\begin{align*}
\ell \circ (m \circ n)
&= \ell \circ (m\cdot n)\\
&= \ell \cdot (m\cdot n)\\
&= (\ell\cdot m)\cdot n\\
&= (\ell \circ m)\cdot n\\
&= (\ell \circ m)\circ n,
\end{align*}
where the middle equality is associativity in the monoid $M$. Thus composition in $\mathcal C_M$ is associative.
For the identity laws, let $m \in \mathcal C_M(*,*)=M$. Since $\operatorname{id}_*=e$,
\begin{align*}
m \circ \operatorname{id}_*
&= m \circ e\\
&= m\cdot e\\
&= m
\end{align*}
and
\begin{align*}
\operatorname{id}_* \circ m
&= e \circ m\\
&= e\cdot m\\
&= m,
\end{align*}
using the right and left identity laws in the monoid. Therefore the category axioms for $\mathcal C_M$ are precisely the monoid axioms for $(M,\cdot,e)$, so a monoid is the same kind of composition structure as a category with only one object.
[/example]
## Size, Local Smallness, and Opposite Categories
If categories may have all sets, all groups, or all topological spaces as objects, in what sense are their collections of objects themselves legitimate mathematical things? The first size distinction separates categories whose objects form a set from categories whose objects are too large to be treated as a set in the same universe.
[definition: Small Category]
A category $\mathcal C$ is small if $\operatorname{Ob}(\mathcal C)$ is a set and each $\mathcal C(A,B)$ is a set.
[/definition]
Not every useful category is small: the familiar categories of all sets, groups, rings, or spaces have too many objects to fit into a single set. The obstruction is not that the morphisms are mysterious, but that collecting every object of the given kind would reproduce a universe-sized collection. We therefore need a separate size term for categories whose objects cannot themselves be treated as one set.
[definition: Large Category]
A category $\mathcal C$ is large if $\operatorname{Ob}(\mathcal C)$ is a proper class.
[/definition]
The categories $\mathbf{Set}$, $\mathbf{Grp}$, $\mathbf{Ring}$, and $\mathbf{Top}$ are large in the usual foundations because their objects range over all sets of a given kind. Many categories used in algebra and topology are large but still have set-sized morphism collections between any two fixed objects.
This motivates a finer distinction than small versus large. Even when the objects form a proper class, category theory often only needs the arrows between two fixed objects to form a set, so that hom-collections can be used as ordinary mathematical objects.
[definition: Locally Small Category]
A category $\mathcal C$ is locally small if for every pair of objects $A,B \in \operatorname{Ob}(\mathcal C)$, the class $\mathcal C(A,B)$ is a set.
[/definition]
Local smallness is the size hypothesis needed for hom-sets to behave like ordinary sets. It lets us count morphisms, form functions out of hom-sets, and later define functors such as $\mathcal C(A,-):\mathcal C \to \mathbf{Set}$ when the surrounding foundations permit it.
[example: A Large but Locally Small Category]
In the usual set-theoretic foundations, $\mathbf{Set}$ is large because its objects would have to form a set containing every set, and there is no set of all sets. It is locally small because, for fixed sets $A$ and $B$, every function $f:A \to B$ is its graph
\begin{align*}
\Gamma_f=\{(a,f(a)) : a \in A\}\subseteq A\times B.
\end{align*}
Thus the collection of all functions $A \to B$ is a subcollection of $\mathcal P(A\times B)$ consisting exactly of those subsets $\Gamma\subseteq A\times B$ such that for every $a\in A$ there is a unique $b\in B$ with $(a,b)\in \Gamma$. Since $A\times B$ is a set and $\mathcal P(A\times B)$ is a set, this hom-class is a set.
If $A$ has $m$ elements and $B$ has $n$ elements, write
\begin{align*}
A=\{a_1,\dots,a_m\}.
\end{align*}
A function $f:A\to B$ is determined by the ordered list
\begin{align*}
(f(a_1),\dots,f(a_m))\in B^m,
\end{align*}
and every ordered list $(b_1,\dots,b_m)\in B^m$ defines exactly one function by $f(a_i)=b_i$ for each $i$. Therefore
\begin{align*}
|\mathbf{Set}(A,B)|
&= |B^m|\\
&= \underbrace{|B|\cdot |B|\cdots |B|}_{m\text{ factors}}\\
&= \underbrace{n\cdot n\cdots n}_{m\text{ factors}}\\
&= n^m.
\end{align*}
So $\mathbf{Set}$ is large at the level of objects, but its morphisms between any two fixed objects still form an ordinary set.
[/example]
The second construction in this section reverses every arrow while leaving the objects unchanged. It is the formal device behind dual statements throughout category theory.
To make arrow reversal into a genuine category, we must also specify how reversed arrows compose. The only possible rule is determined by the original composition law, but the order has to be handled carefully.
[definition: Opposite Category]
Let $\mathcal C$ be a category. The opposite category $\mathcal C^{\mathrm{op}}$ is the category defined as follows:
1. $\operatorname{Ob}(\mathcal C^{\mathrm{op}})=\operatorname{Ob}(\mathcal C)$.
2. For objects $A,B$, define $\mathcal C^{\mathrm{op}}(A,B)=\mathcal C(B,A)$.
3. If $f \in \mathcal C^{\mathrm{op}}(A,B)$ and $g \in \mathcal C^{\mathrm{op}}(B,C)$, their composite in $\mathcal C^{\mathrm{op}}$ is the morphism corresponding to $f \circ g$ in $\mathcal C$.
4. The identity morphism of $A$ in $\mathcal C^{\mathrm{op}}$ is $\operatorname{id}_A$.
[/definition]
The order in the composition rule is forced by types. An arrow $f:A \to B$ in $\mathcal C^{\mathrm{op}}$ is an arrow $f:B \to A$ in $\mathcal C$, and an arrow $g:B \to C$ in $\mathcal C^{\mathrm{op}}$ is an arrow $g:C \to B$ in $\mathcal C$, so the corresponding arrow $A \to C$ in $\mathcal C^{\mathrm{op}}$ must come from $f \circ g:C \to A$ in $\mathcal C$. The next result checks that this type-correct recipe satisfies the category axioms, so that opposite categories can be used without rechecking associativity and identities each time.
[quotetheorem:3945]
[citeproof:3945]
The point of the opposite category is not merely that another category can be manufactured from an old one. It gives a controlled test for which arguments depend only on the formal direction of arrows and which arguments use extra structure not present in the categorical language. Once the construction is legitimate, a statement about arrows can be reversed by passing to $\mathcal C^{\mathrm{op}}$, provided every occurrence of source, target, and composition is reversed coherently.
This observation needs a precise principle because informal arrow-reversal can easily miss a hypothesis or reverse only part of a diagram. The duality principle packages the safe version of this operation: it tells us when a proved categorical statement automatically supplies its reversed counterpart.
[quotetheorem:3946]
[citeproof:3946]
The duality principle is a labour-saving device: once a theorem has been proved, the dual theorem comes for free. Its value is conceptual as well as economical: it tells us which definitions are paired by reversing arrows and which arguments are really the same argument viewed from the opposite direction. Later, monomorphisms and epimorphisms, initial and terminal objects, and products and coproducts will appear in dual pairs.
## Standard Mathematical Categories
Which familiar mathematical universes become categories once we choose the correct structure-preserving maps? The guiding rule is that objects carry structure and morphisms preserve that structure, while composition is inherited from ordinary function composition.
[quotetheorem:3947]
[citeproof:3947]
[example: Module Homomorphisms]
Let $R$ be a unital ring and let $M,N,P$ be left $R$-modules. A morphism $f:M \to N$ in $\mathbf{Mod}_R$ is an abelian group homomorphism satisfying
\begin{align*}
f(rm)=r f(m)
\end{align*}
for all $r \in R$ and $m \in M$. If $g:N \to P$ is another $R$-module homomorphism, then $g \circ f:M \to P$ is additive because, for $m_1,m_2 \in M$,
\begin{align*}
(g\circ f)(m_1+m_2)
&=g(f(m_1+m_2))\\
&=g(f(m_1)+f(m_2))\\
&=g(f(m_1))+g(f(m_2))\\
&=(g\circ f)(m_1)+(g\circ f)(m_2),
\end{align*}
where the second equality uses additivity of $f$ and the third uses additivity of $g$.
It also respects scalar multiplication: for $r\in R$ and $m\in M$,
\begin{align*}
(g\circ f)(rm)
&=g(f(rm))\\
&=g(rf(m))\\
&=r g(f(m))\\
&=r(g\circ f)(m),
\end{align*}
where the second equality uses $R$-linearity of $f$ and the third uses $R$-linearity of $g$. Thus the composite of two $R$-module homomorphisms is again an $R$-module homomorphism, which is exactly the closure under composition needed for the category $R\text{-}\mathbf{Mod}$.
[/example]
Topological spaces require a different preservation condition: morphisms preserve open-set structure through continuity. The category axioms then depend on two familiar facts about continuous maps: identity maps are continuous, and composites of continuous maps are continuous.
[quotetheorem:3948]
[citeproof:3948]
[example: Continuous Maps in Top]
Let $X=\mathbb R$ with the usual topology and let $Y=S^1\subset \mathbb C$ with the [subspace topology](/page/Subspace%20Topology). Define $f:X\to Y$ by $f(t)=e^{it}$. Writing
\begin{align*}
e^{it}=\cos t+i\sin t,
\end{align*}
we first check that the codomain is correct:
\begin{align*}
|f(t)|^2
&=|e^{it}|^2\\
&=(\cos t)^2+(\sin t)^2\\
&=1.
\end{align*}
Thus $f(t)\in S^1$ for every $t\in \mathbb R$.
To see that $f$ is continuous, view $\mathbb C$ as $\mathbb R^2$. The map
\begin{align*}
\widetilde f:\mathbb R&\longrightarrow \mathbb C,\\
t&\longmapsto \cos t+i\sin t
\end{align*}
is continuous because its two coordinate functions $\cos t$ and $\sin t$ are continuous. If $U\subseteq S^1$ is open in the subspace topology, then $U=V\cap S^1$ for some [open set](/page/Open%20Set) $V\subseteq \mathbb C$. Since the image of $f$ lies in $S^1$,
\begin{align*}
f^{-1}(U)
&=\{t\in \mathbb R:f(t)\in V\cap S^1\}\\
&=\{t\in \mathbb R:f(t)\in V\}\\
&=\widetilde f^{-1}(V),
\end{align*}
which is open in $\mathbb R$ because $\widetilde f$ is continuous. Hence $f$ is a morphism in $\mathbf{Top}$.
The map is not injective. For every $t\in\mathbb R$,
\begin{align*}
f(t+2\pi)
&=e^{i(t+2\pi)}\\
&=\cos(t+2\pi)+i\sin(t+2\pi)\\
&=\cos t+i\sin t\\
&=e^{it}\\
&=f(t),
\end{align*}
while $t+2\pi\ne t$. Thus a morphism in $\mathbf{Top}$ must preserve topological structure through continuity, but it need not preserve set-theoretic properties such as injectivity.
[/example]
The standard categories above all have many different maps between the same two objects. It is also important to see categories where a hom-class contains at most one morphism, because then equations of morphisms reduce to questions about whether arrows exist at all.
To make that idea precise, we need a construction that turns order into arrows without adding any extra choices. A partially ordered set provides exactly this sparse kind of category: comparability gives existence of an arrow, reflexivity gives identities, and transitivity gives composition. The next definition packages order-theoretic information as categorical data.
[definition: Poset Category]
Let $(P,\le)$ be a partially ordered set. The associated category $\mathcal C_P$ has objects the elements of $P$ and has a unique morphism $x \to y$ exactly when $x \le y$.
[/definition]
The identity morphism at $x$ comes from reflexivity $x \le x$, and composition comes from transitivity: if $x \le y$ and $y \le z$, then $x \le z$. Antisymmetry is not needed for the category axioms, but it controls when two objects are isomorphic in this category.
[example: Order-Preserving Maps as Morphisms]
There is also a category $\mathbf{Pos}$ whose objects are partially ordered sets and whose morphisms are order-preserving maps. For partially ordered sets $(P,\le_P)$ and $(Q,\le_Q)$, a function $f:P\to Q$ is a morphism when, for all $x,y\in P$,
\begin{align*}
x\le_P y \quad \Longrightarrow \quad f(x)\le_Q f(y).
\end{align*}
We check that these morphisms are closed under identities and composition. For any partially ordered set $(P,\le_P)$, the identity function $\operatorname{id}_P:P\to P$ is order-preserving: if $x\le_P y$, then
\begin{align*}
\operatorname{id}_P(x)&=x,\\
\operatorname{id}_P(y)&=y,
\end{align*}
so the same hypothesis $x\le_P y$ gives
\begin{align*}
\operatorname{id}_P(x)\le_P \operatorname{id}_P(y).
\end{align*}
Now let $f:P\to Q$ and $g:Q\to R$ be order-preserving maps, where $P,Q,R$ are partially ordered sets. If $x,y\in P$ and $x\le_P y$, then order-preservation of $f$ gives
\begin{align*}
f(x)\le_Q f(y).
\end{align*}
Since $g$ is order-preserving and $f(x),f(y)\in Q$, this implies
\begin{align*}
g(f(x))\le_R g(f(y)).
\end{align*}
By the definition of composition,
\begin{align*}
(g\circ f)(x)&=g(f(x)),\\
(g\circ f)(y)&=g(f(y)),
\end{align*}
and therefore
\begin{align*}
(g\circ f)(x)\le_R (g\circ f)(y).
\end{align*}
Thus $g\circ f:P\to R$ is order-preserving. The category $\mathbf{Pos}$ records partially ordered sets together with the maps that preserve their order structure.
[/example]
Order-preserving maps illustrate morphisms defined by preserving a relation. In analysis, the analogous question is which maps preserve quantitative structure strongly enough for composition to remain controlled. For metric spaces, one useful choice is to allow only maps that never increase distance, because that condition is expressed directly in terms of the metric and is stable under composition.
[definition: Category of Metric Spaces]
The category $\mathbf{Met}$ has metric spaces as objects and non-expansive maps as morphisms, where a map $f:(X,d_X) \to (Y,d_Y)$ is non-expansive if
\begin{align*}
d_Y(f(x),f(y)) \le d_X(x,y)
\end{align*}
for all $x,y \in X$.
[/definition]
Non-expansive maps are chosen because they are stable under composition with the same Lipschitz constant bound. Some authors use continuous maps or Lipschitz maps instead; those choices produce different categories with the same objects but different morphisms.
[example: A Non-Expansive Map]
Let $X=Y=\mathbb R$ with the usual metric $d(x,y)=|x-y|$, and define $f:\mathbb R\to\mathbb R$ by $f(x)=\sin x$. We show that $f$ is non-expansive. If $x=y$, then
\begin{align*}
|f(x)-f(y)|
&=|\sin x-\sin x|\\
&=0\\
&=|x-x|\\
&=|x-y|.
\end{align*}
If $x\ne y$, then by the *[Mean Value Theorem](/theorems/186)* applied to $\sin$ on the closed interval with endpoints $x$ and $y$, there is a point $c$ between $x$ and $y$ such that
\begin{align*}
\frac{\sin x-\sin y}{x-y}=\cos c.
\end{align*}
Multiplying by $x-y$ and taking absolute values gives
\begin{align*}
|\sin x-\sin y|
&=|\cos c|\cdot |x-y|\\
&\le 1\cdot |x-y|\\
&=|x-y|.
\end{align*}
Thus $d(f(x),f(y))\le d(x,y)$ for all $x,y\in\mathbb R$, so $f$ is a morphism in $\mathbf{Met}$.
By contrast, define $g:\mathbb R\to\mathbb R$ by $g(x)=2x$. Then
\begin{align*}
d(g(1),g(0))
&=|g(1)-g(0)|\\
&=|2-0|\\
&=2,
\end{align*}
while
\begin{align*}
d(1,0)=|1-0|=1.
\end{align*}
Since $2\le 1$ is false, $g$ does not satisfy the non-expansive inequality. This shows that $\mathbf{Met}$ allows maps that do not increase distances, not arbitrary continuous or Lipschitz maps.
[/example]
The metric example used a numerical inequality to control morphisms. Hilbert spaces add another obstruction: a useful map should preserve linear combinations while also remaining compatible with the topology coming from the norm. Bounded linear maps are exactly the standard analytic maps with these two features, and they compose without leaving the same class.
[definition: Category of Hilbert Spaces]
The category $\mathbf{Hilb}$ has Hilbert spaces as objects and bounded linear maps as morphisms.
[/definition]
Boundedness is the analytic condition that makes composition behave well. If $T:H \to K$ and $S:K \to L$ are bounded linear maps, then $S \circ T$ is linear and satisfies
\begin{align*}
\|S(Tx)\|_L \le \|S\|_{\mathcal L(K,L)}\|T\|_{\mathcal L(H,K)}\|x\|_H.
\end{align*}
Thus the composite is again bounded.
[example: Bounded Linear Maps Between Hilbert Spaces]
Let $H=L^2([0,1])$ and define $T:H\to H$ by
\begin{align*}
(Tf)(x)=x f(x).
\end{align*}
This formula is well-defined on $L^2$-classes: if $f=g$ almost everywhere, then $xf(x)=xg(x)$ almost everywhere, so $Tf=Tg$ in $L^2([0,1])$.
We first check linearity. Let $f,g\in L^2([0,1])$ and let $\alpha,\beta$ be scalars. For almost every $x\in[0,1]$,
\begin{align*}
T(\alpha f+\beta g)(x)
&=x(\alpha f+\beta g)(x)\\
&=x(\alpha f(x)+\beta g(x))\\
&=\alpha x f(x)+\beta x g(x)\\
&=\alpha (Tf)(x)+\beta (Tg)(x).
\end{align*}
Hence
\begin{align*}
T(\alpha f+\beta g)=\alpha Tf+\beta Tg
\end{align*}
in $L^2([0,1])$.
We now check boundedness. Since $0\le x\le 1$ on $[0,1]$, we have $x^2\le 1$. Therefore, for every $f\in L^2([0,1])$,
\begin{align*}
\|Tf\|_{L^2}^2
&=\int_0^1 |(Tf)(x)|^2\,d\mathcal L^1(x)\\
&=\int_0^1 |x f(x)|^2\,d\mathcal L^1(x)\\
&=\int_0^1 x^2 |f(x)|^2\,d\mathcal L^1(x)\\
&\le \int_0^1 |f(x)|^2\,d\mathcal L^1(x)\\
&=\|f\|_{L^2}^2.
\end{align*}
Taking square roots gives
\begin{align*}
\|Tf\|_{L^2}\le \|f\|_{L^2}.
\end{align*}
Thus $T$ is a bounded [linear map](/page/Linear%20Map), so it is a morphism in $\mathbf{Hilb}$. This example is the basic multiplication-operator pattern: multiplication by a bounded scalar function produces a [bounded linear operator](/page/Bounded%20Linear%20Operator) on $L^2$.
[/example]
## What the First Examples Teach
The examples above are not a catalogue for its own sake; they teach how category theory changes the viewpoint. Instead of asking what the elements of an object are, we ask which morphisms into and out of that object are available and how they compose.
[explanation: Same Objects, Different Categories]
A single class of objects can support several useful categories depending on the chosen morphisms. Metric spaces with non-expansive maps, Lipschitz maps, uniformly continuous maps, or continuous maps give different categories. The objects are the same in each case, but the categorical structure changes because the hom-sets and isomorphisms change.
This sensitivity to morphisms is a feature of the language. A construction that is natural for continuous maps may fail to be natural for arbitrary functions, and a theorem about non-expansive maps may encode metric information that is invisible in $\mathbf{Top}$.
[/explanation]
Because examples can use the same objects with different morphisms, it is important to know what data really determines a category. The examples above suggest a common checklist: specify objects, specify each hom-class, specify identities, and specify composition in a way that satisfies the axioms. The following summary isolates that checklist so it can be reused whenever a new mathematical setting is organized categorically.
[quotetheorem:3949]
[citeproof:3949]
The chapter ends with the central habit of the subject: when meeting a new mathematical environment, identify the objects, identify the morphisms, and check identities and composition. The rest of the course studies what can be recovered from that data alone.
Having introduced categories and a range of basic examples, we are ready to ask what features of familiar maps can be expressed purely in terms of composition. The next chapter shifts from the ambient structure to the morphisms themselves, so that we can recognize important mathematical notions categorically.
# 2. Morphisms as Structure
This chapter shifts attention from objects to the arrows between them. In the previous chapter, a category was introduced as a system of objects and composable morphisms; now we ask what properties of familiar maps can be expressed using only composition. The guiding theme is that category theory treats a morphism not as a function with elements, but as a structural position inside all possible diagrams around it.
The first tests are familiar: isomorphisms, one-sided inverses, injective maps, and surjective maps. The surprise is that their categorical versions are often more flexible than their set-theoretic models. The chapter ends with initial and terminal objects, the first universal properties already foreshadowed in Chapter 0: instead of constructing an object by its elements, we specify it by the unique arrows into or out of it.
## Invertible Morphisms and One-Sided Inverses
When should two objects in a category count as the same for the purposes of the category? In algebra, groups are the same when there is an isomorphism between them; in topology, spaces are the same when there is a homeomorphism; in linear algebra, vector spaces are the same when there is an invertible linear map. Category theory packages all of these as the existence of a morphism with a two-sided inverse.
[definition: Isomorphism]
Let $\mathcal C$ be a category. A morphism $f: A \to B$ in $\mathcal C$ is an isomorphism if there exists a morphism $g: B \to A$ such that
\begin{align*}
g \circ f &= \operatorname{id}_A, & f \circ g &= \operatorname{id}_B.
\end{align*}
[/definition]
The definition of isomorphism asks only for the existence of some two-sided inverse. To use inverse notation coherently, we must know that two different choices cannot satisfy the same equations. This is the first place where the category axioms do real work after the definition: associativity and identities alone force the inverse, if it exists, to be unique.
[quotetheorem:3950]
[citeproof:3950]
The theorem uses the two-sided inverse equations, not merely the existence of a one-sided inverse. For instance, the unique function $f:\{0,1\}\to \{*\}$ in $\mathbf{Set}$ has two different right inverses $\{*\}\to \{0,1\}$, so one-sided inverses need not be unique. The theorem also does not say that every morphism has an inverse; it says that once invertibility exists, the inverse is forced by the categorical structure. This prepares the distinction between genuine isomorphisms and the one-sided inverse notions introduced next.
[example: Isomorphisms in Standard Categories]
In $\mathbf{Set}$, an isomorphism $f:A\to B$ is exactly a bijection. If $g:B\to A$ satisfies $g\circ f=\operatorname{id}_A$ and $f\circ g=\operatorname{id}_B$, then for $a,a'\in A$,
\begin{align*}
f(a)=f(a') &\implies g(f(a))=g(f(a')) \\
&\implies a=a',
\end{align*}
so $f$ is injective. For every $b\in B$,
\begin{align*}
f(g(b))=(f\circ g)(b)=\operatorname{id}_B(b)=b,
\end{align*}
so $f$ is surjective. Conversely, if $f$ is bijective, define $g:B\to A$ by taking $g(b)$ to be the unique element $a\in A$ with $f(a)=b$. Then
\begin{align*}
(g\circ f)(a)&=g(f(a))=a,\\
(f\circ g)(b)&=b,
\end{align*}
so $f$ is an isomorphism in $\mathbf{Set}$.
In $\mathbf{Grp}$, the same two-sided inverse equations must hold inside the category of groups, so both $f$ and its inverse must be group homomorphisms. If $f:G\to H$ is a bijective group homomorphism and $g=f^{-1}$ is its set-theoretic inverse, then for $x,y\in H$ choose $a,b\in G$ with $x=f(a)$ and $y=f(b)$. Since $f$ is a homomorphism,
\begin{align*}
g(xy)&=g(f(a)f(b))\\
&=g(f(ab))\\
&=ab\\
&=g(x)g(y),
\end{align*}
so $g$ is also a group homomorphism. Thus the categorical isomorphisms in $\mathbf{Grp}$ are exactly the usual group isomorphisms.
In $\mathbf{Top}$, an isomorphism is a continuous map $f:X\to Y$ with a continuous inverse $g:Y\to X$, so the isomorphisms are homeomorphisms. A bijective continuous map alone is not enough: let $X=\{0,1\}$ with the discrete topology and let $Y=\{0,1\}$ with the indiscrete topology $\{\varnothing,Y\}$. The identity function $i:X\to Y$ is continuous because the only open subsets of $Y$ are $\varnothing$ and $Y$, whose preimages are open in $X$. Its inverse identity function $Y\to X$ is not continuous, since $\{0\}$ is open in $X$ but its preimage $\{0\}$ is not open in $Y$. Thus invertibility depends on the morphisms allowed in the category, not only on the underlying set function.
[/example]
Once isomorphisms identify when two objects are structurally the same, the next useful specialization is to keep the object fixed and study its self-isomorphisms. These arrows measure the symmetries visible inside the chosen category, so changing the allowed morphisms changes the resulting symmetry group.
[definition: Automorphism]
Let $\mathcal C$ be a category and let $A$ be an object of $\mathcal C$. An automorphism of $A$ is an isomorphism $f: A \to A$.
[/definition]
The automorphisms of $A$ form a group under composition, denoted $\operatorname{Aut}_{\mathcal C}(A)$. The identity is $\operatorname{id}_A$, inverses are categorical inverses, and associativity is inherited from the category.
[example: Automorphism Groups]
For a group $G$ regarded as an object of $\mathbf{Grp}$, an element of $\operatorname{Aut}_{\mathbf{Grp}}(G)$ is a morphism $f:G\to G$ in $\mathbf{Grp}$ with an inverse morphism $g:G\to G$. Since the morphisms in $\mathbf{Grp}$ are group homomorphisms, this means
\begin{align*}
g\circ f&=\operatorname{id}_G, &
f\circ g&=\operatorname{id}_G,
\end{align*}
with both $f$ and $g$ homomorphisms. Thus $f$ is exactly a bijective homomorphism whose inverse is also a homomorphism, which is the usual notion of a group automorphism.
For a set $X$, the morphisms in $\mathbf{Set}$ are functions. Hence an automorphism of $X$ is a function $f:X\to X$ for which there is a function $g:X\to X$ satisfying
\begin{align*}
g(f(x))&=x &&\text{for every }x\in X,\\
f(g(x))&=x &&\text{for every }x\in X.
\end{align*}
The first equation makes $f$ injective, and the second makes $f$ surjective. Conversely, any bijection $f:X\to X$ has an inverse function $f^{-1}:X\to X$, so the automorphisms in $\mathbf{Set}$ are precisely the permutations of $X$, forming the symmetric group on $X$.
For a topological space $X$, the morphisms in $\mathbf{Top}$ are continuous maps. Therefore an automorphism of $X$ is a continuous map $f:X\to X$ with a continuous inverse $g:X\to X$ satisfying
\begin{align*}
g\circ f&=\operatorname{id}_X, &
f\circ g&=\operatorname{id}_X.
\end{align*}
This is exactly a homeomorphism $X\to X$. Thus $\operatorname{Aut}_{\mathbf{Grp}}(G)$, $\operatorname{Aut}_{\mathbf{Set}}(X)$, and $\operatorname{Aut}_{\mathbf{Top}}(X)$ recover the familiar symmetry groups appropriate to the category.
[/example]
Automorphisms and isomorphisms require two-sided inverse data, but many constructions naturally provide only one direction of inverse control. For example, one map may choose representatives for another map without making both composites identities. To separate this weaker situation from genuine invertibility, we name the arrow that has a left inverse.
[definition: Section]
Let $\mathcal C$ be a category. A morphism $s: A \to B$ is a section if there exists a morphism $r: B \to A$ such that
\begin{align*}
r \circ s = \operatorname{id}_A.
\end{align*}
[/definition]
In this situation, $r$ is called a retraction of $s$. The word section suggests choosing representatives: in $\mathbf{Set}$, a section of a function $r: B \to A$ chooses for each $a \in A$ an element of the fibre $r^{-1}(\{a\})$.
The same equation can also be read from the other arrow's point of view. Instead of emphasizing the representative-choosing map $s$, we may need to emphasize the map $r$ that collapses back onto $A$ and undoes $s$ after composition.
[definition: Retraction]
Let $\mathcal C$ be a category. A morphism $r: B \to A$ is a retraction if there exists a morphism $s: A \to B$ such that
\begin{align*}
r \circ s = \operatorname{id}_A.
\end{align*}
[/definition]
Thus the same equation says that $s$ is a section and $r$ is a retraction. The important categorical question is what this explicit one-sided inverse data forces. Since monomorphisms and epimorphisms will be defined by cancellation, we first record that a split arrow automatically satisfies the corresponding cancellation law.
[quotetheorem:3951]
[citeproof:3951]
The splitting hypothesis is stronger than the cancellation conclusion. In $\mathbf{Set}$, the inclusion $\varnothing\to \{*\}$ is monic, but it is not a section because there is no function $\{*\}\to\varnothing$ to serve as a retraction. Dually, the theorem does not say that every epimorphism is split; outside categories with enough choice or projectivity, right-cancellation can exist without a chosen section. The result is useful because it gives an efficient route from explicit inverse data to the cancellation laws that define monomorphisms and epimorphisms.
[example: A Projection and an Inclusion]
In $\mathbf{Set}$, let $A$ and $C$ be sets, and let $\pi_A:A\times C\to A$ be the projection defined by $\pi_A(a,c)=a$. Suppose $C$ contains an element $c_0$. Define $s:A\to A\times C$ by
\begin{align*}
s(a)=(a,c_0).
\end{align*}
Then for every $a\in A$,
\begin{align*}
(\pi_A\circ s)(a)
&=\pi_A(s(a))\\
&=\pi_A(a,c_0)\\
&=a\\
&=\operatorname{id}_A(a).
\end{align*}
Thus $\pi_A\circ s=\operatorname{id}_A$ as functions $A\to A$. Therefore $s$ is a section of $\pi_A$, and $\pi_A$ is a retraction of $s$.
The chosen element $c_0\in C$ supplies a specific lift of each $a\in A$ to the point $(a,c_0)\in A\times C$, so this split epimorphism records not only that $\pi_A$ lands back on every element of $A$, but also a chosen way to return to the product over each element.
[/example]
## Monomorphisms and Epimorphisms as Cancellation Laws
How can a category recognize that a map behaves like an injection or a surjection without mentioning elements? The answer is to test whether the map can be cancelled from equations of morphisms. Injective functions can be cancelled after composition on the left, while surjective functions can be cancelled after composition on the right.
[definition: Monomorphism]
Let $\mathcal C$ be a category. A morphism $m: A \to B$ is a monomorphism if for every object $X$ of $\mathcal C$ and every pair of morphisms $u,v: X \to A$, the equality
\begin{align*}
m\circ u=m\circ v
\end{align*}
implies $u=v$.
[/definition]
A monomorphism is often called a mono. The definition says that $m$ is left-cancellative with respect to composition. To express the dual idea, we test whether a morphism can be cancelled when it is precomposed into maps leaving its codomain. This gives the categorical version of being onto, but the definition is stated purely in terms of arrows and equations.
[definition: Epimorphism]
Let $\mathcal C$ be a category. A morphism $e: A \to B$ is an epimorphism if for every object $Y$ of $\mathcal C$ and every pair of morphisms $u,v: B \to Y$, the equality
\begin{align*}
u\circ e=v\circ e
\end{align*}
implies $u=v$.
[/definition]
An epimorphism is often called an epi. This is the categorical dual of a monomorphism: reverse all arrows in the definition of mono and the definition of epi appears. These cancellation definitions are abstract, so they should be checked against the motivating case of ordinary functions. For monomorphisms in $\mathbf{Set}$, the question is whether left-cancellation detects exactly the usual elementwise condition of injectivity.
[quotetheorem:3952]
[citeproof:3952]
The singleton test in the converse is essential because elements of a set can be represented categorically as maps from $1=\{*\}$. Without enough such test objects, monomorphisms need not behave like literal subset inclusions in the same way. The theorem does not say that every monomorphism is literally an inclusion; it says that in $\mathbf{Set}$ every monomorphism is isomorphic to the inclusion of its image. This interpretation motivates the common practice of treating monomorphisms as generalized subobjects.
[example: Subsets as Monomorphisms]
Let $A\subset B$, and let $i:A\to B$ be the inclusion, so $i(a)=a$ for every $a\in A$. To show that $i$ is monic in $\mathbf{Set}$, take any set $X$ and any functions $u,v:X\to A$ such that
\begin{align*}
i\circ u=i\circ v.
\end{align*}
Then for each $x\in X$,
\begin{align*}
i(u(x))&=(i\circ u)(x)\\
&=(i\circ v)(x)\\
&=i(v(x)).
\end{align*}
Since $i$ is the inclusion, this says $u(x)=v(x)$ as elements of $B$. Both $u(x)$ and $v(x)$ already lie in the subset $A$, so they are equal as elements of $A$. Hence $u=v$, and $i$ is a monomorphism.
Conversely, if $m:A\to B$ is a monomorphism in $\mathbf{Set}$, then $m$ is injective by the cancellation argument for functions. Let $m(A)\subset B$ be its image, and define
\begin{align*}
\overline m:A&\to m(A),\\
\overline m(a)&=m(a).
\end{align*}
This map is surjective by the definition of $m(A)$: every element of $m(A)$ has the form $m(a)$ for some $a\in A$. It is injective because if
\begin{align*}
\overline m(a)=\overline m(a'),
\end{align*}
then
\begin{align*}
m(a)=m(a'),
\end{align*}
and injectivity of $m$ gives $a=a'$. Thus $\overline m$ is a bijection $A\to m(A)$, so $m$ factors as
\begin{align*}
A \xrightarrow{\ \overline m\ } m(A) \xrightarrow{\ j\ } B,
\end{align*}
where $j:m(A)\hookrightarrow B$ is the inclusion and
\begin{align*}
(j\circ \overline m)(a)=j(m(a))=m(a)
\end{align*}
for every $a\in A$. Therefore a monomorphism in $\mathbf{Set}$ is literally an inclusion after replacing its domain by the isomorphic copy $m(A)$, which is why monomorphisms are treated as categorical inclusions.
[/example]
The epimorphism story should now be compared with ordinary surjectivity. In $\mathbf{Set}$, right-cancellation can be tested by asking whether two functions out of the codomain agree on every element hit by the original map. The possible obstruction is a point outside the image, where two target functions could differ without being detected by precomposition.
[quotetheorem:3953]
[citeproof:3953]
The construction of the two functions $u$ and $v$ shows why surjectivity is exactly the missing ingredient: a point outside the image cannot be tested by precomposition with $e$. The argument uses the category $\mathbf{Set}$ strongly, because arbitrary functions to $\{0,1\}$ are available and can separate points outside the image. The theorem does not imply that epimorphisms are surjective in every category; the dense-inclusion example below is the standard warning. It only identifies the categorical cancellation law with surjectivity in this particular category.
[example: A Split Epimorphism in Set but Not in Grp]
Let $q:\mathbb Z\to \mathbb Z/2\mathbb Z$ be the quotient homomorphism, written additively, so $q(n)=\overline n$. As a function, $q$ is surjective: every element of $\mathbb Z/2\mathbb Z$ is either $\overline 0=q(0)$ or $\overline 1=q(1)$. Hence if $u,v:\mathbb Z/2\mathbb Z\to Y$ are functions with $u\circ q=v\circ q$, then
\begin{align*}
u(\overline 0)&=u(q(0))=(u\circ q)(0)=(v\circ q)(0)=v(q(0))=v(\overline 0),\\
u(\overline 1)&=u(q(1))=(u\circ q)(1)=(v\circ q)(1)=v(q(1))=v(\overline 1).
\end{align*}
Since $\overline 0$ and $\overline 1$ are the only elements of $\mathbb Z/2\mathbb Z$, this gives $u=v$, so $q$ is an epimorphism in $\mathbf{Set}$.
It is also split in $\mathbf{Set}$. Define a function $s:\mathbb Z/2\mathbb Z\to\mathbb Z$ by
\begin{align*}
s(\overline 0)&=0, &
s(\overline 1)&=1.
\end{align*}
Then
\begin{align*}
(q\circ s)(\overline 0)&=q(0)=\overline 0,\\
(q\circ s)(\overline 1)&=q(1)=\overline 1,
\end{align*}
so $q\circ s=\operatorname{id}_{\mathbb Z/2\mathbb Z}$ as functions.
In $\mathbf{Grp}$, however, no such section can be a group homomorphism. Suppose for contradiction that $s:\mathbb Z/2\mathbb Z\to\mathbb Z$ is a homomorphism with $q\circ s=\operatorname{id}_{\mathbb Z/2\mathbb Z}$. Since $\overline 1+\overline 1=\overline 0$, homomorphism preservation gives
\begin{align*}
s(\overline 1)+s(\overline 1)
&=s(\overline 1+\overline 1)\\
&=s(\overline 0)\\
&=0.
\end{align*}
Thus $2s(\overline 1)=0$ in $\mathbb Z$, which forces $s(\overline 1)=0$. Applying $q$ gives
\begin{align*}
(q\circ s)(\overline 1)=q(0)=\overline 0,
\end{align*}
contradicting $(q\circ s)(\overline 1)=\operatorname{id}_{\mathbb Z/2\mathbb Z}(\overline 1)=\overline 1$. Therefore $q$ is split as a function but not split as a group homomorphism, showing that splitting depends on the morphisms allowed in the category.
[/example]
Monomorphisms and epimorphisms are not synonyms for injective and surjective maps in every category. They are cancellation laws, and cancellation depends on the available test morphisms. Topology provides a useful obstruction to the set-based intuition: a map can fail to hit every point but still have a dense image, so continuous maps into a sufficiently separated target may be determined by their values on that image.
[quotetheorem:3954]
[citeproof:3954]
The Hausdorff hypothesis on the codomain is doing real work: it makes the diagonal closed, which turns equality on a dense subspace into equality everywhere. If the target is not Hausdorff, the equalizer of two continuous maps need not be closed, so this proof no longer applies. The theorem does not say that every epimorphism in $\mathbf{Haus}$ is a dense inclusion; it only supplies an important family of non-surjective epimorphisms. This is the first serious sign that categorical epi means right-cancellative, not onto.
[example: An Epimorphism That Is Not Surjective]
Let $i:\mathbb Q\hookrightarrow \mathbb R$ be the inclusion. It is not surjective as a function because $\sqrt 2\in \mathbb R$ and $\sqrt 2\notin \mathbb Q$.
We show that $i$ is nevertheless an epimorphism in $\mathbf{Haus}$. Let $Y$ be a [Hausdorff space](/page/Hausdorff%20Space), and let $f,g:\mathbb R\to Y$ be continuous maps such that
\begin{align*}
f\circ i=g\circ i.
\end{align*}
For every $q\in\mathbb Q$,
\begin{align*}
f(q)
&=f(i(q))\\
&=(f\circ i)(q)\\
&=(g\circ i)(q)\\
&=g(i(q))\\
&=g(q).
\end{align*}
Thus $\mathbb Q\subseteq E$, where
\begin{align*}
E=\{x\in\mathbb R:f(x)=g(x)\}.
\end{align*}
Since $Y$ is Hausdorff, the diagonal $\Delta_Y=\{(y,y):y\in Y\}\subseteq Y\times Y$ is closed. The map
\begin{align*}
(f,g):\mathbb R&\to Y\times Y,\\
x&\mapsto (f(x),g(x))
\end{align*}
is continuous because $f$ and $g$ are continuous, and
\begin{align*}
E=(f,g)^{-1}(\Delta_Y).
\end{align*}
Therefore $E$ is closed in $\mathbb R$. Since $\mathbb Q$ is dense in $\mathbb R$ and $\mathbb Q\subseteq E$, every closed subset of $\mathbb R$ containing $\mathbb Q$ must be all of $\mathbb R$, so
\begin{align*}
E=\mathbb R.
\end{align*}
Hence $f(x)=g(x)$ for every $x\in\mathbb R$, so $f=g$. This proves that precomposition with $i$ is right-cancellative, even though $i$ is not onto; categorical epimorphisms measure cancellation, not literal surjectivity.
[/example]
There is also a useful relation between isomorphisms and the two cancellation properties. In many concrete categories a bijective homomorphism is an isomorphism, but the categorical statement needs an actual inverse morphism, not only mono and epi behaviour.
[remark: Mono and Epi Need Not Force Isomorphism]
In $\mathbf{Set}$, a function that is both monic and epic is bijective, hence an isomorphism. In $\mathbf{Haus}$, the dense inclusion $\mathbb Q\hookrightarrow \mathbb R$ is both monic and epic, but it is not an isomorphism because there is no continuous inverse $\mathbb R\to \mathbb Q$. Thus a morphism can be both mono and epi without being an isomorphism in a general category.
[/remark]
## Initial and Terminal Objects as Universal Properties
Can an object be characterized by the arrows connected to it rather than by its internal construction? Initial and terminal objects are the first examples of this method. An initial object has exactly one arrow out to every object, while a terminal object has exactly one arrow in from every object.
[definition: Initial Object]
Let $\mathcal C$ be a category. An object $I$ of $\mathcal C$ is initial if for every object $A$ of $\mathcal C$ there exists a unique morphism $I\to A$.
[/definition]
The initial object is a universal source. It supplies a canonical map into every object of the category.
[example: Initial Objects]
In $\mathbf{Set}$, the empty set $\varnothing$ is initial. For any set $A$, there is a function
\begin{align*}
!_A:\varnothing\to A,
\end{align*}
because there is no element $x\in\varnothing$ on which a value must be chosen. If $f,g:\varnothing\to A$ are two functions, then for every $x\in\varnothing$ we have $f(x)=g(x)$ vacuously. Hence $f=g$ by equality of functions, so the function $\varnothing\to A$ is unique.
In $\mathbf{Grp}$, let $0$ denote the one-element group $\{e\}$. For any group $G$, define $\phi:0\to G$ by
\begin{align*}
\phi(e)=e_G.
\end{align*}
This is a homomorphism because
\begin{align*}
\phi(e\cdot e)
&=\phi(e)\\
&=e_G\\
&=e_Ge_G\\
&=\phi(e)\phi(e).
\end{align*}
If $\psi:0\to G$ is any group homomorphism, then
\begin{align*}
\psi(e)
&=\psi(e\cdot e)\\
&=\psi(e)\psi(e).
\end{align*}
Multiplying on the left by $\psi(e)^{-1}$ gives $e_G=\psi(e)$, so $\psi=\phi$. Thus the zero group is initial in $\mathbf{Grp}$.
In the category of unital rings and unital ring homomorphisms, $\mathbb Z$ is initial. For a unital ring $R$, define $\eta_R:\mathbb Z\to R$ by
\begin{align*}
\eta_R(n)=n\cdot 1_R,
\end{align*}
where positive $n$ means an $n$-fold sum of $1_R$, $0\cdot 1_R=0_R$, and negative $n$ means the additive inverse of $(-n)\cdot 1_R$. Then
\begin{align*}
\eta_R(n+m)
&=(n+m)\cdot 1_R\\
&=n\cdot 1_R+m\cdot 1_R\\
&=\eta_R(n)+\eta_R(m),
\end{align*}
and, by distributivity of multiplication over repeated addition,
\begin{align*}
\eta_R(nm)
&=(nm)\cdot 1_R\\
&=(n\cdot 1_R)(m\cdot 1_R)\\
&=\eta_R(n)\eta_R(m).
\end{align*}
Also
\begin{align*}
\eta_R(1)=1\cdot 1_R=1_R,
\end{align*}
so $\eta_R$ is a unital ring homomorphism. If $\psi:\mathbb Z\to R$ is any unital ring homomorphism, then $\psi(1)=1_R$, and for $n>0$,
\begin{align*}
\psi(n)
&=\psi(\underbrace{1+\cdots+1}_{n\text{ times}})\\
&=\underbrace{\psi(1)+\cdots+\psi(1)}_{n\text{ times}}\\
&=n\cdot 1_R.
\end{align*}
Also $\psi(0)=0_R=0\cdot 1_R$, and for $n<0$,
\begin{align*}
\psi(n)
&=\psi(-(-n))\\
&=-\psi(-n)\\
&=-((-n)\cdot 1_R)\\
&=n\cdot 1_R.
\end{align*}
Thus every unital ring homomorphism $\mathbb Z\to R$ equals $\eta_R$, so $\mathbb Z$ is initial. Initiality is therefore not about being empty; it is about having exactly one structure-preserving arrow into every object of the category.
[/example]
The ring example is often the first sign that initial does not mean empty. It means freely generated by no choices compatible with the structure of the category.
The dual universal role reverses the direction of the unique arrows. Instead of asking for an object that maps uniquely into every other object, we ask for an object that receives a unique map from every other object. This captures canonical collapse maps, such as the unique function from any set to a singleton, and motivates the following definition.
[definition: Terminal Object]
Let $\mathcal C$ be a category. An object $T$ of $\mathcal C$ is terminal if for every object $A$ of $\mathcal C$ there exists a unique morphism $A\to T$.
[/definition]
The terminal object is a universal target. Every object admits a canonical map into it.
[example: Terminal Objects]
In $\mathbf{Set}$, let $1=\{*\}$ be a singleton set. For any set $A$, define
\begin{align*}
!_A:A&\to 1,\\
a&\mapsto *.
\end{align*}
This is a function because each $a\in A$ is assigned the element $*\in 1$. If $f:A\to 1$ is any function, then for every $a\in A$ the value $f(a)$ must be an element of the singleton set $\{*\}$, so
\begin{align*}
f(a)=*={!}_A(a).
\end{align*}
Thus $f={!}_A$ by equality of functions, and there is exactly one function $A\to 1$. Hence any singleton set is terminal in $\mathbf{Set}$.
In $\mathbf{Grp}$, let $0=\{e\}$ be the one-element group. For any group $G$, define
\begin{align*}
\epsilon_G:G&\to 0,\\
g&\mapsto e.
\end{align*}
This is a group homomorphism because for all $g,h\in G$,
\begin{align*}
\epsilon_G(gh)
&=e\\
&=ee\\
&=\epsilon_G(g)\epsilon_G(h).
\end{align*}
If $\phi:G\to 0$ is any group homomorphism, then for every $g\in G$ the value $\phi(g)$ lies in the singleton set $\{e\}$, so
\begin{align*}
\phi(g)=e=\epsilon_G(g).
\end{align*}
Therefore $\phi=\epsilon_G$, and the zero group is terminal in $\mathbf{Grp}$.
In $\mathbf{Top}$, let $1=\{*\}$ be any one-point topological space. For a topological space $X$, there is only one underlying function
\begin{align*}
!_X:X&\to 1,\\
x&\mapsto *.
\end{align*}
To check continuity, let $U\subseteq 1$ be open. Since $1$ has one point, either $U=\varnothing$ or $U=1$. Then
\begin{align*}
!_X^{-1}(\varnothing)&=\varnothing,\\
!_X^{-1}(1)&=X,
\end{align*}
and both $\varnothing$ and $X$ are open in $X$. Hence $!_X$ is continuous. Since the underlying function is unique and it is continuous, every space has exactly one continuous map into a one-point space, so any one-point space is terminal in $\mathbf{Top}$.
[/example]
Initial and terminal objects are not expected to be equal as literal objects, and even two initial objects may be presented in different ways. The relevant issue is whether the universal property leaves any freedom in comparing them. Because each initial object has a unique arrow to every object, the comparison maps between two initial objects should be forced, and the next result makes that uniqueness precise.
[quotetheorem:3943]
[citeproof:3943]
Initiality is essential here because it gives both the comparison arrows and the uniqueness of the endomorphisms $I\to I$ and $I'\to I'$. If an object merely has at least one arrow to every object, uniqueness up to unique isomorphism can fail; in a category with two parallel morphisms between the same objects, existence alone does not control the comparison maps. The theorem does not choose a preferred literal initial object among all models. Instead, it justifies treating all initial objects as the same construction inside the category.
The same question arises for terminal objects, but all arrows now point into the universal object. If two terminal objects both receive a unique map from every object, then the maps between them are again forced by the universal property. This is the arrow-reversed version of uniqueness for initial objects, and it explains why all choices of terminal object carry the same categorical information.
[quotetheorem:3956]
[citeproof:3956]
This is the terminal analogue of the initial-object uniqueness principle. Mere existence of maps into an object would not be enough; the force of the result comes from the uniqueness built into the universal property. The theorem does not assert that terminal objects are equal as literal objects: any singleton set is terminal in $\mathbf{Set}$, but different singleton sets need not be equal. What matters is that there is a unique isomorphism between them, which is the categorical form of uniqueness.
[example: The Zero Group as a Zero Object]
Let $0=\{e\}$ be the one-element group. We first show that $0$ is initial in $\mathbf{Grp}$. For any group $G$, define
\begin{align*}
\eta_G:0&\to G,\\
e&\mapsto e_G.
\end{align*}
This is a homomorphism because
\begin{align*}
\eta_G(e\cdot e)
&=\eta_G(e)\\
&=e_G\\
&=e_Ge_G\\
&=\eta_G(e)\eta_G(e).
\end{align*}
If $\phi:0\to G$ is any homomorphism, then
\begin{align*}
\phi(e)
&=\phi(e\cdot e)\\
&=\phi(e)\phi(e).
\end{align*}
Multiplying on the left by $\phi(e)^{-1}$ in $G$ gives
\begin{align*}
e_G=\phi(e),
\end{align*}
so $\phi(e)=\eta_G(e)$, and therefore $\phi=\eta_G$. Hence there is exactly one homomorphism $0\to G$.
We next show that $0$ is terminal. For any group $G$, define
\begin{align*}
\epsilon_G:G&\to 0,\\
g&\mapsto e.
\end{align*}
For all $g,h\in G$,
\begin{align*}
\epsilon_G(gh)
&=e\\
&=e\cdot e\\
&=\epsilon_G(g)\epsilon_G(h),
\end{align*}
so $\epsilon_G$ is a homomorphism. If $\psi:G\to 0$ is any homomorphism, then $\psi(g)$ is an element of the one-element set $\{e\}$, so
\begin{align*}
\psi(g)=e=\epsilon_G(g)
\end{align*}
for every $g\in G$. Thus $\psi=\epsilon_G$, and there is exactly one homomorphism $G\to 0$.
Therefore the zero group is both initial and terminal in $\mathbf{Grp}$, so it is a zero object. Consequently, for any groups $A$ and $B$, the composite
\begin{align*}
A\xrightarrow{\ \epsilon_A\ }0\xrightarrow{\ \eta_B\ }B
\end{align*}
is the canonical zero morphism $A\to B$; explicitly, it sends every $a\in A$ to $e_B$.
[/example]
The definitions of initial and terminal objects are the first universal properties in the course. They identify an object by a pattern of unique arrows, and the uniqueness theorem explains why universal constructions are usually described only up to unique isomorphism.
[remark: Universal Properties as Definitions by Arrows]
A universal property does not merely prove that a construction has a useful feature; it can serve as the definition of the construction within a category. The empty set, singleton set, zero group, and initial ring $\mathbb Z$ are different mathematical objects, but their roles are expressed by the same arrow-counting patterns. Later chapters will use the same idea for products, coproducts, free objects, and representable functors.
[/remark]
Once morphisms have been isolated as structural features, the next step is to compare whole categories rather than individual arrows. Functors provide exactly that level of comparison, carrying the patterns identified in the previous chapter from one category to another.
# 3. Functors
Functors are the maps between categories: they send objects to objects, morphisms to morphisms, and preserve the two structural operations that make categories work. The previous chapter studied morphisms inside a category; this chapter studies comparisons between whole categories. The guiding question is how much mathematical structure survives when we move from one category to another, and how to measure whether such a comparison forgets information.
The examples from Chapter 1 and the morphism tests from Chapter 2 should be kept in view throughout. A forgetful functor such as $\mathbf{Grp} \to \mathbf{Set}$ keeps the underlying set of a group but discards multiplication as part of the output. The fundamental group turns pointed topological spaces into groups. The dual [vector space](/page/Vector%20Space) construction reverses arrows, so it is naturally contravariant rather than covariant.
## Covariant Functors
How can a construction that assigns one mathematical object to another also act on maps between those objects? A functor is designed to answer exactly this question. It is not enough to send objects of one category to objects of another: morphisms must also be transported in a way compatible with identity morphisms and composition.
[definition: Covariant Functor]
Let $\mathcal C$ and $\mathcal D$ be categories. A covariant functor $F: \mathcal C \to \mathcal D$ consists of:
1. for every object $X \in \mathcal C$, an object $F(X) \in \mathcal D$;
2. for every morphism $f: X \to Y$ in $\mathcal C$, a morphism $F(f): F(X) \to F(Y)$ in $\mathcal D$;
3. for every object $X \in \mathcal C$, the identity condition
\begin{align*}
F(\operatorname{id}_X)=\operatorname{id}_{F(X)};
\end{align*}
4. for every pair of composable morphisms $X \xrightarrow{f} Y \xrightarrow{g} Z$ in $\mathcal C$, the composition condition
\begin{align*}
F(g \circ f)=F(g)\circ F(f).
\end{align*}
[/definition]
The two equations say that a functor respects the category structure, not merely the objects. Once the object part is known, the morphism part is often the part where the real mathematics lies.
[example: Forgetful Functor From Groups]
Let $U: \mathbf{Grp}\to\mathbf{Set}$ send a group $G$ to its underlying set $U(G)$, and send a group homomorphism $\varphi:G\to H$ to the same function on elements,
\begin{align*}
U(\varphi):U(G)\to U(H),\qquad g\mapsto \varphi(g).
\end{align*}
For a group $G$, the identity morphism in $\mathbf{Grp}$ is the homomorphism $\operatorname{id}_G:G\to G$ with $\operatorname{id}_G(g)=g$ for every $g\in G$. Therefore the underlying function satisfies
\begin{align*}
U(\operatorname{id}_G)(g)
&=\operatorname{id}_G(g)\\
&=g\\
&=\operatorname{id}_{U(G)}(g)
\end{align*}
for every $g\in U(G)$, so $U(\operatorname{id}_G)=\operatorname{id}_{U(G)}$.
Now let $G\xrightarrow{\varphi}H\xrightarrow{\psi}K$ be group homomorphisms. For every $g\in U(G)$,
\begin{align*}
U(\psi\circ\varphi)(g)
&=(\psi\circ\varphi)(g)\\
&=\psi(\varphi(g))\\
&=U(\psi)(U(\varphi)(g))\\
&=(U(\psi)\circ U(\varphi))(g).
\end{align*}
Thus $U(\psi\circ\varphi)=U(\psi)\circ U(\varphi)$ as functions $U(G)\to U(K)$. Hence $U$ preserves identities and composition, so it is a covariant functor. This example shows that a functor can discard the group operation from each object while still retaining the homomorphisms as structure-respecting functions.
[/example]
The same forgetting pattern appears in topology, where the discarded structure is not an operation but a collection of open sets. Checking this case helps separate object-level loss of structure from arrow-level preservation of composition.
[example: Forgetful Functor From Topological Spaces]
Let $U: \mathbf{Top} \to \mathbf{Set}$ send a topological space $X$ to its underlying set $U(X)$, and send a continuous map $f:X\to Y$ to the same function on points,
\begin{align*}
U(f):U(X)\to U(Y),\qquad x\mapsto f(x).
\end{align*}
For a topological space $X$, the identity morphism in $\mathbf{Top}$ is the continuous map $\operatorname{id}_X:X\to X$ with $\operatorname{id}_X(x)=x$ for every $x\in X$. Hence, for every $x\in U(X)$,
\begin{align*}
U(\operatorname{id}_X)(x)
&=\operatorname{id}_X(x)\\
&=x\\
&=\operatorname{id}_{U(X)}(x).
\end{align*}
Thus $U(\operatorname{id}_X)=\operatorname{id}_{U(X)}$ as functions $U(X)\to U(X)$.
Now let $X\xrightarrow{f}Y\xrightarrow{g}Z$ be continuous maps. Their composite $g\circ f:X\to Z$ is continuous, so it is a morphism in $\mathbf{Top}$. For every $x\in U(X)$,
\begin{align*}
U(g\circ f)(x)
&=(g\circ f)(x)\\
&=g(f(x))\\
&=U(g)(U(f)(x))\\
&=(U(g)\circ U(f))(x).
\end{align*}
Therefore $U(g\circ f)=U(g)\circ U(f)$ as functions $U(X)\to U(Z)$. The assignment $U$ preserves identities and composition, so it is a covariant functor. It keeps the points and the underlying functions, while the open subsets and the continuity condition are no longer part of the objects and morphisms in the target category $\mathbf{Set}$.
[/example]
A functor preserves identities and composition inside a category, but a second functor may then be applied to its outputs. The possible obstruction is coherence: after applying two functors in succession, identities must still go to identities and composites must still be respected in the final category. The following result verifies that no extra compatibility condition is needed.
[quotetheorem:3957]
[citeproof:3957]
The theorem is used constantly, often without comment: a complicated construction may be functorial because it is built as a composite of simpler functorial constructions. Its force is that functoriality is stable under iteration, so one can verify a construction in stages instead of checking the identity and composition laws from scratch each time. This is especially useful for forgetful constructions, quotient constructions, and algebraic operations that pass through intermediate categories before landing in the desired target.
Composition of functors still preserves the direction of arrows. The next issue is different: some important constructions systematically reverse arrows, so they cannot be described as covariant functors out of the original category.
## Contravariant Functors And Opposite Categories
What happens when a construction reverses the direction of every map? Dual spaces, inverse image operations, and pullbacks all behave this way. Rather than treating this as an exception, category theory packages it as an ordinary functor out of an opposite category.
[definition: Contravariant Functor]
Let $\mathcal C$ and $\mathcal D$ be categories. A contravariant functor from $\mathcal C$ to $\mathcal D$ is a covariant functor
\begin{align*}
F: \mathcal C^{\operatorname{op}} \to \mathcal D.
\end{align*}
[/definition]
Equivalently, it assigns to each object $X \in \mathcal C$ an object $F(X) \in \mathcal D$, and to each morphism $f: X \to Y$ in $\mathcal C$ a morphism $F(f): F(Y) \to F(X)$ in $\mathcal D$, preserving identities and reversing composition:
\begin{align*}
F(\operatorname{id}_X)&=\operatorname{id}_{F(X)}, & F(g\circ f)&=F(f)\circ F(g).
\end{align*}
The phrase "contravariant functor $\mathcal C \to \mathcal D$" is common shorthand, but the precise categorical object is a covariant functor $\mathcal C^{\operatorname{op}} \to \mathcal D$. The reversed composition law is forced by the typing of the arrows.
[example: Dual Vector Space Functor]
Fix a field $k$. Let $(-)^*: \mathbf{Vect}_k^{\operatorname{op}}\to \mathbf{Vect}_k$ send a vector space $V$ to its dual vector space $V^*=\operatorname{Hom}_k(V,k)$. If $T:V\to W$ is a $k$-linear map, define
\begin{align*}
T^*:W^*\to V^*,\qquad T^*(\lambda)=\lambda\circ T.
\end{align*}
This is well-defined because if $\lambda:W\to k$ is $k$-linear, then $\lambda\circ T:V\to k$ is $k$-linear: for $v_1,v_2\in V$ and $a,b\in k$,
\begin{align*}
(\lambda\circ T)(av_1+bv_2)
&=\lambda(T(av_1+bv_2))\\
&=\lambda(aT(v_1)+bT(v_2))\\
&=a\lambda(T(v_1))+b\lambda(T(v_2))\\
&=a(\lambda\circ T)(v_1)+b(\lambda\circ T)(v_2).
\end{align*}
For the identity map $\operatorname{id}_V:V\to V$ and every $\lambda\in V^*$,
\begin{align*}
(\operatorname{id}_V)^*(\lambda)
&=\lambda\circ \operatorname{id}_V\\
&=\lambda\\
&=\operatorname{id}_{V^*}(\lambda),
\end{align*}
so $(\operatorname{id}_V)^*=\operatorname{id}_{V^*}$.
Now let $V\xrightarrow{T}W\xrightarrow{S}Z$ be composable $k$-linear maps. For every $\mu\in Z^*$ and every $v\in V$,
\begin{align*}
((S\circ T)^*(\mu))(v)
&=(\mu\circ(S\circ T))(v)\\
&=\mu((S\circ T)(v))\\
&=\mu(S(T(v)))\\
&=(\mu\circ S)(T(v))\\
&=(S^*(\mu))(T(v))\\
&=(T^*(S^*(\mu)))(v)\\
&=((T^*\circ S^*)(\mu))(v).
\end{align*}
Thus $(S\circ T)^*(\mu)= (T^*\circ S^*)(\mu)$ for every $\mu\in Z^*$, so
\begin{align*}
(S\circ T)^*=T^*\circ S^*.
\end{align*}
The arrow direction is reversed: a map $T:V\to W$ produces $T^*:W^*\to V^*$, and composition is reversed in exactly the way required for a contravariant functor.
[/example]
Dual spaces give an algebraic contravariant functor. The fundamental group gives a topological example where maps preserve basepoints and carry loops forward, producing a covariant functor into groups.
[example: Fundamental Group As A Functor]
Let $\mathbf{Top}_*$ be the category of pointed topological spaces $(X,x_0)$ and basepoint-preserving continuous maps. Define
\begin{align*}
\pi_1(f):\pi_1(X,x_0)\to \pi_1(Y,y_0),\qquad [\gamma]\mapsto [f\circ \gamma]
\end{align*}
for a pointed map $f:(X,x_0)\to (Y,y_0)$. This is well-defined: if $H:I\times I\to X$ is a based homotopy from $\gamma$ to $\gamma'$, then $f\circ H:I\times I\to Y$ is a based homotopy from $f\circ\gamma$ to $f\circ\gamma'$, since
\begin{align*}
(f\circ H)(s,0)&=f(\gamma(s)), &
(f\circ H)(s,1)&=f(\gamma'(s)),\\
(f\circ H)(0,t)&=f(x_0)=y_0, &
(f\circ H)(1,t)&=f(x_0)=y_0.
\end{align*}
It is a group homomorphism because for loops $\gamma,\delta:I\to X$ based at $x_0$,
\begin{align*}
\pi_1(f)([\gamma][\delta])
&=\pi_1(f)([\gamma*\delta])\\
&=[f\circ(\gamma*\delta)]\\
&=[(f\circ\gamma)*(f\circ\delta)]\\
&=[f\circ\gamma][f\circ\delta]\\
&=\pi_1(f)([\gamma])\,\pi_1(f)([\delta]).
\end{align*}
For the identity pointed map $\operatorname{id}_{(X,x_0)}:(X,x_0)\to (X,x_0)$ and every loop class $[\gamma]\in\pi_1(X,x_0)$,
\begin{align*}
\pi_1(\operatorname{id}_{(X,x_0)})([\gamma])
&=[\operatorname{id}_X\circ\gamma]\\
&=[\gamma]\\
&=\operatorname{id}_{\pi_1(X,x_0)}([\gamma]).
\end{align*}
Thus $\pi_1(\operatorname{id}_{(X,x_0)})=\operatorname{id}_{\pi_1(X,x_0)}$.
Now let
\begin{align*}
(X,x_0)\xrightarrow{f}(Y,y_0)\xrightarrow{g}(Z,z_0)
\end{align*}
be pointed maps. For every $[\gamma]\in\pi_1(X,x_0)$,
\begin{align*}
\pi_1(g\circ f)([\gamma])
&=[(g\circ f)\circ\gamma]\\
&=[g\circ(f\circ\gamma)]\\
&=\pi_1(g)([f\circ\gamma])\\
&=\pi_1(g)(\pi_1(f)([\gamma]))\\
&=(\pi_1(g)\circ\pi_1(f))([\gamma]).
\end{align*}
Therefore $\pi_1(g\circ f)=\pi_1(g)\circ\pi_1(f)$. The fundamental group assignment preserves identities and composition, so it defines a covariant functor $\pi_1:\mathbf{Top}_*\to\mathbf{Grp}$. It converts pointed topological spaces into groups while keeping the effect of pointed continuous maps through the induced homomorphisms.
[/example]
The fundamental group example is important because the functor is not forgetful: it extracts algebraic information from topology. Functoriality says that this extraction is compatible with continuous maps, so topological constructions can be studied through group homomorphisms.
## Functors And Isomorphisms
Which features of morphisms must a functor preserve? Since a functor preserves identities and composition, it must preserve invertibility. This is one of the first signs that functorial constructions respect categorical structure rather than incidental presentations.
[quotetheorem:3958]
[citeproof:3958]
The converse need not hold. A forgetful functor can send a non-isomorphism of structured objects to a bijection of underlying sets if the inverse function does not preserve the extra structure.
[example: A Bijection Need Not Be A Homeomorphism]
Let $X$ be a set with at least two points. Write $X_{\mathrm{disc}}$ for $X$ with the discrete topology and $X_{\mathrm{ind}}$ for $X$ with the indiscrete topology. Consider the identity function
\begin{align*}
i:X_{\mathrm{disc}}\to X_{\mathrm{ind}},\qquad i(x)=x.
\end{align*}
This function is bijective because if $i(x)=i(y)$, then $x=y$, and for every $y\in X$ we have $i(y)=y$. It is continuous because the only open subsets of $X_{\mathrm{ind}}$ are $\varnothing$ and $X$, and their preimages are
\begin{align*}
i^{-1}(\varnothing)&=\varnothing,\\
i^{-1}(X)&=X,
\end{align*}
both of which are open in the discrete topology.
Under the forgetful functor $U:\mathbf{Top}\to\mathbf{Set}$, the map $i$ becomes the identity bijection $U(i):X\to X$, so it is an isomorphism in $\mathbf{Set}$. However, $i$ is not an isomorphism in $\mathbf{Top}$. Its inverse as a function is again the identity
\begin{align*}
i^{-1}:X_{\mathrm{ind}}\to X_{\mathrm{disc}},\qquad i^{-1}(x)=x.
\end{align*}
Choose $a\in X$. Since $X$ has at least two points, the singleton $\{a\}$ is neither $\varnothing$ nor $X$. It is open in $X_{\mathrm{disc}}$, but
\begin{align*}
(i^{-1})^{-1}(\{a\})=\{a\},
\end{align*}
which is not open in $X_{\mathrm{ind}}$. Therefore $i^{-1}$ is not continuous, so the continuous bijection $i$ is not a homeomorphism. The forgetful functor remembers the underlying bijection but loses the topological information needed to detect whether its inverse is continuous.
[/example]
This example motivates the next distinctions. We need vocabulary for whether a functor loses morphisms, identifies distinct morphisms, or misses objects up to isomorphism.
## Faithful, Full, And Essentially Surjective Functors
When does a functor give an accurate comparison between two categories? The answer has several layers. A functor may remember distinct morphisms, it may capture all morphisms between images, and it may reach every object of the target up to isomorphism.
[definition: Faithful Functor]
Let $F: \mathcal C \to \mathcal D$ be a functor. The functor $F$ is faithful if for every pair of objects $X,Y\in\mathcal C$, the function
\begin{align*}
F_{X,Y}:\operatorname{Hom}_{\mathcal C}(X,Y)\to \operatorname{Hom}_{\mathcal D}(F(X),F(Y)), \qquad f\mapsto F(f)
\end{align*}
is injective.
[/definition]
Faithfulness says that distinct morphisms in $\mathcal C$ remain distinct after applying $F$. The forgetful functors $\mathbf{Grp}\to\mathbf{Set}$ and $\mathbf{Top}\to\mathbf{Set}$ are faithful, since a homomorphism or continuous map is determined by its underlying function.
Faithfulness only says that $F$ does not identify two different arrows; it does not say that the target has no extra arrows between the images. To measure whether every arrow visible between $F(X)$ and $F(Y)$ really comes from an arrow $X\to Y$, we need the complementary surjectivity condition on each hom-set map.
[definition: Full Functor]
Let $F: \mathcal C \to \mathcal D$ be a functor. The functor $F$ is full if for every pair of objects $X,Y\in\mathcal C$, the function
\begin{align*}
F_{X,Y}:\operatorname{Hom}_{\mathcal C}(X,Y)\to \operatorname{Hom}_{\mathcal D}(F(X),F(Y)), \qquad f\mapsto F(f)
\end{align*}
is surjective.
[/definition]
Fullness says that every morphism between objects in the image already came from a morphism upstairs. The forgetful functor $\mathbf{Grp}\to\mathbf{Set}$ is not full, because most functions between underlying sets of groups are not group homomorphisms.
Fullness and faithfulness control morphisms between objects that are already in the image of $F$. They do not address whether the target category contains further objects that are missing from the comparison.
The obstruction is that literal surjectivity on objects is too rigid: a functor may miss an object only because the target category presents an isomorphic copy under a different name. The object-level condition used in category theory is therefore not literal surjectivity, but surjectivity up to isomorphism.
[definition: Essentially Surjective Functor]
Let $F: \mathcal C \to \mathcal D$ be a functor. The functor $F$ is essentially surjective if for every object $D\in\mathcal D$, there exists an object $C\in\mathcal C$ and an isomorphism
\begin{align*}
F(C)\cong D
\end{align*}
in $\mathcal D$.
[/definition]
Essential surjectivity is weaker than literal surjectivity on objects. Category theory usually regards isomorphic objects as interchangeable for structural purposes, so reaching every object up to isomorphism is the relevant condition.
[example: Inclusion Of A Full Subcategory]
Let $\mathcal C$ be the category of finite-dimensional $k$-vector spaces, and let $\mathcal D=\mathbf{Vect}_k$. The inclusion functor
\begin{align*}
I:\mathcal C\to\mathcal D
\end{align*}
sends each finite-dimensional vector space to the same vector space, now regarded as an object of $\mathbf{Vect}_k$, and sends each linear map to the same linear map.
We show that $I$ is faithful. Let $V,W\in\mathcal C$, and let $f,g:V\to W$ be morphisms in $\mathcal C$. If $I(f)=I(g)$ in $\mathbf{Vect}_k$, then the two underlying functions $V\to W$ are equal. Hence, for every $v\in V$,
\begin{align*}
f(v)&=I(f)(v)\\
&=I(g)(v)\\
&=g(v).
\end{align*}
Therefore $f=g$ as linear maps, so the map
\begin{align*}
I_{V,W}:\operatorname{Hom}_{\mathcal C}(V,W)\to \operatorname{Hom}_{\mathcal D}(I(V),I(W))
\end{align*}
is injective.
We show that $I$ is full. Let $V,W\in\mathcal C$, and let
\begin{align*}
h:I(V)\to I(W)
\end{align*}
be a morphism in $\mathbf{Vect}_k$. Since morphisms in $\mathbf{Vect}_k$ are $k$-linear maps, $h:V\to W$ is a $k$-linear map. Since $V$ and $W$ are finite-dimensional, they are objects of $\mathcal C$, so the same linear map $h:V\to W$ is a morphism in $\mathcal C$. Moreover,
\begin{align*}
I(h)=h
\end{align*}
because the inclusion functor does not change morphisms. Thus every morphism $I(V)\to I(W)$ in $\mathbf{Vect}_k$ comes from a morphism $V\to W$ in $\mathcal C$, so $I_{V,W}$ is surjective.
The functor is not essentially surjective. Choose an infinite-dimensional $k$-vector space $E$, such as the vector space $k[x]$ of polynomials over $k$. If $E$ were isomorphic in $\mathbf{Vect}_k$ to an object in the image of $I$, then there would be a finite-dimensional vector space $V$ and a linear isomorphism
\begin{align*}
\phi:V\to E.
\end{align*}
If $(v_1,\dots,v_n)$ is a basis of $V$, then every $e\in E$ has the form $e=\phi(v)$ for some $v\in V$, and
\begin{align*}
v&=a_1v_1+\cdots+a_nv_n,\\
e&=\phi(v)\\
&=\phi(a_1v_1+\cdots+a_nv_n)\\
&=a_1\phi(v_1)+\cdots+a_n\phi(v_n).
\end{align*}
Thus $\phi(v_1),\dots,\phi(v_n)$ span $E$, making $E$ finite-dimensional, a contradiction. Hence infinite-dimensional vector spaces are not isomorphic to objects in the image of $I$. The inclusion is therefore full and faithful, but it reaches only the finite-dimensional part of $\mathbf{Vect}_k$ up to isomorphism.
[/example]
The phrase "embedding" is used for functors that identify the source category with a subcategory of the target. The categorical test for this is fullness plus faithfulness, but this test needs a precise target subcategory when the original target contains objects unrelated to the comparison.
The obstruction is that the literal image of a functor can be too small or badly behaved: it may omit objects merely isomorphic to reached ones, and it may fail to contain all target morphisms between reached objects. The right subcategory should include exactly the objects reached by the functor up to isomorphism, and it should keep all target morphisms between those objects.
[definition: Essential Image]
Let $F:\mathcal C\to\mathcal D$ be a functor. The essential image of $F$ is the full subcategory of $\mathcal D$ whose objects are those objects $D\in\mathcal D$ for which there exists $C\in\mathcal C$ with $D\cong F(C)$.
[/definition]
The essential image records exactly what $F$ reaches up to isomorphism, while keeping all morphisms between such objects in the target. This convention matters because the literal image of a functor on morphisms need not be full.
[quotetheorem:3959]
[citeproof:3959]
Full and faithful functors need not be essentially surjective. They may embed a category as a proper full subcategory of a larger one, as finite-dimensional vector spaces sit inside all vector spaces. The theorem identifies the exact subcategory on which such a functor becomes a perfect comparison: its essential image. This distinction matters because a functor can be completely accurate about the objects it reaches while still failing to describe the whole ambient category.
The next structural question is whether a full faithful functor merely preserves isomorphisms or also detects them. Detection is stronger: it lets one infer that an arrow in the source was already an isomorphism from the fact that its image is an isomorphism.
[quotetheorem:3960]
[citeproof:3960]
This reflection property explains why full faithful functors are strong embeddings: they preserve isomorphisms and also detect them. Essential surjectivity then says that the embedding reaches the whole target category up to isomorphism, the first step toward the later notion of equivalence of categories.
With functors in hand, we can ask when two such comparisons should be regarded as the same in substance rather than just in notation. Natural transformations answer that question by comparing functors componentwise while preserving the categorical structure that makes the comparison meaningful.
# 4. Natural Transformations
Natural transformations answer the question: when should two functors be regarded as doing the same construction in different notation? A functor compares categories by sending objects and morphisms to objects and morphisms, but many constructions in mathematics come in families, with one map for every object under consideration. The central point of this chapter is that such a family deserves to be called compatible only when it respects every morphism in the source category.
This chapter follows the progression from components and naturality squares, through natural isomorphisms and functor categories, to the interpretation of naturality as coordinate-free compatibility. The examples come from linear algebra, algebraic topology, and group-valued constructions, and they show why natural transformations are not additional decoration on functors but the morphisms between functors.
## Comparing Functors Object by Object
Suppose $F,G:\mathcal C\to\mathcal D$ are functors. For each object $X\in\mathcal C$, it may be possible to choose a morphism $\eta_X:F(X)\to G(X)$ in $\mathcal D$. The problem is that arbitrary choices of such morphisms do not interact with maps $f:X\to Y$ in $\mathcal C$, so they need not express a construction internal to the category.
[definition: Natural Transformation]
Let $\mathcal C$ and $\mathcal D$ be categories, and let $F,G:\mathcal C\to\mathcal D$ be functors. A natural transformation $\eta:F\Rightarrow G$ consists of a morphism
\begin{align*}
\eta_X:F(X)\to G(X)
\end{align*}
in $\mathcal D$ for every object $X\in\mathcal C$, such that for every morphism $f:X\to Y$ in $\mathcal C$,
\begin{align*}
G(f)\circ \eta_X=\eta_Y\circ F(f).
\end{align*}
The morphism $\eta_X$ is called the component of $\eta$ at $X$.
[/definition]
The equation in the definition says that applying the component first and then transporting along $G(f)$ gives the same result as transporting along $F(f)$ first and then applying the component at the target. This is the formal version of saying that the family $\eta_X$ is independent of the particular object name chosen.
Because this compatibility condition involves four objects and four arrows, it is usually easier to read as a commutative square. Naming that square gives a compact way to state and check naturality throughout the rest of the page.
[definition: Naturality Square]
Let $\eta:F\Rightarrow G$ be a natural transformation and let $f:X\to Y$ be a morphism in $\mathcal C$. The naturality square of $\eta$ at $f$ is the commutative square in $\mathcal D$ whose top arrow is $F(f):F(X)\to F(Y)$, whose bottom arrow is $G(f):G(X)\to G(Y)$, and whose vertical arrows are the components $\eta_X:F(X)\to G(X)$ and $\eta_Y:F(Y)\to G(Y)$:
\begin{align*}
F(X) &\xrightarrow{F(f)} F(Y)\\
\eta_X\downarrow\; & \;\downarrow\eta_Y\\
G(X) &\xrightarrow{G(f)} G(Y).
\end{align*}
[/definition]
A naturality square is not extra data; it is a way to display the single compatibility equation attached to a morphism $f$. In practice, most checks of naturality amount to proving that these squares commute for the generating morphisms or for an arbitrary morphism of the source category.
[example: Determinant on Automorphism Groups]
Let $\operatorname{Vect}^{\mathrm{fd}}_k$ be the category of finite-dimensional $k$-vector spaces and $k$-linear isomorphisms. Define $\operatorname{Aut}:\operatorname{Vect}^{\mathrm{fd}}_k\to \operatorname{Grp}$ by $\operatorname{Aut}(V)=\operatorname{Aut}_k(V)$ and, for an isomorphism $u:V\to W$, by
\begin{align*}
\operatorname{Aut}(u)(a)=u\circ a\circ u^{-1}
\end{align*}
for $a\in \operatorname{Aut}_k(V)$. Define $k^\times:\operatorname{Vect}^{\mathrm{fd}}_k\to \operatorname{Grp}$ to send every object to $k^\times$ and every isomorphism to $\operatorname{id}_{k^\times}$. For each $V$, the component $\det_V:\operatorname{Aut}_k(V)\to k^\times$ is a group homomorphism because
\begin{align*}
\det_V(a\circ b)=\det_V(a)\det_V(b)
\end{align*}
for $a,b\in\operatorname{Aut}_k(V)$, and $\det_V(a)\ne 0$ since $a$ is invertible.
We show that the family $(\det_V)_V$ is natural. Let $u:V\to W$ be an isomorphism and let $a\in\operatorname{Aut}_k(V)$. The two sides of the naturality equation are homomorphisms $\operatorname{Aut}_k(V)\to k^\times$. Evaluating the bottom-then-right composite gives
\begin{align*}
(k^\times(u)\circ \det_V)(a)
&=k^\times(u)(\det_V(a))\\
&=\operatorname{id}_{k^\times}(\det_V(a))\\
&=\det_V(a).
\end{align*}
Evaluating the right-then-bottom composite gives
\begin{align*}
(\det_W\circ \operatorname{Aut}(u))(a)
&=\det_W(u\circ a\circ u^{-1})\\
&=\det_W(u)\det_V(a)\det_W(u^{-1})\\
&=\det_W(u)\det_V(a)\det_W(u)^{-1}\\
&=\det_V(a).
\end{align*}
Here multiplicativity of determinant is used in the second line, and $\det_W(u^{-1})=\det_W(u)^{-1}$ follows from
\begin{align*}
1=\det_W(\operatorname{id}_W)=\det_W(u\circ u^{-1})=\det_W(u)\det_W(u^{-1}).
\end{align*}
Thus
\begin{align*}
k^\times(u)\circ \det_V=\det_W\circ \operatorname{Aut}(u),
\end{align*}
so the determinant maps form a natural transformation $\det:\operatorname{Aut}\Rightarrow k^\times$. This says precisely that determinant is unchanged when an automorphism is transported across an isomorphism by conjugation.
[/example]
This example explains the word natural in a concrete case: the determinant of an automorphism does not depend on the basis or on the chosen isomorphic copy of the vector space. Naturality records invariance under change of coordinates.
[example: Double Dual Map]
Let $\operatorname{Vect}^{\mathrm{fd}}_k$ be the category of finite-dimensional $k$-vector spaces and $k$-linear maps. Let $I$ be the identity functor and let $(-)^{**}$ be the double-dual functor. For each $V$, define $\eta_V:V\to V^{**}$ by
\begin{align*}
\eta_V(v)(\lambda)=\lambda(v)
\end{align*}
for $v\in V$ and $\lambda\in V^*$. This is linear: for $\alpha,\beta\in k$ and $v_1,v_2\in V$,
\begin{align*}
\eta_V(\alpha v_1+\beta v_2)(\lambda)
&=\lambda(\alpha v_1+\beta v_2)\\
&=\alpha\lambda(v_1)+\beta\lambda(v_2)\\
&=(\alpha\eta_V(v_1)+\beta\eta_V(v_2))(\lambda)
\end{align*}
for every $\lambda\in V^*$, so $\eta_V(\alpha v_1+\beta v_2)=\alpha\eta_V(v_1)+\beta\eta_V(v_2)$.
We show that the maps $\eta_V$ are natural. Let $f:V\to W$ be linear. The dual map $f^*:W^*\to V^*$ is given by
\begin{align*}
f^*(\mu)=\mu\circ f,
\end{align*}
and the double-dual map $f^{**}:V^{**}\to W^{**}$ satisfies
\begin{align*}
f^{**}(\Phi)(\mu)=\Phi(f^*(\mu))
\end{align*}
for $\Phi\in V^{**}$ and $\mu\in W^*$. For $v\in V$ and $\mu\in W^*$,
\begin{align*}
(f^{**}\circ \eta_V)(v)(\mu)
&=f^{**}(\eta_V(v))(\mu)\\
&=\eta_V(v)(f^*(\mu))\\
&=f^*(\mu)(v)\\
&=(\mu\circ f)(v)\\
&=\mu(f(v)),
\end{align*}
while
\begin{align*}
(\eta_W\circ f)(v)(\mu)
&=\eta_W(f(v))(\mu)\\
&=\mu(f(v)).
\end{align*}
Thus $(f^{**}\circ\eta_V)(v)(\mu)=(\eta_W\circ f)(v)(\mu)$ for every $v\in V$ and every $\mu\in W^*$, hence
\begin{align*}
f^{**}\circ\eta_V=\eta_W\circ f.
\end{align*}
Therefore $\eta:I\Rightarrow (-)^{**}$ is a natural transformation.
Finally, each component is an isomorphism in finite dimension. If $\eta_V(v)=0$, then $\lambda(v)=0$ for every $\lambda\in V^*$. If $v\ne 0$, extend $v$ to a basis $(v,e_2,\ldots,e_n)$ of $V$ and define $\lambda\in V^*$ by $\lambda(v)=1$ and $\lambda(e_i)=0$ for $i\ge 2$, contradicting $\lambda(v)=0$. Hence $\eta_V$ is injective. If $\dim V=n$, then a basis of $V$ has a [dual basis](/theorems/414) of $V^*$ with $n$ elements, and that dual basis has a dual basis of $V^{**}$ with $n$ elements, so $\dim V^{**}=n=\dim V$. An injective linear map between finite-dimensional vector spaces of the same dimension is surjective, so $\eta_V$ is an isomorphism. This natural isomorphism says that the evaluation map into the double dual is compatible with every linear map, not with any chosen basis.
[/example]
The finite-dimensional hypothesis matters for the last sentence, not for the definition of the component. The map $V\to V^{**}$ exists for every vector space, but it is an isomorphism precisely in the finite-dimensional setting considered here.
[example: Singular Cycles Inside Singular Chains]
Fix $n\ge 0$, with the convention $C_{-1}(X)=0$ when $n=0$. For a topological space $X$, let $C_n(X)$ be the free abelian group on singular $n$-simplices $\sigma:\Delta^n\to X$, and let
\begin{align*}
Z_n(X)=\ker(\partial_n:C_n(X)\to C_{n-1}(X)).
\end{align*}
If $f:X\to Y$ is continuous, then $C_n(f):C_n(X)\to C_n(Y)$ is determined on generators by
\begin{align*}
C_n(f)(\sigma)=f\circ \sigma
\end{align*}
and extended linearly.
For a generator $\sigma:\Delta^n\to X$, write $\delta_j:\Delta^{n-1}\to\Delta^n$ for the $j$th face map. Then
\begin{align*}
C_{n-1}(f)(\partial_n\sigma)
&=C_{n-1}(f)\left(\sum_{j=0}^n(-1)^j\sigma\circ\delta_j\right)\\
&=\sum_{j=0}^n(-1)^j C_{n-1}(f)(\sigma\circ\delta_j)\\
&=\sum_{j=0}^n(-1)^j f\circ\sigma\circ\delta_j\\
&=\partial_n(f\circ\sigma)\\
&=\partial_n(C_n(f)(\sigma)).
\end{align*}
By linearity, $C_{n-1}(f)\circ\partial_n=\partial_n\circ C_n(f)$ on all of $C_n(X)$. Hence if $c\in Z_n(X)$, then
\begin{align*}
\partial_n(C_n(f)(c))
&=C_{n-1}(f)(\partial_n c)\\
&=C_{n-1}(f)(0)\\
&=0,
\end{align*}
so $C_n(f)(c)\in Z_n(Y)$. Therefore the restricted map
\begin{align*}
Z_n(f):Z_n(X)\to Z_n(Y),\qquad Z_n(f)(c)=C_n(f)(c),
\end{align*}
is well-defined.
Let $i_X:Z_n(X)\hookrightarrow C_n(X)$ be the inclusion. For $c\in Z_n(X)$,
\begin{align*}
(C_n(f)\circ i_X)(c)
&=C_n(f)(c),\\
(i_Y\circ Z_n(f))(c)
&=i_Y(C_n(f)(c))\\
&=C_n(f)(c).
\end{align*}
Thus $C_n(f)\circ i_X=i_Y\circ Z_n(f)$ for every continuous map $f:X\to Y$, so the inclusions form a natural transformation $i:Z_n\Rightarrow C_n$. This says that passing from cycles to chains is compatible with pushing singular simplices forward along continuous maps.
[/example]
The naturality square for this inclusion says that pushing a cycle forward and then regarding it as a chain agrees with regarding it as a chain first and then pushing it forward. This compatibility is the categorical expression of the functoriality of homology at the chain level.
## Composing Natural Transformations
If natural transformations are to be the morphisms between functors, they must compose. There are two different compositions to understand: vertical composition, where the source and target functors are stacked in the same functor category, and horizontal composition, where natural transformations are composed across different source and target categories.
[quotetheorem:3961]
[citeproof:3961]
Vertical composition is the composition one sees inside a fixed diagram of functors $C\to D$. The hypotheses that all three functors have the same source and target are essential: without them the componentwise composite $\theta_X\circ\eta_X$ may not even be defined in $D$. The theorem also does not say that arbitrary objectwise composites are natural; it uses the naturality of both $\eta$ and $\theta$ to make the middle terms $G(f)$ line up. Associativity then follows because the calculation is performed component by component in $D$, so vertical composition gives the ordinary categorical composition in the functor category introduced below.
[definition: Identity Natural Transformation]
Let $F:C\to D$ be a functor. The identity natural transformation $\operatorname{id}_F:F\Rightarrow F$ has component
\begin{align*}
(\operatorname{id}_F)_X=\operatorname{id}_{F(X)}
\end{align*}
for every object $X\in C$.
[/definition]
The identity natural transformation acts as the identity for vertical composition. This is the first indication that, for fixed $C$ and $D$, functors $C\to D$ themselves form a category.
Before building that category, we also need to understand how natural transformations interact with composition of functors across different categories. Vertical composition stacks transformations with the same source and target functors; the next result addresses the separate operation that combines transformations along a chain of functors.
[quotetheorem:3962]
[citeproof:3962]
Horizontal composition is what allows natural transformations to be transported through other functors. The middle category $D$ is essential: the components $\eta_X:F(X)\to G(X)$ must be morphisms in exactly the category on which $F'$ and $G'$ act. A common confusion is to treat horizontal composition like vertical composition, but vertical composition stacks transformations between parallel functors $C\to D$, while horizontal composition combines transformations living on different stages $C\to D\to E$. The special cases where only one side is transformed are called whiskering, and the compatibility between horizontal and vertical composition is the interchange law; these are the first signs that categories, functors, and natural transformations form a two-dimensional calculus.
## Natural Isomorphisms and Functor Categories
When are two functors the same for categorical purposes? Equality of functors is usually too strict, since it requires equality on every object and morphism. Objectwise comparison alone is also too weak: two functors may send each object to isomorphic objects, but if those isomorphisms do not commute with the morphisms of the source category, they cannot be used as morphisms between functors. The useful replacement is isomorphism in the category of functors.
[definition: Natural Isomorphism]
Let $F,G:C\to D$ be functors. A natural transformation $\eta:F\Rightarrow G$ is a natural isomorphism if every component $\eta_X:F(X)\to G(X)$ is an isomorphism in $D$.
[/definition]
If $\eta$ is a natural isomorphism, then the inverse maps $\eta_X^{-1}:G(X)\to F(X)$ form a natural transformation $G\Rightarrow F$. The point is that the naturality square for $\eta^{-1}$ is obtained by rearranging the naturality square for $\eta$ using the inverse components.
This observation needs a formal guarantee because objectwise inverse maps are useful only if they assemble into a natural transformation. The obstruction is coherence: inverses chosen separately at each object could fail to commute with the source morphisms, so they would not define a morphism of functors. Componentwise invertibility must therefore be checked against naturality, and the result below identifies exactly when the categorical inverse exists.
[quotetheorem:3963]
[citeproof:3963]
Natural isomorphism is the correct categorical substitute for equality of constructions. Componentwise invertibility is essential here: if some $\eta_X$ is not an isomorphism in $D$, then there is no possible inverse component at $X$, so the family cannot define an isomorphism in the functor category. Even when every component is merely a monomorphism or epimorphism, the natural transformation may express an inclusion or quotient construction rather than sameness of functors. For instance, the identity functor on finite-dimensional vector spaces is naturally isomorphic to the double-dual functor, but not equal to it as a literal rule on objects.
To make this replacement for equality precise, we now organize functors themselves as objects and natural transformations as arrows. Vertical composition and identity natural transformations then supply the categorical structure needed to speak about isomorphisms of functors internally.
[definition: Functor Category]
Let $C$ be a small category and let $D$ be a category. The functor category $[C,D]$ has functors $C\to D$ as objects and natural transformations between such functors as morphisms. Composition is [vertical composition of natural transformations](/theorems/3961), and identity morphisms are identity natural transformations.
[/definition]
The smallness assumption ensures that the collection of functors $C\to D$ and the collection of natural transformations between two such functors form legitimate collections in the chosen set-theoretic universe. Without a size convention, one may still reason formally, but the phrase category of all functors can hide set-theoretic difficulties.
After the definition, one still has to verify the category axioms: identity natural transformations must act as identities, and vertical composition must be associative. Those facts were established locally for natural transformations, and the next theorem packages them into the statement that the displayed data really forms a category.
[quotetheorem:3964]
[citeproof:3964]
Functor categories turn the study of natural transformations into ordinary category theory. The smallness hypothesis is not cosmetic: if $C$ is a large category such as the category of all sets, then the collection of all functors $C\to D$ may be too large to be an object of the ambient set theory. The theorem therefore does not settle every possible ``category of all functors'' without choosing universe conventions or restricting the source. Within the small-source setting, a natural isomorphism between functors is exactly an isomorphism in $[C,D]$, and later constructions such as representable functors and the Yoneda embedding are naturally phrased inside such functor categories.
## Naturality Without Coordinates
The final question is conceptual: why does naturality so often mean independent of choices? The answer is that a natural transformation commutes with every allowed change of object in the source category, so it cannot depend on data that is destroyed by such changes.
[explanation: Coordinate-Free Compatibility]
Many constructions in linear algebra can be described after choosing a basis, but the resulting formula may depend on that basis. A natural transformation packages a construction so that every change of basis, every isomorphism, or more generally every morphism in the source category produces a commuting square. When the source category is chosen correctly, this commuting-square condition is exactly the assertion that the construction is coordinate-free.
The determinant example uses the category whose morphisms are vector-space isomorphisms. Naturality then says invariance under conjugation: replacing an automorphism by its expression in another vector space leaves the determinant unchanged. The double-dual map uses all linear maps, not only isomorphisms, so it states a stronger compatibility: every linear map intertwines the canonical evaluation maps into the double dual.
[/explanation]
The source category controls the strength of the word natural. If the source contains fewer morphisms, fewer squares must commute, so more families of components qualify as natural. If the source contains more morphisms, naturality becomes a stronger rigidity condition.
[example: A Non-Natural Choice of Basis]
Let $k$ be a field with $2\ne 0$, and consider the identity functor $I:\operatorname{Vect}^{\mathrm{fd}}_k\to\operatorname{Vect}^{\mathrm{fd}}_k$. Suppose that for each non-zero finite-dimensional vector space $V$ we choose an ordered basis $(e_1,\ldots,e_n)$ and define $s_V:V\to V$ by
\begin{align*}
s_V(e_1)=2e_1,\qquad s_V(e_i)=e_i\quad\text{for }2\le i\le n,
\end{align*}
extended linearly; set $s_0=\operatorname{id}_0$. Since $2\in k^\times$, the inverse of $s_V$ sends $e_1$ to $\frac{1}{2}e_1$ and sends $e_i$ to $e_i$ for $i\ge 2$, so each $s_V$ is an isomorphism.
The family $(s_V)_V$ is nevertheless not natural. Naturality for a transformation $I\Rightarrow I$ would require
\begin{align*}
f\circ s_V=s_W\circ f
\end{align*}
for every linear map $f:V\to W$. In one dimension this particular obstruction is invisible: if $V=kv$, $W=kw$, the chosen bases are $(v)$ and $(w)$, and $f(v)=3w$, then
\begin{align*}
(f\circ s_V)(v)
&=f(2v)\\
&=2f(v)\\
&=2(3w)\\
&=6w,
\end{align*}
while
\begin{align*}
(s_W\circ f)(v)
&=s_W(3w)\\
&=3s_W(w)\\
&=3(2w)\\
&=6w.
\end{align*}
Now take two-dimensional spaces $V$ and $W$ with chosen bases $(e_1,e_2)$ and $(w_1,w_2)$. Define $f:V\to W$ by
\begin{align*}
f(e_1)=w_2,\qquad f(e_2)=0.
\end{align*}
Then
\begin{align*}
(f\circ s_V)(e_1)
&=f(2e_1)\\
&=2f(e_1)\\
&=2w_2,
\end{align*}
but
\begin{align*}
(s_W\circ f)(e_1)
&=s_W(w_2)\\
&=w_2.
\end{align*}
Since $2w_2\ne w_2$, the two composites are different, so $(s_V)_V$ is not a natural transformation $I\Rightarrow I$. Objectwise choices of coordinates can therefore produce objectwise automorphisms without producing a natural automorphism of the identity functor.
[/example]
This example is a useful warning. A construction may exist for every object separately and still fail to be natural because the choices do not respect morphisms.
[remark: Naturality as a Test]
To prove that a proposed family $\eta_X:F(X)\to G(X)$ is natural, start with an arbitrary morphism $f:X\to Y$ in the source category and compute the two composites $G(f)\circ\eta_X$ and $\eta_Y\circ F(f)$. Check first that both composites have the same source $F(X)$ and target $G(Y)$, then use the definitions of $F(f)$, $G(f)$, and the components $\eta_X,\eta_Y$ to compare their actions. To disprove naturality, find one morphism for which the two composites differ; isomorphisms often test coordinate dependence, while non-invertible maps often detect hidden choices such as preferred subspaces or bases. The entire condition is contained in this comparison.
[/remark]
Natural transformations complete the basic language for talking about categories, functors, and structured families of maps. The next chapter asks a deeper question: when do two categories have the same categorical content, even if they are not literally identical?
# 5. Equivalence of Categories
The previous chapters developed categories, functors, and natural transformations as ways of organizing mathematical structures and comparing them. This chapter asks when two categories should count as having the same categorical content. A literal isomorphism of categories is often too rigid, because it asks for equality after applying functors in both directions, while ordinary mathematics usually identifies objects only up to isomorphism. The right replacement is equivalence of categories: a comparison that preserves all morphism-level structure while allowing objects to be represented by isomorphic substitutes.
## Isomorphism of Categories and the Need for Equivalence
When two categories model the same kind of mathematics, should we demand a perfect bijective match between their objects, or should we allow different representatives of the same object? The distinction matters because many mathematical constructions make non-canonical choices, such as choosing a basis of a vector space, and those choices should not change the categorical information.
[definition: Isomorphism of Categories]
Let $\mathcal C$ and $\mathcal D$ be categories. An isomorphism of categories from $\mathcal C$ to $\mathcal D$ is a functor $F:\mathcal C\to\mathcal D$ for which there exists a functor $G:\mathcal D\to\mathcal C$ such that
\begin{align*}
G\circ F &= \operatorname{id}_{\mathcal C}, & F\circ G &= \operatorname{id}_{\mathcal D}.
\end{align*}
[/definition]
This definition treats categories as algebraic structures with literal equality of objects, morphisms, and composition. It is useful when categories have been presented with exactly the same data, but it is too strict for most mathematical comparisons.
[example: Ordered Two-Element Sets]
Let $\mathcal C$ have objects $0,1$, identity morphisms $\operatorname{id}_0,\operatorname{id}_1$, and one non-identity morphism $u:0\to 1$. Let $\mathcal D$ have objects $a,b$, identity morphisms $\operatorname{id}_a,\operatorname{id}_b$, and one non-identity morphism $v:a\to b$. Define a functor $F:\mathcal C\to\mathcal D$ by
\begin{align*}
F(0)&=a, & F(1)&=b, &
F(\operatorname{id}_0)&=\operatorname{id}_a, &
F(\operatorname{id}_1)&=\operatorname{id}_b, &
F(u)&=v.
\end{align*}
Define a functor $G:\mathcal D\to\mathcal C$ in the opposite direction by
\begin{align*}
G(a)&=0, & G(b)&=1, &
G(\operatorname{id}_a)&=\operatorname{id}_0, &
G(\operatorname{id}_b)&=\operatorname{id}_1, &
G(v)&=u.
\end{align*}
These assignments preserve identities by definition. They also preserve composition: the only non-identity morphisms are $u$ and $v$, and neither can be composed with itself, so every defined composite involving $u$ or $v$ is forced by an identity, for example
\begin{align*}
F(u\circ \operatorname{id}_0)&=F(u)=v=v\circ \operatorname{id}_a=F(u)\circ F(\operatorname{id}_0),\\
F(\operatorname{id}_1\circ u)&=F(u)=v=\operatorname{id}_b\circ v=F(\operatorname{id}_1)\circ F(u),
\end{align*}
and the same verification for $G$ replaces $v,a,b$ by $u,0,1$.
Now compute the two composites on every object and morphism of the categories:
\begin{align*}
(GF)(0)&=G(a)=0, & (GF)(1)&=G(b)=1,\\
(GF)(\operatorname{id}_0)&=G(\operatorname{id}_a)=\operatorname{id}_0, &
(GF)(\operatorname{id}_1)&=G(\operatorname{id}_b)=\operatorname{id}_1, &
(GF)(u)&=G(v)=u,
\end{align*}
so $G\circ F=\operatorname{id}_{\mathcal C}$. Similarly,
\begin{align*}
(FG)(a)&=F(0)=a, & (FG)(b)&=F(1)=b,\\
(FG)(\operatorname{id}_a)&=F(\operatorname{id}_0)=\operatorname{id}_a, &
(FG)(\operatorname{id}_b)&=F(\operatorname{id}_1)=\operatorname{id}_b, &
(FG)(v)&=F(u)=v,
\end{align*}
so $F\circ G=\operatorname{id}_{\mathcal D}$. Hence $F$ is an isomorphism of categories with inverse $G$: the object names changed, but the objects, morphisms, identities, and composition have an exact bijective match.
[/example]
The next example shows the limitation of literal isomorphism. The category of finite-dimensional vector spaces contains many distinct objects of the same dimension, while a skeletal matrix category has only one object for each dimension.
[definition: Equivalence of Categories]
Let $\mathcal C$ and $\mathcal D$ be categories. An equivalence of categories consists of functors
\begin{align*}
F &: \mathcal C\to\mathcal D, & G &: \mathcal D\to\mathcal C
\end{align*}
and natural isomorphisms
\begin{align*}
\eta &: \operatorname{id}_{\mathcal C} \Rightarrow G\circ F, & \varepsilon &: F\circ G \Rightarrow \operatorname{id}_{\mathcal D}.
\end{align*}
[/definition]
Here the composites need not be equal to the identity functors. They only need to be naturally isomorphic to the identity functors, meaning that every object returns to an isomorphic object in a way compatible with all morphisms.
[remark: Direction of the Natural Isomorphisms]
Some authors define an equivalence using natural isomorphisms $G F\Rightarrow \operatorname{id}_{\mathcal C}$ and $\operatorname{id}_{\mathcal D}\Rightarrow F G$. Since the components are isomorphisms, either orientation gives the same notion after inversion. The essential point is not the direction but naturality across every morphism.
[/remark]
With the definition fixed, we need examples that are equivalent without being literally the same category. Finite-dimensional vector spaces and matrices provide the cleanest test case because bases turn linear maps into matrices, while change of basis accounts for the non-literal equality.
[example: Vector Spaces and Matrix Categories]
Fix a field $k$. In $\mathbf{Mat}_k$, a morphism $m\to n$ is an $n\times m$ matrix $A=(a_{ij})$, and $S(A):k^m\to k^n$ is the linear map
\begin{align*}
S(A)(x_1,\ldots,x_m)
&=\left(\sum_{j=1}^m a_{1j}x_j,\ldots,\sum_{j=1}^m a_{nj}x_j\right).
\end{align*}
The identity morphism $n\to n$ is the identity matrix $I_n=(\delta_{ij})$, and for $x=(x_1,\ldots,x_n)$,
\begin{align*}
S(I_n)(x)_i
&=\sum_{j=1}^n \delta_{ij}x_j
=x_i,
\end{align*}
so $S(I_n)=\operatorname{id}_{k^n}$. If $A:m\to n$ and $B:n\to p$, then $B A$ has entries
\begin{align*}
(BA)_{\ell j}=\sum_{i=1}^n b_{\ell i}a_{ij}.
\end{align*}
For $x=(x_1,\ldots,x_m)$, the $\ell$th coordinate of $S(B A)(x)$ is
\begin{align*}
S(BA)(x)_\ell
&=\sum_{j=1}^m (BA)_{\ell j}x_j\\
&=\sum_{j=1}^m\left(\sum_{i=1}^n b_{\ell i}a_{ij}\right)x_j\\
&=\sum_{i=1}^n b_{\ell i}\left(\sum_{j=1}^m a_{ij}x_j\right)\\
&=S(B)(S(A)(x))_\ell,
\end{align*}
so $S(BA)=S(B)\circ S(A)$. Thus $S:\mathbf{Mat}_k\to\mathbf{FdVect}_k$ is a functor.
For every finite-dimensional vector space $V$, choose an ordered basis $\beta_V=(e_1,\ldots,e_d)$, where $d=\dim_k V$, choosing the standard basis when $V=k^d$. Define $T(V)=d$. If $f:V\to W$ and $\beta_W=(w_1,\ldots,w_r)$, write
\begin{align*}
f(e_j)=\sum_{i=1}^r a_{ij}w_i
\end{align*}
and define $T(f)$ to be the $r\times d$ matrix $(a_{ij})$. The coordinate isomorphism
\begin{align*}
\phi_V:V&\to k^d, &
\phi_V\left(\sum_{j=1}^d x_j e_j\right)&=(x_1,\ldots,x_d)
\end{align*}
satisfies, for $v=\sum_{j=1}^d x_j e_j$,
\begin{align*}
\phi_W(f(v))
&=\phi_W\left(f\left(\sum_{j=1}^d x_j e_j\right)\right)\\
&=\phi_W\left(\sum_{j=1}^d x_j f(e_j)\right)\\
&=\phi_W\left(\sum_{j=1}^d x_j\sum_{i=1}^r a_{ij}w_i\right)\\
&=\left(\sum_{j=1}^d a_{1j}x_j,\ldots,\sum_{j=1}^d a_{rj}x_j\right)\\
&=S(T(f))(\phi_V(v)).
\end{align*}
Hence $\phi_W\circ f=S(T(f))\circ\phi_V$, so the maps $\phi_V$ form a natural isomorphism $\operatorname{id}_{\mathbf{FdVect}_k}\Rightarrow S T$. On the standard spaces, the chosen bases are standard, so $T(S(n))=n$ and $T(S(A))=A$ for every matrix $A$. Therefore $T S=\operatorname{id}_{\mathbf{Mat}_k}$, and $S$ is an equivalence of categories.
This equivalence is not canonical: the construction of $T$ depends on the chosen bases $\beta_V$, and changing a basis changes the coordinate matrix of the same linear map.
[/example]
This example is the motivating pattern for the chapter. Equivalence keeps the morphisms exactly under control but allows objects to be replaced by isomorphic standard representatives.
The obstruction to replacing equivalence by isomorphism is object equality. A category may contain many isomorphic copies of the same structure, while a skeletal or coordinate category may keep only one representative. The next criterion isolates the amount of sameness that equivalence requires without forcing the two object collections to match literally.
[quotetheorem:3965]
[citeproof:3965]
The converse fails in the situations that matter most: equivalent categories can have different numbers of objects, because equivalence identifies isomorphic objects rather than demanding equality. This is why equivalence is the right comparison for categories of mathematical structures: it preserves the available morphisms and the classification of objects up to isomorphism, but it ignores duplicate presentations. Thus a category of all finite-dimensional vector spaces and a category of standard coordinate spaces can describe the same mathematics even though their objects are not literally the same.
## Full, Faithful, and Essentially Surjective Functors
Using the full, faithful, and essentially surjective notions introduced in Chapter 3, how can we recognize an equivalence without first constructing a quasi-inverse functor? The practical criterion separates the problem into morphisms and objects: preserve all hom-set information, and make sure every object on the target side is represented up to isomorphism.
[definition: Full Functor]
Let $F:\mathcal C\to\mathcal D$ be a functor. The functor $F$ is full if, for every pair of objects $X,Y\in\mathcal C$, the function
\begin{align*}
F_{X,Y}:\operatorname{Hom}_{\mathcal C}(X,Y)\to\operatorname{Hom}_{\mathcal D}(F X,F Y)
\end{align*}
is surjective.
[/definition]
Fullness says that every morphism between objects in the image of $F$ comes from a morphism before applying $F$. This handles surjectivity on each hom-set, but it leaves a serious obstruction: a functor could still collapse different source morphisms into the same target morphism, losing equations and distinctions that existed in the original category.
The next property isolates the complementary requirement on each hom-set map. To compare categories without losing morphism-level information, we must require that distinct parallel morphisms remain distinct after applying the functor.
[definition: Faithful Functor]
Let $F:\mathcal C\to\mathcal D$ be a functor. The functor $F$ is faithful if, for every pair of objects $X,Y\in\mathcal C$, the function
\begin{align*}
F_{X,Y}:\operatorname{Hom}_{\mathcal C}(X,Y)\to\operatorname{Hom}_{\mathcal D}(F X,F Y)
\end{align*}
is injective.
[/definition]
Faithfulness says that $F$ does not identify two different morphisms with the same source and target. Fullness and faithfulness together make the morphism theory accurate on the part of the target that is reached, but they still say nothing about target objects outside that part.
The remaining obstruction to equivalence is object-level coverage. Since categorical sameness treats isomorphic objects as interchangeable, the right condition asks whether every target object is represented by the functor up to isomorphism, not whether it literally lies in the image.
[definition: Essentially Surjective Functor]
Let $F:\mathcal C\to\mathcal D$ be a functor. The functor $F$ is essentially surjective if for every object $D\in\mathcal D$, there exists an object $C\in\mathcal C$ such that $D$ is isomorphic to $F C$ in $\mathcal D$.
[/definition]
The word "essentially" records the central categorical convention: objects are reached up to isomorphism, not necessarily on the nose.
[example: Inclusion of Standard Vector Spaces]
Let $S:\mathbf{Mat}_k\to\mathbf{FdVect}_k$ send $n$ to $k^n$ and send an $n\times m$ matrix $A=(a_{ij})$ to the linear map
\begin{align*}
S(A)(x_1,\ldots,x_m)
=\left(\sum_{j=1}^m a_{1j}x_j,\ldots,\sum_{j=1}^m a_{nj}x_j\right).
\end{align*}
To see that $S$ is full, let $f:k^m\to k^n$ be any linear map. Write $e_1,\ldots,e_m$ for the standard basis of $k^m$ and write each image in coordinates:
\begin{align*}
f(e_j)=(a_{1j},\ldots,a_{nj})\in k^n.
\end{align*}
For $x=(x_1,\ldots,x_m)=\sum_{j=1}^m x_j e_j$, linearity gives
\begin{align*}
f(x)
&=f\left(\sum_{j=1}^m x_j e_j\right)\\
&=\sum_{j=1}^m x_j f(e_j)\\
&=\sum_{j=1}^m x_j(a_{1j},\ldots,a_{nj})\\
&=\left(\sum_{j=1}^m a_{1j}x_j,\ldots,\sum_{j=1}^m a_{nj}x_j\right)\\
&=S(A)(x),
\end{align*}
where $A=(a_{ij})$. Hence every linear map $k^m\to k^n$ is $S(A)$ for some matrix $A$.
To see that $S$ is faithful, suppose $A=(a_{ij})$ and $B=(b_{ij})$ are $n\times m$ matrices with $S(A)=S(B)$. For the standard basis vector $e_j\in k^m$,
\begin{align*}
S(A)(e_j)&=(a_{1j},\ldots,a_{nj}),\\
S(B)(e_j)&=(b_{1j},\ldots,b_{nj}).
\end{align*}
Since $S(A)=S(B)$, these two vectors are equal for every $j$, so $a_{ij}=b_{ij}$ for every $i$ and $j$. Thus $A=B$.
The functor is essentially surjective because if $V$ is finite-dimensional and $\dim_k V=n$, then choosing a basis $v_1,\ldots,v_n$ defines an isomorphism
\begin{align*}
k^n&\to V, &
(x_1,\ldots,x_n)&\mapsto \sum_{i=1}^n x_i v_i.
\end{align*}
This map is linear by construction, surjective because the basis spans $V$, and injective because a linear relation among the $v_i$ has all coefficients zero. Therefore every object of $\mathbf{FdVect}_k$ is isomorphic to some $S(n)=k^n$. The functor need not be literally surjective on objects: a finite-dimensional vector space $V$ with underlying set not equal to $k^n$ is still only isomorphic to $k^n$, not equal to it, unless $\mathbf{FdVect}_k$ has been defined to contain only the standard spaces.
[/example]
The example shows why the recognition criterion needs all three hypotheses. Fullness supplies enough morphisms, faithfulness prevents collapse of distinct morphisms, and essential surjectivity ensures that no target object is left outside the comparison up to isomorphism. Without any one of these conditions, a functor may resemble an equivalence locally while still failing to describe the whole target category.
[quotetheorem:3966]
[citeproof:3966]
This theorem is the main working test for equivalence. In examples, checking full and faithful usually means understanding the morphisms, while checking essential surjectivity usually means proving a classification up to isomorphism.
[example: A Non-Full Embedding]
Let $U:\mathbf{Grp}\to\mathbf{Set}$ be the forgetful functor. To see that $U$ is faithful, let $f,g:G\to H$ be group homomorphisms and suppose $U(f)=U(g)$ as functions $U(G)\to U(H)$. Then for every $x\in G$,
\begin{align*}
f(x)&=U(f)(x)\\
&=U(g)(x)\\
&=g(x).
\end{align*}
Since $f$ and $g$ have the same value on every element of $G$, they are the same homomorphism. Thus each map
\begin{align*}
U_{G,H}:\operatorname{Hom}_{\mathbf{Grp}}(G,H)\to \operatorname{Hom}_{\mathbf{Set}}(U(G),U(H))
\end{align*}
is injective.
The functor is not full. Let $C_2=\{0,1\}$ with addition modulo $2$, and define a function $\tau:U(C_2)\to U(C_2)$ by
\begin{align*}
\tau(0)&=1, & \tau(1)&=0.
\end{align*}
If $\tau$ were the underlying function of a group homomorphism $C_2\to C_2$, then it would preserve addition. But
\begin{align*}
\tau(0+0)&=\tau(0)=1,\\
\tau(0)+\tau(0)&=1+1=0
\end{align*}
in $C_2$, so $\tau(0+0)\ne \tau(0)+\tau(0)$. Hence $\tau$ is a set map between underlying sets that is not in the image of $U_{C_2,C_2}$, so $U$ is not full. Therefore the forgetful functor preserves distinct homomorphisms but does not capture all set maps, and it cannot be an equivalence of categories.
[/example]
Full faithfulness alone still does not guarantee equivalence. To see why essential surjectivity is a separate condition, we need a functor that preserves every Hom-set perfectly but misses some objects up to isomorphism.
[example: A Fully Faithful Functor That Is Not Essentially Surjective]
Let $I:\mathbf{FdVect}_k\to\mathbf{Vect}_k$ be the inclusion functor. Thus for each finite-dimensional vector space $V$, $I(V)=V$, and for each linear map $f:V\to W$, $I(f)=f$ regarded as the same function in the larger category $\mathbf{Vect}_k$.
For finite-dimensional vector spaces $V$ and $W$, the hom-set map induced by $I$ is
\begin{align*}
I_{V,W}:\operatorname{Hom}_{\mathbf{FdVect}_k}(V,W)
&\to \operatorname{Hom}_{\mathbf{Vect}_k}(I(V),I(W)),\\
f&\mapsto f.
\end{align*}
Both sides consist of exactly the linear maps $V\to W$. Therefore if $g:I(V)\to I(W)$ is a morphism in $\mathbf{Vect}_k$, then $g$ is a linear map $V\to W$, so $g=I(g)$ with $g\in\operatorname{Hom}_{\mathbf{FdVect}_k}(V,W)$; this proves fullness. If $f,g:V\to W$ satisfy $I(f)=I(g)$, then the two underlying functions $V\to W$ are equal, so for every $v\in V$,
\begin{align*}
f(v)=I(f)(v)=I(g)(v)=g(v).
\end{align*}
Hence $f=g$, so $I$ is faithful.
The functor is not essentially surjective. Let $E$ be an infinite-dimensional $k$-vector space, for example the vector space $k[x]$ of polynomials in one variable over $k$. The set
\begin{align*}
\{1,x,x^2,x^3,\ldots\}
\end{align*}
is linearly independent because if
\begin{align*}
a_0+a_1x+\cdots+a_nx^n=0
\end{align*}
as a polynomial, then equality of coefficients gives
\begin{align*}
a_0=a_1=\cdots=a_n=0.
\end{align*}
Thus $k[x]$ is not finite-dimensional. If $k[x]$ were isomorphic to $I(V)=V$ for some finite-dimensional vector space $V$, then choosing a finite basis $v_1,\ldots,v_m$ of $V$ and an isomorphism $\alpha:V\to k[x]$ would make
\begin{align*}
\alpha(v_1),\ldots,\alpha(v_m)
\end{align*}
a basis of $k[x]$: every $p\in k[x]$ has $\alpha^{-1}(p)=\sum_{i=1}^m c_i v_i$, so
\begin{align*}
p=\alpha(\alpha^{-1}(p))
=\alpha\left(\sum_{i=1}^m c_i v_i\right)
=\sum_{i=1}^m c_i\alpha(v_i),
\end{align*}
and if $\sum_{i=1}^m c_i\alpha(v_i)=0$, then applying $\alpha^{-1}$ gives
\begin{align*}
0=\alpha^{-1}(0)
=\alpha^{-1}\left(\sum_{i=1}^m c_i\alpha(v_i)\right)
=\sum_{i=1}^m c_i v_i,
\end{align*}
so $c_1=\cdots=c_m=0$. This would make $k[x]$ finite-dimensional, contradicting the infinite linearly independent set above. Therefore $I$ preserves all hom-set information among finite-dimensional vector spaces, but it does not reach every vector space up to isomorphism.
[/example]
## Skeletons and Categorical Invariants
If equivalence ignores repeated isomorphic copies of the same object, can every category be compressed by choosing one representative from each isomorphism class? Skeletons formalize this compression and explain why equivalence is often the right notion of sameness.
[definition: Skeleton]
A skeleton of a category $\mathcal C$ is a full subcategory $\mathcal S\subset\mathcal C$ such that every object of $\mathcal C$ is isomorphic to exactly one object of $\mathcal S$.
[/definition]
The word "exactly" refers to equality of objects inside $\mathcal S$: no two distinct objects of the skeleton are isomorphic in $\mathcal C$. A skeleton keeps all morphisms between the chosen representatives.
[example: A Skeleton of Finite-Dimensional Vector Spaces]
In $\mathbf{FdVect}_k$, choose one representative for each finite dimension, namely the standard vector space $k^n$ for $n\in\mathbb N\cup\{0\}$, and let $\mathcal S$ be the full subcategory of $\mathbf{FdVect}_k$ whose objects are these $k^n$. If $V$ is finite-dimensional and $\dim_k V=n$, choose a basis $v_1,\ldots,v_n$ of $V$. The map
\begin{align*}
\alpha:k^n&\to V, &
\alpha(x_1,\ldots,x_n)&=\sum_{i=1}^n x_i v_i
\end{align*}
is linear because
\begin{align*}
\alpha((x_1,\ldots,x_n)+(y_1,\ldots,y_n))
&=\alpha(x_1+y_1,\ldots,x_n+y_n)\\
&=\sum_{i=1}^n (x_i+y_i)v_i\\
&=\sum_{i=1}^n x_i v_i+\sum_{i=1}^n y_i v_i\\
&=\alpha(x_1,\ldots,x_n)+\alpha(y_1,\ldots,y_n),
\end{align*}
and similarly $\alpha(c x)=c\alpha(x)$. It is surjective because the $v_i$ span $V$, and it is injective because
\begin{align*}
\alpha(x_1,\ldots,x_n)=0
&\Longrightarrow \sum_{i=1}^n x_i v_i=0\\
&\Longrightarrow x_1=\cdots=x_n=0
\end{align*}
by [linear independence](/page/Linear%20Independence) of the basis. Thus every finite-dimensional vector space is isomorphic to some object of $\mathcal S$.
No two distinct objects of $\mathcal S$ are isomorphic. Indeed, if $k^m\cong k^n$, then an isomorphism sends a basis of $k^m$ to a basis of $k^n$, so both vector spaces have the same number of basis vectors, hence $m=n$. Therefore $\mathcal S$ contains exactly one object from each isomorphism class in $\mathbf{FdVect}_k$, so it is a skeleton.
Finally, $\mathcal S$ is isomorphic to $\mathbf{Mat}_k$. Send the object $n$ of $\mathbf{Mat}_k$ to $k^n$, and send an $n\times m$ matrix $A=(a_{ij})$ to the linear map
\begin{align*}
S(A)(x_1,\ldots,x_m)
=\left(\sum_{j=1}^m a_{1j}x_j,\ldots,\sum_{j=1}^m a_{nj}x_j\right).
\end{align*}
Every linear map $f:k^m\to k^n$ is obtained this way: if $e_1,\ldots,e_m$ is the standard basis and
\begin{align*}
f(e_j)=(a_{1j},\ldots,a_{nj}),
\end{align*}
then for $x=\sum_{j=1}^m x_j e_j$,
\begin{align*}
f(x)
&=\sum_{j=1}^m x_j f(e_j)\\
&=\sum_{j=1}^m x_j(a_{1j},\ldots,a_{nj})\\
&=\left(\sum_{j=1}^m a_{1j}x_j,\ldots,\sum_{j=1}^m a_{nj}x_j\right)\\
&=S(A)(x).
\end{align*}
The matrix $A$ is unique because its $j$th column is $f(e_j)$. Hence the skeleton records exactly the invariant $\dim_k V$: finite-dimensional vector spaces are isomorphic precisely when they have the same dimension.
[/example]
Skeletons are useful only if replacing a category by one representative from each isomorphism class does not change the categorical information. The possible obstruction is that deleting duplicate objects might also delete morphism data or change composition. The theorem below confirms that a skeleton preserves the category up to equivalence, so it is a legitimate way to remove redundant isomorphic copies.
[quotetheorem:3967]
[citeproof:3967]
Skeletons are not usually canonical. Choosing one representative from each isomorphism class often requires arbitrary choices, and equivalent categories need not come with a preferred equivalence.
[example: Equivalent Choices of Bases]
Let $V$ be finite-dimensional with $\dim_k V=n$. Choose an ordered basis $\beta=(e_1,\ldots,e_n)$ and define
\begin{align*}
\phi_\beta:V&\to k^n, &
\phi_\beta\left(\sum_{j=1}^n x_j e_j\right)&=(x_1,\ldots,x_n).
\end{align*}
If $\gamma=(e'_1,\ldots,e'_n)$ is another ordered basis, write each new basis vector in the old basis:
\begin{align*}
e'_j=\sum_{i=1}^n p_{ij}e_i.
\end{align*}
Let $P=(p_{ij})$. Since $\gamma$ is a basis, the linear map $k^n\to V$ sending the $j$th standard basis vector to $e'_j$ is an isomorphism, so its matrix $P$ in the basis $\beta$ is invertible; hence $P\in GL(n,k)$.
For $v=\sum_{j=1}^n y_j e'_j$, its $\gamma$-coordinates are
\begin{align*}
\phi_\gamma(v)=(y_1,\ldots,y_n).
\end{align*}
Expanding the same vector in the basis $\beta$ gives
\begin{align*}
v
&=\sum_{j=1}^n y_j e'_j\\
&=\sum_{j=1}^n y_j\left(\sum_{i=1}^n p_{ij}e_i\right)\\
&=\sum_{i=1}^n\left(\sum_{j=1}^n p_{ij}y_j\right)e_i.
\end{align*}
Therefore
\begin{align*}
\phi_\beta(v)
&=\left(\sum_{j=1}^n p_{1j}y_j,\ldots,\sum_{j=1}^n p_{nj}y_j\right)\\
&=P\phi_\gamma(v).
\end{align*}
Thus $\phi_\beta=P\phi_\gamma$, equivalently $\phi_\gamma=P^{-1}\phi_\beta$.
Now let $f:V\to W$ be linear, and choose two bases $\beta,\gamma$ of $V$ with change-of-basis matrix $P$, and two bases $\alpha,\delta$ of $W$ with change-of-basis matrix $Q$, so
\begin{align*}
\phi_\beta&=P\phi_\gamma, &
\psi_\alpha&=Q\psi_\delta.
\end{align*}
If $A$ is the matrix of $f$ using $\beta$ on $V$ and $\alpha$ on $W$, and $A'$ is the matrix of $f$ using $\gamma$ on $V$ and $\delta$ on $W$, then
\begin{align*}
\psi_\alpha f&=A\phi_\beta, &
\psi_\delta f&=A'\phi_\gamma.
\end{align*}
Using $\psi_\alpha=Q\psi_\delta$ and $\phi_\beta=P\phi_\gamma$,
\begin{align*}
Q A'\phi_\gamma
&=Q\psi_\delta f\\
&=\psi_\alpha f\\
&=A\phi_\beta\\
&=A P\phi_\gamma.
\end{align*}
Since $\phi_\gamma:V\to k^n$ is an isomorphism, this gives
\begin{align*}
Q A'&=A P, &
A'&=Q^{-1}A P.
\end{align*}
So different basis choices give equivalences with the matrix category that classify objects by the same dimension, but they assign conjugated coordinate matrices to the same linear maps. The equivalence is therefore structural: it records the invariant dimension and the linear maps up to chosen coordinates, not a canonical coordinate system.
[/example]
This raises the question of which features survive after changing coordinates, choosing different representatives, or replacing a category by an equivalent one. Features that depend on a particular presentation should not count as intrinsic. The intrinsic features are those that equivalences preserve.
[definition: Categorical Invariant]
A categorical invariant is a property or construction of a category that is preserved under equivalence of categories.
[/definition]
This definition is intentionally broad. In practice, categorical invariants are those features expressible using objects, morphisms, composition, identity morphisms, and natural isomorphism, rather than features depending on the accidental presentation of objects.
[quotetheorem:3968]
[citeproof:3968]
The same reasoning applies to many familiar categorical properties: being initial, terminal, a monomorphism, an epimorphism, a section, or a retraction is preserved under equivalence. Each of these notions is stated entirely in terms of morphisms and their equations.
[example: Preorders as Thin Categories]
A preorder $(P,\le)$ determines a category $\mathcal P$ as follows: the objects are the elements of $P$, and
\begin{align*}
\operatorname{Hom}_{\mathcal P}(p,q)
=
\begin{cases}
\{\ast_{p,q}\}, & p\le q,\\
\varnothing, & p\nleq q.
\end{cases}
\end{align*}
The identity morphism on $p$ is $\ast_{p,p}$, which exists because reflexivity gives $p\le p$. If $\ast_{p,q}:p\to q$ and $\ast_{q,r}:q\to r$ are morphisms, then $p\le q$ and $q\le r$, so transitivity gives $p\le r$; define
\begin{align*}
\ast_{q,r}\circ \ast_{p,q}=\ast_{p,r}.
\end{align*}
The identity laws hold because each relevant hom-set has at most one element:
\begin{align*}
\ast_{p,q}\circ \ast_{p,p}&=\ast_{p,q},&
\ast_{q,q}\circ \ast_{p,q}&=\ast_{p,q}.
\end{align*}
Associativity holds for the same reason: if $p\le q\le r\le s$, then both composites
\begin{align*}
(\ast_{r,s}\circ \ast_{q,r})\circ \ast_{p,q}
\quad\text{and}\quad
\ast_{r,s}\circ(\ast_{q,r}\circ \ast_{p,q})
\end{align*}
are the unique morphism $\ast_{p,s}:p\to s$.
A functor $F:\mathcal P\to\mathcal Q$ is exactly an order-preserving map $P\to Q$. Indeed, if $p\le q$, then there is a morphism $\ast_{p,q}:p\to q$ in $\mathcal P$, so $F(\ast_{p,q})$ is a morphism $F(p)\to F(q)$ in $\mathcal Q$, which means $F(p)\le F(q)$. Conversely, if $F:P\to Q$ is order-preserving, define the functor on objects by $p\mapsto F(p)$; whenever $p\le q$, order preservation gives $F(p)\le F(q)$, so there is a unique morphism $\ast_{F(p),F(q)}:F(p)\to F(q)$. This unique choice preserves identities and composition because the target category is thin, meaning each hom-set has at most one morphism.
Two elements $p,q\in P$ are isomorphic in $\mathcal P$ exactly when $p\le q$ and $q\le p$. If $p\cong q$, then there are morphisms $p\to q$ and $q\to p$, so by the definition of $\mathcal P$ we have $p\le q$ and $q\le p$. Conversely, if $p\le q$ and $q\le p$, then the morphisms $\ast_{p,q}$ and $\ast_{q,p}$ exist, and their composites satisfy
\begin{align*}
\ast_{q,p}\circ \ast_{p,q}&=\ast_{p,p}=\operatorname{id}_p,\\
\ast_{p,q}\circ \ast_{q,p}&=\ast_{q,q}=\operatorname{id}_q.
\end{align*}
Thus they are inverse isomorphisms.
Therefore a skeleton of $\mathcal P$ is obtained by choosing one representative from each equivalence class for the relation
\begin{align*}
p\sim q \quad\Longleftrightarrow\quad p\le q\text{ and }q\le p.
\end{align*}
Passing from a preorder to its skeleton collapses exactly the elements that are mutually comparable in both directions, so equivalence of preorder categories records the associated quotient posets rather than the original presentation with all duplicate representatives.
[/example]
For genuine posets there are no duplicate representatives to collapse. This makes posets a useful boundary case where categorical equivalence tightens to the ordinary order-isomorphism notion.
[example: Posets and Order Isomorphism]
Let $(P,\le_P)$ and $(Q,\le_Q)$ be posets, and write $\mathcal P$ and $\mathcal Q$ for their associated thin categories. The category $\mathcal P$ is skeletal: if $p,p'\in P$ are isomorphic in $\mathcal P$, then there are morphisms $p\to p'$ and $p'\to p$, so
\begin{align*}
p\le_P p'
\quad\text{and}\quad
p'\le_P p.
\end{align*}
Since $P$ is a poset, antisymmetry gives $p=p'$. The same argument shows that $\mathcal Q$ is skeletal.
Now suppose $F:\mathcal P\to\mathcal Q$ is an equivalence, with quasi-inverse $G:\mathcal Q\to\mathcal P$. Since a functor between preorder categories is order-preserving, $F$ and $G$ are order-preserving maps. The natural isomorphism $\operatorname{id}_{\mathcal P}\Rightarrow G F$ has, at each $p\in P$, an isomorphism
\begin{align*}
p\cong G(F(p)).
\end{align*}
Because $\mathcal P$ is skeletal, this forces
\begin{align*}
G(F(p))=p.
\end{align*}
Similarly, the natural isomorphism $F G\Rightarrow\operatorname{id}_{\mathcal Q}$ gives
\begin{align*}
F(G(q))=q
\end{align*}
for every $q\in Q$. Thus $F:P\to Q$ and $G:Q\to P$ are inverse functions.
It remains to check that $F$ reflects the order. If $F(p)\le_Q F(p')$, then applying the order-preserving map $G$ gives
\begin{align*}
G(F(p))\le_P G(F(p')).
\end{align*}
Using $G(F(p))=p$ and $G(F(p'))=p'$, this becomes
\begin{align*}
p\le_P p'.
\end{align*}
Therefore
\begin{align*}
p\le_P p'
\quad\Longleftrightarrow\quad
F(p)\le_Q F(p'),
\end{align*}
so $F$ is an order isomorphism.
Conversely, if $h:P\to Q$ is an order isomorphism, define a functor $H:\mathcal P\to\mathcal Q$ by $H(p)=h(p)$ on objects and by sending the unique morphism $p\to p'$ to the unique morphism $h(p)\to h(p')$. Its inverse order isomorphism defines an inverse functor, so the associated categories are isomorphic and hence equivalent. Thus for posets, equivalence of the associated categories is exactly order isomorphism; unlike preorders, there are no distinct mutually comparable elements for equivalence to collapse.
[/example]
Equivalence of categories is therefore the category-theoretic version of sameness up to isomorphism of objects. It is weaker than isomorphism of categories, strong enough to preserve all morphism-level structure, and flexible enough to cover the comparisons that arise throughout algebra, topology, and analysis. In later chapters, representable functors and the Yoneda lemma will refine this theme: an object is determined, up to unique isomorphism, by how all other objects map into it.
Equivalence shows that category theory is concerned with structure up to the right notion of sameness, not with rigid equality. The next chapter uses that flexible viewpoint to develop universal properties as the standard way to specify constructions by their mapping behavior.
# 6. Universal Properties in Practice
Universal properties turn constructions into answers to mapping problems. Earlier chapters introduced initial and terminal objects as the first cases: an object is determined by how maps into or out of it behave. This chapter develops that principle in the examples most algebraists use constantly: free groups, polynomial rings, quotient groups, and tensor products. The central theme is that a construction should be recognised not by its presentation, but by the maps it represents.
## Universal Mapping Properties and Uniqueness
Suppose two different constructions claim to solve the same mapping problem. What should it mean for them to be the same, and why should the comparison map be forced rather than chosen?
[definition: Universal Mapping Property]
Let $C$ be a category and let $F:C\to \mathbf{Set}$ be a functor. A representation of $F$ is an object $U\in C$ together with a natural bijection
\begin{align*}
\Phi_X:\operatorname{Hom}_C(U,X) \to F(X)
\end{align*}
for every object $X\in C$.
[/definition]
Dually, a contravariant functor $F:C^{\mathrm{op}}\to\mathbf{Set}$ is represented by an object $U\in C$ when there are natural bijections $\operatorname{Hom}_C(X,U)\cong F(X)$. The element of $F(U)$ corresponding to $\operatorname{id}_U$ is the universal datum; every other datum in $F(X)$ is obtained from it by a unique morphism. This formulation makes precise what the informal language of "the object solving a mapping problem" means.
The direction of the unique morphism is part of the property. Free objects usually produce maps out of the universal object, while quotient objects usually receive maps from the object being quotiented and then factor through the quotient.
[example: Initial Object as a Universal Property]
In a category $C$, an initial object $0$ has exactly one morphism $0\to X$ for every object $X\in C$. Thus it represents the constant one-point mapping problem: if $F(X)=\{\ast\}$ for every $X$, then the map
\begin{align*}
\operatorname{Hom}_C(0,X) &\longrightarrow F(X)\\
u &\longmapsto \ast
\end{align*}
is a bijection because both sets have exactly one element.
The universal datum is the unique element $\ast\in F(0)$, so there is no extra structure for a morphism $0\to X$ to preserve. This is the same pattern as a free construction: the datum may later be a function from a set of generators, but the universal property still says that the compatible morphism is forced by that datum.
[/example]
A universal property should determine an object by its role, not by the details of a chosen construction. The obstruction is that two constructions may look different as objects while solving exactly the same mapping problem. In each concrete universal property, the same pattern resolves the ambiguity: the universal mapping property gives a comparison morphism in each direction, and uniqueness forces the two composites to be identity morphisms.
[remark: Why Universal Objects Are Treated As Canonical]
A construction may require choices, but any two successful outcomes for the same specified universal property are compared by exactly one isomorphism compatible with the universal data. The compatibility condition is essential: two abstractly isomorphic groups may carry different chosen generators, and an isomorphism that ignores the chosen generator need not be the universal comparison map. Thus "canonical" does not mean literally equal or choice-free; it means that the comparison map is forced once the universal data are fixed.
[/remark]
## Free Objects Described By Maps Out Of A Set
How can a bare set of symbols generate an algebraic object without imposing unintended equations? If we start with the subgroup generated by chosen elements inside a particular group, accidental relations may appear: in a cyclic group of order $2$, a chosen generator satisfies $s^2=e$, while in $\mathbb Z$ the corresponding generator has infinite order. A free object avoids building in any such relation by being defined through maps out of it. The answer is to require that every function from the generating set into an algebraic object extend uniquely to a structure-preserving map.
[definition: Free Group On A Set]
Let $S$ be a set. A free group on $S$ is a group $F(S)$ together with a function $\iota:S \to F(S)$ such that for every group $G$ and every function $f:S \to G$, there exists a unique group homomorphism $\bar f:F(S) \to G$ satisfying $\bar f \circ \iota = f$.
[/definition]
The map $\iota$ names the generators. The universal property says that a group homomorphism from $F(S)$ is determined exactly by the images of those generators, with no further relations beyond the group axioms.
It remains to know that such a group actually exists. The obstruction is concrete: words in generators must include inverses, but different written words can represent the same element after cancellation. The construction theorem supplies a model whose elements are reduced words, so the universal property is not merely a specification but an attainable object.
[quotetheorem:3970]
[citeproof:3970]
Reduced words are needed because unreduced words give many names for the same group element, such as $s$ and $stt^{-1}$, so equality would not be controlled by the generators alone. The formal inverse symbols are also necessary: a monoid of positive words cannot represent maps into arbitrary groups, since a group homomorphism must know where $s^{-1}$ goes as well as where $s$ goes. The theorem proves existence of an object with the universal property, but it does not make every concrete model of the free group literally identical; uniqueness up to unique compatible isomorphism is the correct conclusion. This construction is the model for later free constructions, including polynomial rings and free modules.
[example: The Free Group On One Generator]
Let $F(\{s\})$ be modeled by reduced words in the one symbol $s$ and its inverse $s^{-1}$. Every reduced word has one of the forms
\begin{align*}
s^n &= \underbrace{s\cdots s}_{n\text{ times}} \quad &&(n>0),\\
1 &&&(n=0),\\
s^n &= \underbrace{s^{-1}\cdots s^{-1}}_{-n\text{ times}} \quad &&(n<0),
\end{align*}
because a reduced word cannot contain both an $s$ and an $s^{-1}$ adjacent after all cancellations have been made. Define
\begin{align*}
\theta:\mathbb Z &\longrightarrow F(\{s\})\\
n &\longmapsto s^n.
\end{align*}
For integers $m,n$, concatenating the word $s^m$ with the word $s^n$ and then cancelling adjacent pairs $ss^{-1}$ or $s^{-1}s$ leaves exactly $s^{m+n}$, so
\begin{align*}
\theta(m+n)=s^{m+n}=s^m s^n=\theta(m)\theta(n).
\end{align*}
Thus $\theta$ is a group homomorphism from $(\mathbb Z,+)$ to $F(\{s\})$, and it is bijective because every reduced word is exactly one of the words $s^n$ above.
Under this identification, the universal map sends $s$ to $1\in\mathbb Z$. Given a group $G$ and a function $\{s\}\to G$ with $s\mapsto g$, define
\begin{align*}
\varphi:\mathbb Z &\longrightarrow G\\
n &\longmapsto
\begin{cases}
g^n, & n>0,\\
e, & n=0,\\
(g^{-1})^{-n}, & n<0.
\end{cases}
\end{align*}
For all integers $m,n$, the laws $g^m g^n=g^{m+n}$ and $g^0=e$ show that $\varphi(m+n)=\varphi(m)\varphi(n)$, so $\varphi$ is a homomorphism and $\varphi(1)=g$. If $\psi:\mathbb Z\to G$ is any homomorphism with $\psi(1)=g$, then for $n>0$,
\begin{align*}
\psi(n)=\psi(\underbrace{1+\cdots+1}_{n\text{ times}})=\underbrace{\psi(1)\cdots\psi(1)}_{n\text{ times}}=g^n,
\end{align*}
while $\psi(0)=e$ and, for $n<0$,
\begin{align*}
\psi(n)=\psi(-(-n))=\psi(-n)^{-1}=g^n.
\end{align*}
Hence the homomorphism is forced by the image of $1$, which is exactly the [universal property of the free group](/theorems/1900) on one generator.
[/example]
The same pattern appears throughout algebra: start with underlying set data, then freely add the operations required by the target category.
[example: Polynomial Ring As Freely Adjoining Variables]
Let $R$ be a commutative ring, let $A$ be a commutative $R$-algebra, and choose elements $a_1,\dots,a_n\in A$. For a multi-index $\alpha=(\alpha_1,\dots,\alpha_n)\in\mathbb N^n$, write
\begin{align*}
x^\alpha=x_1^{\alpha_1}\cdots x_n^{\alpha_n}
\qquad\text{and}\qquad
a^\alpha=a_1^{\alpha_1}\cdots a_n^{\alpha_n}.
\end{align*}
Define
\begin{align*}
\varphi:R[x_1,\dots,x_n]&\longrightarrow A\\
\sum_{\alpha} r_\alpha x^\alpha&\longmapsto \sum_{\alpha} r_\alpha a^\alpha,
\end{align*}
where only finitely many coefficients $r_\alpha\in R$ are nonzero, and where $r_\alpha a^\alpha$ means the scalar action of $r_\alpha$ on $a^\alpha$ through the given $R$-algebra structure on $A$.
This map sends constants to their images in $A$ and sends each variable to the chosen element, since
\begin{align*}
\varphi(x_i)
=\varphi(1\cdot x_i)
=1\cdot a_i
=a_i.
\end{align*}
It is additive because, for polynomials $p=\sum_\alpha r_\alpha x^\alpha$ and $q=\sum_\alpha s_\alpha x^\alpha$,
\begin{align*}
\varphi(p+q)
&=\varphi\left(\sum_\alpha (r_\alpha+s_\alpha)x^\alpha\right)\\
&=\sum_\alpha (r_\alpha+s_\alpha)a^\alpha\\
&=\sum_\alpha r_\alpha a^\alpha+\sum_\alpha s_\alpha a^\alpha\\
&=\varphi(p)+\varphi(q).
\end{align*}
It is multiplicative because polynomial multiplication is given by the Cauchy product:
\begin{align*}
pq
&=\left(\sum_\alpha r_\alpha x^\alpha\right)\left(\sum_\beta s_\beta x^\beta\right)\\
&=\sum_{\alpha,\beta} r_\alpha s_\beta x^{\alpha+\beta},
\end{align*}
so, using commutativity in the $R$-algebra $A$,
\begin{align*}
\varphi(pq)
&=\sum_{\alpha,\beta} r_\alpha s_\beta a^{\alpha+\beta}\\
&=\sum_{\alpha,\beta} r_\alpha s_\beta a^\alpha a^\beta\\
&=\left(\sum_\alpha r_\alpha a^\alpha\right)\left(\sum_\beta s_\beta a^\beta\right)\\
&=\varphi(p)\varphi(q).
\end{align*}
Also $\varphi(1)=1_A$, so $\varphi$ is an $R$-algebra homomorphism.
Now suppose $\psi:R[x_1,\dots,x_n]\to A$ is any $R$-algebra homomorphism with $\psi(x_i)=a_i$ for every $i$. Since $\psi$ fixes $R$ and preserves products,
\begin{align*}
\psi(r_\alpha x^\alpha)
&=r_\alpha\psi(x_1)^{\alpha_1}\cdots\psi(x_n)^{\alpha_n}\\
&=r_\alpha a_1^{\alpha_1}\cdots a_n^{\alpha_n}\\
&=r_\alpha a^\alpha.
\end{align*}
Therefore, by additivity,
\begin{align*}
\psi\left(\sum_\alpha r_\alpha x^\alpha\right)
&=\sum_\alpha \psi(r_\alpha x^\alpha)\\
&=\sum_\alpha r_\alpha a^\alpha\\
&=\varphi\left(\sum_\alpha r_\alpha x^\alpha\right).
\end{align*}
Thus $\psi=\varphi$, so choosing an $R$-algebra map out of $R[x_1,\dots,x_n]$ is exactly the same as choosing the images of the variables $x_1,\dots,x_n$.
[/example]
This example is often more useful than the coefficient formula for polynomials. It explains why substitution into polynomials is a homomorphism and why maps out of a [polynomial ring](/page/Polynomial%20Ring) are controlled by the target elements assigned to the variables.
## Quotient-Type Constructions and Factorization Properties
Free objects create maps out of a constructed object. What is the dual-looking problem when we want to impose equations on an existing object? If we try to force a subset of a group to become the identity without checking compatibility with multiplication, the cosets may not multiply consistently; the product of representatives can depend on the representative chosen. A quotient is the universal target through which precisely the maps respecting those equations must factor.
[definition: Quotient Group By A Normal Subgroup]
Let $G$ be a group and let $N \trianglelefteq G$. The [quotient group](/page/Quotient%20Group) $G/N$ is a group together with a homomorphism $q:G \to G/N$ such that $q(n)=e$ for all $n \in N$, and such that for every group $H$ and every homomorphism $f:G \to H$ with $N \subseteq \ker f$, there exists a unique homomorphism $\bar f:G/N \to H$ satisfying $\bar f \circ q=f$.
[/definition]
The normality condition is what allows cosets to carry a group structure. The universal property says that $G/N$ is the most efficient way to force all elements of $N$ to become the identity.
After defining the quotient by its mapping property, we still need the standard factorization principle that makes the definition usable: maps out of $G$ whose kernels contain $N$ should be exactly the maps that descend uniquely through the quotient map. This is the practical test for whether a homomorphism respects the imposed relations.
[quotetheorem:2701]
[citeproof:2701]
Normality is necessary because kernels of group homomorphisms are normal, so a subgroup that is not normal cannot be exactly the collection of elements killed by a quotient homomorphism. For example, a subgroup of order $2$ generated by a transposition in $S_3$ is not normal, and its left cosets do not support a well-defined [quotient group](/theorems/790) structure. The theorem does not say that every map $G\to H$ factors through $G/N$; it characterises precisely those maps that kill $N$. This is the factorization pattern behind abelianization and, later, presentations by generators and relations.
[example: Abelianization As A Quotient]
For a group $G$, let $[G,G]$ be the subgroup generated by all commutators $ghg^{-1}h^{-1}$. Write $q:G\to G/[G,G]$ for the quotient map. We first show that $G/[G,G]$ is abelian. For $g,h\in G$,
\begin{align*}
gh(hg)^{-1}
&=ghg^{-1}h^{-1}\in [G,G],
\end{align*}
so $gh[G,G]=hg[G,G]$. Hence
\begin{align*}
(g[G,G])(h[G,G])
&=gh[G,G]\\
&=hg[G,G]\\
&=(h[G,G])(g[G,G]),
\end{align*}
and therefore $G/[G,G]$ is abelian.
Now let $A$ be an abelian group and let $f:G\to A$ be a homomorphism. For every commutator $ghg^{-1}h^{-1}$,
\begin{align*}
f(ghg^{-1}h^{-1})
&=f(g)f(h)f(g^{-1})f(h^{-1})\\
&=f(g)f(h)f(g)^{-1}f(h)^{-1}\\
&=f(g)f(g)^{-1}f(h)f(h)^{-1}\\
&=e_A,
\end{align*}
where the third equality uses commutativity of $A$. Thus every commutator lies in $\ker f$, and since $\ker f$ is a subgroup, it contains the subgroup generated by all commutators:
\begin{align*}
[G,G]\subseteq \ker f.
\end{align*}
Define
\begin{align*}
\bar f:G/[G,G]&\longrightarrow A\\
g[G,G]&\longmapsto f(g).
\end{align*}
If $g[G,G]=h[G,G]$, then $h^{-1}g\in [G,G]\subseteq\ker f$, so
\begin{align*}
e_A=f(h^{-1}g)=f(h)^{-1}f(g),
\end{align*}
and hence $f(g)=f(h)$; therefore $\bar f$ is well-defined. It is a homomorphism because
\begin{align*}
\bar f((g[G,G])(h[G,G]))
&=\bar f(gh[G,G])\\
&=f(gh)\\
&=f(g)f(h)\\
&=\bar f(g[G,G])\bar f(h[G,G]).
\end{align*}
Also
\begin{align*}
(\bar f\circ q)(g)=\bar f(g[G,G])=f(g),
\end{align*}
so $f$ factors through $q$. If $\psi:G/[G,G]\to A$ is another homomorphism with $\psi\circ q=f$, then for every coset $g[G,G]$,
\begin{align*}
\psi(g[G,G])
&=\psi(q(g))\\
&=f(g)\\
&=\bar f(g[G,G]),
\end{align*}
so $\psi=\bar f$. Thus abelianization is the universal quotient of $G$ through which all homomorphisms from $G$ to abelian groups factor.
[/example]
This example shows how a quotient can be universal for changing category. Abelianization takes a group and produces the universal abelian group receiving a homomorphism from it.
[remark: Quotients As Imposing Relations]
In many algebraic categories, quotient objects are described by generators and relations. The quotient map records the imposed relations, while the factorization condition identifies exactly which maps out of the original object respect those relations. The universal property is usually the most stable formulation because it remains meaningful even when representatives are inconvenient.
[/remark]
## Tensor Products As Representing Bilinear Maps
Bilinear maps are not homomorphisms out of a cartesian product of modules, because they are linear in each variable separately rather than jointly linear in pairs. Can we replace a pair of inputs by a single universal module whose linear maps encode bilinear maps?
[definition: Tensor Product Of Modules]
Let $R$ be a commutative ring and let $M,N$ be $R$-modules. A [tensor product](/page/Tensor%20Product) of $M$ and $N$ is an $R$-module $M\otimes_R N$ together with an $R$-bilinear map $\tau:M\times N\to M\otimes_R N$ such that for every $R$-module $P$ and every $R$-bilinear map $b:M\times N\to P$, there exists a unique $R$-linear map $\bar b:M\otimes_R N\to P$ satisfying $\bar b\circ\tau=b$.
[/definition]
The notation $m\otimes n$ means $\tau(m,n)$. The defining property is not the list of tensor identities; those identities are the relations needed so that bilinear maps become ordinary linear maps out of $M\otimes_R N$.
[illustration:tensor-product-universal-property]
The universal property should now be turned into a usable correspondence. For tensor products to serve as a representing object, every bilinear map out of $M\times N$ must be recovered uniquely from an ordinary linear map out of $M\otimes_R N$, and this recovery must work for every target module $P$. The theorem packages that correspondence so later arguments can replace bilinear constructions by ordinary linear maps without rebuilding the quotient presentation each time.
[quotetheorem:3971]
[citeproof:3971]
The commutativity of $R$ keeps left and right scalar actions from competing, so $M\otimes_R N$ is again an $R$-module in the form stated here. Over a noncommutative ring one must distinguish right modules, left modules, and bimodules; using two arbitrary left modules would not give the same universal property. The theorem assumes existence and then identifies what the object represents; a separate construction, usually by quotienting the free module on $M\times N$ by bilinearity relations, proves existence. This representative viewpoint is the bridge to homological algebra, where tensor product becomes a functor studied through exactness and derived functors.
[example: Tensoring Free Modules]
Let $R^m$ and $R^n$ have standard bases $e_1,\dots,e_m$ and $f_1,\dots,f_n$. We show that $R^m\otimes_R R^n$ is naturally the free $R$-module with basis indexed by pairs $(i,j)$, namely the tensors $e_i\otimes f_j$.
Every element $x\in R^m$ and $y\in R^n$ has a unique expansion
\begin{align*}
x=\sum_{i=1}^m r_i e_i
\qquad\text{and}\qquad
y=\sum_{j=1}^n s_j f_j
\end{align*}
with $r_i,s_j\in R$. By bilinearity of the tensor map,
\begin{align*}
x\otimes y
&=\left(\sum_{i=1}^m r_i e_i\right)\otimes\left(\sum_{j=1}^n s_j f_j\right)\\
&=\sum_{i=1}^m r_i\left(e_i\otimes\sum_{j=1}^n s_j f_j\right)\\
&=\sum_{i=1}^m r_i\sum_{j=1}^n s_j(e_i\otimes f_j)\\
&=\sum_{i=1}^m\sum_{j=1}^n r_i s_j(e_i\otimes f_j).
\end{align*}
Thus the tensors $e_i\otimes f_j$ span $R^m\otimes_R R^n$.
Now let $P$ be any $R$-module and choose elements $p_{ij}\in P$ for all $1\le i\le m$ and $1\le j\le n$. Define a function
\begin{align*}
b:R^m\times R^n&\longrightarrow P\\
\left(\sum_{i=1}^m r_i e_i,\sum_{j=1}^n s_j f_j\right)&\longmapsto \sum_{i=1}^m\sum_{j=1}^n r_i s_j p_{ij}.
\end{align*}
For fixed $y=\sum_j s_j f_j$, this map is linear in $x$ because
\begin{align*}
b\left(\sum_i r_i e_i+\sum_i r'_i e_i,y\right)
&=\sum_{i,j}(r_i+r'_i)s_jp_{ij}\\
&=\sum_{i,j}r_is_jp_{ij}+\sum_{i,j}r'_is_jp_{ij},
\end{align*}
and scalar multiplication works similarly:
\begin{align*}
b\left(c\sum_i r_i e_i,y\right)
&=\sum_{i,j}(cr_i)s_jp_{ij}\\
&=c\sum_{i,j}r_is_jp_{ij}.
\end{align*}
The same calculation in the second variable shows that $b$ is bilinear, and $b(e_i,f_j)=p_{ij}$.
Conversely, if $b:R^m\times R^n\to P$ is bilinear, then for $x=\sum_i r_i e_i$ and $y=\sum_j s_j f_j$,
\begin{align*}
b(x,y)
&=b\left(\sum_{i=1}^m r_i e_i,\sum_{j=1}^n s_j f_j\right)\\
&=\sum_{i=1}^m r_i b\left(e_i,\sum_{j=1}^n s_j f_j\right)\\
&=\sum_{i=1}^m r_i\sum_{j=1}^n s_j b(e_i,f_j)\\
&=\sum_{i=1}^m\sum_{j=1}^n r_i s_j b(e_i,f_j).
\end{align*}
So $b$ is completely determined by the $mn$ values $b(e_i,f_j)$.
By the tensor product universal property, this bilinear map corresponds to a unique linear map
\begin{align*}
\bar b:R^m\otimes_R R^n\longrightarrow P
\end{align*}
with $\bar b(e_i\otimes f_j)=b(e_i,f_j)$. Therefore choosing a linear map out of $R^m\otimes_R R^n$ is exactly the same as choosing the images of the $mn$ tensors $e_i\otimes f_j$, so those tensors form the expected free basis.
[/example]
The tensor product is a representative for a functor of bilinear maps. This is the bridge from the concrete examples in this chapter to Chapter 7 on representable functors: instead of treating each construction separately, we will recognise when a functor is encoded by maps out of a single object.
Universal properties turn constructions into solutions to mapping problems, and the examples in this chapter make that principle concrete. That viewpoint naturally leads to Hom-functors, which package all maps into or out of a fixed object and prepare the way for representability.
# 7. Hom-Functors and Representability
This chapter turns the language of functors back onto the morphism sets of a category. Earlier chapters used arrows to define special objects and universal constructions; here we package all arrows out of, or into, a fixed object as functors. The guiding question is when an abstract set-valued functor is actually a disguised Hom-functor, because that is the point at which its elements can be treated as maps from a representing object.
The main theme is that representability is a precise form of universal property. A representing object is not merely an object with many useful maps: it is an object from which every element of the functor is obtained uniquely by composition. This perspective explains familiar constructions such as free groups, points of spaces, dual vector spaces, and tensor products in the same language.
## Hom-Functors from a Fixed Object
Given a locally small category, how can we turn all morphisms out of a fixed object into a single functor? The answer is to let a morphism in the category act by post-composition. This is the covariant Hom-functor, and it records how maps with a common domain vary when the codomain changes.
[definition: Covariant Hom-Functor]
Let $\mathcal C$ be a locally small category and let $A \in \mathcal C$. The covariant Hom-functor represented by $A$ is the functor
\begin{align*}
\operatorname{Hom}_{\mathcal C}(A,-):\mathcal C \to \mathbf{Set}
\end{align*}
defined on objects by
\begin{align*}
X \mapsto \operatorname{Hom}_{\mathcal C}(A,X)
\end{align*}
and on a morphism $f:X \to Y$ by the function
\begin{align*}
\operatorname{Hom}_{\mathcal C}(A,f):\operatorname{Hom}_{\mathcal C}(A,X) \to \operatorname{Hom}_{\mathcal C}(A,Y), \qquad g \mapsto f \circ g.
\end{align*}
[/definition]
The word covariant reflects that arrows $X \to Y$ in $\mathcal C$ become functions from the set attached to $X$ to the set attached to $Y$. The fixed object $A$ is the source of every map being counted.
Before this Hom-set assignment can be used as an example, there is a specific functoriality problem to solve. Postcomposition by a morphism has to turn identities into identity functions and composites into composite functions, otherwise the assignment would only be a collection of sets rather than a functor into $\mathbf{Set}$.
[quotetheorem:3972]
[citeproof:3972]
This proof is short because the functor laws are exactly the category axioms applied with one extra morphism attached on the right. Local smallness is the hypothesis that makes the target really be $\mathbf{Set}$: without it, $\operatorname{Hom}_{\mathcal C}(A,X)$ might be a proper class, so the displayed assignment would not define a set-valued functor. The theorem does not say that every set-valued functor arises in this way; representability will be the extra condition that detects when a functor is genuinely a Hom-functor rather than only formally similar to one. The next examples show how to recognise this situation by identifying an element of the functor with a map out of a fixed object.
[example: Points Of A Topological Space]
In $\mathbf{Top}$, let $1=\{\ast\}$ with its unique topology. For every space $X$, define
\begin{align*}
\Phi_X:\operatorname{Hom}_{\mathbf{Top}}(1,X)\to |X|,\qquad \Phi_X(p)=p(\ast).
\end{align*}
This function is injective because if $p,q:1\to X$ satisfy $\Phi_X(p)=\Phi_X(q)$, then
\begin{align*}
p(\ast)=q(\ast),
\end{align*}
and $\ast$ is the only point of $1$, so $p=q$ as functions. It is surjective because for any $x\in |X|$, the function $p_x:1\to X$ defined by
\begin{align*}
p_x(\ast)=x
\end{align*}
is continuous: for every open set $U\subseteq X$,
\begin{align*}
p_x^{-1}(U)=
\begin{cases}
\{\ast\}, & x\in U,\\
\varnothing, & x\notin U,
\end{cases}
\end{align*}
and both $\{\ast\}$ and $\varnothing$ are open in $1$. Hence
\begin{align*}
\operatorname{Hom}_{\mathbf{Top}}(1,X)\cong |X|.
\end{align*}
This bijection is natural in $X$. If $f:X\to Y$ is continuous and $p:1\to X$, then post-composition sends $p$ to $f\circ p$, and under the displayed bijections,
\begin{align*}
\Phi_Y(f\circ p)
&=(f\circ p)(\ast)\\
&=f(p(\ast))\\
&=f(\Phi_X(p)).
\end{align*}
Thus the Hom-functor $\operatorname{Hom}_{\mathbf{Top}}(1,-)$ agrees with the underlying-set functor by sending a map out of the one-point space to its selected point. The reusable recognition principle is that maps from a terminal object pick out global elements; the boundary is that this works here because maps $1\to X$ encode exactly one point of $X$, while an arbitrary forgetful functor need not be represented by its terminal object unless maps from that object encode precisely the forgotten data.
[/example]
The same pattern appears in algebra: a suitable object can encode the data of choosing one distinguished element.
[example: The Forgetful Functor From Groups]
Let $U:\mathbf{Grp}\to\mathbf{Set}$ be the forgetful functor. For each group $G$, define
\begin{align*}
\Phi_G:\operatorname{Hom}_{\mathbf{Grp}}(\mathbb Z,G)\to U(G),
\qquad
\Phi_G(\varphi)=\varphi(1).
\end{align*}
We show that $\Phi_G$ is a bijection. If $g\in G$, define $\varphi_g:\mathbb Z\to G$ by
\begin{align*}
\varphi_g(n)=
\begin{cases}
g^n, & n>0,\\
e_G, & n=0,\\
(g^{-1})^{-n}, & n<0.
\end{cases}
\end{align*}
For all $m,n\in\mathbb Z$, the usual laws of integer powers in a group give
\begin{align*}
\varphi_g(m+n)=g^{m+n}=g^m g^n=\varphi_g(m)\varphi_g(n),
\end{align*}
with the displayed formula interpreted by the same definition when one of $m,n,m+n$ is nonpositive. Thus $\varphi_g$ is a group homomorphism, and
\begin{align*}
\Phi_G(\varphi_g)=\varphi_g(1)=g.
\end{align*}
So $\Phi_G$ is surjective.
For injectivity, let $\varphi,\psi:\mathbb Z\to G$ be group homomorphisms with $\varphi(1)=\psi(1)$. If $n>0$, then
\begin{align*}
\varphi(n)
&=\varphi(\underbrace{1+\cdots+1}_{n\text{ times}})\\
&=\underbrace{\varphi(1)\cdots\varphi(1)}_{n\text{ times}}\\
&=\underbrace{\psi(1)\cdots\psi(1)}_{n\text{ times}}\\
&=\psi(n).
\end{align*}
Also
\begin{align*}
\varphi(0)=e_G=\psi(0),
\end{align*}
and if $n<0$, then $-n>0$, so
\begin{align*}
\varphi(n)
&=\varphi(-(-n))\\
&=\varphi(-n)^{-1}\\
&=\psi(-n)^{-1}\\
&=\psi(n).
\end{align*}
Hence $\varphi=\psi$, so $\Phi_G$ is injective. Therefore
\begin{align*}
\operatorname{Hom}_{\mathbf{Grp}}(\mathbb Z,G)\cong U(G)
\end{align*}
by sending $\varphi$ to $\varphi(1)$.
This bijection is natural in $G$. If $h:G\to H$ is a group homomorphism and $\varphi:\mathbb Z\to G$, then
\begin{align*}
\Phi_H(h\circ\varphi)
&=(h\circ\varphi)(1)\\
&=h(\varphi(1))\\
&=U(h)(\Phi_G(\varphi)).
\end{align*}
Thus the forgetful functor $U$ is represented by $\mathbb Z$: a map out of $\mathbb Z$ is exactly the choice of one element of the target group.
[/example]
## Hom-Functors into a Fixed Object
What changes when the fixed object is the codomain rather than the domain? A morphism $f:X \to Y$ then turns a map $Y \to A$ into a map $X \to A$ by pre-composition. Thus the direction of arrows reverses.
[definition: Contravariant Hom-Functor]
Let $\mathcal C$ be a locally small category and let $A \in \mathcal C$. The contravariant Hom-functor corepresented by $A$ is the functor
\begin{align*}
\operatorname{Hom}_{\mathcal C}(-,A):\mathcal C^{\operatorname{op}} \to \mathbf{Set}
\end{align*}
defined on objects by
\begin{align*}
X \mapsto \operatorname{Hom}_{\mathcal C}(X,A)
\end{align*}
and on a morphism $f:X \to Y$ in $\mathcal C$ by the function
\begin{align*}
\operatorname{Hom}_{\mathcal C}(f,A):\operatorname{Hom}_{\mathcal C}(Y,A) \to \operatorname{Hom}_{\mathcal C}(X,A), \qquad g \mapsto g \circ f.
\end{align*}
[/definition]
This functor is covariant as a functor out of $\mathcal C^{\operatorname{op}}$, but contravariant when viewed as depending on morphisms of $\mathcal C$. The distinction is often only notational, yet it prevents mistakes in naturality squares.
The reversed direction creates a real verification problem: precomposition must turn arrows of $\mathcal C$ into functions in the opposite direction while still respecting identities and composites. Establishing this functoriality is what allows contravariant Hom to serve as the basic model for pullback-style constructions and later presheaves.
[quotetheorem:3973]
[citeproof:3973]
Contravariant Hom-functors are the prototype for functors that assign to an object its functions, functionals, predicates, or probes into a fixed target. Again, local smallness is needed so that each Hom-collection is an actual set. The theorem does not make $\operatorname{Hom}_{\mathcal C}(-,A)$ a covariant functor on $\mathcal C$; it is covariant only after passing to $\mathcal C^{\operatorname{op}}$, and forgetting this reversal gives naturality squares with the arrows pointing the wrong way. This contravariant pattern is the one used whenever structure is pulled back along a map, as in dual spaces.
[example: Dual Vector Spaces]
Let $k$ be a field and let $\mathbf{Vect}_k$ be the category of $k$-vector spaces and $k$-linear maps. For each vector space $V$,
\begin{align*}
V^*=\operatorname{Hom}_k(V,k),
\end{align*}
so the [dual space](/page/Dual%20Space) construction is the contravariant Hom-functor $\operatorname{Hom}_{\mathbf{Vect}_k}(-,k)$.
If $T:V\to W$ is linear, the induced function
\begin{align*}
T^*:W^*\to V^*
\end{align*}
is defined by
\begin{align*}
T^*(\lambda)=\lambda\circ T.
\end{align*}
To see that $T^*(\lambda)$ is a $k$-linear functional on $V$, let $v_1,v_2\in V$ and $a,b\in k$. Since $T$ and $\lambda$ are linear,
\begin{align*}
(\lambda\circ T)(av_1+bv_2)
&=\lambda(T(av_1+bv_2))\\
&=\lambda(aT(v_1)+bT(v_2))\\
&=a\lambda(T(v_1))+b\lambda(T(v_2))\\
&=a(\lambda\circ T)(v_1)+b(\lambda\circ T)(v_2).
\end{align*}
Thus pre-composition with $T$ sends linear functionals on $W$ to linear functionals on $V$.
This assignment reverses composition. If $V\xrightarrow{T}W\xrightarrow{S}X$ are linear maps and $\mu\in X^*$, then
\begin{align*}
((S\circ T)^*\mu)(v)
&=(\mu\circ(S\circ T))(v)\\
&=\mu(S(T(v)))\\
&=((\mu\circ S)\circ T)(v)\\
&=(T^*(S^*\mu))(v)
\end{align*}
for every $v\in V$, so
\begin{align*}
(S\circ T)^*=T^*\circ S^*.
\end{align*}
Also, for $\lambda\in V^*$ and $v\in V$,
\begin{align*}
((\operatorname{id}_V)^*\lambda)(v)
&=(\lambda\circ \operatorname{id}_V)(v)\\
&=\lambda(v),
\end{align*}
so $(\operatorname{id}_V)^*=\operatorname{id}_{V^*}$. Hence the usual pullback of linear functionals is exactly the action of the contravariant Hom-functor with fixed target $k$.
The recognition technique is to ask whether the data assigned to $V$ are maps from $V$ into one fixed target. The boundary is that the double-dual functor $V\mapsto V^{**}$ is naturally isomorphic to the identity on finite-dimensional vector spaces, but not on all vector spaces; in infinite dimension, the evaluation map $V\to V^{**}$ need not be surjective, so pointwise similarity does not by itself identify the expected representing object.
[/example]
The covariant and contravariant Hom-functors can be combined into a two-variable functor.
[remark: Hom Is Bifunctorial]
For a locally small category $\mathcal C$, the assignment
\begin{align*}
(A,X) \mapsto \operatorname{Hom}_{\mathcal C}(A,X)
\end{align*}
extends to a functor $\mathcal C^{\operatorname{op}} \times \mathcal C \to \mathbf{Set}$. A pair of morphisms $A' \xrightarrow{u} A$ and $X \xrightarrow{v} X'$ acts on $f:A \to X$ by sending it to $v\circ f\circ u:A' \to X'$.
[/remark]
## Representable Functors
When should a set-valued functor be regarded as a Hom-functor in disguise? The answer is not equality of functors but natural isomorphism. A functor is representable when its elements can be parametrised naturally by arrows out of, or into, a single object.
The word naturally is doing real work. A functor may assign sets whose cardinalities match Hom-sets object by object without being represented, because arbitrary bijections need not commute with maps in the category. For example, on $\mathbf{Set}$ the powerset functor $\mathcal P:\mathbf{Set}^{\operatorname{op}}\to\mathbf{Set}$ is represented by the two-point set through characteristic functions, while a choice of unrelated bijections $\mathcal P(X)\cong\operatorname{Hom}_{\mathbf{Set}}(X,2)$ would not by itself express the pullback behaviour of subsets. Representability is therefore a natural compatibility condition, not merely a counting statement.
[definition: Representable Covariant Functor]
Let $\mathcal C$ be a locally small category. A functor $F:\mathcal C \to \mathbf{Set}$ is representable if there exists an object $A \in \mathcal C$ and a natural isomorphism
\begin{align*}
\alpha:\operatorname{Hom}_{\mathcal C}(A,-) \Rightarrow F.
\end{align*}
The pair $(A,\alpha)$ is called a representation of $F$, and $A$ is called a representing object.
[/definition]
The representing object is the object whose outgoing maps classify elements of $F$. Different choices of natural isomorphism can encode different choices of a universal element, but the representing object itself is rigid up to unique isomorphism once the representation data are fixed.
Many important functors vary in the opposite direction: a map $X\to Y$ pulls structures on $Y$ back to structures on $X$. For such functors, elements are classified by maps into a fixed object rather than maps out of it, so representability needs a contravariant version.
[definition: Representable Contravariant Functor]
Let $\mathcal C$ be a locally small category. A functor $F:\mathcal C^{\operatorname{op}} \to \mathbf{Set}$ is representable if there exists an object $A \in \mathcal C$ and a natural isomorphism
\begin{align*}
\alpha:\operatorname{Hom}_{\mathcal C}(-,A) \Rightarrow F.
\end{align*}
The pair $(A,\alpha)$ is called a representation of $F$.
[/definition]
For contravariant functors, maps into $A$ classify the elements of $F$. This is the form used for dual spaces and for many moduli-type functors in geometry.
Once a functor has a representation, the next issue is whether different representing objects can give genuinely different answers. The expected uniqueness cannot be plain equality of objects, so the theorem formulates the precise canonical isomorphism forced by the natural isomorphism data.
[quotetheorem:3974]
[citeproof:3974]
The theorem is the categorical reason universal constructions are canonical. The hypothesis that both representations are natural is essential: objectwise bijections between $\operatorname{Hom}_{\mathcal C}(A,-)$ and $\operatorname{Hom}_{\mathcal C}(B,-)$ do not have to come from an isomorphism $A\cong B$ if they are chosen independently at each object and ignore composition. The theorem also does not say that an unadorned representing object is literally unique; it says that once the representation data are fixed, there is a unique isomorphism compatible with those data. A free group on one generator, if it exists, is not canonical because of a chosen model such as $\mathbb Z$; it is canonical because every other object with the same universal property is uniquely isomorphic to it in the required way. This prepares the universal-element formulation, where the chosen data are compressed to a single distinguished element.
[example: Representing The Underlying Set Of A Group]
For the forgetful functor $U:\mathbf{Grp}\to\mathbf{Set}$, the representation by $\mathbb Z$ has components
\begin{align*}
\alpha_G:\operatorname{Hom}_{\mathbf{Grp}}(\mathbb Z,G)\to U(G),
\qquad
\alpha_G(\varphi)=\varphi(1).
\end{align*}
The element corresponding to $\operatorname{id}_{\mathbb Z}$ is
\begin{align*}
\alpha_{\mathbb Z}(\operatorname{id}_{\mathbb Z})
&=\operatorname{id}_{\mathbb Z}(1)\\
&=1,
\end{align*}
so the chosen universal element is the integer generator $1\in \mathbb Z$.
Suppose another group $A$ also represents $U$, with natural isomorphism
\begin{align*}
\beta:\operatorname{Hom}_{\mathbf{Grp}}(A,-)\Rightarrow U.
\end{align*}
Let
\begin{align*}
a=\beta_A(\operatorname{id}_A)\in U(A)
\end{align*}
be its chosen universal element. By *Representing Objects Are Unique Up To Unique Isomorphism*, there is a unique isomorphism
\begin{align*}
u:\mathbb Z\to A
\end{align*}
such that for every group $G$ and every homomorphism $f:A\to G$,
\begin{align*}
\beta_G(f)=\alpha_G(f\circ u).
\end{align*}
Taking $G=A$ and $f=\operatorname{id}_A$ gives
\begin{align*}
a
&=\beta_A(\operatorname{id}_A)\\
&=\alpha_A(\operatorname{id}_A\circ u)\\
&=\alpha_A(u)\\
&=u(1).
\end{align*}
Thus the unique isomorphism sends the generator $1\in\mathbb Z$ to the chosen universal element $a\in A$.
This is why the universal property determines the infinite cyclic group: any other representing group is forced to be isomorphic to $\mathbb Z$, and the compatible isomorphism is fixed by where it sends the universal generator.
[/example]
## Universal Elements
A natural isomorphism from a Hom-functor is a large amount of data: it gives a bijection for every object and compatibility for every morphism. Can this data be compressed into a single element? The universal element criterion says yes.
[definition: Universal Element Of A Covariant Functor]
Let $F:\mathcal C \to \mathbf{Set}$ be a functor. A universal element of $F$ is a pair $(A,u)$ where $A \in \mathcal C$ and $u \in F(A)$ such that for every object $X \in \mathcal C$ and every element $x \in F(X)$, there exists a unique morphism $f:A \to X$ satisfying
\begin{align*}
F(f)(u)=x.
\end{align*}
[/definition]
The element $u$ is the element from which all other elements of the functor are generated by pushing forward along a unique map.
Contravariant representability needs the same compression of data, but with the arrows reversed. Since a morphism $X\to A$ induces a function $F(A)\to F(X)$, the corresponding definition must describe one element at the representing object whose pullbacks, along unique maps into that object, produce all elements elsewhere.
[definition: Universal Element Of A Contravariant Functor]
Let $F:\mathcal C^{\operatorname{op}} \to \mathbf{Set}$ be a functor. A universal element of $F$ is a pair $(A,u)$ where $A \in \mathcal C$ and $u \in F(A)$ such that for every object $X \in \mathcal C$ and every element $x \in F(X)$, there exists a unique morphism $f:X \to A$ satisfying
\begin{align*}
F(f)(u)=x.
\end{align*}
[/definition]
Here the element $u$ is pulled back along a unique map into the representing object. This reversal matches the reversal in the contravariant Hom-functor.
Universal elements are meant to be more than suggestive notation: they should exactly encode representability. The next result gives the promised criterion by showing when one distinguished element, together with its universal mapping property, determines the same data as a natural isomorphism from a Hom-functor.
[quotetheorem:3975]
[citeproof:3975]
This theorem turns representability into the familiar task of finding a universal object with a universal element. Local smallness is still needed because the comparison functor is $\operatorname{Hom}_{\mathcal C}(A,-):\mathcal C\to\mathbf{Set}$; without set-sized Hom-values, the displayed natural isomorphism would not live in $\mathbf{Set}$. The criterion does not say that pointwise bijections $\operatorname{Hom}_{\mathcal C}(A,X)\cong F(X)$ are enough: the bijections must be the maps $f\mapsto F(f)(u)$ for one fixed $u\in F(A)$, so they automatically respect every morphism $X\to Y$. This is why the identity morphism on the representing object is the hidden source of the entire natural isomorphism, and it is also why the next section can rephrase the criterion as an initial-object statement in the category of elements.
[example: The Universal Generator Of The Infinite Cyclic Group]
For $U:\mathbf{Grp}\to\mathbf{Set}$, we show that $(\mathbb Z,1)$ is a universal element, where $\mathbb Z$ is written additively and $1\in U(\mathbb Z)$ is the integer generator. Let $G$ be a group and let $g\in G$. Define $\varphi_g:\mathbb Z\to G$ by
\begin{align*}
\varphi_g(n)=
\begin{cases}
g^n, & n>0,\\
e_G, & n=0,\\
(g^{-1})^{-n}, & n<0.
\end{cases}
\end{align*}
Then $\varphi_g(1)=g$. To check that $\varphi_g$ is a homomorphism, let $m,n\in\mathbb Z$. If $m,n\geq 0$, then
\begin{align*}
\varphi_g(m+n)
&=g^{m+n}\\
&=\underbrace{g\cdots g}_{m\text{ times}}\underbrace{g\cdots g}_{n\text{ times}}\\
&=g^m g^n\\
&=\varphi_g(m)\varphi_g(n).
\end{align*}
If $m,n<0$, write $m=-a$ and $n=-b$ with $a,b>0$. Then
\begin{align*}
\varphi_g(m+n)
&=\varphi_g(-(a+b))\\
&=(g^{-1})^{a+b}\\
&=(g^{-1})^a(g^{-1})^b\\
&=\varphi_g(m)\varphi_g(n).
\end{align*}
If $m\geq 0$ and $n=-a<0$, then when $m\geq a$,
\begin{align*}
\varphi_g(m+n)
&=\varphi_g(m-a)\\
&=g^{m-a}\\
&=g^m(g^{-1})^a\\
&=\varphi_g(m)\varphi_g(n),
\end{align*}
and when $m<a$,
\begin{align*}
\varphi_g(m+n)
&=\varphi_g(-(a-m))\\
&=(g^{-1})^{a-m}\\
&=g^m(g^{-1})^a\\
&=\varphi_g(m)\varphi_g(n).
\end{align*}
The case $m<0$ and $n\geq 0$ is the same cancellation calculation with the two integers interchanged, since all factors are powers of the same element $g$. Hence $\varphi_g(m+n)=\varphi_g(m)\varphi_g(n)$ for all $m,n$, so $\varphi_g$ is a group homomorphism.
It remains to prove uniqueness. If $\psi:\mathbb Z\to G$ is a group homomorphism with $\psi(1)=g$, then for $n>0$,
\begin{align*}
\psi(n)
&=\psi(\underbrace{1+\cdots+1}_{n\text{ times}})\\
&=\underbrace{\psi(1)\cdots\psi(1)}_{n\text{ times}}\\
&=g^n\\
&=\varphi_g(n).
\end{align*}
Also
\begin{align*}
\psi(0)=e_G=\varphi_g(0).
\end{align*}
If $n<0$, then $-n>0$, and since $0=n+(-n)$,
\begin{align*}
e_G
&=\psi(0)\\
&=\psi(n+(-n))\\
&=\psi(n)\psi(-n),
\end{align*}
so $\psi(n)=\psi(-n)^{-1}$. Therefore
\begin{align*}
\psi(n)
&=\psi(-n)^{-1}\\
&=(g^{-n})^{-1}\\
&=(g^{-1})^{-n}\\
&=\varphi_g(n).
\end{align*}
Thus every $g\in U(G)$ is obtained from the single element $1\in U(\mathbb Z)$ by a unique homomorphism $\mathbb Z\to G$. This is precisely the universal-element form of the universal property of the free group on one generator.
[/example]
The free-group example represents a functor by maps out of a universal source. Tensor products show the same representability idea in a multilinear setting, where the data to be represented are bilinear maps.
[example: Tensor Product As A Representing Object]
Let $V$ and $W$ be vector spaces over a field $k$, and define $B_{V,W}:\mathbf{Vect}_k \to \mathbf{Set}$ on objects by
\begin{align*}
B_{V,W}(E)=\{\text{$k$-bilinear maps } V\times W\to E\}.
\end{align*}
On a linear map $T:E\to E'$, define
\begin{align*}
B_{V,W}(T):B_{V,W}(E)\to B_{V,W}(E'),
\qquad
\beta\mapsto T\circ \beta.
\end{align*}
If $\beta$ is bilinear, then for $v_1,v_2\in V$, $w\in W$, and $a,c\in k$,
\begin{align*}
(T\circ \beta)(av_1+cv_2,w)
&=T(\beta(av_1+cv_2,w))\\
&=T(a\beta(v_1,w)+c\beta(v_2,w))\\
&=aT(\beta(v_1,w))+cT(\beta(v_2,w))\\
&=a(T\circ\beta)(v_1,w)+c(T\circ\beta)(v_2,w),
\end{align*}
and the same calculation in the second variable shows that $T\circ\beta$ is bilinear.
Let
\begin{align*}
b:V\times W\to V\otimes_k W
\end{align*}
be the universal bilinear map of the tensor product. For every vector space $E$, define
\begin{align*}
\alpha_E:\operatorname{Hom}_{\mathbf{Vect}_k}(V\otimes_k W,E)\to B_{V,W}(E),
\qquad
\alpha_E(f)=f\circ b.
\end{align*}
Since $f$ is linear and $b$ is bilinear, $f\circ b$ is bilinear by the calculation above with $T=f$. Conversely, if $\beta:V\times W\to E$ is bilinear, the universal property of $V\otimes_k W$ gives a unique linear map
\begin{align*}
\bar{\beta}:V\otimes_k W\to E
\end{align*}
such that
\begin{align*}
\beta=\bar{\beta}\circ b.
\end{align*}
Thus $\alpha_E(\bar{\beta})=\beta$, so $\alpha_E$ is surjective. If $f,g:V\otimes_k W\to E$ satisfy $\alpha_E(f)=\alpha_E(g)$, then
\begin{align*}
f\circ b=g\circ b.
\end{align*}
Both $f$ and $g$ are linear maps whose composition with $b$ is the same bilinear map $V\times W\to E$, so uniqueness in the universal property gives $f=g$. Hence $\alpha_E$ is injective.
The bijections are natural in $E$. If $T:E\to E'$ is linear and $f:V\otimes_k W\to E$, then
\begin{align*}
\alpha_{E'}(T\circ f)
&=(T\circ f)\circ b\\
&=T\circ(f\circ b)\\
&=B_{V,W}(T)(\alpha_E(f)).
\end{align*}
Therefore
\begin{align*}
\operatorname{Hom}_{\mathbf{Vect}_k}(V\otimes_k W,E)\cong B_{V,W}(E)
\end{align*}
naturally in $E$. The tensor product represents the functor of bilinear maps out of $V\times W$: replacing a bilinear map by its unique linear factor through $V\otimes_k W$ is exactly the representing bijection.
[/example]
The tensor product example is often the first place where representability improves an old construction: instead of building $V\otimes_k W$ from generators and relations, the universal property says what it does.
## The Category of Elements
How can the universal element criterion be expressed as an ordinary initial-object statement? The category of elements turns all elements of all sets $F(X)$ into objects of a new category. Morphisms record when one element is transported to another by the functor.
[definition: Category Of Elements Of A Covariant Functor]
Let $F:\mathcal C \to \mathbf{Set}$ be a functor. The category of elements $\int_{\mathcal C} F$ has objects pairs $(X,x)$ with $X\in\mathcal C$ and $x\in F(X)$. A morphism
\begin{align*}
(X,x)\to (Y,y)
\end{align*}
is a morphism $f:X\to Y$ in $\mathcal C$ such that $F(f)(x)=y$.
[/definition]
Composition and identities are inherited from $\mathcal C$, with the functoriality of $F$ ensuring that the element condition is preserved.
To use categories of elements for presheaf-like functors, the construction must also handle contravariant transport. Since a morphism $X\to Y$ gives a function $F(Y)\to F(X)$, the compatibility condition in the category of elements has to record pullback rather than pushforward.
[definition: Category Of Elements Of A Contravariant Functor]
Let $F:\mathcal C^{\operatorname{op}} \to \mathbf{Set}$ be a functor. The category of elements $\int_{\mathcal C} F$ has objects pairs $(X,x)$ with $X\in\mathcal C$ and $x\in F(X)$. A morphism
\begin{align*}
(X,x)\to (Y,y)
\end{align*}
is a morphism $f:X\to Y$ in $\mathcal C$ such that $F(f)(y)=x$.
[/definition]
The condition is reversed because $F(f)$ is a function $F(Y)\to F(X)$. Thus a morphism in the category of elements says that the element at the target pulls back to the element at the source.
For covariant functors, the category of elements turns the question of representability into an ordinary question about a special object. A representing element should have a unique morphism to every other element of the functor, so representability should be detected by initiality in the corresponding category of elements.
[quotetheorem:3976]
[citeproof:3976]
This reformulation links representability to the earlier uniqueness theorem for initial objects. The theorem is not saying that every category of elements has such an object: for many functors there are elements, but no single element maps uniquely to all the others. For instance, a constant functor with a two-element value has a category of elements split into two unrelated copies of $\mathcal C$, so no object can be initial unless the two elements are forced to coincide. The hypothesis that $F$ is covariant fixes the direction of the initiality statement; for a contravariant functor the corresponding category of elements uses the reversed element condition from the preceding definition. Representable functors are precisely those whose category of elements has a specially placed object from which all other elements are reached uniquely, and the next example shows this initial object in the familiar case of groups with a chosen element.
[example: Elements Of The Underlying-Set Functor On Groups]
For $U:\mathbf{Grp}\to\mathbf{Set}$, the category of elements has objects $(G,g)$, where $G$ is a group and $g\in U(G)$ is a chosen element. A morphism
\begin{align*}
(G,g)\to(H,h)
\end{align*}
is a group homomorphism $\varphi:G\to H$ satisfying
\begin{align*}
U(\varphi)(g)=h,
\end{align*}
which means exactly that
\begin{align*}
\varphi(g)=h.
\end{align*}
We show that $(\mathbb Z,1)$ is initial. Let $(H,h)$ be any object of the category of elements, so $H$ is a group and $h\in H$. Define $\varphi_h:\mathbb Z\to H$ by
\begin{align*}
\varphi_h(n)=
\begin{cases}
h^n, & n>0,\\
e_H, & n=0,\\
(h^{-1})^{-n}, & n<0.
\end{cases}
\end{align*}
Then
\begin{align*}
\varphi_h(1)=h,
\end{align*}
so $\varphi_h$ has the required element condition.
It remains to check that $\varphi_h$ is a group homomorphism. If $m,n\geq 0$, then
\begin{align*}
\varphi_h(m+n)
&=h^{m+n}\\
&=\underbrace{h\cdots h}_{m\text{ times}}\underbrace{h\cdots h}_{n\text{ times}}\\
&=h^m h^n\\
&=\varphi_h(m)\varphi_h(n).
\end{align*}
If $m=-a$ and $n=-b$ with $a,b>0$, then
\begin{align*}
\varphi_h(m+n)
&=\varphi_h(-(a+b))\\
&=(h^{-1})^{a+b}\\
&=(h^{-1})^a(h^{-1})^b\\
&=\varphi_h(m)\varphi_h(n).
\end{align*}
If $m\geq 0$ and $n=-a<0$, then when $m\geq a$,
\begin{align*}
\varphi_h(m+n)
&=\varphi_h(m-a)\\
&=h^{m-a}\\
&=h^m(h^{-1})^a\\
&=\varphi_h(m)\varphi_h(n),
\end{align*}
and when $m<a$,
\begin{align*}
\varphi_h(m+n)
&=\varphi_h(-(a-m))\\
&=(h^{-1})^{a-m}\\
&=h^m(h^{-1})^a\\
&=\varphi_h(m)\varphi_h(n).
\end{align*}
The case $m<0$ and $n\geq 0$ is the same cancellation calculation with the roles of $m$ and $n$ interchanged, since all factors are powers of the same element $h$. Hence $\varphi_h(m+n)=\varphi_h(m)\varphi_h(n)$ for all $m,n\in\mathbb Z$, so $\varphi_h$ is a group homomorphism.
Now let $\psi:\mathbb Z\to H$ be any group homomorphism with $\psi(1)=h$. For $n>0$,
\begin{align*}
\psi(n)
&=\psi(\underbrace{1+\cdots+1}_{n\text{ times}})\\
&=\underbrace{\psi(1)\cdots\psi(1)}_{n\text{ times}}\\
&=h^n\\
&=\varphi_h(n).
\end{align*}
Also
\begin{align*}
\psi(0)=e_H=\varphi_h(0).
\end{align*}
If $n<0$, then $-n>0$, and since $0=n+(-n)$,
\begin{align*}
e_H
&=\psi(0)\\
&=\psi(n+(-n))\\
&=\psi(n)\psi(-n),
\end{align*}
so $\psi(n)=\psi(-n)^{-1}$. Therefore
\begin{align*}
\psi(n)
&=\psi(-n)^{-1}\\
&=(h^{-n})^{-1}\\
&=(h^{-1})^{-n}\\
&=\varphi_h(n).
\end{align*}
Thus $\psi=\varphi_h$. For every $(H,h)$ there is exactly one morphism
\begin{align*}
(\mathbb Z,1)\to(H,h)
\end{align*}
in the category of elements, so $(\mathbb Z,1)$ is initial.
[/example]
The category of elements therefore makes visible the difference between having many elements and having a universal one. The universal element is not special because it is large or complicated, but because every other element receives a unique explanation from it.
## What Representability Buys
Why is representability such a persistent idea across mathematics? It converts external data into morphisms, and morphisms are the native language of a category. Once a functor is represented, naturality questions become composition questions.
[explanation: Practical Meaning Of Representability]
Representability packages a construction by its behaviour under maps. For a covariant functor $F:\mathcal C\to\mathbf{Set}$ represented by $A$, an element of $F(X)$ is the same data as a morphism $A\to X$. The object $A$ is therefore a universal source of elements.
For a contravariant functor $F:\mathcal C^{\operatorname{op}}\to\mathbf{Set}$ represented by $A$, an element of $F(X)$ is the same data as a morphism $X\to A$. The object $A$ is then a universal target or parameter object. This is why affine schemes, dual spaces, and moduli problems are naturally expressed through contravariant functors.
The uniqueness theorem means that a representing object can be constructed in any convenient way. After representability is proved, all constructions satisfying the same universal property are canonically interchangeable through the unique isomorphism compatible with the representation.
[/explanation]
This interchangeability is powerful, but it depends on remembering the chosen representing data. The next remark separates the bare fact of representability from the particular universal element or natural isomorphism used in computations.
[remark: Representability Is A Property With Chosen Data]
A functor may be representable without a particular representation being specified. In computations, however, the chosen natural isomorphism matters because it identifies elements of $F(X)$ with concrete morphisms from or to the representing object. Changing the universal element changes the representation, even though the representing object remains unique up to the appropriate unique isomorphism.
[/remark]
Representable functors show that universal properties can be encoded by ordinary sets of maps. The next chapter turns this observation into the Yoneda lemma, which explains why Hom-functors are powerful probes capable of detecting objects, morphisms, and natural transformations.
# 8. The Yoneda Lemma
Chapter 7 developed representable functors as a way of encoding the universal properties studied in Chapter 6. This chapter explains why representable functors are not merely examples among functors: they are probes that recover objects, morphisms, and elements of arbitrary functors. The Yoneda lemma is the central result, and its proof is short enough that the mechanism should be learned as a reusable calculation rather than treated as a black box.
The guiding idea is that a natural transformation out of a Hom-functor is determined by what it does to the identity morphism. Naturality then forces its value on every other morphism. This principle gives a precise sense in which an object $A$ is represented by the functor of maps out of $A$, and it leads to the Yoneda embedding of a category into a functor category.
## Natural Transformations Out of a Representable Functor
Suppose $F:\mathcal C\to \mathbf{Set}$ is a functor and $A\in \mathcal C$ is an object. A natural transformation
$\eta:\operatorname{Hom}_{\mathcal C}(A,-)\Rightarrow F$ assigns to every object $X$ a function
\begin{align*}
\eta_X:\operatorname{Hom}_{\mathcal C}(A,X)\to F(X).
\end{align*}
The question is: how much data is needed to specify all these functions compatibly with every morphism in $\mathcal C$? Without naturality, the components $\eta_X$ could be chosen independently and no value at $A$ would control them. Without the Hom-functor viewpoint, the special role of the identity morphism would be invisible.
[definition: Covariant Hom Functor]
Let $\mathcal C$ be a locally small category and let $A\in \mathcal C$. The [covariant Hom functor](/theorems/3972) represented by $A$ is the functor
\begin{align*}
\operatorname{Hom}_{\mathcal C}(A,-):\mathcal C\to \mathbf{Set}
\end{align*}
defined on objects by $X\mapsto \operatorname{Hom}_{\mathcal C}(A,X)$ and on morphisms $f:X\to Y$ by
\begin{align*}
\operatorname{Hom}_{\mathcal C}(A,f)(u)=f\circ u
\end{align*}
for $u:A\to X$.
[/definition]
The functor records all ways of mapping from the fixed object $A$ into a variable object. This raises a sharp classification question: how much data is needed to specify a natural transformation from this represented functor to an arbitrary set-valued functor $F$? The Yoneda lemma answers that the entire transformation is controlled by what happens to the identity map of the representing object.
[quotetheorem:3977]
[citeproof:3977]
The conceptual point is that the identity morphism of $A$ acts as the universal generalized element of $A$. Once the value of a natural transformation is known there, naturality forces its value on every map out of $A$, so a family of functions collapses to a single element of $F(A)$.
The hypotheses are doing real work. Local smallness ensures that each $\operatorname{Hom}_{\mathcal C}(A,X)$ is a set, so the represented functor genuinely lands in $\mathbf{Set}$; without it, the displayed Hom-set may be too large for the stated functor category. The Set-valued target matters because the conclusion identifies natural transformations with ordinary elements of $F(A)$; enriched and higher-categorical versions need different statements. The lemma also does not say that every functor is representable: it describes natural transformations out of a representable functor once such a representing object has been fixed. This limitation is exactly why the rest of this chapter separates the covariant form, the contravariant form, and the representable-specialized consequences.
[example: Natural Transformations Determined by an Element]
Let $\mathcal C=\mathbf{Grp}$, let $A=\mathbb Z$, and let $F:\mathbf{Grp}\to \mathbf{Set}$ be the forgetful functor. For each group $G$, define
\begin{align*}
\Phi_G:\operatorname{Hom}_{\mathbf{Grp}}(\mathbb Z,G)&\to F(G),\\
u&\mapsto u(1).
\end{align*}
If $g\in G$, define $u_g:\mathbb Z\to G$ by $u_g(k)=g^k$. Then
\begin{align*}
u_g(k+\ell)=g^{k+\ell}=g^k g^\ell=u_g(k)u_g(\ell),
\end{align*}
so $u_g$ is a group homomorphism. Also $\Phi_G(u_g)=u_g(1)=g$. Conversely, if $u:\mathbb Z\to G$ is a group homomorphism, then
\begin{align*}
u(k)=u(\underbrace{1+\cdots+1}_{k\text{ times}})=u(1)^k
\end{align*}
for $k>0$, while $u(0)=e_G=u(1)^0$ and $u(-k)=u(k)^{-1}=u(1)^{-k}$ for $k>0$. Hence $u=u_{u(1)}$, so $\Phi_G$ is a bijection.
By *[Covariant Yoneda Lemma](/theorems/3977)*, the element $n\in F(\mathbb Z)=\mathbb Z$ determines the natural transformation $\eta^n$ with
\begin{align*}
\eta^n_G(u)=F(u)(n)=u(n).
\end{align*}
Under the identification above, if $u=u_g$, then
\begin{align*}
\eta^n_G(u_g)=u_g(n)=g^n.
\end{align*}
Naturality says that for every homomorphism $\varphi:G\to H$,
\begin{align*}
\eta^n_H(\varphi\circ u)
&=(\varphi\circ u)(n)\\
&=\varphi(u(n))\\
&=F(\varphi)(\eta^n_G(u)).
\end{align*}
Thus choosing the single integer $n$ fixes the whole natural operation: on each group $G$, it sends an element $g\in G$ to its $n$th power $g^n$.
[/example]
This example explains why the lemma is stronger than a cardinality statement. It does not merely count natural transformations; it identifies the unique formula that naturality permits.
## The Contravariant Form
Many familiar functors in algebra and geometry are contravariant: functions pull back, open sets restrict, and presheaves reverse arrows. The same question therefore has a dual form. If a functor is defined on $\mathcal C^{\mathrm{op}}$, what does a natural transformation out of $\operatorname{Hom}_{\mathcal C}(-,A)$ remember? If we tried to treat this Hom construction as a covariant functor on $\mathcal C$, a morphism $f:X\to Y$ would have to turn maps $X\to A$ into maps $Y\to A$, but composition supplies the opposite direction.
[definition: Contravariant Hom Functor]
Let $\mathcal C$ be a locally small category and let $A\in \mathcal C$. The [contravariant Hom functor](/theorems/3973) represented by $A$ is the functor
\begin{align*}
\operatorname{Hom}_{\mathcal C}(-,A):\mathcal C^{\mathrm{op}}\to \mathbf{Set}
\end{align*}
defined on objects by $X\mapsto \operatorname{Hom}_{\mathcal C}(X,A)$ and on a morphism $f:X\to Y$ in $\mathcal C$ by
\begin{align*}
\operatorname{Hom}_{\mathcal C}(f,A)(v)=v\circ f
\end{align*}
for $v:Y\to A$.
[/definition]
Here arrows into $A$ are transported by precomposition. This reversal is the reason the functor has domain $\mathcal C^{\mathrm{op}}$.
The contravariant version of Yoneda asks the same classification question for probes into a fixed object rather than generalized elements out of it. A natural transformation from $\operatorname{Hom}_{\mathcal C}(-,A)$ to a presheaf $F$ should be determined by a single element of $F(A)$, with precomposition forcing all other components. Making this precise is what lets presheaves be studied by how they respond to represented probes.
[quotetheorem:3978]
[citeproof:3978]
The two forms are the same theorem viewed through opposite categories, but in practice both are worth having at hand. Covariant Yoneda is used for covariant representable functors, while the contravariant version is the form underlying presheaves.
The same size and codomain cautions apply here. The category must be locally small so that $\operatorname{Hom}_{\mathcal C}(X,A)$ is a set for each probe $X$, and the functor $F$ is Set-valued so that $F(A)$ is literally a set of elements. The theorem does not identify arbitrary presheaves with representable presheaves; it identifies elements of any presheaf with natural transformations from a represented presheaf. This distinction becomes essential when the Yoneda embedding is introduced: representables form a special full subcategory of the presheaf category, not the whole presheaf category.
[example: Generalized Elements of a Presheaf]
Let $F:\mathcal C^{\mathrm{op}}\to \mathbf{Set}$ be a presheaf and let $a\in F(A)$. By the *[Contravariant Yoneda Lemma](/theorems/3978)*, $a$ determines a natural transformation
\begin{align*}
\eta^a:\operatorname{Hom}_{\mathcal C}(-,A)\Rightarrow F
\end{align*}
whose component at an object $X$ is
\begin{align*}
\eta^a_X:\operatorname{Hom}_{\mathcal C}(X,A)&\to F(X),\\
v&\mapsto F(v)(a).
\end{align*}
Here $v:X\to A$ is a morphism in $\mathcal C$, so contravariance gives a function $F(v):F(A)\to F(X)$.
To see the compatibility explicitly, let $f:Y\to X$ be a morphism in $\mathcal C$ and let $v:X\to A$. The represented presheaf sends $v$ along $f$ by precomposition:
\begin{align*}
\operatorname{Hom}_{\mathcal C}(f,A)(v)=v\circ f.
\end{align*}
Then
\begin{align*}
F(f)(\eta^a_X(v))
&=F(f)(F(v)(a))\\
&=(F(f)\circ F(v))(a)\\
&=F(v\circ f)(a)\\
&=\eta^a_Y(v\circ f)\\
&=\eta^a_Y(\operatorname{Hom}_{\mathcal C}(f,A)(v)).
\end{align*}
Thus the single element $a\in F(A)$ determines a compatible family of elements over every object mapping to $A$: in geometric language, $a$ is observed through all generalized points $X\to A$.
[/example]
This viewpoint is especially useful because it replaces hidden element-chasing by naturality. An element of $F(A)$ is the same thing as a coherent method of evaluating $F$ on every probe into $A$.
## Evaluating at the Identity Morphism
The proof of Yoneda is short, but it deserves a separate conceptual reading. The problem is to understand why the identity morphism carries enough information to reconstruct every natural transformation from a represented functor.
[illustration:yoneda-naturality-square]
[explanation: The Identity Controls the Components]
Let $\eta:\operatorname{Hom}_{\mathcal C}(A,-)\Rightarrow F$ be natural. For a morphism $u:A\to X$, naturality for $u$ gives the commutative square
\begin{align*}
\operatorname{Hom}_{\mathcal C}(A,A) &\xrightarrow{\eta_A} F(A)\\
\operatorname{Hom}_{\mathcal C}(A,u)\downarrow\phantom{\operatorname{Hom}_{\mathcal C}(A,u)} & \phantom{F(u)}\downarrow F(u)\\
\operatorname{Hom}_{\mathcal C}(A,X) &\xrightarrow{\eta_X} F(X).
\end{align*}
Following $\operatorname{id}_A$ down the left side gives $u$, while following it across and then down gives $F(u)(\eta_A(\operatorname{id}_A))$. Therefore
\begin{align*}
\eta_X(u)=F(u)(\eta_A(\operatorname{id}_A)).
\end{align*}
The component $\eta_X$ is not independent data; it is evaluation of $F$ on $u$ applied to one element of $F(A)$.
[/explanation]
This square is often the fastest way to rederive Yoneda during a calculation. If a proposed construction depends on more information than $\eta_A(\operatorname{id}_A)$, then it cannot be a natural transformation out of the represented functor.
[example: Natural Endomorphisms of a Hom Functor]
Fix $A\in \mathcal C$ and consider natural transformations
\begin{align*}
\alpha:\operatorname{Hom}_{\mathcal C}(A,-)\Rightarrow \operatorname{Hom}_{\mathcal C}(A,-).
\end{align*}
By the *[Covariant Yoneda Lemma](/theorems/3977)*, such an $\alpha$ is determined by the element
\begin{align*}
r:=\alpha_A(\operatorname{id}_A)\in \operatorname{Hom}_{\mathcal C}(A,A).
\end{align*}
For any object $X$ and any morphism $u:A\to X$, the Yoneda formula gives
\begin{align*}
\alpha_X(u)
&=\operatorname{Hom}_{\mathcal C}(A,u)(r)\\
&=u\circ r,
\end{align*}
because the covariant Hom functor sends $r:A\to A$ along $u:A\to X$ by postcomposition.
Conversely, given an endomorphism $r:A\to A$, define
\begin{align*}
\alpha^r_X:\operatorname{Hom}_{\mathcal C}(A,X)&\to \operatorname{Hom}_{\mathcal C}(A,X),\\
u&\mapsto u\circ r.
\end{align*}
For a morphism $f:X\to Y$ and $u:A\to X$, naturality follows from associativity of composition:
\begin{align*}
\operatorname{Hom}_{\mathcal C}(A,f)(\alpha^r_X(u))
&=f\circ (u\circ r)\\
&=(f\circ u)\circ r\\
&=\alpha^r_Y(f\circ u)\\
&=\alpha^r_Y(\operatorname{Hom}_{\mathcal C}(A,f)(u)).
\end{align*}
Finally,
\begin{align*}
\alpha^r_A(\operatorname{id}_A)=\operatorname{id}_A\circ r=r,
\end{align*}
so the endomorphism recovered by evaluation at $\operatorname{id}_A$ is exactly the original $r$. Thus natural endomorphisms of the covariant represented functor are precisely precomposition by endomorphisms of $A$.
[/example]
There is a useful warning here about order. Although the functor is covariant in the variable object, endomorphisms of $A$ act by precomposition on maps out of $A$.
The represented hypothesis is again decisive. If the source functor were not $\operatorname{Hom}_{\mathcal C}(A,-)$, evaluating at $\operatorname{id}_A$ would not even be available, and there would be no reason for a natural endomorphism to be controlled by a single endomorphism of $A$. Nor does this statement make the endomorphism monoid commutative: composition order is remembered by precomposition. This order issue is the bridge to the next section, where changing the representing object reverses the direction of morphisms for covariant Hom-functors.
## Natural Transformations Between Hom-Functors
The next question is what Yoneda says when the target functor is itself representable. This specializes the lemma to a statement about natural transformations between Hom-functors and morphisms in the original category. Without this specialization, a natural transformation between two functors can look like extra structure living in the functor category; Yoneda shows that for represented Hom-functors this apparent extra structure is exactly an ordinary morphism of representing objects.
[quotetheorem:3979]
[citeproof:3979]
The reversal of $A$ and $B$ is important: maps between covariant Hom-functors are induced by morphisms in the opposite direction between the representing objects. This is the first appearance of the contravariance of the assignment $A\mapsto \operatorname{Hom}_{\mathcal C}(A,-)$.
The hypothesis that both functors are represented cannot be dropped from this conclusion. For arbitrary functors $G,H:\mathcal C\to\mathbf{Set}$, a natural transformation $G\Rightarrow H$ need not arise from a morphism inside $\mathcal C$ at all; the representing objects are what translate naturality into composition. The local smallness assumption still guarantees that both Hom-functors are Set-valued. This theorem also does not identify $A$ with $B$ when the Hom-functors are merely connected by a natural transformation; isomorphism of represented functors corresponds to isomorphism of representing objects, while a general transformation corresponds to a general morphism $B\to A$.
[quotetheorem:3980]
[citeproof:3980]
For contravariant Hom-functors, the direction of morphisms between representing objects is preserved. This fact is the basis for embedding a category into a presheaf category.
Here the variance is not a cosmetic choice. If the represented presheaves were replaced by covariant Hom-functors, the induced map on representing objects would point the other way, so it would not define the usual Yoneda embedding $\mathcal C\to\widehat{\mathcal C}$. The theorem says only that transformations between represented presheaves come from morphisms of $\mathcal C$; transformations involving non-representable presheaves may encode genuinely new data. This is precisely why the presheaf category is larger than the original category.
[example: Induced Maps Between Represented Presheaves]
Let $\mathcal C=\mathbf{Top}$ and let $i:A\to B$ be a continuous map. For each test space $X$, define
\begin{align*}
\beta^i_X:\operatorname{Hom}_{\mathbf{Top}}(X,A)&\to \operatorname{Hom}_{\mathbf{Top}}(X,B),\\
v&\mapsto i\circ v.
\end{align*}
The composite $i\circ v$ is continuous because it is a composite of continuous maps.
To check naturality, let $f:Y\to X$ be continuous and let $v:X\to A$. The represented presheaves act by precomposition, so
\begin{align*}
\operatorname{Hom}_{\mathbf{Top}}(f,A)(v)&=v\circ f,\\
\operatorname{Hom}_{\mathbf{Top}}(f,B)(i\circ v)&=(i\circ v)\circ f.
\end{align*}
Then associativity of composition gives
\begin{align*}
\operatorname{Hom}_{\mathbf{Top}}(f,B)(\beta^i_X(v))
&=\operatorname{Hom}_{\mathbf{Top}}(f,B)(i\circ v)\\
&=(i\circ v)\circ f\\
&=i\circ (v\circ f)\\
&=\beta^i_Y(v\circ f)\\
&=\beta^i_Y(\operatorname{Hom}_{\mathbf{Top}}(f,A)(v)).
\end{align*}
Thus $i$ determines a natural transformation
\begin{align*}
\beta^i:\operatorname{Hom}_{\mathbf{Top}}(-,A)\Rightarrow \operatorname{Hom}_{\mathbf{Top}}(-,B).
\end{align*}
Conversely, let
\begin{align*}
\beta:\operatorname{Hom}_{\mathbf{Top}}(-,A)\Rightarrow \operatorname{Hom}_{\mathbf{Top}}(-,B)
\end{align*}
be any natural transformation. By *Natural Transformations Between Contravariant Hom Functors*, it is determined by the continuous map
\begin{align*}
i:=\beta_A(\operatorname{id}_A):A\to B.
\end{align*}
For any $v:X\to A$, naturality applied to $v:X\to A$ gives
\begin{align*}
\beta_X(v)
&=\beta_X(\operatorname{id}_A\circ v)\\
&=\beta_X(\operatorname{Hom}_{\mathbf{Top}}(v,A)(\operatorname{id}_A))\\
&=\operatorname{Hom}_{\mathbf{Top}}(v,B)(\beta_A(\operatorname{id}_A))\\
&=\operatorname{Hom}_{\mathbf{Top}}(v,B)(i)\\
&=i\circ v.
\end{align*}
If two maps $i,j:A\to B$ induced the same natural transformation, evaluating the component at $A$ on $\operatorname{id}_A$ would give
\begin{align*}
i=i\circ \operatorname{id}_A=j\circ \operatorname{id}_A=j.
\end{align*}
So every natural transformation between these represented presheaves is composition with a unique continuous map $A\to B$.
[/example]
This is the categorical version of a familiar principle: a map is determined by what it does to all generalized points, and natural operations on generalized points come from actual maps.
## The Yoneda Embedding
If objects are completely detected by their represented presheaves, the next problem is to package this detection as a functor. The result is the Yoneda embedding, which sends each object to the presheaf of maps into it. The smallness assumption enters because the presheaf category $\mathbf{Set}^{\mathcal C^{\mathrm{op}}}$ should itself be an ordinary functor category; without a size convention, the collection of all presheaves may be too large for the ambient foundations.
[definition: Presheaf Category]
Let $\mathcal C$ be a small category. The presheaf category on $\mathcal C$ is the functor category
\begin{align*}
\widehat{\mathcal C}:=\mathbf{Set}^{\mathcal C^{\mathrm{op}}}.
\end{align*}
[/definition]
The objects of $\widehat{\mathcal C}$ are contravariant Set-valued functors on $\mathcal C$, and its morphisms are natural transformations. Smallness of $\mathcal C$ is used here so that the functor category is an ordinary category in the chosen universe.
Once the presheaf category is available, each object of $\mathcal C$ has a canonical shadow inside it: the presheaf of all maps into that object. To compare objects and morphisms of $\mathcal C$ using these shadows, we need a functor that sends objects to their represented presheaves and sends arrows to the corresponding natural transformations.
[definition: Yoneda Embedding]
Let $\mathcal C$ be a small category. The Yoneda embedding is the functor
\begin{align*}
y:\mathcal C\to \widehat{\mathcal C}
\end{align*}
defined on objects by
\begin{align*}
y(A)=\operatorname{Hom}_{\mathcal C}(-,A)
\end{align*}
and on morphisms $s:A\to B$ by the natural transformation $y(s):y(A)\Rightarrow y(B)$ whose component at $X$ is
\begin{align*}
y(s)_X:\operatorname{Hom}_{\mathcal C}(X,A)\to \operatorname{Hom}_{\mathcal C}(X,B),\qquad v\mapsto s\circ v.
\end{align*}
[/definition]
This is a genuine functor because identity morphisms and composites are respected by associativity in $\mathcal C$. The notation $y$ is common; some texts write $h_A$ for the presheaf represented by $A$.
The important question is whether this passage to presheaves loses information. If natural transformations between represented presheaves are exactly the original morphisms between representing objects, then $\mathcal C$ has been embedded into a larger presheaf category without distorting its Hom-sets.
[quotetheorem:3981]
[citeproof:3981]
The word embedding means more than injectivity on objects. It says that all morphisms of $\mathcal C$ can be recovered inside the presheaf category, so $\mathcal C$ sits inside $\widehat{\mathcal C}$ as a full subcategory up to the usual identification of isomorphic images.
Full faithfulness is the exact strength of the conclusion. It does not assert that $y$ is essentially surjective: most presheaves are not represented by an object of $\mathcal C$. It also does not require $y$ to be literally injective on objects; if two objects have isomorphic represented presheaves, Yoneda recovers an isomorphism between the objects rather than equality of their names. Smallness is used to keep $\widehat{\mathcal C}$ available as a category, while local smallness is already built into the Hom-sets used to define each presheaf. These distinctions prepare the later use of presheaf categories as completions that contain many new objects beyond the representables.
[example: Cayley Theorem for Categories]
Let $G$ be a group, and regard it as the one-object category $\mathcal C_G$ with unique object $*$, with
\begin{align*}
\operatorname{Hom}_{\mathcal C_G}(*,*)=G,
\end{align*}
where identity is $e\in G$ and composition is group multiplication: if $h,k:*\to *$, then $h\circ k=hk$. The Yoneda embedding sends $*$ to the presheaf
\begin{align*}
y(*)=\operatorname{Hom}_{\mathcal C_G}(-,*),
\end{align*}
whose value at the only object is the set
\begin{align*}
y(*)(*)=\operatorname{Hom}_{\mathcal C_G}(*,*)=G.
\end{align*}
For a morphism $g:*\to *$, the component of $y(g):y(*)\Rightarrow y(*)$ at $*$ sends a generalized element $h:*\to *$ to
\begin{align*}
y(g)_*(h)=g\circ h=gh.
\end{align*}
Thus $y(g)_*$ is the left-translation function
\begin{align*}
L_g:G&\to G,\\
h&\mapsto gh.
\end{align*}
This function is a bijection, since $L_{g^{-1}}$ is its inverse:
\begin{align*}
(L_{g^{-1}}\circ L_g)(h)&=L_{g^{-1}}(gh)=g^{-1}(gh)=(g^{-1}g)h=eh=h,\\
(L_g\circ L_{g^{-1}})(h)&=L_g(g^{-1}h)=g(g^{-1}h)=(gg^{-1})h=eh=h.
\end{align*}
Moreover, the assignment $g\mapsto L_g$ preserves multiplication:
\begin{align*}
(L_g\circ L_k)(h)&=L_g(kh)=g(kh)=(gk)h=L_{gk}(h),
\end{align*}
and it preserves the identity because
\begin{align*}
L_e(h)=eh=h.
\end{align*}
It is injective because, if $L_g=L_k$, then evaluating both functions at $e$ gives
\begin{align*}
g=ge=L_g(e)=L_k(e)=ke=k.
\end{align*}
By *[Yoneda Embedding Is Fully Faithful](/theorems/3981)*, distinct morphisms of $\mathcal C_G$ give distinct natural transformations of the represented presheaf, and every such natural transformation comes from a unique morphism $g:*\to *$. Therefore the group $G$ embeds into the permutation group of its underlying set by the left regular action $g\mapsto L_g$, which is exactly [Cayley's theorem](/theorems/846) in the one-object categorical case.
[/example]
This example is worth remembering because it shows the size of the theorem. Yoneda is not an analogy with [Cayley's theorem](/theorems/795); it is a categorified form of the same representation principle.
## Generalized Elements and the Meaning of Representation
The final question is interpretive: what does Yoneda teach us to do in later category theory? It tells us to study an object $A$ through all morphisms into or out of it, and to treat elements of a functor as natural families of observations along those morphisms.
[definition: Generalized Element]
Let $\mathcal C$ be a category and let $A,X\in \mathcal C$. A generalized element of $X$ of shape $A$ is a morphism
\begin{align*}
A\to X.
\end{align*}
[/definition]
In $\mathbf{Set}$, ordinary elements of a set $X$ are the same as functions $1\to X$ from a singleton. In a general category there may be no preferred singleton-like object, so all shapes $A$ are allowed as probes.
[example: Ordinary and Generalized Elements]
In $\mathbf{Set}$, a morphism $p:A\to X$ assigns to each $a\in A$ an element $p(a)\in X$, so it is exactly an $A$-indexed family
\begin{align*}
(p(a))_{a\in A}
\end{align*}
of elements of $X$.
If $A=1=\{*\}$, then a function $p:1\to X$ is determined by the single value $p(*)\in X$. Conversely, every element $x\in X$ defines a function
\begin{align*}
p_x:1&\to X,\\
*&\mapsto x.
\end{align*}
These constructions are inverse because $p_{p(*)}(*)=p(*)$ and, for every $x\in X$, $p_x(*)=x$.
If $A=\{1,2\}$, then a function $p:\{1,2\}\to X$ is determined by the ordered pair
\begin{align*}
(p(1),p(2))\in X\times X.
\end{align*}
Conversely, every pair $(x_1,x_2)\in X\times X$ defines a function
\begin{align*}
p_{(x_1,x_2)}:\{1,2\}&\to X,\\
1&\mapsto x_1,\\
2&\mapsto x_2.
\end{align*}
Again the constructions are inverse, since
\begin{align*}
(p_{(p(1),p(2))}(1),p_{(p(1),p(2))}(2))=(p(1),p(2))
\end{align*}
and $p_{(x_1,x_2)}$ has values $x_1$ and $x_2$ at $1$ and $2$.
Thus ordinary elements are the special case of generalized elements with singleton shape, while two-point shapes record ordered pairs; Yoneda treats all maps $A\to X$ as probes of $X$, with naturality requiring these probes to behave compatibly under functions between shapes.
[/example]
The language of generalized elements becomes powerful in categories where ordinary element notation is misleading or unavailable. For example, in a category of spaces, a morphism $A\to X$ may describe a path, a parametrized family, or a local chart into $X$.
[remark: Yoneda Philosophy]
To understand an object $X$, study the functor $\operatorname{Hom}_{\mathcal C}(-,X)$ of generalized elements. To understand an element $a\in F(A)$ of a presheaf, study the natural transformation $\operatorname{Hom}_{\mathcal C}(-,A)\Rightarrow F$ that it determines. To understand a morphism $A\to B$, study the induced natural transformation between represented presheaves.
[/remark]
This philosophy will recur whenever objects are characterized by maps into test objects or maps out of free objects. It is the reason representable functors are central in algebra, topology, algebraic geometry, and homological algebra.
[example: Universal Elements as Yoneda Data]
Let $F:\mathcal C\to \mathbf{Set}$ be a functor, and suppose $F$ is represented by an object $A$ through a natural isomorphism
\begin{align*}
\theta:\operatorname{Hom}_{\mathcal C}(A,-)\Rightarrow F.
\end{align*}
Let
\begin{align*}
u:=\theta_A(\operatorname{id}_A)\in F(A).
\end{align*}
For any morphism $f:A\to X$, naturality of $\theta$ for $f$ gives
\begin{align*}
\theta_X(f)
&=\theta_X(\operatorname{Hom}_{\mathcal C}(A,f)(\operatorname{id}_A))\\
&=F(f)(\theta_A(\operatorname{id}_A))\\
&=F(f)(u).
\end{align*}
Since $\theta$ is an isomorphism of functors, each component
\begin{align*}
\theta_X:\operatorname{Hom}_{\mathcal C}(A,X)\to F(X)
\end{align*}
is a bijection. Therefore every $x\in F(X)$ has a unique preimage $f:A\to X$ under $\theta_X$, and the displayed formula says exactly that
\begin{align*}
x=\theta_X(f)=F(f)(u).
\end{align*}
Conversely, suppose $u\in F(A)$ has the property that for every object $X$ and every $x\in F(X)$, there is a unique morphism $f:A\to X$ such that $x=F(f)(u)$. Define
\begin{align*}
\theta^u_X:\operatorname{Hom}_{\mathcal C}(A,X)&\to F(X),\\
f&\mapsto F(f)(u).
\end{align*}
Each $\theta^u_X$ is bijective by the stated existence and uniqueness property. For a morphism $h:X\to Y$ and $f:A\to X$, functoriality of $F$ gives
\begin{align*}
F(h)(\theta^u_X(f))
&=F(h)(F(f)(u))\\
&=(F(h)\circ F(f))(u)\\
&=F(h\circ f)(u)\\
&=\theta^u_Y(h\circ f)\\
&=\theta^u_Y(\operatorname{Hom}_{\mathcal C}(A,h)(f)).
\end{align*}
Thus $\theta^u$ is a natural isomorphism $\operatorname{Hom}_{\mathcal C}(A,-)\cong F$.
This is the content of the *Covariant Yoneda Lemma* in representation form: a universal element is precisely the element $u\in F(A)$ obtained by evaluating the representing isomorphism at $\operatorname{id}_A$, and its universality is exactly the bijectivity of the representing natural transformation.
[/example]
Yoneda therefore completes the course's first arc. Categories provide the setting, functors transport structure, natural transformations compare functors, and representable functors convert universal properties into ordinary elements of sets. The Yoneda lemma explains why this conversion loses no information.
The Yoneda lemma closes the first arc of the course by showing that objects and natural transformations can be recovered from their Hom-behaviour. The next chapter uses that insight as a working principle, turning Yoneda into a tool for reconstructing and comparing categorical constructions.
# 9. Consequences of Yoneda
The Yoneda lemma does more than identify natural transformations out of a representable functor. It explains why categorical constructions can be reconstructed from the way other objects map into them. In this chapter we use Yoneda as a working principle: to understand an object, understand its Hom-functor; to prove uniqueness of a construction, prove that two representing objects represent the same functor; to classify natural operations, look at the element selected by the identity map.
Chapter 8 established the Yoneda lemma as a theorem about natural transformations. This chapter turns it into a method. The examples are deliberately concrete: products of sets, free objects, tensor products, natural endomorphisms of the identity functor on $\mathbf{Set}$, and real-valued continuous functions.
## Categorical Reconstruction From Hom-Sets
How much of an object is visible if we only know all maps into it? Yoneda answers this question: in a locally small category, the contravariant Hom-functor $\operatorname{Hom}_{\mathcal C}(-, A)$ determines $A$ up to unique isomorphism once the representing structure is fixed.
[quotetheorem:3982]
[citeproof:3982]
This theorem is often the cleanest way to prove that two constructions with the same universal mapping property are the same up to unique isomorphism. The locally small hypothesis is needed so that the Hom-assignments really land in $\mathbf{Set}$; without it, the displayed natural isomorphism is not a set-valued natural isomorphism of the kind used by Yoneda. The theorem does not say that any two objects with Hom-sets of the same cardinalities are isomorphic: the bijections must be natural in the test object. The proof also shows how to find the isomorphism: evaluate the natural isomorphism at the representing object and apply it to the identity morphism.
[example: Reconstructing A Set From Its Incoming Maps]
Let $\alpha_X:\operatorname{Hom}_{\mathbf{Set}}(X,A)\to\operatorname{Hom}_{\mathbf{Set}}(X,B)$ be the given natural bijection, and let $\beta_X$ denote its inverse. Define
\begin{align*}
f&:=\alpha_A(1_A):A\to B,\\
g&:=\beta_B(1_B):B\to A.
\end{align*}
We show that $f$ and $g$ are inverse functions.
Because $\beta_A$ is inverse to $\alpha_A$, we have
\begin{align*}
\beta_A(f)
&=\beta_A(\alpha_A(1_A))\\
&=1_A.
\end{align*}
On the other hand, naturality of $\beta$ with respect to $f:A\to B$ gives
\begin{align*}
\beta_A(f)
&=\beta_A(1_B\circ f)\\
&=\beta_B(1_B)\circ f\\
&=g\circ f.
\end{align*}
Hence $g\circ f=1_A$.
Similarly, since $\alpha_B$ is inverse to $\beta_B$,
\begin{align*}
\alpha_B(g)
&=\alpha_B(\beta_B(1_B))\\
&=1_B.
\end{align*}
Naturality of $\alpha$ with respect to $g:B\to A$ gives
\begin{align*}
\alpha_B(g)
&=\alpha_B(1_A\circ g)\\
&=\alpha_A(1_A)\circ g\\
&=f\circ g.
\end{align*}
Therefore $f\circ g=1_B$. Thus $f:A\to B$ is a bijection with inverse $g:B\to A$, so the natural bijection of Hom-sets is induced by an actual bijection of sets.
[/example]
The set-level case is a useful warning because it separates pointwise counting from functorial structure. The maps $f$ and $g$ are not chosen independently; they are forced by the same natural bijection evaluated at different test sets.
[remark: Why Naturality Matters]
A bijection between the sets $\operatorname{Hom}_{\mathcal C}(X,A)$ and $\operatorname{Hom}_{\mathcal C}(X,B)$ for each separate object $X$ is not enough. The bijections must commute with precomposition by every morphism $Y\to X$. That compatibility is what prevents arbitrary counting coincidences from being mistaken for categorical structure.
[/remark]
The same idea applies covariantly, but the direction changes the kind of universal property it controls. Instead of testing maps into a candidate object, the covariant form tests maps out of it, which is the pattern behind free objects, tensor products, and other constructions specified by outgoing maps.
For that reason the covariant form needs its own statement. The issue is whether a natural comparison between functors of the form $\operatorname{Hom}_{\mathcal C}(A,-)$ and $\operatorname{Hom}_{\mathcal C}(B,-)$ is still forced by an isomorphism between $A$ and $B$.
[quotetheorem:3983]
[citeproof:3983]
This covariant version has the same limitations as the contravariant one. The isomorphism must be natural in the output object, not merely a collection of unrelated bijections. It prepares the ground for universal properties where the representing object is the source of maps rather than the target.
[quotetheorem:3984]
[citeproof:3984]
The phrase "unique up to unique isomorphism" should therefore be read literally. Universal properties do not merely specify an object up to some non-canonical bijection; they specify the unique isomorphism that preserves the universal data.
This is the bridge from Yoneda-style recognition to ordinary mathematical constructions. When a construction is characterized by a universal property, its uniqueness is not an added convention but a consequence of representability. The remaining task is to express familiar universal data as functors whose representing objects are the constructions themselves.
## Universal Properties As Representability Statements
What is a universal property, stripped of its familiar notation? It is a claim that a certain functor is representable. The representing object is the construction, and the representing element is the universal arrow or universal datum.
[definition: Representation Of A Contravariant Set-Valued Functor]
Let $\mathcal C$ be a locally small category, and let $F:\mathcal C^{\mathrm{op}}\to\mathbf{Set}$ be a functor. A representation of $F$ is an object $A\in\mathcal C$ together with a natural isomorphism
\begin{align*}
\Phi_X:\operatorname{Hom}_{\mathcal C}(X,A)\longrightarrow F(X)
\end{align*}
for all $X\in\mathcal C$.
[/definition]
The representing object $A$ is not isolated from the isomorphism $\Phi$. The element $u:=\Phi_A(1_A)\in F(A)$ is the universal element, and every element of every $F(X)$ is obtained uniquely by pulling back $u$ along a morphism $X\to A$.
This observation needs a precise converse: if a functor comes equipped with one element whose pullbacks classify all other elements uniquely, then that element should determine a representation. The theorem turns the informal language of universal elements into an exact representability criterion.
[quotetheorem:3985]
[citeproof:3985]
This formulation is useful because many constructions are introduced by naming a special element and saying that all other data factor uniquely through it. The word "unique" is doing essential work: if two different maps $X\to A$ could pull back $u$ to the same datum, then $A$ would not classify the data represented by $F$. The theorem also does not assert that every functor has such an element; non-representable functors are common, and detecting non-representability often means finding data that cannot be pulled back from a single universal object. Yoneda packages the successful cases into representability.
[example: Products Of Sets As A Representability Statement]
Fix sets $A$ and $B$. Consider the functor $F:\mathbf{Set}^{\mathrm{op}}\to\mathbf{Set}$ defined by
\begin{align*}
F(X)=\operatorname{Hom}_{\mathbf{Set}}(X,A)\times\operatorname{Hom}_{\mathbf{Set}}(X,B).
\end{align*}
We show that the cartesian product $A\times B$, with projections
\begin{align*}
\pi_A:A\times B\to A,\qquad \pi_A(a,b)=a,
\qquad
\pi_B:A\times B\to B,\qquad \pi_B(a,b)=b,
\end{align*}
represents $F$.
For each set $X$, define
\begin{align*}
\Phi_X:\operatorname{Hom}_{\mathbf{Set}}(X,A\times B)
&\longrightarrow
\operatorname{Hom}_{\mathbf{Set}}(X,A)\times\operatorname{Hom}_{\mathbf{Set}}(X,B),\\
h&\longmapsto (\pi_A\circ h,\pi_B\circ h).
\end{align*}
Given functions $f:X\to A$ and $g:X\to B$, define $h:X\to A\times B$ by
\begin{align*}
h(x)=(f(x),g(x)).
\end{align*}
Then
\begin{align*}
(\pi_A\circ h)(x)
&=\pi_A(h(x))\\
&=\pi_A(f(x),g(x))\\
&=f(x),
\end{align*}
and similarly
\begin{align*}
(\pi_B\circ h)(x)
&=\pi_B(h(x))\\
&=\pi_B(f(x),g(x))\\
&=g(x).
\end{align*}
Thus $\Phi_X(h)=(f,g)$.
The function $h$ is unique with this property. Indeed, if $k:X\to A\times B$ satisfies $\pi_A\circ k=f$ and $\pi_B\circ k=g$, then for each $x\in X$ we can write $k(x)=(a_x,b_x)$ with $a_x\in A$ and $b_x\in B$. The two projection equations give
\begin{align*}
a_x
&=\pi_A(k(x))\\
&=(\pi_A\circ k)(x)\\
&=f(x),
\end{align*}
and
\begin{align*}
b_x
&=\pi_B(k(x))\\
&=(\pi_B\circ k)(x)\\
&=g(x).
\end{align*}
Therefore
\begin{align*}
k(x)=(a_x,b_x)=(f(x),g(x))=h(x)
\end{align*}
for every $x\in X$, so $k=h$. Hence each $\Phi_X$ is a bijection.
The universal element is the pair of projections
\begin{align*}
(\pi_A,\pi_B)\in F(A\times B).
\end{align*}
For a function $h:X\to A\times B$, pulling this element back along $h$ gives
\begin{align*}
F(h)(\pi_A,\pi_B)=(\pi_A\circ h,\pi_B\circ h),
\end{align*}
which is exactly $\Phi_X(h)$. Thus $A\times B$ represents the functor assigning to $X$ the set of pairs of functions from $X$ into $A$ and $B$.
[/example]
This example is the product universal property without developing the full theory of limits. The functor records the data that a candidate object must classify: pairs of maps into $A$ and $B$.
The example suggests a general principle: whenever such a product object exists, representing the pair-of-maps functor should force its projections and its uniqueness properties. The theorem states this representability formulation of binary products, separating the classification property from the separate question of existence.
[quotetheorem:3986]
[citeproof:3986]
The product example also shows what the theorem does not guarantee: an arbitrary category need not have an object representing the product functor. If such an object exists, Yoneda gives its uniqueness and the compatibility of its projections, but existence must be proved separately. Other limits follow the same pattern after replacing the pair of projection maps by the corresponding cone data.
Free objects fit the same pattern, but now the represented functor remembers underlying functions out of a set. The construction is familiar from algebra: a free group on a set, a free module on a set, or a polynomial ring generated by a set.
[example: Free Groups As Representing Objects]
Let $U:\mathbf{Grp}\to\mathbf{Set}$ be the forgetful functor, and let $\eta:S\to U(F(S))$ be the inclusion of the generators of the free group on $S$. We show that $F(S)$ represents the functor
\begin{align*}
G\longmapsto \operatorname{Hom}_{\mathbf{Set}}(S,U(G)).
\end{align*}
For each group $G$, define
\begin{align*}
\Phi_G:\operatorname{Hom}_{\mathbf{Grp}}(F(S),G)
&\longrightarrow \operatorname{Hom}_{\mathbf{Set}}(S,U(G)),\\
\varphi&\longmapsto U(\varphi)\circ \eta.
\end{align*}
Thus $\Phi_G(\varphi)$ sends a generator $s\in S$ to the element $\varphi(\eta(s))\in U(G)$.
Given a function $a:S\to U(G)$, the defining property of the free group gives a unique group homomorphism $\widetilde a:F(S)\to G$ satisfying
\begin{align*}
U(\widetilde a)\circ \eta=a.
\end{align*}
Equivalently, for every $s\in S$,
\begin{align*}
\Phi_G(\widetilde a)(s)
&=(U(\widetilde a)\circ \eta)(s)\\
&=U(\widetilde a)(\eta(s))\\
&=\widetilde a(\eta(s))\\
&=a(s).
\end{align*}
Hence $\Phi_G(\widetilde a)=a$, so $\Phi_G$ is surjective.
If $\varphi,\psi:F(S)\to G$ are group homomorphisms with $\Phi_G(\varphi)=\Phi_G(\psi)$, then for every $s\in S$,
\begin{align*}
\varphi(\eta(s))
&=(U(\varphi)\circ \eta)(s)\\
&=\Phi_G(\varphi)(s)\\
&=\Phi_G(\psi)(s)\\
&=(U(\psi)\circ \eta)(s)\\
&=\psi(\eta(s)).
\end{align*}
Both $\varphi$ and $\psi$ are homomorphisms out of $F(S)$ with the same values on all generators, so the uniqueness part of the free group property gives $\varphi=\psi$. Thus $\Phi_G$ is injective.
The bijection is natural in $G$. If $\theta:G\to H$ is a group homomorphism and $\varphi:F(S)\to G$, then for every $s\in S$,
\begin{align*}
\Phi_H(\theta\circ \varphi)(s)
&=(U(\theta\circ \varphi)\circ \eta)(s)\\
&=(\theta\circ \varphi)(\eta(s))\\
&=\theta(\varphi(\eta(s)))\\
&=U(\theta)\bigl((U(\varphi)\circ \eta)(s)\bigr)\\
&=(U(\theta)\circ \Phi_G(\varphi))(s).
\end{align*}
Therefore $\Phi_H(\theta\circ\varphi)=U(\theta)\circ\Phi_G(\varphi)$. The universal element is $\eta:S\to U(F(S))$, and the representing bijection says precisely that choosing elements of $G$ indexed by $S$ is the same as choosing a unique homomorphism $F(S)\to G$.
[/example]
The direction of variance has changed, but the logic has not. A free object represents a functor of outgoing maps from the free object, while products represent a functor of incoming maps into the product.
[example: Tensor Products As Representing Objects]
Let $R$ be a commutative ring, and let $M,N$ be $R$-modules. Write
\begin{align*}
\tau:M\times N\to M\otimes_R N,
\qquad
\tau(m,n)=m\otimes n.
\end{align*}
We show that $M\otimes_R N$ represents the functor sending an $R$-module $P$ to the set $\operatorname{Bilin}_R(M,N;P)$ of $R$-bilinear maps $M\times N\to P$.
For each $R$-module $P$, define
\begin{align*}
\Phi_P:\operatorname{Hom}_R(M\otimes_R N,P)
&\longrightarrow
\operatorname{Bilin}_R(M,N;P),\\
\ell&\longmapsto \ell\circ\tau.
\end{align*}
If $\ell:M\otimes_R N\to P$ is $R$-linear, then $\ell\circ\tau$ is bilinear. Indeed, for $m,m'\in M$, $n,n'\in N$, and $r\in R$,
\begin{align*}
(\ell\circ\tau)(m+m',n)
&=\ell((m+m')\otimes n)\\
&=\ell(m\otimes n+m'\otimes n)\\
&=\ell(m\otimes n)+\ell(m'\otimes n)\\
&=(\ell\circ\tau)(m,n)+(\ell\circ\tau)(m',n),
\end{align*}
\begin{align*}
(\ell\circ\tau)(rm,n)
&=\ell((rm)\otimes n)\\
&=\ell(r(m\otimes n))\\
&=r\ell(m\otimes n)\\
&=r(\ell\circ\tau)(m,n),
\end{align*}
and the same calculation in the second variable gives
\begin{align*}
(\ell\circ\tau)(m,n+n')
&=(\ell\circ\tau)(m,n)+(\ell\circ\tau)(m,n'),\\
(\ell\circ\tau)(m,rn)
&=r(\ell\circ\tau)(m,n).
\end{align*}
Conversely, let $b:M\times N\to P$ be $R$-bilinear. By the defining [universal property of the tensor product](/theorems/3971), there is a unique $R$-linear map
\begin{align*}
\bar b:M\otimes_R N\to P
\end{align*}
such that
\begin{align*}
\bar b\circ\tau=b.
\end{align*}
Equivalently, for every $m\in M$ and $n\in N$,
\begin{align*}
\Phi_P(\bar b)(m,n)
&=(\bar b\circ\tau)(m,n)\\
&=\bar b(m\otimes n)\\
&=b(m,n).
\end{align*}
Thus $\Phi_P(\bar b)=b$, so $\Phi_P$ is surjective.
If $\ell_1,\ell_2:M\otimes_R N\to P$ satisfy $\Phi_P(\ell_1)=\Phi_P(\ell_2)$, then
\begin{align*}
\ell_1\circ\tau=\ell_2\circ\tau.
\end{align*}
Both $\ell_1$ and $\ell_2$ are $R$-linear maps out of $M\otimes_R N$ inducing the same bilinear map $M\times N\to P$, so the uniqueness part of the tensor product universal property gives
\begin{align*}
\ell_1=\ell_2.
\end{align*}
Hence $\Phi_P$ is injective, and therefore
\begin{align*}
\operatorname{Hom}_R(M\otimes_R N,P)
\cong
\operatorname{Bilin}_R(M,N;P).
\end{align*}
This bijection is natural in $P$. If $q:P\to Q$ is an $R$-linear map and $\ell:M\otimes_R N\to P$, then for every $(m,n)\in M\times N$,
\begin{align*}
\Phi_Q(q\circ\ell)(m,n)
&=((q\circ\ell)\circ\tau)(m,n)\\
&=q(\ell(\tau(m,n)))\\
&=q(\ell(m\otimes n))\\
&=q((\ell\circ\tau)(m,n))\\
&=(q\circ\Phi_P(\ell))(m,n).
\end{align*}
Thus $\Phi_Q(q\circ\ell)=q\circ\Phi_P(\ell)$. The universal element is $\tau$, and the representing bijection says that giving a linear map out of $M\otimes_R N$ is exactly the same as giving a bilinear map out of $M\times N$.
[/example]
Representability therefore gives a common grammar for products, free objects, tensor products, and many constructions that appear in algebra and topology. The construction is the object that makes a data-assigning functor into a Hom-functor.
## Natural Operations Classified By Elements
What does it mean to have a natural operation on all objects of a category? Yoneda says that natural transformations out of a representable functor are elements of the target functor evaluated at the representing object. In practice, this turns a family of compatible operations into a single piece of data.
[quotetheorem:3987]
[citeproof:3987]
The point is that naturality removes arbitrary choices. The representable source hypothesis is essential: Yoneda classifies transformations whose source is a Hom-functor, not arbitrary natural transformations between unrelated functors. The theorem does not say that every family of operations is natural; for example, choosing a distinguished element of each set fails because functions need not preserve such choices. A natural operation cannot inspect the internal presentation of an object unless that information is available through morphisms in the category.
[example: Natural Endomorphisms Of The Identity Functor On Set]
Let $\operatorname{Id}_{\mathbf{Set}}:\mathbf{Set}\to\mathbf{Set}$ be the identity functor. A natural endomorphism $\alpha:\operatorname{Id}_{\mathbf{Set}}\Rightarrow\operatorname{Id}_{\mathbf{Set}}$ assigns to each set $X$ a function $\alpha_X:X\to X$ such that, for every function $f:X\to Y$,
\begin{align*}
f\circ \alpha_X=\alpha_Y\circ f.
\end{align*}
We show that every component $\alpha_X$ is the identity function on $X$.
Fix a set $X$ and an element $x\in X$. Let $1=\{*\}$, and define $f:1\to X$ by $f(*)=x$. Naturality of $\alpha$ with respect to $f$ says
\begin{align*}
f\circ \alpha_1=\alpha_X\circ f.
\end{align*}
Evaluating both sides at $*$ gives
\begin{align*}
(f\circ \alpha_1)(*)
&=(\alpha_X\circ f)(*).
\end{align*}
Expanding the two compositions,
\begin{align*}
f(\alpha_1(*))
&=\alpha_X(f(*)).
\end{align*}
Since $\alpha_1:1\to 1$ is a function from the singleton set to itself, its only possible value is
\begin{align*}
\alpha_1(*)=*.
\end{align*}
Using also $f(*)=x$, we get
\begin{align*}
\alpha_X(x)
&=\alpha_X(f(*))\\
&=f(\alpha_1(*))\\
&=f(*)\\
&=x.
\end{align*}
Thus $\alpha_X$ fixes every element $x\in X$, so $\alpha_X=1_X$ for every set $X$. Therefore the only natural endomorphism of the identity functor on $\mathbf{Set}$ is the identity natural transformation.
[/example]
This is a small but revealing calculation. There is no natural way to choose a different element of every set, because the singleton already tests every element through maps out of it.
[example: Natural Operations On Real-Valued Continuous Functions]
Let $\mathbf{Top}$ be the [category of topological spaces](/theorems/3948) and continuous maps, and write
\begin{align*}
C(X,\mathbb R)=\operatorname{Hom}_{\mathbf{Top}}(X,\mathbb R).
\end{align*}
A natural operation on continuous real-valued functions is a natural transformation
\begin{align*}
\alpha:C(-,\mathbb R)\Rightarrow C(-,\mathbb R),
\end{align*}
so each component $\alpha_X:C(X,\mathbb R)\to C(X,\mathbb R)$ sends a continuous function $f:X\to\mathbb R$ to another continuous function $\alpha_X(f):X\to\mathbb R$.
Let
\begin{align*}
\varphi:=\alpha_{\mathbb R}(1_{\mathbb R}):\mathbb R\to\mathbb R.
\end{align*}
Since $\alpha_{\mathbb R}$ has codomain $C(\mathbb R,\mathbb R)$, the map $\varphi$ is continuous. We show that $\alpha$ is completely determined by this single continuous map. For any space $X$ and any $f:X\to\mathbb R$, naturality of $\alpha$ with respect to $f:X\to\mathbb R$ gives
\begin{align*}
C(f,\mathbb R)\bigl(\alpha_{\mathbb R}(1_{\mathbb R})\bigr)
&=\alpha_X\bigl(C(f,\mathbb R)(1_{\mathbb R})\bigr).
\end{align*}
Expanding the two precomposition maps,
\begin{align*}
C(f,\mathbb R)\bigl(\alpha_{\mathbb R}(1_{\mathbb R})\bigr)
&=\alpha_{\mathbb R}(1_{\mathbb R})\circ f\\
&=\varphi\circ f,
\end{align*}
and
\begin{align*}
\alpha_X\bigl(C(f,\mathbb R)(1_{\mathbb R})\bigr)
&=\alpha_X(1_{\mathbb R}\circ f)\\
&=\alpha_X(f).
\end{align*}
Therefore
\begin{align*}
\alpha_X(f)=\varphi\circ f.
\end{align*}
Conversely, every continuous map $\varphi:\mathbb R\to\mathbb R$ defines a natural operation by
\begin{align*}
\alpha_X(f)=\varphi\circ f.
\end{align*}
If $u:Y\to X$ is continuous, then for every $f:X\to\mathbb R$,
\begin{align*}
\alpha_Y(C(u,\mathbb R)(f))
&=\alpha_Y(f\circ u)\\
&=\varphi\circ(f\circ u)\\
&=(\varphi\circ f)\circ u\\
&=C(u,\mathbb R)(\varphi\circ f)\\
&=C(u,\mathbb R)(\alpha_X(f)).
\end{align*}
Thus the components are natural in $X$. Hence natural operations $C(-,\mathbb R)\Rightarrow C(-,\mathbb R)$ are exactly postcomposition operations $f\mapsto\varphi\circ f$ for continuous maps $\varphi:\mathbb R\to\mathbb R$, as predicted by the *contravariant Yoneda lemma*.
[/example]
This example explains why operations such as $f\mapsto f^2$, $f\mapsto |f|$, and $f\mapsto \sin\circ f$ are natural on continuous real-valued functions: they arise from continuous functions $\mathbb R\to\mathbb R$. The same reasoning excludes operations that depend on a coordinate chart, a measure, or a chosen point of $X$, since those data are not part of the functor $C(-,\mathbb R)$.
[remark: Elements As Generalised Points]
For any object $A$, a morphism $X\to A$ may be read as an $X$-indexed family of points of $A$. Yoneda makes this slogan precise by saying that all such generalised points, functorially in $X$, determine $A$. The case $X=1$ in a category with a terminal object recovers ordinary global elements, but Yoneda uses every test object rather than only the terminal one.
[/remark]
## Representability As A Habit Of Thought
How should Yoneda change the way we read later mathematics? Whenever a construction is introduced by saying that maps to or from it correspond naturally to some structured data, we should ask which functor is being represented.
[explanation: Reading Universal Properties]
A universal property has three parts. First, there is a category in which the representing object lives. Second, there is a functor assigning to each test object the set of data we want to classify. Third, there is a natural bijection between morphisms involving the representing object and that set of data.
This perspective separates the construction from its implementation. A product of sets can be built as ordered pairs, but its categorical content is the natural bijection $\operatorname{Hom}(X,A\times B)\cong\operatorname{Hom}(X,A)\times\operatorname{Hom}(X,B)$. A tensor product can be built by generators and relations, but its categorical content is that linear maps out of it are bilinear maps out of the original pair.
[/explanation]
This distinction between construction and implementation becomes sharper in later categorical language. Adjunctions package whole families of representability statements, limits and colimits organise diagram-shaped universal properties, and moduli problems often begin by asking whether a functor of points is representable.
The remaining question is why two different implementations of the same universal property should count as the same construction. Representability answers this by showing that two representing objects for one functor are connected by a uniquely determined isomorphism compatible with the representing data.
[quotetheorem:3988]
[citeproof:3988]
The word "canonically" in this context means that the isomorphism is forced by the universal property. The hypotheses do not assert that the functor is representable, only that once two representations have been supplied they determine the same object up to the unique compatible isomorphism. If the natural isomorphisms to $F$ are changed, the compatible isomorphism may change as well; the canonical map is canonical relative to the universal data. This is the precise sense in which later constructions such as adjoints, limits, and moduli objects are independent of a chosen model.
[example: Two Models Of A Free Module]
Let $S=\{s_1,\dots,s_n\}$ be a chosen numbering of the finite set $S$. Let $L$ be the formal free $R$-module with basis symbols $[s]$ for $s\in S$, so every element of $L$ is uniquely written as
\begin{align*}
\sum_{i=1}^n r_i[s_i].
\end{align*}
Define
\begin{align*}
T:L\to R^n,\qquad
T\left(\sum_{i=1}^n r_i[s_i]\right)=(r_1,\dots,r_n).
\end{align*}
If $x=\sum_i r_i[s_i]$, $y=\sum_i q_i[s_i]$, and $a\in R$, then
\begin{align*}
T(x+y)
&=T\left(\sum_{i=1}^n (r_i+q_i)[s_i]\right)\\
&=(r_1+q_1,\dots,r_n+q_n)\\
&=(r_1,\dots,r_n)+(q_1,\dots,q_n)\\
&=T(x)+T(y),
\end{align*}
and
\begin{align*}
T(ax)
&=T\left(\sum_{i=1}^n (ar_i)[s_i]\right)\\
&=(ar_1,\dots,ar_n)\\
&=a(r_1,\dots,r_n)\\
&=aT(x).
\end{align*}
Thus $T$ is $R$-linear.
Let $e_i\in R^n$ be the standard basis vector with $1$ in the $i$th coordinate and $0$ elsewhere. Define
\begin{align*}
V:R^n\to L,\qquad
V(r_1,\dots,r_n)=\sum_{i=1}^n r_i[s_i].
\end{align*}
Then
\begin{align*}
(V\circ T)\left(\sum_{i=1}^n r_i[s_i]\right)
&=V(r_1,\dots,r_n)\\
&=\sum_{i=1}^n r_i[s_i],
\end{align*}
and
\begin{align*}
(T\circ V)(r_1,\dots,r_n)
&=T\left(\sum_{i=1}^n r_i[s_i]\right)\\
&=(r_1,\dots,r_n).
\end{align*}
Hence $V\circ T=1_L$ and $T\circ V=1_{R^n}$, so $T$ is an $R$-linear isomorphism.
Both models represent the same functor
\begin{align*}
M\longmapsto \operatorname{Hom}_{\mathbf{Set}}(S,U(M)).
\end{align*}
For $L$, a homomorphism $\varphi:L\to M$ is sent to the function
\begin{align*}
s\longmapsto \varphi([s]).
\end{align*}
Given any function $a:S\to U(M)$, the corresponding homomorphism is
\begin{align*}
\widetilde a\left(\sum_{i=1}^n r_i[s_i]\right)
=\sum_{i=1}^n r_i a(s_i),
\end{align*}
and for each $j$,
\begin{align*}
\widetilde a([s_j])
&=\widetilde a\left(\sum_{i=1}^n \delta_{ij}[s_i]\right)\\
&=\sum_{i=1}^n \delta_{ij}a(s_i)\\
&=a(s_j).
\end{align*}
For $R^n$, a homomorphism $\psi:R^n\to M$ is sent to the function $s_i\mapsto \psi(e_i)$. Given the same $a:S\to U(M)$, the corresponding homomorphism is
\begin{align*}
\widehat a(r_1,\dots,r_n)=\sum_{i=1}^n r_i a(s_i),
\end{align*}
and
\begin{align*}
\widehat a(e_j)
&=\widehat a(0,\dots,0,1,0,\dots,0)\\
&=\sum_{i=1}^n \delta_{ij}a(s_i)\\
&=a(s_j).
\end{align*}
Thus $L$ and $R^n$ classify exactly the same choice of elements of $M$ indexed by $S$.
The isomorphism $T$ is the unique compatible one: for each generator $[s_i]$,
\begin{align*}
T([s_i])
&=T(0[s_1]+\cdots+1[s_i]+\cdots+0[s_n])\\
&=e_i.
\end{align*}
So the formal linear-combination model and the coordinate model $R^n$ differ only by the chosen numbering of $S$; their shared representing property forces the map sending each generator to the corresponding standard basis vector.
[/example]
The example illustrates why a coordinate description such as $R^n$ is secondary. The numbering of $S$ helps build a convenient model, but the representing property is what identifies the model with every other construction of the free module.
[remark: What This Chapter Does Not Use]
We have used products and tensor products as examples of universal properties, but we have not developed general limits, colimits, or adjunctions. Those belong to the next stage of category theory. The present point is narrower: Yoneda already explains why representable universal properties determine their objects and classify natural operations.
[/remark]
The practical lesson is simple: whenever a construction is defined by a natural family of bijections, the representing object is determined by Yoneda. This turns many uniqueness proofs into a single reusable argument and makes naturality the main condition to check.
With Yoneda in hand, category theory becomes a method for organizing familiar mathematics rather than an isolated abstraction. The final chapter uses that method to show how the same categorical ideas connect algebra, topology, and other areas through common patterns of naturality and representation.
# 10. Categorical Thinking Across Mathematics
Category theory becomes most useful when it stops being a separate subject and starts functioning as a way to compare familiar mathematical worlds. Chapters 1 through 9 introduced categories, functors, natural transformations, equivalences, representability, and the Yoneda lemma inside an abstract language. This final chapter revisits algebra, topology, and analysis to show how the same categorical questions organise constructions that students already know: which maps preserve structure, which objects are determined only up to isomorphism, and which assignments are functorial rather than merely notational.
The guiding test is simple: if a construction is genuinely intrinsic, it should behave well under the correct notion of sameness. For objects inside one category, that means invariance under isomorphism. For whole categories, it means invariance under equivalence. For assignments between mathematical settings, it means functoriality and, when choices are involved, naturality. The recurring obstruction is that informal constructions often depend on representatives, coordinates, generators, basepoints, or choices; categorical thinking is the discipline of detecting and removing that dependence.
## Algebraic Categories and Structure-Preserving Maps
What changes when groups, rings, modules, and algebras are viewed as categories rather than as separate lists of definitions? The categorical viewpoint asks first for the morphisms, since the morphisms encode which features of the objects are visible to the theory. Algebra supplies the most direct examples: the objects carry operations, and the morphisms preserve those operations.
[definition: Category of Groups]
The category $\mathbf{Grp}$ has groups as objects and group homomorphisms as morphisms. Composition is ordinary composition of functions, and the identity morphism on a group $G$ is $\operatorname{id}_G$.
[/definition]
Ring-theoretic examples require one extra convention before the category is meaningful. If rings have multiplicative identities, then the morphisms should preserve those identities; otherwise universal properties involving units, polynomial rings, and spectra would be stated in the wrong ambient category.
[definition: Category of Rings]
The category $\mathbf{Ring}$ has unital rings as objects and unital ring homomorphisms as morphisms. Composition is ordinary composition of functions, and the identity morphism on a ring $R$ is $\operatorname{id}_R$.
[/definition]
Modules introduce a different obstruction: the scalar action is part of the structure, so ordinary group homomorphisms are too weak. To make tensor products, linear duals, and exact algebraic constructions categorical, the ambient category must remember a fixed base ring and allow only maps compatible with its action.
[definition: Category of Modules]
Let $R$ be a ring. The category $R\text{-}\mathbf{Mod}$ has left $R$-modules as objects. For left $R$-modules $M$ and $N$, the hom-set $\operatorname{Hom}_{R\text{-}\mathbf{Mod}}(M,N)$ is the set of $R$-linear maps $M\to N$. Composition is ordinary composition of $R$-linear maps, and the identity morphism on an $R$-module $M$ is $\operatorname{id}_M$.
[/definition]
The difference between module maps and arbitrary functions is not cosmetic; it changes which arrows exist. A small module over $\mathbb Z/2\mathbb Z$ makes this visible by letting us count both classes of maps explicitly.
[example: Forgetting Structure Changes The Category]
Let $R=\mathbb{Z}/2\mathbb{Z}=\{0,1\}$ and view $M=R$ as a left $R$-module over itself. If $f:M\to M$ is $R$-linear, then for each $a\in R$,
\begin{align*}
f(a)&=f(a\cdot 1)\\
&=a\cdot f(1),
\end{align*}
so $f$ is determined by the single value $f(1)$. Since $f(1)\in R=\{0,1\}$, there are only two possibilities. If $f(1)=0$, then
\begin{align*}
f(0)&=0\cdot f(1)=0\cdot 0=0,\\
f(1)&=1\cdot f(1)=1\cdot 0=0,
\end{align*}
so $f$ is multiplication by $0$. If $f(1)=1$, then
\begin{align*}
f(0)&=0\cdot f(1)=0\cdot 1=0,\\
f(1)&=1\cdot f(1)=1\cdot 1=1,
\end{align*}
so $f$ is multiplication by $1$, the identity map.
If we forget the module structure and keep only the underlying set $\{0,1\}$, then a function $R\to R$ is obtained by choosing independently the two values $h(0)$ and $h(1)$. Each has two choices, so there are $2\cdot 2=4$ set maps:
\begin{align*}
0\mapsto 0,\ 1\mapsto 0;\qquad
0\mapsto 0,\ 1\mapsto 1;\qquad
0\mapsto 1,\ 1\mapsto 0;\qquad
0\mapsto 1,\ 1\mapsto 1.
\end{align*}
Only the first two are $R$-linear, because every $R$-linear map must satisfy $f(0)=0\cdot f(1)=0$. Thus changing from $R$-linear maps to arbitrary set maps changes the category: the same object has more morphisms once the scalar action is no longer required to be preserved.
[/example]
A recurring lesson is that categorical properties may agree with set-theoretic ones in familiar categories, but the agreement is a theorem rather than a definition. Monomorphisms and epimorphisms are cancellation properties of morphisms, so they make sense in every category.
In $\mathbf{Set}$, the question is whether left cancellation by a function is exactly the same as elementwise injectivity. Elements can be tested by maps from a singleton, so the categorical definition should recover the familiar condition if those tests are sufficient.
[quotetheorem:3952]
[citeproof:3952]
The theorem works because elements of a set can be represented by maps from a singleton, so elementwise equality can be tested by morphisms. The hypothesis that the ambient category is $\mathbf{Set}$ is essential: in a general category there may be no singleton-like object that probes points in this way. The result therefore illustrates a method rather than a definition: translate an elementwise property into a cancellation property, then check whether the chosen category has enough morphisms to recover the familiar notion.
The corresponding statement for epimorphisms holds in $\mathbf{Set}$, but not in every algebraic category. This difference is one reason categorical definitions should not be replaced by elementwise analogues.
[example: A Non-Surjective Epimorphism]
Let $i:\mathbb{Z}\hookrightarrow\mathbb{Q}$ be the usual inclusion. This map is not surjective because, for example, $1/2\in\mathbb{Q}$ is not an integer. We show that $i$ is nevertheless an epimorphism in the category of unital commutative rings. Let $f,g:\mathbb{Q}\to R$ be unital ring homomorphisms such that $f\circ i=g\circ i$, so $f(m)=g(m)$ for every $m\in\mathbb{Z}$.
Fix a nonzero integer $n$. Since $n\cdot (1/n)=1$ in $\mathbb{Q}$, applying $f$ gives
\begin{align*}
f(n)f(1/n)&=f(n\cdot 1/n)\\
&=f(1)\\
&=1.
\end{align*}
Similarly,
\begin{align*}
g(n)g(1/n)&=g(n\cdot 1/n)\\
&=g(1)\\
&=1.
\end{align*}
Because $f(n)=g(n)$, both $f(1/n)$ and $g(1/n)$ are inverses of the same element $f(n)=g(n)$. Inverses are unique: if $rx=1$ and $ry=1$, then in the commutative ring $R$,
\begin{align*}
x&=x\cdot 1\\
&=x(ry)\\
&=(xr)y\\
&=(rx)y\\
&=1\cdot y\\
&=y.
\end{align*}
Hence $f(1/n)=g(1/n)$.
Now let $a/n\in\mathbb{Q}$ with $a\in\mathbb{Z}$ and $n\ne 0$. Using multiplicativity and the agreement on integers,
\begin{align*}
f(a/n)&=f(a\cdot 1/n)\\
&=f(a)f(1/n)\\
&=g(a)g(1/n)\\
&=g(a\cdot 1/n)\\
&=g(a/n).
\end{align*}
Thus $f$ and $g$ agree on every rational number, so $f=g$. Therefore precomposition with $\mathbb{Z}\hookrightarrow\mathbb{Q}$ is cancellative on the right, which is exactly why the inclusion is an epimorphism despite not being surjective.
[/example]
Algebraic constructions are often best identified by universal properties. The tensor product is a central example: it converts bilinear maps into linear maps in a universal way.
The obstruction is that a concrete construction of $M\otimes_R N$ may depend on generators and relations, while the object itself should be independent of that presentation. The universal property solves this by specifying the object through how every bilinear map out of $M\times N$ factors through it.
[quotetheorem:3990]
[citeproof:3990]
This theorem does more than construct an object. The bilinearity hypothesis is essential: without it, there is no reason a map out of $M\times N$ should factor through a linear map from a module. The universal quantifier over all target modules $P$ is also essential, because the tensor product is characterised by its behaviour against every possible recipient of a bilinear map, not by a single chosen formula. The theorem does not say that a tensor product has a preferred basis or a preferred concrete presentation; it says that every use of bilinear data through a linear map factors through the same object. That is why the next step is uniqueness up to unique isomorphism.
[quotetheorem:3991]
[citeproof:3991]
The word unique matters, since an object determined only up to some unspecified isomorphism would still carry hidden choices. For tensor products, the categorical content is not a preferred quotient presentation of $M\otimes_R N$ but the stable mapping property shared by every valid construction.
This uniqueness result is what lets mathematicians write "the" tensor product after choosing any construction. Different models may have different underlying sets or presentations, but the universal property identifies them by a unique isomorphism compatible with the bilinear maps.
## Contravariant Algebraic Constructions
How can a map of rings produce a map of spaces in the opposite direction? The obstruction is that many algebraic maps behave like pullback maps on functions: a function on the target can be composed with a map from the source, so the induced operation reverses direction. Category theory gives a precise vocabulary for this reversal: a contravariant construction is a functor out of the opposite category, or equivalently a functor whose induced maps reverse arrows.
[definition: Contravariant Functor]
Let $\mathcal{C}$ and $\mathcal{D}$ be categories. A contravariant functor from $\mathcal{C}$ to $\mathcal{D}$ is a functor $F:\mathcal{C}^{\mathrm{op}}\to\mathcal{D}$.
[/definition]
After the definition, one may say that such a functor assigns to each morphism $f:X\to Y$ in $\mathcal{C}$ a morphism $F(f):F(Y)\to F(X)$ in $\mathcal{D}$, while preserving identities and composition with the order reversed. The opposite category is not a notational trick; it is what makes the functor axioms match the reversed direction of the induced maps. Without recording the variance, formulas that look compatible object-by-object can fail to compose in the right order.
[example: Spectrum as a Contravariant Construction]
Let $\mathbf{CRing}$ denote the category of commutative unital rings. For a unital ring homomorphism $\varphi:R\to S$, define
\begin{align*}
\varphi^*:\operatorname{Spec}(S)&\to \operatorname{Spec}(R),\\
\mathfrak q&\mapsto \varphi^{-1}(\mathfrak q).
\end{align*}
We first check that $\varphi^{-1}(\mathfrak q)$ is a prime ideal of $R$ whenever $\mathfrak q$ is a prime ideal of $S$. If $a,b\in \varphi^{-1}(\mathfrak q)$ and $r\in R$, then
\begin{align*}
\varphi(a+b)&=\varphi(a)+\varphi(b)\in \mathfrak q,\\
\varphi(ra)&=\varphi(r)\varphi(a)\in \mathfrak q,
\end{align*}
so $\varphi^{-1}(\mathfrak q)$ is an ideal. It is proper because if $1_R\in\varphi^{-1}(\mathfrak q)$, then
\begin{align*}
1_S=\varphi(1_R)\in \mathfrak q,
\end{align*}
contradicting that a prime ideal is proper. Finally, if $ab\in\varphi^{-1}(\mathfrak q)$, then
\begin{align*}
\varphi(a)\varphi(b)=\varphi(ab)\in\mathfrak q.
\end{align*}
Since $\mathfrak q$ is prime, $\varphi(a)\in\mathfrak q$ or $\varphi(b)\in\mathfrak q$, so $a\in\varphi^{-1}(\mathfrak q)$ or $b\in\varphi^{-1}(\mathfrak q)$. Thus $\varphi^{-1}(\mathfrak q)$ is prime, and $\varphi^*$ is well-defined.
The direction is reversed because inverse images compose in the opposite order. If $R\xrightarrow{\varphi}S\xrightarrow{\psi}T$ are unital ring homomorphisms and $\mathfrak p\in\operatorname{Spec}(T)$, then
\begin{align*}
(\psi\circ\varphi)^*(\mathfrak p)
&=(\psi\circ\varphi)^{-1}(\mathfrak p)\\
&=\{r\in R:\psi(\varphi(r))\in\mathfrak p\}\\
&=\{r\in R:\varphi(r)\in\psi^{-1}(\mathfrak p)\}\\
&=\varphi^{-1}(\psi^{-1}(\mathfrak p))\\
&=\varphi^*(\psi^*(\mathfrak p)).
\end{align*}
Also $\operatorname{id}_R^*(\mathfrak p)=\operatorname{id}_R^{-1}(\mathfrak p)=\mathfrak p$, so identities are preserved.
With the Zariski topology, continuity follows on closed sets. For an ideal $I\subset R$, let $V(I)=\{\mathfrak p\in\operatorname{Spec}(R):I\subset\mathfrak p\}$. Then
\begin{align*}
(\varphi^*)^{-1}(V(I))
&=\{\mathfrak q\in\operatorname{Spec}(S):\varphi^*(\mathfrak q)\in V(I)\}\\
&=\{\mathfrak q\in\operatorname{Spec}(S):I\subset \varphi^{-1}(\mathfrak q)\}\\
&=\{\mathfrak q\in\operatorname{Spec}(S):\varphi(I)\subset \mathfrak q\}\\
&=V((\varphi(I))),
\end{align*}
where $(\varphi(I))$ denotes the ideal of $S$ generated by $\varphi(I)$. This is closed in $\operatorname{Spec}(S)$, so $\varphi^*$ is continuous. Therefore $\operatorname{Spec}$ sends rings to topological spaces and ring maps to continuous maps in the opposite direction, giving a contravariant functor $\operatorname{Spec}:\mathbf{CRing}^{\mathrm{op}}\to\mathbf{Top}$.
[/example]
The prime-spectrum construction motivates a general verification problem. When an assignment reverses arrows, we must check the contravariant identity and composition laws, otherwise the construction is only a rule on objects and maps rather than a functor.
[quotetheorem:3992]
[citeproof:3992]
The theorem is deliberately elementary, but it is a practical test used throughout mathematics. The object assignment alone is never enough: a construction that assigns reasonable objects may still fail to respect identities or composition. The composition condition is the stronger check in most examples, because it detects whether choices made while defining $F(f)$ are compatible across successive morphisms. This is why functoriality is treated as evidence that a construction is intrinsic rather than presentation-dependent.
## Topological Examples and Homotopical Motivation
What does category theory see in topology beyond continuous maps between spaces? It sees that many topological invariants are not just numbers or groups attached to spaces, but functors that transport information along maps. This distinction is essential when one wants invariants to compare spaces systematically.
[definition: Category of Topological Spaces]
The category $\mathbf{Top}$ has topological spaces as objects and continuous maps as morphisms. Composition is ordinary composition of continuous maps, and the identity morphism on a space $X$ is $\operatorname{id}_X$.
[/definition]
Pointed invariants force us to remember more than the underlying space. A continuous map $f:X\to Y$ does not determine a homomorphism $\pi_1(X,x_0)\to\pi_1(Y,y_0)$ unless the target basepoint is tied to $f(x_0)$, so the category must build the marked point into the objects and require maps to preserve it.
[definition: Category of Pointed Topological Spaces]
The category $\mathbf{Top}_*$ has pairs $(X,x_0)$, where $X$ is a topological space and $x_0\in X$, as objects. A morphism $f:(X,x_0)\to(Y,y_0)$ is a continuous map $f:X\to Y$ such that $f(x_0)=y_0$.
[/definition]
The next problem is to verify that the fundamental group respects this pointed category. A based map sends loops based at $x_0$ to loops based at $y_0$, and functoriality asks that this assignment preserve identity maps and composition of pointed maps.
[quotetheorem:3993]
[citeproof:3993]
The based-map hypothesis is essential because it guarantees that image loops begin and end at the chosen basepoint $y_0$. The theorem does not say that the fundamental group is a functor on all of $\mathbf{Top}$ without choices; changing basepoints usually requires extra path data. Its importance is that a topological map now produces algebraic information in a way compatible with composition, so topological questions can be transported into group-theoretic ones.
[example: Homology Detects A Boundary That Homotopy-Invariant Numbers Miss]
For each $n\ge 0$, singular homology assigns to a topological space $X$ an abelian group $H_n(X)$ and to a continuous map $f:X\to Y$ a group homomorphism $H_n(f):H_n(X)\to H_n(Y)$, with
\begin{align*}
H_n(\operatorname{id}_X)&=\operatorname{id}_{H_n(X)},\\
H_n(g\circ f)&=H_n(g)\circ H_n(f)
\end{align*}
for composable continuous maps $X\xrightarrow{f}Y\xrightarrow{g}Z$. Suppose, toward a contradiction, that there were a continuous retraction $r:D^2\to S^1$ of the closed disc onto its boundary. If $i:S^1\hookrightarrow D^2$ is the inclusion, the retraction condition says
\begin{align*}
r\circ i=\operatorname{id}_{S^1}.
\end{align*}
Applying $H_1$ and using functoriality gives
\begin{align*}
H_1(r)\circ H_1(i)
&=H_1(r\circ i)\\
&=H_1(\operatorname{id}_{S^1})\\
&=\operatorname{id}_{H_1(S^1)}.
\end{align*}
Now $H_1(S^1)\cong \mathbb Z$ and $H_1(D^2)=0$. Therefore $H_1(i):H_1(S^1)\to H_1(D^2)$ has codomain the zero group, so for every $a\in H_1(S^1)$,
\begin{align*}
H_1(i)(a)=0.
\end{align*}
Then for every $a\in H_1(S^1)$,
\begin{align*}
(H_1(r)\circ H_1(i))(a)
&=H_1(r)(H_1(i)(a))\\
&=H_1(r)(0)\\
&=0,
\end{align*}
because a group homomorphism sends $0$ to $0$. Thus $H_1(r)\circ H_1(i)$ is the zero homomorphism on $H_1(S^1)$. Under the identification $H_1(S^1)\cong\mathbb Z$, the identity sends $1$ to $1$, while the zero homomorphism sends $1$ to $0$, so the composite cannot equal $\operatorname{id}_{H_1(S^1)}$. This contradiction shows that no such retraction exists; the functorial shape of homology turns a proposed topological map into an impossible algebraic factorization.
[/example]
The homology example shows that functorial invariants often care only about maps up to homotopy. To make that insensitivity part of the category itself, we replace individual continuous maps by homotopy classes of maps.
[definition: Homotopy Category of Spaces]
The homotopy category $\mathbf{hTop}$ has topological spaces as objects and homotopy classes of continuous maps as morphisms. Composition is induced by composition of representatives.
[/definition]
The definition relies on the fact that composing homotopic maps with fixed maps preserves homotopy classes, so composition of representatives is independent of the chosen representatives. The resulting category is a first glimpse of a broader theme: sometimes the right category is obtained by identifying morphisms that the intended invariants already treat as equivalent.
## Analytic Categories and Controlled Maps
What is the correct category for analytic objects, where functions may need to be continuous, linear, bounded, isometric, or compact? The obstruction is that a merely algebraic linear map between infinite-dimensional normed spaces can ignore convergence, Cauchy sequences, and limits. Analysis forces the morphism question into the foreground. The same underlying vector spaces can form different categories depending on which maps preserve the structure relevant to the problem.
[definition: Category of Normed Spaces]
The category $\mathbf{Norm}$ has normed vector spaces over a fixed field $k\in\{\mathbb{R},\mathbb{C}\}$ as objects and bounded linear maps as morphisms.
[/definition]
Many analytic arguments require limits of Cauchy sequences, and those limits may not exist in a merely normed space. To make completeness part of the categorical setting, we restrict the objects to Banach spaces while keeping bounded linear maps as the morphisms.
[definition: Category of Banach Spaces]
The category $\mathbf{Ban}$ has Banach spaces over a fixed field $k\in\{\mathbb{R},\mathbb{C}\}$ as objects and bounded linear maps as morphisms.
[/definition]
After Banach spaces, the next analytic setting is one where the norm comes from an inner product. Orthogonality, projections, and adjoints are Hilbert-space phenomena, but the morphisms still need boundedness so that maps preserve the analytic topology as well as the linear structure.
[definition: Category of Hilbert Spaces]
The category $\mathbf{Hilb}$ has Hilbert spaces over a fixed field $k\in\{\mathbb{R},\mathbb{C}\}$ as objects and bounded linear maps as morphisms.
[/definition]
Boundedness is the categorical form of continuity for linear maps between normed spaces. It ensures that composition remains inside the category and that analytic structure is respected by morphisms.
[example: Why Bounded Maps Are The Morphisms]
Let $T:X\to Y$ and $S:Y\to Z$ be bounded linear maps between normed spaces. By boundedness, there are constants $C_T>0$ and $C_S>0$ such that for every $x\in X$ and every $y\in Y$,
\begin{align*}
|T x|_Y&\le C_T|x|_X,\\
|S y|_Z&\le C_S|y|_Y.
\end{align*}
Then $S\circ T:X\to Z$ is linear because, for $x,x'\in X$ and $\lambda\in k$,
\begin{align*}
(S\circ T)(x+\lambda x')
&=S(T(x+\lambda x'))\\
&=S(Tx+\lambda Tx')\\
&=S(Tx)+\lambda S(Tx')\\
&=(S\circ T)(x)+\lambda(S\circ T)(x').
\end{align*}
It is also bounded, since for every $x\in X$,
\begin{align*}
|(S\circ T)(x)|_Z
&=|S(Tx)|_Z\\
&\le C_S|Tx|_Y\\
&\le C_S C_T |x|_X.
\end{align*}
Thus bounded linear maps are closed under composition, so they form a viable class of morphisms for normed and Banach spaces.
The boundedness condition also records the topology induced by the norm. If $x_n\to 0$ in $X$, then
\begin{align*}
|T x_n|_Y\le C_T|x_n|_X\to 0,
\end{align*}
so $T x_n\to 0$ in $Y$. Arbitrary linear maps need not satisfy any inequality of this form, and therefore need not preserve convergence. The category $\mathbf{Ban}$ uses bounded linear maps because its morphisms should respect both the vector-space operations and the analytic structure carried by the norm.
[/example]
This example identifies the analytic reason for the morphism convention. Boundedness is not an extra decoration on a linear map; it is the condition that makes the map compatible with the topology induced by the norm. The next theorem explains why the terms bounded linear map and continuous linear map are interchangeable in this setting.
[quotetheorem:3994]
[citeproof:3994]
Linearity is essential in the theorem: for nonlinear maps, continuity and a global bound of the form $|T x|_Y\le C|x|_X$ are different conditions. The normed-space hypotheses are also essential because boundedness is measured using the two norms. The theorem justifies the morphisms in $\mathbf{Norm}$ and $\mathbf{Ban}$: requiring boundedness is exactly requiring continuity while staying within linear maps. This connection prepares the universal-property view of completion, where the relevant maps are uniformly continuous rather than merely continuous.
Completion raises a universal extension problem. Its usual construction uses Cauchy sequences, but its categorical meaning is that maps from the original space into complete targets should extend uniquely across the completed space.
[quotetheorem:3995]
[citeproof:3995]
The theorem is intentionally phrased as a mapping property. Even when technical hypotheses vary between versions, the categorical idea remains: completion is determined by the extension behaviour of maps into complete spaces.
This is the analytic analogue of the universal properties from algebra. A completion is not valuable merely because it is a larger space; it is valuable because every compatible map from the original space into a complete target extends uniquely through it.
## Invariance Under Isomorphism and Equivalence
Which features of a category are intrinsic, and which depend on its presentation? The obstruction is that categories often contain redundant copies of the same object, or encode the same theory using different-looking objects. Earlier chapters showed that isomorphism is the right sameness relation between objects of a category, while equivalence is the right sameness relation between categories. This distinction lets us compare mathematical theories that have different literal objects but the same categorical content.
[definition: Categorical Property of Objects]
A property $P$ of objects in a category $\mathcal{C}$ is categorical if whenever $X\cong Y$ in $\mathcal{C}$, the object $X$ has property $P$ if and only if $Y$ has property $P$.
[/definition]
The object-level notion is not enough when comparing entire categories. Two categories can present the same mathematics with different literal objects, so we need a stronger standard for properties or constructions that should survive replacing a category by an equivalent one.
[definition: Equivalence-Invariant Property]
A property or construction of categories is invariant under equivalence if every equivalence of categories $F:\mathcal{C}\to\mathcal{D}$ transports the property or construction from $\mathcal{C}$ to the corresponding property or construction in $\mathcal{D}$.
[/definition]
To use equivalence as sameness, we must know what it preserves internally. The first check is object-level isomorphism: if two objects are isomorphic in $\mathcal C$, an equivalence should carry them to isomorphic objects in $\mathcal D$, and every isomorphism relation in the target should be reflected up to the inverse equivalence.
[quotetheorem:3996]
[citeproof:3996]
The theorem says that equivalences preserve the object-level sameness relation relevant inside a category. Fullness and faithfulness are the hypotheses that prevent morphism information from being lost, while essential surjectivity prevents objects of the target from being missed up to isomorphism. The statement does not claim that equivalent categories have literally equal objects or equal hom-sets. The next theorem packages these three operational conditions into a criterion for recognising equivalences.
[quotetheorem:3997]
[citeproof:3997]
This theorem explains why equivalence is the operational notion of sameness for categories. Fullness and faithfulness say that every hom-set is represented exactly, so no morphisms are added, removed, or identified. Essential surjectivity is needed because a functor can be perfectly faithful on morphisms while still describing only a proper part of the target category. The theorem does not construct a unique quasi-inverse; it constructs one after making choices of representatives and isomorphisms, and different choices lead to naturally isomorphic results. Thus a functor loses no morphism information, hits every object up to isomorphism, and recovers the target category up to categorical sameness.
[example: Skeletons and Presentation]
Let $\mathcal S$ be a skeleton of a category $\mathcal C$, viewed as the full subcategory whose objects are the chosen representatives. Let
\begin{align*}
J:\mathcal S\to\mathcal C
\end{align*}
be the inclusion functor. For objects $A,B\in\mathcal S$, fullness of the subcategory means
\begin{align*}
\operatorname{Hom}_{\mathcal S}(A,B)=\operatorname{Hom}_{\mathcal C}(A,B).
\end{align*}
Therefore the induced map
\begin{align*}
J_{A,B}:\operatorname{Hom}_{\mathcal S}(A,B)\to \operatorname{Hom}_{\mathcal C}(J(A),J(B))
\end{align*}
is the identity function on this hom-set. Hence $J_{A,B}$ is both injective and surjective, so $J$ is faithful and full.
Now take any object $X\in\mathcal C$. Since $\mathcal S$ contains exactly one representative from each isomorphism class, there is an object $S_X\in\mathcal S$ such that $S_X\cong X$ in $\mathcal C$. Because $J(S_X)=S_X$ as an object of $\mathcal C$, this says
\begin{align*}
J(S_X)\cong X.
\end{align*}
Thus every object of $\mathcal C$ is isomorphic to one in the image of $J$, so $J$ is essentially surjective. By the equivalence criterion above, $J$ is an equivalence of categories.
The point is that passing to a skeleton removes duplicate isomorphic copies of objects without changing the category up to equivalence. It may still change the category strictly: if $\mathcal C$ contains two distinct isomorphic objects while $\mathcal S$ contains only one representative of their common isomorphism class, then the inclusion cannot be an isomorphism of categories, because it does not give a one-to-one equality-level correspondence of objects.
[/example]
## Yoneda As The Final Test Case
How much of an object is remembered by all maps out of it or all maps into it? The obstruction is that objects may have no preferred elements, coordinates, or presentations, so direct comparison can be misleading. The Yoneda lemma gives the sharp answer: an object is completely detected by the functor it represents. This result turns categorical thinking into a powerful method for computing natural transformations.
[quotetheorem:3998]
[citeproof:3998]
The locally small hypothesis ensures that the hom-collections appearing in the statement are actual sets, so that the functor $\operatorname{Hom}_{\mathcal{C}}(A,-)$ lands in $\mathbf{Set}$. The proof shows why the identity morphism $\operatorname{id}_A$ is decisive: naturality transports its value along every map $A\to X$. The lemma does not merely count natural transformations; it identifies them with elements of $F(A)$ in a way compatible with changing both the representing object and the target functor. The final example specialises this principle to representable functors on both sides.
[example: Natural Transformations Between Representables]
Take $F=\operatorname{Hom}_{\mathcal{C}}(B,-)$ in the *Yoneda Lemma*. It gives a bijection
\begin{align*}
\operatorname{Nat}(\operatorname{Hom}_{\mathcal{C}}(A,-),\operatorname{Hom}_{\mathcal{C}}(B,-))
&\cong \operatorname{Hom}_{\mathcal{C}}(B,A),
\end{align*}
and we now spell out what this bijection does.
Given a morphism $u:B\to A$, define a family of functions
\begin{align*}
\eta^u_X:\operatorname{Hom}_{\mathcal{C}}(A,X)&\to \operatorname{Hom}_{\mathcal{C}}(B,X),\\
f&\mapsto f\circ u.
\end{align*}
To check naturality, let $g:X\to Y$ be a morphism in $\mathcal{C}$ and let $f:A\to X$. The covariant hom-functors send $g$ to postcomposition by $g$, so
\begin{align*}
\operatorname{Hom}_{\mathcal{C}}(B,g)(\eta^u_X(f))
&=\operatorname{Hom}_{\mathcal{C}}(B,g)(f\circ u)\\
&=g\circ(f\circ u)\\
&=(g\circ f)\circ u\\
&=\eta^u_Y(g\circ f)\\
&=\eta^u_Y(\operatorname{Hom}_{\mathcal{C}}(A,g)(f)).
\end{align*}
Thus $\eta^u$ is a natural transformation.
Conversely, let
\begin{align*}
\eta:\operatorname{Hom}_{\mathcal{C}}(A,-)\to \operatorname{Hom}_{\mathcal{C}}(B,-)
\end{align*}
be a natural transformation. Its component at $A$ sends $\operatorname{id}_A:A\to A$ to a morphism
\begin{align*}
u:=\eta_A(\operatorname{id}_A):B\to A.
\end{align*}
For any morphism $f:A\to X$, naturality of $\eta$ with respect to $f:A\to X$ gives
\begin{align*}
\eta_X(\operatorname{Hom}_{\mathcal{C}}(A,f)(\operatorname{id}_A))
&=\operatorname{Hom}_{\mathcal{C}}(B,f)(\eta_A(\operatorname{id}_A)).
\end{align*}
The two sides expand as
\begin{align*}
\operatorname{Hom}_{\mathcal{C}}(A,f)(\operatorname{id}_A)&=f\circ \operatorname{id}_A=f,\\
\operatorname{Hom}_{\mathcal{C}}(B,f)(\eta_A(\operatorname{id}_A))&=f\circ u.
\end{align*}
Therefore
\begin{align*}
\eta_X(f)=f\circ u.
\end{align*}
So every natural transformation is exactly precomposition by the single morphism $u:B\to A$, and different choices of $u$ give different transformations because evaluating at $X=A$ and $f=\operatorname{id}_A$ recovers $u$. Natural transformations between covariant representable functors therefore run in the opposite direction from the representing objects.
[/example]
The Yoneda lemma is the endpoint of this first course because it unifies the main themes. Objects are studied through morphisms, constructions are tested by functoriality, and equality of constructions is replaced by natural isomorphism. Across algebra, topology, and analysis, category theory supplies a disciplined way to ask whether a definition depends on presentation or expresses a genuine mathematical structure.
Contents
- Introduction
- What Category Theory Is For
- The Scope of This First Course
- Mathematical Background and Conventions
- Running Examples
- What Theorems in This Course Will Look Like
- The Main Destination
- 1. Categories and Basic Examples
- The Data and Axioms of a Category
- Size, Local Smallness, and Opposite Categories
- Standard Mathematical Categories
- What the First Examples Teach
- 2. Morphisms as Structure
- Invertible Morphisms and One-Sided Inverses
- Monomorphisms and Epimorphisms as Cancellation Laws
- Initial and Terminal Objects as Universal Properties
- 3. Functors
- Covariant Functors
- Contravariant Functors And Opposite Categories
- Functors And Isomorphisms
- Faithful, Full, And Essentially Surjective Functors
- 4. Natural Transformations
- Comparing Functors Object by Object
- Composing Natural Transformations
- Natural Isomorphisms and Functor Categories
- Naturality Without Coordinates
- 5. Equivalence of Categories
- Isomorphism of Categories and the Need for Equivalence
- Full, Faithful, and Essentially Surjective Functors
- Skeletons and Categorical Invariants
- 6. Universal Properties in Practice
- Universal Mapping Properties and Uniqueness
- Free Objects Described By Maps Out Of A Set
- Quotient-Type Constructions and Factorization Properties
- Tensor Products As Representing Bilinear Maps
- 7. Hom-Functors and Representability
- Hom-Functors from a Fixed Object
- Hom-Functors into a Fixed Object
- Representable Functors
- Universal Elements
- The Category of Elements
- What Representability Buys
- 8. The Yoneda Lemma
- Natural Transformations Out of a Representable Functor
- The Contravariant Form
- Evaluating at the Identity Morphism
- Natural Transformations Between Hom-Functors
- The Yoneda Embedding
- Generalized Elements and the Meaning of Representation
- 9. Consequences of Yoneda
- Categorical Reconstruction From Hom-Sets
- Universal Properties As Representability Statements
- Natural Operations Classified By Elements
- Representability As A Habit Of Thought
- 10. Categorical Thinking Across Mathematics
- Algebraic Categories and Structure-Preserving Maps
- Contravariant Algebraic Constructions
- Topological Examples and Homotopical Motivation
- Analytic Categories and Controlled Maps
- Invariance Under Isomorphism and Equivalence
- Yoneda As The Final Test Case
Category Theory I: Categories, Functors, and Natural Transformations
Content
Problems
History
Created by admin on 5/28/2026 | Last updated on 6/1/2026
Prerequisites (0/1 completed)
Log in to track your prerequisite progress.
Prerequisites Graph
Interactive dependency map showing prerequisite concepts
Loading dependency graph...
Theorem
Definition
Current
Requires
Rate this page
★
★
★
★
★
Poor
Excellent