This course develops the categorical language needed to describe algebraic structures through universal properties, then pushes that language into the setting of limits, exactness, and abelian categories. It begins with adjunctions as the organizing principle behind many familiar constructions, such as free objects, forgetful functors, and algebraic examples from groups, rings, modules, and related structures. From there, it studies how adjunctions connect to equivalences and monads, and how universal properties reappear in the theory of limits and colimits.
The later chapters build the machinery needed for modern homological algebra. After limits and colimits, the course turns to completeness, cocontinuity, creation of limits, and Kan extensions, culminating in a preview of adjoint functor theorems. It then shifts to additive and preadditive categories, introduces kernels, cokernels, and abelian categories, and uses exact sequences and diagram lemmas to formalize the behavior of algebraic objects under functors. The final chapters treat projectives, injectives, exact functors, and a preview of derived functors, so the course moves from structural category theory to the tools that underpin much of algebra and homological algebra.
# 1. Adjunctions From Universal Properties
Adjunctions organise universal constructions by recording not just an object with a property, but a coherent way of transporting maps through that property. The chapter moves from external data, namely natural bijections of hom-sets, to internal data, namely units and counits, and then to the universal-arrow formulation that appears in concrete constructions. The main obstruction throughout is coherence: pointwise bijections or isolated universal objects are not enough unless they vary correctly with morphisms.
## Hom-Set Adjunctions
Suppose a construction assigns to each object $c$ of a category $\mathcal C$ an object $Fc$ of a category $\mathcal D$. When should $Fc$ deserve to be called the object freely generated by $c$, or the best $\mathcal D$-object associated to $c$? The answer is that maps out of $Fc$ should be controlled exactly by maps out of $c$ after applying a comparison functor back to $\mathcal C$.
A pointwise bijection of hom-sets is not enough. If the chosen bijection changes arbitrarily when $c$ or $d$ is replaced by an isomorphic object, then it does not describe a construction in the categories; it describes unrelated set-theoretic coincidences. Naturality is the condition that rules out this failure.
[definition: Hom-Set Adjunction]
Let $F: \mathcal C \to \mathcal D$ and $G: \mathcal D \to \mathcal C$ be functors. An adjunction $F \dashv G$ is a family of bijections
\begin{align*}
\Phi_{c,d}: \mathcal D(Fc,d) \longrightarrow \mathcal C(c,Gd)
\end{align*}
for all objects $c \in \mathcal C$ and $d \in \mathcal D$, natural in both $c$ and $d$.
[/definition]
The functor $F$ is called the left adjoint and $G$ is called the right adjoint. Naturality means that the bijection is compatible with changing $c$ by a morphism in $\mathcal C$ and changing $d$ by a morphism in $\mathcal D$. Thus an adjunction is a coherent identification of two bifunctors
\begin{align*}
\mathcal D(F-, -), \mathcal C(-,G-) : \mathcal C^{\mathrm{op}} \times \mathcal D \to \operatorname{Set}.
\end{align*}
To use this coherence in calculations, we need the explicit two-variable naturality equation. It is the rule that lets one move morphisms in the source and target through the adjunction bijection without changing the represented map.
[definition: Natural In Both Variables]
For a family $\Phi_{c,d}: \mathcal D(Fc,d) \to \mathcal C(c,Gd)$, naturality in both variables means that for every morphism $u:c' \to c$ in $\mathcal C$, every morphism $v:d \to d'$ in $\mathcal D$, and every morphism $f:Fc \to d$, one has
\begin{align*}
\Phi_{c',d'}(v \circ f \circ Fu) = Gv \circ \Phi_{c,d}(f) \circ u.
\end{align*}
[/definition]
This formula is often the most efficient way to use an adjunction. It says that the adjunct of a composite is the corresponding composite of adjuncts after applying the functors on the correct sides.
[example: Free Group And Underlying Set]
Let $U:\operatorname{Grp}\to\operatorname{Set}$ be the forgetful functor, and let $F(S)$ be the free group on a set $S$. Its elements are reduced words
\begin{align*}
s_1^{\epsilon_1}s_2^{\epsilon_2}\cdots s_n^{\epsilon_n},
\end{align*}
where $s_i\in S$, $\epsilon_i\in\{1,-1\}$, and adjacent inverse pairs have been cancelled.
We compute the adjunction bijection
\begin{align*}
\operatorname{Grp}(F(S),H)\cong \operatorname{Set}(S,UH).
\end{align*}
Given a group homomorphism $\varphi:F(S)\to H$, define its underlying function on generators by
\begin{align*}
\Phi(\varphi):S&\longrightarrow U(H),\\
s&\longmapsto \varphi(s).
\end{align*}
Conversely, given a function $a:S\to U(H)$, define $\widetilde a:F(S)\to H$ on a reduced word by
\begin{align*}
\widetilde a(s_1^{\epsilon_1}s_2^{\epsilon_2}\cdots s_n^{\epsilon_n})
=
a(s_1)^{\epsilon_1}a(s_2)^{\epsilon_2}\cdots a(s_n)^{\epsilon_n},
\end{align*}
where $a(s)^1=a(s)$ and $a(s)^{-1}$ is the inverse of $a(s)$ in $H$. This respects cancellation because if a word contains an adjacent pair $ss^{-1}$, its image contains
\begin{align*}
a(s)a(s)^{-1}=e_H,
\end{align*}
and if it contains $s^{-1}s$, its image contains
\begin{align*}
a(s)^{-1}a(s)=e_H.
\end{align*}
Thus deleting either adjacent inverse pair does not change the product in $H$.
The two constructions are inverse. Starting with $a:S\to U(H)$ and restricting $\widetilde a$ to generators gives
\begin{align*}
\Phi(\widetilde a)(s)=\widetilde a(s)=a(s)
\end{align*}
for every $s\in S$. Starting with a homomorphism $\varphi:F(S)\to H$, the extension of its restriction to $S$ sends a reduced word to
\begin{align*}
\widetilde{\Phi(\varphi)}(s_1^{\epsilon_1}\cdots s_n^{\epsilon_n})
&=
\Phi(\varphi)(s_1)^{\epsilon_1}\cdots \Phi(\varphi)(s_n)^{\epsilon_n}\\
&=
\varphi(s_1)^{\epsilon_1}\cdots \varphi(s_n)^{\epsilon_n}\\
&=
\varphi(s_1^{\epsilon_1}\cdots s_n^{\epsilon_n}),
\end{align*}
where the last equality uses preservation of multiplication and inverses by the group homomorphism $\varphi$. Hence a homomorphism out of $F(S)$ is exactly a choice of elements of $H$ indexed by $S$, which is the universal property encoded by $F\dashv U$.
[/example]
The example shows the slogan: left adjoints build freely, right adjoints forget structure or record a space of observations. The word "free" is not part of the definition, but many familiar free constructions are left adjoints because their universal property is a hom-set bijection.
[example: Tensor Hom For Modules]
Let $R$ be a commutative ring and let $M$ be an $R$-module. We compute the adjunction
\begin{align*}
\operatorname{Hom}_R(M \otimes_R N, P)
\cong
\operatorname{Hom}_R(N,\operatorname{Hom}_R(M,P))
\end{align*}
for $R$-modules $N$ and $P$.
Given an $R$-linear map $b:M\otimes_R N\to P$, define
\begin{align*}
\Phi(b):N&\longrightarrow \operatorname{Hom}_R(M,P),\\
n&\longmapsto \bigl(m\longmapsto b(m\otimes n)\bigr).
\end{align*}
For each fixed $n\in N$, the map $m\mapsto b(m\otimes n)$ is $R$-linear because
\begin{align*}
b((m_1+m_2)\otimes n)
&=b(m_1\otimes n+m_2\otimes n)\\
&=b(m_1\otimes n)+b(m_2\otimes n),
\end{align*}
and
\begin{align*}
b((r m)\otimes n)
&=b(r(m\otimes n))\\
&=r\,b(m\otimes n).
\end{align*}
The assignment $n\mapsto \Phi(b)(n)$ is also $R$-linear, since for $m\in M$,
\begin{align*}
\Phi(b)(n_1+n_2)(m)
&=b(m\otimes(n_1+n_2))\\
&=b(m\otimes n_1+m\otimes n_2)\\
&=b(m\otimes n_1)+b(m\otimes n_2)\\
&=\bigl(\Phi(b)(n_1)+\Phi(b)(n_2)\bigr)(m),
\end{align*}
and
\begin{align*}
\Phi(b)(r n)(m)
&=b(m\otimes r n)\\
&=b(r(m\otimes n))\\
&=r\,b(m\otimes n)\\
&=\bigl(r\,\Phi(b)(n)\bigr)(m).
\end{align*}
Conversely, given an $R$-linear map $a:N\to \operatorname{Hom}_R(M,P)$, define
\begin{align*}
\beta_a:M\times N&\longrightarrow P,\\
(m,n)&\longmapsto a(n)(m).
\end{align*}
This rule is $R$-bilinear. In the first variable,
\begin{align*}
\beta_a(m_1+m_2,n)
&=a(n)(m_1+m_2)\\
&=a(n)(m_1)+a(n)(m_2),
\end{align*}
and
\begin{align*}
\beta_a(rm,n)
&=a(n)(rm)\\
&=r\,a(n)(m).
\end{align*}
In the second variable,
\begin{align*}
\beta_a(m,n_1+n_2)
&=a(n_1+n_2)(m)\\
&=(a(n_1)+a(n_2))(m)\\
&=a(n_1)(m)+a(n_2)(m),
\end{align*}
and
\begin{align*}
\beta_a(m,rn)
&=a(rn)(m)\\
&=(r\,a(n))(m)\\
&=r\,a(n)(m).
\end{align*}
Hence $\beta_a$ is balanced, so by the universal property of the tensor product there is a unique $R$-linear map
\begin{align*}
\widetilde a:M\otimes_R N\longrightarrow P
\end{align*}
such that
\begin{align*}
\widetilde a(m\otimes n)=a(n)(m)
\end{align*}
for all $m\in M$ and $n\in N$.
The two constructions are inverse. Starting with $a:N\to \operatorname{Hom}_R(M,P)$, for every $n\in N$ and $m\in M$,
\begin{align*}
\Phi(\widetilde a)(n)(m)
&=\widetilde a(m\otimes n)\\
&=a(n)(m),
\end{align*}
so $\Phi(\widetilde a)=a$. Starting with $b:M\otimes_R N\to P$, the map $\widetilde{\Phi(b)}$ satisfies
\begin{align*}
\widetilde{\Phi(b)}(m\otimes n)
&=\Phi(b)(n)(m)\\
&=b(m\otimes n)
\end{align*}
on every pure tensor $m\otimes n$; by the uniqueness part of the tensor product universal property, $\widetilde{\Phi(b)}=b$. Thus maps out of $M\otimes_R N$ are exactly $R$-linear families of maps out of $M$ indexed by $N$, which is the adjunction $M\otimes_R -\dashv \operatorname{Hom}_R(M,-)$.
[/example]
This second example is central because it turns a bilinear operation into an ordinary hom-set. Many later exactness arguments in homological algebra come from knowing which side of this adjunction preserves limits or colimits.
[example: Product And Mapping Space]
Let $\mathcal E$ be a cartesian closed category, fix an object $X$, and write $Z^X$ for the exponential object. By definition of the exponential, there is an evaluation morphism
\begin{align*}
\operatorname{ev}_{X,Z}:Z^X\times X\longrightarrow Z
\end{align*}
such that for every object $Y$ and every morphism $f:Y\times X\to Z$, there is a unique morphism $\lambda f:Y\to Z^X$ satisfying
\begin{align*}
\operatorname{ev}_{X,Z}\circ(\lambda f\times \operatorname{id}_X)=f.
\end{align*}
Thus the transposition map is
\begin{align*}
\Phi_{Y,Z}:\mathcal E(Y\times X,Z)&\longrightarrow \mathcal E(Y,Z^X),\\
f&\longmapsto \lambda f.
\end{align*}
Conversely, from a morphism $g:Y\to Z^X$, recover a morphism $Y\times X\to Z$ by
\begin{align*}
\Psi_{Y,Z}(g)
&=\operatorname{ev}_{X,Z}\circ(g\times \operatorname{id}_X).
\end{align*}
These two operations are inverse. Starting with $f:Y\times X\to Z$, the defining equation for $\lambda f$ gives
\begin{align*}
\Psi_{Y,Z}(\Phi_{Y,Z}(f))
&=\Psi_{Y,Z}(\lambda f)\\
&=\operatorname{ev}_{X,Z}\circ(\lambda f\times \operatorname{id}_X)\\
&=f.
\end{align*}
Starting with $g:Y\to Z^X$, the morphism $g$ itself satisfies
\begin{align*}
\operatorname{ev}_{X,Z}\circ(g\times \operatorname{id}_X)=\Psi_{Y,Z}(g),
\end{align*}
so by the uniqueness clause in the exponential universal property,
\begin{align*}
\Phi_{Y,Z}(\Psi_{Y,Z}(g))=g.
\end{align*}
Therefore
\begin{align*}
\mathcal E(Y\times X,Z)\cong \mathcal E(Y,Z^X),
\end{align*}
naturally in $Y$ and $Z$. In compactly generated weak Hausdorff spaces, this says that a continuous map $Y\times X\to Z$ is equivalently a continuous map $Y\to Z^X$, whose value at $y$ is the function $x\mapsto f(y,x)$.
[/example]
This topological example explains why adjunctions are not confined to algebraic freeness. They also encode exponential objects, currying, and internal function spaces whenever the ambient category supports them.
## Units And Counits
The hom-set definition is powerful, but it mentions all pairs of objects at once. Can the same adjunction be recovered from a small amount of structural data inside the two categories? The answer is yes: the identity maps of $Fc$ and $Gd$ transpose to two natural transformations called the unit and counit.
The danger is that arbitrary maps $c \to GFc$ and $FGd \to d$ need not encode inverse transposition procedures. A unit can insert data and a counit can evaluate data, but without compatibility equations the process of inserting and then evaluating may change the original map. The triangle identities are exactly the conditions excluding that defect.
[definition: Unit And Counit]
Let $F: \mathcal C \to \mathcal D$ and $G: \mathcal D \to \mathcal C$ be functors. A unit-counit adjunction consists of natural transformations
\begin{align*}
\eta &: \operatorname{id}_{\mathcal C} \Rightarrow GF, & \varepsilon &: FG \Rightarrow \operatorname{id}_{\mathcal D}
\end{align*}
satisfying the triangle identities
\begin{align*}
\varepsilon_{Fc} \circ F\eta_c &= \operatorname{id}_{Fc}, \\
G\varepsilon_d \circ \eta_{Gd} &= \operatorname{id}_{Gd}
\end{align*}
for all $c \in \mathcal C$ and $d \in \mathcal D$.
[/definition]
The unit $\eta_c:c \to GFc$ is the universal map from $c$ into the underlying object of its free image. The counit $\varepsilon_d:FGd \to d$ evaluates the free object built on the underlying data of $d$ back in $d$.
[remark: Triangle Identities As No Loss Conditions]
The first triangle identity says that if one freely inserts $c$ into $GFc$ and then evaluates back at $Fc$, nothing changes. The second says the dual statement on the right adjoint side. These equations prevent the unit and counit from merely producing comparison maps; they force the two transposition procedures to be inverse to each other.
[/remark]
The internal maps are not extra decorations. They are forced by the hom-set adjunction itself, because identity morphisms are among the morphisms being transposed.
This raises the structural question of whether the whole hom-set adjunction can be recovered from these two internal maps. The next result isolates the unit and counit as the canonical data extracted from an adjunction, preparing the later converse in which the triangle identities become the obstruction to reconstruction.
[quotetheorem:4143]
[citeproof:4143]
The construction tells us how to remember an adjunction in practice: keep track of the universal insertion map $\eta$ and the evaluation map $\varepsilon$. Naturality is essential here, not cosmetic. If the bijections were merely chosen separately for each pair $(c,d)$, the maps obtained by transposing identity morphisms could fail to commute with morphisms in $\mathcal C$ or $\mathcal D$, so they would not be natural transformations. The theorem extracts unit and counit data from an adjunction already known to be coherent; it does not say that arbitrary bijections of sets can be upgraded to an adjunction.
The converse problem is now the useful one. If we start only with natural transformations $\eta:c \to GFc$ and $\varepsilon:FGd \to d$, when do they recover all the hom-set bijections rather than merely giving two comparison maps? The obstruction is possible loss of information when we insert and then evaluate, so the triangle identities must be strong enough to force the two transposition formulas to undo each other.
[quotetheorem:4144]
[citeproof:4144]
Together the last two theorems show that the hom-set and unit-counit definitions carry the same mathematical content. The triangle identities are the decisive hypotheses: without them, the formulas $f \mapsto Gf \circ \eta_c$ and $g \mapsto \varepsilon_d \circ Fg$ may be well-typed but need not be inverse. Already in $\operatorname{Ab}$, take $F=G=\operatorname{id}_{\operatorname{Ab}}$, let $\eta_A:A\to A$ be multiplication by $2$, and let $\varepsilon_A=\operatorname{id}_A$. These are natural transformations, but $\varepsilon_A\circ \eta_A=2\operatorname{id}_A$, not $\operatorname{id}_A$, so the proposed transpose-and-return operation doubles a homomorphism rather than recovering it. Thus the theorem characterises exactly those unit-counit pairs that come from adjunctions, and it prepares the later monad construction $GF$ where the same identities control associativity and units.
It is worth seeing these equations in a familiar case before moving on. The free-group adjunction makes the unit and counit concrete: the unit inserts a generator, while the counit evaluates a formal word in actual group elements. In that setting the triangle identities say exactly that "make a one-letter word and evaluate it" and "freely relabel generators and evaluate" lose no information.
[example: Unit And Counit For Free Groups]
For the adjunction $F\dashv U$ from sets to groups, write $\eta_S:S\to U(F(S))$ for the function
\begin{align*}
\eta_S(s)=s,
\end{align*}
where the right-hand side means the generator of the free group $F(S)$ labelled by $s$. For a group $H$, write $[h]$ for the generator of $F(U(H))$ labelled by the element $h\in H$. The counit
\begin{align*}
\varepsilon_H:F(U(H))\longrightarrow H
\end{align*}
is the homomorphism determined on generators by
\begin{align*}
\varepsilon_H([h])=h.
\end{align*}
Thus on a reduced word in the free group on the underlying set of $H$,
\begin{align*}
\varepsilon_H([h_1]^{\epsilon_1}[h_2]^{\epsilon_2}\cdots [h_n]^{\epsilon_n})
=
h_1^{\epsilon_1}h_2^{\epsilon_2}\cdots h_n^{\epsilon_n},
\end{align*}
with the product taken in $H$.
We verify the two triangle identities. Let $w=s_1^{\epsilon_1}\cdots s_n^{\epsilon_n}$ be a reduced word in $F(S)$. The homomorphism $F\eta_S:F(S)\to F(U(F(S)))$ sends each generator $s$ to the generator $[\eta_S(s)]$ labelled by the element $\eta_S(s)\in U(F(S))$, so
\begin{align*}
(\varepsilon_{F(S)}\circ F\eta_S)(w)
&=\varepsilon_{F(S)}\bigl([\eta_S(s_1)]^{\epsilon_1}\cdots[\eta_S(s_n)]^{\epsilon_n}\bigr)\\
&=\eta_S(s_1)^{\epsilon_1}\cdots \eta_S(s_n)^{\epsilon_n}\\
&=s_1^{\epsilon_1}\cdots s_n^{\epsilon_n}\\
&=w.
\end{align*}
Hence $\varepsilon_{F(S)}\circ F\eta_S=\operatorname{id}_{F(S)}$.
For the second triangle identity, take $h\in U(H)$. The unit at the underlying set $U(H)$ sends $h$ to the generator $[h]\in F(U(H))$, and then $U\varepsilon_H$ applies the group homomorphism $\varepsilon_H$ while forgetting that the result lies in a group. Therefore
\begin{align*}
(U\varepsilon_H\circ \eta_{U(H)})(h)
&=U\varepsilon_H([h])\\
&=\varepsilon_H([h])\\
&=h.
\end{align*}
Thus $U\varepsilon_H\circ \eta_{U(H)}=\operatorname{id}_{U(H)}$. The identities say exactly that inserting generators into a free group and then evaluating loses no information, and that an element of $H$ viewed as a one-letter word evaluates back to itself.
[/example]
## Universal Arrows And Comma Categories
Adjunctions are often first met through universal properties rather than through named functor pairs. Given a functor $G:\mathcal D \to \mathcal C$ and an object $c \in \mathcal C$, what does it mean to choose the best approximation to $c$ by an object in the image of $G$? The categorical answer is a universal arrow from $c$ to $G$.
The obstruction is that a good object for one input $c$ does not by itself produce a functor. If no universal object exists for some $c$, there is no possible value $Fc$ for a left adjoint. If universal objects exist but are chosen without using their factorisation property to define maps, the resulting assignment need not respect identity morphisms or composition.
[definition: Universal Arrow From An Object To A Functor]
Let $G:\mathcal D \to \mathcal C$ be a functor and let $c \in \mathcal C$. A universal arrow from $c$ to $G$ is an object $d_c \in \mathcal D$ together with a morphism $\eta_c:c \to Gd_c$ such that for every $d \in \mathcal D$ and every morphism $h:c \to Gd$, there exists a unique morphism $\bar h:d_c \to d$ in $\mathcal D$ satisfying
\begin{align*}
G\bar h \circ \eta_c = h.
\end{align*}
[/definition]
This is the usual universal-property pattern: every candidate map from $c$ into a $G$-object factors uniquely through the chosen universal one. If $d_c$ is written $Fc$, the universal arrow says precisely that maps $Fc \to d$ correspond bijectively to maps $c \to Gd$.
To make this objectwise universal property usable in an adjunction construction, we need a setting that contains all possible arrows $c \to Gd$ at once. The right setting is a comma category: its objects remember both the candidate target $d$ and the map from $c$, and its morphisms record exactly the allowed comparisons between candidates.
[definition: Comma Category For A Universal Arrow]
Let $G:\mathcal D \to \mathcal C$ be a functor and let $c \in \mathcal C$. The comma category $(c \downarrow G)$ has objects pairs $(d,h)$ with $d \in \mathcal D$ and $h:c \to Gd$ in $\mathcal C$. A morphism $(d,h) \to (d',h')$ is a morphism $a:d \to d'$ in $\mathcal D$ such that
\begin{align*}
Ga \circ h = h'.
\end{align*}
[/definition]
With this language, a universal arrow from $c$ to $G$ is exactly an initial object of $(c \downarrow G)$. This viewpoint separates the existence of a universal object for each $c$ from the additional work of making those objects functorial in $c$.
The remaining problem is global: objectwise universal arrows must assemble into a functor and then into an adjunction. The following criterion explains why the uniqueness built into each comma-category initial object is strong enough to force the morphism part of the left adjoint.
[quotetheorem:4145]
[citeproof:4145]
This theorem is the main bridge from concrete universal constructions to adjunctions. The hypothesis must hold for every object $c \in \mathcal C$: if free objects exist only for a subcategory of inputs, the construction may define a partial operation but not a left adjoint on all of $\mathcal C$. The morphism part of $F$ is also forced by the chosen universal arrows; choosing unrelated representatives without the defining equation $G(Fu) \circ \eta_c = \eta_{c'} \circ u$ can destroy functoriality. The limitation is that the theorem proves existence of a left adjoint only after the universal arrows have been supplied, so later existence theorems still have to construct those initial objects.
[example: Free Group As An Initial Object]
For a set $S$, consider the object $(F(S),\eta_S)$ of $(S \downarrow U)$, where $\eta_S(s)$ is the generator of the free group $F(S)$ labelled by $s$. We show that this object is initial: for every object $(H,h)$, with $H$ a group and $h:S\to U(H)$ a function, there is exactly one morphism
\begin{align*}
(F(S),\eta_S)\longrightarrow (H,h)
\end{align*}
in the comma category.
Define a homomorphism $\widetilde h:F(S)\to H$ on a reduced word by
\begin{align*}
\widetilde h(s_1^{\epsilon_1}\cdots s_n^{\epsilon_n})
=
h(s_1)^{\epsilon_1}\cdots h(s_n)^{\epsilon_n},
\end{align*}
where $\epsilon_i\in\{1,-1\}$. This is compatible with cancellation: if a word contains $ss^{-1}$, its image contains
\begin{align*}
h(s)h(s)^{-1}=e_H,
\end{align*}
and if it contains $s^{-1}s$, its image contains
\begin{align*}
h(s)^{-1}h(s)=e_H.
\end{align*}
Thus deleting an adjacent inverse pair in the word does not change the resulting product in $H$.
The comma-category condition for a morphism $(F(S),\eta_S)\to(H,h)$ is
\begin{align*}
U\widetilde h\circ \eta_S=h.
\end{align*}
For each $s\in S$,
\begin{align*}
(U\widetilde h\circ \eta_S)(s)
&=U\widetilde h(\eta_S(s))\\
&=\widetilde h(s)\\
&=h(s),
\end{align*}
so $\widetilde h$ is a morphism in $(S\downarrow U)$.
It is the only such morphism. If $\alpha:F(S)\to H$ is any group homomorphism satisfying $U\alpha\circ\eta_S=h$, then for each generator $s\in S$,
\begin{align*}
\alpha(s)
&=\alpha(\eta_S(s))\\
&=(U\alpha\circ\eta_S)(s)\\
&=h(s).
\end{align*}
Therefore, on any reduced word,
\begin{align*}
\alpha(s_1^{\epsilon_1}\cdots s_n^{\epsilon_n})
&=\alpha(s_1)^{\epsilon_1}\cdots \alpha(s_n)^{\epsilon_n}\\
&=h(s_1)^{\epsilon_1}\cdots h(s_n)^{\epsilon_n}\\
&=\widetilde h(s_1^{\epsilon_1}\cdots s_n^{\epsilon_n}),
\end{align*}
using preservation of multiplication and inverses by the homomorphism $\alpha$. Hence $\alpha=\widetilde h$, so $(F(S),\eta_S)$ is initial in $(S\downarrow U)$. This restates the free group universal property as an initial-object property in a comma category.
[/example]
There is a dual notion of universal arrow from a functor to an object, yielding right adjoints. In this chapter we focus on left adjoints because the main examples are free or tensor-like constructions, but the dual statements are obtained by reversing arrows.
## Equivalence And Uniqueness
The three descriptions above look different: hom-set bijections, unit-counit data, and universal arrows. Why are they treated as the same concept rather than as related notions? The point is that each construction recovers the others without arbitrary choices, up to the unique isomorphisms forced by universal properties.
[quotetheorem:4146]
[citeproof:4146]
This equivalence lets us switch definitions depending on the problem. Each item includes coherence hypotheses, and omitting any of them changes the statement: non-natural hom-set bijections, unit-counit maps without triangle identities, or universal arrows that do not vary functorially do not determine the same notion. The theorem is therefore not a loose analogy between three perspectives, but a precise equivalence of structured data. In examples, the universal arrow is often easiest to construct; in proofs about preservation of limits and colimits, the hom-set definition is usually the most compact; in calculations with monads and comonads, units and counits carry the structure.
One ambiguity still remains after the definitions have been identified. Universal constructions are rarely built uniquely on the nose: a free group can be realised by many different sets of reduced words, and a tensor product can be presented by different quotients. The next theorem explains why this non-uniqueness is harmless at the categorical level.
[quotetheorem:4147]
[citeproof:4147]
The uniqueness theorem is why category theory speaks of the left adjoint to a functor when one exists. The uniqueness is not literal equality of functors; it is uniqueness up to a unique natural isomorphism compatible with the specified adjunction data. Compatibility is necessary because two functors may be naturally isomorphic in many ways, while the universal arrows single out the isomorphism that respects the adjunction. Without the assumption that both functors are left adjoint to the same $G$, there is no reason for such an isomorphism to exist, and later constructions use this theorem precisely to identify different models of the same universal construction.
This also gives an important warning about language. Saying that two functors form an adjoint pair is not just saying that some hom-sets happen to have the same cardinality; the particular natural bijection, or equivalently the particular unit and counit, is part of the data being used.
[remark: Adjunctions Are Structure]
An adjunction is more than the assertion that two functors happen to determine isomorphic hom-sets. The specified natural bijection, or equivalently the specified unit and counit, is part of the structure. In many examples there is a canonical choice, and the uniqueness theorem explains why different constructions of the same universal object agree once the universal maps are fixed.
[/remark]
The chapter has established the basic language of adjunctions in its three standard forms. Chapter 2 uses universal arrows to recognise free and reflective constructions, Chapter 3 uses units and counits to study equivalences and monads, and Chapters 4 through 7 use hom-set adjunctions to prove preservation and existence results for limits, colimits, and Kan extensions.
Universal arrows already show that many constructions are not isolated tricks but instances of the same mechanism. Free objects, reflective subcategories, and other algebraic examples make the adjoint pattern visible in computations, where abstract definitions turn into familiar generators and relations.
# 2. Free-Forgetful Adjunctions and Algebraic Examples
Free constructions are often the first place where adjunctions become computational. The question is not merely whether a group, module, or algebra can be generated by a set, but why every map out of the generating data extends in exactly the way required by the target structure. This chapter turns the abstract definitions of Chapter 1 into a working toolkit: free objects are left adjoints, forgetful functors are right adjoints, and many familiar algebraic constructions are reflections into better-behaved subcategories.
Building on the universal-arrow formulation from Chapter 1, the guiding pattern is that a left adjoint builds the most economical structured object from raw data, while a right adjoint forgets structure without changing the underlying object. Without the universal property, a construction may still look natural but give no reliable test for maps out of it, and without the adjunction, preservation claims about limits and colimits become separate calculations. Once this is recognised, preservation theorems explain why forgetful functors tend to preserve products, pullbacks, and equalizers, while free constructions tend to preserve coproducts and coequalizers.
## Free Constructions as Left Adjoints
What does it mean to build an algebraic object from generators without imposing relations beyond those forced by the axioms? A presentation by generators can fail if it accidentally identifies two distinct formal expressions, or if it omits relations forced by the algebraic laws. Category theory answers this by asking for a universal map from the raw data into the underlying object of the algebraic category: every proposed interpretation of the generators must extend uniquely to a genuine structured map. The result is a left adjoint to a forgetful functor.
[definition: Free Object on a Set]
Let $U: \mathcal A \to \mathbf{Set}$ be a functor. A free object on a set $S$ is an object $F(S) \in \mathcal A$ together with a function $\eta_S: S \to U(F(S))$ such that for every object $A \in \mathcal A$ and every function $f: S \to U(A)$, there exists a unique morphism $\bar f: F(S) \to A$ in $\mathcal A$ satisfying $U(\bar f) \circ \eta_S = f$.
[/definition]
This definition packages the extension property of generators. If such free objects exist functorially for all sets $S$, the assignment $S \mapsto F(S)$ becomes a functor $F: \mathbf{Set} \to \mathcal A$.
The key issue is whether these extension properties are merely object-by-object recipes or whether they produce the adjunction expected of a free construction. The theorem below turns the universal mapping property into the hom-set bijection that characterises a left adjoint.
[quotetheorem:4148]
[citeproof:4148]
The theorem says that free constructions are not isolated recipes. They are instances of the same representability phenomenon: maps out of a free object are controlled by functions out of its generators. The hypotheses matter because free objects may fail to exist in some categories, and even objectwise choices do not define an adjunction unless they are compatible with maps between generating sets. The forward use is computational: to define a morphism out of a free object, it is enough to define it on generators and then invoke uniqueness.
[example: Free Monoid]
Let $S$ be a set. Write $S^*$ for the set of all finite words $s_1\cdots s_n$ with letters in $S$, together with the empty word $\varepsilon$. Multiplication is concatenation:
\begin{align*}
(s_1\cdots s_m)(t_1\cdots t_n)=s_1\cdots s_m t_1\cdots t_n,
\end{align*}
and $\varepsilon$ is the unit because $\varepsilon w=w=w\varepsilon$ for every word $w\in S^*$. Thus $S^*$ is a monoid.
Given a monoid $M$ and a function $f:S\to U(M)$, define $\bar f:S^*\to M$ by
\begin{align*}
\bar f(\varepsilon)&=1_M,\\
\bar f(s_1\cdots s_n)&=f(s_1)\cdots f(s_n)\qquad(n\geq 1).
\end{align*}
For words $u=s_1\cdots s_m$ and $v=t_1\cdots t_n$, this map respects multiplication:
\begin{align*}
\bar f(uv)
&=\bar f(s_1\cdots s_m t_1\cdots t_n)\\
&=f(s_1)\cdots f(s_m)f(t_1)\cdots f(t_n)\\
&=\bar f(u)\bar f(v).
\end{align*}
It also respects the unit since $\bar f(\varepsilon)=1_M$, so $\bar f$ is a monoid homomorphism.
If $h:S^*\to M$ is any monoid homomorphism with $h(s)=f(s)$ for every one-letter word $s\in S$, then
\begin{align*}
h(s_1\cdots s_n)
&=h(s_1\cdots s_{n-1})h(s_n)\\
&=\cdots\\
&=h(s_1)\cdots h(s_n)\\
&=f(s_1)\cdots f(s_n)\\
&=\bar f(s_1\cdots s_n),
\end{align*}
and also $h(\varepsilon)=1_M=\bar f(\varepsilon)$ because monoid homomorphisms preserve units. Hence $h=\bar f$, so the extension is unique. Therefore maps $S\to U(M)$ are in natural bijection with monoid homomorphisms $S^*\to M$, giving the adjunction $(-)^*:\mathbf{Set}\rightleftarrows \mathbf{Mon}:U$.
[/example]
The free monoid example is the cleanest case because the only algebraic operation to respect is concatenation with a unit. It also shows a useful diagnostic: a proposed construction is free only if every function on letters determines exactly one structure-preserving map, with no extra compatibility condition imposed on the function. Free groups add formal inverses, so the same universal property must now record exactly how words reduce under a target homomorphism.
[example: Free Group]
Let $S^{-1}=\{s^{-1}:s\in S\}$ be a formal copy of $S$. Elements of $F(S)$ are reduced words in the alphabet $S\sqcup S^{-1}$, meaning finite words with no adjacent pair of the form $ss^{-1}$ or $s^{-1}s$. The empty word is the identity. Multiplication is concatenation followed by repeatedly deleting adjacent inverse pairs.
Given a group $G$ and a function $f:S\to U(G)$, define the value of a letter by
\begin{align*}
\widetilde f(s)&=f(s),\\
\widetilde f(s^{-1})&=f(s)^{-1}.
\end{align*}
For a word $w=a_1\cdots a_n$ with each $a_i\in S\sqcup S^{-1}$, define
\begin{align*}
\bar f(w)=\widetilde f(a_1)\cdots \widetilde f(a_n),
\end{align*}
and set $\bar f(\varepsilon)=1_G$. This value is unchanged by deleting an adjacent inverse pair. Indeed, if the adjacent pair is $ss^{-1}$, then
\begin{align*}
\widetilde f(s)\widetilde f(s^{-1})
=f(s)f(s)^{-1}
=1_G,
\end{align*}
and if the adjacent pair is $s^{-1}s$, then
\begin{align*}
\widetilde f(s^{-1})\widetilde f(s)
=f(s)^{-1}f(s)
=1_G.
\end{align*}
Thus deleting such a pair removes a factor equal to $1_G$, so the evaluated product in $G$ is unchanged.
For reduced words $u=a_1\cdots a_m$ and $v=b_1\cdots b_n$, their product in $F(S)$ is the reduced word obtained from $a_1\cdots a_m b_1\cdots b_n$ by deleting adjacent inverse pairs. Since each deletion leaves the value unchanged,
\begin{align*}
\bar f(uv)
&=\bar f(a_1\cdots a_m b_1\cdots b_n)\\
&=\widetilde f(a_1)\cdots \widetilde f(a_m)\widetilde f(b_1)\cdots \widetilde f(b_n)\\
&=\bar f(u)\bar f(v).
\end{align*}
Also $\bar f(\varepsilon)=1_G$, so $\bar f:F(S)\to G$ is a group homomorphism.
It remains to see uniqueness. If $h:F(S)\to G$ is a group homomorphism with $h(s)=f(s)$ for every $s\in S$, then
\begin{align*}
h(s^{-1})=h(s)^{-1}=f(s)^{-1}
\end{align*}
because homomorphisms preserve inverses. Hence for every reduced word $a_1\cdots a_n$,
\begin{align*}
h(a_1\cdots a_n)
&=h(a_1)\cdots h(a_n)\\
&=\widetilde f(a_1)\cdots \widetilde f(a_n)\\
&=\bar f(a_1\cdots a_n),
\end{align*}
and $h(\varepsilon)=1_G=\bar f(\varepsilon)$. Therefore $h=\bar f$. Thus every function $S\to U(G)$ extends uniquely to a group homomorphism $F(S)\to G$, so $F(S)$ is the free group on $S$.
[/example]
The reduced-word construction also marks a boundary: the free group is not obtained by merely taking arbitrary strings, because cancellation is forced by the group axioms. In linear settings, the forced operations are addition and scalar multiplication rather than multiplication and inverse. Modules and algebras therefore show the same pattern with linear combinations replacing words, and the universal property becomes a finite-support computation.
[example: Free Module]
Let $R$ be a ring and let $S$ be a set. The free left $R$-module on $S$ is
\begin{align*}
R^{(S)}=\{a:S\to R : a(s)=0 \text{ for all but finitely many } s\},
\end{align*}
with pointwise addition and scalar multiplication:
\begin{align*}
(a+b)(s)&=a(s)+b(s),\\
(r a)(s)&=r a(s).
\end{align*}
For each $s\in S$, let $e_s\in R^{(S)}$ be the function with $e_s(s)=1_R$ and $e_s(t)=0_R$ for $t\neq s$. The generator map $\eta_S:S\to U(R^{(S)})$ is $\eta_S(s)=e_s$.
Let $M$ be a left $R$-module and let $f:S\to U(M)$ be a function. For $a\in R^{(S)}$, define
\begin{align*}
\bar f(a)=\sum_{s\in S} a(s)f(s),
\end{align*}
where the sum is finite because $a(s)=0$ for all but finitely many $s$. This extends $f$ on generators, since for each $t\in S$,
\begin{align*}
\bar f(e_t)
&=\sum_{s\in S} e_t(s)f(s)\\
&=e_t(t)f(t)+\sum_{s\neq t} e_t(s)f(s)\\
&=1_R f(t)+\sum_{s\neq t} 0_R f(s)\\
&=f(t)+0\\
&=f(t).
\end{align*}
The map $\bar f$ is $R$-linear. If $a,b\in R^{(S)}$, then
\begin{align*}
\bar f(a+b)
&=\sum_{s\in S} (a+b)(s)f(s)\\
&=\sum_{s\in S} (a(s)+b(s))f(s)\\
&=\sum_{s\in S} \bigl(a(s)f(s)+b(s)f(s)\bigr)\\
&=\sum_{s\in S} a(s)f(s)+\sum_{s\in S} b(s)f(s)\\
&=\bar f(a)+\bar f(b).
\end{align*}
If $r\in R$, then
\begin{align*}
\bar f(ra)
&=\sum_{s\in S} (ra)(s)f(s)\\
&=\sum_{s\in S} (r a(s))f(s)\\
&=\sum_{s\in S} r\bigl(a(s)f(s)\bigr)\\
&=r\sum_{s\in S} a(s)f(s)\\
&=r\bar f(a).
\end{align*}
Finally, suppose $h:R^{(S)}\to M$ is any $R$-linear map satisfying $h(e_s)=f(s)$ for every $s\in S$. Since $a$ has finite support,
\begin{align*}
a
&=\sum_{s\in S} a(s)e_s,
\end{align*}
because at each $t\in S$ the right-hand side has value
\begin{align*}
\left(\sum_{s\in S} a(s)e_s\right)(t)
&=\sum_{s\in S} a(s)e_s(t)\\
&=a(t)e_t(t)+\sum_{s\neq t} a(s)e_s(t)\\
&=a(t)1_R+\sum_{s\neq t} a(s)0_R\\
&=a(t).
\end{align*}
Therefore
\begin{align*}
h(a)
&=h\left(\sum_{s\in S} a(s)e_s\right)\\
&=\sum_{s\in S} a(s)h(e_s)\\
&=\sum_{s\in S} a(s)f(s)\\
&=\bar f(a).
\end{align*}
Thus $h=\bar f$, so every function $S\to U(M)$ extends uniquely to an $R$-linear map $R^{(S)}\to M$. This is exactly why $R^{(S)}$ is the free left $R$-module on $S$: maps out of it are determined by the chosen images of the basis elements $e_s$.
[/example]
Free modules illustrate how a basis is best understood by its extension property, not merely by coordinates. This matters because the elements of $R^{(S)}$ must have finite support: allowing arbitrary functions $S \to R$ would usually produce a product module, whose maps out are not determined by unrestricted sums over $S$. Polynomial algebras add multiplication to the same finite-formal-expression principle.
[example: Free Algebra]
Let $k$ be a commutative ring and let $A$ be a commutative $k$-algebra. Write an element of $k[x_s:s\in S]$ as a finite sum
\begin{align*}
p=\sum_{\alpha} c_\alpha x^\alpha,
\end{align*}
where each $\alpha:S\to \mathbb N$ has finite support, only finitely many coefficients $c_\alpha\in k$ are nonzero, and
\begin{align*}
x^\alpha=\prod_{s\in S}x_s^{\alpha(s)}.
\end{align*}
The product is finite because $\alpha(s)=0$ except for finitely many $s$.
Given a function $f:S\to U(A)$, define
\begin{align*}
\bar f\left(\sum_{\alpha}c_\alpha x^\alpha\right)
=
\sum_{\alpha} c_\alpha \prod_{s\in S} f(s)^{\alpha(s)}.
\end{align*}
This extends $f$ on generators: for $t\in S$, the variable $x_t$ is the monomial with exponent $1$ at $t$ and $0$ elsewhere, so
\begin{align*}
\bar f(x_t)
&=1_k f(t)^1\prod_{s\neq t} f(s)^0\\
&=f(t).
\end{align*}
The map $\bar f$ is a $k$-algebra homomorphism. For addition,
\begin{align*}
\bar f\left(\sum_\alpha c_\alpha x^\alpha+\sum_\alpha d_\alpha x^\alpha\right)
&=\bar f\left(\sum_\alpha (c_\alpha+d_\alpha)x^\alpha\right)\\
&=\sum_\alpha (c_\alpha+d_\alpha)\prod_s f(s)^{\alpha(s)}\\
&=\sum_\alpha c_\alpha\prod_s f(s)^{\alpha(s)}
+\sum_\alpha d_\alpha\prod_s f(s)^{\alpha(s)}.
\end{align*}
For multiplication, using $x^\alpha x^\beta=x^{\alpha+\beta}$ and commutativity in $A$,
\begin{align*}
\bar f\left(\left(\sum_\alpha c_\alpha x^\alpha\right)\left(\sum_\beta d_\beta x^\beta\right)\right)
&=\bar f\left(\sum_{\alpha,\beta} c_\alpha d_\beta x^{\alpha+\beta}\right)\\
&=\sum_{\alpha,\beta} c_\alpha d_\beta \prod_s f(s)^{\alpha(s)+\beta(s)}\\
&=\sum_{\alpha,\beta} c_\alpha d_\beta
\left(\prod_s f(s)^{\alpha(s)}\right)
\left(\prod_s f(s)^{\beta(s)}\right)\\
&=\left(\sum_\alpha c_\alpha\prod_s f(s)^{\alpha(s)}\right)
\left(\sum_\beta d_\beta\prod_s f(s)^{\beta(s)}\right).
\end{align*}
Also $\bar f(c)=c\cdot 1_A$ for $c\in k$, so $\bar f$ respects the $k$-algebra structure.
Finally, if $h:k[x_s:s\in S]\to A$ is any $k$-algebra homomorphism with $h(x_s)=f(s)$ for every $s\in S$, then for every monomial $c_\alpha x^\alpha$,
\begin{align*}
h(c_\alpha x^\alpha)
&=c_\alpha h\left(\prod_s x_s^{\alpha(s)}\right)\\
&=c_\alpha \prod_s h(x_s)^{\alpha(s)}\\
&=c_\alpha \prod_s f(s)^{\alpha(s)}.
\end{align*}
Therefore, for every polynomial $p=\sum_\alpha c_\alpha x^\alpha$,
\begin{align*}
h(p)
&=h\left(\sum_\alpha c_\alpha x^\alpha\right)\\
&=\sum_\alpha h(c_\alpha x^\alpha)\\
&=\sum_\alpha c_\alpha\prod_s f(s)^{\alpha(s)}\\
&=\bar f(p).
\end{align*}
Thus $h=\bar f$, so every function $S\to U(A)$ extends uniquely to a commutative $k$-algebra homomorphism $k[x_s:s\in S]\to A$. The polynomial algebra is therefore free because its maps out are determined exactly by the chosen images of the variables.
[/example]
The examples differ in syntax, but their universal properties have the same form. This is the reason algebraic categories often come with a forgetful functor to sets and a free functor back.
## Forgetful Functors and Preservation of Limits
Why do products, pullbacks, and equalizers in algebraic categories often look as though they are computed on underlying sets? The explanation is that many forgetful functors are right adjoints, and right adjoints preserve limits. The slogan is useful, but the proof is a direct calculation with universal properties.
Recall that a cone from an object $c \in \mathcal C$ to a diagram $D:J\to\mathcal C$ is a compatible family of maps $c\to D(j)$. A limit is the terminal cone over a diagram. Dually, a colimit is the initial cocone under a diagram.
[quotetheorem:4149]
[citeproof:4149]
For algebraic categories, this theorem explains why the forgetful functor sends many limits to their set-theoretic versions. The hypothesis that the functor is a right adjoint is essential: a general forgetful-looking functor need not preserve arbitrary limits unless it has the representing bijections supplied by an adjunction. The limitation is also useful in practice, since the theorem only transfers limits that already exist in the structured category. When it applies, the extra algebraic structure is forced coordinatewise by the universal property, so many limit computations reduce to checking closure of the set-theoretic construction.
[example: Products of Groups]
Since the forgetful functor $U:\mathbf{Grp}\to\mathbf{Set}$ is a right adjoint, *Right Adjoints Preserve Limits* implies that it preserves products. Concretely, let
\begin{align*}
P=\prod_{i\in I} U(G_i)
\end{align*}
be the set of tuples $(g_i)_{i\in I}$ with $g_i\in G_i$. Define multiplication, identity, and inverses coordinatewise:
\begin{align*}
(g_i)_{i\in I}(h_i)_{i\in I}&=(g_i h_i)_{i\in I},\\
1_P&=(1_{G_i})_{i\in I},\\
(g_i)_{i\in I}^{-1}&=(g_i^{-1})_{i\in I}.
\end{align*}
For associativity, if $(g_i),(h_i),(k_i)\in P$, then the $i$th coordinate of $((g_i)(h_i))(k_i)$ is
\begin{align*}
(g_i h_i)k_i=g_i(h_i k_i),
\end{align*}
because multiplication in $G_i$ is associative. This is the $i$th coordinate of $(g_i)((h_i)(k_i))$, so multiplication on $P$ is associative. Similarly,
\begin{align*}
(1_{G_i})_{i\in I}(g_i)_{i\in I}
&=(1_{G_i}g_i)_{i\in I}
=(g_i)_{i\in I},\\
(g_i)_{i\in I}(1_{G_i})_{i\in I}
&=(g_i1_{G_i})_{i\in I}
=(g_i)_{i\in I},
\end{align*}
and
\begin{align*}
(g_i)_{i\in I}(g_i^{-1})_{i\in I}
&=(g_i g_i^{-1})_{i\in I}
=(1_{G_i})_{i\in I},\\
(g_i^{-1})_{i\in I}(g_i)_{i\in I}
&=(g_i^{-1} g_i)_{i\in I}
=(1_{G_i})_{i\in I}.
\end{align*}
Thus $P$ is a group.
For each $j\in I$, the projection
\begin{align*}
\pi_j:P\to G_j,\qquad \pi_j((g_i)_{i\in I})=g_j
\end{align*}
is a group homomorphism, since
\begin{align*}
\pi_j((g_i)(h_i))
&=\pi_j((g_i h_i)_{i\in I})\\
&=g_jh_j\\
&=\pi_j((g_i))\pi_j((h_i)),
\end{align*}
and
\begin{align*}
\pi_j(1_P)=\pi_j((1_{G_i})_{i\in I})=1_{G_j}.
\end{align*}
Therefore the group product has exactly the set-theoretic product as its underlying set, with all structure forced coordinate by coordinate.
[/example]
Products show how forgetful functors preserve objects built by independent coordinates. Equalizers test a different kind of limit: instead of assembling coordinates, they carve out the part of an object where two maps agree.
[example: Equalizers of Rings]
Let $f,g:R\to S$ be ring homomorphisms. We compute the equalizer by first taking the set
\begin{align*}
E=\{r\in R:f(r)=g(r)\}.
\end{align*}
If $r,r'\in E$, then $f(r)=g(r)$ and $f(r')=g(r')$, so
\begin{align*}
f(r+r')
&=f(r)+f(r')\\
&=g(r)+g(r')\\
&=g(r+r'),
\end{align*}
hence $r+r'\in E$. Also
\begin{align*}
f(-r)
&=-f(r)\\
&=-g(r)\\
&=g(-r),
\end{align*}
so $-r\in E$, and
\begin{align*}
f(rr')
&=f(r)f(r')\\
&=g(r)g(r')\\
&=g(rr'),
\end{align*}
so $rr'\in E$. Finally,
\begin{align*}
f(1_R)=1_S=g(1_R),
\end{align*}
because ring homomorphisms preserve units. Therefore $1_R\in E$, and $E$ is a subring of $R$.
Let $i:E\hookrightarrow R$ be the inclusion. For every $e\in E$,
\begin{align*}
(f\circ i)(e)=f(e)=g(e)=(g\circ i)(e),
\end{align*}
so $f\circ i=g\circ i$. Now suppose $T$ is a ring and $h:T\to R$ is a ring homomorphism satisfying $f\circ h=g\circ h$. For each $t\in T$,
\begin{align*}
f(h(t))=(f\circ h)(t)=(g\circ h)(t)=g(h(t)),
\end{align*}
so $h(t)\in E$. Hence there is a function $\bar h:T\to E$ defined by $\bar h(t)=h(t)$, viewed as an element of $E$. This map is a ring homomorphism because, for $t,t'\in T$,
\begin{align*}
\bar h(t+t')
&=h(t+t')\\
&=h(t)+h(t')\\
&=\bar h(t)+\bar h(t'),\\
\bar h(tt')
&=h(tt')\\
&=h(t)h(t')\\
&=\bar h(t)\bar h(t'),
\end{align*}
and also
\begin{align*}
\bar h(1_T)=h(1_T)=1_R=1_E.
\end{align*}
By construction $i\circ \bar h=h$. If $k:T\to E$ is another ring homomorphism with $i\circ k=h$, then for every $t\in T$,
\begin{align*}
k(t)=i(k(t))=h(t)=\bar h(t),
\end{align*}
so $k=\bar h$. Thus $i:E\hookrightarrow R$ is the equalizer of $f$ and $g$ in $\mathbf{Ring}$, and its underlying set is exactly the set-theoretic equalizer. This is the concrete form of the fact that the forgetful functor preserves this equalizer because it is a right adjoint in the usual algebraic setting where the free ring functor exists.
[/example]
The dual preservation theorem is just as important, especially when working with free constructions. Limits are controlled by maps into a candidate object, while colimits are controlled by maps out of it; adjunctions convert one kind of universal mapping problem into the other. This is why left adjoints are the natural home for constructions generated freely from data: they preserve the operations that assemble data by coproducts and quotients.
[quotetheorem:4150]
[citeproof:4150]
The theorem is the colimit analogue of the previous preservation result. Its hypotheses matter for the same reason: without an adjunction, a functor can preserve some colimits accidentally but gives no general method for transporting universal properties. The theorem does not say that left adjoints preserve limits, and free constructions often fail to do so. For example, the free group functor sends coproducts of sets, namely disjoint unions, to coproducts of groups, namely free products, which is much more efficient than constructing the free product from scratch.
[example: Free Products from Disjoint Unions]
Let $S$ and $T$ be sets, with coproduct injections $\iota_S:S\to S\sqcup T$ and $\iota_T:T\to S\sqcup T$. Write $\eta_X:X\to U(F_{\mathbf{Grp}}(X))$ for the generator map of the free group on a set $X$. By the free group universal property, the function
\begin{align*}
S&\to U(F_{\mathbf{Grp}}(S\sqcup T)),\\
s&\mapsto \eta_{S\sqcup T}(\iota_S(s))
\end{align*}
extends uniquely to a group homomorphism
\begin{align*}
a:F_{\mathbf{Grp}}(S)\to F_{\mathbf{Grp}}(S\sqcup T),
\end{align*}
and similarly the function
\begin{align*}
T&\to U(F_{\mathbf{Grp}}(S\sqcup T)),\\
t&\mapsto \eta_{S\sqcup T}(\iota_T(t))
\end{align*}
extends uniquely to a group homomorphism
\begin{align*}
b:F_{\mathbf{Grp}}(T)\to F_{\mathbf{Grp}}(S\sqcup T).
\end{align*}
We show that $F_{\mathbf{Grp}}(S\sqcup T)$ has the universal property of the free product $F_{\mathbf{Grp}}(S)*F_{\mathbf{Grp}}(T)$. Let $G$ be a group, and let
\begin{align*}
p:F_{\mathbf{Grp}}(S)\to G,\qquad q:F_{\mathbf{Grp}}(T)\to G
\end{align*}
be group homomorphisms. Define a function $r:S\sqcup T\to U(G)$ by
\begin{align*}
r(\iota_S(s))&=p(\eta_S(s)),\\
r(\iota_T(t))&=q(\eta_T(t)).
\end{align*}
Since $S\sqcup T$ is a disjoint union, these two formulas define exactly one function on all of $S\sqcup T$. By freeness, $r$ extends uniquely to a group homomorphism
\begin{align*}
\bar r:F_{\mathbf{Grp}}(S\sqcup T)\to G
\end{align*}
satisfying $\bar r\circ \eta_{S\sqcup T}=r$.
Now $\bar r\circ a=p$. Indeed, both maps $F_{\mathbf{Grp}}(S)\to G$ are group homomorphisms, and for each generator $s\in S$,
\begin{align*}
(\bar r\circ a)(\eta_S(s))
&=\bar r(a(\eta_S(s)))\\
&=\bar r(\eta_{S\sqcup T}(\iota_S(s)))\\
&=r(\iota_S(s))\\
&=p(\eta_S(s)).
\end{align*}
Since maps out of $F_{\mathbf{Grp}}(S)$ are determined by their values on generators, $\bar r\circ a=p$. The same calculation gives $\bar r\circ b=q$, because for $t\in T$,
\begin{align*}
(\bar r\circ b)(\eta_T(t))
&=\bar r(b(\eta_T(t)))\\
&=\bar r(\eta_{S\sqcup T}(\iota_T(t)))\\
&=r(\iota_T(t))\\
&=q(\eta_T(t)).
\end{align*}
If $h:F_{\mathbf{Grp}}(S\sqcup T)\to G$ is another group homomorphism with $h\circ a=p$ and $h\circ b=q$, then for every $s\in S$,
\begin{align*}
h(\eta_{S\sqcup T}(\iota_S(s)))
&=h(a(\eta_S(s)))\\
&=(h\circ a)(\eta_S(s))\\
&=p(\eta_S(s))\\
&=r(\iota_S(s)),
\end{align*}
and for every $t\in T$,
\begin{align*}
h(\eta_{S\sqcup T}(\iota_T(t)))
&=h(b(\eta_T(t)))\\
&=(h\circ b)(\eta_T(t))\\
&=q(\eta_T(t))\\
&=r(\iota_T(t)).
\end{align*}
Thus $h\circ \eta_{S\sqcup T}=r$, so the uniqueness part of the free group universal property gives $h=\bar r$.
Therefore $F_{\mathbf{Grp}}(S\sqcup T)$ is a coproduct of $F_{\mathbf{Grp}}(S)$ and $F_{\mathbf{Grp}}(T)$ in $\mathbf{Grp}$. Since the coproduct of groups is the free product, there is a canonical isomorphism
\begin{align*}
F_{\mathbf{Grp}}(S\sqcup T)\cong F_{\mathbf{Grp}}(S)*F_{\mathbf{Grp}}(T).
\end{align*}
This is the concrete instance of *Left Adjoints Preserve Colimits*: the free group on a disjoint union is obtained by freely combining the two free groups, with no relations imposed between the $S$-generators and the $T$-generators.
[/example]
## Reflections and Coreflections in Full Subcategories
When does a subcategory inherit enough structure that every ambient object has a best approximation inside it? A naive approximation can fail in two ways: it may depend on arbitrary choices, or it may not receive all maps from the original object into objects of the subcategory. For example, every group has a best abelian approximation, and every completely regular Hausdorff space has a best compact Hausdorff enlargement. These are not merely constructions; they are adjoints to inclusions of full subcategories.
[definition: Reflective Subcategory]
Let $\mathcal A$ be a full subcategory of $\mathcal C$, and let $I: \mathcal A \hookrightarrow \mathcal C$ be the inclusion functor. The subcategory $\mathcal A$ is reflective in $\mathcal C$ if $I$ has a left adjoint $L: \mathcal C \to \mathcal A$.
[/definition]
The object $L(c)$ is called the reflection of $c$ in $\mathcal A$. The unit $\eta_c: c \to I(L(c))$ is the universal arrow from $c$ to the inclusion.
There is also a dual approximation problem: instead of mapping an ambient object into the subcategory, one may ask for the best object of the subcategory mapping out to it. This reverses the universal arrow and gives the companion notion of coreflectivity.
[definition: Coreflective Subcategory]
Let $\mathcal A$ be a full subcategory of $\mathcal C$, and let $I: \mathcal A \hookrightarrow \mathcal C$ be the inclusion functor. The subcategory $\mathcal A$ is coreflective in $\mathcal C$ if $I$ has a right adjoint $R: \mathcal C \to \mathcal A$.
[/definition]
Coreflections reverse the direction of the universal arrow: the counit has the form $I(R(c)) \to c$. Many algebraic examples in this chapter are reflections rather than coreflections because they impose relations or completion conditions.
For reflective subcategories, the practical question is how to prove that an inclusion has a left adjoint without constructing the whole functor in advance. The next criterion reduces that task to producing the correct universal arrow for each ambient object.
[quotetheorem:4151]
[citeproof:4151]
The criterion is the working test for reflectivity: construct the universal map into the subcategory, then functoriality follows from uniqueness. Fullness of the subcategory is part of what makes the statement clean, because morphisms between reflected objects are exactly the ambient morphisms between them. The limitation is that the criterion does not construct $L(c)$ for us; it only says that once the universal arrows exist objectwise, they automatically assemble into a left adjoint. In examples, the main work is therefore to identify the smallest obstruction that must be killed or the minimal enlargement that must be added.
[example: Abelianization as Reflection]
Let $\mathbf{Ab}$ be the full subcategory of $\mathbf{Grp}$ consisting of abelian groups. For a group $G$, let $[G,G]$ be the subgroup generated by all commutators
\begin{align*}
[g,h]=ghg^{-1}h^{-1}.
\end{align*}
This subgroup is normal: if $x,g,h\in G$, then
\begin{align*}
x[g,h]x^{-1}
&=xghg^{-1}h^{-1}x^{-1}\\
&=(xgx^{-1})(xhx^{-1})(xg^{-1}x^{-1})(xh^{-1}x^{-1})\\
&=(xgx^{-1})(xhx^{-1})(xgx^{-1})^{-1}(xhx^{-1})^{-1}\\
&=[xgx^{-1},xhx^{-1}],
\end{align*}
so conjugating a commutator gives another commutator, and therefore conjugating any product of commutators and their inverses keeps it inside $[G,G]$.
Set
\begin{align*}
G^{\mathrm{ab}}=G/[G,G],
\end{align*}
and let $\eta_G:G\to G^{\mathrm{ab}}$ be the quotient homomorphism. The quotient is abelian because for $g,h\in G$,
\begin{align*}
(hg)^{-1}(gh)
&=g^{-1}h^{-1}gh\\
&=[g^{-1},h^{-1}]
\in [G,G],
\end{align*}
so $gh[G,G]=hg[G,G]$. Hence
\begin{align*}
(g[G,G])(h[G,G])
&=gh[G,G]\\
&=hg[G,G]\\
&=(h[G,G])(g[G,G]).
\end{align*}
Now let $A$ be an abelian group and let $f:G\to A$ be a group homomorphism. For every commutator $[g,h]\in G$,
\begin{align*}
f([g,h])
&=f(ghg^{-1}h^{-1})\\
&=f(g)f(h)f(g)^{-1}f(h)^{-1}\\
&=f(g)f(g)^{-1}f(h)f(h)^{-1}\\
&=1_A,
\end{align*}
where the middle equality uses commutativity in $A$. Since $[G,G]$ is generated by commutators, every element of $[G,G]$ is a finite product of commutators and their inverses, and $f$ sends each such factor to $1_A$. Thus $[G,G]\subseteq \ker(f)$.
Define
\begin{align*}
\bar f:G^{\mathrm{ab}}\to A,\qquad \bar f(g[G,G])=f(g).
\end{align*}
This is well-defined: if $g[G,G]=g'[G,G]$, then $g'^{-1}g\in [G,G]\subseteq \ker(f)$, so
\begin{align*}
f(g')^{-1}f(g)
&=f(g'^{-1}g)\\
&=1_A,
\end{align*}
and hence $f(g)=f(g')$. It is a group homomorphism because
\begin{align*}
\bar f\bigl((g[G,G])(h[G,G])\bigr)
&=\bar f(gh[G,G])\\
&=f(gh)\\
&=f(g)f(h)\\
&=\bar f(g[G,G])\bar f(h[G,G]).
\end{align*}
Also
\begin{align*}
(\bar f\circ \eta_G)(g)=\bar f(g[G,G])=f(g),
\end{align*}
so $f$ factors through $\eta_G$.
The factorization is unique. If $u:G^{\mathrm{ab}}\to A$ is another group homomorphism with $u\circ \eta_G=f$, then for every coset $g[G,G]\in G^{\mathrm{ab}}$,
\begin{align*}
u(g[G,G])
&=u(\eta_G(g))\\
&=(u\circ \eta_G)(g)\\
&=f(g)\\
&=\bar f(g[G,G]).
\end{align*}
Thus $u=\bar f$. Therefore every homomorphism from $G$ to an abelian group factors uniquely through $G^{\mathrm{ab}}$, so $(-)^{\mathrm{ab}}:\mathbf{Grp}\to\mathbf{Ab}$ is left adjoint to the inclusion $\mathbf{Ab}\hookrightarrow\mathbf{Grp}$.
[/example]
This reflection is a model for many algebraic quotients: force a desired property by dividing out the smallest obstruction to that property. In abelianization, the obstruction is measured by commutators, and quotienting by $[G,G]$ is exactly what makes every map to an abelian group compatible with the target structure. The next example has a different flavour: instead of killing elements, it formally inverts nonzero elements, and the universal property only works under hypotheses that prevent denominators from collapsing to zero.
[example: Field of Fractions as a Partial Left Adjoint]
Let $\mathbf{Dom}$ be the category of integral domains and injective unital ring homomorphisms, and let $\mathbf{Field}$ be the full subcategory whose objects are fields. For an integral domain $R$, write
\begin{align*}
\operatorname{Frac}(R)=\{a/b:a,b\in R,\ b\neq 0\},
\end{align*}
where $a/b=a'/b'$ means $ab'=a'b$. The map
\begin{align*}
\eta_R:R\to \operatorname{Frac}(R),\qquad \eta_R(r)=r/1
\end{align*}
is injective, since $\eta_R(r)=\eta_R(r')$ means $r\cdot 1=r'\cdot 1$, hence $r=r'$.
Let $K$ be a field and let $f:R\to K$ be an injective ring homomorphism. Since $f$ is injective, if $b\neq 0$ in $R$, then $f(b)\neq 0$ in $K$, so $f(b)^{-1}$ exists. Define
\begin{align*}
\bar f:\operatorname{Frac}(R)\to K,\qquad \bar f(a/b)=f(a)f(b)^{-1}.
\end{align*}
This is well-defined. If $a/b=a'/b'$, then $ab'=a'b$, so applying $f$ gives
\begin{align*}
f(a)f(b')=f(a')f(b).
\end{align*}
Multiplying both sides in $K$ by $f(b)^{-1}f(b')^{-1}$ gives
\begin{align*}
f(a)f(b')f(b)^{-1}f(b')^{-1}
&=f(a')f(b)f(b)^{-1}f(b')^{-1},\\
f(a)f(b)^{-1}
&=f(a')f(b')^{-1},
\end{align*}
using commutativity of the field $K$. Hence the value of $\bar f(a/b)$ does not depend on the chosen representative.
The map $\bar f$ is a ring homomorphism. For addition,
\begin{align*}
\bar f(a/b+a'/b')
&=\bar f((ab'+a'b)/(bb'))\\
&=f(ab'+a'b)f(bb')^{-1}\\
&=(f(a)f(b')+f(a')f(b))(f(b)f(b'))^{-1}\\
&=(f(a)f(b')+f(a')f(b))f(b)^{-1}f(b')^{-1}\\
&=f(a)f(b)^{-1}+f(a')f(b')^{-1}\\
&=\bar f(a/b)+\bar f(a'/b').
\end{align*}
For multiplication,
\begin{align*}
\bar f((a/b)(a'/b'))
&=\bar f((aa')/(bb'))\\
&=f(aa')f(bb')^{-1}\\
&=f(a)f(a')f(b)^{-1}f(b')^{-1}\\
&=(f(a)f(b)^{-1})(f(a')f(b')^{-1})\\
&=\bar f(a/b)\bar f(a'/b').
\end{align*}
Also
\begin{align*}
\bar f(1_{\operatorname{Frac}(R)})
&=\bar f(1/1)\\
&=f(1_R)f(1_R)^{-1}\\
&=1_K,
\end{align*}
so $\bar f$ is a unital ring homomorphism, hence a field homomorphism onto its image. It extends $f$ because for every $r\in R$,
\begin{align*}
(\bar f\circ \eta_R)(r)
&=\bar f(r/1)\\
&=f(r)f(1_R)^{-1}\\
&=f(r).
\end{align*}
The extension is unique. If $h:\operatorname{Frac}(R)\to K$ is a field homomorphism with $h\circ \eta_R=f$, then for $a,b\in R$ with $b\neq 0$,
\begin{align*}
h(a/b)
&=h((a/1)(b/1)^{-1})\\
&=h(a/1)h(b/1)^{-1}\\
&=f(a)f(b)^{-1}\\
&=\bar f(a/b).
\end{align*}
Thus $h=\bar f$. Therefore $\operatorname{Frac}(R)$ is the reflection of $R$ into fields inside the category of integral domains with injective maps. The construction is only partial from the viewpoint of all commutative rings: if $R$ has zero divisors, a nonzero element may have to map to $0$ under maps to fields, so the formula $a/b\mapsto f(a)f(b)^{-1}$ cannot be made universal for all nonzero denominators.
[/example]
The topological analogue has a similar shape but uses compactness rather than commutativity. Here the obstruction is not an algebraic relation but a failure of compactness: continuous maps from $X$ into compact Hausdorff spaces may behave as though $X$ has limiting points that are not actually present. The Stone-Cech compactification adds precisely enough ideal limiting information to make all such maps extend, while preserving the same universal-arrow pattern as the algebraic reflections.
[quotetheorem:4152]
[citeproof:4152]
The construction of this theorem belongs to topology and uses $\beta X$ from bounded continuous functions or from ultrafilters. The completely regular Hausdorff hypothesis is not decorative: it is the condition that gives enough continuous maps into compact Hausdorff spaces for the construction to separate points and closed sets in the required way. The limitation is categorical as well as topological, since the reflection statement depends on choosing the correct ambient category of spaces and morphisms. In this course it is used as a quoted example of a reflection: compact Hausdorff spaces sit reflectively inside completely regular Hausdorff spaces when morphisms are continuous maps and the ambient category is chosen appropriately.
The main lesson of this chapter is that adjunctions are structure detectors. Free functors detect generators, forgetful functors explain limit computations, and reflective subcategories turn familiar quotients or completions into universal arrows.
The free-forgetful examples show that an adjunction does more than package a universal property: it measures exactly what structure is created and what is remembered. That perspective leads naturally to units, counits, equivalences, and monads, which record the internal dynamics of an adjunction rather than just its external hom-set description.
# 3. Adjunctions, Equivalences, and Monads
After Chapter 1's unit-counit description and Chapter 2's free-forgetful examples, adjunctions are not only a way to package universal properties; they also measure how much information is lost when passing between categories. In this chapter we study three increasingly structured outcomes of an adjunction: fully faithful adjoints, equivalences of categories, and the monads and comonads generated by adjunctions. The guiding theme is that the unit and counit record the exact comparison between an object and the result of moving it across the adjunction and back.
## Fully Faithful Adjoints and the Unit-Counit Test
When does an adjoint functor embed one category inside another without identifying or losing morphisms? The answer is encoded in only half of the adjunction data: for a left adjoint it is the unit, and for a right adjoint it is the counit.
[definition: Fully Faithful Functor]
A functor $F: \mathcal C \to \mathcal D$ is fully faithful if, for every pair of objects $c,c' \in \mathcal C$, the function
\begin{align*}
\mathcal C(c,c') \longrightarrow \mathcal D(Fc,Fc'), \qquad f \longmapsto Ff
\end{align*}
is a bijection.
[/definition]
A fully faithful functor identifies $\mathcal C$ with a full subcategory of $\mathcal D$, up to isomorphism of categories. In an adjunction, this condition has a strong universal-property interpretation: every object in the source is recovered exactly after applying the other adjoint and returning. The failure mode is concrete: a left adjoint may collapse distinct maps after passing into $\mathcal D$, and then the unit cannot be invertible because $c$ is not recoverable from $GFc$. For example, the free group functor $F: \mathrm{Set} \to \mathrm{Grp}$ is not fully faithful, and its unit $X \to UFX$ is not an isomorphism unless the free group on $X$ has the same underlying set as $X$, which usually fails.
This suggests a sharp test: in the presence of an adjunction, full faithfulness should be detectable by whether the appropriate comparison map is an isomorphism. The result below makes that test precise for a left adjoint by using the unit components.
[quotetheorem:4153]
[citeproof:4153]
The theorem says that a fully faithful left adjoint presents $\mathcal C$ as a category of objects already satisfying the universal property imposed by $G$. The hypothesis that $F$ is a left adjoint matters: a fully faithful functor without a specified right adjoint still embeds hom-sets, but there is no unit $c \to GFc$ to test. The theorem also does not say that the counit is an isomorphism on every object of $\mathcal D$; that stronger conclusion would force an equivalence. This distinction leads to the dual test, where the right adjoint rather than the left adjoint is the embedding.
[quotetheorem:4154]
[citeproof:4154]
The right-adjoint version is the form most often used for reflective subcategories, because inclusions are commonly right adjoints. The condition cannot be weakened to saying that some counit components are isomorphisms: for an adjunction $F \dashv G$, failure at even one object $d$ means morphisms into or out of $Gd$ may not be faithfully represented after applying $G$. The theorem does not claim that $F$ is fully faithful; in a reflection, the reflector usually identifies many objects with the same reflected object. In examples, the practical test is to compute $FGd \to d$ on objects already in the proposed subcategory.
[example: Abelianization as a Reflector]
Let $I: \mathrm{Ab} \to \mathrm{Grp}$ be the inclusion, and let $\operatorname{ab}: \mathrm{Grp} \to \mathrm{Ab}$ send a group $G$ to $G/[G,G]$. For an abelian group $A$, regarded as a group $IA$, every commutator is trivial:
\begin{align*}
aba^{-1}b^{-1}
&= aa^{-1}bb^{-1} \\
&= e.
\end{align*}
Hence $[IA,IA] = \{e\}$, so
\begin{align*}
\operatorname{ab}(IA)
&= IA/[IA,IA] \\
&= IA/\{e\}.
\end{align*}
The counit $\varepsilon_A: \operatorname{ab}(IA) \to A$ is the map
\begin{align*}
\varepsilon_A(a\{e\}) = a.
\end{align*}
It is well-defined because $a\{e\} = a'\{e\}$ implies $a = a'$. Its inverse sends $a \in A$ to $a\{e\}$, and the two composites are
\begin{align*}
a &\longmapsto a\{e\} \longmapsto a, \\
a\{e\} &\longmapsto a \longmapsto a\{e\}.
\end{align*}
Thus $\varepsilon_A$ is an isomorphism. The computation shows that abelianization imposes no new relation on an already abelian group: all commutators were already equal to the identity.
[/example]
Reflective subcategories are exactly the common setting in which the right adjoint is a fully faithful inclusion. Coreflective subcategories are the dual setting, with a fully faithful left adjoint.
Because this pattern will recur in examples, it is worth restating the definition in the language of inclusions and reflectors. The emphasis is now on the adjunction to a full subcategory, not on the particular abelianization example.
[definition: Reflective Subcategory]
A full subcategory $\mathcal A \subset \mathcal C$ is reflective if the inclusion functor $I: \mathcal A \to \mathcal C$ has a left adjoint $L: \mathcal C \to \mathcal A$.
[/definition]
After applying $L$, the unit $c \to ILc$ is the universal map from $c$ to an object of the full subcategory. The counit $LIA \to A$ is an isomorphism because the inclusion is fully faithful.
[example: Localization as Reflection]
Let $S$ be a multiplicative subset of a commutative ring $R$, and write $U: S^{-1}R\text{-}\mathrm{Mod}\to R\text{-}\mathrm{Mod}$ for restriction of scalars. If $N$ and $N'$ are $S^{-1}R$-modules, every $S^{-1}R$-linear map is $R$-linear. Conversely, let $f: UN \to UN'$ be $R$-linear. For $s \in S$ and $n \in N$,
\begin{align*}
s \cdot f\bigl((1/s)n\bigr)
&= f\bigl(s\cdot (1/s)n\bigr) \\
&= f(n).
\end{align*}
Since multiplication by $s$ is invertible on $N'$, this gives
\begin{align*}
f\bigl((1/s)n\bigr)=(1/s)f(n).
\end{align*}
Hence for $r \in R$,
\begin{align*}
f\bigl((r/s)n\bigr)
&= f\bigl(r\cdot (1/s)n\bigr) \\
&= r\cdot f\bigl((1/s)n\bigr) \\
&= r\cdot (1/s)f(n) \\
&= (r/s)f(n).
\end{align*}
Thus $f$ is $S^{-1}R$-linear, so $U$ is fully faithful.
The left adjoint sends an $R$-module $M$ to
\begin{align*}
S^{-1}M = S^{-1}R \otimes_R M.
\end{align*}
For an $S^{-1}R$-module $N$, the counit is
\begin{align*}
\varepsilon_N: S^{-1}R \otimes_R UN \longrightarrow N,
\qquad
(r/s)\otimes n \longmapsto (r/s)n.
\end{align*}
Define $\nu_N: N \to S^{-1}R \otimes_R UN$ by $\nu_N(n)=1\otimes n$. Then
\begin{align*}
\varepsilon_N(\nu_N(n))
&= \varepsilon_N(1\otimes n) \\
&= n,
\end{align*}
and, for a pure tensor $(r/s)\otimes n$,
\begin{align*}
\nu_N\bigl(\varepsilon_N((r/s)\otimes n)\bigr)
&= \nu_N((r/s)n) \\
&= 1\otimes (r/s)n \\
&= 1\otimes r(1/s)n \\
&= r\otimes (1/s)n \\
&= (r/s)s\otimes (1/s)n \\
&= (r/s)\otimes s(1/s)n \\
&= (r/s)\otimes n.
\end{align*}
Therefore $\varepsilon_N$ is an isomorphism. The unit $M \to S^{-1}M$ is $m \mapsto 1\otimes m$, and localization is reflective because it freely forces every $s\in S$ to act invertibly while changing nothing on modules where this is already true.
[/example]
## Equivalences as Adjunctions with Invertible Unit and Counit
When should two categories count as having the same mathematical content? Isomorphism of categories is too strict, because it asks for equality after applying inverse functors, while most natural comparisons are inverse only up to coherent isomorphism.
[definition: Equivalence of Categories]
An equivalence of categories between $\mathcal C$ and $\mathcal D$ consists of functors $F: \mathcal C \to \mathcal D$ and $G: \mathcal D \to \mathcal C$ together with natural isomorphisms
\begin{align*}
\eta &: \operatorname{id}_{\mathcal C} \Rightarrow GF, & \varepsilon &: FG \Rightarrow \operatorname{id}_{\mathcal D}.
\end{align*}
[/definition]
The two natural isomorphisms play the same formal roles as the unit and counit of an adjunction. The extra content in an adjunction is the triangle identities, and these can be arranged without changing the underlying equivalence. Without the triangle identities, the two inverse-up-to-isomorphism comparisons need not compose coherently, so the hom-set bijections used in an adjunction may fail to be inverse in the required natural way. This is the first place where equivalence of categories is not merely a pair of weak inverses, but a coherent adjunction whose unit and counit are both invertible.
[quotetheorem:4155]
[citeproof:4155]
This result explains why adjunctions are the correct language for equivalence: an equivalence is an adjunction in which neither approximation loses any information. The invertibility of both unit and counit is essential; if only the unit is invertible, the left adjoint may identify a reflective subcategory rather than all of $\mathcal D$, as abelianization does inside groups. The theorem does not assert equality of categories or equality of objects, only equivalence through coherent isomorphism. The next criterion translates this into the two tests most often checked in practice: hom-set bijectivity and coverage of objects up to isomorphism.
[quotetheorem:3966]
[citeproof:3966]
Both hypotheses are necessary. A fully faithful inclusion of a proper full subcategory need not be essentially surjective, while a functor that hits every isomorphism class may still fail to preserve the correct hom-sets. The criterion does not provide a canonical quasi-inverse; the construction depends on choosing representatives and chosen isomorphisms $FGd \cong d$. In computations, one therefore first proves full faithfulness on morphisms, then separately checks that every target object is isomorphic to one in the image.
[example: Finite-Dimensional Vector Spaces and Matrices]
Let $k$ be a field and let $\mathrm{Mat}_k$ have objects natural numbers and morphisms $m \to n$ given by $n \times m$ matrices over $k$. Define $F: \mathrm{Mat}_k \to \mathrm{FinVect}_k$ by $F(n)=k^n$, and send a matrix $A=(a_{ij})$ to the linear map
\begin{align*}
k^m &\longrightarrow k^n, \\
(x_1,\ldots,x_m) &\longmapsto
\left(\sum_{j=1}^m a_{1j}x_j,\ldots,\sum_{j=1}^m a_{nj}x_j\right).
\end{align*}
We first check full faithfulness. For fixed $m,n$, the map
\begin{align*}
\mathrm{Mat}_k(m,n) \longrightarrow \mathrm{FinVect}_k(k^m,k^n)
\end{align*}
sends an $n \times m$ matrix to its associated linear map. If two matrices $A=(a_{ij})$ and $B=(b_{ij})$ induce the same linear map, then for the $j$th standard basis vector $e_j \in k^m$,
\begin{align*}
A e_j
&=
(a_{1j},\ldots,a_{nj}), \\
B e_j
&=
(b_{1j},\ldots,b_{nj}).
\end{align*}
Equality of the induced maps gives $Ae_j=Be_j$ for every $j$, so $a_{ij}=b_{ij}$ for every $i,j$; hence $A=B$. Thus the map on hom-sets is injective.
For surjectivity, let $T:k^m \to k^n$ be a linear map. Write
\begin{align*}
T(e_j)=(a_{1j},\ldots,a_{nj})
\end{align*}
for each $1 \leq j \leq m$, and let $A=(a_{ij})$. Every vector $x=(x_1,\ldots,x_m)$ satisfies
\begin{align*}
x
&= \sum_{j=1}^m x_j e_j,
\end{align*}
so by linearity,
\begin{align*}
T(x)
&= T\left(\sum_{j=1}^m x_j e_j\right) \\
&= \sum_{j=1}^m x_j T(e_j) \\
&= \sum_{j=1}^m x_j(a_{1j},\ldots,a_{nj}) \\
&= \left(\sum_{j=1}^m a_{1j}x_j,\ldots,\sum_{j=1}^m a_{nj}x_j\right).
\end{align*}
This is exactly the linear map associated to $A$, so every linear map comes from a unique matrix. Therefore $F$ is fully faithful.
The functor is also essentially surjective: if $V$ is finite-dimensional, choose a basis $(v_1,\ldots,v_n)$ of $V$. The map
\begin{align*}
k^n &\longrightarrow V, \\
(\lambda_1,\ldots,\lambda_n) &\longmapsto \sum_{i=1}^n \lambda_i v_i
\end{align*}
is an isomorphism, because the basis condition says every vector of $V$ has a unique expression in that form. Hence every object of $\mathrm{FinVect}_k$ is isomorphic to one in the image of $F$. By the *Fully Faithful and Essentially Surjective Criterion*, $F$ is an equivalence of categories.
This equivalence is not an isomorphism of categories: $\mathrm{Mat}_k$ has one chosen object $n$ for each dimension, while $\mathrm{FinVect}_k$ contains many distinct vector spaces isomorphic to $k^n$.
[/example]
A skeleton of a category is a full subcategory containing exactly one object from each isomorphism class. The criterion shows that the inclusion of a skeleton is always an equivalence, which is why category theory usually treats equivalent categories as having the same structure.
## Monads and Comonads from Adjunctions
What remains if we move across an adjunction and then return to the category we started from? The composite endofunctor carries extra structure: multiplication comes from the counit, and the unit comes from the unit of the adjunction.
[definition: Monad]
A monad on a category $\mathcal C$ is a triple $(T,\eta,\mu)$ consisting of a functor $T: \mathcal C \to \mathcal C$, a natural transformation $\eta: \operatorname{id}_{\mathcal C} \Rightarrow T$, and a natural transformation $\mu: T^2 \Rightarrow T$ such that
\begin{align*}
\mu \circ T\eta &= \operatorname{id}_T, & \mu \circ \eta T &= \operatorname{id}_T, & \mu \circ T\mu &= \mu \circ \mu T.
\end{align*}
[/definition]
The unit inserts a plain object into the structured world described by $T$, while multiplication flattens two layers of that structure into one. The obstruction to being a monad is coherence: an endofunctor with a unit-like map but no associative multiplication cannot support a well-defined algebra theory. In an adjunction, the only possible source of multiplication is the counit $FG \Rightarrow \operatorname{id}_{\mathcal D}$, because two crossings $GFGF$ are shortened by cancelling the middle $FG$.
[quotetheorem:4156]
[citeproof:4156]
The theorem is useful because it gives a recipe rather than merely an existence statement: compute $T=GF$, use the adjunction unit as $\eta$, and obtain $\mu$ by applying $G$ to the counit after $F$. The hypotheses cannot be dropped to an arbitrary pair of functors $F,G$, since without the triangle identities the unit laws for $T$ may fail. The theorem does not say that every monad arises from a unique adjunction; many different adjunctions can induce the same monad, and later comparison functors measure this ambiguity.
There is a second structure to isolate before returning to examples: the same adjunction can be read from the $\mathcal D$-side instead of the $\mathcal C$-side. From that viewpoint the issue is not how to collapse two layers of algebraic structure, but how to compare an object with a canonical expansion of it and how to repeat that expansion coherently.
The dual structure does not describe how to evaluate freely generated operations; it describes how an object can be expanded into a canonical approximation and then consistently expanded again. Thus the formal definition must reverse the directions of the monad structure maps: counit replaces unit, and comultiplication replaces multiplication.
To use the target-side construction independently of its adjunction, we need a name for exactly this reversed package of data. The definition below isolates the endofunctor, the map back to the identity, and the coherent rule for duplicating the approximation, so that later results can recognize the same pattern outside a specific adjunction.
[definition: Comonad]
A comonad on a category $\mathcal D$ is a triple $(Q,\varepsilon,\delta)$ consisting of a functor $Q: \mathcal D \to \mathcal D$, a natural transformation $\varepsilon: Q \Rightarrow \operatorname{id}_{\mathcal D}$, and a natural transformation $\delta: Q \Rightarrow Q^2$ satisfying the duals of the monad unit and associativity axioms.
[/definition]
The monad construction used $GF$ to describe what the adjunction does after returning to the source category. On the target side there is a dual question: if an object of $\mathcal D$ is moved across the adjunction by $G$ and then sent back by $F$, what universal structure controls the resulting approximation? The answer should use the counit to compare $FGd$ with $d$, and the triangle identities should provide the coassociativity and counit laws needed for a comonad.
[quotetheorem:4157]
[citeproof:4157]
The comonad records what happens on the codomain side of the adjunction, where objects are first moved back by $G$ and then freely returned by $F$. The counit hypothesis is essential: without it there is no canonical way to extract an object of $\mathcal D$ from its cofree approximation $FGd$. This theorem does not make $FG$ idempotent in general; idempotence occurs only in special reflective or coreflective situations. The list monad below illustrates the non-idempotent case on the monad side.
[example: The List Monad from Free Monoids]
Let $F: \mathrm{Set} \to \mathrm{Mon}$ send a set $X$ to the free monoid $X^*$ of finite words in elements of $X$, and let $U: \mathrm{Mon} \to \mathrm{Set}$ be the forgetful functor. The induced endofunctor is
\begin{align*}
T(X)
&= U(FX) \\
&= U(X^*) \\
&= X^*,
\end{align*}
so $T(X)$ is the set of finite lists in $X$.
The unit $\eta_X: X \to X^*$ sends an element to the corresponding one-letter word:
\begin{align*}
\eta_X(x)=[x].
\end{align*}
Applying $T$ twice gives
\begin{align*}
T^2(X)
&= T(TX) \\
&= T(X^*) \\
&= (X^*)^*,
\end{align*}
whose elements are finite lists of finite lists, such as
\begin{align*}
\bigl[[x_{11},\ldots,x_{1k_1}],\ldots,[x_{n1},\ldots,x_{nk_n}]\bigr].
\end{align*}
The counit of the free-forgetful adjunction at a monoid $M$ evaluates a word in elements of $M$ by multiplying its letters in $M$. Taking $M=X^*$, where multiplication is concatenation of words, the monad multiplication
\begin{align*}
\mu_X:(X^*)^* \longrightarrow X^*
\end{align*}
is therefore
\begin{align*}
\mu_X\bigl([[x_{11},\ldots,x_{1k_1}],\ldots,[x_{n1},\ldots,x_{nk_n}]]\bigr)
&=
[x_{11},\ldots,x_{1k_1}]\cdots [x_{n1},\ldots,x_{nk_n}] \\
&=
[x_{11},\ldots,x_{1k_1},\ldots,x_{n1},\ldots,x_{nk_n}].
\end{align*}
For a word $w=[x_1,\ldots,x_n]\in X^*$, the two unit laws become
\begin{align*}
\mu_X(T\eta_X(w))
&= \mu_X([[x_1],\ldots,[x_n]]) \\
&= [x_1,\ldots,x_n] \\
&= w,
\end{align*}
and
\begin{align*}
\mu_X(\eta_{T X}(w))
&= \mu_X([w]) \\
&= w.
\end{align*}
Thus the list monad inserts elements as one-letter lists and removes one layer of nesting by concatenating the inner lists.
[/example]
This example also shows how to compute an induced monad in practice: identify the free object, forget back to the original category, then translate the counit into the operation that removes one layer of freeness. For lists, removing one layer means concatenating nested lists. The next example has a different character because its multiplication is an isomorphism rather than a genuinely many-to-one flattening map.
[example: Localization Monad]
For the localization adjunction $S^{-1}(-) \dashv U$ between $R$-modules and $S^{-1}R$-modules, the induced monad on $R\text{-}\mathrm{Mod}$ is
\begin{align*}
T(M)
&= U(S^{-1}M),
\end{align*}
so $T(M)$ is $S^{-1}M$ regarded again as an $R$-module. Applying $T$ twice gives
\begin{align*}
T^2(M)
&= T(T(M)) \\
&= S^{-1}(S^{-1}M).
\end{align*}
An element of $S^{-1}(S^{-1}M)$ may be written as $(m/t)/s$, with $m\in M$ and $s,t\in S$. The monad multiplication is
\begin{align*}
\mu_M:S^{-1}(S^{-1}M)&\longrightarrow S^{-1}M, \\
(m/t)/s&\longmapsto m/(st).
\end{align*}
We verify explicitly that this map is an isomorphism. Define
\begin{align*}
\nu_M:S^{-1}M&\longrightarrow S^{-1}(S^{-1}M), \\
m/t&\longmapsto (m/t)/1.
\end{align*}
Then
\begin{align*}
\mu_M(\nu_M(m/t))
&= \mu_M((m/t)/1) \\
&= m/(1t) \\
&= m/t.
\end{align*}
In the other direction, for $(m/t)/s\in S^{-1}(S^{-1}M)$,
\begin{align*}
\nu_M(\mu_M((m/t)/s))
&= \nu_M(m/(st)) \\
&= (m/(st))/1.
\end{align*}
This equals $(m/t)/s$ in $S^{-1}(S^{-1}M)$ because
\begin{align*}
s\cdot (m/(st))
&= sm/(st) \\
&= m/t
\end{align*}
in $S^{-1}M$, where the last equality holds since the fractions $(sm,st)$ and $(m,t)$ have the same numerator after cross-multiplication:
\begin{align*}
t(sm)&=st\,m.
\end{align*}
Thus $\nu_M\mu_M=\operatorname{id}$ and $\mu_M\nu_M=\operatorname{id}$, so localizing twice is canonically the same as localizing once. The monad is therefore idempotent: it changes an $R$-module only until every element of $S$ acts invertibly, and a second localization adds no new effect.
[/example]
## Algebras over a Monad and the Comparison Functor
If a monad is the shadow of an adjunction, can we reconstruct the category on the other side of the adjunction from the monad alone? The Eilenberg-Moore category answers this by treating a $T$-algebra as an object equipped with an action of the monad.
[definition: Algebra over a Monad]
Let $(T,\eta,\mu)$ be a monad on a category $\mathcal C$. A $T$-algebra is a pair $(c,a)$ where $c \in \mathcal C$ and $a: Tc \to c$ is a morphism such that
\begin{align*}
a \circ \eta_c &= \operatorname{id}_c, & a \circ T a &= a \circ \mu_c.
\end{align*}
[/definition]
The first axiom says that freely inserted structure acts as doing nothing. The second says that acting after flattening two layers of structure agrees with acting layer by layer. If either axiom is omitted, the action map $Tc \to c$ may not describe a genuine algebraic structure: the unit might not act as an identity, or nested formal operations might evaluate differently depending on the order of evaluation. These are exactly the two coherence failures that the monad axioms were designed to prevent.
Once the objects carrying a monad action have been identified, we also need the maps that preserve those actions. The Eilenberg-Moore category collects the coherent algebras and the action-preserving morphisms into the category that the monad itself can reconstruct.
[definition: Eilenberg-Moore Category]
Let $(T,\eta,\mu)$ be a monad on $\mathcal C$. The Eilenberg-Moore category $\mathcal C^T$ has $T$-algebras as objects. A morphism $f: (c,a) \to (c',a')$ is a morphism $f: c \to c'$ in $\mathcal C$ such that
\begin{align*}
f \circ a = a' \circ Tf.
\end{align*}
[/definition]
This category packages all objects that genuinely carry the algebraic structure described by the monad.
[example: Algebras for the List Monad]
For the list monad $T(X)=X^*$ on $\mathrm{Set}$, a $T$-algebra is a set $M$ with a map
\begin{align*}
\alpha:M^*\longrightarrow M
\end{align*}
such that
\begin{align*}
\alpha\circ \eta_M&=\operatorname{id}_M, &
\alpha\circ T\alpha&=\alpha\circ \mu_M.
\end{align*}
The first axiom says that, for every $m\in M$,
\begin{align*}
\alpha([m])
&=(\alpha\circ \eta_M)(m) \\
&=m.
\end{align*}
Given such an algebra, define
\begin{align*}
e&=\alpha([]), &
m\cdot n&=\alpha([m,n]).
\end{align*}
We show that this is a monoid. For the left unit law, apply $\alpha\circ T\alpha=\alpha\circ\mu_M$ to the list of lists $[[],[m]]$:
\begin{align*}
(\alpha\circ T\alpha)([[],[m]])
&=\alpha([\alpha([]),\alpha([m])]) \\
&=\alpha([e,m]) \\
&=e\cdot m,
\end{align*}
while
\begin{align*}
(\alpha\circ\mu_M)([[],[m]])
&=\alpha([m]) \\
&=m.
\end{align*}
Hence $e\cdot m=m$. Similarly, applying the same axiom to $[[m],[]]$ gives
\begin{align*}
m\cdot e
&=\alpha([m,e]) \\
&=\alpha([\alpha([m]),\alpha([])]) \\
&=(\alpha\circ T\alpha)([[m],[]]) \\
&=(\alpha\circ\mu_M)([[m],[]]) \\
&=\alpha([m]) \\
&=m.
\end{align*}
For associativity, apply the algebra axiom to $[[m,n],[p]]$:
\begin{align*}
(m\cdot n)\cdot p
&=\alpha([\alpha([m,n]),\alpha([p])]) \\
&=(\alpha\circ T\alpha)([[m,n],[p]]) \\
&=(\alpha\circ\mu_M)([[m,n],[p]]) \\
&=\alpha([m,n,p]).
\end{align*}
Applying the same axiom to $[[m],[n,p]]$ gives
\begin{align*}
m\cdot(n\cdot p)
&=\alpha([\alpha([m]),\alpha([n,p])]) \\
&=(\alpha\circ T\alpha)([[m],[n,p]]) \\
&=(\alpha\circ\mu_M)([[m],[n,p]]) \\
&=\alpha([m,n,p]).
\end{align*}
Thus $(m\cdot n)\cdot p=m\cdot(n\cdot p)$.
Conversely, if $(M,e,\cdot)$ is a monoid, define $\alpha:M^*\to M$ by
\begin{align*}
\alpha([])
&=e,\\
\alpha([m_1,\ldots,m_n])
&=m_1\cdot m_2\cdots m_n
\end{align*}
with the product associated in any order, which is independent of the association by associativity. Then
\begin{align*}
(\alpha\circ\eta_M)(m)
&=\alpha([m]) \\
&=m,
\end{align*}
because a one-term product is its single entry. For a list of lists $[w_1,\ldots,w_r]$, write
\begin{align*}
w_i=[m_{i1},\ldots,m_{ik_i}].
\end{align*}
Then
\begin{align*}
(\alpha\circ T\alpha)([w_1,\ldots,w_r])
&=\alpha([\alpha(w_1),\ldots,\alpha(w_r)]) \\
&=\alpha(w_1)\cdots \alpha(w_r) \\
&=(m_{11}\cdots m_{1k_1})\cdots(m_{r1}\cdots m_{rk_r}),
\end{align*}
while
\begin{align*}
(\alpha\circ\mu_M)([w_1,\ldots,w_r])
&=\alpha([m_{11},\ldots,m_{1k_1},\ldots,m_{r1},\ldots,m_{rk_r}]) \\
&=m_{11}\cdots m_{1k_1}\cdots m_{r1}\cdots m_{rk_r}.
\end{align*}
These are equal by associativity of the monoid product, with empty inner lists contributing the identity element $e$. Therefore algebras for the list monad are exactly monoids: the algebra map evaluates a finite list by multiplying its entries.
[/example]
This example is the model case for the comparison construction: an object on the structured side of an adjunction should become its underlying object together with the action that evaluates free structure. The possible failure is that the monad may remember only the operations visible after forgetting to $\mathcal C$, while the category $\mathcal D$ might contain extra structure or extra morphism conditions not recoverable from those operations. The comparison functor is the precise device for testing whether such information has been lost.
The construction must therefore assign to every object of $\mathcal D$ a canonical algebra for the induced monad, and it must do so functorially on morphisms. The next result gives this comparison functor and identifies the algebra action supplied by the counit of the adjunction.
[quotetheorem:4158]
[citeproof:4158]
The comparison functor measures how much of $\mathcal D$ is visible from the induced monad on $\mathcal C$. The construction uses the counit essentially: without $G\varepsilon_d$, the object $Gd$ would have no canonical $T$-algebra action. The theorem does not claim that $K$ is always full, faithful, or essentially surjective; those properties are additional monadicity conditions, and they can fail when $G$ forgets structure not encoded by the induced operations. When $K$ is an equivalence, the adjunction is called monadic, and $\mathcal D$ is recovered as the category of algebras for the monad.
[remark: Monadicity as Reconstruction]
The free-forgetful adjunction $\mathrm{Set} \leftrightarrows \mathrm{Mon}$ is monadic: monoids are exactly algebras for the list monad. Reflective localizations are also monadic in an idempotent sense: the algebras are the objects already local with respect to the reflector. General monadicity theorems, such as Beck's theorem, give precise hypotheses under which the comparison functor is an equivalence; those criteria belong to the later study of limits and coequalizers.
[/remark]
Reflective examples are especially good tests for the comparison functor because the induced monad is often idempotent: applying the reflector twice gives no new information. In that situation, the algebras should be understood as objects already satisfying the local condition enforced by the reflector. Sheafification is the geometric example of this pattern.
[example: Sheafification as Reflective Localization]
Let $X$ be a topological space, let $\mathrm{PSh}(X)$ be the category of presheaves of sets on $X$, and let $I:\mathrm{Sh}(X)\to \mathrm{PSh}(X)$ be the inclusion. The sheafification adjunction
\begin{align*}
a:\mathrm{PSh}(X) \rightleftarrows \mathrm{Sh}(X):I
\end{align*}
has unit
\begin{align*}
\eta_P:P\longrightarrow IaP
\end{align*}
for a presheaf $P$ and counit
\begin{align*}
\varepsilon_F:aIF\longrightarrow F
\end{align*}
for a sheaf $F$.
For a sheaf $F$, the presheaf $IF$ already satisfies locality and gluing, so applying sheafification should not change it. Categorically, this is exactly the counit test for a fully faithful right adjoint: since $I$ is the inclusion of a full subcategory, $I$ is fully faithful, and by *Fully Faithful Right Adjoint* the counit
\begin{align*}
\varepsilon_F:aIF\longrightarrow F
\end{align*}
is an isomorphism for every sheaf $F$.
The induced monad on $\mathrm{PSh}(X)$ is
\begin{align*}
T
&= Ia,
\end{align*}
so for a presheaf $P$,
\begin{align*}
T(P)
&= Ia(P),
\end{align*}
which is the associated sheaf $aP$ viewed again as a presheaf. Applying the monad twice gives
\begin{align*}
T^2(P)
&= T(TP) \\
&= Ia(IaP).
\end{align*}
Since $aP$ is already a sheaf, the counit at $aP$ is an isomorphism
\begin{align*}
\varepsilon_{aP}:aI(aP)\longrightarrow aP.
\end{align*}
Therefore the monad multiplication
\begin{align*}
\mu_P
&= I\varepsilon_{aP}:IaI(aP)\longrightarrow IaP
\end{align*}
is an isomorphism of presheaves.
Thus the sheafification monad is idempotent: once a presheaf has been sheafified, applying sheafification again adds no new sections or identifications. Its algebras are precisely presheaves already satisfying the sheaf locality and gluing conditions, because those are exactly the presheaves for which the unit $P\to IaP$ is an isomorphism.
[/example]
The progression of the chapter is therefore: fully faithful adjoints are detected by one side of the unit-counit data; equivalences are adjunctions where both sides are invertible; monads record the effect of crossing an adjunction and returning. This viewpoint prepares the next part of the course, where limits and colimits provide the exact hypotheses under which such reconstructions exist and behave well.
Once adjunctions are understood as a way to reconstruct or encode structure, limits provide the next universal problem to solve. Instead of focusing on individual objects, we now ask for objects determined by entire diagrams, and cones become the basic language for expressing that requirement.
# 4. Limits as Universal Cones
This chapter turns universal properties from individual objects into a systematic calculus. Instead of treating products, equalizers, and pullbacks as separate constructions, we package each of them as a terminal cone over a diagram. This point of view explains why limits are unique up to unique isomorphism, why finite limits can be built from products and equalizers, and why pullbacks compose by pasting.
## Diagrams, Cones, and Terminal Cones
What does it mean to ask for an object that maps compatibly into an entire pattern of objects at once? A diagram records the pattern, while a cone records a candidate object equipped with compatible maps into that pattern. The limit is the universal such candidate.
[definition: Diagram]
Let $J$ and $\mathcal C$ be categories. A diagram of shape $J$ in $\mathcal C$ is a functor $D:J\to \mathcal C$.
[/definition]
The category $J$ is often small and is called the indexing category. Its objects label the objects of the diagram, and its morphisms label the structure maps that must be respected. Without this indexing category, each construction has to carry its own separate list of compatibility equations; the diagram is the device that records all of those equations in one place.
[example: Parallel Pair Diagram]
Let $J$ be the category with objects $0,1$, identity morphisms $\operatorname{id}_0,\operatorname{id}_1$, and two distinct non-identity morphisms $\alpha,\beta:0\to 1$. A diagram $D:J\to\mathcal C$ assigns
\begin{align*}
A&=D(0),&
B&=D(1),&
f&=D(\alpha):A\to B,&
g&=D(\beta):A\to B.
\end{align*}
Thus the shape records a parallel pair $A\rightrightarrows B$. A cone from an object $N$ to this diagram consists of morphisms $\pi_0:N\to A$ and $\pi_1:N\to B$ such that the two non-identity morphisms of $J$ give the equations
\begin{align*}
D(\alpha)\circ \pi_0&=\pi_1,\\
D(\beta)\circ \pi_0&=\pi_1.
\end{align*}
Substituting $D(\alpha)=f$ and $D(\beta)=g$, these become
\begin{align*}
f\circ \pi_0&=\pi_1,\\
g\circ \pi_0&=\pi_1.
\end{align*}
Therefore $f\circ \pi_0=g\circ \pi_0$, since both composites are equal to $\pi_1$. This is exactly why this indexing shape produces equalizers: the cone data is a map into $A$ whose two composites to $B$ agree.
[/example]
The example suggests what must be abstracted away from the special equalizer shape. Instead of naming two particular arrows, we need a general device that describes a single object mapping compatibly into every object of an arbitrary diagram.
[definition: Cone]
Let $D:J\to \mathcal C$ be a diagram and let $N\in \mathcal C$. A cone from $N$ to $D$ consists of morphisms $\pi_j:N\to D(j)$ for each object $j\in J$ such that for every morphism $u:j\to k$ in $J$,
\begin{align*}
D(u)\circ \pi_j=\pi_k.
\end{align*}
[/definition]
Once cones are objects of comparison, we also need to know when one cone factors through another. The relevant morphism should be a map between the vertices that preserves every projection, because only such maps keep the cone equations intact.
[definition: Morphism of Cones]
Let $(N,(\pi_j)_{j\in J})$ and $(N',(\pi'_j)_{j\in J})$ be cones over $D:J\to \mathcal C$. A morphism of cones $h:(N,(\pi_j))\to (N',(\pi'_j))$ is a morphism $h:N\to N'$ in $\mathcal C$ such that
\begin{align*}
\pi'_j\circ h=\pi_j
\end{align*}
for every object $j\in J$.
[/definition]
With cone morphisms available, all cones over a fixed diagram can be compared inside a category $\operatorname{Cone}(D)$.
The remaining problem is to single out the cone that represents all compatible comparisons at once. Such a cone should be the most efficient one: every other compatible cone maps to it uniquely, so the definition of a limit is a terminal-object condition in this cone category.
[definition: Limit]
Let $D:J\to \mathcal C$ be a diagram. A limit of $D$ is a terminal object in $\operatorname{Cone}(D)$.
[/definition]
If a limit exists, we write its vertex as $\lim D$ and its structure maps as $p_j:\lim D\to D(j)$. The universal property says that for every cone $(N,(\pi_j))$ there is a unique morphism $u:N\to \lim D$ such that $p_j\circ u=\pi_j$ for all $j\in J$.
Because the definition permits many possible models for the same limiting cone, the next question is whether those choices are genuinely interchangeable. The universal property should force any two limits of the same diagram to be uniquely isomorphic in a way that respects all projection maps.
[quotetheorem:4159]
[citeproof:4159]
This theorem justifies speaking of the limit when it exists, because any two choices are canonically isomorphic in a way compatible with all projections. The hypothesis that both cones are already limits is essential: the theorem gives uniqueness, not existence. For instance, in the category of fields, the product of two fields need not exist, since the cartesian product ring usually has zero divisors and is not a field. Thus the universal property can identify all possible answers, but it cannot by itself guarantee that an answer is present in the category.
## Products, Equalizers, Pullbacks, and Inverse Limits
How do familiar constructions fit into the cone picture? Each arises from a particular indexing category. Changing the shape $J$ changes the compatibility conditions imposed on the cone.
[definition: Product]
Let $(X_i)_{i\in I}$ be a family of objects of $\mathcal C$. A product of this family is a limit of the diagram $D:I\to \mathcal C$ from the discrete category on $I$ with $D(i)=X_i$.
[/definition]
For a discrete indexing category there are no non-identity morphisms, so a cone is only a family of maps into the objects $X_i$. The universal morphism into the product is determined by its components.
[example: Product of Sets]
For a family of sets $(X_i)_{i\in I}$, consider the Cartesian product
\begin{align*}
\prod_{i\in I}X_i=\{(x_i)_{i\in I}:x_i\in X_i\text{ for every }i\in I\}
\end{align*}
with projections $p_i:\prod_{k\in I}X_k\to X_i$ defined by
\begin{align*}
p_i((x_k)_{k\in I})=x_i.
\end{align*}
Given a set $N$ and maps $f_i:N\to X_i$ for every $i\in I$, define
\begin{align*}
u:N&\longrightarrow \prod_{i\in I}X_i,\\
u(n)&=(f_i(n))_{i\in I}.
\end{align*}
This is well-defined because $f_i(n)\in X_i$ for each $i$, so $(f_i(n))_{i\in I}$ is an element of the Cartesian product.
For each $j\in I$ and each $n\in N$,
\begin{align*}
(p_j\circ u)(n)
&=p_j(u(n))\\
&=p_j((f_i(n))_{i\in I})\\
&=f_j(n).
\end{align*}
Thus $p_j\circ u=f_j$ for every $j\in I$. If $v:N\to\prod_{i\in I}X_i$ is another map with $p_j\circ v=f_j$ for every $j$, then for each $n\in N$ write $v(n)=(x_i)_{i\in I}$. For every $j\in I$,
\begin{align*}
x_j
&=p_j((x_i)_{i\in I})\\
&=p_j(v(n))\\
&=(p_j\circ v)(n)\\
&=f_j(n).
\end{align*}
Therefore every component of $v(n)$ equals the corresponding component of $(f_i(n))_{i\in I}$, so
\begin{align*}
v(n)=(f_i(n))_{i\in I}=u(n).
\end{align*}
Since this holds for every $n\in N$, $v=u$. Hence the Cartesian product with its coordinate projections satisfies the product universal property in $\mathrm{Set}$.
[/example]
Products impose no equations between the components, so they only organize independent choices. To express the common categorical task of selecting the part of an object on which two parallel maps become indistinguishable, the indexing shape must force an equality between two composites.
[definition: Equalizer]
Let $f,g:A\to B$ be parallel morphisms in $\mathcal C$. An equalizer of $f$ and $g$ is a limit of the parallel pair diagram $A\rightrightarrows B$.
[/definition]
Concretely, an equalizer is a morphism $e:E\to A$ satisfying $f\circ e=g\circ e$, universal among such morphisms. It extracts the largest subobject of $A$ on which the two maps agree, in whatever sense the ambient category supports.
[example: Equalizer of Group Homomorphisms]
Let $f,g:G\to H$ be group homomorphisms. We show that their equalizer in $\mathrm{Grp}$ is
\begin{align*}
E=\{x\in G:f(x)=g(x)\}
\end{align*}
with the inclusion homomorphism $e:E\to G$.
First $E$ is a subgroup of $G$. Since $f$ and $g$ are homomorphisms,
\begin{align*}
f(1_G)&=1_H,\\
g(1_G)&=1_H,
\end{align*}
so $f(1_G)=g(1_G)$ and hence $1_G\in E$. If $x,y\in E$, then $f(x)=g(x)$ and $f(y)=g(y)$. Therefore
\begin{align*}
f(xy^{-1})
&=f(x)f(y^{-1})\\
&=f(x)f(y)^{-1}\\
&=g(x)g(y)^{-1}\\
&=g(x)g(y^{-1})\\
&=g(xy^{-1}).
\end{align*}
Thus $xy^{-1}\in E$, so $E$ is a subgroup of $G$. The inclusion $e:E\to G$ is then a group homomorphism, and for every $x\in E$,
\begin{align*}
(f\circ e)(x)
&=f(e(x))\\
&=f(x)\\
&=g(x)\\
&=g(e(x))\\
&=(g\circ e)(x).
\end{align*}
Hence $f\circ e=g\circ e$.
Now let $u:K\to G$ be any group homomorphism such that $f\circ u=g\circ u$. For each $k\in K$,
\begin{align*}
f(u(k))
&=(f\circ u)(k)\\
&=(g\circ u)(k)\\
&=g(u(k)),
\end{align*}
so $u(k)\in E$. Define $\bar u:K\to E$ by
\begin{align*}
\bar u(k)=u(k).
\end{align*}
This is well-defined because $u(k)\in E$ for every $k\in K$. For $k,\ell\in K$,
\begin{align*}
\bar u(k\ell)
&=u(k\ell)\\
&=u(k)u(\ell)\\
&=\bar u(k)\bar u(\ell),
\end{align*}
so $\bar u$ is a group homomorphism. Also, for every $k\in K$,
\begin{align*}
(e\circ \bar u)(k)
&=e(\bar u(k))\\
&=e(u(k))\\
&=u(k),
\end{align*}
so $e\circ \bar u=u$. If $v:K\to E$ is another homomorphism with $e\circ v=u$, then for every $k\in K$,
\begin{align*}
v(k)
&=e(v(k))\\
&=(e\circ v)(k)\\
&=u(k)\\
&=\bar u(k),
\end{align*}
where the first equality holds because $e$ is the inclusion. Thus $v=\bar u$. Therefore $e:E\to G$ is universal among homomorphisms into $G$ on which $f$ and $g$ agree, so it is the equalizer of $f$ and $g$ in $\mathrm{Grp}$.
[/example]
Equalizers compare two maps with the same domain and codomain. A different compatibility problem occurs when two objects map into a common target: one wants pairs of inputs whose images agree, but without a universal construction there is no canonical object representing those compatible pairs.
[definition: Pullback]
Let $f:X\to Z$ and $g:Y\to Z$ be morphisms in $\mathcal C$. A pullback of $f$ and $g$ is a limit of the cospan diagram $X\xrightarrow{f} Z \xleftarrow{g}Y$.
[/definition]
A pullback is usually drawn as a commutative square
\begin{align*}
P &\xrightarrow{q} Y\\
p\downarrow \quad & \quad \downarrow g\\
X &\xrightarrow{f} Z,
\end{align*}
where $f\circ p=g\circ q$, and where the square is universal among commutative squares mapping into the cospan.
[illustration:category-theory-ii-pullback-square]
[example: Pullback of Sets]
For functions $f:X\to Z$ and $g:Y\to Z$, define
\begin{align*}
X\times_ZY=\{(x,y)\in X\times Y:f(x)=g(y)\}.
\end{align*}
Let $p:X\times_ZY\to X$ and $q:X\times_ZY\to Y$ be the restrictions of the Cartesian product projections:
\begin{align*}
p(x,y)&=x,&
q(x,y)&=y.
\end{align*}
For every $(x,y)\in X\times_ZY$,
\begin{align*}
(f\circ p)(x,y)
&=f(p(x,y))\\
&=f(x)\\
&=g(y)\\
&=g(q(x,y))\\
&=(g\circ q)(x,y),
\end{align*}
so $f\circ p=g\circ q$.
Now let $N$ be a set with maps $a:N\to X$ and $b:N\to Y$ satisfying $f\circ a=g\circ b$. Define
\begin{align*}
u:N&\longrightarrow X\times_ZY,\\
u(n)&=(a(n),b(n)).
\end{align*}
This is well-defined because, for each $n\in N$,
\begin{align*}
f(a(n))
&=(f\circ a)(n)\\
&=(g\circ b)(n)\\
&=g(b(n)),
\end{align*}
so $(a(n),b(n))\in X\times_ZY$. The projection equations hold pointwise:
\begin{align*}
(p\circ u)(n)
&=p(u(n))\\
&=p(a(n),b(n))\\
&=a(n),
\end{align*}
and
\begin{align*}
(q\circ u)(n)
&=q(u(n))\\
&=q(a(n),b(n))\\
&=b(n).
\end{align*}
Thus $p\circ u=a$ and $q\circ u=b$.
If $v:N\to X\times_ZY$ is another map with $p\circ v=a$ and $q\circ v=b$, then for each $n\in N$ write $v(n)=(x_n,y_n)$. Then
\begin{align*}
x_n
&=p(x_n,y_n)\\
&=p(v(n))\\
&=(p\circ v)(n)\\
&=a(n),
\end{align*}
and
\begin{align*}
y_n
&=q(x_n,y_n)\\
&=q(v(n))\\
&=(q\circ v)(n)\\
&=b(n).
\end{align*}
Therefore
\begin{align*}
v(n)=(x_n,y_n)=(a(n),b(n))=u(n)
\end{align*}
for every $n\in N$, so $v=u$. Hence $X\times_ZY$ with projections $p$ and $q$ is the pullback of $f$ and $g$ in $\mathrm{Set}$: its elements are exactly the compatible pairs over $Z$.
[/example]
The same formula often works in concrete categories, but the surrounding structure matters. In topological spaces, the set of compatible pairs must carry the topology that makes the universal property true.
[example: Pullback of Topological Spaces]
For continuous maps $f:X\to Z$ and $g:Y\to Z$, define
\begin{align*}
X\times_ZY=\{(x,y)\in X\times Y:f(x)=g(y)\},
\end{align*}
and give $X\times_ZY$ the subspace topology inherited from $X\times Y$. Let $i:X\times_ZY\hookrightarrow X\times Y$ be the inclusion, and define $p:X\times_ZY\to X$ and $q:X\times_ZY\to Y$ by
\begin{align*}
p(x,y)&=x,&
q(x,y)&=y.
\end{align*}
Since $p=\operatorname{pr}_X\circ i$ and $q=\operatorname{pr}_Y\circ i$, the maps $p$ and $q$ are continuous. For every $(x,y)\in X\times_ZY$,
\begin{align*}
(f\circ p)(x,y)
&=f(p(x,y))\\
&=f(x)\\
&=g(y)\\
&=g(q(x,y))\\
&=(g\circ q)(x,y),
\end{align*}
so $f\circ p=g\circ q$.
Now let $N$ be a topological space with continuous maps $a:N\to X$ and $b:N\to Y$ satisfying $f\circ a=g\circ b$. Define
\begin{align*}
u:N&\longrightarrow X\times_ZY,\\
u(n)&=(a(n),b(n)).
\end{align*}
This is well-defined because, for each $n\in N$,
\begin{align*}
f(a(n))
&=(f\circ a)(n)\\
&=(g\circ b)(n)\\
&=g(b(n)),
\end{align*}
so $(a(n),b(n))\in X\times_ZY$. To prove continuity, consider the composite $i\circ u:N\to X\times Y$. Its coordinate projections are
\begin{align*}
\operatorname{pr}_X\circ i\circ u
&=p\circ u
=a,\\
\operatorname{pr}_Y\circ i\circ u
&=q\circ u
=b.
\end{align*}
Since $a$ and $b$ are continuous, the map $i\circ u:N\to X\times Y$ is continuous by the defining property of the product topology. Since $X\times_ZY$ has the subspace topology and $i\circ u$ has image contained in $X\times_ZY$, the map $u:N\to X\times_ZY$ is continuous.
The projection equations hold pointwise:
\begin{align*}
(p\circ u)(n)
&=p(u(n))\\
&=p(a(n),b(n))\\
&=a(n),
\end{align*}
and
\begin{align*}
(q\circ u)(n)
&=q(u(n))\\
&=q(a(n),b(n))\\
&=b(n).
\end{align*}
Thus $p\circ u=a$ and $q\circ u=b$.
If $v:N\to X\times_ZY$ is another continuous map with $p\circ v=a$ and $q\circ v=b$, then for each $n\in N$ write
\begin{align*}
v(n)=(x_n,y_n).
\end{align*}
Then
\begin{align*}
x_n
&=p(x_n,y_n)\\
&=p(v(n))\\
&=(p\circ v)(n)\\
&=a(n),
\end{align*}
and
\begin{align*}
y_n
&=q(x_n,y_n)\\
&=q(v(n))\\
&=(q\circ v)(n)\\
&=b(n).
\end{align*}
Therefore
\begin{align*}
v(n)=(x_n,y_n)=(a(n),b(n))=u(n)
\end{align*}
for every $n\in N$, so $v=u$. Hence $X\times_ZY$ with the subspace topology is the pullback in $\mathrm{Top}$: its points are the compatible pairs, and its topology is exactly the one making maps into it equivalent to compatible continuous maps into $X$ and $Y$.
[/example]
Finite pullbacks impose compatibility over one stage. Many constructions in algebra and topology require the same compatibility across a whole ordered tower of approximations, where a later stage must reduce consistently to every earlier stage.
[definition: Inverse Limit]
Let $(I,\leq)$ be a partially ordered set regarded as a category with a unique morphism $i\to j$ when $i\leq j$. An inverse system in $\mathcal C$ is a diagram $D:I^{\mathrm{op}}\to \mathcal C$. An inverse limit is a limit of this diagram.
[/definition]
Inverse limits organise compatible families over a directed system of approximations. In algebra and topology they often appear as spaces of coherent choices across all finite stages.
[example: Inverse Limit of Quotient Rings]
Let $R$ be a commutative ring and let $I\trianglelefteq R$ be an ideal. For $n\geq 1$, since $I^{n+1}\subseteq I^n$, there is a quotient map
\begin{align*}
\rho_{n+1,n}:R/I^{n+1}&\longrightarrow R/I^n,\\
r+I^{n+1}&\longmapsto r+I^n.
\end{align*}
This is well-defined: if $r+I^{n+1}=s+I^{n+1}$, then $r-s\in I^{n+1}\subseteq I^n$, so $r+I^n=s+I^n$.
The inverse limit is the subset of the product $\prod_{n\geq 1}R/I^n$ consisting of sequences whose coordinates agree under all transition maps:
\begin{align*}
\varprojlim_n R/I^n
=
\left\{(a_n)_{n\geq 1}:a_n\in R/I^n\text{ and }\rho_{n+1,n}(a_{n+1})=a_n\text{ for every }n\geq 1\right\}.
\end{align*}
Writing $a_n=r_n+I^n$, the compatibility condition is
\begin{align*}
\rho_{n+1,n}(a_{n+1})=a_n
&\Longleftrightarrow
\rho_{n+1,n}(r_{n+1}+I^{n+1})=r_n+I^n\\
&\Longleftrightarrow
r_{n+1}+I^n=r_n+I^n\\
&\Longleftrightarrow
r_{n+1}-r_n\in I^n\\
&\Longleftrightarrow
r_{n+1}\equiv r_n\pmod{I^n}.
\end{align*}
Thus an element is exactly a compatible sequence of residues modulo the powers of $I$.
The ring operations are coordinatewise. If $(a_n)$ and $(b_n)$ are compatible, then
\begin{align*}
\rho_{n+1,n}(a_{n+1}+b_{n+1})
&=\rho_{n+1,n}(a_{n+1})+\rho_{n+1,n}(b_{n+1})\\
&=a_n+b_n,
\end{align*}
and
\begin{align*}
\rho_{n+1,n}(a_{n+1}b_{n+1})
&=\rho_{n+1,n}(a_{n+1})\rho_{n+1,n}(b_{n+1})\\
&=a_nb_n,
\end{align*}
because $\rho_{n+1,n}$ is a ring homomorphism. Hence sums and products of compatible sequences are again compatible. With these coordinatewise operations, $\varprojlim_n R/I^n$ is the $I$-adic completion of $R$: it records one residue modulo each $I^n$, with each residue reducing to the previous one.
[/example]
## Limits as Representable Functors of Cones
Can the universal property of a limit be expressed as a representability statement? Yes: a limit represents the functor that assigns to an object the set of cones from that object into the diagram. This reformulation connects limits directly to the adjunction language from earlier chapters.
[definition: Cone Functor]
Let $D:J\to \mathcal C$ be a diagram. The cone functor associated to $D$ is the functor
\begin{align*}
\operatorname{Cone}(-,D):\mathcal C^{\mathrm{op}}\to \mathrm{Set}
\end{align*}
that sends $N\in\mathcal C$ to the set of cones from $N$ to $D$.
[/definition]
For a morphism $h:M\to N$, the functor sends a cone $(N,(\pi_j))$ to the cone $(M,(\pi_j\circ h))$. Thus it is contravariant in the vertex. The universal property of a limit says that every cone from $N$ factors uniquely through the limiting cone, so the set of cones from $N$ should be naturally the same as the set of maps from $N$ to the limiting object.
This raises a precise representability question: when does the cone functor have the form $\mathcal C(-,L)$ for some object $L$? If it does, the representing object should be exactly the limit vertex, and the representing natural bijection should encode all cone factorisations at once.
[quotetheorem:4160]
[citeproof:4160]
This theorem says that limits are representable objects for a specific functor. The statement is conditional: it identifies a limiting cone with a representation of $\operatorname{Cone}(-,D)$, but it does not prove that such a representing object exists. In a category where the relevant product or equalizer is missing, the cone functor may simply be non-representable.
[quotetheorem:4149]
[citeproof:4149]
This preservation theorem also has a boundary: it preserves limits that already exist in $\mathcal D$, rather than creating limits in $\mathcal D$ or guaranteeing that $F$ preserves them. Left adjoints are instead expected to preserve colimits. Conceptually, the right adjoint can preserve the representing behavior of cones because maps into $Gd$ are controlled by maps out of the corresponding left-adjoint object. The result is therefore a transfer principle for already-formed universal objects, not an existence theorem.
The representability viewpoint explains what a limit does once it exists, but it still leaves the construction problem open. For finite diagrams, one wants a small toolkit that builds all finite limits from simpler universal objects. Products provide the ambient space of all possible coordinate choices, while equalizers cut out the compatibility equations imposed by the arrows of the diagram.
[quotetheorem:4162]
[citeproof:4162]
The construction is important because it reduces finite-limit existence to two elementary cases. Both ingredients matter: products assemble all candidate components, while equalizers impose all arrow-compatibility equations. The result is finite because it uses only finite products; it says nothing about infinite diagrams unless the required infinite products also exist. As a boundary example, the category of fields lacks many finite products, so this theorem cannot be applied there even though equalizer-like subfields of parallel field homomorphisms may exist. In many algebraic categories, products and equalizers can be constructed directly on underlying sets and then equipped with the inherited algebraic structure.
[example: Finite Limits in Groups]
Let $D:J\to \mathrm{Grp}$ be a finite diagram. Define
\begin{align*}
L=\left\{(x_j)_{j\in J}\in \prod_{j\in J}D(j):D(u)(x_j)=x_k\text{ for every }u:j\to k\text{ in }J\right\}.
\end{align*}
We show that $L$ is the limit of $D$, with projections $p_j:L\to D(j)$ given by $p_j((x_i)_{i\in J})=x_j$.
First $L$ is a subgroup of the finite direct product. The identity element $(1_{D(j)})_{j\in J}$ lies in $L$ because for every arrow $u:j\to k$,
\begin{align*}
D(u)(1_{D(j)})=1_{D(k)}.
\end{align*}
If $x=(x_j)_{j\in J}$ and $y=(y_j)_{j\in J}$ lie in $L$, then for every $u:j\to k$,
\begin{align*}
D(u)(x_jy_j^{-1})
&=D(u)(x_j)D(u)(y_j^{-1})\\
&=D(u)(x_j)D(u)(y_j)^{-1}\\
&=x_ky_k^{-1}.
\end{align*}
Thus $xy^{-1}\in L$, so $L$ is a subgroup. Each projection $p_j:L\to D(j)$ is a group homomorphism because it is the restriction of the coordinate projection:
\begin{align*}
p_j(xy)&=(xy)_j=x_jy_j=p_j(x)p_j(y).
\end{align*}
For every arrow $u:j\to k$ and every $x=(x_i)_{i\in J}\in L$,
\begin{align*}
(D(u)\circ p_j)(x)
&=D(u)(p_j(x))\\
&=D(u)(x_j)\\
&=x_k\\
&=p_k(x),
\end{align*}
so $(L,(p_j))$ is a cone over $D$.
Now let $N$ be a group with a cone $(\pi_j:N\to D(j))_{j\in J}$, so $D(u)\circ \pi_j=\pi_k$ for every $u:j\to k$. Define
\begin{align*}
v:N&\longrightarrow L,\\
v(n)&=(\pi_j(n))_{j\in J}.
\end{align*}
This is well-defined because for every arrow $u:j\to k$,
\begin{align*}
D(u)(\pi_j(n))
&=(D(u)\circ \pi_j)(n)\\
&=\pi_k(n).
\end{align*}
It is a homomorphism since
\begin{align*}
v(nm)
&=(\pi_j(nm))_{j\in J}\\
&=(\pi_j(n)\pi_j(m))_{j\in J}\\
&=(\pi_j(n))_{j\in J}(\pi_j(m))_{j\in J}\\
&=v(n)v(m).
\end{align*}
For every $j\in J$ and every $n\in N$,
\begin{align*}
(p_j\circ v)(n)
&=p_j(v(n))\\
&=p_j((\pi_i(n))_{i\in J})\\
&=\pi_j(n),
\end{align*}
so $p_j\circ v=\pi_j$.
If $w:N\to L$ is another homomorphism with $p_j\circ w=\pi_j$ for every $j$, then for each $n\in N$ write $w(n)=(x_j)_{j\in J}$. For every $j$,
\begin{align*}
x_j
&=p_j(w(n))\\
&=(p_j\circ w)(n)\\
&=\pi_j(n).
\end{align*}
Hence
\begin{align*}
w(n)=(\pi_j(n))_{j\in J}=v(n)
\end{align*}
for every $n\in N$, so $w=v$. Therefore every finite diagram of groups has a limit: it is the subgroup of the finite direct product consisting exactly of the compatible tuples.
[/example]
## Pullback Calculus and Opposite Categories
How can we compute with universal squares without rechecking the full universal property every time? The main tool is the pullback pasting lemma, which says that adjacent pullback squares can be composed and, under a mild hypothesis, decomposed.
[quotetheorem:4163]
[citeproof:4163]
This lemma is the categorical version of substituting fibre conditions. It allows pullbacks to be assembled in stages, which is essential in algebraic geometry, topology, and homological algebra. The assumptions are not decorative: if the right square is not a pullback, the outer rectangle may be universal while the left square fails to be, because the missing universal property on the right prevents recovering the intermediate map uniquely. In $\mathrm{Set}$, this can happen when the right square is merely a commutative square with too many points in $B$ over the same compatible pair in $C\times_F E$; the outer fibre condition cannot detect that extra ambiguity. Thus pasting is a theorem about universal squares, not about arbitrary commutative rectangles.
[example: Fibre Products of Commutative Rings]
Let $\alpha_B:A\to B$ and $\alpha_C:A\to C$ be homomorphisms of commutative rings with identity. Put
\begin{align*}
P=B\otimes_A C.
\end{align*}
Define the two canonical maps
\begin{align*}
\iota_B:B&\longrightarrow P,&
\iota_B(b)&=b\otimes 1,\\
\iota_C:C&\longrightarrow P,&
\iota_C(c)&=1\otimes c.
\end{align*}
They agree on $A$: for every $a\in A$,
\begin{align*}
\iota_B(\alpha_B(a))
&=\alpha_B(a)\otimes 1\\
&=(\alpha_B(a)\cdot 1_B)\otimes 1_C\\
&=1_B\otimes (\alpha_C(a)\cdot 1_C)\\
&=1\otimes \alpha_C(a)\\
&=\iota_C(\alpha_C(a)),
\end{align*}
where the middle equality is the $A$-balancing relation in $B\otimes_A C$.
Now let $R$ be a commutative ring with homomorphisms $\varphi:B\to R$ and $\psi:C\to R$ satisfying
\begin{align*}
\varphi\circ \alpha_B=\psi\circ \alpha_C.
\end{align*}
Define on pure tensors
\begin{align*}
u(b\otimes c)=\varphi(b)\psi(c).
\end{align*}
This formula is compatible with the $A$-balancing relation, since for $a\in A$,
\begin{align*}
u((\alpha_B(a)b)\otimes c)
&=\varphi(\alpha_B(a)b)\psi(c)\\
&=\varphi(\alpha_B(a))\varphi(b)\psi(c)\\
&=\psi(\alpha_C(a))\varphi(b)\psi(c)\\
&=\varphi(b)\psi(\alpha_C(a))\psi(c)\\
&=\varphi(b)\psi(\alpha_C(a)c)\\
&=u(b\otimes \alpha_C(a)c).
\end{align*}
Hence it induces a unique additive map $u:B\otimes_A C\to R$. It is a ring homomorphism because on pure tensors,
\begin{align*}
u((b\otimes c)(b'\otimes c'))
&=u(bb'\otimes cc')\\
&=\varphi(bb')\psi(cc')\\
&=\varphi(b)\varphi(b')\psi(c)\psi(c')\\
&=\varphi(b)\psi(c)\varphi(b')\psi(c')\\
&=u(b\otimes c)u(b'\otimes c'),
\end{align*}
and
\begin{align*}
u(1\otimes 1)=\varphi(1)\psi(1)=1\cdot 1=1.
\end{align*}
Moreover,
\begin{align*}
(u\circ \iota_B)(b)
&=u(b\otimes 1)
=\varphi(b)\psi(1)
=\varphi(b),\\
(u\circ \iota_C)(c)
&=u(1\otimes c)
=\varphi(1)\psi(c)
=\psi(c).
\end{align*}
If $v:B\otimes_A C\to R$ is another ring homomorphism with $v\circ\iota_B=\varphi$ and $v\circ\iota_C=\psi$, then for every pure tensor,
\begin{align*}
v(b\otimes c)
&=v((b\otimes 1)(1\otimes c))\\
&=v(b\otimes 1)v(1\otimes c)\\
&=\varphi(b)\psi(c)\\
&=u(b\otimes c).
\end{align*}
Since pure tensors generate $B\otimes_A C$ additively, $v=u$. Thus $B\otimes_A C$ is the pushout of $B\leftarrow A\to C$ in $\mathrm{CRing}$.
Passing to affine schemes reverses arrows:
\begin{align*}
\operatorname{Spec}(B\otimes_A C)&\longrightarrow \operatorname{Spec}(B),\\
\operatorname{Spec}(B\otimes_A C)&\longrightarrow \operatorname{Spec}(C),
\end{align*}
and the equality $\iota_B\circ\alpha_B=\iota_C\circ\alpha_C$ becomes a commutative square over $\operatorname{Spec}(A)$. Therefore
\begin{align*}
\operatorname{Spec}(B\otimes_A C)\cong \operatorname{Spec}(B)\times_{\operatorname{Spec}(A)}\operatorname{Spec}(C).
\end{align*}
The tensor product is a pushout of rings, but the corresponding affine scheme is a pullback because $\operatorname{Spec}$ is contravariant.
[/example]
This example is a reminder that variance changes the direction of the construction. The same universal diagram may be a limit in one category and a colimit after passing to the opposite category.
[remark: Limits and Colimits by Duality]
A limit in $\mathcal C$ is a colimit in $\mathcal C^{\mathrm{op}}$, and a colimit in $\mathcal C$ is a limit in $\mathcal C^{\mathrm{op}}$. Products dualise to coproducts, equalizers to coequalizers, and pullbacks to pushouts. This duality is often the fastest way to remember which construction belongs to which variance.
[/remark]
The chapter's main point is that limits are not a list of constructions but a single universal idea. Once cones are organised into a category, the limit is simply the terminal cone; once cones are organised functorially, the limit is the representing object of the cone functor. These two views will be used repeatedly when studying completeness, adjoint functor theorems, and exactness in abelian categories.
After limits, the dual story of colimits completes the universal picture. Mapping out of a diagram leads to coproducts, coequalizers, and pushouts, and together these constructions set up the later discussion of completeness, cocontinuity, and exactness.
# 5. Colimits and Universal Cocones
This chapter turns the theory of limits around. Instead of asking for a universal way of mapping into a diagram, we ask for a universal way of mapping out of a diagram. The resulting objects are colimits: coproducts, coequalizers, pushouts, direct limits, and many familiar quotient constructions. The guiding theme is that a colimit is a canonical object obtained by freely gluing together the data in a diagram and imposing exactly the relations demanded by its arrows.
## Cocones and Initial Cocones
What does it mean to combine all objects in a diagram into a single object without losing the information carried by the arrows of the diagram? For limits, a cone maps into each object of the diagram compatibly. For colimits, the arrows point the other way: every object of the diagram maps into a proposed receiving object, and compatibility says that the receiving object cannot distinguish points already identified by the diagram.
[definition: Cocone]
Let $J$ be a small category, let $F:J \to \mathcal C$ be a functor, and let $c \in \mathcal C$. A cocone from $F$ to $c$ is a family of morphisms
\begin{align*}
\lambda_j:F(j) \to c
\end{align*}
for each object $j \in J$, such that for every morphism $\alpha:j \to k$ in $J$ the equation
\begin{align*}
\lambda_k \circ F(\alpha)=\lambda_j
\end{align*}
holds in $\mathcal C$.
[/definition]
Thus a cocone is a compatible family of maps out of the diagram. The condition says that going along the diagram and then into $c$ gives the same morphism as going directly into $c$.
[example: Cocone Over A Parallel Pair]
Let the parallel-pair diagram have objects $A$ and $B$ and arrows $f,g:A\rightrightarrows B$. A cocone with vertex $q$ consists of morphisms
\begin{align*}
\lambda_A:A\to q,
\qquad
\lambda_B:B\to q
\end{align*}
such that the cocone compatibility condition holds for both arrows:
\begin{align*}
\lambda_B\circ f&=\lambda_A,\\
\lambda_B\circ g&=\lambda_A.
\end{align*}
Therefore
\begin{align*}
\lambda_B\circ f=\lambda_A=\lambda_B\circ g,
\end{align*}
so if we write $u=\lambda_B$, the cocone determines a morphism $u:B\to q$ satisfying
\begin{align*}
u\circ f=u\circ g.
\end{align*}
Conversely, any morphism $u:B\to q$ with $u\circ f=u\circ g$ defines a cocone by setting
\begin{align*}
\lambda_B&=u,\\
\lambda_A&=u\circ f=u\circ g.
\end{align*}
Thus a cocone over a parallel pair is exactly a morphism out of $B$ that identifies the two composites from $A$, which is the compatibility pattern used in coequalizers.
[/example]
The universal cocone should be the most economical compatible receiving object: every other compatible receiving object factors through it in exactly one way. The obstruction is that merely having many maps out of the diagram does not identify which target is freely generated by the diagram rather than carrying extra accidental structure. The definition therefore requires an initial receiving object among all compatible cocones.
[definition: Colimit]
Let $F:J \to \mathcal C$ be a functor. A colimit of $F$ is a cocone $(\lambda_j:F(j)\to L)_{j\in J}$ such that for every cocone $(\mu_j:F(j)\to c)_{j\in J}$ there exists a unique morphism $u:L\to c$ satisfying
\begin{align*}
u\circ \lambda_j=\mu_j
\end{align*}
for all $j\in J$.
[/definition]
A colimit is therefore an initial object in the category of cocones. This statement is not only a slogan; it gives a compact way to transport facts about initial objects to colimits.
[definition: Category Of Cocones]
Let $F:J\to\mathcal C$ be a functor. The category $\operatorname{Cocone}(F)$ has as objects cocones $(c,\lambda_j:F(j)\to c)$, and a morphism
\begin{align*}
\nu:(c,\lambda_j)\to(d,\mu_j)
\end{align*}
is a morphism $\nu:c\to d$ in $\mathcal C$ such that $\nu\circ\lambda_j=\mu_j$ for all $j\in J$.
[/definition]
With this category in hand, a colimit of $F$ is an initial object of $\operatorname{Cocone}(F)$. This also explains the variance: limits are terminal cones, while colimits are initial cocones.
Since colimits are defined by a universal property rather than by a preferred construction, one must check that different constructions of the same colimit cannot disagree in any essential way. The expected uniqueness statement is the dual of the corresponding result for limits: any two universal cocones over the same diagram should be connected by a unique compatible isomorphism.
[quotetheorem:4164]
[citeproof:4164]
This theorem justifies speaking of "the" colimit when it exists, while remembering that the actual object is determined only up to unique isomorphism compatible with the cocone maps. The hypothesis that both cocones are colimits is essential: two arbitrary cocones over the same diagram may have no morphism between their vertices, and even when such a morphism exists it need not be unique or invertible. The theorem also does not say that a preferred underlying object has been chosen; it says that every construction satisfying the same universal property is interchangeable for all categorical purposes. Later computations use this principle constantly: once a coproduct, quotient, or glued object is shown to satisfy the colimit property, it may be identified with any other model of the same colimit by the unique compatible isomorphism.
[remark: Duality With Limits]
A colimit of $F:J\to\mathcal C$ is the same as a limit of $F^{\operatorname{op}}:J^{\operatorname{op}}\to\mathcal C^{\operatorname{op}}$ in the opposite category. Many formal statements about colimits can be obtained from the corresponding statements about limits by reversing arrows.
[/remark]
## Coproducts, Coequalizers, Pushouts, and Direct Limits
Which familiar constructions are special cases of colimits? The answer is that colimits package several operations that are usually taught separately: disjoint union, quotient by an equivalence relation, gluing along a common subobject, and passage to a union over a directed system. Their common feature is that they freely assemble data subject to compatibility relations.
[definition: Coproduct]
Let $(x_i)_{i\in I}$ be a family of objects of a category $\mathcal C$. A coproduct of this family is an object $\coprod_{i\in I}x_i$ together with morphisms
\begin{align*}
\iota_i:x_i\to \coprod_{i\in I}x_i
\end{align*}
such that for every object $c$ and every family of morphisms $f_i:x_i\to c$, there exists a unique morphism $f:\coprod_{i\in I}x_i\to c$ satisfying $f\circ\iota_i=f_i$ for all $i\in I$.
[/definition]
The coproduct is the colimit of a diagram indexed by a discrete category. There are no arrows in the indexing category, so there are no compatibility relations to impose.
[example: Coproducts In Algebra And Topology]
In $\mathrm{Set}$, the coproduct of sets $(X_i)_{i\in I}$ is the disjoint union
\begin{align*}
\bigsqcup_{i\in I}X_i=\{(i,x):i\in I,\ x\in X_i\},
\end{align*}
with inclusion maps $\iota_i:X_i\to\bigsqcup_i X_i$ given by $\iota_i(x)=(i,x)$. Given functions $f_i:X_i\to Y$, define $f:\bigsqcup_iX_i\to Y$ by
\begin{align*}
f(i,x)=f_i(x).
\end{align*}
Then
\begin{align*}
(f\circ\iota_i)(x)=f(i,x)=f_i(x),
\end{align*}
so $f\circ\iota_i=f_i$. If $g:\bigsqcup_iX_i\to Y$ also satisfies $g\circ\iota_i=f_i$ for every $i$, then for each element $(i,x)$,
\begin{align*}
g(i,x)=g(\iota_i(x))=(g\circ\iota_i)(x)=f_i(x)=f(i,x),
\end{align*}
so $g=f$. Thus disjoint union has the coproduct universal property in $\mathrm{Set}$.
In $\mathrm{Ab}$ and in $R$-$\mathrm{Mod}$, the coproduct of a family $(M_i)_{i\in I}$ is the direct sum
\begin{align*}
\bigoplus_{i\in I}M_i
=\{(m_i)_{i\in I}:m_i\in M_i,\ m_i=0\text{ for all but finitely many }i\}.
\end{align*}
The inclusion $\iota_i:M_i\to\bigoplus_iM_i$ places $m$ in the $i$-th coordinate and $0$ elsewhere. Given homomorphisms $f_i:M_i\to N$, define
\begin{align*}
f((m_i)_{i\in I})=\sum_{i\in I}f_i(m_i),
\end{align*}
where the sum is finite because $(m_i)$ has finite support. Then
\begin{align*}
(f\circ\iota_i)(m)=f_i(m),
\end{align*}
and if $g$ is another homomorphism with $g\circ\iota_i=f_i$, then for every finitely supported $(m_i)$,
\begin{align*}
g((m_i)_{i\in I})
&=g\left(\sum_{i\in I}\iota_i(m_i)\right)\\
&=\sum_{i\in I}g(\iota_i(m_i))\\
&=\sum_{i\in I}(g\circ\iota_i)(m_i)\\
&=\sum_{i\in I}f_i(m_i)\\
&=f((m_i)_{i\in I}).
\end{align*}
Hence $g=f$, so the direct sum is the coproduct.
In $\mathrm{Grp}$, the coproduct of two groups $G$ and $H$ is the free product $G*H$. Its defining universal property says that for every group $K$ and homomorphisms $\alpha:G\to K$ and $\beta:H\to K$, there is a unique homomorphism $\varphi:G*H\to K$ such that
\begin{align*}
\varphi\circ\iota_G=\alpha,
\qquad
\varphi\circ\iota_H=\beta.
\end{align*}
This is exactly the coproduct universal property for the two-object family $(G,H)$, so $G*H$ is the group-theoretic coproduct.
[/example]
A coproduct freely puts objects side by side. A coequalizer starts with two competing ways of mapping one object into another and imposes the relation that those two maps become equal.
The universal problem is not merely to form some quotient of the target, but to form the most general quotient through which every map that already equalizes the pair must factor. This makes coequalizers the categorical mechanism for forcing equations while losing no information irrelevant to those equations.
[definition: Coequalizer]
Let $f,g:a\rightrightarrows b$ be parallel morphisms in $\mathcal C$. A coequalizer of $f$ and $g$ is a morphism $q:b\to c$ such that $q\circ f=q\circ g$, and such that for every morphism $h:b\to d$ with $h\circ f=h\circ g$, there exists a unique morphism $u:c\to d$ satisfying $u\circ q=h$.
[/definition]
A coequalizer is the colimit of the parallel-pair diagram. It is the categorical form of quotienting by the relation generated by $f(x)=g(x)$.
[example: Quotient Spaces As Coequalizers]
Let $X$ be a topological space, let $R\subset X\times X$ be an equivalence relation, and let $p_1,p_2:R\rightrightarrows X$ be the projections
\begin{align*}
p_1(x,x')=x,
\qquad
p_2(x,x')=x'.
\end{align*}
Write $q:X\to X/R$ for the quotient map, so $q(x)=[x]$. For every $(x,x')\in R$, the points $x$ and $x'$ have the same equivalence class, hence
\begin{align*}
(q\circ p_1)(x,x')
&=q(p_1(x,x'))\\
&=q(x)\\
&=[x]\\
&=[x']\\
&=q(x')\\
&=q(p_2(x,x'))\\
&=(q\circ p_2)(x,x').
\end{align*}
Thus $q\circ p_1=q\circ p_2$.
Now let $h:X\to Y$ be a continuous map such that $h\circ p_1=h\circ p_2$. Then for every $(x,x')\in R$,
\begin{align*}
h(x)
&=h(p_1(x,x'))\\
&=(h\circ p_1)(x,x')\\
&=(h\circ p_2)(x,x')\\
&=h(p_2(x,x'))\\
&=h(x').
\end{align*}
So $h$ is constant on equivalence classes. Define $\bar h:X/R\to Y$ by
\begin{align*}
\bar h([x])=h(x).
\end{align*}
This is well-defined because if $[x]=[x']$, then $(x,x')\in R$, and the displayed calculation gives $h(x)=h(x')$. For every $x\in X$,
\begin{align*}
(\bar h\circ q)(x)
&=\bar h(q(x))\\
&=\bar h([x])\\
&=h(x),
\end{align*}
so $\bar h\circ q=h$. By the definition of the quotient topology, a map $\bar h:X/R\to Y$ is continuous exactly when $\bar h\circ q:X\to Y$ is continuous; since $\bar h\circ q=h$ is continuous, $\bar h$ is continuous.
Finally, if $u:X/R\to Y$ is another continuous map with $u\circ q=h$, then for every equivalence class $[x]\in X/R$,
\begin{align*}
u([x])
&=u(q(x))\\
&=(u\circ q)(x)\\
&=h(x)\\
&=\bar h([x]).
\end{align*}
Hence $u=\bar h$. Therefore $q:X\to X/R$ is the coequalizer of $p_1$ and $p_2$ in $\mathrm{Top}$: maps out of $X/R$ are exactly continuous maps out of $X$ that identify $R$-equivalent points.
[/example]
Pushouts combine the two previous constructions: take two objects receiving maps from a common source, and glue the two targets along the images of that source. The obstruction is that the two targets may contain overlapping information coming from the source, so maps out of the glued object should be exactly pairs of maps out of the targets that agree on that common part. The definition therefore asks for the most general object in which the two copies of the source have been identified.
[definition: Pushout]
Let $f:a\to b$ and $g:a\to c$ be morphisms in $\mathcal C$. A pushout of $f$ and $g$ is an object $p$ together with morphisms $i:b\to p$ and $j:c\to p$ such that
\begin{align*}
i\circ f=j\circ g,
\end{align*}
and such that for every object $x$ with morphisms $r:b\to x$ and $s:c\to x$ satisfying $r\circ f=s\circ g$, there exists a unique morphism $u:p\to x$ satisfying $u\circ i=r$ and $u\circ j=s$.
[/definition]
A pushout is the colimit of a span $b\leftarrow a\to c$. It is often written $b\coprod_a c$ when the maps from $a$ are understood.
This notation becomes usable only after it is converted into a mapping rule. If $b\coprod_a c$ really is the object obtained by gluing $b$ and $c$ along $a$, then a map out of it should be exactly a pair of maps out of $b$ and $c$ that agree on the common source. The following result records this test as the working criterion for using pushouts in computations.
[quotetheorem:4165]
[citeproof:4165]
This is the operational meaning of gluing: to define a map out of the glued object, define maps on the two pieces and verify agreement on the overlap. The pushout hypothesis is essential here. An arbitrary commutative square with the same shape gives a function from maps out of $p$ to compatible pairs, but it may fail to be surjective if some compatible pairs do not extend across $p$, or fail to be injective if two different maps $p\to x$ agree on the images of $b$ and $c$. Thus the gluing property is not a consequence of commutativity alone; it is the universal property that distinguishes the actual pushout from a merely compatible square. This is why concrete pushout computations below always build an object and then verify the universal mapping property, rather than only checking that the square commutes.
[example: Amalgamated Free Products Of Groups]
Let $A\xrightarrow{f}G$ and $A\xrightarrow{g}H$ be group homomorphisms, and let
\begin{align*}
\iota_G:G\to G*H,
\qquad
\iota_H:H\to G*H
\end{align*}
be the canonical homomorphisms into the free product. Let $N\trianglelefteq G*H$ be the normal subgroup generated by the elements
\begin{align*}
\iota_G(f(a))\iota_H(g(a))^{-1}
\end{align*}
for $a\in A$, and define
\begin{align*}
G*_A H=(G*H)/N.
\end{align*}
Write $\pi:G*H\to (G*H)/N$ for the quotient map, and set
\begin{align*}
j_G=\pi\circ\iota_G,
\qquad
j_H=\pi\circ\iota_H.
\end{align*}
For every $a\in A$, the generator $\iota_G(f(a))\iota_H(g(a))^{-1}$ lies in $N$, so its image under $\pi$ is the identity. Hence
\begin{align*}
e
&=\pi\bigl(\iota_G(f(a))\iota_H(g(a))^{-1}\bigr)\\
&=\pi(\iota_G(f(a)))\pi(\iota_H(g(a))^{-1})\\
&=\pi(\iota_G(f(a)))\pi(\iota_H(g(a)))^{-1}.
\end{align*}
Multiplying on the right by $\pi(\iota_H(g(a)))$ gives
\begin{align*}
\pi(\iota_G(f(a)))=\pi(\iota_H(g(a))),
\end{align*}
which is exactly
\begin{align*}
(j_G\circ f)(a)=(j_H\circ g)(a).
\end{align*}
Thus $j_G\circ f=j_H\circ g$.
Now let $K$ be a group, and let $r:G\to K$ and $s:H\to K$ be homomorphisms satisfying
\begin{align*}
r\circ f=s\circ g.
\end{align*}
By the universal property of the free product, there is a unique homomorphism $\varphi:G*H\to K$ such that
\begin{align*}
\varphi\circ\iota_G=r,
\qquad
\varphi\circ\iota_H=s.
\end{align*}
For every $a\in A$,
\begin{align*}
\varphi\bigl(\iota_G(f(a))\iota_H(g(a))^{-1}\bigr)
&=\varphi(\iota_G(f(a)))\varphi(\iota_H(g(a))^{-1})\\
&=(\varphi\circ\iota_G)(f(a))(\varphi\circ\iota_H)(g(a))^{-1}\\
&=r(f(a))s(g(a))^{-1}\\
&=(r\circ f)(a)(s\circ g)(a)^{-1}\\
&=e.
\end{align*}
Since $\ker(\varphi)$ is a normal subgroup of $G*H$ containing all the displayed generators, it contains $N$. Therefore $\varphi$ factors through the quotient by a unique homomorphism $\bar\varphi:G*_A H\to K$ satisfying
\begin{align*}
\bar\varphi\circ\pi=\varphi.
\end{align*}
Then
\begin{align*}
\bar\varphi\circ j_G
&=\bar\varphi\circ\pi\circ\iota_G\\
&=\varphi\circ\iota_G\\
&=r,
\end{align*}
and similarly
\begin{align*}
\bar\varphi\circ j_H
&=\bar\varphi\circ\pi\circ\iota_H\\
&=\varphi\circ\iota_H\\
&=s.
\end{align*}
If $u:G*_A H\to K$ is another homomorphism with $u\circ j_G=r$ and $u\circ j_H=s$, then $u\circ\pi:G*H\to K$ satisfies
\begin{align*}
(u\circ\pi)\circ\iota_G=u\circ j_G=r,
\qquad
(u\circ\pi)\circ\iota_H=u\circ j_H=s.
\end{align*}
By uniqueness for the free product, $u\circ\pi=\varphi$. Since $\pi$ is surjective, $u=\bar\varphi$. Hence $G*_A H$ with the maps $j_G$ and $j_H$ is the pushout of $G\xleftarrow{f}A\xrightarrow{g}H$ in $\mathrm{Grp}$: maps out of it are exactly pairs of group homomorphisms out of $G$ and $H$ that agree on $A$.
[/example]
Directed systems are diagrams indexed by a directed poset. Their colimits record the idea of taking a growing union, but in categories where the connecting maps need not be inclusions.
[definition: Directed System]
Let $(I,\le)$ be a directed poset, regarded as a category with one morphism $i\to j$ whenever $i\le j$. A directed system in $\mathcal C$ is a functor $F:I\to\mathcal C$.
[/definition]
The colimit of such a system is traditionally given a special name because it behaves like a union of successively larger stages when the transition maps are inclusions.
The point of the next definition is to name the categorical object that records eventual agreement among stages, even when the transition maps are arbitrary and the stages are not literal subobjects of one another. This is the categorical version of passing from finite or local approximations to a single object that receives all stages compatibly.
[definition: Direct Limit]
A direct limit of a directed system $F:I\to\mathcal C$ is the colimit $\varinjlim_{i\in I}F(i)$, when this colimit exists.
[/definition]
The notation $\varinjlim$ emphasizes direction: maps go from earlier stages to later stages and then into the limiting object.
[example: Direct Limit Of Modules]
Let $(M_i,\phi_{ij})_{i\le j}$ be a directed system of $R$-modules, and write
\begin{align*}
\iota_i:M_i\to \bigoplus_{k\in I}M_k
\end{align*}
for the canonical inclusion into the $i$-th summand. Let $S$ be the submodule of $\bigoplus_k M_k$ generated by all elements
\begin{align*}
\iota_i(m)-\iota_j(\phi_{ij}(m)),
\qquad i\le j,\ m\in M_i.
\end{align*}
Set
\begin{align*}
L=\left(\bigoplus_{k\in I}M_k\right)/S,
\end{align*}
let $\pi:\bigoplus_kM_k\to L$ be the quotient map, and define
\begin{align*}
\lambda_i=\pi\circ\iota_i:M_i\to L.
\end{align*}
For $i\le j$ and $m\in M_i$, the generator $\iota_i(m)-\iota_j(\phi_{ij}(m))$ lies in $S$, so its image in the quotient is $0$. Hence
\begin{align*}
0
&=\pi\bigl(\iota_i(m)-\iota_j(\phi_{ij}(m))\bigr)\\
&=\pi(\iota_i(m))-\pi(\iota_j(\phi_{ij}(m)))\\
&=\lambda_i(m)-\lambda_j(\phi_{ij}(m)).
\end{align*}
Therefore
\begin{align*}
\lambda_i(m)=\lambda_j(\phi_{ij}(m)),
\end{align*}
so $\lambda_i=\lambda_j\circ\phi_{ij}$. Thus the maps $\lambda_i:M_i\to L$ form a compatible cocone.
Now let $N$ be an $R$-module, and let $\mu_i:M_i\to N$ be a compatible family, meaning that
\begin{align*}
\mu_i=\mu_j\circ\phi_{ij}
\end{align*}
whenever $i\le j$. By the universal property of the direct sum, there is a unique homomorphism
\begin{align*}
\widetilde\mu:\bigoplus_{k\in I}M_k\to N
\end{align*}
such that $\widetilde\mu\circ\iota_i=\mu_i$ for every $i$. For every generator $\iota_i(m)-\iota_j(\phi_{ij}(m))$ of $S$,
\begin{align*}
\widetilde\mu\bigl(\iota_i(m)-\iota_j(\phi_{ij}(m))\bigr)
&=\widetilde\mu(\iota_i(m))-\widetilde\mu(\iota_j(\phi_{ij}(m)))\\
&=(\widetilde\mu\circ\iota_i)(m)-(\widetilde\mu\circ\iota_j)(\phi_{ij}(m))\\
&=\mu_i(m)-\mu_j(\phi_{ij}(m))\\
&=\mu_i(m)-(\mu_j\circ\phi_{ij})(m)\\
&=\mu_i(m)-\mu_i(m)\\
&=0.
\end{align*}
Since $S$ is generated by these elements and $\widetilde\mu$ is a homomorphism, $S\subseteq\ker(\widetilde\mu)$. Hence $\widetilde\mu$ factors uniquely through the quotient by a homomorphism
\begin{align*}
u:L\to N
\end{align*}
satisfying $u\circ\pi=\widetilde\mu$. For every $i$,
\begin{align*}
u\circ\lambda_i
&=u\circ\pi\circ\iota_i\\
&=\widetilde\mu\circ\iota_i\\
&=\mu_i.
\end{align*}
Finally, if $v:L\to N$ is another homomorphism with $v\circ\lambda_i=\mu_i$ for every $i$, then for each $i$,
\begin{align*}
(v\circ\pi)\circ\iota_i
&=v\circ\pi\circ\iota_i\\
&=v\circ\lambda_i\\
&=\mu_i\\
&=\widetilde\mu\circ\iota_i.
\end{align*}
By uniqueness for maps out of the direct sum, $v\circ\pi=\widetilde\mu$. Since $\pi$ is surjective, $v=u$. Therefore $L$ is the direct limit: maps out of $L$ are exactly compatible families of maps out of the modules $M_i$.
[/example]
## Constructing Finite Colimits From Coproducts and Coequalizers
How many colimit shapes must a category have before it has all finite colimits? A useful answer is that finite coproducts and coequalizers suffice. The proof is important because it reveals the generators-and-relations nature of finite colimits.
[quotetheorem:4166]
[citeproof:4166]
The construction says: start with the formal disjoint sum of all objects and impose one relation for each arrow of the indexing category. Both hypotheses are doing real work. Without the finite coproducts there may be no ambient object $P$ in which all pieces of the diagram have first been placed side by side; without coequalizers there may be no categorical quotient imposing the arrow relations. The theorem is also finite in scope: it does not imply the existence of filtered colimits, countable coproducts, or arbitrary colimits, because those require coproducts and coequalizers indexed by larger shapes. This is the categorical analogue of presenting a group, module, or topological space by finitely many layers of generators and relations.
[example: Finite Colimit Of A Small Diagram]
Let the span be $x\xleftarrow{f}a\xrightarrow{g}y$. In the coproduct-coequalizer construction, set
\begin{align*}
P=x\coprod a\coprod y,
\qquad
Q=a\coprod a.
\end{align*}
Write the inclusions into $P$ as $\iota_x,\iota_a,\iota_y$, and the two inclusions into $Q$ as $\kappa_f,\kappa_g$. Define two morphisms $d_0,d_1:Q\rightrightarrows P$ by
\begin{align*}
d_0\circ\kappa_f&=\iota_a,
&
d_1\circ\kappa_f&=\iota_x\circ f,\\
d_0\circ\kappa_g&=\iota_a,
&
d_1\circ\kappa_g&=\iota_y\circ g.
\end{align*}
Let $q:P\to L$ be the coequalizer of $d_0$ and $d_1$, and define
\begin{align*}
i=q\circ\iota_x:x\to L,
\qquad
j=q\circ\iota_y:y\to L.
\end{align*}
Since $q\circ d_0=q\circ d_1$, we have
\begin{align*}
q\circ\iota_a
&=q\circ d_0\circ\kappa_f\\
&=q\circ d_1\circ\kappa_f\\
&=q\circ\iota_x\circ f\\
&=i\circ f,
\end{align*}
and also
\begin{align*}
q\circ\iota_a
&=q\circ d_0\circ\kappa_g\\
&=q\circ d_1\circ\kappa_g\\
&=q\circ\iota_y\circ g\\
&=j\circ g.
\end{align*}
Thus $i\circ f=j\circ g$.
Now suppose $r:x\to z$ and $s:y\to z$ satisfy $r\circ f=s\circ g$. By the coproduct universal property, there is a unique morphism $t:P\to z$ such that
\begin{align*}
t\circ\iota_x&=r,\\
t\circ\iota_y&=s,\\
t\circ\iota_a&=r\circ f=s\circ g.
\end{align*}
On the $\kappa_f$-summand of $Q$,
\begin{align*}
t\circ d_0\circ\kappa_f
&=t\circ\iota_a\\
&=r\circ f\\
&=t\circ\iota_x\circ f\\
&=t\circ d_1\circ\kappa_f,
\end{align*}
and on the $\kappa_g$-summand,
\begin{align*}
t\circ d_0\circ\kappa_g
&=t\circ\iota_a\\
&=s\circ g\\
&=t\circ\iota_y\circ g\\
&=t\circ d_1\circ\kappa_g.
\end{align*}
Hence $t\circ d_0=t\circ d_1$, so the coequalizer property gives a unique morphism $u:L\to z$ with $u\circ q=t$. Then
\begin{align*}
u\circ i
&=u\circ q\circ\iota_x\\
&=t\circ\iota_x\\
&=r,
\end{align*}
and similarly $u\circ j=s$. If $v:L\to z$ also satisfies $v\circ i=r$ and $v\circ j=s$, then $v\circ q$ agrees with $t$ on the three coproduct summands $x$, $a$, and $y$, because
\begin{align*}
v\circ q\circ\iota_x&=r=t\circ\iota_x,\\
v\circ q\circ\iota_y&=s=t\circ\iota_y,\\
v\circ q\circ\iota_a&=v\circ i\circ f=r\circ f=t\circ\iota_a.
\end{align*}
By uniqueness for the coproduct $P$, $v\circ q=t=u\circ q$, and by uniqueness in the coequalizer factorization, $v=u$. Therefore $L$ has the pushout universal property, so it is canonically isomorphic to $x\coprod_a y$. Pushouts are thus the first non-discrete finite colimits produced by the coproduct-coequalizer method.
[/example]
A particularly useful special case is the existence of pushouts from binary coproducts and coequalizers. The general coproduct-coequalizer construction can look abstract because it refers to an entire finite diagram at once. Pushouts isolate the first genuinely gluing-shaped case: two objects are placed side by side, and the two images of a common source are then identified. This makes the two required ingredients visible in the smallest nontrivial colimit problem.
[quotetheorem:4167]
[citeproof:4167]
This proof is often the easiest way to compute pushouts concretely: build the coproduct and then impose the identifications forced by the common source. The two hypotheses cannot be separated in this argument. Binary coproducts alone provide the ambient object $b\coprod c$ but do not provide the quotient that forces $\iota_b\circ f$ and $\iota_c\circ g$ to agree. Coequalizers alone can impose equality between parallel arrows once those arrows exist, but they do not supply the object $b\coprod c$ into which the two targets have been jointly placed. Thus the construction isolates the two stages of gluing: juxtaposition first, identification second.
## Quotients and Generators-And-Relations Constructions
Why do colimits appear so often in algebra? Algebraic objects are commonly built by freely generating a large object and then quotienting by relations. Colimits provide the invariant language for this process, independent of the chosen presentation.
[explanation: Generators And Relations As Colimits]
A coproduct supplies generators by placing independent pieces next to each other. A coequalizer supplies relations by forcing two specified maps to become equal. Combining them lets us build objects from formal pieces and imposed equations. In varieties of algebraic structures, such as groups, rings, and modules, many finite colimits can therefore be computed as a free construction followed by a quotient by the congruence generated by the diagram relations.
[/explanation]
The most familiar quotient construction already has exactly this shape. A normal subgroup is not merely a subset to collapse; it is the relation that says each element of $N$ should become indistinguishable from the identity.
[example: Quotient Groups As Coequalizers]
Let $G$ be a group and let $N\trianglelefteq G$. Let $i:N\to G$ be the inclusion, let $e_N:N\to G$ be the homomorphism $e_N(n)=e_G$, and let $q:G\to G/N$ be the quotient homomorphism $q(g)=gN$. For every $n\in N$,
\begin{align*}
(q\circ i)(n)
&=q(i(n))\\
&=q(n)\\
&=nN\\
&=N\\
&=e_GN\\
&=q(e_G)\\
&=q(e_N(n))\\
&=(q\circ e_N)(n).
\end{align*}
Thus $q\circ i=q\circ e_N$.
Now let $h:G\to K$ be a group homomorphism such that $h\circ i=h\circ e_N$. For every $n\in N$,
\begin{align*}
h(n)
&=h(i(n))\\
&=(h\circ i)(n)\\
&=(h\circ e_N)(n)\\
&=h(e_G)\\
&=e_K,
\end{align*}
so $N\subseteq\ker(h)$. Define $\bar h:G/N\to K$ by
\begin{align*}
\bar h(gN)=h(g).
\end{align*}
This is well-defined: if $gN=g'N$, then $g^{-1}g'\in N$, and hence
\begin{align*}
h(g)^{-1}h(g')
&=h(g^{-1})h(g')\\
&=h(g^{-1}g')\\
&=e_K,
\end{align*}
so $h(g)=h(g')$. It is a homomorphism because for all $g,g'\in G$,
\begin{align*}
\bar h((gN)(g'N))
&=\bar h(gg'N)\\
&=h(gg')\\
&=h(g)h(g')\\
&=\bar h(gN)\bar h(g'N).
\end{align*}
For every $g\in G$,
\begin{align*}
(\bar h\circ q)(g)
&=\bar h(q(g))\\
&=\bar h(gN)\\
&=h(g),
\end{align*}
so $\bar h\circ q=h$.
If $u:G/N\to K$ is another homomorphism satisfying $u\circ q=h$, then for every coset $gN\in G/N$,
\begin{align*}
u(gN)
&=u(q(g))\\
&=(u\circ q)(g)\\
&=h(g)\\
&=\bar h(gN).
\end{align*}
Hence $u=\bar h$. Therefore $q:G\to G/N$ is the coequalizer of $i$ and $e_N$ in $\mathrm{Grp}$: maps out of $G/N$ are exactly homomorphisms out of $G$ that send every element of $N$ to the identity.
[/example]
Tensor products use the same colimit mechanism, but the imposed relations encode bilinearity rather than normal-subgroup collapse. This makes the construction a useful second test of how generators and relations are packaged categorically.
[example: Tensor Product Of Modules As A Coequalizer]
Let $R$ be a ring, let $M$ be a right $R$-module, and let $N$ be a left $R$-module. Write $F$ for the free abelian group on the set $M\times N$, and denote the basis element corresponding to $(m,n)$ by $[m,n]$. Let $S$ be the free abelian group on symbols
\begin{align*}
\rho_{m,m',n},
\qquad
\sigma_{m,n,n'},
\qquad
\tau_{m,r,n},
\end{align*}
where $m,m'\in M$, $n,n'\in N$, and $r\in R$. Define homomorphisms $d_0,d_1:S\rightrightarrows F$ on these basis symbols by
\begin{align*}
d_0(\rho_{m,m',n})&=[m+m',n],
&
d_1(\rho_{m,m',n})&=[m,n]+[m',n],\\
d_0(\sigma_{m,n,n'})&=[m,n+n'],
&
d_1(\sigma_{m,n,n'})&=[m,n]+[m,n'],\\
d_0(\tau_{m,r,n})&=[mr,n],
&
d_1(\tau_{m,r,n})&=[m,rn].
\end{align*}
Let $H\le F$ be the subgroup generated by the elements
\begin{align*}
[m+m',n]-[m,n]-[m',n],\\
[m,n+n']-[m,n]-[m,n'],\\
[mr,n]-[m,rn],
\end{align*}
and set
\begin{align*}
T=F/H.
\end{align*}
Let $q:F\to T$ be the quotient homomorphism. For each generator of $S$,
\begin{align*}
q(d_0(\rho_{m,m',n}))-q(d_1(\rho_{m,m',n}))
&=q([m+m',n])-q([m,n]+[m',n])\\
&=q([m+m',n]-[m,n]-[m',n])\\
&=0,
\end{align*}
and similarly
\begin{align*}
q(d_0(\sigma_{m,n,n'}))-q(d_1(\sigma_{m,n,n'}))
&=q([m,n+n']-[m,n]-[m,n'])\\
&=0,\\
q(d_0(\tau_{m,r,n}))-q(d_1(\tau_{m,r,n}))
&=q([mr,n]-[m,rn])\\
&=0.
\end{align*}
Since $S$ is free on these symbols, the equalities on basis elements give
\begin{align*}
q\circ d_0=q\circ d_1.
\end{align*}
Now let $A$ be an abelian group, and let $h:F\to A$ be a homomorphism satisfying $h\circ d_0=h\circ d_1$. Define
\begin{align*}
b:M\times N\to A,
\qquad
b(m,n)=h([m,n]).
\end{align*}
For $m,m'\in M$ and $n\in N$,
\begin{align*}
b(m+m',n)
&=h([m+m',n])\\
&=h(d_0(\rho_{m,m',n}))\\
&=h(d_1(\rho_{m,m',n}))\\
&=h([m,n]+[m',n])\\
&=h([m,n])+h([m',n])\\
&=b(m,n)+b(m',n).
\end{align*}
The same argument with $\sigma_{m,n,n'}$ gives
\begin{align*}
b(m,n+n')=b(m,n)+b(m,n'),
\end{align*}
and the argument with $\tau_{m,r,n}$ gives
\begin{align*}
b(mr,n)=b(m,rn).
\end{align*}
Thus $b$ is biadditive and balanced.
Conversely, suppose $b:M\times N\to A$ is biadditive and satisfies $b(mr,n)=b(m,rn)$. By freeness of $F$, there is a unique homomorphism $h:F\to A$ such that
\begin{align*}
h([m,n])=b(m,n)
\end{align*}
for all $m,n$. For the generators of $S$,
\begin{align*}
h(d_0(\rho_{m,m',n}))
&=h([m+m',n])\\
&=b(m+m',n)\\
&=b(m,n)+b(m',n)\\
&=h([m,n])+h([m',n])\\
&=h([m,n]+[m',n])\\
&=h(d_1(\rho_{m,m',n})),
\end{align*}
and similarly
\begin{align*}
h(d_0(\sigma_{m,n,n'}))&=h(d_1(\sigma_{m,n,n'})),\\
h(d_0(\tau_{m,r,n}))&=h(d_1(\tau_{m,r,n})).
\end{align*}
Therefore $h\circ d_0=h\circ d_1$.
Since $h\circ d_0=h\circ d_1$, every generator of $H$ is sent to $0$ by $h$, so $H\subseteq\ker(h)$. Hence $h$ factors uniquely through the quotient as a homomorphism
\begin{align*}
\bar h:T\to A
\end{align*}
with $\bar h\circ q=h$. In particular,
\begin{align*}
\bar h(q([m,n]))=h([m,n])=b(m,n).
\end{align*}
Thus the coequalizer of $d_0$ and $d_1$ is precisely the quotient $F/H$, which is the tensor product $M\otimes_R N$: homomorphisms $M\otimes_R N\to A$ are exactly balanced biadditive maps $M\times N\to A$.
[/example]
The tensor product example shows that colimits do not merely describe quotients after they are known; they are a systematic recipe for constructing quotients with prescribed universal mapping properties. The categorical issue is that a quotient should identify exactly the pairs forced by the diagram and no more, while still remaining an object of the same category. Coequalizers express this by turning two competing maps into one common map, and the universal property records precisely which later maps respect the imposed relation.
[quotetheorem:4168]
[citeproof:4168]
This theorem is the algebraic template behind quotient groups, quotient rings, quotient modules, and presentations by generators and relations. Its hypotheses should not be ignored. The category must have quotient objects by the relevant generated congruences, and the quotient map must remain inside the category under consideration. In an arbitrary concrete category, a set-theoretic quotient may fail to carry the required structure, or may carry it but fail to satisfy the correct universal property among the allowed morphisms. For instance, subcategories defined by extra separation, finiteness, or completeness conditions often do not contain the naive quotient. Coequalizer presentations are therefore algebraic facts about categories with suitable congruence quotients, not automatic consequences of having underlying sets.
[remark: Colimits As Free Gluing]
The phrase "free gluing" means that the colimit imposes the relations demanded by the diagram and no extra relations detectable by maps out of the result. The universal property records this precisely: maps out of the colimit are exactly compatible families of maps out of the diagram.
[/remark]
Chapter 6 uses this colimit perspective alongside the limit theory of Chapter 4. Since left adjoints preserve colimits, many colimits can be computed by moving them through a left adjoint, while right adjoints preserve the dual constructions, namely limits.
With both limits and colimits in hand, the next question is not just whether individual constructions exist, but whether a category can support all the diagram-shaped problems we want to solve. Completeness and cocontinuity turn those existence questions into a systematic framework, especially when combined with the behavior of adjoints.
# 6. Completeness, Cocontinuity, and Creation of Limits
Completeness asks whether a category has enough universal objects to solve all diagram-shaped matching problems. In Chapters 4 and 5, limits and colimits appeared one shape at a time: products, equalizers, pullbacks, coproducts, coequalizers, and pushouts. This chapter packages that behaviour into completeness, studies how functors interact with these constructions, and isolates a size phenomenon that prevents small complete categories from being rich unless set-theoretic hypotheses are stated with care.
## Complete and Cocomplete Categories
When a category is used as a universe of mathematical objects, the first structural question is whether every small diagram has a universal cone and every small diagram has a universal cocone. This turns isolated universal properties into a systematic calculus: once products and equalizers exist in the right sizes, all small limits can be assembled from them.
[definition: Small Diagram]
Let $\mathcal C$ be a category. A small diagram in $\mathcal C$ is a functor $D:J\to \mathcal C$ whose domain category $J$ is small.
[/definition]
The word small is doing work: the collection of vertices and arrows in $J$ must form sets.
Once the allowed diagrams have been bounded in this way, one can ask whether every set-sized system of compatibility equations can be solved by a universal object inside the same category. Completeness is the condition that no small limiting problem forces us to leave the category.
[definition: Complete Category]
A category $\mathcal C$ is complete if every small diagram $D:J\to \mathcal C$ has a limit in $\mathcal C$.
[/definition]
The dual existence problem is about building universal receivers rather than universal sources.
Colimit constructions can fail for the same size and closure reasons as limits: a diagram may require a quotient, gluing, or freely assembled object that the category does not contain. Cocompleteness rules out this obstruction by requiring every small diagram to have its universal receiver inside the category.
[definition: Cocomplete Category]
A category $\mathcal C$ is cocomplete if every small diagram $D:J\to \mathcal C$ has a colimit in $\mathcal C$.
[/definition]
For a fixed supply of small diagrams, completeness says that arbitrary compatible families can be represented by a single universal object. Cocompleteness says that arbitrary systems of identifications and generators can be represented by a single universal quotient or gluing.
These definitions are broad, but in practice one wants smaller checklists. The following criterion explains why products and equalizers are enough to build every small limit: products collect all candidate components, while equalizers impose the equations forced by the arrows of the diagram.
[quotetheorem:4169]
[citeproof:4169]
The dual construction gives colimits from coproducts and coequalizers. In practice this theorem is the main reason categories of algebraic structures are complete: products are coordinatewise, and equalizers are subobjects cut out by equations. The theorem is not merely a checklist of ingredients; it says that every small diagram can be reduced to one large product together with equations enforcing compatibility along arrows. Its usefulness depends on small products and equalizers existing inside the category, so size restrictions and closure under equation-defined subobjects both matter.
Cocompleteness is the dual existence problem. Instead of selecting compatible tuples inside a product, one starts with a coproduct of all pieces and then identifies elements that the diagram forces to agree. This changes the construction from solving equations in a product to imposing relations in a quotient, and it is the colimit counterpart to the preceding completeness criterion.
[quotetheorem:4170]
[citeproof:4170]
This dual criterion is useful for the same reason as the completeness criterion, but the construction runs in the opposite direction. Coproducts freely assemble all objects appearing in the diagram, and coequalizers impose the relations forced by the arrows. The hypotheses are not cosmetic: without enough coproducts there is no single ambient object containing all generators, and without coequalizers there is no canonical way to quotient by the diagram relations. Algebraic categories usually satisfy the criterion because free sums and quotients by congruences exist, while categories lacking the relevant quotients can fail to be cocomplete even when many individual colimits exist.
[example: Limits in Sets]
Let $D:J\to\mathsf{Set}$ be a small diagram. Define
\begin{align*}
L=\{(x_j)_{j\in\operatorname{Ob}J}\in\prod_{j\in\operatorname{Ob}J}D(j):D(\alpha)(x_j)=x_k\text{ for every arrow }\alpha:j\to k\}.
\end{align*}
For each object $j\in J$, let $\pi_j:L\to D(j)$ be the restriction of the $j$th product projection, so $\pi_j((x_i)_i)=x_j$. If $\alpha:j\to k$ is an arrow in $J$, then every $(x_i)_i\in L$ satisfies
\begin{align*}
(D(\alpha)\circ \pi_j)((x_i)_i)
&=D(\alpha)(\pi_j((x_i)_i))\\
&=D(\alpha)(x_j)\\
&=x_k\\
&=\pi_k((x_i)_i),
\end{align*}
so $D(\alpha)\circ\pi_j=\pi_k$. Thus the maps $\pi_j$ form a cone $L\to D$.
Now let $S$ be a set with a cone $(f_j:S\to D(j))_{j\in\operatorname{Ob}J}$, so for every arrow $\alpha:j\to k$ we have $D(\alpha)\circ f_j=f_k$. Define $u:S\to L$ by
\begin{align*}
u(s)=(f_j(s))_{j\in\operatorname{Ob}J}.
\end{align*}
This tuple lies in $L$ because, for every $\alpha:j\to k$,
\begin{align*}
D(\alpha)(f_j(s))
&=(D(\alpha)\circ f_j)(s)\\
&=f_k(s).
\end{align*}
Moreover, for each $j$ and each $s\in S$,
\begin{align*}
(\pi_j\circ u)(s)=\pi_j((f_i(s))_i)=f_j(s),
\end{align*}
so $\pi_j\circ u=f_j$. If $v:S\to L$ is another function with $\pi_j\circ v=f_j$ for every $j$, then for each $s\in S$ the tuple $v(s)$ has $j$th coordinate
\begin{align*}
\pi_j(v(s))=(\pi_j\circ v)(s)=f_j(s),
\end{align*}
so $v(s)=(f_j(s))_j=u(s)$. Hence $v=u$, and $L$ with its coordinate projections is the limit of $D$ in $\mathsf{Set}$.
[/example]
The set example is more than a sanity check: it is the model for how many structured limits are computed. A limit can often be formed by first taking the ordinary set of compatible tuples and then checking that the relevant operations preserve compatibility. This raises a general preservation question for forgetful functors: when does the underlying set of a limit in a structured category coincide with the limit of the underlying sets?
[quotetheorem:4171]
[citeproof:4171]
The set-theoretic construction is also a template for algebraic categories. The difference is that the subset of compatible tuples must inherit operations, and quotient sets must carry well-defined operations. For limits, this inheritance is often straightforward because operations are defined coordinatewise and the compatibility equations are preserved by homomorphisms. For colimits, the corresponding quotient construction is usually more delicate because one must prove that the generated relation is a congruence for all operations.
Groups provide the first important test case for this principle. Their products are computed on underlying sets with coordinatewise multiplication, and equalizers are subgroups determined by equations between homomorphisms. If arbitrary limits are assembled from products and equalizers, then the forgetful functor to sets should preserve those assembled limits as well.
The point is not only that particular products or equalizers of groups have familiar descriptions. We need a uniform statement saying that every small limiting construction in groups can be checked after forgetting to sets, so that compatibility equations and coordinatewise operations are enough to recover the universal group cone.
[quotetheorem:4172]
[citeproof:4172]
[example: Limits of Groups on Underlying Sets]
Let $D:J\to\mathsf{Grp}$ be a small diagram. Its group limit can be written as the compatible-tuple subset
\begin{align*}
L=\{(x_j)_{j\in\operatorname{Ob}J}\in\prod_{j\in\operatorname{Ob}J}D(j):D(\alpha)(x_j)=x_k\text{ for every arrow }\alpha:j\to k\}.
\end{align*}
The multiplication, identity, and inverses are inherited coordinatewise from the product group:
\begin{align*}
(x_j)_{j}(y_j)_{j}&=(x_jy_j)_{j},\\
1_L&=(1_{D(j)})_{j},\\
(x_j)_{j}^{-1}&=(x_j^{-1})_{j}.
\end{align*}
To see that these operations stay inside $L$, let $\alpha:j\to k$ be an arrow. If $(x_i)_i,(y_i)_i\in L$, then
\begin{align*}
D(\alpha)(x_jy_j)
&=D(\alpha)(x_j)D(\alpha)(y_j)\\
&=x_ky_k,
\end{align*}
because $D(\alpha):D(j)\to D(k)$ is a group homomorphism and both tuples are compatible. Also
\begin{align*}
D(\alpha)(1_{D(j)})&=1_{D(k)},\\
D(\alpha)(x_j^{-1})
&=D(\alpha)(x_j)^{-1}\\
&=x_k^{-1},
\end{align*}
so the identity tuple and inverse tuples are compatible as well. Thus $L$ is a subgroup of the product group $\prod_j D(j)$.
For each $j$, the projection $\pi_j:L\to D(j)$ is a group homomorphism, since
\begin{align*}
\pi_j((x_i)_i(y_i)_i)
&=\pi_j((x_iy_i)_i)\\
&=x_jy_j\\
&=\pi_j((x_i)_i)\pi_j((y_i)_i).
\end{align*}
The same coordinate equations show that the cone condition $D(\alpha)\circ\pi_j=\pi_k$ holds for every arrow $\alpha:j\to k$. Hence the underlying set $U(L)$ is exactly the set-theoretic limit of the underlying diagram $UD:J\to\mathsf{Set}$, and the forgetful functor $U:\mathsf{Grp}\to\mathsf{Set}$ sends this group limit to the corresponding set limit.
[/example]
The group calculation illustrates a broader pattern for algebraic categories whose operations are defined by equations. Limits are usually created on underlying sets because compatible tuples remain compatible after applying the operations coordinatewise. This suggests that the forgetful functor does not merely preserve a few particular limits, but actually creates them: the set-level limiting object carries a unique algebraic structure making the projections homomorphisms.
[example: Products and Equalizers in Modules]
Let $(M_i)_{i\in I}$ be a set-indexed family of left $R$-modules. The product is the Cartesian product
\begin{align*}
\prod_{i\in I}M_i
=\{(x_i)_{i\in I}:x_i\in M_i\text{ for every }i\in I\},
\end{align*}
with coordinatewise operations
\begin{align*}
(x_i)_i+(y_i)_i&=(x_i+y_i)_i,\\
r(x_i)_i&=(rx_i)_i.
\end{align*}
The coproduct is the direct sum
\begin{align*}
\bigoplus_{i\in I}M_i
=\{(x_i)_{i\in I}\in\prod_{i\in I}M_i:x_i=0\text{ for all but finitely many }i\}.
\end{align*}
If $(x_i)_i$ and $(y_i)_i$ are finitely supported, then the support of $(x_i+y_i)_i$ is contained in the union of their two finite supports, and the support of $r(x_i)_i$ is contained in the support of $(x_i)_i$, so the direct sum is closed under addition and scalar multiplication.
Now let $f,g:M\to N$ be $R$-linear maps. The categorical equalizer is the subset
\begin{align*}
\operatorname{Eq}(f,g)
=\{x\in M:f(x)=g(x)\}.
\end{align*}
The kernel of $f-g$ is
\begin{align*}
\ker(f-g)
&=\{x\in M:(f-g)(x)=0\}\\
&=\{x\in M:f(x)-g(x)=0\}\\
&=\{x\in M:f(x)=g(x)\}.
\end{align*}
Thus
\begin{align*}
\operatorname{Eq}(f,g)=\ker(f-g).
\end{align*}
This identifies equalizers in $R\text{-}\mathsf{Mod}$ with ordinary kernels of $R$-linear maps, so the categorical construction is the familiar linear-algebraic one.
[/example]
Kernels explain finite equalizer-shaped limits in module categories. Inverse systems ask for the same compatibility idea across infinitely many stages, so the next computation shows how products and equalizers combine to produce a limit.
[example: Inverse Systems of Modules]
Let $(M_n,p_n)_{n\in\mathbb N}$ be an inverse system of left $R$-modules, with $p_n:M_{n+1}\to M_n$ an $R$-linear map for each $n\ge 1$. Put
\begin{align*}
P=\prod_{n\ge 1}M_n.
\end{align*}
Define two $R$-linear maps $a,b:P\to P$ by their $n$th coordinates:
\begin{align*}
a((x_m)_{m\ge 1})_n&=x_n,\\
b((x_m)_{m\ge 1})_n&=p_n(x_{n+1}).
\end{align*}
The map $a$ is the identity on $P$. The map $b$ is $R$-linear because, for $(x_m)_m,(y_m)_m\in P$ and $r\in R$,
\begin{align*}
b((x_m)_m+(y_m)_m)_n
&=b((x_m+y_m)_m)_n\\
&=p_n(x_{n+1}+y_{n+1})\\
&=p_n(x_{n+1})+p_n(y_{n+1})\\
&=b((x_m)_m)_n+b((y_m)_m)_n,
\end{align*}
and
\begin{align*}
b(r(x_m)_m)_n
&=b((rx_m)_m)_n\\
&=p_n(rx_{n+1})\\
&=r\,p_n(x_{n+1})\\
&=r\,b((x_m)_m)_n.
\end{align*}
Thus the equalizer of $a$ and $b$ is
\begin{align*}
\operatorname{Eq}(a,b)
&=\{(x_n)_{n\ge 1}\in P:a((x_n)_n)=b((x_n)_n)\}\\
&=\{(x_n)_{n\ge 1}\in P:x_n=p_n(x_{n+1})\text{ for all }n\ge 1\}.
\end{align*}
Therefore
\begin{align*}
\varprojlim M_n
=\{(x_n)\in\prod_{n\ge 1}M_n:p_n(x_{n+1})=x_n\text{ for all }n\ge 1\}.
\end{align*}
The condition says that every coordinate is determined compatibly with the transition map from the next coordinate, which is why inverse limits record coherent infinite systems of approximations in algebra and homological algebra.
[/example]
The inverse-limit formula for modules suggests a broader compactness principle: module categories should have all small limits, not just the sequential ones that can be written as an equalizer of two maps. The remaining issue is whether arbitrary diagrams can still be reduced to products and compatibility equations while preserving the module structure coordinate by coordinate.
[quotetheorem:4173]
[citeproof:4173]
This result supplies the existence backdrop for the functorial language that follows. It says that the coordinatewise construction used for inverse limits is not an accident of sequences: arbitrary small diagrams of modules can be solved by taking a product and imposing the appropriate compatibility equations. The abelian-group operations and scalar multiplication survive those equations, so the limiting object remains a module rather than merely a set with extra data. Once limits are available in the categories being compared, it becomes meaningful to ask whether a functor preserves a chosen limit, detects that a cone is already limiting, or creates the limiting object from an underlying construction.
## Functors Preserving, Reflecting, and Creating Limits
Once limits exist in several categories, the next question is whether a functor transports, detects, or constructs them. These are different levels of compatibility. Preservation says a known universal object remains universal after applying the functor; reflection says universality can be detected after applying the functor; creation says the target-category limit can be lifted uniquely back to the source.
[definition: Preservation of Limits]
Let $F:\mathcal C\to\mathcal D$ be a functor and let $D:J\to\mathcal C$ be a small diagram. The functor $F$ preserves the limit of $D$ if, whenever $L$ with cone $\lambda:L\to D$ is a limit of $D$, the cone $F\lambda:FL\to FD$ is a limit of $FD:J\to\mathcal D$.
[/definition]
Preservation is a statement about carrying an existing construction forward. Right adjoints preserve all limits, so forgetful functors that occur as right adjoints often preserve limits.
There is a different question when a functor forgets structure: if the image of a cone is already universal after applying the functor, was the original cone universal before anything was forgotten? The definition of reflection isolates this detection property, separating it from the transport property of preservation.
[definition: Reflection of Limits]
Let $F:\mathcal C\to\mathcal D$ be a functor and let $D:J\to\mathcal C$ be a small diagram. The functor $F$ reflects the limit of $D$ if any cone $\lambda:L\to D$ whose image $F\lambda:FL\to FD$ is a limit cone in $\mathcal D$ is already a limit cone in $\mathcal C$.
[/definition]
Reflection is a detection property. It does not say that limits exist; it says that a candidate whose image is universal was already universal before applying $F$.
For forgetful functors, there is a stronger and more constructive question: can a limit formed after forgetting structure be lifted back uniquely with the missing structure restored? The next definition captures this situation, where the functor does not merely detect a limit but supplies the source-category limit from the target-category one.
[definition: Creation of Limits]
Let $F:\mathcal C\to\mathcal D$ be a functor and let $D:J\to\mathcal C$ be a small diagram. The functor $F$ creates the limit of $D$ if every limit cone $\mu:X\to FD$ in $\mathcal D$ has a unique lift to a cone $\lambda:L\to D$ in $\mathcal C$ with $FL=X$, $F\lambda=\mu$, and this lifted cone is a limit cone in $\mathcal C$.
[/definition]
Creation is stronger than preservation for the diagrams it applies to: the functor does not merely respect the limit, but allows it to be built in the source from the underlying limit in the target. The strongest examples come from algebraic forgetful functors.
The three notions are related but not interchangeable. The obstruction is that a functor may transport known limits without detecting false candidates, or detect universal cones without giving any way to lift target-side limits back to the source. The formal implications below isolate exactly what creation automatically supplies.
[quotetheorem:4175]
[citeproof:4175]
The theorem packages the relation between the three notions: creation is the most constructive condition, while preservation and reflection capture what remains visible after applying the functor. This distinction is useful because many forgetful functors remember enough underlying data to reconstruct algebraic limits, even when arbitrary functors only transport already existing universal cones.
For algebraic categories, the central obstruction is uniqueness of structure on the underlying limit. A set-theoretic limit of the underlying diagram is only useful if there is exactly one compatible algebraic structure making the projections homomorphisms; the next result records this stronger creation property for the standard forgetful functors.
[quotetheorem:4176]
[citeproof:4176]
This theorem explains why many computations in algebra begin by writing down a set of compatible tuples and only afterwards checking algebraic structure. The uniqueness clause is essential: the algebraic structure on the underlying set limit is forced by the requirement that the projections be homomorphisms.
[example: The Forgetful Functor from Modules Creates Limits]
Let $U:R\text{-}\mathsf{Mod}\to\mathsf{Set}$ be the forgetful functor and let $D:J\to R\text{-}\mathsf{Mod}$ be a small diagram. The underlying set limit of $UD:J\to\mathsf{Set}$ is
\begin{align*}
L=\{(x_j)_{j\in\operatorname{Ob}J}\in\prod_{j\in\operatorname{Ob}J}UD(j):D(\alpha)(x_j)=x_k\text{ for every arrow }\alpha:j\to k\}.
\end{align*}
Define addition and scalar multiplication coordinatewise by
\begin{align*}
(x_j)_j+(y_j)_j&=(x_j+y_j)_j,\\
r(x_j)_j&=(rx_j)_j.
\end{align*}
These formulas stay inside $L$. Indeed, if $(x_i)_i,(y_i)_i\in L$ and $\alpha:j\to k$ is an arrow of $J$, then $D(\alpha):D(j)\to D(k)$ is $R$-linear, so
\begin{align*}
D(\alpha)(x_j+y_j)
&=D(\alpha)(x_j)+D(\alpha)(y_j)\\
&=x_k+y_k.
\end{align*}
Similarly, for $r\in R$,
\begin{align*}
D(\alpha)(rx_j)
&=rD(\alpha)(x_j)\\
&=rx_k.
\end{align*}
The zero tuple also lies in $L$, since
\begin{align*}
D(\alpha)(0_{D(j)})=0_{D(k)}.
\end{align*}
Thus $L$ is closed under addition, scalar multiplication, and zero.
The module axioms hold coordinatewise. For example, if $(x_j)_j,(y_j)_j,(z_j)_j\in L$ and $r,s\in R$, then
\begin{align*}
((x_j)_j+(y_j)_j)+(z_j)_j
&=(x_j+y_j)_j+(z_j)_j\\
&=((x_j+y_j)+z_j)_j\\
&=(x_j+(y_j+z_j))_j\\
&=(x_j)_j+(y_j+z_j)_j\\
&=(x_j)_j+((y_j)_j+(z_j)_j),
\end{align*}
and
\begin{align*}
r((x_j)_j+(y_j)_j)
&=r(x_j+y_j)_j\\
&=(r(x_j+y_j))_j\\
&=(rx_j+ry_j)_j\\
&=(rx_j)_j+(ry_j)_j\\
&=r(x_j)_j+r(y_j)_j.
\end{align*}
The remaining module axioms are checked in the same coordinatewise way from the corresponding axioms in each module $D(j)$.
For each $j$, let $\pi_j:L\to D(j)$ be the $j$th projection. It is $R$-linear because
\begin{align*}
\pi_j((x_i)_i+(y_i)_i)
&=\pi_j((x_i+y_i)_i)\\
&=x_j+y_j\\
&=\pi_j((x_i)_i)+\pi_j((y_i)_i),
\end{align*}
and
\begin{align*}
\pi_j(r(x_i)_i)
&=\pi_j((rx_i)_i)\\
&=rx_j\\
&=r\pi_j((x_i)_i).
\end{align*}
If $\alpha:j\to k$, then for every $(x_i)_i\in L$,
\begin{align*}
(D(\alpha)\circ\pi_j)((x_i)_i)
&=D(\alpha)(x_j)\\
&=x_k\\
&=\pi_k((x_i)_i),
\end{align*}
so the projections form a cone in $R\text{-}\mathsf{Mod}$.
Now let $M$ be an $R$-module with a cone of $R$-linear maps $(f_j:M\to D(j))_j$. Define $u:M\to L$ by
\begin{align*}
u(m)=(f_j(m))_j.
\end{align*}
This tuple lies in $L$ because for every arrow $\alpha:j\to k$,
\begin{align*}
D(\alpha)(f_j(m))
&=(D(\alpha)\circ f_j)(m)\\
&=f_k(m).
\end{align*}
The map $u$ is $R$-linear, since
\begin{align*}
u(m+n)
&=(f_j(m+n))_j\\
&=(f_j(m)+f_j(n))_j\\
&=(f_j(m))_j+(f_j(n))_j\\
&=u(m)+u(n),
\end{align*}
and
\begin{align*}
u(rm)
&=(f_j(rm))_j\\
&=(rf_j(m))_j\\
&=r(f_j(m))_j\\
&=ru(m).
\end{align*}
Also $\pi_j\circ u=f_j$ for every $j$. If $v:M\to L$ is another $R$-linear map with $\pi_j\circ v=f_j$ for every $j$, then for each $m\in M$ the $j$th coordinate of $v(m)$ is
\begin{align*}
\pi_j(v(m))=(\pi_j\circ v)(m)=f_j(m),
\end{align*}
so $v(m)=(f_j(m))_j=u(m)$ for every $m$. Hence $v=u$.
Thus the module structure on the underlying set limit is forced by the requirement that all projections be $R$-linear, and $U$ creates the limit of $D$.
[/example]
This creation result is deliberately one-sided. Forgetful functors often preserve and create limits because equations can be checked on underlying sets, but colimits usually require new formal expressions or quotienting that changes the underlying set.
[remark: Colimits Are Different for Forgetful Functors]
Algebraic forgetful functors usually do not create colimits. For example, the underlying set of a coproduct of groups is not the disjoint union of the underlying sets; it is the free product of groups, whose elements are reduced words. This contrast is one reason adjunctions and monadicity become central in later treatments of algebraic categories.
[/remark]
## Small Complete Categories and Size Issues
The final question in this chapter is whether a category can be both small and complete without collapsing. If all objects and all morphisms form sets, and if all set-indexed limits exist internally, then the category has enough products to compare every object against every large family of morphisms internal to the category. Under the usual size hypotheses this forces the category to behave like a preorder.
[definition: Preorder Category]
A category $\mathcal C$ is a preorder if for every pair of objects $A,B\in\mathcal C$ the hom-set $\mathcal C(A,B)$ has at most one element.
[/definition]
A preorder category may still have many objects, but there is no nontrivial parallel morphism data.
The obstruction is that a small complete category has enough internally indexed products to compare all of its own morphisms at once. If two parallel arrows were distinct, these products would give a universal comparison object that detects the difference and forces a contradiction. The collapse result identifies the precise consequence of combining smallness with all small limits.
[quotetheorem:4177]
[citeproof:4177]
The theorem is often quoted in a sharper-sounding form as: every small complete category is a preorder. The size hypotheses are part of the statement, not decoration. Large complete categories such as $\mathsf{Set}$, $\mathsf{Grp}$, $R\text{-}\mathsf{Mod}$, and $\mathsf{Top}$ have many parallel arrows because their collections of objects and morphisms are proper classes, while the diagrams whose limits are required remain small.
[remark: Why Completeness Is Not the Same as Having a Largest Product]
The collapse theorem is not saying that a small category cannot have finite products or many useful limits. It says that asking for products indexed by the full morphism set of the category is already strong enough to force preorder behaviour. Many small categories used in algebra have finite limits, or limits of a restricted shape, without being preorders.
[/remark]
The boundary is easiest to see in a one-object category. Such a category can be very small and still have many parallel endomorphisms, so it separates smallness from the much stronger assumption of all small products.
[example: A Small One-Object Category Need Not Be a Preorder]
Let $M$ be a nontrivial monoid and let $\mathcal C_M$ be the one-object category whose unique object is $*$ and whose endomorphism monoid is
\begin{align*}
\mathcal C_M(*,*)=M.
\end{align*}
Composition in $\mathcal C_M$ is multiplication in $M$, and the identity morphism of $*$ is the identity element $1_M$. Since $\mathcal C_M$ has one object and the morphisms form the set $M$, it is small. If $a,b\in M$ with $a\ne b$, then $a,b:* \to *$ are two distinct parallel morphisms, so $\mathcal C_M$ is not a preorder.
The binary product of $*$ with itself, if it exists, must again have underlying object $*$, because there is no other object. Thus it consists of two chosen morphisms $p_1,p_2:* \to *$, equivalently two elements $p_1,p_2\in M$, such that for every pair $f,g\in M$ there is a unique $h\in M$ satisfying
\begin{align*}
p_1h&=f,\\
p_2h&=g.
\end{align*}
Equivalently, the function
\begin{align*}
M&\longrightarrow M\times M,\\
h&\longmapsto (p_1h,p_2h)
\end{align*}
must be a bijection. More generally, a chosen $n$-fold product of $*$ with itself is exactly a tuple $p_1,\dots,p_n\in M$ such that
\begin{align*}
M&\longrightarrow M^n,\\
h&\longmapsto (p_1h,\dots,p_nh)
\end{align*}
is a bijection.
Thus smallness alone does not remove parallel morphisms: every nontrivial monoid gives a small one-object category that is not a preorder. This example is not claiming that every such one-object category has finite products; rather, the displayed calculation shows exactly what finite products would demand inside the monoid. The collapse theorem needs the much stronger hypothesis of all small products, not merely smallness or the presence of a few chosen finite products.
[/example]
The other way to avoid the collapse is to leave the small world. Standard categories are complete precisely because they are large enough to contain the products and equalizers needed without forcing all parallel arrows to coincide.
[example: Large Complete Categories Avoid the Collapse]
The category $\mathsf{Set}$ has nontrivial parallel morphisms. For example, define two functions
\begin{align*}
f,g:\{0\}&\longrightarrow \{0,1\}
\end{align*}
by
\begin{align*}
f(0)&=0,\\
g(0)&=1.
\end{align*}
Since $0\ne 1$ in $\{0,1\}$, the two functions have different values at the same element $0\in\{0\}$, so $f\ne g$. Thus $\mathsf{Set}$ is not a preorder.
This does not contradict the small-category collapse theorem, because $\mathsf{Set}$ is not small: there is no set whose elements are all sets. Indeed, if $V$ were a set containing every set as an element, then its power set $\mathcal P(V)$ would also be a set, hence would be an element of $V$; but *Cantor's theorem* gives $|\mathcal P(V)|>|V|$, so $\mathcal P(V)$ cannot be listed among the elements of $V$ in a way that accounts for all subsets of $V$.
The limits required for completeness of $\mathsf{Set}$ are still small limits: the indexing category $J$ must have a set of objects and a set of arrows. For such a diagram $D:J\to\mathsf{Set}$, the product $\prod_{j\in\operatorname{Ob}J}D(j)$ is indexed by the set $\operatorname{Ob}J$, and the compatibility conditions are indexed by the set of arrows of $J$. Thus $\mathsf{Set}$ can have all small limits while avoiding the collapse theorem, because the ambient category is large even though each diagram used to form a limit is small.
[/example]
Size distinctions therefore separate two roles of the word small. A diagram is small when it is indexed by a set-sized category; a category is small when its whole object and morphism collections are sets. Completeness permits all small diagrams, but the categories in which this is most useful are usually large.
When limits and colimits are viewed as universal solutions to diagrammatic extension problems, Kan extensions become the natural next step. They express the best possible approximation to transport along a functor, and the adjoint functor theorems explain when such best approximations exist.
# 7. Kan Extensions and Adjoint Functor Theorems
Kan extensions answer the following problem: given a functor already defined on one category, how should it be extended or compared along another functor when no literal extension is available? The point is not to choose values object by object, but to choose the best approximation measured by a universal property. This chapter connects that idea to adjunctions: Kan extensions generalise adjoint functors, and adjoint functor theorems give practical hypotheses under which adjoints exist.
## Universal Approximations Along a Functor
Suppose $K: \mathcal C \to \mathcal D$ and $F: \mathcal C \to \mathcal E$ are functors. When can $F$ be extended from $\mathcal C$ to $\mathcal D$ along $K$? If there is no genuine extension $L: \mathcal D \to \mathcal E$ with $LK = F$, the next best object is a functor $L$ equipped with a natural transformation relating $LK$ and $F$, universal among all such choices.
[definition: Left Kan Extension]
Let $K: \mathcal C \to \mathcal D$ and $F: \mathcal C \to \mathcal E$ be functors. A left Kan extension of $F$ along $K$ is a functor $\operatorname{Lan}_K F: \mathcal D \to \mathcal E$ together with a natural transformation
\begin{align*}
\eta: F \to (\operatorname{Lan}_K F)K
\end{align*}
such that for every functor $G: \mathcal D \to \mathcal E$ and every natural transformation $\alpha: F \to GK$, there exists a unique natural transformation $\bar{\alpha}: \operatorname{Lan}_K F \to G$ satisfying $\alpha = (\bar{\alpha}K)\eta$.
[/definition]
Thus $\operatorname{Lan}_K F$ is the initial way of mapping $F$ into a functor that factors through $K$. The arrow direction is important: it says that left Kan extension is a universal approximation to $F$ from above, in the same variance as a left adjoint.
There is also a dual approximation problem in which functors factoring through $K$ map into $F$ rather than receive maps from it. Reversing the natural transformation reverses the universal property and leads to the right Kan extension.
[definition: Right Kan Extension]
Let $K: \mathcal C \to \mathcal D$ and $F: \mathcal C \to \mathcal E$ be functors. A right Kan extension of $F$ along $K$ is a functor $\operatorname{Ran}_K F: \mathcal D \to \mathcal E$ together with a natural transformation
\begin{align*}
\varepsilon: (\operatorname{Ran}_K F)K \to F
\end{align*}
such that for every functor $G: \mathcal D \to \mathcal E$ and every natural transformation $\beta: GK \to F$, there exists a unique natural transformation $\bar{\beta}: G \to \operatorname{Ran}_K F$ satisfying $\beta = \varepsilon(\bar{\beta}K)$.
[/definition]
Right Kan extension is the terminal way of mapping a functor that factors through $K$ into $F$. The two notions are formally dual, but in examples they often compute different objects: colimit-like constructions for left Kan extensions and limit-like constructions for right Kan extensions.
The next structural question is how these universal approximations relate to ordinary restriction of functors. Precomposition along $K$ forgets information by viewing a functor on $\mathcal D$ only on the image of $\mathcal C$. When a Kan extension exists, it should be the universal way to undo that restriction on the left or on the right, which is exactly an adjunction statement.
[quotetheorem:4178]
[citeproof:4178]
This theorem explains the slogan that Kan extensions are adjoints to restriction. Let $K^*: [\mathcal D,\mathcal E] \to [\mathcal C,\mathcal E]$ denote precomposition with $K$, so $K^*(G)=GK$. The existence hypothesis is essential: if the target category lacks the relevant colimits or limits, the representing object in the displayed bijection may not exist, as happens when trying to form a left Kan extension into a category with too few coproducts. The theorem does not say that every functor has a Kan extension, nor that a pointwise formula is available; it only identifies the universal property once the representing functor exists. When $\operatorname{Lan}_K F$ exists for all $F$, these bijections assemble into an adjunction $\operatorname{Lan}_K \dashv K^*$, and when $\operatorname{Ran}_K F$ exists for all $F$, they assemble into $K^* \dashv \operatorname{Ran}_K$.
[example: Inclusion Of A Subcategory]
Let $i:\mathcal A\hookrightarrow \mathcal C$ be the inclusion of a full subcategory and let $F:\mathcal A\to\mathcal E$ be a functor. For an object $c\in\mathcal C$, the pointwise left Kan extension, when it exists, is computed from the comma category $(i\downarrow c)$:
\begin{align*}
(\operatorname{Lan}_iF)(c)\cong \operatorname{colim}_{(a,u)\in(i\downarrow c)}F(a).
\end{align*}
An object of $(i\downarrow c)$ is an arrow $u:i(a)\to c$ in $\mathcal C$, so this colimit assembles all values $F(a)$ from objects of $\mathcal A$ that map into $c$.
Now suppose $c$ itself lies in $\mathcal A$. Then $(c,\operatorname{id}_c)$ is a terminal object of $(i\downarrow c)$. Indeed, for any object $(a,u:i(a)\to c)$, fullness of $i$ gives a morphism $f:a\to c$ in $\mathcal A$ with
\begin{align*}
i(f)=u.
\end{align*}
The condition for $f$ to define a morphism $(a,u)\to(c,\operatorname{id}_c)$ is
\begin{align*}
\operatorname{id}_c\circ i(f)=u,
\end{align*}
and this holds because $\operatorname{id}_c\circ i(f)=i(f)=u$. Uniqueness follows because $i$ is an inclusion, hence faithful. Therefore the colimit over $(i\downarrow c)$ collapses to the value at this terminal object:
\begin{align*}
(\operatorname{Lan}_iF)(c)
&\cong \operatorname{colim}_{(a,u)\in(i\downarrow c)}F(a)\\
&\cong F(c).
\end{align*}
Thus, on objects already belonging to the full subcategory, the pointwise left Kan extension recovers the original functor; outside $\mathcal A$, it gives the universal value assembled from all maps out of objects of $\mathcal A$ into the chosen object of $\mathcal C$.
[/example]
## Pointwise Kan Extensions
The universal property defines a Kan extension globally in the functor category $[\mathcal D,\mathcal E]$. How can we compute its value at a single object $d \in \mathcal D$? Pointwise Kan extensions answer this by replacing the global problem with a colimit or limit over a comma category of objects of $\mathcal C$ mapping toward or away from $d$.
[definition: Comma Category For Left Kan Extension]
Let $K: \mathcal C \to \mathcal D$ be a functor and let $d \in \mathcal D$. The comma category $(K \downarrow d)$ has objects pairs $(c,u)$ with $c \in \mathcal C$ and $u: Kc \to d$ in $\mathcal D$. A morphism $(c,u) \to (c',u')$ is a morphism $f: c \to c'$ in $\mathcal C$ such that $u'Kf = u$.
[/definition]
The category $(K \downarrow d)$ records all ways in which $d$ receives information from objects in the image of $K$. It is therefore the indexing category for assembling the left Kan extension at $d$.
For a right Kan extension, the value at $d$ is constrained by all ways that $d$ maps into objects of the form $Kc$. The indexing category must therefore reverse the incidence data so that compatible families form a limit rather than a colimit. This motivates the dual comma category used in the pointwise formula.
[definition: Comma Category For Right Kan Extension]
Let $K: \mathcal C \to \mathcal D$ be a functor and let $d \in \mathcal D$. The comma category $(d \downarrow K)$ has objects pairs $(c,u)$ with $c \in \mathcal C$ and $u: d \to Kc$ in $\mathcal D$. A morphism $(c,u) \to (c',u')$ is a morphism $f: c \to c'$ in $\mathcal C$ such that $Kf\,u = u'$.
[/definition]
This category records all ways in which $d$ maps into objects coming from $\mathcal C$. It indexes the compatible cones used to compute the right Kan extension at $d$.
The remaining computational question is whether these comma categories really recover the global Kan extension one object at a time. The pointwise theorem answers this by identifying the left case with a colimit over $(K\downarrow d)$ and the right case with a limit over $(d\downarrow K)$, under the needed existence hypotheses.
[quotetheorem:4179]
[citeproof:4179]
Pointwise formulas make Kan extensions usable. The colimit and limit hypotheses are needed object by object: for example, a left Kan extension into a category without the required coequalisers can fail even when all the comma categories are small. Functoriality in $d$ is also part of the theorem; separate colimits at each object do not by themselves produce a functor $\mathcal D \to \mathcal E$. The theorem does not identify all Kan extensions with pointwise ones in arbitrary enriched or large settings, but in ordinary locally small categories with the required small limits or colimits it gives the main computational tool used below.
[example: Presheaf Extension From A Basis]
Let $X$ be a topological space, let $\mathcal B$ be a basis ordered by inclusion, and let $i:\mathcal B\hookrightarrow \operatorname{Open}(X)$ be the inclusion. A presheaf on the basis is a functor $F:\mathcal B^{\mathrm{op}}\to \operatorname{Set}$, so an inclusion $B'\subseteq B$ of basic opens gives a restriction map
\begin{align*}
F(B)\to F(B').
\end{align*}
We compute the right Kan extension of $F$ along $i^{\mathrm{op}}:\mathcal B^{\mathrm{op}}\to \operatorname{Open}(X)^{\mathrm{op}}$. For an open set $U$, an object of the comma category $(U\downarrow i^{\mathrm{op}})$ is a basic open $B\in\mathcal B$ together with a morphism
\begin{align*}
U\to i^{\mathrm{op}}(B)
\end{align*}
in $\operatorname{Open}(X)^{\mathrm{op}}$. Reversing arrows, this is exactly an inclusion
\begin{align*}
B\subseteq U
\end{align*}
in $\operatorname{Open}(X)$. A morphism $(B\subseteq U)\to (B'\subseteq U)$ in $(U\downarrow i^{\mathrm{op}})$ is a morphism $B\to B'$ in $\mathcal B^{\mathrm{op}}$, equivalently an inclusion $B'\subseteq B$ in $\mathcal B$, such that the composite
\begin{align*}
U\to B\to B'
\end{align*}
in $\operatorname{Open}(X)^{\mathrm{op}}$ equals the given arrow $U\to B'$. In $\operatorname{Open}(X)$ this says that the composite inclusion
\begin{align*}
B'\subseteq B\subseteq U
\end{align*}
is the inclusion $B'\subseteq U$.
Therefore the pointwise value is the limit
\begin{align*}
(\operatorname{Ran}_{i^{\mathrm{op}}}F)(U)
&\cong \lim_{(B\subseteq U)\in (U\downarrow i^{\mathrm{op}})} F(B).
\end{align*}
Concretely, this limit is the set of families
\begin{align*}
(s_B)_{B\subseteq U,\ B\in\mathcal B}
\end{align*}
with $s_B\in F(B)$ such that whenever $B'\subseteq B\subseteq U$, the restriction of $s_B$ to $B'$ equals $s_{B'}$:
\begin{align*}
F(B\supseteq B')(s_B)=s_{B'}.
\end{align*}
Thus the right Kan extension assigns to $U$ precisely the compatible families of sections on all basis elements inside $U$, which is the categorical form of extending local presheaf data from a basis to all open sets.
[/example]
The next remark is a common source of sign errors: colimits collapse at terminal objects, while limits collapse at initial objects.
[remark: Initial And Final Objects In Comma Categories]
If $(K \downarrow d)$ has a terminal object $(c_0,u_0)$, then the pointwise left Kan extension at $d$ is isomorphic to $F(c_0)$. If $(d \downarrow K)$ has an initial object $(c_0,u_0)$, then the pointwise right Kan extension at $d$ is isomorphic to $F(c_0)$. These cases explain why some Kan extension computations collapse to evaluation at a single best approximation.
[/remark]
## Kan Extensions And Algebraic Change Of Base
Many algebraic constructions are described as changing the category in which an object is viewed. How does a module over one ring become a module over another ring, and which constructions are universal? Kan extensions provide the functor-category template behind restriction, induction, and coinduction.
Let $\varphi: R \to S$ be a ring homomorphism. Restriction of scalars sends a left $S$-module $N$ to the left $R$-module obtained by defining $r \cdot n = \varphi(r)n$. This gives a functor
\begin{align*}
\varphi^*: S\operatorname{-Mod} \to R\operatorname{-Mod}.
\end{align*}
[quotetheorem:4180]
[citeproof:4180]
This theorem is a familiar adjunction, but it also reflects the Kan extension pattern. The ring homomorphism hypothesis supplies the $R$-$S$ bimodule structure on $S$; without it, the tensor product $S \otimes_R M$ and the displayed $S$-action on $\operatorname{Hom}_R(S,M)$ are not defined. The theorem does not say that extension and coextension of scalars are equivalent: for instance, over $\mathbb Z \to \mathbb Q$, tensoring with $\mathbb Q$ kills torsion while $\operatorname{Hom}_{\mathbb Z}(\mathbb Q,-)$ detects divisible structure instead. This distinction is the algebraic shadow of the general difference between left and right Kan extensions.
[example: Group Actions As Functors]
Let $G$ and $H$ be groups, let $\theta:G\to H$ be a group homomorphism, and regard each group as a one-object category. If $X:G\to\operatorname{Set}$ is a left $G$-set, then restriction along $\theta$ sends a left $H$-set $Y$ to the left $G$-set with action
\begin{align*}
g\cdot y=\theta(g)\cdot y.
\end{align*}
We compute the left Kan extension of $X$ along $\theta$. In the comma category $(\theta\downarrow *)$, an object is an arrow $* \to *$ in the one-object category $H$, hence an element $h\in H$. A morphism $h\to h'$ is an element $g\in G$ satisfying
\begin{align*}
h'\theta(g)=h.
\end{align*}
The diagram indexed by $(\theta\downarrow *)$ sends every object $h$ to $X$, and sends such a morphism $g:h\to h'$ to the action map
\begin{align*}
X&\to X,\\
x&\mapsto g\cdot x.
\end{align*}
Therefore the colimit is the quotient of $H\times X$ by the relations
\begin{align*}
(h,x)\sim(h',g\cdot x)
\end{align*}
whenever $h'\theta(g)=h$. Writing $h=h'\theta(g)$, this is equivalently
\begin{align*}
(h'\theta(g),x)\sim(h',g\cdot x).
\end{align*}
Thus
\begin{align*}
(\operatorname{Lan}_{\theta}X)(*)\cong H\times_G X,
\end{align*}
where $H\times_G X$ denotes the quotient of $H\times X$ by
\begin{align*}
(h\theta(g),x)\sim(h,g\cdot x).
\end{align*}
The left $H$-action is induced by left multiplication:
\begin{align*}
k\cdot[h,x]=[kh,x].
\end{align*}
This is well-defined because
\begin{align*}
k\cdot[h\theta(g),x]
&=[kh\theta(g),x]\\
&=[kh,g\cdot x]\\
&=k\cdot[h,g\cdot x].
\end{align*}
For the right Kan extension, the comma category $(*\downarrow\theta)$ again has objects $h\in H$, and a morphism $h\to h'$ is an element $g\in G$ satisfying
\begin{align*}
\theta(g)h=h'.
\end{align*}
Its limit consists of families $(x_h)_{h\in H}$ with $x_h\in X$ such that for every $g\in G$ and $h\in H$,
\begin{align*}
x_{\theta(g)h}=g\cdot x_h.
\end{align*}
Equivalently, this is the set of functions $f:H\to X$ satisfying
\begin{align*}
f(\theta(g)h)=g\cdot f(h)
\end{align*}
for all $g\in G$ and $h\in H$. Hence the right Kan extension is the coinduced $H$-set of $G$-equivariant functions from $H$ to $X$, with $H$ acting by right translation on the input:
\begin{align*}
(k\cdot f)(h)=f(hk).
\end{align*}
Induction therefore forms orbits of pairs $(h,x)$, while coinduction forms compatible families indexed by $H$.
[/example]
## The Special Adjoint Functor Theorem
Adjoints are often recognised by constructing units or counits directly, but in large algebraic categories such constructions can be hard to find. What structural hypotheses guarantee that a functor preserving limits has a left adjoint, or that a functor preserving colimits has a right adjoint? The adjoint functor theorems answer this by combining completeness, size control, and a solution-set condition.
[definition: Well-Powered Category]
A category $\mathcal C$ is well-powered if for every object $c \in \mathcal C$, the collection of subobjects of $c$ is a set.
[/definition]
Well-poweredness is a size condition. It prevents a single object from having a proper class of distinct subobjects, which would make certain universal constructions too large to form.
The adjoint functor theorem also needs a way to test morphisms using a controlled amount of data. A cogenerator provides such a test object: distinct arrows can be separated after mapping their codomain into one fixed object.
[definition: Cogenerator]
An object $Q$ of a category $\mathcal C$ is a cogenerator if for every pair of distinct morphisms $f,g: x \to y$ in $\mathcal C$, there exists a morphism $h:y \to Q$ such that $hf \ne hg$.
[/definition]
A cogenerator lets morphisms be tested by maps into one object. In algebra, injective objects or products of simple objects often play this role, depending on the category.
The right-adjoint version of the theorem uses the dual size controls. A category is co-well-powered if each object has only a set of quotient objects up to isomorphism. An object $P$ is a generator if distinct morphisms $f,g:x\to y$ can be separated before mapping out of one fixed object: there is a morphism $h:P\to x$ with $fh\ne gh$. Thus well-poweredness and cogenerators control subobjects and maps into test objects, while co-well-poweredness and generators control quotients and maps out of test objects.
Even with such tests, an adjoint may fail to exist if the relevant comma category contains a proper class of unrelated candidates. The solution-set condition is the additional smallness requirement that replaces this uncontrolled class by a set of representative arrows through which every candidate factors.
This is the last size hypothesis needed before the adjoint functor theorem, but it is not just another named condition. The obstruction is that an object $c$ may admit too many arrows into objects of the form $Gd$ to search through directly. The solution-set condition asks for a set-sized list of test arrows that is large enough to dominate all the others by factorisation, so the later existence proof can work with an actual set rather than a proper class.
[definition: Solution-Set Condition]
Let $G: \mathcal D \to \mathcal C$ be a functor and let $c \in \mathcal C$. The solution-set condition at $c$ says that there is a set-indexed family of arrows
\begin{align*}
c \to Gd_i
\end{align*}
with $d_i \in \mathcal D$ such that every arrow $c \to Gd$ factors as
\begin{align*}
c \to Gd_i \to Gd
\end{align*}
for some $i$.
[/definition]
The solution-set condition says that the comma category $(c \downarrow G)$ may be large, but it has a set of candidates through which every object is reached. This is the size hypothesis that replaces having a small comma category.
The point of assembling completeness, well-poweredness, a cogenerator, and solution sets is to force the comma categories that would define a left adjoint to have initial objects. The adjoint functor theorem gives the resulting existence criterion: preservation of limits plus these size controls is enough to produce the missing left adjoint.
The formal criterion is needed because the desired left adjoint is usually not visible from an explicit formula. What can be checked instead is how $G$ interacts with limits and whether the comma categories have enough size control to contain initial objects. The theorem turns those verifiable categorical hypotheses into the conclusion that $G$ is already a right adjoint.
This is the point where the preceding hypotheses become usable: instead of constructing a left adjoint object by object, one asks whether every comma category $(c \downarrow G)$ is forced to contain an initial object. The adjoint functor theorem gives exactly this bridge from checkable limit and size conditions to the existence of a universal construction.
[quotetheorem:4181]
[citeproof:4181]
The theorem is powerful because it turns preservation of limits plus size hypotheses into existence of adjoints. Completeness is needed to build the candidate universal object from products and equalisers, while limit preservation is needed so that applying $G$ still remembers those constructions in the comma category. The size hypotheses are not cosmetic: without well-poweredness or a solution set, the category $(c \downarrow G)$ may contain a proper class of incomparable candidates, so there need not be a set-sized construction from which an initial object can be carved out. The theorem does not construct the left adjoint by a simple formula in general; it proves existence, and the next dual theorem gives the corresponding criterion for right adjoints.
[quotetheorem:4182]
[citeproof:4182]
The dual theorem has the same size-sensitive content with all arrows reversed. Cocompleteness and colimit preservation are needed because the desired right adjoint is detected by terminal objects in dual comma categories; without the needed coproducts or coequalisers, those terminal objects may fail to exist. Co-well-poweredness and a generator prevent the dual construction from ranging over a proper class of quotient objects, just as well-poweredness and a cogenerator controlled subobjects above. The theorem does not give an explicit right adjoint in every example, but it tells us which categorical checks must precede any claim that such an adjoint exists.
[example: Free Algebra Constructions]
Let $U:\operatorname{Alg}_\Sigma\to\operatorname{Set}$ be the forgetful functor from algebras for a finitary algebraic signature $\Sigma$. Limits in $\operatorname{Alg}_\Sigma$ are formed on underlying sets: for a diagram $(A_j)$, the product has underlying set $\prod_j U(A_j)$, and each $n$-ary operation $\sigma$ is defined coordinatewise by
\begin{align*}
\sigma_{\prod A_j}((a_{1j})_j,\ldots,(a_{nj})_j)
=
(\sigma_{A_j}(a_{1j},\ldots,a_{nj}))_j.
\end{align*}
Equalisers are also created in sets: if $f,g:A\to B$ are homomorphisms, then
\begin{align*}
E=\{a\in U(A):f(a)=g(a)\}
\end{align*}
is closed under every operation, because for $a_1,\ldots,a_n\in E$,
\begin{align*}
f(\sigma_A(a_1,\ldots,a_n))
&=\sigma_B(f(a_1),\ldots,f(a_n))\\
&=\sigma_B(g(a_1),\ldots,g(a_n))\\
&=g(\sigma_A(a_1,\ldots,a_n)).
\end{align*}
Hence $U$ preserves limits, since products and equalisers are sent to the corresponding products and equalisers in $\operatorname{Set}$.
The solution-set condition at a set $X$ is checked as follows. Given any function $\alpha:X\to U(A)$, let $B\subseteq A$ be the subalgebra generated by $\alpha(X)$. Every element of $U(B)$ is obtained by evaluating a formal $\Sigma$-term in variables from $X$, so the evaluation map
\begin{align*}
T_\Sigma(X)&\to U(B),\\
t&\mapsto t^A(\alpha(x_1),\ldots,\alpha(x_n))
\end{align*}
is surjective, where $T_\Sigma(X)$ is the set of formal $\Sigma$-terms over $X$. Thus
\begin{align*}
|U(B)|\le |T_\Sigma(X)|.
\end{align*}
There are only set-many $\Sigma$-algebra structures on sets of cardinal at most $|T_\Sigma(X)|$, and only set-many functions from $X$ into the underlying sets of those algebras. Choosing one representative for each such pair gives a set of arrows $X\to U(B_i)$. The original map $\alpha:X\to U(A)$ factors as
\begin{align*}
X\to U(B)\to U(A),
\end{align*}
where $B\hookrightarrow A$ is the generated subalgebra inclusion. Therefore $U$ satisfies the solution-set condition. Since $\operatorname{Alg}_\Sigma$ is complete, well-powered, and has a cogenerator under the usual set-theoretic hypotheses for algebraic categories, the *Special Adjoint Functor Theorem* applies and gives a left adjoint to $U$; this left adjoint is the free $\Sigma$-algebra functor.
[/example]
## Kan Extensions As A Bridge To Existence Theorems
The chapter began with Kan extensions as universal approximations along a functor and ends with adjoint functor theorems as existence machines for the adjoints studied in Chapters 1 through 3. How do these ideas fit together? Both organise mathematics around universal arrows in comma categories.
For a left Kan extension, the value at $d$ is built from the comma category $(K \downarrow d)$. For a left adjoint to $G: \mathcal D \to \mathcal C$, the value at $c$ is obtained from an initial object of $(c \downarrow G)$. The constructions differ in shape, but the underlying method is the same: turn a desired object into an initial or terminal object problem, then use limits, colimits, and size hypotheses to prove existence.
[remark: Why Pointwise Formulas Matter]
Pointwise Kan extensions are often the computational form of abstract adjunctions. When a precomposition functor $K^*$ has a left adjoint, the adjoint is $\operatorname{Lan}_K$; when it has a right adjoint, the adjoint is $\operatorname{Ran}_K$. Thus the pointwise formulas show what these adjoints do at each object, while the adjoint functor theorem gives conditions under which such adjoints must exist.
[/remark]
The final example packages this principle in the setting where a small subcategory acts as a collection of generators.
[example: Extending Data From Generators]
Let $i:\mathcal A\hookrightarrow\mathcal C$ be a small full subcategory, and suppose an object $c\in\mathcal C$ is presented as a colimit of objects from $\mathcal A$ by a diagram $P:J\to\mathcal A$ with colimit cocone
\begin{align*}
\lambda_j:iP(j)\to c.
\end{align*}
Whenever the required colimits exist in $\mathcal E$, the pointwise left Kan extension of $F:\mathcal A\to\mathcal E$ along $i$ is computed by
\begin{align*}
(\operatorname{Lan}_iF)(c)
\cong
\operatorname{colim}_{(a,u)\in(i\downarrow c)}F(a),
\end{align*}
where an object of $(i\downarrow c)$ is an arrow $u:i(a)\to c$.
The chosen presentation of $c$ gives a functor
\begin{align*}
\Phi:J\to(i\downarrow c)
\end{align*}
defined by
\begin{align*}
\Phi(j)&=(P(j),\lambda_j).
\end{align*}
If $\theta:j\to k$ is a morphism in $J$, then $\Phi(\theta)=P(\theta):P(j)\to P(k)$ is a morphism in $(i\downarrow c)$ because the cocone condition gives
\begin{align*}
\lambda_k\, i(P(\theta))=\lambda_j.
\end{align*}
Thus the presentation diagram contributes the subdiagram
\begin{align*}
F P:J\to\mathcal E
\end{align*}
inside the comma-category diagram. If
\begin{align*}
q_{(a,u)}:F(a)\to \operatorname{colim}_{(i\downarrow c)}F(a)
\end{align*}
denotes the colimit cocone, then for every $\theta:j\to k$,
\begin{align*}
q_{\Phi(k)}\,F(P(\theta))
&=
q_{(P(k),\lambda_k)}\,F(P(\theta))\\
&=
q_{(P(j),\lambda_j)}\\
&=
q_{\Phi(j)},
\end{align*}
because $P(\theta)$ is a morphism $\Phi(j)\to\Phi(k)$ in $(i\downarrow c)$. Hence the maps $q_{\Phi(j)}:F(P(j))\to(\operatorname{Lan}_iF)(c)$ form a cocone on $FP$.
So the left Kan extension does not choose one presentation of $c$ arbitrarily; it takes the universal colimit over all arrows from generators $a\in\mathcal A$ into $c$. When a chosen presentation is final in $(i\downarrow c)$, this universal colimit is exactly the colimit of the displayed diagram $FP$, which is why this construction underlies presentations of algebraic objects, extensions from bases, and induction-style constructions.
[/example]
Kan extensions and adjoint functor theorems show that universal constructions can be forced by the right hypotheses, not merely observed in examples. From there, the course moves to additive structure, where Hom-sets themselves acquire algebraic form and the categorical language begins to resemble linear algebra.
# 8. Additive and Preadditive Categories
This chapter begins the passage from ordinary category theory to the categorical setting of homological algebra. Chapters 1 through 7 treated adjunctions, limits, colimits, Kan extensions, and representability in categories whose Hom-sets were only sets. Here we add linear structure to morphisms: Hom-sets become abelian groups, composition becomes bilinear, and finite products and coproducts merge into direct sums.
The guiding question is how much of the familiar algebra of abelian groups, modules, vector spaces, and chain complexes can be expressed without referring to elements. Preadditive categories supply addition of morphisms, while additive categories add zero objects and finite biproducts. This structure is the entry point to kernels, cokernels, exactness, and abelian categories in the next part of the course.
## Preadditive Categories and Hom Abelian Groups
What must be added to a category so that morphisms can be added, subtracted, and multiplied by integers in a way compatible with composition? It is not enough to put an abelian-group structure on each Hom-set arbitrarily. If composition does not distribute over addition, then expressions such as $g(f_1-f_2)$ no longer behave like differences of composites, and later notions such as kernels of differences or chain homotopies lose their algebraic meaning. The answer is therefore to enrich each Hom-set separately and require composition to be bilinear. This is the categorical abstraction behind the formulae used every day for homomorphisms of abelian groups and modules.
[definition: Preadditive Category]
A preadditive category is a category $\mathcal C$ such that, for every pair of objects $A,B \in \mathcal C$, the set $\mathcal C(A,B)$ is an abelian group, and for every triple of objects $A,B,C \in \mathcal C$, the composition map
\begin{align*}
\mathcal C(B,C) \times \mathcal C(A,B) &\longrightarrow \mathcal C(A,C),\\
(g,f) &\longmapsto g \circ f
\end{align*}
is bilinear.
[/definition]
Bilinearity means that for composable morphisms the identities
\begin{align*}
(g_1+g_2)\circ f &= g_1\circ f + g_2\circ f, & g\circ(f_1+f_2) &= g\circ f_1 + g\circ f_2
\end{align*}
hold, and composition with zero morphisms gives zero morphisms. Thus each Hom-set behaves like an abelian group, and composition behaves like multiplication distributed over addition.
[example: Abelian Groups Are Preadditive]
In $\mathrm{Ab}$, let $A$ and $B$ be abelian groups. For homomorphisms $f,g:A\to B$, define their sum pointwise by
\begin{align*}
(f+g)(a)=f(a)+g(a).
\end{align*}
This is again a homomorphism: for $a,a'\in A$,
\begin{align*}
(f+g)(a+a')
&=f(a+a')+g(a+a')\\
&=(f(a)+f(a'))+(g(a)+g(a'))\\
&=(f(a)+g(a))+(f(a')+g(a'))\\
&=(f+g)(a)+(f+g)(a'),
\end{align*}
where the third equality uses commutativity in the abelian group $B$. The zero element of $\mathrm{Ab}(A,B)$ is the zero homomorphism $0(a)=0_B$, and the inverse of $f$ is the homomorphism $-f$ given by $(-f)(a)=-f(a)$.
Composition is bilinear. If $f_1,f_2:A\to B$ and $g:B\to C$, then for every $a\in A$,
\begin{align*}
\bigl(g\circ(f_1+f_2)\bigr)(a)
&=g(f_1(a)+f_2(a))\\
&=g(f_1(a))+g(f_2(a))\\
&=(g\circ f_1)(a)+(g\circ f_2)(a)\\
&=\bigl(g\circ f_1+g\circ f_2\bigr)(a).
\end{align*}
If $f:A\to B$ and $g_1,g_2:B\to C$, then for every $a\in A$,
\begin{align*}
\bigl((g_1+g_2)\circ f\bigr)(a)
&=(g_1+g_2)(f(a))\\
&=g_1(f(a))+g_2(f(a))\\
&=(g_1\circ f)(a)+(g_2\circ f)(a)\\
&=\bigl(g_1\circ f+g_2\circ f\bigr)(a).
\end{align*}
Thus $\mathrm{Ab}$ is preadditive.
This example also gives a diagnostic: pointwise addition of homomorphisms works because the target object is abelian. In the category of groups, if $f,g:G\to H$ are homomorphisms and one defines $(fg)(x)=f(x)g(x)$ pointwise, then
\begin{align*}
(fg)(xy)&=f(xy)g(xy)=f(x)f(y)g(x)g(y),\\
(fg)(x)(fg)(y)&=f(x)g(x)f(y)g(y).
\end{align*}
These are equal for all such $f,g,x,y$ only when the relevant elements of $H$ commute, so pointwise products of group homomorphisms need not be homomorphisms in nonabelian targets.
[/example]
Modules restore the missing commutativity at the level of Hom-sets while also respecting scalar multiplication. This makes module categories the guiding algebraic example of preadditivity.
[example: Modules Over A Ring]
Let $R$ be a ring, and let $R\text{-}\mathrm{Mod}$ be the category of left $R$-modules and $R$-linear maps. For left $R$-modules $M,N$, define addition in $\operatorname{Hom}_R(M,N)$ pointwise:
\begin{align*}
(f+h)(m)=f(m)+h(m).
\end{align*}
If $f,h:M\to N$ are $R$-linear, then for $m,m'\in M$ and $r\in R$,
\begin{align*}
(f+h)(m+m')
&=f(m+m')+h(m+m')\\
&=\bigl(f(m)+f(m')\bigr)+\bigl(h(m)+h(m')\bigr)\\
&=\bigl(f(m)+h(m)\bigr)+\bigl(f(m')+h(m')\bigr)\\
&=(f+h)(m)+(f+h)(m'),
\end{align*}
where the third equality uses commutativity of the abelian group underlying $N$. Also,
\begin{align*}
(f+h)(rm)
&=f(rm)+h(rm)\\
&=r f(m)+r h(m)\\
&=r\bigl(f(m)+h(m)\bigr)\\
&=r(f+h)(m).
\end{align*}
Thus $f+h$ is again $R$-linear. The zero element is the zero map $0(m)=0_N$, and the inverse of $f$ is the map $-f$ given by $(-f)(m)=-f(m)$; the abelian-group laws hold pointwise because they hold in the abelian group $N$.
Composition is bilinear. If $f_1,f_2:M\to N$ and $g:N\to P$ are $R$-linear, then for every $m\in M$,
\begin{align*}
\bigl(g\circ(f_1+f_2)\bigr)(m)
&=g\bigl((f_1+f_2)(m)\bigr)\\
&=g\bigl(f_1(m)+f_2(m)\bigr)\\
&=g(f_1(m))+g(f_2(m))\\
&=(g\circ f_1)(m)+(g\circ f_2)(m)\\
&=\bigl(g\circ f_1+g\circ f_2\bigr)(m).
\end{align*}
If $f:M\to N$ and $g_1,g_2:N\to P$ are $R$-linear, then for every $m\in M$,
\begin{align*}
\bigl((g_1+g_2)\circ f\bigr)(m)
&=(g_1+g_2)(f(m))\\
&=g_1(f(m))+g_2(f(m))\\
&=(g_1\circ f)(m)+(g_2\circ f)(m)\\
&=\bigl(g_1\circ f+g_2\circ f\bigr)(m).
\end{align*}
Therefore the Hom-sets in $R\text{-}\mathrm{Mod}$ are abelian groups under pointwise addition, and composition distributes over that addition in both variables, so $R\text{-}\mathrm{Mod}$ is preadditive.
[/example]
This calculation has a compact conceptual translation. Instead of saying separately that every Hom-set is an abelian group and composition is additive, we can say that the category is enriched over abelian groups.
[remark: Enrichment Viewpoint]
A preadditive category is the same thing as a category enriched over the monoidal category of abelian groups. The phrase "enriched Hom abelian groups" means that the Hom-objects are abelian groups rather than bare sets, and that categorical composition is a homomorphism in each variable.
[/remark]
A preadditive category automatically has a distinguished zero morphism between any two objects: the identity element of the abelian group $\mathcal C(A,B)$. We denote it by $0_{A,B}:A\to B$, or just $0$ when the source and target are fixed by context.
For these zero elements to behave like zero maps, composition must carry them to zero in every Hom-group. This is not merely notation: later matrix calculations with direct sums use identities such as $f0=0$ and $0g=0$ to eliminate off-diagonal terms. The bilinearity in a preadditive category supplies exactly this compatibility.
Before zero morphisms can be used in matrix identities or biproduct calculations, there is a basic coherence issue to settle. The zero element in one Hom-group must remain zero after composing on either side, otherwise the notation $0$ would not behave like a zero map. The result below isolates exactly the bilinear consequence that makes the zero elements in a preadditive category compatible with composition.
[quotetheorem:4183]
[citeproof:4183]
This theorem is often used silently, but its hypotheses matter. The additive identity in each Hom-group gives a candidate zero morphism, while bilinearity forces composition to preserve that identity on both sides. Without bilinearity, the element called $0$ in $\mathcal C(X,A)$ need not be sent to the element called $0$ in $\mathcal C(X,B)$ by postcomposition. The result does not assert the existence of a zero object; it only says that the zero elements already present in Hom-groups compose like zero maps. The next step is to ask when these Hom-wise zero maps come from an actual object $0$ inside the category.
## Zero Objects and Biproducts
How can a category express the direct sum $A\oplus B$ without referring to elements of $A$ and $B$? Products alone give projections out of an object, and coproducts alone give inclusions into an object, but neither structure by itself says that the two descriptions are compatible. In $\mathrm{Set}$, for instance, the binary product $A\times B$ and binary coproduct $A\sqcup B$ are usually not the same object, so there is no direct-sum calculus of inclusions and projections. In ordinary algebra, a direct sum is both a product and a coproduct, with projections and inclusions satisfying matrix-like identities. In a preadditive category, these identities can be written purely in terms of morphism addition.
[definition: Zero Object]
A zero object in a category $\mathcal C$ is an object $0$ that is both initial and terminal.
[/definition]
If a zero object exists, then for every pair of objects $A,B$ there is a canonical morphism $A\to B$, namely the composite $A\to 0\to B$. In a preadditive category this canonical morphism agrees with the additive zero morphism whenever the zero object is compatible with the enrichment.
[example: Zero Objects In Algebraic Categories]
In $\mathrm{Ab}$, let $0$ denote the one-element abelian group $\{0\}$. For any abelian group $A$, there is exactly one function $0\to A$, namely the function sending the only element $0\in 0$ to $0_A\in A$. It is a homomorphism because
\begin{align*}
u(0+0)&=u(0)=0_A,\\
u(0)+u(0)&=0_A+0_A=0_A.
\end{align*}
There is also exactly one function $A\to 0$, since every element of $A$ must be sent to the only element of $0$. It is a homomorphism because for $a,a'\in A$,
\begin{align*}
v(a+a')&=0,\\
v(a)+v(a')&=0+0=0.
\end{align*}
Thus the one-element abelian group is both initial and terminal in $\mathrm{Ab}$.
The same verification works in $R\text{-}\mathrm{Mod}$. The zero module has one element $0$, and the unique map $0\to M$ sends $0$ to $0_M$; it preserves addition by the calculation above and preserves scalar multiplication because, for $r\in R$,
\begin{align*}
u(r0)&=u(0)=0_M,\\
r\,u(0)&=r0_M=0_M.
\end{align*}
The unique map $M\to 0$ is also $R$-linear, since for $m,m'\in M$ and $r\in R$,
\begin{align*}
v(m+m')&=0=v(m)+v(m'),\\
v(rm)&=0=r0=r\,v(m).
\end{align*}
For vector spaces over a field $k$, the zero vector space is the same argument with $R=k$. Hence the usual zero group, zero module, and zero vector space are zero objects, and this is why the zero object itself is conventionally denoted by $0$.
[/example]
The zero object supplies the empty finite direct sum. The next step is to understand binary direct sums categorically, where the same object must behave as both a product and a coproduct.
[definition: Binary Biproduct]
Let $\mathcal C$ be a preadditive category. A binary biproduct of objects $A$ and $B$ is an object $A\oplus B$ together with morphisms
\begin{align*}
i_A &: A\to A\oplus B, & i_B &: B\to A\oplus B,\\
p_A &: A\oplus B\to A, & p_B &: A\oplus B\to B
\end{align*}
such that
\begin{align*}
p_A i_A &= 1_A, & p_B i_B &= 1_B,\\
p_A i_B &= 0, & p_B i_A &= 0,\\
i_A p_A + i_B p_B &= 1_{A\oplus B}.
\end{align*}
[/definition]
The four maps $i_A,i_B,p_A,p_B$ are the categorical versions of inclusions and projections. The first two equations say that each summand is recovered by projecting after including. The middle equations say the two summands do not overlap. The final equation says every morphism through $A\oplus B$ decomposes into its $A$-part plus its $B$-part.
The definition is useful only if those equations really recover the universal properties expected of a direct sum. In particular, one should be able to build a unique map into $A\oplus B$ from two component maps, and similarly decompose maps out of it. The following result turns the biproduct identities into that product-and-coproduct behavior.
[quotetheorem:4184]
[citeproof:4184]
The computation displays the main advantage of biproducts: maps into and out of a direct sum are assembled by adding composites with inclusions and projections. The preadditive hypothesis is essential because the candidate map $i_Af+i_Bg$ uses addition in the Hom-group $\mathcal C(X,A\oplus B)$. The equations $p_Ai_B=0$ and $p_Bi_A=0$ are also essential: without them, projecting the assembled map would pick up unwanted cross-terms. The theorem does not say that every object that is separately a product and a coproduct is automatically a biproduct with arbitrary structure maps; compatibility of the structure maps is the point. This is why the next criterion isolates exactly which compatibility equations must be checked in practice.
[quotetheorem:4185]
[citeproof:4185]
In practice, this criterion says that once a product and coproduct have been identified on the same object with compatible structure maps, the additive identity $i_Ap_A+i_Bp_B=1$ supplies the missing direct-sum decomposition. The product hypothesis is used to prove equality of endomorphisms by checking projections; without it, the two composites with $p_A$ and $p_B$ would not determine a map $P\to P$. The off-diagonal zero equations are also not cosmetic: if $p_Ai_B$ were nonzero, then the alleged $B$-summand would leak into the $A$-coordinate. The criterion is therefore a practical test rather than a definition-free coincidence theorem. It prepares the standard algebraic examples, where the compatible product and coproduct structures are carried by the same object.
[example: Direct Sums Of Modules]
For $R$-modules $M$ and $N$, the usual direct sum $M\oplus N$ consists of pairs $(m,n)$ with componentwise addition and scalar multiplication. Define
\begin{align*}
i_M(m)&=(m,0), & i_N(n)&=(0,n),\\
p_M(m,n)&=m, & p_N(m,n)&=n.
\end{align*}
These maps are $R$-linear. For example,
\begin{align*}
i_M(m+m')&=(m+m',0)=(m,0)+(m',0)=i_M(m)+i_M(m'),\\
i_M(rm)&=(rm,0)=r(m,0)=r\,i_M(m),
\end{align*}
and the same calculation proves $R$-linearity of $i_N$. For $p_M$,
\begin{align*}
p_M\bigl((m,n)+(m',n')\bigr)
&=p_M(m+m',n+n')\\
&=m+m'\\
&=p_M(m,n)+p_M(m',n'),\\
p_M\bigl(r(m,n)\bigr)
&=p_M(rm,rn)\\
&=rm\\
&=r\,p_M(m,n),
\end{align*}
and the proof for $p_N$ is identical in the second coordinate.
The diagonal composites are identities because, for $m\in M$ and $n\in N$,
\begin{align*}
(p_Mi_M)(m)&=p_M(m,0)=m=1_M(m),\\
(p_Ni_N)(n)&=p_N(0,n)=n=1_N(n).
\end{align*}
The off-diagonal composites are zero maps because
\begin{align*}
(p_Mi_N)(n)&=p_M(0,n)=0,\\
(p_Ni_M)(m)&=p_N(m,0)=0.
\end{align*}
Finally, for every $(m,n)\in M\oplus N$,
\begin{align*}
\bigl(i_Mp_M+i_Np_N\bigr)(m,n)
&=i_M\bigl(p_M(m,n)\bigr)+i_N\bigl(p_N(m,n)\bigr)\\
&=i_M(m)+i_N(n)\\
&=(m,0)+(0,n)\\
&=(m+0,0+n)\\
&=(m,n)\\
&=1_{M\oplus N}(m,n).
\end{align*}
Thus the maps $i_M,i_N,p_M,p_N$ satisfy exactly the biproduct equations, so the usual algebraic direct sum is the categorical biproduct of $M$ and $N$. The finite nature is essential here: for an infinite family of modules, the direct sum contains only finitely supported families, while the direct product contains all families, so the two constructions usually differ.
[/example]
The module example shows the pattern, but additive algebra needs an abstract setting where the same finite direct-sum calculus is always available. A preadditive category alone has addition of morphisms, yet it may lack a zero object or the finite biproducts needed to write matrices and split objects into finite sums. The next definition packages exactly those missing structural requirements.
[definition: Additive Category]
An additive category is a preadditive category with a zero object and all finite biproducts.
[/definition]
The definition requires all finite biproducts, but in practice one wants a smaller checklist. Since the empty biproduct is the zero object and larger finite sums should be built by repeatedly adjoining one summand, binary biproducts ought to generate the whole finite theory. The following result makes that reduction precise and justifies the standard notation $A_1\oplus\cdots\oplus A_n$.
[quotetheorem:4186]
[citeproof:4186]
This theorem is what allows additive categories to support the same finite direct-sum notation used in module theory. The finiteness hypothesis is important: an additive category need not have countable products, countable coproducts, or infinite biproducts. The zero object supplies the empty direct sum, while binary biproducts supply the induction step; without either part of the definition, the notation $\bigoplus_{j=1}^n A_j$ would fail at $n=0$ or could not be built uniformly. The theorem also does not choose a literal equality between different parenthesisations of direct sums; it gives canonical isomorphisms compatible with projections and inclusions. Later, kernels and cokernels will interact with these finite direct sums exactly as expected in abelian categories, and the same finite matrix calculus appears in representation theory, sheaf categories, and triangulated or derived settings.
## Additive Functors and Matrices of Morphisms
Once objects can be added by biproducts and morphisms can be added in Hom-groups, how should functors respect this structure? An ordinary functor preserves identities and composition, but it may ignore addition of morphisms. If $F(f+g)$ differs from $F(f)+F(g)$, then $F$ cannot be expected to preserve equations such as $i_Ap_A+i_Bp_B=1$, even if it preserves each composite separately. The right notion is therefore that a functor should be a group homomorphism on every Hom-set. Such functors preserve the linear algebra of morphisms and allow direct-sum decompositions to be transported between additive categories.
[definition: Additive Functor]
Let $\mathcal C$ and $\mathcal D$ be preadditive categories. A functor $F:\mathcal C\to\mathcal D$ is additive if, for every pair of objects $A,B\in\mathcal C$, the induced map
\begin{align*}
F_{A,B}:\mathcal C(A,B)&\longrightarrow \mathcal D(F(A),F(B)),\\
f&\longmapsto F(f)
\end{align*}
is a homomorphism of abelian groups.
[/definition]
Thus an additive functor satisfies $F(f+g)=F(f)+F(g)$ and $F(0)=0$ on every Hom-group. Since every functor already preserves identities and composition, additivity says exactly that the functor respects the extra enriched structure.
The remaining compatibility to check is whether this Hom-wise condition also preserves the object-level direct-sum structure. Biproducts are characterized by equations involving composition, zero maps, and sums of morphisms, so an additive functor has exactly the operations needed to transport those equations to the target category.
[quotetheorem:4187]
[citeproof:4187]
This preservation result explains why additive functors are the correct morphisms between additive categories: they preserve direct sums because direct sums are encoded by additive equations. The proof uses both functoriality and additivity. Functoriality alone preserves composites such as $p_Ai_A$, but the final biproduct equation involves a sum of endomorphisms, so additivity is what allows $F(i_Ap_A+i_Bp_B)$ to split into the required sum. The theorem does not say that every functor between additive categories is additive, nor that a non-additive functor preserving the underlying object $A\oplus B$ preserves its biproduct structure. This distinction becomes important for derived and homological constructions, where functors are useful only when they respect the additive structure that chain maps and exact sequences use.
[example: Forgetful Functors Need Not Be Additive]
The forgetful functor $U:\mathrm{Ab}\to\mathrm{Set}$ sends an abelian group to its underlying set and a homomorphism to its underlying function. It preserves identities and composition as an ordinary functor, but it is not an additive functor in the stated sense because the codomain $\mathrm{Set}$ is not preadditive in its usual structure: for sets $X,Y$, the Hom-set $\mathrm{Set}(X,Y)$ is just a set of functions, with no specified abelian-group operation. For instance, if $Y=\{0,1\}$, there is no addition operation on $Y$ supplied by the category $\mathrm{Set}$, so for functions $u,v:X\to Y$ the formula
\begin{align*}
(u+v)(x)=u(x)+v(x)
\end{align*}
has no meaning in ordinary $\mathrm{Set}$. Thus the expression
\begin{align*}
U(f+h)=U(f)+U(h)
\end{align*}
is not an equation in a Hom abelian group of $\mathrm{Set}$, so $U$ is not additive as a functor between preadditive categories.
By contrast, restriction of scalars is additive. If $\varphi:R\to S$ is a ring homomorphism and $\operatorname{Res}:S\text{-}\mathrm{Mod}\to R\text{-}\mathrm{Mod}$ is restriction of scalars, then for $S$-linear maps $f,h:M\to N$ and $m\in M$,
\begin{align*}
\operatorname{Res}(f+h)(m)
&=(f+h)(m)\\
&=f(m)+h(m)\\
&=\operatorname{Res}(f)(m)+\operatorname{Res}(h)(m)\\
&=\bigl(\operatorname{Res}(f)+\operatorname{Res}(h)\bigr)(m).
\end{align*}
Therefore $\operatorname{Res}(f+h)=\operatorname{Res}(f)+\operatorname{Res}(h)$.
Extension of scalars is additive for the same reason, with the tensor product making the calculation explicit. For $R$-linear maps $f,h:M\to N$, extension of scalars sends them to
\begin{align*}
S\otimes_R f,\; S\otimes_R h:S\otimes_R M\to S\otimes_R N.
\end{align*}
On a pure tensor $s\otimes m$,
\begin{align*}
\bigl(S\otimes_R(f+h)\bigr)(s\otimes m)
&=s\otimes (f+h)(m)\\
&=s\otimes \bigl(f(m)+h(m)\bigr)\\
&=s\otimes f(m)+s\otimes h(m)\\
&=\bigl(S\otimes_R f\bigr)(s\otimes m)+\bigl(S\otimes_R h\bigr)(s\otimes m)\\
&=\bigl(S\otimes_R f+S\otimes_R h\bigr)(s\otimes m),
\end{align*}
where the third equality is additivity of the tensor product in its second variable. Hence extension and restriction of scalars respect addition on Hom-groups, while the ordinary forgetful functor to $\mathrm{Set}$ does not even have Hom-group addition available in its target.
[/example]
Biproducts also make morphisms behave like matrices. This is more than an analogy: without the identity decompositions $1_A=\sum_j i_jp_j$ and $1_B=\sum_i i_ip_i$, a list of components $p_ifi_j$ would not determine the original morphism. Products alone let us read off columns of data, and coproducts alone let us assemble rows of data, but biproducts make both operations inverse to each other. If $A=A_1\oplus\cdots\oplus A_m$ and $B=B_1\oplus\cdots\oplus B_n$, then a morphism $f:A\to B$ has components
\begin{align*}
f_{ij}=p_i f i_j:A_j\to B_i.
\end{align*}
The products of Hom-sets in the matrix notation are external finite cartesian products of abelian groups. They are not categorical products inside $\mathcal C$; the categorical finite products have already been supplied by the biproduct objects $A$ and $B$. Thus the statement below compares the abelian group $\mathcal C(A,B)$ with an ordinary finite product of abelian groups whose factors are $\mathcal C(A_j,B_i)$. Conversely, a family of morphisms $f_{ij}:A_j\to B_i$ determines
\begin{align*}
f=\sum_{i=1}^{n}\sum_{j=1}^{m} i_i f_{ij} p_j.
\end{align*}
The ordering of the indices matches the usual convention that rows correspond to target summands and columns correspond to source summands.
To use this notation reliably, one must know that passing between a morphism and its matrix of components loses no information. The key issue is reconstruction: the component maps must determine the original morphism, and every compatible finite matrix of components must assemble back into a morphism.
[quotetheorem:4188]
[citeproof:4188]
This matrix calculus is not extra notation imposed on the category; it is forced by the biproduct equations. The finite-biproduct hypothesis is again essential, since the reconstruction formula uses a finite sum of morphisms. The ordering of the indices matters because composition in a category is generally not commutative; the row-by-column formula records the order of source inclusion, middle projection, and target projection. The theorem does not turn Hom-sets into rings unless the source and target biproducts coincide and composition is defined as multiplication; in general it gives rectangular matrices with entries in different Hom-groups. This is the mechanism behind block decompositions of representations, sheaves, chain complexes, and later exact triangles in triangulated categories.
After matrices, the next structural test is how Hom itself behaves as a functor. In an additive setting, representable functors should remember the abelian-group structure on Hom-sets, but this requires checking both variance directions: postcomposition and precomposition must respect addition in the appropriate Hom-groups.
[quotetheorem:4189]
[citeproof:4189]
The theorem is the enriched version of representability: representable functors do not merely land in sets; in a preadditive category they land naturally in abelian groups and preserve the addition of morphisms. The preadditive hypothesis is exactly what makes postcomposition and precomposition group homomorphisms; in an ordinary category there is no addition on Hom-sets for a representable functor to preserve. The result is also deliberately one-variable-at-a-time: covariance in the second variable and contravariance in the first variable have different variance, even though both are additive on Hom-groups. This observation is a bridge to Yoneda-style arguments in additive and abelian categories, where representable functors detect morphism equations while retaining the ambient abelian-group structure.
[example: Chain Complexes In An Additive Category]
Let $\mathcal A$ be an additive category. A chain complex in $\mathcal A$ is a sequence of objects $(C_n)_{n\in\mathbb Z}$ and morphisms $d_n:C_n\to C_{n-1}$ such that
\begin{align*}
d_{n-1}d_n=0
\end{align*}
for every $n$. A morphism of chain complexes $f:C\to D$ is a family of morphisms $f_n:C_n\to D_n$ satisfying
\begin{align*}
f_{n-1}d_n^C=d_n^D f_n
\end{align*}
for every $n$.
If $f,g:C\to D$ are chain maps, define their sum degreewise by
\begin{align*}
(f+g)_n=f_n+g_n.
\end{align*}
This is again a chain map because, for every $n$,
\begin{align*}
(f+g)_{n-1}d_n^C
&=(f_{n-1}+g_{n-1})d_n^C\\
&=f_{n-1}d_n^C+g_{n-1}d_n^C\\
&=d_n^D f_n+d_n^D g_n\\
&=d_n^D(f_n+g_n)\\
&=d_n^D(f+g)_n,
\end{align*}
where the second and fourth equalities use bilinearity of composition in $\mathcal A$, and the third uses that $f$ and $g$ commute with the differentials. The zero chain map has components $0_n=0_{C_n,D_n}$, and
\begin{align*}
0_{n-1}d_n^C=0=d_n^D0_n,
\end{align*}
so it is a chain map. The additive inverse of $f$ is given by $(-f)_n=-f_n$, and it is a chain map since
\begin{align*}
(-f)_{n-1}d_n^C
&=(-f_{n-1})d_n^C\\
&=-(f_{n-1}d_n^C)\\
&=-(d_n^Df_n)\\
&=d_n^D(-f_n)\\
&=d_n^D(-f)_n.
\end{align*}
Thus the morphisms $C\to D$ form an abelian group under degreewise addition.
Composition is also degreewise. If $f:C\to D$ and $g:D\to E$ are chain maps, then $(gf)_n=g_nf_n$, and this family is a chain map because
\begin{align*}
(gf)_{n-1}d_n^C
&=g_{n-1}f_{n-1}d_n^C\\
&=g_{n-1}d_n^Df_n\\
&=d_n^Eg_nf_n\\
&=d_n^E(gf)_n.
\end{align*}
Bilinearity follows degreewise: if $f_1,f_2:C\to D$ and $g:D\to E$, then
\begin{align*}
\bigl(g(f_1+f_2)\bigr)_n
&=g_n(f_{1,n}+f_{2,n})\\
&=g_nf_{1,n}+g_nf_{2,n}\\
&=(gf_1)_n+(gf_2)_n,
\end{align*}
and if $f:C\to D$ and $g_1,g_2:D\to E$, then
\begin{align*}
\bigl((g_1+g_2)f\bigr)_n
&=(g_{1,n}+g_{2,n})f_n\\
&=g_{1,n}f_n+g_{2,n}f_n\\
&=(g_1f)_n+(g_2f)_n.
\end{align*}
The zero complex has the zero object of $\mathcal A$ in every degree and zero differentials. For two complexes $C$ and $D$, their finite biproduct is formed degreewise:
\begin{align*}
(C\oplus D)_n=C_n\oplus D_n,
\end{align*}
with differential
\begin{align*}
d_n^{C\oplus D}=i_{C_{n-1}}d_n^Cp_{C_n}+i_{D_{n-1}}d_n^Dp_{D_n}.
\end{align*}
Then
\begin{align*}
d_{n-1}^{C\oplus D}d_n^{C\oplus D}
&=(i_Cd_{n-1}^Cp_C+i_Dd_{n-1}^Dp_D)(i_Cd_n^Cp_C+i_Dd_n^Dp_D)\\
&=i_Cd_{n-1}^Cp_Ci_Cd_n^Cp_C+i_Cd_{n-1}^Cp_Ci_Dd_n^Dp_D\\
&\quad+i_Dd_{n-1}^Dp_Di_Cd_n^Cp_C+i_Dd_{n-1}^Dp_Di_Dd_n^Dp_D\\
&=i_Cd_{n-1}^Cd_n^Cp_C+0+0+i_Dd_{n-1}^Dd_n^Dp_D\\
&=0+0\\
&=0,
\end{align*}
where the subscripts on the inclusions and projections are suppressed in the middle lines, and the fourth equality uses $p_Ci_C=1$, $p_Di_D=1$, $p_Ci_D=0$, and $p_Di_C=0$. Hence chain complexes in an additive category again form an additive category: addition, zero morphisms, composition, and finite biproducts are all inherited degree by degree from $\mathcal A$.
[/example]
This stability result explains why additivity is introduced before abelian categories. Once chain complexes and matrix-style constructions live comfortably in the additive world, kernels and cokernels can be added without changing the ambient algebraic language.
[remark: Why Additivity Comes Before Abelian Categories]
Abelian categories will add kernels, cokernels, and exactness axioms to the additive setting. Without Hom abelian groups, the phrase "kernel of a difference of maps" has no categorical meaning; without biproducts, exact sequences cannot be manipulated by direct-sum and matrix arguments. Additive categories therefore provide the algebraic language on which the exactness theory rests.
[/remark]
Additive structure supplies the setting in which kernels, cokernels, and exactness can be formulated cleanly. Once morphisms form abelian groups and biproducts exist, the homological constructions of module theory can be expressed categorically rather than elementwise.
# 9. Kernels, Cokernels, and Abelian Categories
Chapters 4 through 8 developed limits, colimits, and additive structure as general categorical tools. This chapter explains how those tools combine to recover the exactness language familiar from modules: kernels, cokernels, images, quotients, and short exact sequences. The guiding question is which categorical hypotheses are strong enough to make homological algebra possible without referring to elements.
## Kernels and Cokernels by Universal Properties
How can the subgroup $\{x \in M : f(x)=0\}$ be described without mentioning elements of $M$? The categorical answer is that it is the universal way of mapping into the domain of $f$ so that the composite with $f$ vanishes. The dual construction, the cokernel, is the universal way of mapping out of the codomain of $f$ after forcing $f$ to become zero.
[definition: Zero Object]
An object $0$ of a category $\mathcal A$ is a zero object if it is both initial and terminal.
[/definition]
When a zero object exists, there is a unique morphism $X \to 0 \to Y$ from any object $X$ to any object $Y$. This morphism is denoted $0_{X,Y}$, or simply $0$ when the source and target are understood.
With zero morphisms available, the equation $fu=0$ becomes meaningful in any category. The point is not to list the individual solutions to this equation, since a general category may have no elements to list. Instead, one asks for a single object through which every such solution factors uniquely; that universal object is what replaces the usual subgroup of elements killed by $f$.
[definition: Kernel]
Let $\mathcal A$ be a category with a zero object, and let $f:A \to B$ be a morphism. A kernel of $f$ is a morphism $k:K \to A$ such that $fk=0$ and such that for every morphism $u:X \to A$ with $fu=0$, there exists a unique morphism $\bar{u}:X \to K$ satisfying $k\bar{u}=u$.
[/definition]
Equivalently, $k:K \to A$ is a terminal object in the category of morphisms into $A$ annihilated by $f$. This reformulation is often the best way to remember the direction of the universal property. The word "terminal" matters: every other solution to the equation $fu=0$ must pass uniquely through the kernel, so the kernel records all solutions at once rather than a chosen list of elements.
The dual problem asks for the universal way to make $f$ vanish after leaving its codomain. This is the categorical version of quotienting by the image of a map.
[definition: Cokernel]
Let $\mathcal A$ be a category with a zero object, and let $f:A \to B$ be a morphism. A cokernel of $f$ is a morphism $q:B \to Q$ such that $qf=0$ and such that for every morphism $v:B \to Y$ with $vf=0$, there exists a unique morphism $\bar{v}:Q \to Y$ satisfying $\bar{v}q=v$.
[/definition]
The cokernel is the dual construction to the kernel. It is an initial object in the category of morphisms out of $B$ annihilating $f$.
Because kernels and cokernels are defined by universal properties, the first structural question is uniqueness. A construction defined only by a mapping property would be unusable if different choices led to unrelated objects or incompatible structure maps. The theorem below verifies that any two choices are uniquely comparable in the expected sense, so later arguments may choose convenient representatives without changing the categorical content.
[quotetheorem:3969]
[citeproof:3969]
This theorem justifies speaking of "the" kernel or cokernel when only its isomorphism class matters. The hypothesis that the object satisfies the universal property is essential: two arbitrary subobjects of $A$ killed by $f$ need not be comparable, and without the terminal property there is no reason for them to be isomorphic. The theorem also does not say that the underlying object alone is canonical; the structure morphism $K \to A$ or $B \to Q$ is part of what is unique. In diagram arguments, the practical rule is to construct maps into a kernel or out of a cokernel by checking the relevant composite is zero, then invoke uniqueness to prove commutativity.
[example: Kernels and Cokernels in Modules]
Let $R$ be a ring and let $f:M \to N$ be an $R$-module homomorphism. We show first that the inclusion
\begin{align*}
i:\ker f \hookrightarrow M
\end{align*}
is the categorical kernel of $f$. For every $x \in \ker f$, the definition of $\ker f$ gives $f(x)=0$, so
\begin{align*}
(f i)(x)=f(i(x))=f(x)=0.
\end{align*}
Thus $fi=0$. Now let $u:X \to M$ be an $R$-module homomorphism with $fu=0$. For each $x \in X$,
\begin{align*}
f(u(x))=(fu)(x)=0,
\end{align*}
so $u(x)\in \ker f$. Define $\bar u:X \to \ker f$ by $\bar u(x)=u(x)$. This is $R$-linear because, for $x,y\in X$ and $r\in R$,
\begin{align*}
\bar u(x+y)&=u(x+y)=u(x)+u(y)=\bar u(x)+\bar u(y),\\
\bar u(rx)&=u(rx)=r u(x)=r\bar u(x).
\end{align*}
Also
\begin{align*}
(i\bar u)(x)=i(\bar u(x))=i(u(x))=u(x),
\end{align*}
so $i\bar u=u$. If $w:X\to \ker f$ also satisfies $iw=u$, then for every $x\in X$,
\begin{align*}
w(x)=i(w(x))=u(x)=\bar u(x),
\end{align*}
where the first equality uses that $i$ is the inclusion. Hence $w=\bar u$, proving uniqueness.
Now let
\begin{align*}
q:N \to N/\operatorname{im} f
\end{align*}
be the quotient map. For every $m\in M$,
\begin{align*}
(qf)(m)=q(f(m))=f(m)+\operatorname{im} f=\operatorname{im} f=0
\end{align*}
in the quotient module, so $qf=0$. If $v:N\to P$ is an $R$-module homomorphism with $vf=0$, then $v$ vanishes on $\operatorname{im} f$: for every element $y\in \operatorname{im} f$, choose $m\in M$ with $y=f(m)$, and compute
\begin{align*}
v(y)=v(f(m))=(vf)(m)=0.
\end{align*}
Define $\bar v:N/\operatorname{im} f\to P$ by
\begin{align*}
\bar v(n+\operatorname{im} f)=v(n).
\end{align*}
This is well-defined: if $n+\operatorname{im} f=n'+\operatorname{im} f$, then $n-n'\in \operatorname{im} f$, so
\begin{align*}
v(n)-v(n')=v(n-n')=0,
\end{align*}
and hence $v(n)=v(n')$. It is $R$-linear because
\begin{align*}
\bar v((n+\operatorname{im} f)+(n'+\operatorname{im} f))
&=\bar v(n+n'+\operatorname{im} f)\\
&=v(n+n')\\
&=v(n)+v(n')\\
&=\bar v(n+\operatorname{im} f)+\bar v(n'+\operatorname{im} f),
\end{align*}
and
\begin{align*}
\bar v(r(n+\operatorname{im} f))
&=\bar v(rn+\operatorname{im} f)\\
&=v(rn)\\
&=r v(n)\\
&=r\bar v(n+\operatorname{im} f).
\end{align*}
Finally,
\begin{align*}
(\bar v q)(n)=\bar v(n+\operatorname{im} f)=v(n),
\end{align*}
so $\bar v q=v$. If $w:N/\operatorname{im} f\to P$ also satisfies $wq=v$, then for every coset $n+\operatorname{im} f$,
\begin{align*}
w(n+\operatorname{im} f)=w(q(n))=(wq)(n)=v(n)=\bar v(n+\operatorname{im} f),
\end{align*}
so $w=\bar v$. Thus the ordinary kernel submodule and quotient by the image satisfy exactly the categorical universal properties.
[/example]
The example shows that the universal property has not changed the underlying object in familiar algebra. What changes is that the same definition also applies in categories where elements, subgroups, and quotient sets are not available.
The next question is what cancellation properties these universal maps automatically have. Kernels should behave like inclusions, and cokernels should behave like quotient maps.
[quotetheorem:4190]
[citeproof:4190]
The converse is not true in an arbitrary category: a monomorphism need not be the kernel of any morphism unless the category has enough exactness structure. For instance, in categories without a zero object the phrase "kernel of a morphism" is not even available, while monomorphisms may still exist. The theorem therefore gives only one direction: kernels and cokernels have cancellation properties, but cancellation alone does not yet produce equations or quotients. Abelian categories are designed so that, after adding additive and exactness hypotheses, monomorphisms and epimorphisms are controlled by kernels and cokernels.
## Images, Coimages, Monomorphisms, and Epimorphisms
Given a module homomorphism $f:M \to N$, the image $\operatorname{im} f \subset N$ and the quotient $M/\ker f$ carry the same information: the first is a subobject of the codomain, while the second is a quotient of the domain. In a general category with kernels and cokernels, these two constructions need not agree automatically. The discrepancy is measured by the comparison morphism from the coimage to the image.
[definition: Image]
Let $\mathcal A$ be a category with kernels and cokernels, and let $f:A \to B$ be a morphism. The image of $f$ is the kernel of the cokernel of $f$.
[/definition]
Thus if $q:B \to \operatorname{coker} f$ is a cokernel of $f$, the image is a morphism $i:\operatorname{im} f \to B$ satisfying $qi=0$ and universal for maps into $B$ killed by $q$. This definition deliberately avoids saying that the image is a set of values of $f$; instead, it characterises the image as the largest subobject of $B$ on which the quotient by $f$ vanishes.
To compare this target-side subobject with information coming from the source, we also need the dual construction. The source may contain directions that $f$ sends to zero, and those directions should be collapsed before comparing with the image in the codomain. The coimage is the universal quotient that performs exactly this collapse, making it the source-side candidate for the middle object in a factorisation of $f$.
[definition: Coimage]
Let $\mathcal A$ be a category with kernels and cokernels, and let $f:A \to B$ be a morphism. The coimage of $f$ is the cokernel of the kernel of $f$.
[/definition]
Thus if $k:\ker f \to A$ is a kernel of $f$, the coimage is a morphism $p:A \to \operatorname{coim} f$ satisfying $pk=0$ and universal for maps out of $A$ killing $k$.
The two universal constructions are attached to opposite ends of the same morphism, so one still has to explain how the original map passes through them. The needed statement is a factorisation through the coimage and image, together with the canonical comparison map that measures whether quotienting the source and taking the subobject of the target produce the same middle object.
[quotetheorem:4191]
[citeproof:4191]
This factorisation exists before we assume the category is abelian, and it is the basic computational template for a morphism in any exactness-like setting. The hypotheses used so far only require kernels and cokernels; they do not force the comparison map $\bar{f}:\operatorname{coim} f \to \operatorname{im} f$ to be invertible. In a non-abelian or insufficiently exact category, the coimage can remember a quotient of the source while the image remembers a subobject of the target, and these may fail to match. The abelian axiom will say that the middle arrow is always an isomorphism.
[illustration:category-theory-ii-coimage-image-factorisation]
[example: Coimage and Image for a Module Homomorphism]
Let $k:\ker f\hookrightarrow M$ be the ordinary kernel inclusion. The coimage is the cokernel of $k$, so in $R\text{-}\operatorname{Mod}$ it is the quotient map
\begin{align*}
p:M\to M/\ker f,
\qquad
p(m)=m+\ker f.
\end{align*}
Indeed, for $x\in \ker f$,
\begin{align*}
(pk)(x)=p(x)=x+\ker f=\ker f=0
\end{align*}
in $M/\ker f$. The image is the kernel of the quotient map
\begin{align*}
q:N\to N/\operatorname{im} f,
\qquad
q(n)=n+\operatorname{im} f,
\end{align*}
so it is the inclusion $i:\operatorname{im} f\hookrightarrow N$, because for every $m\in M$,
\begin{align*}
q(f(m))=f(m)+\operatorname{im} f=\operatorname{im} f=0.
\end{align*}
The canonical comparison map $\theta:M/\ker f\to \operatorname{im} f$ is
\begin{align*}
\theta(m+\ker f)=f(m).
\end{align*}
This is well-defined: if $m+\ker f=m'+\ker f$, then $m-m'\in \ker f$, so
\begin{align*}
f(m)-f(m')=f(m-m')=0,
\end{align*}
and hence $f(m)=f(m')$. It is $R$-linear because
\begin{align*}
\theta((m+\ker f)+(m'+\ker f))
&=\theta(m+m'+\ker f)\\
&=f(m+m')\\
&=f(m)+f(m')\\
&=\theta(m+\ker f)+\theta(m'+\ker f),
\end{align*}
and
\begin{align*}
\theta(r(m+\ker f))
&=\theta(rm+\ker f)\\
&=f(rm)\\
&=r f(m)\\
&=r\theta(m+\ker f).
\end{align*}
Define $\psi:\operatorname{im} f\to M/\ker f$ by choosing $m\in M$ with $y=f(m)$ and setting
\begin{align*}
\psi(y)=m+\ker f.
\end{align*}
This is well-defined: if $f(m)=f(m')$, then
\begin{align*}
f(m-m')=f(m)-f(m')=0,
\end{align*}
so $m-m'\in \ker f$, and therefore $m+\ker f=m'+\ker f$. For every $m\in M$,
\begin{align*}
(\psi\theta)(m+\ker f)=\psi(f(m))=m+\ker f,
\end{align*}
while for every $y\in \operatorname{im} f$, writing $y=f(m)$ gives
\begin{align*}
(\theta\psi)(y)=\theta(m+\ker f)=f(m)=y.
\end{align*}
Thus $\psi$ is the inverse of $\theta$, so the categorical comparison $M/\ker f\to \operatorname{im} f$ is an isomorphism in $R\text{-}\operatorname{Mod}$.
[/example]
The preceding example is the model for the abelian-category definition. It isolates the exact point at which ordinary module theory uses the first isomorphism theorem.
Outside module categories, the comparison from coimage to image may fail to be an isomorphism. This failure is important because it marks the point where a morphism no longer factors as a quotient of its source followed by a subobject of its target in the familiar way. The term strict names precisely the morphisms for which the comparison still behaves as it does in the first isomorphism theorem.
To use image-factorisation arguments in settings weaker than abelian categories, we need a name for the morphisms that still have this good comparison map. The definition singles out those morphisms so later exactness statements can distinguish a genuine quotient-then-subobject factorisation from a merely formal kernel-cokernel construction.
[definition: Strict Morphism]
Let $\mathcal A$ be a category with kernels and cokernels. A morphism $f:A \to B$ is strict if the canonical morphism $\operatorname{coim} f \to \operatorname{im} f$ is an isomorphism.
[/definition]
In an abelian category every morphism is strict. In quasi-abelian or merely additive settings, strictness becomes a property that must be checked rather than a background theorem.
The exactness axioms will also relate kernels and cokernels to categorical cancellation. To make that relation meaningful, we first need definitions of injective-like and surjective-like morphisms that do not mention elements. Monomorphisms and epimorphisms provide those intrinsic cancellation notions, and the later abelian axioms will say when they are actually controlled by kernels and cokernels.
[definition: Monomorphism and Epimorphism]
A morphism $m:A \to B$ is a monomorphism if for every object $X$ and every pair of morphisms $u,v:X \to A$, the equality $mu=mv$ implies $u=v$. A morphism $e:A \to B$ is an epimorphism if for every object $Y$ and every pair of morphisms $u,v:B \to Y$, the equality $ue=ve$ implies $u=v$.
[/definition]
These are cancellation properties, not elementwise injectivity or surjectivity. In concrete algebraic categories they often coincide with familiar injective or surjective maps, but the categorical definitions are the cancellation conditions.
[example: Epimorphisms Need Not Be Surjective]
In the category of rings with identity and unital ring homomorphisms, let $i:\mathbb Z\hookrightarrow \mathbb Q$ be the usual inclusion. We show that $i$ is an epimorphism: suppose $u,v:\mathbb Q\to S$ are ring homomorphisms with $ui=vi$. This means that for every $a\in \mathbb Z$,
\begin{align*}
u(a)=u(i(a))=(ui)(a)=(vi)(a)=v(i(a))=v(a).
\end{align*}
Let $r\in \mathbb Q$. Write $r=n/m$ with $n,m\in \mathbb Z$ and $m\neq 0$. Since $m\cdot (1/m)=1$ in $\mathbb Q$, applying $u$ gives
\begin{align*}
u(m)u(1/m)=u(m\cdot (1/m))=u(1)=1_S,
\end{align*}
and similarly
\begin{align*}
u(1/m)u(m)=u((1/m)\cdot m)=u(1)=1_S.
\end{align*}
Thus $u(1/m)$ is the inverse of $u(m)$. The same argument shows that $v(1/m)$ is the inverse of $v(m)$. Because $u(m)=v(m)$, uniqueness of inverses in a ring gives
\begin{align*}
u(1/m)=v(1/m).
\end{align*}
Therefore
\begin{align*}
u(n/m)
&=u(n\cdot (1/m))\\
&=u(n)u(1/m)\\
&=v(n)v(1/m)\\
&=v(n\cdot (1/m))\\
&=v(n/m).
\end{align*}
So $u(r)=v(r)$ for every $r\in \mathbb Q$, hence $u=v$. This proves that $\mathbb Z\hookrightarrow \mathbb Q$ is an epimorphism. It is not surjective as a function, since $1/2\in \mathbb Q$ is not in the image of $\mathbb Z\to\mathbb Q$. Thus epimorphism is a categorical cancellation property, not a synonym for surjectivity.
[/example]
For abelian categories, this pathology disappears in the exactness formalism: epimorphisms are precisely cokernels, and monomorphisms are precisely kernels.
## Definition and First Consequences of Abelian Categories
Which categorical assumptions make kernels, cokernels, images, and quotients behave like they do for modules? Additive structure alone gives sums of morphisms and zero maps, but it does not force the first isomorphism theorem. Abelian categories are the setting where additive categorical algebra supports exact sequences.
[definition: Preadditive Category]
A category $\mathcal A$ is preadditive if each hom-set $\mathcal A(X,Y)$ is an abelian group and composition is bilinear in both variables.
[/definition]
The bilinearity condition says that for composable morphisms, composition distributes over addition in each variable. It gives a meaningful zero morphism between any two objects.
To support kernel and cokernel calculus, Hom-group addition must be paired with finite direct sums inside the category. Without a zero object and finite biproducts, one can add parallel morphisms but still lack the categorical direct sums needed for matrix arguments and exact sequence diagrams. This extra structure is what turns a preadditive category into the additive setting used by abelian categories.
Exactness needs an ambient category where finite direct-sum notation, zero morphisms, and Hom-group addition are all available at once. Once finite biproducts and zero morphisms are available, morphisms can be arranged in matrix-like diagrams and kernels and cokernels can interact with addition.
[definition: Additive Category]
A category $\mathcal A$ is additive if it is preadditive, has a zero object, and has finite biproducts.
[/definition]
A biproduct $A \oplus B$ is simultaneously a product and a coproduct, with the compatibility equations familiar from direct sums of modules. Additive categories are the natural categorical home for matrices of morphisms, but additive structure alone still does not force images, kernels, and quotients to match module theory. The abelian condition adds the exactness requirements that make every morphism admit kernel-cokernel calculus and make cancellation morphisms arise from those universal constructions.
The obstruction now is that an additive category may still have monomorphisms and epimorphisms unrelated to kernels and cokernels. The definition of an abelian category imposes the missing compatibility so that cancellation, quotients, and images become part of one exactness calculus.
[definition: Abelian Category]
An abelian category is an additive category $\mathcal A$ such that every morphism has a kernel and a cokernel, every monomorphism is a kernel, and every epimorphism is a cokernel.
[/definition]
Many texts use an equivalent definition: every morphism has a kernel and cokernel, and the canonical map from coimage to image is an isomorphism. The issue is that kernels and cokernels alone do not automatically make images behave like they do in module categories. The theorem below proves that in an abelian category the coimage-image comparison is an isomorphism, restoring the familiar factorisation of a morphism through its image.
[quotetheorem:4192]
[citeproof:4192]
This result is the technical heart of the chapter. Its hypotheses are doing real work: outside the abelian setting, the canonical comparison can fail to be an isomorphism, so a quotient of the source need not coincide with a subobject of the target. The theorem also does not say that images are elementwise sets of values; it says that two universal constructions become canonically interchangeable. From this point onward, diagram arguments can factor any morphism as an epimorphism onto its image followed by a monomorphism into its codomain, exactly as in module theory.
[quotetheorem:4193]
[citeproof:4193]
This theorem is the reason exact sequences can be drawn and manipulated in an abelian category. The additive and kernel-cokernel hypotheses are necessary: in a general category a monomorphism may only express left cancellation, and an epimorphism may only express right cancellation, with no quotient object attached. The theorem does not identify monomorphisms with literal subset inclusions or epimorphisms with literal surjections; it identifies them with the universal roles played by kernels and cokernels. This is exactly what is needed to define exactness without elements.
[definition: Exact Pair]
Let $\mathcal A$ be an abelian category. A composable pair
\begin{align*}
A \xrightarrow{f} B \xrightarrow{g} C
\end{align*}
is exact at $B$ if $gf=0$ and the induced morphism $\operatorname{im} f \to \ker g$ is an isomorphism.
[/definition]
Because image and coimage agree in an abelian category, this definition matches the module-theoretic condition $\operatorname{im} f=\ker g$. It is formulated only using universal properties.
Many arguments need the finite case where an object is presented simultaneously as an extension of a quotient by a subobject. Naming this three-term pattern separates the data of the two maps from the exactness condition that there is no leftover homology in the middle.
[definition: Short Exact Sequence]
A short exact sequence in an abelian category is a diagram
\begin{align*}
0 \longrightarrow A \xrightarrow{f} B \xrightarrow{g} C \longrightarrow 0
\end{align*}
that is exact at $A$, $B$, and $C$.
[/definition]
Equivalently, $f$ is a kernel of $g$ and $g$ is a cokernel of $f$. This is the categorical form of saying that $A$ is a subobject of $B$ and $C$ is the corresponding quotient. In practice, to prove a sequence is short exact one usually verifies three pieces of data: $f$ is monic, $g$ is epic, and the image of $f$ agrees with the kernel of $g$. The abelian-category axioms then convert those checks into the universal kernel-cokernel statement.
[illustration:category-theory-ii-short-exact-sequence]
[example: The Category of Modules Is Abelian]
Let $R$ be a ring. For left $R$-modules $M,N$, the set $\operatorname{Hom}_R(M,N)$ is an abelian group under pointwise addition:
\begin{align*}
(\alpha+\beta)(m)&=\alpha(m)+\beta(m),\\
0(m)&=0_N,\\
(-\alpha)(m)&=-\alpha(m).
\end{align*}
Composition is bilinear because for $\alpha,\beta:M\to N$ and $\gamma:N\to P$,
\begin{align*}
(\gamma(\alpha+\beta))(m)
&=\gamma(\alpha(m)+\beta(m))\\
&=\gamma(\alpha(m))+\gamma(\beta(m))\\
&=(\gamma\alpha+\gamma\beta)(m),
\end{align*}
and for $\delta,\epsilon:N\to P$,
\begin{align*}
((\delta+\epsilon)\alpha)(m)
&=(\delta+\epsilon)(\alpha(m))\\
&=\delta(\alpha(m))+\epsilon(\alpha(m))\\
&=(\delta\alpha+\epsilon\alpha)(m).
\end{align*}
Thus $R\text{-}\operatorname{Mod}$ is preadditive.
The zero module is both initial and terminal, since for every module $M$ there is exactly one homomorphism $0\to M$ and exactly one homomorphism $M\to 0$. Finite biproducts are direct sums. For $M\oplus N$, define
\begin{align*}
\iota_M(m)&=(m,0),&
\iota_N(n)&=(0,n),\\
\pi_M(m,n)&=m,&
\pi_N(m,n)&=n.
\end{align*}
Then
\begin{align*}
\pi_M\iota_M&=\operatorname{id}_M,&
\pi_N\iota_N&=\operatorname{id}_N,\\
\pi_M\iota_N&=0,&
\pi_N\iota_M&=0,
\end{align*}
and for every $(m,n)\in M\oplus N$,
\begin{align*}
(\iota_M\pi_M+\iota_N\pi_N)(m,n)
&=\iota_M(m)+\iota_N(n)\\
&=(m,0)+(0,n)\\
&=(m,n).
\end{align*}
So $M\oplus N$ is simultaneously a product and a coproduct.
For an $R$-linear map $f:M\to N$, its categorical kernel is the inclusion $\ker f\hookrightarrow M$, and its categorical cokernel is the quotient map $N\to N/\operatorname{im} f$, as verified by the kernel and cokernel universal properties. It remains to check the exactness axioms for monomorphisms and epimorphisms.
If $m:A\to B$ is monic, then $m$ is injective: if $m(a)=m(a')$, define $u,v:R\to A$ by $u(r)=ra$ and $v(r)=ra'$. Then $mu=mv$, so $u=v$, hence $a=u(1)=v(1)=a'$. Let
\begin{align*}
q:B\to B/m(A)
\end{align*}
be the quotient map. Since $q(m(a))=m(a)+m(A)=m(A)=0$, we have $qm=0$. If $u:X\to B$ satisfies $qu=0$, then for every $x\in X$,
\begin{align*}
u(x)+m(A)=q(u(x))=(qu)(x)=0,
\end{align*}
so $u(x)\in m(A)$. Thus there is a unique element $\bar u(x)\in A$ with $m(\bar u(x))=u(x)$, uniqueness using injectivity of $m$. This defines a homomorphism $\bar u:X\to A$: for $x,y\in X$,
\begin{align*}
m(\bar u(x+y))
&=u(x+y)\\
&=u(x)+u(y)\\
&=m(\bar u(x))+m(\bar u(y))\\
&=m(\bar u(x)+\bar u(y)),
\end{align*}
so $\bar u(x+y)=\bar u(x)+\bar u(y)$, and similarly
\begin{align*}
m(\bar u(rx))
&=u(rx)\\
&=r u(x)\\
&=r m(\bar u(x))\\
&=m(r\bar u(x)),
\end{align*}
so $\bar u(rx)=r\bar u(x)$. Also $m\bar u=u$, and any other factorization through $m$ agrees with $\bar u$ after applying the injective map $m$. Hence $m$ is the kernel of $q$.
Dually, let $e:A\to B$ be epic. Then $e$ is surjective: if $\operatorname{im}e\neq B$, the quotient map $q:B\to B/\operatorname{im}e$ is nonzero, while
\begin{align*}
(qe)(a)=q(e(a))=e(a)+\operatorname{im}e=\operatorname{im}e=0.
\end{align*}
Thus $qe=0e$, contradicting epimorphy unless $q=0$; but $q=0$ would force every $b\in B$ to lie in $\operatorname{im}e$. Let $k:\ker e\hookrightarrow A$ be the kernel inclusion. Since $ek=0$, suppose $v:A\to P$ satisfies $vk=0$. Define $\bar v:B\to P$ by choosing $a\in A$ with $e(a)=b$ and setting
\begin{align*}
\bar v(b)=v(a).
\end{align*}
This is well-defined: if $e(a)=e(a')$, then
\begin{align*}
e(a-a')=e(a)-e(a')=0,
\end{align*}
so $a-a'\in\ker e$, and therefore
\begin{align*}
v(a)-v(a')=v(a-a')=(vk)(a-a')=0.
\end{align*}
The same calculation shows $\bar v$ is $R$-linear, and $\bar v e=v$ by construction. Since $e$ is surjective, any map $B\to P$ satisfying this equation is forced to agree with $\bar v$ on every element of $B$. Hence $e$ is the cokernel of $k$.
Therefore $R\text{-}\operatorname{Mod}$ is additive, has kernels and cokernels, every monomorphism is a kernel, and every epimorphism is a cokernel. Equivalently, the *First Isomorphism Theorem* identifies $M/\ker f$ with $\operatorname{im}f$ for every homomorphism $f:M\to N$, so the categorical image-coimage comparison is an isomorphism. Thus $R\text{-}\operatorname{Mod}$ is an abelian category.
[/example]
Module categories are the reference examples throughout homological algebra. Most diagram lemmas, including the snake lemma and five lemma, are formal consequences of the abelian axioms rather than special facts about elements of modules.
[example: Sheaves of Abelian Groups]
Let $X$ be a topological space and let $\varphi:\mathcal F\to \mathcal G$ be a morphism of sheaves of abelian groups. The kernel sheaf is computed on each open set by
\begin{align*}
(\ker \varphi)(U)=\ker(\varphi_U:\mathcal F(U)\to \mathcal G(U)).
\end{align*}
This assignment is a sheaf: if $s_i\in \ker(\varphi)(U_i)$ agree on overlaps, the sheaf axiom for $\mathcal F$ gives a unique $s\in \mathcal F(U)$ with $s|_{U_i}=s_i$, and
\begin{align*}
(\varphi_U(s))|_{U_i}
=\varphi_{U_i}(s|_{U_i})
=\varphi_{U_i}(s_i)
=0.
\end{align*}
Since $\mathcal G$ is separated, $\varphi_U(s)=0$, so $s\in \ker(\varphi)(U)$.
The cokernel starts with the presheaf
\begin{align*}
U\longmapsto \mathcal G(U)/\varphi_U(\mathcal F(U)).
\end{align*}
For an open inclusion $V\subseteq U$, restriction is well-defined because if $g-g'=\varphi_U(f)$, then
\begin{align*}
g|_V-g'|_V
=(g-g')|_V
=\varphi_U(f)|_V
=\varphi_V(f|_V).
\end{align*}
This presheaf need not be a sheaf, so the cokernel sheaf is its sheafification. If $q:\mathcal G\to \operatorname{coker}\varphi$ is the composite quotient map followed by sheafification, then for every open $U$ and every $f\in \mathcal F(U)$,
\begin{align*}
q_U(\varphi_U(f))=0,
\end{align*}
so $q\varphi=0$.
Now suppose $v:\mathcal G\to \mathcal H$ is a sheaf morphism with $v\varphi=0$. On the presheaf cokernel define
\begin{align*}
\widetilde v_U(g+\varphi_U(\mathcal F(U)))=v_U(g).
\end{align*}
This is well-defined: if $g-g'=\varphi_U(f)$, then
\begin{align*}
v_U(g)-v_U(g')
=v_U(g-g')
=v_U(\varphi_U(f))
=(v\varphi)_U(f)
=0.
\end{align*}
The maps $\widetilde v_U$ commute with restrictions because $v$ is a sheaf morphism. Therefore $\widetilde v$ factors uniquely through the sheafification, giving a unique morphism
\begin{align*}
\bar v:\operatorname{coker}\varphi\to \mathcal H
\end{align*}
with $\bar v q=v$. Thus cokernels are obtained by sheafifying the presheaf cokernel.
The remaining abelian-category axioms are checked locally on stalks: kernels, cokernels, monomorphisms, epimorphisms, and exactness of sheaves of abelian groups are detected on the abelian groups $\mathcal F_x$. Since each stalk category is $\operatorname{Ab}$, exactness reduces pointwise to ordinary exactness of abelian groups. Hence $\operatorname{Sh}(X,\operatorname{Ab})$ is abelian, and sheaf cohomology is homological algebra inside this abelian category.
[/example]
The sheaf example shows why abelian categories are not merely a rephrasing of module theory. They provide the exactness framework needed for geometry, topology, and representation theory.
[example: Groups Are Not an Abelian Category]
The category $\operatorname{Grp}$ is not abelian because it is not additive. In an additive category, the biproduct of two objects is both their product and their coproduct. In $\operatorname{Grp}$, the product of two copies of $\mathbb Z$ is $\mathbb Z\times \mathbb Z$, with elements
\begin{align*}
e_1=(1,0),
\qquad
e_2=(0,1).
\end{align*}
These commute, since the group operation is componentwise addition:
\begin{align*}
e_1e_2
&=(1,0)+(0,1)\\
&=(1,1)\\
&=(0,1)+(1,0)\\
&=e_2e_1.
\end{align*}
Now define homomorphisms $\alpha,\beta:\mathbb Z\to S_3$ by
\begin{align*}
\alpha(1)=(12),
\qquad
\beta(1)=(23).
\end{align*}
If $\mathbb Z\times\mathbb Z$ were also the coproduct of two copies of $\mathbb Z$, then there would be a homomorphism $h:\mathbb Z\times\mathbb Z\to S_3$ satisfying
\begin{align*}
h(e_1)=(12),
\qquad
h(e_2)=(23).
\end{align*}
Applying $h$ to the equality $e_1e_2=e_2e_1$ gives
\begin{align*}
(12)(23)
&=h(e_1)h(e_2)\\
&=h(e_1e_2)\\
&=h(e_2e_1)\\
&=h(e_2)h(e_1)\\
&=(23)(12).
\end{align*}
But the two permutations are different:
\begin{align*}
(12)(23)&=(123),\\
(23)(12)&=(132).
\end{align*}
Thus the product $\mathbb Z\times\mathbb Z$ cannot satisfy the coproduct universal property, so $\operatorname{Grp}$ has no finite biproducts of the kind required for additivity.
The failure also appears in exactness language. Let $i:H\hookrightarrow S_3$ be the inclusion of the subgroup
\begin{align*}
H=\{e,(12)\}.
\end{align*}
The cokernel of $i$ in $\operatorname{Grp}$ is the quotient of $S_3$ by the normal closure of $H$, because a homomorphism out of $S_3$ kills $H$ exactly when it kills every conjugate of every element of $H$. In $S_3$,
\begin{align*}
(123)(12)(123)^{-1}&=(23),\\
(132)(12)(132)^{-1}&=(13).
\end{align*}
So the normal closure of $H$ contains $(12)$, $(23)$, and $(13)$. These transpositions generate $S_3$: for example,
\begin{align*}
(12)(23)&=(123),\\
(23)(12)&=(132),
\end{align*}
and the elements
\begin{align*}
e,\ (12),\ (23),\ (13),\ (123),\ (132)
\end{align*}
are all the elements of $S_3$. Hence the cokernel of $i$ is the quotient map $S_3\to 1$, whose kernel is all of $S_3$, while the ordinary image of $i$ is only $H$. Thus the image of a group homomorphism need not be recovered as the kernel of its cokernel, which is exactly the kind of additive exactness behavior required in an abelian category.
[/example]
This failure is structural rather than cosmetic. Nonabelian group theory has kernels and quotients, but it does not support the linear exact-sequence calculus of abelian categories.
[example: Finitely Generated Modules over a Non-Noetherian Ring]
Let $R$ be a non-Noetherian ring, so there is a left ideal $I\subseteq R$ that is not finitely generated. Consider the full subcategory $R\text{-}\operatorname{Mod}_{\mathrm{fg}}$ of finitely generated left $R$-modules. It is additive: if $M$ and $N$ are finitely generated, then $M\oplus N$ is generated by the finite list
\begin{align*}
(m_1,0),\ldots,(m_a,0),(0,n_1),\ldots,(0,n_b),
\end{align*}
whenever $m_1,\ldots,m_a$ generate $M$ and $n_1,\ldots,n_b$ generate $N$.
Cokernels stay inside this category. If $\alpha:M\to N$ is a homomorphism between finitely generated modules and $n_1,\ldots,n_b$ generate $N$, then the quotient module $N/\operatorname{im}\alpha$ is generated by
\begin{align*}
n_1+\operatorname{im}\alpha,\ldots,n_b+\operatorname{im}\alpha,
\end{align*}
because every element of $N/\operatorname{im}\alpha$ has the form $n+\operatorname{im}\alpha$, and if
\begin{align*}
n=r_1n_1+\cdots+r_bn_b,
\end{align*}
then
\begin{align*}
n+\operatorname{im}\alpha
&=(r_1n_1+\cdots+r_bn_b)+\operatorname{im}\alpha\\
&=r_1(n_1+\operatorname{im}\alpha)+\cdots+r_b(n_b+\operatorname{im}\alpha).
\end{align*}
Now take the quotient map
\begin{align*}
\pi:R\to R/I,
\qquad
\pi(r)=r+I.
\end{align*}
Both $R$ and $R/I$ are finitely generated: $R$ is generated by $1$, and $R/I$ is generated by $1+I$. In the full module category,
\begin{align*}
\ker \pi
&=\{r\in R:\pi(r)=0\}\\
&=\{r\in R:r+I=I\}\\
&=I.
\end{align*}
Suppose $\pi$ had a kernel $k:K\to R$ inside $R\text{-}\operatorname{Mod}_{\mathrm{fg}}$. Since $\pi k=0$, every element of $\operatorname{im}k$ lies in $I$. Conversely, for any $a\in I$, define $u_a:R\to R$ by
\begin{align*}
u_a(r)=ra.
\end{align*}
This map is $R$-linear, and
\begin{align*}
(\pi u_a)(r)=\pi(ra)=ra+I=I=0
\end{align*}
because $I$ is a left ideal. By the kernel universal property in $R\text{-}\operatorname{Mod}_{\mathrm{fg}}$, there is a morphism $\bar u_a:R\to K$ such that $k\bar u_a=u_a$. Evaluating at $1$ gives
\begin{align*}
a=u_a(1)=(k\bar u_a)(1)=k(\bar u_a(1)),
\end{align*}
so $a\in\operatorname{im}k$. Hence $\operatorname{im}k=I$.
But $K$ is finitely generated, say by $x_1,\ldots,x_n$. Then $\operatorname{im}k$ is generated by $k(x_1),\ldots,k(x_n)$, because if
\begin{align*}
x=r_1x_1+\cdots+r_nx_n,
\end{align*}
then
\begin{align*}
k(x)
&=k(r_1x_1+\cdots+r_nx_n)\\
&=r_1k(x_1)+\cdots+r_nk(x_n).
\end{align*}
Thus $I=\operatorname{im}k$ would be finitely generated, contradicting the choice of $I$. Therefore $R\text{-}\operatorname{Mod}_{\mathrm{fg}}$ need not have kernels, and so it need not be abelian. The failure is that submodules of finitely generated modules can leave the chosen finiteness class.
[/example]
This warning is important in algebraic geometry and representation theory, where finiteness conditions are often imposed. Noetherian hypotheses are precisely what prevent kernels of maps between finitely generated modules from escaping the category.
[remark: Exactness Is Element-Free]
In an abelian category, the phrase $\operatorname{im} f=\ker g$ means that two subobjects of $B$ are isomorphic in the subobject poset, not that two subsets have the same elements. The notation is inherited from modules, but the content is universal. This is why diagram chasing can often be replaced by universal-property arguments.
[/remark]
The chapter therefore completes the transition from universal constructions to homological algebra. Kernels and cokernels encode equations and quotients; images and coimages encode the first isomorphism theorem; abelian categories impose exactly the axioms needed for exact sequences to behave as they do in module categories.
Kernels and cokernels turn the abstract additive language into the exactness calculus used throughout homological algebra. The next chapter develops that calculus in detail, showing how short exact sequences and diagram lemmas organize local data into global information.
# 10. Exact Sequences and Diagram Lemmas
Exact sequences are the point at which the abstract language of kernels and cokernels starts to behave like familiar algebra. In Chapters 8 and 9, additive and abelian categories supplied enough structure to speak about images, coimages, kernels, and cokernels without mentioning elements. This chapter asks how much of ordinary homological algebra survives in that setting: how exactness is defined, how short exact sequences encode extensions, and how information can be extracted from commutative diagrams.
The guiding principle is that the standard diagram lemmas are not statements about elements of modules, but about the interaction of universal properties in an abelian category. We first define exactness at an object, then interpret pullbacks and pushouts as ways of transporting extensions, and finally prove the snake lemma, five lemma, and nine lemma as the basic computational tools for abelian categories.
## Exactness at an Object and Short Exact Sequences
What should it mean for a sequence of morphisms to have no homology at a middle object when the category may not have elements? In modules, exactness of
\begin{align*}
A \xrightarrow{f} B \xrightarrow{g} C
\end{align*}
means that the elements killed by $g$ are exactly those coming from $A$. The concrete failure exactness detects is the existence of a class in $\ker g$ that has no preimage under $f$; categorically this is a nonzero quotient of $\ker g$ by $\operatorname{im} f$. In an abelian category, the same idea is expressed by comparing the image of $f$ with the kernel of $g$.
[definition: Exactness at an Object]
Let $\mathcal A$ be an abelian category. A pair of composable morphisms
\begin{align*}
A \xrightarrow{f} B \xrightarrow{g} C
\end{align*}
is exact at $B$ if $g \circ f = 0$ and the canonical monomorphism $\operatorname{im} f \to B$ identifies $\operatorname{im} f$ with $\ker g$ as subobjects of $B$.
[/definition]
This definition uses the image subobject rather than an elementwise subset. Since abelian categories identify image and coimage canonically, it is equivalent to saying that $f$ factors through $\ker g$ and the induced morphism $\operatorname{coim} f \to \ker g$ is an isomorphism. To study longer chains of maps, the same local condition is imposed at every object, so each failure to move forward is accounted for by the previous map.
Longer diagrams occur constantly in homological algebra, so the local condition needs a compact global name. The definition below turns exactness at a single object into a property of an entire chain, requiring every kernel in the chain to be supplied by the preceding image.
[definition: Exact Sequence]
A sequence of objects and morphisms
\begin{align*}
\cdots \longrightarrow A_{i-1} \xrightarrow{d_{i-1}} A_i \xrightarrow{d_i} A_{i+1} \longrightarrow \cdots
\end{align*}
in an abelian category $\mathcal A$ is exact if it is exact at every object $A_i$.
[/definition]
The maps in an exact sequence compose to zero, but that condition alone is weaker than exactness. Exactness also says that every obstruction to moving forward is accounted for by having come from the previous object.
[example: Exact Sequence of Modules]
Let $R$ be a ring, let $M$ be an $R$-module, and let $N\subset M$ be a submodule. Define $i:N\to M$ by $i(n)=n$ and define $q:M\to M/N$ by $q(m)=m+N$. We show that
\begin{align*}
0 \longrightarrow N \xrightarrow{i} M \xrightarrow{q} M/N \longrightarrow 0
\end{align*}
is exact in $R\text{-}\mathrm{Mod}$.
At $N$, the incoming map $0\to N$ has image $\{0\}$. Also,
\begin{align*}
\ker i
&=\{n\in N:i(n)=0\}\\
&=\{n\in N:n=0\}\\
&=\{0\},
\end{align*}
so the image of $0\to N$ equals $\ker i$. At $M$, the image of $i$ is
\begin{align*}
\operatorname{im} i
&=\{i(n):n\in N\}\\
&=\{n:n\in N\}\\
&=N,
\end{align*}
while the kernel of $q$ is
\begin{align*}
\ker q
&=\{m\in M:q(m)=0+N\}\\
&=\{m\in M:m+N=N\}\\
&=\{m\in M:m\in N\}\\
&=N.
\end{align*}
Thus $\operatorname{im} i=\ker q$. At $M/N$, every coset $m+N\in M/N$ satisfies
\begin{align*}
m+N=q(m),
\end{align*}
so $q$ is surjective and $\operatorname{im}q=M/N$, which is the kernel of the unique map $M/N\to 0$. Therefore the sequence is exact. This is the model for the categorical definition: the subobject $N$ is recovered as the kernel of the quotient map $q:M\to M/N$.
[/example]
Short exact sequences isolate the smallest nontrivial exact pattern. They encode an object $B$ built from a subobject $A$ and a quotient object $C$.
The module example shows the prototype, but the category-theoretic definition must avoid choosing elements or representatives. It instead asks for an exact chain beginning and ending at zero, which forces the first map to identify a subobject and the second map to identify the corresponding quotient.
[definition: Short Exact Sequence]
A short exact sequence in an abelian category $\mathcal A$ is an exact sequence
\begin{align*}
0 \longrightarrow A \xrightarrow{i} B \xrightarrow{p} C \longrightarrow 0.
\end{align*}
[/definition]
In such a sequence, $i$ is a monomorphism, $p$ is an epimorphism, and $i$ identifies $A$ with $\ker p$. Dually, $p$ identifies $C$ with $\operatorname{coker} i$. Thus a short exact sequence is more than a pair of maps: it is a way of realising $C$ as a quotient of $B$ by $A$.
For applications, one often wants to recognise short exactness from the two universal properties rather than checking exactness at each position separately. The following criterion makes that recognition precise: it identifies the kernel-cokernel data that are exactly equivalent to the four-term sequence being short exact.
[quotetheorem:4194]
[citeproof:4194]
The hypothesis that both universal properties hold is essential: knowing only that $i$ is monic and $p$ is epic does not determine the middle homology. For example, in abelian groups the sequence $0\to 2\mathbb Z\to \mathbb Z\to \mathbb Z/3\mathbb Z\to 0$ has a monomorphism and an epimorphism but is not exact at $\mathbb Z$. The theorem does not classify all middle objects $B$; it only gives the intrinsic test that a proposed four-term sequence is genuinely short exact. This test is what later allows pullbacks, pushouts, and diagram lemmas to recognise short exact sequences without returning to elements.
[remark: Split Short Exact Sequences]
A short exact sequence
\begin{align*}
0 \longrightarrow A \xrightarrow{i} B \xrightarrow{p} C \longrightarrow 0
\end{align*}
is split if there exists $s:C\to B$ with $p s=\operatorname{id}_C$, or equivalently if there exists $r:B\to A$ with $r i=\operatorname{id}_A$. In that case $B \cong A \oplus C$ in a way compatible with $i$ and $p$.
[/remark]
The failure of a short exact sequence to split is often the mathematical content of the sequence. In later homological algebra, extensions are organised into groups such as $\operatorname{Ext}^1(C,A)$; here we only need the categorical mechanism by which extensions can be moved along morphisms.
## Pullbacks, Pushouts, and Extensions in Abelian Categories
How can an extension be transported when either the subobject or the quotient is changed? Pullbacks and pushouts answer this question by imposing the required universal property on a square. In abelian categories, these constructions preserve enough exactness to make extensions functorial in both variables.
[definition: Extension]
Let $\mathcal A$ be an abelian category. An extension of $C$ by $A$ is a short exact sequence
\begin{align*}
0 \longrightarrow A \xrightarrow{i} E \xrightarrow{p} C \longrightarrow 0.
\end{align*}
[/definition]
The object $E$ is the middle term, but the maps are part of the data. Two isomorphic middle objects can represent different extensions if the identifications with $A$ and $C$ differ.
Suppose $u:C'\to C$ is a morphism and
\begin{align*}
0 \longrightarrow A \longrightarrow E \longrightarrow C \longrightarrow 0
\end{align*}
is an extension. A naive inverse image of the middle object is not available in an arbitrary abelian category, and even for modules one must remember the compatibility condition with the quotient map. The pullback $E' = E \times_C C'$ imposes exactly that compatibility and gives a new extension of $C'$ by $A$.
The issue is whether this construction preserves the exactness data, not merely whether the pullback object exists. The theorem below verifies that changing the quotient object along $u$ still leaves $A$ as the kernel and produces a genuine extension with quotient $C'$.
[quotetheorem:4195]
[citeproof:4195]
The epimorphism hypothesis on $p$ matters: pulling back a non-epimorphic map need not produce a quotient map onto $C'$. For instance, in abelian groups, pulling back $2\mathbb Z\hookrightarrow \mathbb Z$ along $\mathbb Z\to \mathbb Z$ does not turn the projection to the new base into a surjection. The theorem does not say that the middle object is unchanged; it says that the same subobject $A$ remains the kernel after the quotient object is changed. This is the first instance of the contravariant behaviour of extension classes in the quotient variable.
[example: Pulling Back a Module Extension]
Let $R$ be a ring and consider a short exact sequence
\begin{align*}
0\longrightarrow A\xrightarrow{i}E\xrightarrow{p}C\longrightarrow 0
\end{align*}
of $R$-modules. Given an $R$-linear map $u:C'\to C$, define
\begin{align*}
E'=\{(e,c')\in E\oplus C':p(e)=u(c')\}.
\end{align*}
This is an $R$-submodule of $E\oplus C'$: if $(e_1,c_1'),(e_2,c_2')\in E'$ and $r\in R$, then
\begin{align*}
p(e_1+e_2)&=p(e_1)+p(e_2)=u(c_1')+u(c_2')=u(c_1'+c_2'),\\
p(r e_1)&=r p(e_1)=r u(c_1')=u(r c_1'),
\end{align*}
so $(e_1+e_2,c_1'+c_2')\in E'$ and $(r e_1,r c_1')\in E'$.
Let $p':E'\to C'$ be the projection $p'(e,c')=c'$. Since $p:E\to C$ is surjective, for each $c'\in C'$ there exists $e\in E$ with
\begin{align*}
p(e)=u(c'),
\end{align*}
so $(e,c')\in E'$ and $p'(e,c')=c'$. Hence $p'$ is surjective. Its kernel is
\begin{align*}
\ker p'
&=\{(e,c')\in E':p'(e,c')=0\}\\
&=\{(e,c')\in E':c'=0\}\\
&=\{(e,0)\in E\oplus C':p(e)=u(0)\}\\
&=\{(e,0)\in E\oplus C':p(e)=0\}\\
&=\{(e,0):e\in\ker p\}.
\end{align*}
Because the original sequence is exact, $\ker p=\operatorname{im}i$, so
\begin{align*}
\ker p'
&=\{(i(a),0):a\in A\}.
\end{align*}
Thus the map $i':A\to E'$ defined by
\begin{align*}
i'(a)=(i(a),0)
\end{align*}
identifies $A$ with $\ker p'$, and the pulled-back sequence
\begin{align*}
0\longrightarrow A\xrightarrow{i'}E'\xrightarrow{p'}C'\longrightarrow 0
\end{align*}
is exact. The pullback keeps the same kernel $A$ while replacing the quotient $C$ by the object $C'$ mapping into it.
[/example]
Dually, if $v:A\to A'$ is a morphism, the pushout of $A\to E$ along $v$ gives a new extension of $C$ by $A'$. The obstruction it removes is that a morphism out of $A$ need not extend across $E$; the pushout forces such an extension by quotienting out the incompatibility between the old copy of $A$ and the new object $A'$.
[quotetheorem:4196]
[citeproof:4196]
The monomorphism hypothesis on $i$ is essential because the pushed-out copy of $A'$ is meant to remain the kernel of the new quotient map. If $i$ is not a kernel, pushing out can preserve the formal square while losing exactness at the new middle object; for modules, a non-injective map $A\to E$ already has elements of $A$ that vanish before any pushout is taken. The theorem does not make extensions covariant in both variables: it changes the subobject while keeping the quotient fixed. Together with pullback, this is the categorical mechanism behind the two-variable functoriality of $\operatorname{Ext}^1$.
[example: Pushing Out a Module Extension]
Let
\begin{align*}
0\longrightarrow A\xrightarrow{i}E\xrightarrow{p}C\longrightarrow 0
\end{align*}
be a short exact sequence of $R$-modules, and let $v:A\to A'$ be $R$-linear. Define
\begin{align*}
S=\{(i(a),-v(a)):a\in A\}\subset E\oplus A',
\qquad
E'=(E\oplus A')/S.
\end{align*}
The submodule property follows from
\begin{align*}
(i(a_1),-v(a_1))+(i(a_2),-v(a_2))
&=(i(a_1+a_2),-v(a_1+a_2)),\\
r(i(a),-v(a))
&=(i(ra),-v(ra)),
\end{align*}
using $R$-linearity of $i$ and $v$.
Define $i':A'\to E'$ and $p':E'\to C$ by
\begin{align*}
i'(a')&=[(0,a')],\\
p'([(e,a')])&=p(e).
\end{align*}
The map $p'$ is well-defined because if $(e,a')-(\tilde e,\tilde a')\in S$, then for some $a\in A$,
\begin{align*}
(e-\tilde e,a'-\tilde a')=(i(a),-v(a)),
\end{align*}
so $e-\tilde e=i(a)$, and hence
\begin{align*}
p(e)-p(\tilde e)=p(e-\tilde e)=p(i(a))=0
\end{align*}
because $p i=0$. Also
\begin{align*}
p'i'(a')=p'([(0,a')])=p(0)=0.
\end{align*}
We compute the kernel of $p'$. If $[(e,a')]\in\ker p'$, then
\begin{align*}
0=p'([(e,a')])=p(e),
\end{align*}
so $e\in\ker p$. Since the original sequence is exact, $\ker p=\operatorname{im}i$, so there exists $a\in A$ with $e=i(a)$. In $E'$,
\begin{align*}
[(e,a')]
&=[(i(a),a')]\\
&=[(i(a),a')-(i(a),-v(a))]\\
&=[(0,a'+v(a))]\\
&=i'(a'+v(a)).
\end{align*}
Thus $\ker p'\subseteq\operatorname{im}i'$. The reverse inclusion follows from $p'i'=0$, so
\begin{align*}
\ker p'=\operatorname{im}i'.
\end{align*}
Finally, $i'$ is injective. If $i'(a')=0$, then $[(0,a')]=0$, so $(0,a')\in S$. Hence for some $a\in A$,
\begin{align*}
(0,a')=(i(a),-v(a)).
\end{align*}
Thus $i(a)=0$. Since the original sequence is exact at $A$, $i$ is injective, so $a=0$, and therefore
\begin{align*}
a'=-v(a)=-v(0)=0.
\end{align*}
Also $p'$ is surjective because for every $c\in C$, surjectivity of $p$ gives some $e\in E$ with $p(e)=c$, and then
\begin{align*}
p'([(e,0)])=p(e)=c.
\end{align*}
Therefore
\begin{align*}
0\longrightarrow A'\xrightarrow{i'}E'\xrightarrow{p'}C\longrightarrow 0
\end{align*}
is exact. The quotient relation $[(i(a),0)]=[(0,v(a))]$ is precisely what replaces the old embedded copy of $A$ in $E$ by the module $A'$.
[/example]
These two constructions explain why extensions are contravariant in the quotient variable and covariant in the subobject variable. They are also the categorical background for connecting morphisms in long exact sequences.
## Snake Lemma, Five Lemma, and Nine Lemma
How does exactness propagate through a commutative diagram? The classical answer is that certain diagram shapes force new exact sequences or force a middle map to be an isomorphism. We first state the module version, where the diagram chase is concrete; the same pattern is the model for the abelian-category form used later with complexes and derived functors.
[quotetheorem:4533]
[citeproof:4533]
In this module form, the exactness of both rows is essential: if either row is only a chain complex rather than a short exact sequence, the displayed kernel-cokernel sequence can fail at the connecting term. The lemma does not identify kernels with cokernels; it produces one exact sequence measuring how the two vertical maps differ. We will use it as a black-box source of connecting homomorphisms in long exact sequences, where the connecting map records an obstruction to passing local or degreewise information through an exact sequence.
[example: Long Exact Sequence from a Snake-Lemma Diagram]
Let
\begin{align*}
0\longrightarrow X'\xrightarrow{i}X\xrightarrow{p}X''\longrightarrow 0
\end{align*}
be a short exact sequence of cochain complexes, with differentials $d'$, $d$, and $d''$. For each $n$,
\begin{align*}
H^n(X')=\ker(d'^n)/\operatorname{im}(d'^{\,n-1}),
\qquad
H^n(X)=\ker(d^n)/\operatorname{im}(d^{n-1}),
\qquad
H^n(X'')=\ker(d''^n)/\operatorname{im}(d''^{\,n-1}).
\end{align*}
We construct
\begin{align*}
\partial:H^n(X'')\longrightarrow H^{n+1}(X')
\end{align*}
by lifting a cocycle and measuring the differential of the lift.
Let $[z'']\in H^n(X'')$, with $z''\in X''^n$ and
\begin{align*}
d''^n z''=0.
\end{align*}
Since $p^n:X^n\to X''^n$ is epic, choose $x\in X^n$ with
\begin{align*}
p^n x=z''.
\end{align*}
Because $p$ is a cochain map,
\begin{align*}
p^{n+1}(d^n x)
&=d''^n(p^n x)\\
&=d''^n z''\\
&=0.
\end{align*}
Thus $d^n x\in\ker p^{n+1}$. Exactness at $X^{n+1}$ gives
\begin{align*}
\ker p^{n+1}=\operatorname{im} i^{n+1},
\end{align*}
so there is $z'\in X'^{n+1}$ such that
\begin{align*}
i^{n+1}z'=d^n x.
\end{align*}
This $z'$ is a cocycle, because
\begin{align*}
i^{n+2}(d'^{\,n+1}z')
&=d^{n+1}(i^{n+1}z')\\
&=d^{n+1}(d^n x)\\
&=0,
\end{align*}
and $i^{n+2}$ is monic. Hence
\begin{align*}
d'^{\,n+1}z'=0.
\end{align*}
Define
\begin{align*}
\partial([z''])=[z']\in H^{n+1}(X').
\end{align*}
This class does not depend on the chosen lift $x$. If $x_1,x_2\in X^n$ both satisfy $p^n x_1=p^n x_2=z''$, then
\begin{align*}
p^n(x_1-x_2)
&=p^n x_1-p^n x_2\\
&=z''-z''\\
&=0.
\end{align*}
Thus $x_1-x_2\in\ker p^n=\operatorname{im}i^n$, so $x_1-x_2=i^n y'$ for some $y'\in X'^n$. If $i^{n+1}z_1'=d^n x_1$ and $i^{n+1}z_2'=d^n x_2$, then
\begin{align*}
i^{n+1}(z_1'-z_2')
&=d^n x_1-d^n x_2\\
&=d^n(x_1-x_2)\\
&=d^n(i^n y')\\
&=i^{n+1}(d'^{\,n}y').
\end{align*}
Since $i^{n+1}$ is monic,
\begin{align*}
z_1'-z_2'=d'^{\,n}y',
\end{align*}
so $[z_1']=[z_2']$ in $H^{n+1}(X')$.
It also does not depend on the chosen representative of $[z'']$. If $\tilde z''=z''+d''^{\,n-1}w''$, choose $w\in X^{n-1}$ with $p^{n-1}w=w''$, and set
\begin{align*}
\tilde x=x+d^{n-1}w.
\end{align*}
Then
\begin{align*}
p^n\tilde x
&=p^n x+p^n(d^{n-1}w)\\
&=z''+d''^{\,n-1}(p^{n-1}w)\\
&=z''+d''^{\,n-1}w''\\
&=\tilde z''.
\end{align*}
Moreover,
\begin{align*}
d^n\tilde x
&=d^n x+d^n d^{n-1}w\\
&=d^n x.
\end{align*}
Thus the same element $z'\in X'^{n+1}$ may be used for both representatives, so $\partial([z''])$ is well-defined.
Applying the *Snake Lemma* to the standard degreewise diagrams relating cycles and boundaries gives exactness at each displayed cohomology object, and therefore the long exact sequence
\begin{align*}
\cdots \longrightarrow H^n(X')\longrightarrow H^n(X)\longrightarrow H^n(X'')\xrightarrow{\partial}H^{n+1}(X')\longrightarrow H^{n+1}(X)\longrightarrow\cdots .
\end{align*}
Thus $\partial([z''])$ is exactly the obstruction to lifting the cocycle $z''$ to a cocycle of $X$: the chosen lift $x$ is a cocycle precisely when $d^n x=0$, which makes the resulting class in $H^{n+1}(X')$ vanish.
[/example]
The five lemma addresses a different situation: if four of the vertical maps in a morphism of exact sequences are isomorphisms, the fifth is forced to be an isomorphism under suitable hypotheses. The issue is that a map in the middle of an exact row cannot be studied independently of its neighbours: exactness lets information about kernels and images travel across the diagram. The lemma formalizes the diagram chase that turns control of the four surrounding vertical maps into control of the remaining central one.
[quotetheorem:1938]
[citeproof:1938]
The requirement that the outer maps are isomorphisms cannot be weakened to arbitrary monomorphisms or epimorphisms. In abelian groups, a map between short exact sequences can have identical-looking end terms but a middle map such as multiplication by $2$ on $\mathbb Z$; without isomorphisms on the ends, the kernel or cokernel of the middle map need not vanish. The lemma does not classify morphisms of extensions; it only detects when the middle morphism is forced to be invertible. This detection principle is the local form of the five lemma for longer exact rows.
The exactness of the rows is doing real work here: without it, the information from the neighbouring vertical maps cannot be transported through images and kernels. A standard counterexample is a commutative five-term diagram of abelian groups with zero horizontal maps, where the four outer vertical maps are identities but the middle vertical map is multiplication by $2$ on $\mathbb Z$; the rows are not exact, and the middle map is not an isomorphism. The five lemma does not say that any four isomorphisms in a diagram force a fifth; it says this only inside the rigid environment of exact rows. This is the form used constantly when comparing long exact cohomology sequences induced by a morphism of short exact sequences of complexes.
[example: Diagram Chase in Sheaves of Abelian Groups]
Let $X$ be a topological space, and consider a commutative diagram of sheaves of abelian groups with exact rows
\begin{align*}
\mathcal F_1' \longrightarrow \mathcal F_2' \longrightarrow \mathcal F_3' \longrightarrow \mathcal F_4' \longrightarrow \mathcal F_5' \\
\downarrow \alpha_1 \qquad \downarrow \alpha_2 \qquad \downarrow \alpha_3 \qquad \downarrow \alpha_4 \qquad \downarrow \alpha_5 \\
\mathcal F_1 \longrightarrow \mathcal F_2 \longrightarrow \mathcal F_3 \longrightarrow \mathcal F_4 \longrightarrow \mathcal F_5 .
\end{align*}
Assume that $\alpha_1,\alpha_2,\alpha_4,\alpha_5$ are isomorphisms. For each point $x\in X$, taking stalks gives a commutative diagram of abelian groups
\begin{align*}
(\mathcal F_1')_x \longrightarrow (\mathcal F_2')_x \longrightarrow (\mathcal F_3')_x \longrightarrow (\mathcal F_4')_x \longrightarrow (\mathcal F_5')_x \\
\downarrow (\alpha_1)_x \qquad \downarrow (\alpha_2)_x \qquad \downarrow (\alpha_3)_x \qquad \downarrow (\alpha_4)_x \qquad \downarrow (\alpha_5)_x \\
(\mathcal F_1)_x \longrightarrow (\mathcal F_2)_x \longrightarrow (\mathcal F_3)_x \longrightarrow (\mathcal F_4)_x \longrightarrow (\mathcal F_5)_x .
\end{align*}
The two stalk rows are exact because exactness of sheaves of abelian groups is detected on stalks.
If $\beta_i:\mathcal F_i\to\mathcal F_i'$ is the inverse of $\alpha_i$ for $i\in\{1,2,4,5\}$, then
\begin{align*}
(\beta_i)_x(\alpha_i)_x
&=(\beta_i\alpha_i)_x\\
&=(\operatorname{id}_{\mathcal F_i'})_x\\
&=\operatorname{id}_{(\mathcal F_i')_x},
\end{align*}
and
\begin{align*}
(\alpha_i)_x(\beta_i)_x
&=(\alpha_i\beta_i)_x\\
&=(\operatorname{id}_{\mathcal F_i})_x\\
&=\operatorname{id}_{(\mathcal F_i)_x}.
\end{align*}
Thus $(\alpha_1)_x,(\alpha_2)_x,(\alpha_4)_x,(\alpha_5)_x$ are isomorphisms of abelian groups. Applying the *Five Lemma* in $\mathrm{Ab}$ to the stalk diagram shows that $(\alpha_3)_x$ is an isomorphism for every $x\in X$.
Now consider $\ker\alpha_3$ and $\operatorname{coker}\alpha_3$ as sheaves. For every $x\in X$,
\begin{align*}
(\ker\alpha_3)_x&=\ker((\alpha_3)_x)=0,\\
(\operatorname{coker}\alpha_3)_x&=\operatorname{coker}((\alpha_3)_x)=0,
\end{align*}
because $(\alpha_3)_x$ is an isomorphism. A sheaf of abelian groups with zero stalk at every point is the zero sheaf, so $\ker\alpha_3=0$ and $\operatorname{coker}\alpha_3=0$. Hence $\alpha_3$ is both monic and epic in $\mathrm{Sh}(X,\mathrm{Ab})$, and therefore $\alpha_3$ is an isomorphism. This shows how the five lemma for sheaves reduces to the ordinary five lemma on every stalk.
[/example]
The nine lemma explains how exactness of rows and columns in a $3\times 3$ diagram determines the remaining row. It is a compact way of organising repeated applications of the snake lemma.
[quotetheorem:4197]
[citeproof:4197]
The short exactness of the columns is indispensable. If the columns are merely exact at their middle objects, there is no reason for $A''\to B''$ to be monic or for $B''\to C''$ to be epic; in abelian groups, one can take a bottom row with a non-injective first map and choose upper rows so that only middle-column exactness is visible. The lemma does not manufacture endpoint exactness from a $3\times 3$ picture alone; the zeros and short exact column hypotheses supply that information. Its role is to prevent repeated snake-lemma arguments from obscuring the main point when building quotient exact sequences, filtrations, and long exact sequences in derived functor arguments.
[example: Nine Lemma for Modules]
Let
\begin{align*}
0\longrightarrow A'\xrightarrow{i'}B'\xrightarrow{p'}C'\longrightarrow 0
\end{align*}
be a short exact sequence of $R$-modules sitting inside another short exact sequence
\begin{align*}
0\longrightarrow A\xrightarrow{i}B\xrightarrow{p}C\longrightarrow 0,
\end{align*}
with $A'\subset A$, $B'\subset B$, $C'\subset C$, and with the squares compatible:
\begin{align*}
i(A')\subset B',
\qquad
p(B')\subset C',
\qquad
i|_{A'}=i',
\qquad
p|_{B'}=p'.
\end{align*}
We compute the quotient row
\begin{align*}
0\longrightarrow A/A'\xrightarrow{\bar i}B/B'\xrightarrow{\bar p}C/C'\longrightarrow 0,
\end{align*}
where
\begin{align*}
\bar i(a+A')&=i(a)+B',\\
\bar p(b+B')&=p(b)+C'.
\end{align*}
First, $\bar i$ is well-defined. If $a+A'=\tilde a+A'$, then $a-\tilde a\in A'$, so
\begin{align*}
i(a)-i(\tilde a)
&=i(a-\tilde a)\\
&=i'(a-\tilde a)\in B',
\end{align*}
and therefore $i(a)+B'=i(\tilde a)+B'$. Similarly, $\bar p$ is well-defined. If $b+B'=\tilde b+B'$, then $b-\tilde b\in B'$, so
\begin{align*}
p(b)-p(\tilde b)
&=p(b-\tilde b)\\
&=p'(b-\tilde b)\in C',
\end{align*}
and hence $p(b)+C'=p(\tilde b)+C'$.
The composite is zero because, for every $a\in A$,
\begin{align*}
(\bar p\bar i)(a+A')
&=\bar p(i(a)+B')\\
&=p(i(a))+C'\\
&=0+C',
\end{align*}
using $p i=0$ from exactness of the middle row.
Now compute the kernel of $\bar p$. If $b+B'\in\ker\bar p$, then
\begin{align*}
0+C'
&=\bar p(b+B')\\
&=p(b)+C',
\end{align*}
so $p(b)\in C'$. Since $p':B'\to C'$ is surjective, there is $b'\in B'$ with
\begin{align*}
p'(b')=p(b).
\end{align*}
Because $p|_{B'}=p'$, this gives
\begin{align*}
p(b-b')
&=p(b)-p(b')\\
&=p(b)-p'(b')\\
&=0.
\end{align*}
Exactness of the middle row gives $\ker p=\operatorname{im}i$, so there is $a\in A$ with
\begin{align*}
i(a)=b-b'.
\end{align*}
Thus
\begin{align*}
b+B'
&=(b-b')+B'\\
&=i(a)+B'\\
&=\bar i(a+A'),
\end{align*}
so $\ker\bar p\subseteq\operatorname{im}\bar i$. The reverse inclusion follows from $\bar p\bar i=0$, hence
\begin{align*}
\ker\bar p=\operatorname{im}\bar i.
\end{align*}
The map $\bar i$ is injective. If $\bar i(a+A')=0+B'$, then $i(a)\in B'$. Since the top row is exact, $\ker p'=\operatorname{im}i'$, and since $p(i(a))=0$, the element $i(a)\in B'$ lies in $\ker p'$. Hence there is $a'\in A'$ such that
\begin{align*}
i'(a')=i(a).
\end{align*}
Using $i' = i|_{A'}$,
\begin{align*}
i(a-a')
&=i(a)-i(a')\\
&=i(a)-i'(a')\\
&=0.
\end{align*}
Exactness of the middle row gives that $i$ is injective, so $a-a'=0$, hence $a=a'\in A'$. Therefore $a+A'=0+A'$.
Finally, $\bar p$ is surjective. For any $c+C'\in C/C'$, exactness of the middle row gives some $b\in B$ with
\begin{align*}
p(b)=c.
\end{align*}
Then
\begin{align*}
\bar p(b+B')
&=p(b)+C'\\
&=c+C'.
\end{align*}
Thus the quotient row is short exact. This concrete quotient construction is the module-level picture behind the *Nine Lemma*: exactness of the first two rows and the compatible short exact columns force exactness of the third row.
[/example]
The diagram lemmas are the working calculus of abelian categories. They let us convert local exactness data in a diagram into global information, and they justify the familiar homological constructions that appear in derived functors, cohomology theories, and sheaf-theoretic algebra.
Diagram lemmas give the working rules for exactness, but homological algebra also needs functors that preserve or detect those rules. Projectives, injectives, and exact functors provide that machinery, and they prepare the ground for derived functors as the next level of approximation.
# 11. Projectives, Injectives, and Exact Functors
This chapter turns the exactness formalism of abelian categories into the language used in homological algebra. After Chapters 9 and 10 treated kernels, cokernels, short exact sequences, and diagram lemmas as structural features of abelian categories, here we ask which functors preserve those structures and which objects make Hom-functors behave as exact functors. The resulting notions of projective and injective objects are the categorical replacements for free modules and divisible groups, and they prepare the ground for resolutions and derived functors.
## Left Exact, Right Exact, and Exact Functors
A functor between abelian categories rarely preserves every short exact sequence. The first problem is to isolate the part of exactness that survives under the basic functors encountered in algebra: Hom, tensor product, restriction, extension of scalars, and global sections.
[definition: Left Exact Functor]
Let $\mathcal A$ and $\mathcal B$ be abelian categories. An additive functor $F: \mathcal A \to \mathcal B$ is left exact if, for every short exact sequence
\begin{align*}
0 \to A' \xrightarrow{u} A \xrightarrow{v} A'' \to 0
\end{align*}
in $\mathcal A$, the sequence
\begin{align*}
0 \to F(A') \xrightarrow{F(u)} F(A) \xrightarrow{F(v)} F(A'')
\end{align*}
is exact in $\mathcal B$.
[/definition]
Left exactness says that the functor preserves kernels and the injective part of a short exact sequence. It does not require $F(v)$ to be an epimorphism.
[definition: Right Exact Functor]
Let $\mathcal A$ and $\mathcal B$ be abelian categories. An additive functor $F: \mathcal A \to \mathcal B$ is right exact if, for every short exact sequence
\begin{align*}
0 \to A' \xrightarrow{u} A \xrightarrow{v} A'' \to 0
\end{align*}
in $\mathcal A$, the sequence
\begin{align*}
F(A') \xrightarrow{F(u)} F(A) \xrightarrow{F(v)} F(A'') \to 0
\end{align*}
is exact in $\mathcal B$.
[/definition]
Right exactness says that the functor preserves cokernels and the surjective part of a short exact sequence. It does not require $F(u)$ to be a monomorphism.
[definition: Exact Functor]
Let $\mathcal A$ and $\mathcal B$ be abelian categories. An additive functor $F: \mathcal A \to \mathcal B$ is exact if, for every short exact sequence
\begin{align*}
0 \to A' \xrightarrow{u} A \xrightarrow{v} A'' \to 0
\end{align*}
in $\mathcal A$, the sequence
\begin{align*}
0 \to F(A') \xrightarrow{F(u)} F(A) \xrightarrow{F(v)} F(A'') \to 0
\end{align*}
is exact in $\mathcal B$.
[/definition]
Equivalently, an additive functor between abelian categories is exact precisely when it is both left exact and right exact.
Hom-functors are the basic test case because they convert objects into abelian groups of morphisms, but exactness is not automatic. A short exact sequence asks for both kernel and cokernel information to be preserved; Hom in one variable usually preserves only the kernel side, and the theorem identifies exactly which half survives without extra hypotheses.
[quotetheorem:4198]
[citeproof:4198]
This theorem explains why Hom-functors are the first source of derived functors: they preserve only half of a short exact sequence in general. The abelian-category hypotheses matter because the proof uses kernels and cokernels, not just the underlying sets of morphisms. The theorem also does not say that Hom is exact: the missing surjectivity condition is exactly where extension and lifting problems enter. The next example shows the failure in abelian groups, and the rest of the chapter isolates the objects for which this failure disappears.
[example: A Hom Functor Need Not Be Exact]
In $\mathbb Z\operatorname{-Mod}$, apply $\operatorname{Hom}_{\mathbb Z}(\mathbb Z/2\mathbb Z,-)$ to the short exact sequence
\begin{align*}
0 \to \mathbb Z \xrightarrow{\times 2} \mathbb Z \xrightarrow{\pi} \mathbb Z/2\mathbb Z \to 0,
\end{align*}
where $\pi(n)=\overline n$. We compute the three Hom-groups. If $f:\mathbb Z/2\mathbb Z \to \mathbb Z$ is a group homomorphism and $f(\overline 1)=a$, then
\begin{align*}
0=f(\overline 0)=f(2\overline 1)=2f(\overline 1)=2a.
\end{align*}
Since $\mathbb Z$ has no nonzero element killed by $2$, this forces $a=0$, so $f=0$. Hence
\begin{align*}
\operatorname{Hom}_{\mathbb Z}(\mathbb Z/2\mathbb Z,\mathbb Z)=0.
\end{align*}
The same calculation applies to both copies of $\mathbb Z$ in the sequence.
For the last term, a homomorphism $g:\mathbb Z/2\mathbb Z \to \mathbb Z/2\mathbb Z$ is determined by $g(\overline 1)$. The relation $2\overline 1=\overline 0$ imposes no further restriction, since
\begin{align*}
2g(\overline 1)=\overline 0
\end{align*}
for both possible values $g(\overline 1)=\overline 0$ and $g(\overline 1)=\overline 1$. Thus there are exactly two such homomorphisms, the zero map and the identity map, and
\begin{align*}
\operatorname{Hom}_{\mathbb Z}(\mathbb Z/2\mathbb Z,\mathbb Z/2\mathbb Z)\cong \mathbb Z/2\mathbb Z.
\end{align*}
Therefore the image of the induced map
\begin{align*}
\operatorname{Hom}_{\mathbb Z}(\mathbb Z/2\mathbb Z,\mathbb Z)
\to
\operatorname{Hom}_{\mathbb Z}(\mathbb Z/2\mathbb Z,\mathbb Z/2\mathbb Z)
\end{align*}
is $0$, while the identity map of $\mathbb Z/2\mathbb Z$ is a nonzero element of the target. The induced map is not surjective, so this Hom-functor is left exact but not exact on this short exact sequence.
[/example]
Tensor product is the dual source of one-sided exactness for modules. Since $M\otimes_R-$ is built as a left adjoint, it is expected to preserve cokernels and therefore right exact sequences, but there is no corresponding reason for it to preserve kernels or monomorphisms. The theorem records the exact part that tensor product preserves before any flatness assumption is imposed.
The obstruction is the same one that later motivates flat modules: tensoring may destroy the left end of a short exact sequence. Before imposing flatness, we need the baseline guarantee that tensoring still preserves the cokernel part of an exact sequence, which is precisely right exactness.
[quotetheorem:4199]
[citeproof:4199]
The hypothesis that the original sequence is right exact is the part tensor product can see: it is a statement about a cokernel. Tensor product does not generally preserve monomorphisms, so an exact sequence beginning with $0 \to N' \to N$ may lose exactness at the left after tensoring. The next example gives the standard failure over $\mathbb Z$, and the condition that rules out this failure is flatness of $M$. Projectivity will imply flatness for modules, but the two notions play different roles in homological algebra.
[example: Tensor Product Need Not Preserve Monomorphisms]
Tensor the monomorphism $\mathbb Z \xrightarrow{\times 2} \mathbb Z$ with $\mathbb Z/2\mathbb Z$ over $\mathbb Z$. Using the standard identification
\begin{align*}
(\mathbb Z/2\mathbb Z)\otimes_{\mathbb Z}\mathbb Z \cong \mathbb Z/2\mathbb Z,
\qquad
\overline a \otimes n \mapsto n\overline a,
\end{align*}
the tensor product of $\times 2$ sends a pure tensor by
\begin{align*}
\overline a \otimes n
&\mapsto \overline a \otimes 2n \\
&\mapsto 2n\overline a.
\end{align*}
In particular, under the identification above, the induced map is
\begin{align*}
\mathbb Z/2\mathbb Z &\to \mathbb Z/2\mathbb Z,\\
\overline a &\mapsto 2\overline a.
\end{align*}
For the two elements of $\mathbb Z/2\mathbb Z$,
\begin{align*}
2\overline 0=\overline 0,
\qquad
2\overline 1=\overline 2=\overline 0,
\end{align*}
so this induced map is the zero map. Since $\overline 1 \neq \overline 0$ in $\mathbb Z/2\mathbb Z$ but maps to $\overline 0$, the induced map is not injective. Thus tensoring with $\mathbb Z/2\mathbb Z$ does not preserve the original monomorphism $\mathbb Z \to \mathbb Z$.
[/example]
## Projective Objects
When a morphism $A \to B$ is an epimorphism, maps into $B$ need not lift to maps into $A$. The projective objects are those domains from which every map lifts across every epimorphism, making them the objects that remove the right-exactness defect of $\operatorname{Hom}(P,-)$.
[definition: Projective Object]
Let $\mathcal A$ be an abelian category. An object $P \in \mathcal A$ is projective if, for every epimorphism $q:A \to B$ and every morphism $f:P \to B$, there exists a morphism $\tilde f:P \to A$ such that $q\tilde f=f$.
[/definition]
The definition is a lifting property. It says that $P$ sees epimorphisms as surjections on Hom-sets. This reformulates projectivity as an exactness condition on a representable functor: the only missing part of left exactness for $\operatorname{Hom}(P,-)$ is whether maps lift along epimorphisms. The next result turns that lifting condition into the statement that $\operatorname{Hom}(P,-)$ preserves short exact sequences.
[quotetheorem:4200]
[citeproof:4200]
This characterisation is often the working definition in computations. It says that projectivity is not an extra structure on $P$, but a test for whether maps out of $P$ turn quotients into surjections of Hom-groups. If $P$ is not projective, surjectivity can fail: for example, applying $\operatorname{Hom}_{\mathbb Z}(\mathbb Z/2\mathbb Z,-)$ to $\mathbb Z \to \mathbb Z/2\mathbb Z$ cannot lift the identity map of $\mathbb Z/2\mathbb Z$ to a homomorphism $\mathbb Z/2\mathbb Z \to \mathbb Z$. In module categories, projective modules are therefore exactly the modules for which Hom out of them does not create higher derived functors.
To build resolutions, it is not enough to recognize projective objects abstractly; one needs many concrete examples that map onto arbitrary modules. Free modules are the most accessible source, because a morphism out of a free module is specified by choosing images of basis elements, and those choices can be lifted one basis element at a time across any epimorphism.
[quotetheorem:4201]
[citeproof:4201]
Freeness is stronger than projectivity because a map out of a free module is determined independently on basis elements. The theorem does not imply that every projective module is free; over many rings there are projective modules which are direct summands of free modules but have no basis. What matters for resolutions is that free modules provide an abundant supply of projective objects from which arbitrary modules can be covered.
[example: A Projective Resolution Begins With A Free Presentation]
Let $M$ be a left $R$-module. Take the free left $R$-module
\begin{align*}
F_0=\bigoplus_{m\in M}Re_m
\end{align*}
and define the $R$-linear map $\varepsilon_0:F_0\to M$ by $\varepsilon_0(e_m)=m$. For every $m\in M$ we have $m=\varepsilon_0(e_m)$, so $\varepsilon_0$ is surjective. Let $K_1=\ker(\varepsilon_0)$, let $j_1:K_1\hookrightarrow F_0$ be the inclusion, choose the free module
\begin{align*}
F_1=\bigoplus_{k\in K_1}Re_k,
\end{align*}
and define $p_1:F_1\to K_1$ by $p_1(e_k)=k$. Again $p_1$ is surjective, and if $d_1=j_1p_1:F_1\to F_0$, then
\begin{align*}
\operatorname{im}(d_1)
&=j_1(\operatorname{im}(p_1))\\
&=j_1(K_1)\\
&=\ker(\varepsilon_0).
\end{align*}
Repeat the same step recursively. Having defined $d_n:F_n\to F_{n-1}$, set
\begin{align*}
K_{n+1}=\ker(d_n),
\end{align*}
choose a free module $F_{n+1}$ with a surjection $p_{n+1}:F_{n+1}\to K_{n+1}$, let $j_{n+1}:K_{n+1}\hookrightarrow F_n$ be the inclusion, and define
\begin{align*}
d_{n+1}=j_{n+1}p_{n+1}:F_{n+1}\to F_n.
\end{align*}
Then
\begin{align*}
\operatorname{im}(d_{n+1})
&=j_{n+1}(\operatorname{im}(p_{n+1}))\\
&=j_{n+1}(K_{n+1})\\
&=\ker(d_n),
\end{align*}
so the sequence
\begin{align*}
\cdots \to F_2 \xrightarrow{d_2} F_1 \xrightarrow{d_1} F_0 \xrightarrow{\varepsilon_0} M \to 0
\end{align*}
is exact at every term. Each $F_i$ is free, hence projective by *Free Modules Are Projective*, so this construction gives the standard projective resolution obtained from successive free presentations.
[/example]
## Injective Objects
The dual problem asks when maps defined on a subobject can be extended to a larger object. Injective objects are codomains with this extension property, and they repair the right-exactness defect of the contravariant Hom-functor.
[definition: Injective Object]
Let $\mathcal A$ be an abelian category. An object $I \in \mathcal A$ is injective if, for every monomorphism $j:A \to B$ and every morphism $f:A \to I$, there exists a morphism $\tilde f:B \to I$ such that $\tilde f j=f$.
[/definition]
The definition is the categorical dual of projectivity. Instead of lifting maps across quotients, it extends maps along subobjects. The useful test is not merely whether extensions exist one map at a time, but whether this extension property is exactly what removes the usual failure of exactness for contravariant Hom. Given a short exact sequence, applying $\operatorname{Hom}_{\mathcal A}(-,I)$ always reverses arrows and is left exact; the obstruction is whether morphisms into $I$ from the subobject extend far enough to make the last map surjective. Thus injectivity should be characterised by full exactness of this Hom-functor, not by a separate ad hoc lifting check.
[quotetheorem:4202]
[citeproof:4202]
Injective objects are often harder to recognise directly than projective objects because the definition quantifies over all subobjects of all objects. The theorem says that injectivity is exactly the condition needed to make the contravariant Hom-functor fully exact, but it does not say that every codomain has this extension property. In $\mathbb Z\operatorname{-Mod}$, the inclusion $\mathbb Z \to \mathbb Q$ and the identity map $\mathbb Z \to \mathbb Z$ show that $\mathbb Z$ is not injective, since extending would require a homomorphism $\mathbb Q \to \mathbb Z$ sending $1$ to $1$. For modules, Baer's criterion gives a practical test that reduces all extensions from submodules to extensions from ideals.
[quotetheorem:4203]
[citeproof:4203]
This theorem is stated here as a recognition principle for module categories. Its proof uses a maximal-extension argument with Zorn's lemma and is usually treated in a homological algebra course.
[example: Divisible Abelian Groups As Injective Objects]
For $R=\mathbb Z$, the left ideals are exactly the subgroups $n\mathbb Z$ with $n\ge 0$. By *Baer Criterion*, an abelian group $I$ is injective precisely when every group homomorphism $\varphi:n\mathbb Z\to I$ extends to a group homomorphism $\Phi:\mathbb Z\to I$.
We show that this extension condition is equivalent to divisibility. Suppose first that the extension condition holds. Fix $x\in I$ and $n\ge 1$. Define $\varphi:n\mathbb Z\to I$ by
\begin{align*}
\varphi(nk)=kx.
\end{align*}
This is well-defined because $nk=nk'$ implies $k=k'$ in $\mathbb Z$. Let $\Phi:\mathbb Z\to I$ extend $\varphi$, and put $y=\Phi(1)$. Then
\begin{align*}
ny
&=n\Phi(1)\\
&=\Phi(n)\\
&=\varphi(n)\\
&=x.
\end{align*}
Thus every $x\in I$ is divisible by every $n\ge 1$.
Conversely, suppose $I$ is divisible. Let $\varphi:n\mathbb Z\to I$ be a homomorphism with $n\ge 1$, and set $x=\varphi(n)$. Choose $y\in I$ with $ny=x$. Define $\Phi:\mathbb Z\to I$ by
\begin{align*}
\Phi(k)=ky.
\end{align*}
For every $k\in\mathbb Z$,
\begin{align*}
\Phi(nk)
&=nky\\
&=k(ny)\\
&=kx\\
&=k\varphi(n)\\
&=\varphi(nk),
\end{align*}
so $\Phi$ extends $\varphi$.
The group $\mathbb Q$ is divisible because, for $x=\frac{a}{b}\in\mathbb Q$ with $b\neq 0$ and $n\ge 1$,
\begin{align*}
n\left(\frac{a}{bn}\right)=\frac{a}{b}=x.
\end{align*}
The quotient $\mathbb Q/\mathbb Z$ is divisible because, for $q+\mathbb Z\in \mathbb Q/\mathbb Z$,
\begin{align*}
n\left(\frac{q}{n}+\mathbb Z\right)=q+\mathbb Z.
\end{align*}
Hence $\mathbb Q$ and $\mathbb Q/\mathbb Z$ are injective abelian groups. By contrast, $\mathbb Z$ is not divisible, since there is no $y\in\mathbb Z$ with $2y=1$, so $\mathbb Z$ is not injective.
[/example]
## Enough Projectives And Enough Injectives
Projective and injective objects are useful for resolving arbitrary objects only when the category contains enough of them. The next question is therefore not whether special objects exist, but whether every object can be reached from a projective object or embedded into an injective one.
[definition: Enough Projectives]
An abelian category $\mathcal A$ has enough projectives if, for every object $A \in \mathcal A$, there exists a projective object $P \in \mathcal A$ and an epimorphism $P \to A$.
[/definition]
This condition says that projective objects are numerous enough to approximate all objects from above by epimorphisms.
The dual obstruction appears when a construction must extend maps out of an object rather than lift maps into it. To build injective resolutions, every object must first sit inside an injective object; without such embeddings, right derived functors cannot be built uniformly across the category.
[definition: Enough Injectives]
An abelian category $\mathcal A$ has enough injectives if, for every object $A \in \mathcal A$, there exists an injective object $I \in \mathcal A$ and a monomorphism $A \to I$.
[/definition]
These conditions make it possible to replace arbitrary objects by exact chains of projective or injective objects. Such replacements are the input for derived functors.
The abstract definitions would be of limited use without large families of examples. For module categories, the first existence question is whether every module can be reached from a projective one, and the theorem answers this by using free modules as universal projective sources.
[quotetheorem:4204]
[citeproof:4204]
Module categories are special because free modules can be built directly from underlying sets, so projective covers in this weak sense always exist. A general abelian category need not have enough projectives; for example, many sheaf categories have very few projective objects, which is one reason sheaf cohomology is usually built from injective or acyclic resolutions instead. Having enough projectives is exactly what makes the iterative construction of projective resolutions possible for every module.
[quotetheorem:4205]
[citeproof:4205]
This recognition-and-existence result belongs to the standard module-theoretic part of homological algebra, after Baer's criterion has been established. A proof embeds a module into a product of injective cogenerators, or uses Baer's criterion together with a Zorn lemma construction. The point for these notes is that module categories support both projective and injective resolutions, while geometric categories such as sheaves often rely especially on injective resolutions for cohomology.
[example: Injective Envelopes As Minimal Injective Embeddings]
We verify in $\mathbb Z\operatorname{-Mod}$ that the inclusion $\mathbb Z\hookrightarrow \mathbb Q$ is the injective envelope of $\mathbb Z$. First, $\mathbb Q$ is injective by *Baer Criterion*, since $\mathbb Q$ is divisible: for $q\in \mathbb Q$ and $n\ge 1$,
\begin{align*}
n\left(\frac{q}{n}\right)=q.
\end{align*}
Next, $\mathbb Z$ is essential in $\mathbb Q$. Let $0\neq q\in \mathbb Q$, and write $q=\frac{a}{b}$ with $a,b\in\mathbb Z$, $b\neq 0$, and $a\neq 0$. Then
\begin{align*}
bq
&=b\left(\frac{a}{b}\right)\\
&=a.
\end{align*}
Since $a\neq 0$ and $a\in\mathbb Z$, the cyclic subgroup generated by $q$ meets $\mathbb Z$ nontrivially. Hence every nonzero subgroup of $\mathbb Q$ has nonzero intersection with $\mathbb Z$.
Finally, $\mathbb Q$ is minimal among injective abelian groups containing $\mathbb Z$. Suppose $E$ is an injective subgroup of $\mathbb Q$ with $\mathbb Z\subseteq E\subseteq\mathbb Q$. Again by *Baer Criterion*, $E$ is divisible. For each $n\ge 1$, divisibility applied to $1\in E$ gives some $y\in E$ such that
\begin{align*}
ny=1.
\end{align*}
Inside $\mathbb Q$, this equation forces $y=\frac{1}{n}$, because multiplying by $\frac{1}{n}$ gives
\begin{align*}
y=\frac{1}{n}.
\end{align*}
Thus $\frac{1}{n}\in E$ for every $n\ge 1$, and therefore, for every $a\in\mathbb Z$,
\begin{align*}
\frac{a}{n}=a\left(\frac{1}{n}\right)\in E.
\end{align*}
Every rational number has the form $\frac{a}{n}$ with $a\in\mathbb Z$ and $n\ge 1$, so $\mathbb Q\subseteq E$. Since also $E\subseteq\mathbb Q$, we get $E=\mathbb Q$. Thus $\mathbb Q$ is exactly the injective module obtained by adjoining all divisions missing from $\mathbb Z$, with no smaller injective subgroup still containing $\mathbb Z$.
[/example]
The preceding existence theorems now turn into a construction: replace an object by a chain of projective or injective objects without changing the object up to exactness at the end. These replacements are not unique, but homological algebra proves that their derived invariants are independent of the chosen resolution.
[definition: Projective Resolution]
Let $\mathcal A$ be an abelian category with enough projectives. A projective resolution of an object $A \in \mathcal A$ is an exact sequence
\begin{align*}
\cdots \to P_2 \to P_1 \to P_0 \to A \to 0
\end{align*}
where each $P_i$ is projective.
[/definition]
Projective resolutions run to the left because they resolve $A$ by objects mapping onto it and then onto successive kernels.
Left exact functors require the dual kind of replacement. To apply such a functor without losing control of cokernels, the object should first embed into an object with an extension property, and then the successive cokernels should be resolved in the same way. This produces the right-running resolution used for right derived functors.
[definition: Injective Resolution]
Let $\mathcal A$ be an abelian category with enough injectives. An injective resolution of an object $A \in \mathcal A$ is an exact sequence
\begin{align*}
0 \to A \to I_0 \to I_1 \to I_2 \to \cdots
\end{align*}
where each $I_i$ is injective.
[/definition]
Projective resolutions are adapted to right exact functors: apply the functor to a projective resolution and measure the homology that remains. Injective resolutions are adapted to left exact functors: apply the functor to an injective resolution and measure the cohomology that remains. This is the bridge from exactness properties in abelian categories to the derived functors introduced in the final chapter.
Once projective and injective resolutions are available, the remaining failure of exactness can be measured systematically. That is the point at which the chapter sequence closes and derived functors enter: they convert the language of exactness into a computational theory of homology and cohomology.
# 12. Derived Functors Preview
Derived functors are the point at which the exactness formalism of abelian categories becomes computational. Chapters 9 through 11 explained kernels, cokernels, exact sequences, projective and injective objects, while Chapters 2, 4, 5, and 6 explained how adjoints interact with limits and colimits. This chapter asks what information is lost when an additive functor fails to be exact, and how that failure can be recorded systematically by replacing objects with better ones before applying the functor.
The guiding principle is that non-exact functors often become computable on projective or injective resolutions. The resulting homology groups are not defects in the theory; they are invariants. They measure extensions, torsion phenomena, and cohomology, and they are the entry point to homological algebra.
## Resolutions and the Idea of Deriving Non-Exact Functors
What should we do with an additive functor that preserves only part of an exact sequence? A left exact functor remembers kernels but may lose cokernel information; a right exact functor remembers cokernels but may lose kernel information. Derived functors repair this by applying the functor not to the object itself, but to a resolution whose terms are adapted to the functor.
[definition: Chain Complex]
A chain complex in an abelian category $\mathcal A$ is a sequence of objects and morphisms
\begin{align*}
\cdots \xrightarrow{d_{3}} C_2 \xrightarrow{d_{2}} C_1 \xrightarrow{d_{1}} C_0 \xrightarrow{d_{0}} C_{-1} \xrightarrow{d_{-1}} \cdots
\end{align*}
such that $d_i d_{i+1} = 0$ for every $i$.
[/definition]
The condition $d_i d_{i+1}=0$ says that boundaries lie inside cycles. To use complexes as replacements for objects, we need an invariant that detects the precise degrees where exactness fails. That invariant is formed by comparing cycles with boundaries inside the abelian category itself.
[definition: Homology Object]
Let $C_\bullet$ be a chain complex in an abelian category $\mathcal A$. Its $i$th homology object is
\begin{align*}
H_i(C_\bullet) := \ker(d_i) / \operatorname{im}(d_{i+1}).
\end{align*}
[/definition]
Derived functors built from injective resolutions naturally use upper indices and arrows running to the right. To treat those constructions without constantly reversing notation, we use the cochain version of a complex. It records the same obstruction to exactness, but in the indexing convention used for right derived functors.
[definition: Cochain Complex]
A cochain complex in an abelian category $\mathcal A$ is a sequence
\begin{align*}
\cdots \xrightarrow{d^{-1}} C^0 \xrightarrow{d^0} C^1 \xrightarrow{d^1} C^2 \xrightarrow{d^2} \cdots
\end{align*}
such that $d^{i+1}d^i = 0$ for every $i$.
[/definition]
Once the arrows run to the right, we need the same exactness detector in upper-index notation. The relevant object should record what survives the outgoing differential after quotienting by what already came from the previous degree. This gives the cohomological version used for right derived functors.
[definition: Cohomology Object]
Let $C^\bullet$ be a cochain complex in an abelian category $\mathcal A$. Its $i$th cohomology object is
\begin{align*}
H^i(C^\bullet) := \ker(d^i) / \operatorname{im}(d^{i-1}).
\end{align*}
[/definition]
To derive a right exact functor, the input object must be replaced before the functor is applied. The replacement should be built from projective objects so that maps out of it lift well, and it must be exact so that any homology appearing afterward comes from the functor's failure to preserve kernels rather than from the replacement process.
This motivates isolating the replacement itself as a formal object. A projective resolution packages the chosen projective terms together with the exact sequence connecting them back to the original object, so that derived functors can be defined by applying the functor to this controlled model.
[definition: Projective Resolution]
Let $\mathcal A$ be an abelian category with enough projectives. A projective resolution of an object $A \in \mathcal A$ is an exact sequence
\begin{align*}
\cdots \to P_2 \to P_1 \to P_0 \to A \to 0
\end{align*}
where each $P_i$ is projective.
[/definition]
Projective resolutions are used to derive right exact functors. The object $A$ is replaced by projective objects $P_i$, the functor is applied termwise, and the remaining homology records the part of exactness that the functor did not preserve.
[definition: Injective Resolution]
Let $\mathcal A$ be an abelian category with enough injectives. An injective resolution of an object $A \in \mathcal A$ is an exact sequence
\begin{align*}
0 \to A \to I^0 \to I^1 \to I^2 \to \cdots
\end{align*}
where each $I^i$ is injective.
[/definition]
Injective resolutions are used to derive left exact functors. They turn an object into a cochain complex of injectives, and cohomology after applying the functor measures the lost right-exact information.
The definitions of projective and injective resolutions are useful only if such resolutions actually exist for the objects under study. The next existence result identifies the exact hypotheses that let the inductive construction begin and continue at every stage.
[quotetheorem:4206]
[citeproof:4206]
The hypotheses are not decorative. Without enough projectives, the first epimorphism $P_0 \to A$ may not exist, so the inductive construction cannot even start; dually, without enough injectives, there may be no monomorphism $A \to I^0$ with $I^0$ injective. The theorem does not say that resolutions are unique, finite, or computationally small. Its role is existential: it guarantees that the machinery below has inputs, while examples such as cyclic abelian groups show that useful resolutions can sometimes be very short.
[example: A Projective Resolution of a Cyclic Abelian Group]
For $n \ge 1$, let $\pi:\mathbb Z \to \mathbb Z/n\mathbb Z$ be the quotient map $\pi(a)=\bar a$. We show that
\begin{align*}
0 \to \mathbb Z \xrightarrow{\cdot n} \mathbb Z \xrightarrow{\pi} \mathbb Z/n\mathbb Z \to 0
\end{align*}
is a projective resolution of $\mathbb Z/n\mathbb Z$ over $\mathbb Z$.
Both copies of $\mathbb Z$ are free $\mathbb Z$-modules, hence projective. The first map is injective because if $nx=0$ in $\mathbb Z$ and $n\ge 1$, then $x=0$. The image of multiplication by $n$ is
\begin{align*}
\operatorname{im}(\cdot n)
&= \{nx : x\in \mathbb Z\} \\
&= n\mathbb Z.
\end{align*}
The kernel of the quotient map is
\begin{align*}
\ker(\pi)
&= \{a\in \mathbb Z : \bar a = \bar 0 \text{ in } \mathbb Z/n\mathbb Z\} \\
&= \{a\in \mathbb Z : a \in n\mathbb Z\} \\
&= n\mathbb Z.
\end{align*}
Thus $\operatorname{im}(\cdot n)=\ker(\pi)$. Finally, $\pi$ is surjective because every class in $\mathbb Z/n\mathbb Z$ has the form $\bar a=\pi(a)$ for some $a\in \mathbb Z$. Therefore the sequence is exact, so it is a projective resolution of $\mathbb Z/n\mathbb Z$.
[/example]
This short resolution is the basic computational device for cyclic groups. It will be used below to compute both $\operatorname{Tor}$ and $\operatorname{Ext}$.
## Right and Left Derived Functors
How can the construction of a resolution give a functor rather than a choice-dependent object? A naive attempt would choose an arbitrary presentation for each object and apply the functor to that presentation, but different choices can give visibly different complexes. The key fact is that projective and injective resolutions are unique up to chain homotopy in the sense needed for homology, so the homology objects obtained after applying the original functor are independent of the chosen resolution up to canonical isomorphism.
The practical workflow is always the same: choose a projective or injective resolution, apply the functor termwise, identify the resulting differentials, and compute homology or cohomology. Most computations in elementary homological algebra are refinements of this four-step procedure.
[definition: Left Derived Functors]
Let $F: \mathcal A \to \mathcal B$ be an additive right exact functor between abelian categories, and suppose $\mathcal A$ has enough projectives. For $A \in \mathcal A$, choose a projective resolution $P_\bullet \to A$. The left derived functors of $F$ are
\begin{align*}
L_iF(A) := H_i(F(P_\bullet)).
\end{align*}
[/definition]
Since $F$ is right exact, $L_0F(A)$ recovers $F(A)$ up to natural isomorphism. The higher objects $L_iF(A)$ for $i>0$ measure the failure of $F$ to be exact on the left.
Left exact functors have the opposite defect: they preserve kernels at the start of a short exact sequence but may lose cokernel information later. To measure that lost information, one embeds the input into an injective resolution, applies the functor termwise, and reads the remaining obstruction as cohomology.
[definition: Right Derived Functors]
Let $F: \mathcal A \to \mathcal B$ be an additive left exact functor between abelian categories, and suppose $\mathcal A$ has enough injectives. For $A \in \mathcal A$, choose an injective resolution $A \to I^\bullet$. The right derived functors of $F$ are
\begin{align*}
R^iF(A) := H^i(F(I^\bullet)).
\end{align*}
[/definition]
Since $F$ is left exact, $R^0F(A)$ recovers $F(A)$ up to natural isomorphism. The objects $R^iF(A)$ for $i>0$ measure how far $F$ is from carrying exact sequences to exact sequences past the first kernel stage.
The definitions still depend on choices of resolutions, so the construction must satisfy two tests: degree zero should recover the original functor, and the higher degrees should assemble functorially. The remaining problem is to show that the objects obtained from a chosen resolution are independent enough to define functors with the expected degree-zero term.
[quotetheorem:4207]
[citeproof:4207]
The assumption of enough projectives is needed because the construction starts with projective resolutions; for module categories this is supplied by free modules, but an arbitrary abelian category need not have such replacements. Right exactness is what makes degree zero return the original functor: without it, $H_0(F(P_\bullet))$ would define a correction of $F$, not $F$ itself. The theorem does not make the higher $L_iF$ vanish; those groups are precisely the new invariants that appear when $F$ fails to preserve kernels.
[quotetheorem:4208]
[citeproof:4208]
Enough injectives are essential for the same reason that enough projectives were essential above: the construction requires an injective resolution for every object. Left exactness is also essential for the degree-zero identification $R^0F \cong F$; without it, the zeroth cohomology would not recover the original functor on short exact sequences. The theorem does not provide a preferred injective resolution or an efficient computation method, so later examples use special resolutions when available.
The indexing convention reflects variance in time: projective resolutions run to the left and produce homological indices $L_i$, while injective resolutions run to the right and produce cohomological indices $R^i$.
## Ext and Tor as Derived Functors
Which familiar algebraic constructions arise from this machine? Tensor product is right exact but not generally left exact, while $\operatorname{Hom}$ is left exact but not generally right exact. Their derived functors are $\operatorname{Tor}$ and $\operatorname{Ext}$.
[definition: Tor Functors]
Let $R$ be a ring, let $M$ be a right $R$-module, and let $N$ be a left $R$-module. Choose a projective resolution $P_\bullet \to M$ in the category of right $R$-modules. The Tor groups are
\begin{align*}
\operatorname{Tor}_i^R(M,N) := H_i(P_\bullet \otimes_R N).
\end{align*}
[/definition]
For commutative rings the distinction between left and right modules can be suppressed after stating it once. The group $\operatorname{Tor}_1$ measures the first obstruction to tensoring preserving a monomorphism.
[example: Computing Tor over the Integers]
Let $m\ge 1$ and tensor the projective resolution
\begin{align*}
0 \to \mathbb Z \xrightarrow{\cdot n} \mathbb Z \to \mathbb Z/n\mathbb Z \to 0
\end{align*}
with $\mathbb Z/m\mathbb Z$. Under the standard identification $\mathbb Z\otimes_{\mathbb Z}\mathbb Z/m\mathbb Z\cong \mathbb Z/m\mathbb Z$, sending $a\otimes \bar{x}$ to $\overline{ax}$, the induced differential is
\begin{align*}
\mathbb Z/m\mathbb Z &\to \mathbb Z/m\mathbb Z,\\
\bar{x} &\mapsto \overline{nx}.
\end{align*}
Therefore
\begin{align*}
\operatorname{Tor}_1^{\mathbb Z}(\mathbb Z/n\mathbb Z,\mathbb Z/m\mathbb Z)
&\cong \ker(\mathbb Z/m\mathbb Z \xrightarrow{\cdot n} \mathbb Z/m\mathbb Z) \\
&= \{\bar{x}\in \mathbb Z/m\mathbb Z : \overline{nx}=\bar{0}\} \\
&= \{\bar{x}\in \mathbb Z/m\mathbb Z : m \mid nx\}.
\end{align*}
Let $d=\gcd(n,m)$, and write $n=dn'$ and $m=dm'$ with $\gcd(n',m')=1$. For $x\in \mathbb Z$,
\begin{align*}
m\mid nx
&\Longleftrightarrow dm'\mid dn'x \\
&\Longleftrightarrow m'\mid n'x \\
&\Longleftrightarrow m'\mid x,
\end{align*}
where the last equivalence uses $\gcd(n',m')=1$. Hence the kernel consists exactly of the residue classes $\overline{m't}$ in $\mathbb Z/dm'\mathbb Z$.
Define
\begin{align*}
\varphi:\mathbb Z/d\mathbb Z &\to \ker(\cdot n),\\
\bar{t} &\mapsto \overline{m't}.
\end{align*}
This is well-defined because if $t'=t+ds$, then
\begin{align*}
m't' = m't + m'ds = m't + ms,
\end{align*}
so $\overline{m't'}=\overline{m't}$ modulo $m$. It is surjective by the description of the kernel above. Its kernel is trivial because
\begin{align*}
\overline{m't}=\bar{0}\text{ in }\mathbb Z/dm'\mathbb Z
&\Longleftrightarrow dm'\mid m't \\
&\Longleftrightarrow d\mid t,
\end{align*}
so $\bar{t}=\bar{0}$ in $\mathbb Z/d\mathbb Z$. Thus $\varphi$ is an isomorphism, and
\begin{align*}
\operatorname{Tor}_1^{\mathbb Z}(\mathbb Z/n\mathbb Z,\mathbb Z/m\mathbb Z)
\cong \mathbb Z/\gcd(n,m)\mathbb Z.
\end{align*}
This group is precisely the subgroup of $\mathbb Z/m\mathbb Z$ consisting of elements killed by multiplication by $n$.
[/example]
Tor measures the failure of tensor product to be exact. The dual obstruction for Hom is captured by Ext: to measure it, one replaces one argument by a projective resolution and takes the cohomology of the resulting Hom complex.
[definition: Ext Functors]
Let $R$ be a ring, let $M$ and $N$ be left $R$-modules, and choose a projective resolution $P_\bullet \to M$. The Ext groups are
\begin{align*}
\operatorname{Ext}_R^i(M,N) := H^i(\operatorname{Hom}_R(P_\bullet,N)).
\end{align*}
[/definition]
There is an equivalent construction using an injective resolution of $N$. This equivalence is one of the reasons $\operatorname{Ext}$ is a central invariant: it can be computed from a projective replacement of the first argument or an injective replacement of the second.
[example: Computing Ext from a Cyclic Group]
Let $n\ge 1$, and apply $\operatorname{Hom}_{\mathbb Z}(-,A)$ to the projective resolution
\begin{align*}
0 \to \mathbb Z \xrightarrow{\cdot n} \mathbb Z \to \mathbb Z/n\mathbb Z \to 0.
\end{align*}
Since $\operatorname{Hom}_{\mathbb Z}(-,A)$ is contravariant, the relevant part of the resulting cochain complex is
\begin{align*}
0 \to \operatorname{Hom}_{\mathbb Z}(\mathbb Z,A)
\xrightarrow{(\cdot n)^*}
\operatorname{Hom}_{\mathbb Z}(\mathbb Z,A)
\to 0,
\end{align*}
where $(\cdot n)^*(f)=f\circ(\cdot n)$. Identify $\operatorname{Hom}_{\mathbb Z}(\mathbb Z,A)$ with $A$ by sending a homomorphism $f$ to $f(1)$. If $f(1)=a$, then
\begin{align*}
(f\circ(\cdot n))(1)
&= f(n) \\
&= f(\underbrace{1+\cdots+1}_{n\text{ times}}) \\
&= \underbrace{f(1)+\cdots+f(1)}_{n\text{ times}} \\
&= na.
\end{align*}
Thus the induced cochain differential is multiplication by $n$:
\begin{align*}
A \xrightarrow{\cdot n} A.
\end{align*}
Therefore
\begin{align*}
\operatorname{Ext}_{\mathbb Z}^1(\mathbb Z/n\mathbb Z,A)
&\cong H^1\bigl(\operatorname{Hom}_{\mathbb Z}(P_\bullet,A)\bigr) \\
&= \operatorname{coker}(A \xrightarrow{\cdot n} A) \\
&= A/\operatorname{im}(\cdot n) \\
&= A/nA.
\end{align*}
For $A=\mathbb Z/m\mathbb Z$, let $d=\gcd(n,m)$, and write $n=dn'$ and $m=dm'$ with $\gcd(n',m')=1$. In $\mathbb Z/m\mathbb Z$,
\begin{align*}
nA
&= \{\overline{nx}:x\in \mathbb Z\} \\
&= \{\overline{dn'x}:x\in \mathbb Z\}.
\end{align*}
Every element of $nA$ is divisible by $d$ modulo $m$. Conversely, if $\overline{dt}\in \mathbb Z/m\mathbb Z$, choose integers $u,v$ with $un'+vm'=1$. Then
\begin{align*}
n(ut)
&= dn'ut \\
&= d(1-vm')t \\
&= dt - dm'vt \\
&= dt - mvt,
\end{align*}
so $\overline{n(ut)}=\overline{dt}$ modulo $m$. Hence
\begin{align*}
nA=\{\overline{dt}:t\in\mathbb Z\}.
\end{align*}
The quotient map
\begin{align*}
\psi:\mathbb Z/m\mathbb Z &\to \mathbb Z/d\mathbb Z,\\
\bar{x} &\mapsto \bar{x}
\end{align*}
is well-defined because $d\mid m$. Its kernel is
\begin{align*}
\ker(\psi)
&= \{\bar{x}\in \mathbb Z/m\mathbb Z : d\mid x\} \\
&= \{\overline{dt}:t\in \mathbb Z\} \\
&= nA.
\end{align*}
Thus
\begin{align*}
\operatorname{Ext}_{\mathbb Z}^1(\mathbb Z/n\mathbb Z,\mathbb Z/m\mathbb Z)
&\cong (\mathbb Z/m\mathbb Z)/n(\mathbb Z/m\mathbb Z) \\
&\cong \mathbb Z/d\mathbb Z \\
&= \mathbb Z/\gcd(n,m)\mathbb Z.
\end{align*}
So for cyclic coefficients, $\operatorname{Ext}^1$ records exactly the quotient left after identifying elements that differ by multiplication by $n$.
[/example]
The calculation above shows that $\operatorname{Ext}^1$ can be nonzero, but it does not yet explain what its elements mean intrinsically. The natural interpretation comes from short exact sequences: a class should measure the different ways that one module can sit as a submodule with a prescribed quotient. The obstruction is that two sequences may have isomorphic middle terms while representing different extension data, because the inclusion of the submodule and the projection to the quotient are part of the structure. The next result gives the precise classification: $\operatorname{Ext}^1_R(M,N)$ is the group of extension classes of $M$ by $N$, with the zero class corresponding to split sequences.
[quotetheorem:4209]
[citeproof:4209]
The module hypotheses matter: the argument uses projective resolutions and the abelian structure of $R$-modules to form kernels, cokernels, and Baer sums of extensions. The theorem classifies extensions only up to equivalence of short exact sequences; it does not classify the middle module $E$ up to arbitrary isomorphism without remembering the maps from $N$ and to $M$. This interpretation explains why $\operatorname{Ext}^1$ is more than a computed cohomology group: it records the exact obstruction to splitting.
[example: Split and Non-Split Extensions]
Let the split sequence use the maps
\begin{align*}
i:\mathbb Z/2\mathbb Z &\to \mathbb Z/2\mathbb Z\oplus \mathbb Z/2\mathbb Z,
&
i(\bar a)&=(\bar a,\bar 0),\\
p:\mathbb Z/2\mathbb Z\oplus \mathbb Z/2\mathbb Z &\to \mathbb Z/2\mathbb Z,
&
p(\bar a,\bar b)&=\bar b.
\end{align*}
Then $p(i(\bar a))=p(\bar a,\bar 0)=\bar 0$, so $\operatorname{im}(i)\subseteq \ker(p)$. Conversely, if $(\bar a,\bar b)\in \ker(p)$, then $\bar b=\bar 0$, hence
\begin{align*}
(\bar a,\bar b)=(\bar a,\bar 0)=i(\bar a),
\end{align*}
so $\ker(p)\subseteq \operatorname{im}(i)$. Thus $\operatorname{im}(i)=\ker(p)$, and $p$ is surjective because $p(\bar 0,\bar b)=\bar b$. The map
\begin{align*}
s:\mathbb Z/2\mathbb Z &\to \mathbb Z/2\mathbb Z\oplus \mathbb Z/2\mathbb Z,\\
s(\bar b)&=(\bar 0,\bar b)
\end{align*}
satisfies $p(s(\bar b))=p(\bar 0,\bar b)=\bar b$, so $p\circ s=\operatorname{id}_{\mathbb Z/2\mathbb Z}$. Therefore the sequence is split, and by *Ext One Classifies Extensions of Modules* it represents the zero class in $\operatorname{Ext}_{\mathbb Z}^1(\mathbb Z/2\mathbb Z,\mathbb Z/2\mathbb Z)$.
Now consider
\begin{align*}
0 \to \mathbb Z/2\mathbb Z \xrightarrow{i} \mathbb Z/4\mathbb Z \xrightarrow{q} \mathbb Z/2\mathbb Z \to 0,
\end{align*}
where
\begin{align*}
i(\bar a)&=\overline{2a}\quad \text{in }\mathbb Z/4\mathbb Z,\\
q(\bar x)&=\bar x\quad \text{modulo }2.
\end{align*}
The map $i$ is injective because
\begin{align*}
i(\bar a)=\bar 0 \text{ in }\mathbb Z/4\mathbb Z
&\Longleftrightarrow 4\mid 2a\\
&\Longleftrightarrow 2\mid a\\
&\Longleftrightarrow \bar a=\bar 0 \text{ in }\mathbb Z/2\mathbb Z.
\end{align*}
Also,
\begin{align*}
\operatorname{im}(i)
&=\{\bar 0,\bar 2\}\subseteq \mathbb Z/4\mathbb Z,\\
\ker(q)
&=\{\bar x\in \mathbb Z/4\mathbb Z:\bar x=\bar 0 \text{ modulo }2\}\\
&=\{\bar 0,\bar 2\}.
\end{align*}
Hence $\operatorname{im}(i)=\ker(q)$, and $q$ is surjective because $q(\bar 0)=\bar 0$ and $q(\bar 1)=\bar 1$.
This second sequence is not split. If it split, there would be a homomorphism $s:\mathbb Z/2\mathbb Z\to \mathbb Z/4\mathbb Z$ with $q\circ s=\operatorname{id}_{\mathbb Z/2\mathbb Z}$. Since $q(s(\bar 1))=\bar 1$, the element $s(\bar 1)$ would have to be either $\bar 1$ or $\bar 3$ in $\mathbb Z/4\mathbb Z$. But a homomorphism must satisfy
\begin{align*}
\bar 0
=s(\bar 0)
=s(\bar 1+\bar 1)
=s(\bar 1)+s(\bar 1)
=2s(\bar 1).
\end{align*}
For the two possible lifts,
\begin{align*}
2\bar 1=\bar 2\neq \bar 0,
\qquad
2\bar 3=\bar 6=\bar 2\neq \bar 0
\end{align*}
in $\mathbb Z/4\mathbb Z$, a contradiction. Thus the extension is non-split, so it represents a nonzero class. Since the previous computation gives
\begin{align*}
\operatorname{Ext}_{\mathbb Z}^1(\mathbb Z/2\mathbb Z,\mathbb Z/2\mathbb Z)
\cong \mathbb Z/2\mathbb Z,
\end{align*}
there are exactly two extension classes: the split class and the non-split class represented by $\mathbb Z/4\mathbb Z$.
[/example]
## Long Exact Sequences and Universal Delta Functors
Why do derived functors appear in long exact sequences? Short exact sequences of objects can be resolved compatibly, and applying a non-exact functor produces a short exact sequence of complexes. Homology then turns that short exact sequence of complexes into a long exact sequence.
[quotetheorem:4210]
[citeproof:4210]
The short exactness hypothesis is necessary: if exactness fails at one degree, the connecting morphism may not be well-defined because a lifted boundary need not come from the left-hand complex. The theorem does not say that the long exact sequence splits, nor that the connecting morphism is zero; in applications, that connecting morphism often contains the main information. This result is the formal bridge from exact sequences of complexes to exact sequences of derived functors.
[quotetheorem:4211]
[citeproof:4211]
Enough projectives, or enough injectives in the dual statement, are needed to build the compatible resolutions used by the horseshoe argument. The theorem does not say that applying $F$ to the original short exact sequence remains exact; the whole point is that the higher derived functors repair the exactness lost by $F$. This is the computational payoff of deriving a functor: a broken short exact sequence is replaced by a longer exact sequence that keeps track of the obstruction terms.
The connecting morphism is often the most important map in the sequence. It records how an element that vanishes after applying one map fails to lift across the previous one.
Long exact sequences occur often enough that it is useful to make them part of the structure rather than restating their existence theorem each time. A delta functor packages the functors in all degrees together with the connecting maps that short exact sequences force.
[definition: Cohomological Delta Functor]
A cohomological delta functor from an abelian category $\mathcal A$ to an abelian category $\mathcal B$ is a family of additive functors $T^i: \mathcal A \to \mathcal B$ for $i\ge 0$, together with connecting morphisms
\begin{align*}
\delta^i: T^i(A'') \to T^{i+1}(A')
\end{align*}
for every short exact sequence $0 \to A' \to A \to A'' \to 0$, such that these data form natural long exact sequences.
[/definition]
The corresponding homological version has functors $T_i$ and connecting maps $T_i(A'') \to T_{i-1}(A')$. Delta functors package the long exact sequence as part of the structure rather than as a separate theorem.
This packaging raises a uniqueness question: if two constructions have the same degree-zero functor and the same exact-sequence behavior, should they be regarded as the same derived theory? Universality is the property that answers this question by forcing all higher-degree data from degree zero.
[definition: Universal Delta Functor]
A delta functor $T^\bullet$ is universal if, for every delta functor $S^\bullet$ and every natural transformation $T^0 \to S^0$, there exists a unique morphism of delta functors $T^\bullet \to S^\bullet$ extending it.
[/definition]
Universality says that derived functors are forced by their degree-zero part and their exactness behaviour. The remaining question is whether the right derived functors constructed from injective resolutions actually have this universal property, rather than merely giving one possible delta functor. The theorem answers that question and explains why competing constructions of the same right derived theory agree.
[quotetheorem:4212]
[citeproof:4212]
The enough-injectives hypothesis is what supplies the effacement step: every object must embed into an object on which the higher right derived functors vanish. Without that vanishing, the inductive uniqueness argument for extending a degree-zero natural transformation breaks down; without enough injectives, there may be no effacing monomorphism to start the induction. Universality does not compute the groups or choose canonical cocycle representatives, but it proves that any two constructions satisfying the same derived-functor axioms agree uniquely once they agree in degree zero. This is the conceptual reason that injective resolutions, projective resolutions in the appropriate variable, and other acyclic-resolution methods can be compared.
## Sheaf Cohomology as a Derived Functor
How does this abstract machinery produce geometric invariants? In sheaf theory, the global sections functor is left exact: a section of a subsheaf is detected on global sections, but surjectivity of sheaves need not give surjectivity on global sections. Right derived functors of global sections therefore measure the obstruction to gluing local data into global data.
[definition: Global Sections Functor]
Let $X$ be a topological space and let $\operatorname{Sh}(X)$ be the category of sheaves of abelian groups on $X$. The global sections functor is
\begin{align*}
\Gamma(X,-): \operatorname{Sh}(X) \to \operatorname{Ab}, \qquad \mathcal F \mapsto \mathcal F(X).
\end{align*}
[/definition]
The functor $\Gamma(X,-)$ is left exact, so its right derived functors are defined whenever the sheaf category has enough injectives. The point of passing to derived functors is that global sections can miss local data that cannot be glued globally.
To make those gluing failures usable as invariants, one needs notation that records the derived functors together with the space and the sheaf being studied. The next definition packages the right derived functors of $\Gamma(X,-)$ into groups whose degree $0$ part is ordinary global sections and whose higher-degree parts measure successive obstructions to exactness.
[definition: Sheaf Cohomology]
Let $X$ be a topological space and let $\mathcal F$ be a sheaf of abelian groups on $X$. The sheaf cohomology groups of $\mathcal F$ are
\begin{align*}
H^i(X,\mathcal F) := R^i\Gamma(X,\mathcal F).
\end{align*}
[/definition]
This definition turns a categorical defect of exactness into the cohomology groups used in geometry and topology.
[example: Global Sections Need Not Be Exact]
Take $X=S^1$, and let
\begin{align*}
\exp: \mathcal C_X(\mathbb R) \to \mathcal C_X(S^1)
\end{align*}
be the morphism of sheaves sending a real-valued continuous function $f$ to $e^{2\pi i f}$. This morphism is surjective as a morphism of sheaves because every $S^1$-valued continuous function has local continuous arguments. Its kernel is the constant sheaf $\mathbb Z$, since
\begin{align*}
e^{2\pi i f}=1
\quad \Longleftrightarrow \quad
f(x)\in \mathbb Z \text{ for every }x,
\end{align*}
and a continuous integer-valued function is locally constant.
Now consider the global section
\begin{align*}
\operatorname{id}_{S^1}\in \mathcal C_X(S^1)(S^1),
\qquad
\operatorname{id}_{S^1}(z)=z.
\end{align*}
If this section lifted to a global section of $\mathcal C_X(\mathbb R)$, there would be a continuous function $f:S^1\to \mathbb R$ such that
\begin{align*}
e^{2\pi i f(z)}=z
\end{align*}
for every $z\in S^1$. Put $\gamma(t)=e^{2\pi i t}$ and define $h(t)=f(\gamma(t))$ for $t\in[0,1]$. Then
\begin{align*}
e^{2\pi i h(t)}
&=e^{2\pi i t},
\end{align*}
so $h(t)-t\in\mathbb Z$ for every $t$. The function $t\mapsto h(t)-t$ is continuous from the connected interval $[0,1]$ to the discrete set $\mathbb Z$, hence is constant; say $h(t)-t=k$. Thus
\begin{align*}
h(0)&=k,\\
h(1)&=k+1.
\end{align*}
But $\gamma(0)=\gamma(1)$, so $h(0)=f(\gamma(0))=f(\gamma(1))=h(1)$, contradicting $k=k+1$.
Therefore the sheaf epimorphism $\mathcal C_X(\mathbb R)\to \mathcal C_X(S^1)$ does not induce a surjection on global sections. The associated long exact sequence contains
\begin{align*}
\mathcal C_X(\mathbb R)(S^1)
\to
\mathcal C_X(S^1)(S^1)
\xrightarrow{\delta}
H^1(S^1,\mathbb Z),
\end{align*}
and exactness says that a global section lies in the image of the first map exactly when its connecting class under $\delta$ is zero. Since $\operatorname{id}_{S^1}$ has no global lift, its image in $H^1(S^1,\mathbb Z)$ is nonzero; this is the first cohomological obstruction to gluing the local logarithms into one global logarithm.
[/example]
Derived functors close the course by tying together the central themes: adjunctions explain why many functors preserve half of the exact structure, abelian categories provide kernels and cokernels, and resolutions turn non-exactness into computable invariants. The next course in homological algebra develops this preview into a systematic language of derived categories, spectral sequences, and cohomological classification.
## References
- Mac Lane, Saunders. *Categories for the Working Mathematician*. Springer, 2nd ed., 1998.
- Awodey, Steve. *Category Theory*. Oxford University Press, 2nd ed., 2010.
- Riehl, Emily. *Category Theory in Context*. Dover, 2016.
- Weibel, Charles A. *An Introduction to Homological Algebra*. Cambridge University Press, 1994.
- Kashiwara, Masaki, and Schapira, Pierre. *Categories and Sheaves*. Springer, 2006.
Contents
- 1. Adjunctions From Universal Properties
- Hom-Set Adjunctions
- Units And Counits
- Universal Arrows And Comma Categories
- Equivalence And Uniqueness
- 2. Free-Forgetful Adjunctions and Algebraic Examples
- Free Constructions as Left Adjoints
- Forgetful Functors and Preservation of Limits
- Reflections and Coreflections in Full Subcategories
- 3. Adjunctions, Equivalences, and Monads
- Fully Faithful Adjoints and the Unit-Counit Test
- Equivalences as Adjunctions with Invertible Unit and Counit
- Monads and Comonads from Adjunctions
- Algebras over a Monad and the Comparison Functor
- 4. Limits as Universal Cones
- Diagrams, Cones, and Terminal Cones
- Products, Equalizers, Pullbacks, and Inverse Limits
- Limits as Representable Functors of Cones
- Pullback Calculus and Opposite Categories
- 5. Colimits and Universal Cocones
- Cocones and Initial Cocones
- Coproducts, Coequalizers, Pushouts, and Direct Limits
- Constructing Finite Colimits From Coproducts and Coequalizers
- Quotients and Generators-And-Relations Constructions
- 6. Completeness, Cocontinuity, and Creation of Limits
- Complete and Cocomplete Categories
- Functors Preserving, Reflecting, and Creating Limits
- Small Complete Categories and Size Issues
- 7. Kan Extensions and Adjoint Functor Theorems
- Universal Approximations Along a Functor
- Pointwise Kan Extensions
- Kan Extensions And Algebraic Change Of Base
- The Special Adjoint Functor Theorem
- Kan Extensions As A Bridge To Existence Theorems
- 8. Additive and Preadditive Categories
- Preadditive Categories and Hom Abelian Groups
- Zero Objects and Biproducts
- Additive Functors and Matrices of Morphisms
- 9. Kernels, Cokernels, and Abelian Categories
- Kernels and Cokernels by Universal Properties
- Images, Coimages, Monomorphisms, and Epimorphisms
- Definition and First Consequences of Abelian Categories
- 10. Exact Sequences and Diagram Lemmas
- Exactness at an Object and Short Exact Sequences
- Pullbacks, Pushouts, and Extensions in Abelian Categories
- Snake Lemma, Five Lemma, and Nine Lemma
- 11. Projectives, Injectives, and Exact Functors
- Left Exact, Right Exact, and Exact Functors
- Projective Objects
- Injective Objects
- Enough Projectives And Enough Injectives
- 12. Derived Functors Preview
- Resolutions and the Idea of Deriving Non-Exact Functors
- Right and Left Derived Functors
- Ext and Tor as Derived Functors
- Long Exact Sequences and Universal Delta Functors
- Sheaf Cohomology as a Derived Functor
- References
Category Theory II: Adjunctions, Limits, and Abelian Categories
Content
Problems
History
Created by admin on 5/29/2026 | Last updated on 6/1/2026
Prerequisites (0/4 completed)
Log in to track your prerequisite progress.
Prerequisites Graph
Interactive dependency map showing prerequisite concepts
Loading dependency graph...
Theorem
Definition
Current
Requires
Rate this page
★
★
★
★
★
Poor
Excellent