This course develops Ricci flow as a central tool in geometric analysis, viewing it as a nonlinear parabolic evolution equation for Riemannian metrics. The main goal is to understand how curvature changes under the flow, how geometric quantities satisfy evolution identities, and how these analytic ideas can be used to extract global information about the underlying manifold. Along the way, the course connects PDE methods with differential geometry in one of the most influential programs in modern geometry.
The chapters begin by setting up Ricci flow and proving short-time existence, then move to the evolution equations for curvature and the maximum principles that control them. From there, the course introduces Harnack inequalities, Li-Yau type estimates, and the analysis of singularity formation through blow-up limits. This leads naturally to Ricci solitons, ancient solutions, and Perelman’s entropy and reduced volume monotonicity formulas, which provide powerful new invariants and structural insight into the flow.
The later chapters focus on the three-dimensional case, where pinching estimates and canonical neighborhood structures organize the geometry near singularities. Surgery and long-time continuation show how to continue the flow past singular times while preserving control of the geometry. The course culminates in an outline of Perelman’s proof of the Poincaré conjecture, showing how the analytic machinery developed throughout the course fits together into a complete geometric classification strategy.
# Introduction
This opening chapter fixes the viewpoint of the course. Ricci flow is a way of studying Riemannian manifolds by evolving their metrics in the direction dictated by Ricci curvature, so the course sits at the meeting point of geometric intuition and nonlinear parabolic analysis. The main questions are existence, curvature control, singularity formation, and the geometric information retained when the flow develops high-curvature regions. Chapters 1--3 make these questions precise through Hamilton's analytic framework, Chapters 4--5 through maximum principles and Harnack inequalities, Chapters 8--9 through Perelman's monotonicity formulae, and Chapters 10--12 through canonical neighbourhoods, surgery, and the Poincare application.
The course assumes the language of Riemannian geometry and parabolic PDE. We shall use tensor calculus on smooth manifolds, Sobolev and elliptic estimates where gauge choices are involved in Chapter 2, and basic topology of three-manifolds when discussing the Poincare strategy in Chapters 11--12. This introduction is not a substitute for those prerequisites; it records the objects, conventions, and guiding examples that will recur throughout the notes.
## What Ricci Flow Tries to Do
The starting problem is that a Riemannian metric contains too much local data to classify directly. Curvature packages part of that data into tensors, but curvature itself changes under deformation of the metric. Ricci flow chooses a deformation rule in which the metric reacts to its own Ricci curvature, with the hope that regions of positive curvature contract, regions of negative curvature expand under normalization, and complicated geometry is driven toward canonical models or detectable singularities.
[definition: Ricci Flow]
Let $M$ be a smooth manifold, and let $\operatorname{Met}(M)$ denote the space of smooth Riemannian metrics on $M$. A Ricci flow on $M$ over a time interval $I \subset \mathbb R$ is a smooth map
\begin{align*}
g:I&\longrightarrow \operatorname{Met}(M), & t&\longmapsto g(t),
\end{align*}
satisfying
\begin{align*}
\frac{\partial}{\partial t}g(t) = -2\operatorname{Ric}(g(t))
\end{align*}
for all $t \in I$.
[/definition]
The factor $-2$ is a normalization that simplifies curvature evolution equations. The negative sign makes positive Ricci curvature shrink lengths, matching the heat-equation principle that the flow should smooth by moving in a curvature-decreasing direction.
[example: Constant Curvature Metrics]
If $(M^n,g_0)$ has constant sectional curvature $K_0$, then the Ricci flow stays in the one-parameter family $g(t)=\lambda(t)g_0$ and the scale factor satisfies
\begin{align*}
\lambda'(t)=-2(n-1)K_0,
\qquad
\lambda(0)=1.
\end{align*}
Hence
\begin{align*}
g(t)=\bigl(1-2(n-1)K_0t\bigr)g_0.
\end{align*}
Thus a round sphere with $K_0>0$ shrinks to zero scale at time $1/(2(n-1)K_0)$, a flat metric with $K_0=0$ remains fixed, and a hyperbolic metric with $K_0<0$ expands under the unnormalized flow. Later chapters reuse this example as the basic compact model for finite-time curvature blow-up.
[/example]
The constant-curvature calculation shows that the unnormalized equation mixes two effects: it changes shape and it changes total scale. When the scale change is the feature under study, the unnormalized equation is the right object; when the goal is to compare evolving shapes over long time, it is useful to add a compensating global term. The next definition records the volume-adapted equation used for that comparison; the scalar $r(t)$ is the spatial average of scalar curvature at time $t$.
[definition: Normalized Ricci Flow]
Let $M$ be a closed smooth manifold of dimension $n$, and let $\operatorname{Met}(M)$ denote the space of smooth Riemannian metrics on $M$. A normalized Ricci flow over a time interval $I \subset \mathbb R$ is a smooth map
\begin{align*}
g:I&\longrightarrow \operatorname{Met}(M), & t&\longmapsto g(t),
\end{align*}
satisfying
\begin{align*}
\frac{\partial}{\partial t}g(t) = -2\operatorname{Ric}(g(t)) + \frac{2}{n}r(t)g(t),
\end{align*}
where $r:I\to\mathbb R$ is given by
\begin{align*}
r(t)=\frac{1}{\operatorname{Vol}_{g(t)}(M)}\int_M S(g(t))\,d\mu_{g(t)}.
\end{align*}
[/definition]
The normalized equation removes a global scaling effect. It is useful when one wants to study convergence of shapes rather than collapse or expansion caused only by volume change.
## The Analytic Difficulty Behind the Equation
The next problem is that the displayed equation resembles a [heat equation](/page/Heat%20Equation) but is not a standard strictly parabolic system. The coefficients of $g(t)$ depend on the choice of coordinates, while the equation is invariant under time-dependent diffeomorphisms in a way that creates degeneracy in the principal symbol. A large part of Hamilton's short-time existence theory is the task of separating this geometric invariance from the genuinely parabolic part of the evolution.
[remark: Diffeomorphism Invariance]
If $g(t)$ solves Ricci flow and $\varphi:M\to M$ is a time-independent diffeomorphism, then $\varphi^*g(t)$ also solves Ricci flow. This follows from naturality of the Ricci tensor under pullback:
\begin{align*}
\operatorname{Ric}(\varphi^*g)=\varphi^*\operatorname{Ric}(g).
\end{align*}
[/remark]
This invariance is geometrically necessary, since a flow of metrics should not depend on the names of points. Analytically it produces a gauge freedom: some directions in the space of metric components correspond to reparametrizing the manifold rather than changing intrinsic geometry. The resulting problem requires a short-time existence theorem that proves the equation can be solved after the gauge degeneracy is separated from the parabolic directions. The closedness assumption is the clean setting in which no boundary conditions or completeness-at-infinity hypotheses are needed; on a manifold with boundary, parabolic boundary data must be added, and on a noncompact manifold one needs additional control to prevent the solution from escaping the analytic estimates at infinity. With these conventions fixed, the foundational question is whether the geometric equation actually determines a smooth flow for a short positive time despite its diffeomorphism degeneracy.
[quotetheorem:5961]
[citeproof:5961]
This theorem is the analytic foundation for everything that follows. It says that the formal curvature evolution is a real dynamical system, at least for short time, once the diffeomorphism freedom is handled correctly. The theorem does not give a uniform lower bound for $\varepsilon$ depending only on the dimension, nor does it prevent singularities from forming later; the existence time depends on quantitative geometry of $g_0$, such as curvature and injectivity-radius control. The uniqueness statement is also tied to the smooth closed setting stated above, and later noncompact uniqueness results require extra bounded-curvature or completeness assumptions. The next chapters use this result as the starting point for curvature evolution and continuation arguments rather than as a long-time existence theorem.
## Curvature as the Main Unknown
Once a solution exists, the central question becomes how curvature evolves. The metric equation is first order in time and second order in space, but the quantities that detect singularity formation are the scalar curvature, the Ricci tensor, and the full Riemann curvature tensor. Their evolution equations combine heat-type diffusion with nonlinear reaction terms.
[definition: Curvature Norm Along a Flow]
Let $g:I\to\operatorname{Met}(M)$ be a Ricci flow. For each $t\in I$, the curvature norm at time $t$ is the function
\begin{align*}
|\operatorname{Rm}(g(t))|_{g(t)}:M&\longrightarrow [0,\infty), & x&\longmapsto |\operatorname{Rm}(g(t))_x|_{g(t)}.
\end{align*}
[/definition]
Controlling this norm is the main local regularity problem. The continuation question asks whether any obstruction to extending a closed smooth flow must be visible in this curvature norm, rather than in some hidden loss of coordinates or higher derivatives.
[quotetheorem:5962]
[citeproof:5962]
The blow-up criterion turns singularity analysis into the study of high-curvature regions. The hypotheses are doing real work: closedness removes boundary and spatial-infinity alternatives, while the assumption $T<\infty$ focuses the statement on finite-time obstruction to continuation. On noncompact manifolds, curvature bounds alone may not give the same compactness without completeness, injectivity-radius, or derivative control hypotheses, and under weaker regularity assumptions the phrase "smooth convergence to $g(T)$" is no longer automatic. Later chapters rescale around high-curvature regions, add the compactness hypotheses needed to pass to limits, and classify the possible ancient solutions that appear.
[example: Shrinking Round Sphere]
For the unit round metric $g_{S^n}$ on $S^n$, take $K_0=1$ in the constant-curvature solution above. Then
\begin{align*}
g(t)=\bigl(1-2(n-1)t\bigr)g_{S^n},
\qquad
T=\frac{1}{2(n-1)}.
\end{align*}
Since sectional curvature rescales by the reciprocal scale factor,
\begin{align*}
K(t)=\frac{1}{1-2(n-1)t}.
\end{align*}
For a constant-sectional-curvature metric, $|\operatorname{Rm}|^2=2n(n-1)K^2$, and therefore
\begin{align*}
|\operatorname{Rm}(g(t))|_{g(t)}=\frac{\sqrt{2n(n-1)}}{1-2(n-1)t}
=\frac{\sqrt{2n(n-1)}}{2(n-1)}(T-t)^{-1}.
\end{align*}
Thus the round sphere develops a spatially uniform finite-time singularity, with curvature blowing up exactly at rate a constant multiple of $(T-t)^{-1}$ as $t\uparrow T$.
[/example]
This model singularity is spatially uniform: the whole manifold collapses at the same curvature scale. More complicated flows can develop necks, caps, and localized high-curvature regions, which is why compactness and classification theorems become necessary.
## Scaling and Model Geometries
A geometric flow must be understood together with its scaling symmetries. Singularities are analysed by zooming in near points where curvature is large, and the formal rescaling rule must preserve the Ricci flow equation itself. The next quoted result records that parabolic scaling rule before it is used to define blow-up limits.
[quotetheorem:5963]
[citeproof:5963]
Scaling explains why blow-up limits are the natural objects near singularities: after rescaling, a small high-curvature region may converge to a complete model flow, but the scaling theorem alone gives no compactness, no curvature bounds, and no guarantee that a subsequential limit exists. Those conclusions require additional estimates such as noncollapsing and local derivative control. The most rigid models are self-similar, meaning that their evolution is generated only by scaling and diffeomorphism. Without a static soliton equation, such rescaled singularity models would have to be recognized from the full time-dependent flow rather than from one geometric structure. To state this self-similarity without carrying an entire time-dependent solution, we need a time-independent definition; the vector field below records the diffeomorphism part of the self-similar motion, and the constant $\rho$ records the scaling direction.
[definition: Ricci Soliton]
A Ricci soliton is a Riemannian manifold $(M,g)$ together with a vector field $X\in\mathfrak X(M)$ and a constant $\rho\in\mathbb R$ such that
\begin{align*}
\operatorname{Ric}(g)+\frac{1}{2}\mathcal L_X g=\rho g.
\end{align*}
It is shrinking if $\rho>0$, steady if $\rho=0$, and expanding if $\rho<0$.
[/definition]
Solitons are fixed points of Ricci flow modulo scaling and diffeomorphism. They form the first candidates for singularity models, just as self-similar solutions organize the singularity theory of many nonlinear PDE.
[example: Shrinking Cylinder]
On $S^{n-1}\times\mathbb R$, let
\begin{align*}
g(t)=a(t)g_{S^{n-1}}+dz^2,
\end{align*}
where $g_{S^{n-1}}$ is the unit round metric and $n\ge 3$. The product splitting gives
\begin{align*}
\operatorname{Ric}(g(t))=(n-2)g_{S^{n-1}}+0\cdot dz^2.
\end{align*}
Substitution into $\partial_tg(t)=-2\operatorname{Ric}(g(t))$ yields
\begin{align*}
a'(t)=-2(n-2),
\end{align*}
so for the initially unit cylinder,
\begin{align*}
g(t)=\bigl(1-2(n-2)t\bigr)g_{S^{n-1}}+dz^2.
\end{align*}
The spherical factor collapses at time $T=1/(2(n-2))$, while the $\mathbb R$ direction does not shrink. Thus the round cylinder is a neck-type model: curvature concentrates in the compact spherical directions, but the geometry remains extended along the line.
[/example]
## The Road to Perelman's Structure Theory
The final problem motivating the course is how local analytic control can yield global topological information. In dimension three, Ricci flow tends to simplify geometry, but singularities can interrupt the evolution. Hamilton's program was to understand those singularities well enough to continue the flow through controlled topological changes; Perelman's work supplied the missing monotonicity, noncollapsing, and canonical-neighbourhood tools.
[explanation: Main Themes of the Course]
The early part of the course develops the equation as a quasilinear parabolic system. This includes the DeTurck trick, uniqueness, curvature evolution equations, and maximum principles for scalar and tensor quantities.
The middle part studies estimates that persist along the flow. Scalar curvature lower bounds, Hamilton's matrix and tensor maximum principles, and derivative estimates show how curvature pinching can improve over time.
The later part turns to singularities. Blow-up sequences, ancient solutions, reduced length, reduced volume, and entropy monotonicity are used to prevent collapsing and to identify geometric models near high-curvature regions.
The final part explains how these analytic tools enter the three-dimensional topological applications. The Poincare strategy is not a single curvature computation; it is a chain linking parabolic smoothing, singularity models, canonical neighbourhoods, and long-time geometric decomposition.
[/explanation]
This chapter has introduced the viewpoint rather than the machinery. The next chapter turns that perspective into a working equation, studying Ricci flow as a geometric parabolic equation and identifying the diffeomorphism invariance that makes Hamilton's existence theorem nonstandard.
# 1. Ricci Flow as a Geometric Parabolic Equation
Ricci flow is the evolution equation that lets the curvature of a Riemannian manifold move the metric itself. In earlier geometric analysis courses, the metric was usually fixed and elliptic or parabolic equations were studied on top of it; here the coefficients of the analytic problem are part of the unknown. The chapter assumes the standard prerequisites from Riemannian geometry and geometric analysis: smooth manifolds, Riemannian metrics, curvature tensors, Lie derivatives, basic parabolic equations, and local-coordinate tensor calculus. It sets up the equation, records its basic symmetries, and studies the model solutions that guide the later singularity theory.
The guiding analogy is heat flow. A heat equation smooths a function by diffusing high-frequency variation, while Ricci flow aims to smooth a metric by reacting to Ricci curvature. The analogy is imperfect because the equation is invariant under diffeomorphisms, and that invariance is the first signal that the equation is only weakly parabolic before a gauge is chosen.
## The Ricci Flow Equation
What should it mean to smooth the geometry of a manifold rather than a scalar function? Since Ricci curvature measures the averaged failure of geodesic balls to look Euclidean, the first-order rule is to decrease the metric in positively Ricci-curved directions and increase it in negatively Ricci-curved directions.
[definition: Ricci Flow]
Let $M$ be a smooth manifold, let $I \subset \mathbb R$ be an interval, and let $\operatorname{Met}(M)$ denote the space of smooth Riemannian metrics on $M$. A Ricci flow on $M$ over $I$ is a smooth map
\begin{align*}
g:I&\longrightarrow \operatorname{Met}(M), & t&\longmapsto g(t),
\end{align*}
satisfying
\begin{align*}
\frac{\partial}{\partial t} g(t) = -2\operatorname{Ric}(g(t))
\end{align*}
for all $t \in I$.
[/definition]
The factor $-2$ is a convention chosen so that the curvature evolution equations take their standard form. It also aligns the constant-curvature examples with simple scalar ordinary differential equations, so the first test is to solve the equation on a space whose Ricci tensor is already known.
[example: Round Sphere Under Ricci Flow]
Let $g_0$ be the round metric of sectional curvature $1$ on $S^n$, with $n \ge 2$, and look for a homothetic solution of the form
\begin{align*}
g(t)=a(t)g_0
\end{align*}
with $a(0)=1$. The round metric has
\begin{align*}
\operatorname{Ric}(g_0)=(n-1)g_0.
\end{align*}
Because multiplying a metric by a positive constant leaves the Ricci tensor unchanged as a $(0,2)$-tensor, we have
\begin{align*}
\operatorname{Ric}(g(t))=\operatorname{Ric}(a(t)g_0)=\operatorname{Ric}(g_0)=(n-1)g_0.
\end{align*}
The time derivative of the ansatz is
\begin{align*}
\frac{\partial}{\partial t}g(t)=\frac{\partial}{\partial t}\bigl(a(t)g_0\bigr)=a'(t)g_0.
\end{align*}
Substituting these two identities into the Ricci flow equation gives
\begin{align*}
a'(t)g_0=-2\operatorname{Ric}(g(t)).
\end{align*}
Using $\operatorname{Ric}(g(t))=(n-1)g_0$, this becomes
\begin{align*}
a'(t)g_0=-2(n-1)g_0.
\end{align*}
Since $g_0$ is not the zero tensor, the scalar coefficient must satisfy
\begin{align*}
a'(t)=-2(n-1).
\end{align*}
Integrating from $0$ to $t$ gives
\begin{align*}
a(t)-a(0)=\int_0^t -2(n-1)\,d\tau.
\end{align*}
The integral is
\begin{align*}
\int_0^t -2(n-1)\,d\tau=-2(n-1)t,
\end{align*}
and $a(0)=1$, so
\begin{align*}
a(t)=1-2(n-1)t.
\end{align*}
The formula defines a Riemannian metric precisely while the scale factor is positive:
\begin{align*}
1-2(n-1)t>0.
\end{align*}
Equivalently,
\begin{align*}
t<\frac{1}{2(n-1)}.
\end{align*}
Thus the round sphere shrinks by a uniform scale factor until the extinction time
\begin{align*}
T=\frac{1}{2(n-1)}.
\end{align*}
[/example]
This example shows that positive Ricci curvature contracts distances. It also shows a failure of any naive expectation that the unnormalized equation preserves size: the volume of the round sphere is multiplied by $a(t)^{n/2}$ and tends to $0$ as $t$ approaches the extinction time. To compare long-time behaviour across solutions whose volume changes, the next problem is to modify the equation by a global scale correction rather than by changing its local curvature response.
[definition: Normalized Ricci Flow]
Let $M$ be a closed smooth $n$-manifold, let $I \subset \mathbb R$ be an interval, and let $\operatorname{Met}(M)$ denote the space of smooth Riemannian metrics on $M$. A normalized Ricci flow is a smooth map
\begin{align*}
g:I&\longrightarrow \operatorname{Met}(M), & t&\longmapsto g(t),
\end{align*}
satisfying
\begin{align*}
\frac{\partial}{\partial t} g(t) = -2\operatorname{Ric}(g(t)) + \frac{2}{n}r(t)g(t),
\end{align*}
where
\begin{align*}
r(t)=\frac{1}{\operatorname{Vol}_{g(t)}(M)}\int_M S(g(t))\,d\mu_{g(t)}
\end{align*}
is the average scalar curvature.
[/definition]
The added term is spatially constant, so it changes the metric by uniform scaling rather than by local distortion. The coefficient has been chosen to answer a specific question: does the normalized equation keep the total volume fixed on a closed manifold?
[quotetheorem:5964]
[citeproof:5964]
The closedness hypothesis is used twice: it makes the total volume finite and removes boundary terms or boundary conditions from the volume calculation. On a noncompact manifold such as hyperbolic space, the total volume may be infinite, so the average scalar curvature in this definition is not available without extra choices. The theorem does not say that local volume elements are fixed; regions can still expand or contract, while the integral over all of $M$ stays constant. It also does not give long-time existence, since preserving volume does not prevent curvature concentration. The normalized equation is therefore best viewed as the unnormalized flow with a global scale correction, and the next symmetry explains how such scale changes interact with the original equation.
## Scaling and Curvature
How does Ricci flow behave when the unit of length is changed? A tempting but wrong normalization is to replace $g(t)$ by $\lambda g(t)$ while keeping the same time parameter; differentiating gives an extra factor on the left-hand side but not on $\operatorname{Ric}$, so the equation is no longer balanced unless $\lambda=1$. A parabolic equation should scale time like length squared, and Ricci flow follows this rule because curvature has the dimension of inverse length squared.
[quotetheorem:5963]
[citeproof:5963]
The condition $\lambda>0$ is essential because a negative multiple of a Riemannian metric is not a Riemannian metric, and $\lambda=0$ collapses all tangent lengths. The interval statement matters as well: if $I=[0,T)$, then the rescaled solution lives on $[0,\lambda T)$, so the same geometric path is being observed with a different clock. The theorem does not say that every reparametrization of time gives another solution; the linear rescaling above is tied to constant spatial scaling. This is the symmetry used in blow-up arguments, but singularity analysis also needs the size of curvature after rescaling. The next computation fixes the conventions for Riemann, Ricci, scalar curvature, and curvature norms under constant metric scaling.
Using the wrong curvature scaling gives the wrong singularity model. For instance, if a point has curvature norm $Q$ and the metric is replaced by $Qg$, treating the curvature norm as unchanged would leave a model with curvature $Q$ rather than the intended normalized size. The following theorem prevents that mistake by separating the tensor type from the norm used to measure it.
[quotetheorem:5965]
[citeproof:5965]
The constant-scaling hypothesis is needed because variable conformal changes introduce derivative terms into the connection and curvature. The theorem also depends on the tensor type being stated: $\operatorname{Rm}$ as a $(1,3)$-tensor is unchanged, while the fully covariant $(0,4)$ tensor gains a factor of $\lambda$. It does not say that geometric curvature has become larger after enlarging lengths; the norm formula says the opposite, since doubling squared lengths halves curvature size. The point is practical: a region with curvature size $Q$ is rescaled by $Q$ so that its new curvature size is of order $1$. Time is rescaled by the same factor, so the next example records the local zoom used near a neck singularity.
[example: Neckpinch Scaling on $S^2 \times \mathbb R$]
[illustration:neck-blowup-cylinder-limit]
Consider a rotationally symmetric Ricci flow $g(t)$ on $S^2\times \mathbb R$ whose middle spherical cross-sections become small. Choose points and times $(p_k,t_k)$ with
\begin{align*}
Q_k=|\operatorname{Rm}(g(t_k))|_{g(t_k)}(p_k)\to\infty,
\end{align*}
and define the rescaled flow, whenever $t_k+s/Q_k$ lies in the original time interval, by
\begin{align*}
g_k(s)=Q_k g(t_k+s/Q_k).
\end{align*}
Equivalently, the new time coordinate is
\begin{align*}
s=Q_k(t-t_k).
\end{align*}
By *Parabolic Scaling of Ricci Flow*, each $g_k(s)$ is again a Ricci flow on its rescaled time interval.
At the base time $s=0$, the definition gives
\begin{align*}
g_k(0)=Q_k g(t_k).
\end{align*}
Apply *Scaling Law for Curvature* with $\lambda=Q_k$ to the metric $g(t_k)$. The curvature norm at $p_k$ transforms as
\begin{align*}
|\operatorname{Rm}(Q_k g(t_k))|_{Q_k g(t_k)}(p_k)=Q_k^{-1}|\operatorname{Rm}(g(t_k))|_{g(t_k)}(p_k).
\end{align*}
Since $g_k(0)=Q_k g(t_k)$, the left-hand side is
\begin{align*}
|\operatorname{Rm}(g_k(0))|_{g_k(0)}(p_k).
\end{align*}
By the definition of $Q_k$,
\begin{align*}
|\operatorname{Rm}(g(t_k))|_{g(t_k)}(p_k)=Q_k.
\end{align*}
Substituting this into the scaling formula gives
\begin{align*}
|\operatorname{Rm}(g_k(0))|_{g_k(0)}(p_k)=Q_k^{-1}Q_k=1.
\end{align*}
Thus the blow-up rescaling normalizes the curvature at the chosen basepoint instead of allowing it to diverge. Since multiplying the metric by $Q_k$ multiplies lengths by $Q_k^{1/2}$, the narrowing neck is viewed at its own curvature scale; in the standard neckpinch scenario, these normalized flows approach a shrinking round cylinder.
[/example]
## Constant-Curvature Model Solutions
Which solutions should be kept in mind when reading the general theory? The constant-curvature metrics give the basic signs of the equation: positive curvature shrinks, zero curvature stays fixed, and negative curvature expands.
[definition: Einstein Metric]
Let $M$ be a smooth $n$-manifold. A Riemannian metric $g$ on $M$ is Einstein if there exists a constant $\kappa \in \mathbb R$ such that
\begin{align*}
\operatorname{Ric}(g)=\kappa g.
\end{align*}
[/definition]
Einstein metrics are the natural finite-dimensional test cases because the Ricci tensor points exactly in the scaling direction of the metric. Without the Einstein condition, a single scale factor cannot solve the equation: different tangent directions may have different Ricci eigenvalues, so the metric changes shape as well as size. The next result turns the Einstein condition into the ordinary differential equation governing the scale factor.
[quotetheorem:5966]
[citeproof:5966]
The Einstein hypothesis is doing the work: it forces the Ricci tensor to be parallel to $g_0$, so the flow cannot leave the one-dimensional family of constant multiples of $g_0$. A product metric with unequal curvature factors gives a counterexample to such homothetic behaviour, because the factors evolve at different rates. The condition $1-2\kappa t>0$ is not a cosmetic restriction; once the scale factor reaches $0$, the formula no longer defines a Riemannian metric. The theorem does not assert that all finite-time singularities are homothetic collapses, since neckpinches and other anisotropic singularities have different local models. It packages the round sphere calculation and also describes flat and hyperbolic examples, so the sign of $\kappa$ now gives the three basic behaviours.
[example: Flat Tori and Hyperbolic Metrics]
Let $T^n=\mathbb R^n/\mathbb Z^n$ carry a flat metric $g_0$. Flatness gives $\operatorname{Ric}(g_0)=0$, and the constant family $g(t)=g_0$ has
\begin{align*}
\frac{\partial}{\partial t}g(t)=0.
\end{align*}
Since $\operatorname{Ric}(g(t))=\operatorname{Ric}(g_0)=0$, the Ricci flow equation becomes
\begin{align*}
\frac{\partial}{\partial t}g(t)=-2\operatorname{Ric}(g(t))=0.
\end{align*}
Thus every flat torus is stationary under Ricci flow.
Now let $g_0$ be a hyperbolic metric on a closed $n$-manifold, normalized so that every sectional curvature is $-1$. Then
\begin{align*}
\operatorname{Ric}(g_0)=-(n-1)g_0.
\end{align*}
Look for a homothetic solution $g(t)=a(t)g_0$ with $a(0)=1$. Constant scaling leaves Ricci curvature unchanged as a $(0,2)$-tensor, so
\begin{align*}
\operatorname{Ric}(g(t))=\operatorname{Ric}(a(t)g_0)=\operatorname{Ric}(g_0)=-(n-1)g_0.
\end{align*}
The time derivative of the ansatz is
\begin{align*}
\frac{\partial}{\partial t}g(t)=\frac{\partial}{\partial t}\bigl(a(t)g_0\bigr)=a'(t)g_0.
\end{align*}
Substituting these identities into $\partial_t g(t)=-2\operatorname{Ric}(g(t))$ gives
\begin{align*}
a'(t)g_0=-2\bigl(-(n-1)g_0\bigr)=2(n-1)g_0.
\end{align*}
Since $g_0$ is not the zero tensor, the scalar coefficient satisfies
\begin{align*}
a'(t)=2(n-1).
\end{align*}
Integrating from $0$ to $t$ gives
\begin{align*}
a(t)-a(0)=\int_0^t 2(n-1)\,d\tau=2(n-1)t.
\end{align*}
Using $a(0)=1$, we obtain
\begin{align*}
a(t)=1+2(n-1)t.
\end{align*}
Therefore
\begin{align*}
g(t)=\bigl(1+2(n-1)t\bigr)g_0.
\end{align*}
By the curvature scaling law for constant metric rescalings, the sectional curvature is multiplied by the reciprocal scale factor, so
\begin{align*}
K_{g(t)}=\bigl(1+2(n-1)t\bigr)^{-1}K_{g_0}=-\frac{1}{1+2(n-1)t}.
\end{align*}
Thus flat metrics remain fixed, while hyperbolic metrics expand linearly in scale and their sectional curvature approaches $0$ from below as $t\to\infty$.
[/example]
The cylinder is the first noncompact model that behaves differently in different directions. It is also the local shape expected in many neck singularities, so it deserves a separate calculation rather than being treated as another Einstein example.
[example: Shrinking Round Cylinders]
[illustration:shrinking-cylinder-vs-sphere]
Let $n \ge 3$, and write the product metric as
\begin{align*}
g_0=g_{S^{n-1}}+dz^2,
\end{align*}
where $g_{S^{n-1}}$ is the round metric of sectional curvature $1$ on $S^{n-1}$ and $dz^2$ is the Euclidean metric on $\mathbb R$. The round unit $(n-1)$-sphere has Ricci tensor $(n-2)g_{S^{n-1}}$, while the line has Ricci tensor $0$, so the product metric has
\begin{align*}
\operatorname{Ric}(g_0)=(n-2)g_{S^{n-1}}+0\cdot dz^2.
\end{align*}
Look for a solution that changes only the spherical factor:
\begin{align*}
g(t)=a(t)g_{S^{n-1}}+dz^2,
\end{align*}
with $a(0)=1$. Constant scaling of the spherical metric leaves its Ricci tensor unchanged as a $(0,2)$-tensor, and the line factor remains flat, hence
\begin{align*}
\operatorname{Ric}(g(t))=(n-2)g_{S^{n-1}}+0\cdot dz^2.
\end{align*}
The time derivative of the ansatz is
\begin{align*}
\frac{\partial}{\partial t}g(t)=a'(t)g_{S^{n-1}}+0\cdot dz^2.
\end{align*}
Substituting this and the Ricci tensor into $\partial_t g(t)=-2\operatorname{Ric}(g(t))$ gives
\begin{align*}
a'(t)g_{S^{n-1}}+0\cdot dz^2=-2(n-2)g_{S^{n-1}}+0\cdot dz^2.
\end{align*}
Comparing the spherical coefficients gives
\begin{align*}
a'(t)=-2(n-2).
\end{align*}
Integrating from $0$ to $t$ gives
\begin{align*}
a(t)-a(0)=\int_0^t -2(n-2)\,d\tau.
\end{align*}
The integral is
\begin{align*}
\int_0^t -2(n-2)\,d\tau=-2(n-2)t,
\end{align*}
and $a(0)=1$, so
\begin{align*}
a(t)=1-2(n-2)t.
\end{align*}
Therefore
\begin{align*}
g(t)=\bigl(1-2(n-2)t\bigr)g_{S^{n-1}}+dz^2.
\end{align*}
This is a Riemannian metric exactly while
\begin{align*}
1-2(n-2)t>0,
\end{align*}
equivalently
\begin{align*}
t<\frac{1}{2(n-2)}.
\end{align*}
The spherical cross-sections shrink to zero scale at time $\frac{1}{2(n-2)}$, while the axial term $dz^2$ never changes. When $n=2$, the spherical factor is $S^1$, whose Ricci curvature is zero, so the same coefficient equation becomes $a'(t)=0$ and the product flat cylinder $S^1\times\mathbb R$ is stationary.
[/example]
The cylinder example explains why Ricci flow singularities are not merely global collapses. A solution may develop high curvature in some directions while retaining an extended direction after rescaling.
## Diffeomorphism Invariance and Weak Parabolicity
Why is Ricci flow harder than a standard nonlinear heat equation for the metric coefficients? The equation cannot distinguish a genuine geometric change from a time-dependent change of coordinates, so its linearization has degenerate directions coming from diffeomorphisms. If this invariance were ignored, two coordinate descriptions of the same evolving geometry would be treated as different analytic solutions, and any strict parabolicity statement for the raw metric coefficients would contradict that freedom.
[quotetheorem:5967]
[citeproof:5967]
The diffeomorphism being fixed in time is part of the statement; a time-dependent family of diffeomorphisms introduces an additional Lie derivative term and no longer solves the same equation without modification. The diffeomorphism hypothesis is also essential, since an arbitrary map need not pull a Riemannian metric back to a metric of the same dimension, and Ricci curvature is natural only under smooth changes of coordinates. The theorem does not identify two pulled-back flows as the same analytic curve of tensors; it says they represent the same geometric evolution in different coordinates. This invariance is geometrically necessary because Ricci flow should evolve geometry, not a preferred coordinate representation. Analytically it creates a degeneracy, and to name the degenerate directions we need the infinitesimal change in the metric generated by flowing along a vector field.
[definition: Lie Derivative Direction of a Metric]
Let $M$ be a smooth manifold, let $g$ be a Riemannian metric, and let $X\in \mathfrak X(M)$ be a smooth vector field. The Lie derivative direction of $g$ generated by $X$ is the symmetric $(0,2)$-tensor
\begin{align*}
\mathcal L_X g.
\end{align*}
[/definition]
These directions are tangent to the orbit of the diffeomorphism group through the metric. A strictly parabolic equation would control all high-frequency variations of the unknown, so the next point is to identify why Ricci flow fails that test before gauge-fixing.
[remark: Weak Parabolicity]
In local coordinates, the highest-order part of the Ricci tensor resembles a Laplacian on the metric coefficients together with second-derivative terms produced by coordinate freedom. The diffeomorphism directions are responsible for the missing strict ellipticity in the spatial operator. Hamilton's short-time existence theorem is therefore not obtained by applying a scalar parabolic theorem directly to the displayed coordinate equation.
[/remark]
This remark is the bridge to Chapter 2. There the DeTurck trick adds a carefully chosen Lie derivative term to fix the gauge, solves a strictly parabolic Ricci-DeTurck equation, and then pulls the solution back by diffeomorphisms.
## Isometries Along the Flow
Which symmetries of the initial geometry survive under Ricci flow? A possible failure mode would be that a symmetric initial metric immediately evolves into a less symmetric metric because the PDE chooses one representative among many coordinate descriptions. Since the equation is built naturally from the metric, pullback invariance and uniqueness rule out that failure whenever uniqueness is available.
[definition: Isometry Group]
Let $(M,g)$ be a Riemannian manifold. The isometry group of $(M,g)$ is
\begin{align*}
\operatorname{Isom}(M,g)=\{\phi \in \operatorname{Diff}(M):\phi^*g=g\}.
\end{align*}
[/definition]
The definition turns a geometric symmetry into an equation for pullbacks of the metric. The remaining issue is whether the flow could break that equation by choosing coordinates asymmetrically after time starts. Pullback invariance alone only shows that an isometry produces another solution with the same initial data; uniqueness is what identifies that pulled-back solution with the original flow.
[quotetheorem:5968]
[citeproof:5968]
Closedness is included because the uniqueness theorem being invoked here is Hamilton uniqueness on closed manifolds. On noncompact manifolds, uniqueness can require completeness, curvature bounds, or other hypotheses; without such assumptions, the same short argument is not available. The initial isometry condition is also necessary: Ricci flow does not create a prescribed symmetry from an asymmetric initial metric by this argument. The theorem does not say that the full isometry group is unchanged as an abstract group, since additional isometries may appear at later times in special situations. What it gives is a reliable inclusion of the initial symmetry group into the later symmetry groups, which lets symmetric initial data be studied inside a smaller ansatz such as rotationally symmetric metrics or homogeneous metrics. Later examples use this reduction to convert geometric PDE into coupled scalar equations without losing the symmetry forced by the initial metric.
Once the basic definition and symmetries of Ricci flow are in place, the key remaining issue is analytic well-posedness. Hamilton's short-time existence theorem resolves that difficulty by showing how to make the system genuinely parabolic after accounting for gauge freedom.
# 2. Hamilton Short-Time Existence
Chapter 1 introduced Ricci flow as the equation
\begin{align*}
\partial_t g(t)=-2\operatorname{Ric}(g(t))
\end{align*}
and showed why its model solutions behave like heat equations for curvature. The analytic difficulty is that the metric itself is not governed by a strictly parabolic system in any coordinate-free sense. This chapter explains Hamilton's short-time existence theorem through DeTurck's gauge-fixing method: first isolate the degeneracy, then add a carefully chosen Lie derivative term, and finally undo the gauge by a family of diffeomorphisms.
The guiding idea is that Ricci flow is parabolic only modulo diffeomorphisms. The DeTurck trick turns this geometric statement into a classical quasilinear parabolic system, where standard existence theory applies. Once the gauged flow has been solved, the gauge is removed by pulling the solution back along diffeomorphisms generated by the DeTurck vector field.
## Linearization and Gauge Degeneracy
Why does the heat-equation intuition from the model solutions fail to give short-time existence directly? The obstruction is not that the Ricci tensor lacks second derivatives of the metric, but that its second-order part has directions corresponding to coordinate changes. These directions make the principal symbol non-invertible, so the usual parabolic theory for systems cannot be applied to the Ricci flow equation in its raw form.
Before computing the obstruction, we record the analytic meaning of the leading part of a geometric differential operator.
[definition: Principal Symbol of a Second-Order Metric Operator]
Let $(M,g)$ be a Riemannian manifold and let
\begin{align*}
P:\Gamma(S^2T^*M)\longrightarrow \Gamma(S^2T^*M)
\end{align*}
be a linear second-order differential operator on smooth symmetric $2$-tensors. Let $\operatorname{Met}(M)$ denote the open subset of $\Gamma(S^2T^*M)$ consisting of smooth positive-definite symmetric $2$-tensors, and let
\begin{align*}
\mathcal{P}:\operatorname{Met}(M)\longrightarrow \Gamma(S^2T^*M)
\end{align*}
be a nonlinear second-order metric operator. The principal symbol of $\mathcal{P}$ at a metric $g\in\operatorname{Met}(M)$ means the principal symbol of the linearized operator
\begin{align*}
D\mathcal{P}_g:\Gamma(S^2T^*M)\longrightarrow \Gamma(S^2T^*M).
\end{align*}
For $p\in M$ and $\xi \in T_p^*M$, the principal symbol of $P$ at $\xi$ is the [linear map](/page/Linear%20Map)
\begin{align*}
\sigma_\xi(P):S^2T_p^*M\longrightarrow S^2T_p^*M
\end{align*}
obtained by retaining only the second-derivative terms of $P$ and replacing each covariant derivative $\nabla_i$ by multiplication by $\xi_i$.
[/definition]
The principal symbol is the part of a linearized equation that decides whether a quasilinear system has heat-type smoothing. To decide whether raw Ricci flow meets this test, we need the next computation of the symbol of $g\mapsto -2\operatorname{Ric}(g)$.
[quotetheorem:5969]
[citeproof:5969]
This computation identifies the analytic defect precisely. The hypothesis $\xi\ne 0$ is essential because the zero covector carries no second-order information: at $\xi=0$ the displayed symbol is the zero map for every second-order operator, so it cannot test ellipticity or parabolicity. The non-zero-covector conclusion is also sharp. On Euclidean space with its flat metric, take $\xi=dx_1$ and $h=2\,dx_1\otimes dx_1$, which is the symbol-level form of $\mathcal{L}_Xg$ for a high-frequency vector field in the $x_1$ direction; substituting gives zero, so the raw Ricci operator has an explicit diffeomorphism kernel direction. By contrast, a transverse tensor such as $h=dx_2\otimes dx_2$ when $n\ge 2$ is not a coordinate-drift example of this form. The theorem therefore does not say that every perturbation is degenerate; it says that ordinary strictly parabolic theory cannot be applied to the unmodified Ricci flow operator, which forces the gauge-fixing step below.
[example: Degenerate Direction from a Coordinate Change]
Let $X$ be a smooth vector field and set $h=\mathcal{L}_Xg$, so $h_{ij}=\nabla_iX_j+\nabla_jX_i$. At the level of principal symbols, replacing $\nabla_i$ by $\xi_i$ and writing $\eta_j$ for the symbol of $X_j$ gives
\begin{align*}
h_{ij}=\xi_i\eta_j+\xi_j\eta_i.
\end{align*}
We compute the symbol from *[Principal Symbol and Degeneracy of the Linearized Ricci Flow Operator](/theorems/5969)* on this tensor. First,
\begin{align*}
\operatorname{tr}_g h=\sum_{k,\ell}(g^{-1})_{k\ell}(\xi_k\eta_\ell+\xi_\ell\eta_k).
\end{align*}
Since $(g^{-1})_{k\ell}=(g^{-1})_{\ell k}$, the second summand becomes the first after interchanging the dummy indices $k$ and $\ell$, hence
\begin{align*}
\operatorname{tr}_g h=2\sum_{k,\ell}(g^{-1})_{k\ell}\xi_k\eta_\ell.
\end{align*}
Write $\langle \xi,\eta\rangle_g=\sum_{k,\ell}(g^{-1})_{k\ell}\xi_k\eta_\ell$ and $\xi^k=\sum_\ell(g^{-1})_{k\ell}\xi_\ell$. Then
\begin{align*}
\sum_{k,\ell}\xi_i(g^{-1})_{k\ell}\xi_\ell h_{kj}=\xi_i\sum_{k,\ell}(g^{-1})_{k\ell}\xi_\ell(\xi_k\eta_j+\xi_j\eta_k).
\end{align*}
Expanding the two summands gives
\begin{align*}
\sum_{k,\ell}\xi_i(g^{-1})_{k\ell}\xi_\ell h_{kj}=\xi_i|\xi|_g^2\eta_j+\xi_i\xi_j\langle \xi,\eta\rangle_g.
\end{align*}
The same calculation with $i$ and $j$ interchanged gives
\begin{align*}
\sum_{k,\ell}\xi_j(g^{-1})_{k\ell}\xi_\ell h_{ki}=\xi_j|\xi|_g^2\eta_i+\xi_i\xi_j\langle \xi,\eta\rangle_g.
\end{align*}
Substituting these identities into the principal symbol gives
\begin{align*}
\sigma_\xi(D(-2\operatorname{Ric})_g)(h)_{ij}=|\xi|_g^2(\xi_i\eta_j+\xi_j\eta_i)+2\xi_i\xi_j\langle \xi,\eta\rangle_g-(\xi_i|\xi|_g^2\eta_j+\xi_i\xi_j\langle \xi,\eta\rangle_g)-(\xi_j|\xi|_g^2\eta_i+\xi_i\xi_j\langle \xi,\eta\rangle_g).
\end{align*}
The two $|\xi|_g^2$ terms cancel the two terms in $|\xi|_g^2(\xi_i\eta_j+\xi_j\eta_i)$, and the two copies of $\xi_i\xi_j\langle \xi,\eta\rangle_g$ subtract the trace contribution, so
\begin{align*}
\sigma_\xi(D(-2\operatorname{Ric})_g)(h)_{ij}=0.
\end{align*}
Thus the symbol annihilates infinitesimal coordinate-change tensors $h=\mathcal{L}_Xg$, which is exactly the diffeomorphism degeneracy of the unmodified Ricci flow operator.
[/example]
The example explains why a geometric flow can fail the standard PDE test while still being expected to exist. This is the infinitesimal version of the pullback invariance from Chapter 1: the equation determines the metric modulo diffeomorphism freedom. The next step is to prescribe a coordinate gauge dynamically.
## The DeTurck Vector Field and the Gauged Equation
How can one add a term that removes the diffeomorphism degeneracy without changing the geometric content of the flow? DeTurck's idea is to compare the evolving metric with a fixed background metric and use the difference of the two Levi-Civita connections to produce a vector field. The Lie derivative along this vector field supplies exactly the missing second-order terms.
We fix a smooth background metric $\bar{g}$ on the compact manifold $M$. The comparison with $\bar{g}$ is auxiliary; it supplies a reference connection and disappears after the gauge is undone.
[definition: DeTurck Vector Field]
Let
\begin{align*}
\operatorname{Met}(M)=\{g\in\Gamma(S^2T^*M):g_p \text{ is positive definite for every } p\in M\}
\end{align*}
be the open subset of smooth positive-definite symmetric $2$-tensor fields on $M$. The DeTurck vector field is the map
\begin{align*}
W:\operatorname{Met}(M)\times \operatorname{Met}(M)\longrightarrow \Gamma(TM).
\end{align*}
It sends $(g,\bar{g})$ to the vector field $W(g,\bar{g})$. In local coordinates, write
\begin{align*}
W(g,\bar{g})=\sum_k W(g,\bar{g})^k\,\partial_{x_k}.
\end{align*}
The coefficients are
\begin{align*}
W(g,\bar{g})^k=\sum_{i,j}g^{ij}\left(\Gamma(g)^k_{ij}-\Gamma(\bar{g})^k_{ij}\right),
\end{align*}
where $\Gamma(g)^k_{ij}$ and $\Gamma(\bar g)^k_{ij}$ are the usual Christoffel symbols. When a lower output index is used later, we write $W_k=g_{k\ell}W^\ell$ and $\Gamma(g)_{ij,k}=g_{k\ell}\Gamma(g)^\ell_{ij}$, with the same convention for $\bar g$.
[/definition]
The difference of two Christoffel symbols is tensorial, so this formula defines a genuine vector field even though the symbols separately depend on coordinates. To use this vector field in the PDE, we build it into the metric evolution as a Lie derivative term.
[definition: Ricci-DeTurck Flow]
Let $\bar{g}\in \operatorname{Met}(M)$ be a fixed background metric and let $T>0$. A Ricci-DeTurck flow with background $\bar{g}$ is a smooth map
\begin{align*}
\hat{g}:[0,T]\longrightarrow \operatorname{Met}(M)\subset \Gamma(S^2T^*M),\qquad t\longmapsto \hat{g}(t),
\end{align*}
such that
\begin{align*}
\partial_t\hat{g}(t)=-2\operatorname{Ric}(\hat{g}(t))+\mathcal{L}_{W(\hat{g}(t),\bar{g})}\hat{g}(t)
\end{align*}
for all $t\in[0,T]$.
[/definition]
The sign convention is chosen so that the added Lie derivative cancels the bad part of the Ricci symbol. The reason for introducing this equation is the following parabolicity result: in local coordinates its leading part becomes the metric Laplacian on the components of $\hat{g}$.
[quotetheorem:5970]
[citeproof:5970]
This theorem is the analytic core of Hamilton's short-time existence argument. Positive definiteness is a necessary hypothesis: if in local coordinates the leading coefficient were $\operatorname{diag}(1,0,\dots,0)$, then the covector $dx_2$ would give zero principal symbol, while a coefficient matrix with a negative entry would produce backward heat behaviour in that direction. The uniform version also needs eigenvalue bounds: the matrices $\operatorname{diag}(1,\varepsilon)$ on a two-dimensional chart are positive definite for every $\varepsilon>0$, but the ellipticity constant tends to $0$ as $\varepsilon\downarrow 0$, so uniform parabolic estimates cannot hold with constants independent of $\varepsilon$. Compactness avoids boundary conditions and gives uniform control on the coefficients for a short time; on an open interval or a manifold with boundary, the scalar heat equation already shows that uniqueness and existence require boundary conditions. Smoothness of $\bar{g}$ and of the evolving coefficients is also part of the hypothesis: if $\bar{g}$ has only continuous first derivatives, then the displayed $Q_{ij}$ need not be a smooth function of the listed arguments because $\partial^2\bar{g}$ may fail to exist classically. The theorem does not itself solve the initial value problem. Its role is to put the gauged equation into the exact class where quasilinear parabolic existence theory applies.
[example: Local Coordinate Derivation of the Ricci-DeTurck Operator]
Choose coordinates on $U\subset M$ and write $g_{ij}=\hat g_{ij}$ during the calculation. Put
\begin{align*}
C_j(g)=\sum_{a,b}(g^{-1})_{ab}\Gamma(g)_{ab,j}
\end{align*}
and
\begin{align*}
\bar C_j=\sum_{a,b}(g^{-1})_{ab}\Gamma(\bar g)_{ab,j}.
\end{align*}
With the coordinate convention used in the DeTurck vector field, the relevant covariant component is $W_j=C_j(g)-\bar C_j$. The second-order part of the Ricci term is
\begin{align*}
-2\operatorname{Ric}_{ij}(g)=\sum_{a,b}(g^{-1})_{ab}\partial_a\partial_b g_{ij}-\partial_iC_j(g)-\partial_jC_i(g)+R_{ij}^{(1)}(g^{-1},\partial g),
\end{align*}
where $R_{ij}^{(1)}$ contains only terms with at most one derivative of $g$.
For the Lie derivative term,
\begin{align*}
(\mathcal L_Wg)_{ij}=\nabla_iW_j+\nabla_jW_i.
\end{align*}
Since
\begin{align*}
\nabla_iW_j=\partial_iW_j-\sum_k\Gamma(g)_{ij,k}W_k,
\end{align*}
the product $\sum_k\Gamma(g)_{ij,k}W_k$ contains only first derivatives of $g$ and first derivatives of $\bar g$. Thus the second-order contribution of $\mathcal L_Wg$ comes from $\partial_iW_j+\partial_jW_i$. Using $W_j=C_j(g)-\bar C_j$ gives
\begin{align*}
\partial_iW_j+\partial_jW_i=\partial_iC_j(g)+\partial_jC_i(g)-\partial_i\bar C_j-\partial_j\bar C_i.
\end{align*}
The terms $\partial_i\bar C_j$ and $\partial_j\bar C_i$ involve the fixed background connection and therefore belong to the background part depending on $\bar g^{-1}$, $\partial\bar g$, and $\partial^2\bar g$.
Adding the Ricci term and the Lie derivative term gives
\begin{align*}
-2\operatorname{Ric}_{ij}(g)+(\mathcal L_Wg)_{ij}=\sum_{a,b}(g^{-1})_{ab}\partial_a\partial_b g_{ij}-\partial_iC_j(g)-\partial_jC_i(g)+\partial_iC_j(g)+\partial_jC_i(g)+Q_{ij}(g^{-1},\partial g,\bar g^{-1},\partial\bar g,\partial^2\bar g).
\end{align*}
The two terms $-\partial_iC_j(g)$ and $+\partial_iC_j(g)$ cancel, and the two terms $-\partial_jC_i(g)$ and $+\partial_jC_i(g)$ cancel. Hence
\begin{align*}
-2\operatorname{Ric}_{ij}(g)+(\mathcal L_Wg)_{ij}=\sum_{a,b}(g^{-1})_{ab}\partial_a\partial_b g_{ij}+Q_{ij}(g^{-1},\partial g,\bar g^{-1},\partial\bar g,\partial^2\bar g).
\end{align*}
Thus the Ricci-DeTurck equation has leading operator $\sum_{a,b}(g^{-1})_{ab}\partial_a\partial_b$ acting componentwise on $g_{ij}$, so its principal part is the heat operator determined by the evolving positive-definite metric.
[/example]
The coordinate calculation also shows why the background metric does not impose extra geometry on the solution. The compactness assumption in the next theorem is used to avoid boundary conditions and to keep the ellipticity constants uniform; on a non-compact manifold, additional completeness and bounded-geometry hypotheses would be needed. The theorem will not yet produce a Ricci flow, because it solves the gauge-fixed equation rather than the original geometric equation. It is nevertheless the decisive PDE input, because the remaining step is an ODE for diffeomorphisms.
[quotetheorem:5971]
[citeproof:5971]
The Ricci-DeTurck theorem proves existence for a gauged equation, not yet for Ricci flow itself. The compactness hypothesis is doing concrete work: on a manifold with boundary, even the scalar heat equation on $[0,1]$ is not a well-posed initial-value problem until Dirichlet, Neumann, or another boundary condition is imposed, and the metric system has the same issue componentwise. On a non-compact manifold, coefficients and their derivatives can fail to remain controlled at infinity; for instance, uniformly parabolic local coordinate systems can have no global lower ellipticity constant without completeness and bounded-geometry assumptions. Smooth initial data is also used: rough initial metrics do not define the classical Ricci tensor and DeTurck vector field in the sense used here. Its uniqueness conclusion is only uniqueness within the chosen background and gauge. A Ricci flow pulled back by a time-dependent diffeomorphism need not solve the same initial-value problem unless the diffeomorphism is controlled at time zero and in time; the DeTurck argument proves uniqueness by pushing competing Ricci flows into the same gauged system first. The final step is to convert the gauged solution back into a geometric solution by solving an ordinary differential equation for diffeomorphisms. This is the same structural move that appears in harmonic-map gauge for geometric PDEs and in gauge fixing for Yang-Mills flow: use a reference object to choose coordinates or gauge, solve a parabolic system, and then remove the auxiliary choice.
## Pullback by Harmonic-Map-Gauge Diffeomorphisms
What is the relationship between the gauged solution and the original Ricci flow? The DeTurck vector field records the coordinate drift between the evolving metric and the fixed background. Pulling back by the inverse drift removes the Lie derivative term, leaving the Ricci flow equation.
The following convention fixes the direction of the diffeomorphisms. With $\hat{g}(t)$ solving Ricci-DeTurck flow, let $\varphi_t:M\to M$ solve a time-dependent ODE generated by $-W(\hat{g}(t),\bar{g})$.
[definition: DeTurck Diffeomorphism]
Let $\hat{g}:[0,T]\to\operatorname{Met}(M)$ be a smooth Ricci-DeTurck solution with background $\bar{g}$. The associated DeTurck diffeomorphisms are a smooth map
\begin{align*}
\varphi:[0,T]\longrightarrow \operatorname{Diff}(M),\qquad t\longmapsto \varphi_t,
\end{align*}
where each $\varphi_t:M\to M$ solves
\begin{align*}
\frac{d}{dt}\varphi_t(x)=-W(\hat{g}(t),\bar{g})(\varphi_t(x)),\qquad \varphi_0=\operatorname{id}_M.
\end{align*}
[/definition]
Because $M$ is compact and the vector field is smooth on $M\times[0,T]$, this ODE produces diffeomorphisms throughout the time interval. The point of choosing this particular motion is that the pullback variation contributes a Lie derivative with the opposite sign from the DeTurck term. The theorem records the cancellation that converts the strictly parabolic auxiliary equation back into the geometric Ricci flow.
The remaining question is whether this gauge removal is exact at the level of the nonlinear equation, rather than only a heuristic cancellation of first-order terms. The formal statement below isolates the pullback computation and identifies the initial condition, so that the auxiliary DeTurck solution can be used as a genuine Ricci flow solution.
[quotetheorem:5972]
[citeproof:5972]
This pullback argument completes the existence half of Hamilton's theorem. The sign in the diffeomorphism ODE is necessary: choosing the opposite sign would double the Lie derivative term rather than cancel it. Compactness is needed for the stated global-in-space ODE conclusion; on a non-compact manifold, a smooth time-dependent vector field can have integral curves that escape to infinity in finite time, such as $\dot{x}=x^2$ on $\mathbb R$. Smoothness is needed for the same reason: a merely continuous vector field can fail to generate a unique flow, as in the scalar ODE $\dot{x}=|x|^{1/2}$ with $x(0)=0$. The theorem does not say that the Ricci-DeTurck metric itself is a Ricci flow; it says that it is equivalent to one after the prescribed pullback. It motivates the compact uniqueness theorem as well, since two Ricci flows can be compared by pushing both into the same DeTurck gauge.
[quotetheorem:5961]
[citeproof:5961]
The uniqueness statement is part of the geometric payoff of the DeTurck trick. Compactness is used here to make both the parabolic uniqueness argument and the diffeomorphism ODE global on a common short time interval; without compactness, extra hypotheses are needed to prevent loss of control at infinity, and the ODE example $\dot{x}=x^2$ on $\mathbb R$ shows the possible finite-time escape. Smoothness is also part of the classical theorem: if the initial metric is not differentiable enough to define $\operatorname{Ric}(g_0)$ classically, then the equation is not a smooth Ricci flow initial-value problem in this sense. Positive definiteness is indispensable because a degenerate symmetric tensor is not a Riemannian metric, and the inverse metric appearing in the Ricci tensor and DeTurck operator would not exist. The theorem does not describe how large $T$ is or whether singularities later form. It supplies the local-in-time foundation on which the later curvature evolution equations, maximum principles, and singularity analysis depend.
[example: Perturbed Round Metric on the Sphere]
Let $M=S^n$ and write $g_{\mathrm{round}}$ for the round metric. Since $S^n$ is compact and $h$ is smooth, the quantity
\begin{align*}
A:=\sup_{\substack{x\in S^n,\ v\in T_xS^n,\ v\ne 0}}\frac{|h_x(v,v)|}{g_{\mathrm{round},x}(v,v)}
\end{align*}
is finite. If $|\varepsilon|A<1$, then for every non-zero $v\in T_xS^n$,
\begin{align*}
g_0(v,v)=g_{\mathrm{round}}(v,v)+\varepsilon h(v,v)
\end{align*}
and therefore
\begin{align*}
g_0(v,v)\ge g_{\mathrm{round}}(v,v)-|\varepsilon|\,|h(v,v)|.
\end{align*}
By the definition of $A$,
\begin{align*}
|\varepsilon|\,|h(v,v)|\le |\varepsilon|A\,g_{\mathrm{round}}(v,v),
\end{align*}
so
\begin{align*}
g_0(v,v)\ge (1-|\varepsilon|A)g_{\mathrm{round}}(v,v)>0.
\end{align*}
Thus $g_0=g_{\mathrm{round}}+\varepsilon h$ is a smooth Riemannian metric, so *Hamilton Short-Time Existence on Compact Manifolds* gives a smooth Ricci flow $g(t)$ with $g(0)=g_0$ on some interval $[0,T]$.
In DeTurck gauge with background $g_0$, let $\hat g(t)$ solve the Ricci-DeTurck equation and define the perturbation
\begin{align*}
u_{ij}(t)=\hat g_{ij}(t)-(g_0)_{ij}.
\end{align*}
Since $g_0$ is independent of $t$,
\begin{align*}
\partial_tu_{ij}=\partial_t\hat g_{ij}.
\end{align*}
By *Ricci-DeTurck Flow Is Strictly Parabolic*, in local coordinates,
\begin{align*}
\partial_t\hat g_{ij}=\sum_{a,b}(\hat g^{-1})_{ab}\partial_a\partial_b\hat g_{ij}+Q_{ij}(\hat g^{-1},\partial\hat g,g_0^{-1},\partial g_0,\partial^2g_0).
\end{align*}
Substituting $\hat g_{ij}=(g_0)_{ij}+u_{ij}$ gives
\begin{align*}
\partial_tu_{ij}=\sum_{a,b}(\hat g^{-1})_{ab}\partial_a\partial_b\big((g_0)_{ij}+u_{ij}\big)+Q_{ij}(\hat g^{-1},\partial\hat g,g_0^{-1},\partial g_0,\partial^2g_0).
\end{align*}
Using linearity of the coordinate derivatives,
\begin{align*}
\partial_a\partial_b\big((g_0)_{ij}+u_{ij}\big)=\partial_a\partial_b(g_0)_{ij}+\partial_a\partial_bu_{ij}.
\end{align*}
Hence
\begin{align*}
\partial_tu_{ij}=\sum_{a,b}(\hat g^{-1})_{ab}\partial_a\partial_bu_{ij}+\sum_{a,b}(\hat g^{-1})_{ab}\partial_a\partial_b(g_0)_{ij}+Q_{ij}(\hat g^{-1},\partial\hat g,g_0^{-1},\partial g_0,\partial^2g_0).
\end{align*}
The only second derivatives of the unknown perturbation $u$ occur in
\begin{align*}
\sum_{a,b}(\hat g^{-1})_{ab}\partial_a\partial_bu_{ij}.
\end{align*}
Because $\hat g(t)$ is positive definite, the quadratic form $\sum_{a,b}(\hat g^{-1})_{ab}\xi_a\xi_b$ is positive for every non-zero covector $\xi$, so this leading term is a heat-type operator acting componentwise on $u_{ij}$. The remaining terms involve the fixed initial metric $g_0=g_{\mathrm{round}}+\varepsilon h$, first derivatives of the evolving metric, and nonlinear dependence on $\hat g^{-1}$; these are precisely where the curvature of the round sphere, the chosen perturbation $h$, and nonlinear interactions enter. Thus small smooth perturbations of the round metric have a well-defined short-time Ricci-flow evolution before any question of stability or convergence is addressed.
[/example]
The construction in this chapter is local in time but global on the compact manifold. Later chapters use this foundation to derive evolution equations for curvature, apply tensor maximum principles, and study the formation of singularities. The essential lesson is that Ricci flow becomes a usable parabolic equation only after separating geometric evolution from diffeomorphism freedom.
Short-time existence makes Ricci flow a legitimate evolution equation, but the main geometric information now comes from understanding what it does to curvature. The next chapter computes those curvature evolution equations and prepares the tools needed to control them.
# 3. Evolution Equations for Curvature
Ricci flow is useful because curvature is both the quantity driving the equation and the quantity whose concentration signals singularity formation. After short-time existence, the next task is to compute how curvature and its derivatives evolve under $\partial_t g = -2\operatorname{Ric}$. This chapter develops Hamilton's evolution equations, separates diffusion from the nonlinear curvature reactions, and prepares the analytic estimates used later to control singularity models.
The guiding theme is that Ricci flow is a heat equation for curvature with quadratic lower-order terms. The metric itself evolves weakly parabolically, but after the geometric identities are written in the right form, the curvature tensor satisfies a strictly parabolic system modulo algebraic curvature interactions. The prerequisites are the short-time existence theorem, the basic Riemannian identities for the Levi-Civita connection, the first and second Bianchi identities, and the [scalar parabolic maximum principle](/theorems/5984). This is the point where the course begins to use maximum principles and local derivative estimates in earnest.
## Differentiating the Basic Geometric Objects
The first problem is practical: if the metric changes with time, then every object built from it changes too. We need formulas for the inverse metric, the connection, and the volume form before the curvature evolution equations can be derived.
[quotetheorem:5973]
[citeproof:5973]
This formula is the first warning that raising and lowering indices does not commute with time differentiation. The smoothness of the metric family is doing real work: the proof differentiates the inverse relation $g^{-1}g=I$, so a merely continuous or piecewise $C^1$ family can have no pointwise value of $\partial_t g^{ij}$ at the break times even when each individual $g(t)$ is a smooth metric on $M$. The identity also does not say that $\partial_t g^{-1}$ is obtained by raising both indices of $\partial_t g$; for instance, if $g(t)=e^{2f(t)}g_0$, then $\partial_t g^{ij}=-2f'(t)g^{ij}$ while raising $v_{ij}=2f'(t)g_{ij}$ gives $v^{ij}=2f'(t)g^{ij}$ with the opposite sign. This failure mode is exactly why later contractions of evolving tensors must include both the derivative of the tensor and the derivative of the metric used to raise indices. The next object is the Levi-Civita connection, whose variation is tensorial even though the Christoffel symbols themselves are coordinate-dependent.
[quotetheorem:5974]
[citeproof:5974]
The connection variation is the mechanism by which derivatives fail to commute with the flow, so it accounts for the local tensor calculus. The Levi-Civita hypothesis is essential here: the cancellation in the proof uses both metric compatibility and torsion-freeness, and a connection with torsion would acquire extra terms involving the torsion tensor and its variation. Smooth time dependence is also needed because the formula differentiates first spatial derivatives of $g(t)$; if the metric is only Lipschitz in time, $\partial_t\Gamma_{ij}^k$ may exist only weakly and the displayed pointwise tensor identity can fail. The formula should also not be read as a tensorial statement about arbitrary coordinate symbols, since Christoffel symbols for a non-Levi-Civita connection or a nonsmooth coordinate change do not obey the same variation law. To use integration, monotonicity, and global quantities, we also need to know how the measure $d\mu_g$ changes as the metric evolves.
[quotetheorem:5975]
[citeproof:5975]
This identity says that positive scalar curvature locally decreases volume under the unnormalised flow, while negative scalar curvature locally increases it. The determinant calculation requires a differentiable family of positive-definite matrices; if the metric degenerates or is only continuous in time, the Riemannian density may fail to have the displayed derivative. The formula is local in the density and does not by itself justify differentiating the total volume of a noncompact manifold, where integrability and boundary control at infinity are separate issues. It is often paired with the curvature evolution equations when differentiating integral quantities, because those later arguments must combine density variation with the evolution of the integrand.
[example: Volume Change on an Einstein Metric]
Let $(M,g_0)$ be Einstein with $\operatorname{Ric}(g_0)=\lambda g_0$, and suppose the Ricci flow solution has the homothetic form $g(t)=c(t)g_0$. Constant rescaling leaves the Levi-Civita connection unchanged, so $\operatorname{Ric}(g(t))=\operatorname{Ric}(g_0)=\lambda g_0$. The Ricci flow equation therefore gives
\begin{align*}
c'(t)g_0=\partial_t g(t)=-2\operatorname{Ric}(g(t))=-2\lambda g_0.
\end{align*}
Since $g_0$ is nondegenerate, $c'(t)=-2\lambda$.
Next, $g(t)^{ij}=c(t)^{-1}g_0^{ij}$, and hence
\begin{align*}
S(t)=g(t)^{ij}\operatorname{Ric}(g(t))_{ij}=c(t)^{-1}g_0^{ij}\lambda (g_0)_{ij}=\frac{n\lambda}{c(t)}.
\end{align*}
The determinant scales by $\det(c(t)g_0)=c(t)^n\det(g_0)$, so
\begin{align*}
d\mu_{g(t)}=c(t)^{n/2}d\mu_{g_0}.
\end{align*}
Differentiating this density directly gives
\begin{align*}
\partial_t(d\mu_{g(t)})=\frac{n}{2}c(t)^{n/2-1}c'(t)d\mu_{g_0}=-n\lambda c(t)^{n/2-1}d\mu_{g_0}.
\end{align*}
On the other hand, the volume evolution formula *Evolution of the Volume Form* gives
\begin{align*}
-S(t)d\mu_{g(t)}=-\frac{n\lambda}{c(t)}c(t)^{n/2}d\mu_{g_0}=-n\lambda c(t)^{n/2-1}d\mu_{g_0}.
\end{align*}
The two expressions agree, so the general volume evolution formula matches the homothetic Einstein model exactly.
[/example]
## Scalar and Ricci Curvature Evolution
The next question is whether Ricci flow diffuses curvature the way the heat equation diffuses temperature. The answer is yes for scalar curvature, but with an additional nonnegative reaction term that drives curvature concentration.
[quotetheorem:5976]
[citeproof:5976]
This is the first complete curvature heat equation in the course. The smoothness hypothesis is needed because the derivation differentiates the connection and commutes covariant derivatives; for a merely continuous evolving metric the displayed pointwise identity is not meaningful. The Ricci-flow hypothesis is also essential: under a general variation $\partial_t g=v$, the scalar variation contains divergence terms and $-\langle v,\operatorname{Ric}\rangle_g$, so the reaction term $2|\operatorname{Ric}|^2$ is special to $v=-2\operatorname{Ric}$. The formula does not say that scalar curvature behaves exactly like a linear heat solution: even spatially constant positive scalar curvature can grow because of the reaction term, while a linear heat equation with constant initial data would keep that constant unchanged. This distinction is what makes the next example a useful test case before we pass to tensor-valued curvature equations.
[example: Scalar Curvature in Dimension Two]
On a surface, $\operatorname{Ric}_{ij}=\frac{1}{2}Sg_{ij}$. Hence the Ricci norm is
\begin{align*}
|\operatorname{Ric}|^2=g^{ik}g^{j\ell}\operatorname{Ric}_{ij}\operatorname{Ric}_{k\ell}.
\end{align*}
Substituting $\operatorname{Ric}_{ij}=\frac{1}{2}Sg_{ij}$ gives
\begin{align*}
|\operatorname{Ric}|^2=g^{ik}g^{j\ell}\left(\frac{1}{2}Sg_{ij}\right)\left(\frac{1}{2}Sg_{k\ell}\right)=\frac{S^2}{4}g^{ik}g^{j\ell}g_{ij}g_{k\ell}.
\end{align*}
Using $g^{ik}g_{k\ell}=\delta^i_{\ell}$ and then $g^{j\ell}g_{ij}=\delta^\ell_i$, this becomes
\begin{align*}
|\operatorname{Ric}|^2=\frac{S^2}{4}\delta^i_{\ell}g^{j\ell}g_{ij}=\frac{S^2}{4}g^{ji}g_{ij}=\frac{S^2}{4}\delta^i_i.
\end{align*}
Since the dimension is $2$, $\delta^i_i=2$, so
\begin{align*}
|\operatorname{Ric}|^2=\frac{S^2}{2}.
\end{align*}
Substituting this into the scalar curvature evolution equation gives
\begin{align*}
\partial_t S=\Delta S+2|\operatorname{Ric}|^2=\Delta S+2\cdot\frac{S^2}{2}=\Delta S+S^2.
\end{align*}
For a round sphere, the scalar curvature is spatially constant along the homothetic flow, so $\Delta S(t)=0$. The equation therefore reduces to the ordinary differential equation
\begin{align*}
\frac{dS}{dt}=S^2.
\end{align*}
Write $S_0=S(0)>0$. Dividing by $S^2$ gives
\begin{align*}
S^{-2}\frac{dS}{dt}=1.
\end{align*}
Integrating from $0$ to $t$ gives
\begin{align*}
\int_{S_0}^{S(t)}s^{-2}\,ds=\int_0^t 1\,d\tau.
\end{align*}
The two integrals are
\begin{align*}
-\frac{1}{S(t)}+\frac{1}{S_0}=t.
\end{align*}
Solving for $S(t)$ yields
\begin{align*}
S(t)=\frac{S_0}{1-S_0t}.
\end{align*}
Thus $S(t)\to+\infty$ as $t\uparrow S_0^{-1}$, giving the simplest positive-curvature model of finite-time curvature blow-up.
[/example]
Scalar curvature is only a contraction of the Ricci tensor, so it loses directional information about how the metric is bending. To study preservation of tensor inequalities and later curvature pinching, we need the evolution equation for $\operatorname{Ric}$ itself.
[quotetheorem:5977]
[citeproof:5977]
The Ricci equation shows that positivity questions for Ricci curvature are not scalar questions in high dimension. The hypotheses again matter because the equation is pointwise and tensorial; without a smooth Ricci flow there is no controlled way to interpret the Laplacian of $\operatorname{Ric}$. The dimension assumption is deliberately left general, but the consequences are not dimension-free: in dimension two the Ricci tensor is determined by $S$, while in higher dimension the Weyl curvature can enter the reaction through $R_{ikj\ell}\operatorname{Ric}^{k\ell}$. The theorem also does not assert that Ricci positivity is automatically preserved in every dimension, since the reaction term is a matrix-valued expression rather than a scalar nonnegative term. Later maximum principles will treat the Ricci tensor as a section of a bundle and analyse whether the reaction term preserves convex curvature cones.
[remark: Sign Convention]
These notes use the curvature convention for which the round sphere has positive sectional curvature and the scalar curvature evolution is $\partial_t S=\Delta S+2|\operatorname{Ric}|^2$. With the opposite Riemann tensor convention, intermediate signs in the full curvature formula change, but the geometric Ricci flow equation and scalar evolution retain the displayed form after conventions are adjusted consistently.
[/remark]
## Full Riemann Curvature Evolution
Scalar and Ricci curvature do not determine the full curvature tensor in dimensions at least four, and singularity analysis needs control of all sectional curvatures. The main computation of the chapter is Hamilton's evolution equation for the Riemann curvature tensor.
The curvature tensor is a section of a tensor bundle, so the diffusion term cannot be the scalar Laplacian applied component by component without specifying how components are compared. The Levi-Civita connection gives the correct bundle derivative, and the resulting second-order operator is the rough Laplacian.
[definition: Rough Laplacian on Tensors]
Let $(M,g)$ be a Riemannian manifold with Levi-Civita connection $\nabla$, and let $T^r_sM$ be the tensor bundle of type $(r,s)$. The rough Laplacian on type $(r,s)$ tensor fields is the map
\begin{align*}
\Delta:C^\infty(T^r_sM)\longrightarrow C^\infty(T^r_sM).
\end{align*}
It is given by
\begin{align*}
\Delta T=g^{ij}\nabla_i\nabla_jT.
\end{align*}
[/definition]
Unlike the Hodge Laplacian, the rough Laplacian is defined directly from the connection on the relevant tensor bundle. Since the Riemann tensor has four indices and the metric also raises and lowers indices during the flow, the exact formula must record both diffusion and the algebraic terms produced by commutators and metric variation.
[quotetheorem:5978]
[citeproof:5978]
This theorem is the tensorial version of the heat-plus-reaction principle. Smoothness is needed because the proof differentiates Christoffel symbols and invokes Bianchi identities pointwise; the formula is not a weak statement for rough metrics. The Ricci-flow equation is also essential: if the metric is instead deformed by an arbitrary tensor $v$, the curvature variation contains second covariant derivatives of $v$, and the special Laplacian-plus-quadratic structure no longer appears. A concrete limitation is visible on a flat torus with a time-dependent nonhomothetic metric: curvature may be produced by the imposed metric variation even though the initial curvature is zero, so Hamilton's reaction term cannot describe the evolution without the Ricci-flow hypothesis. The formula also does not provide a scalar [comparison principle](/theorems/4870) for each component, because the quadratic term mixes the components of $\operatorname{Rm}$ and changes under a change of frame. The next definition packages that algebraic reaction so estimates can use its size without carrying every index.
[definition: Curvature Quadratic Term]
For each point $p\in M$, let $\operatorname{Curv}(T_pM)$ denote the [vector space](/page/Vector%20Space) of algebraic curvature tensors on $T_pM$. The curvature quadratic term is the fibrewise map
\begin{align*}
Q_p:\operatorname{Curv}(T_pM)\longrightarrow \operatorname{Curv}(T_pM).
\end{align*}
The value $Q_p(A)$ is the algebraic curvature tensor whose coordinate expression is obtained by replacing every occurrence of $\operatorname{Rm}$ in the non-diffusion terms of Hamilton's Riemann curvature evolution equation by $A$.
[/definition]
Equivalently, on the curvature-tensor subbundle $\operatorname{Curv}(TM)\subset T^0_4M$, these fibre maps define a bundle map $Q:\operatorname{Curv}(TM)\to\operatorname{Curv}(TM)$, and Hamilton's equation can be written schematically as
\begin{align*}
\partial_t\operatorname{Rm}=\Delta\operatorname{Rm}+Q(\operatorname{Rm}).
\end{align*}
The symbol $Q(\operatorname{Rm})$ suppresses index structure but preserves the analytic feature needed for estimates: it is a finite quadratic contraction of curvature with coefficients determined by the dimension. The next issue is to replace Hamilton's long component formula by a short inequality that can be inserted into parabolic maximum-principle arguments. For that purpose, we need a pointwise bound for $Q(\operatorname{Rm})$ depending only on $n$ and $|\operatorname{Rm}|$.
[quotetheorem:5979]
[citeproof:5979]
This estimate is what turns Hamilton's exact tensor formula into usable parabolic inequalities. The finite-dimensional hypothesis is essential: the constant comes from counting finitely many contractions in an orthonormal frame, and there is no dimension-free constant asserted here. If the dimension is allowed to vary, the number of contraction terms grows, so the same numerical constant cannot be reused uniformly across all $n$. The bound also discards sign and cone information, so it is too crude for curvature-positivity preservation; a curvature operator can have reaction terms with favourable signs that this absolute-value estimate forgets. It is, however, exactly the level of information needed for scalar norm inequalities and derivative estimates.
[example: Curvature Growth on Shrinking Space Forms]
Let $(M^n,g_0)$ have constant sectional curvature $K_0>0$. Since constant rescaling leaves the Levi-Civita connection unchanged, the Ricci tensor of $g(t)=c(t)g_0$ is
\begin{align*}
\operatorname{Ric}(g(t))=\operatorname{Ric}(g_0)=(n-1)K_0g_0.
\end{align*}
The Ricci flow equation gives
\begin{align*}
c'(t)g_0=\partial_t g(t)=-2\operatorname{Ric}(g(t))=-2(n-1)K_0g_0.
\end{align*}
Because $g_0$ is nondegenerate, $c'(t)=-2(n-1)K_0$. With $c(0)=1$, this integrates to
\begin{align*}
c(t)=1-2(n-1)K_0t.
\end{align*}
Thus the shrinking time is
\begin{align*}
T=\frac{1}{2(n-1)K_0}.
\end{align*}
Equivalently,
\begin{align*}
c(t)=2(n-1)K_0(T-t).
\end{align*}
For a space form with sectional curvature $K(t)$, the curvature tensor in a $g(t)$-orthonormal frame is
\begin{align*}
R_{ijk\ell}(t)=K(t)(\delta_{ik}\delta_{j\ell}-\delta_{i\ell}\delta_{jk}).
\end{align*}
Sectional curvature scales by the reciprocal factor under $g(t)=c(t)g_0$, so
\begin{align*}
K(t)=\frac{K_0}{c(t)}.
\end{align*}
Substituting the expression for $c(t)$ gives
\begin{align*}
K(t)=\frac{K_0}{2(n-1)K_0(T-t)}=\frac{1}{2(n-1)(T-t)}.
\end{align*}
Now compute the norm in the same orthonormal frame:
\begin{align*}
|\operatorname{Rm}(g(t))|^2=K(t)^2\sum_{i,j,k,\ell}(\delta_{ik}\delta_{j\ell}-\delta_{i\ell}\delta_{jk})^2.
\end{align*}
Expanding the square gives
\begin{align*}
\sum_{i,j,k,\ell}(\delta_{ik}\delta_{j\ell}-\delta_{i\ell}\delta_{jk})^2=\sum_{i,j,k,\ell}\delta_{ik}\delta_{j\ell}-2\sum_{i,j,k,\ell}\delta_{ik}\delta_{j\ell}\delta_{i\ell}\delta_{jk}+\sum_{i,j,k,\ell}\delta_{i\ell}\delta_{jk}.
\end{align*}
The first sum is $n^2$, since $\delta_{ik}\delta_{j\ell}$ is nonzero exactly when $k=i$ and $\ell=j$. The third sum is also $n^2$, since $\delta_{i\ell}\delta_{jk}$ is nonzero exactly when $\ell=i$ and $k=j$. In the middle sum, the four Kronecker deltas force $i=k$, $j=\ell$, $i=\ell$, and $j=k$, hence $i=j=k=\ell$, giving $n$ nonzero terms. Therefore
\begin{align*}
\sum_{i,j,k,\ell}(\delta_{ik}\delta_{j\ell}-\delta_{i\ell}\delta_{jk})^2=n^2-2n+n^2=2n(n-1).
\end{align*}
So
\begin{align*}
|\operatorname{Rm}(g(t))|^2=2n(n-1)K(t)^2.
\end{align*}
Since $K(t)>0$, taking the square root gives
\begin{align*}
|\operatorname{Rm}(g(t))|=\sqrt{2n(n-1)}K(t).
\end{align*}
Substituting the formula for $K(t)$ yields
\begin{align*}
|\operatorname{Rm}(g(t))|=\frac{\sqrt{2n(n-1)}}{2(n-1)}\frac{1}{T-t}.
\end{align*}
Thus shrinking positive-curvature space forms have curvature growing exactly like $(T-t)^{-1}$, which is the model Type I curvature growth rate.
[/example]
## Moving Frames and Strictly Parabolic Curvature Systems
Hamilton's curvature equation is tensorial, but comparing tensors at different times is awkward because the metric on the tensor bundle is itself evolving. Uhlenbeck's moving-frame trick replaces the evolving tangent bundle metric by a fixed Euclidean bundle metric, turning the curvature equation into a strictly parabolic system plus algebraic terms.
[definition: Uhlenbeck Moving Frame]
Let $(M,g(t))$ solve Ricci flow on a time interval $I$, and let $(E,h)$ be a fixed rank-$n$ Euclidean vector bundle over $M$. A Uhlenbeck moving frame is a smooth time-dependent family of bundle isomorphisms over $\operatorname{id}_M$,
\begin{align*}
\iota:I\longrightarrow C^\infty(\operatorname{Iso}(E,TM)), \qquad t\longmapsto \iota_t:E\to TM,
\end{align*}
such that $\iota_t^*g(t)=h$ for every $t\in I$.
[/definition]
The frame evolves so that the background [inner product](/page/Inner%20Product) remains fixed. Pulling back curvature to $E$ removes the changing metric from the fibrewise norm and isolates the parabolic operator.
[quotetheorem:5980]
[citeproof:5980]
The point is not to change the geometry, but to place the equation in a fixed vector bundle where convexity and maximum principle arguments can be applied cleanly. The smooth frame assumption is necessary because the calculation differentiates $\iota_t$ and the pulled-back connection; a discontinuous choice of frames would add meaningless gauge jumps to $\partial_t\mathcal R$. The construction is also local unless the chosen Euclidean bundle and frame ODE can be arranged globally, so the theorem should not be read as a global trivialisation statement for every tangent bundle. This is a recurring technique whenever the evolving metric creates bookkeeping that obscures the parabolic structure.
[remark: Why the Moving Frame Matters]
Curvature positivity conditions, such as nonnegative curvature operator, are conditions on eigenvalues of a fibrewise symmetric operator. If the fibre metric is changing, the comparison of eigenvalues over time carries extra terms. The moving-frame formulation absorbs those terms into the choice of gauge, leaving the reaction ODE as the object to test against invariant convex cones.
[/remark]
## Curvature Norms and First Derivative Estimates
The final question in this chapter is quantitative: once curvature is bounded, can its derivatives be controlled? Ricci flow has a smoothing effect, and Shi's estimates make this precise by bounding derivatives of curvature at positive times.
[quotetheorem:5981]
[citeproof:5981]
This inequality is the analytic bridge from tensor evolution to maximum principles. The fixed-fibre or metric-compatible formulation matters: if the evolving norm is differentiated without accounting for the metric variation, extra Ricci-linear terms appear and the displayed coefficient of the gradient term is not the whole story. The smooth Ricci-flow hypothesis is again needed because the Bochner identity uses two covariant derivatives of curvature and a pointwise maximum-principle interpretation. It controls the zeroth-order curvature norm, but singularity analysis also needs bounds on spatial oscillation, so the next estimate differentiates the curvature equation once.
[quotetheorem:5982]
[citeproof:5982]
This estimate is the first member of a hierarchy: higher covariant derivatives satisfy similar inequalities with a negative highest-derivative term and lower-order products involving curvature derivatives. The hypothesis that the connection is the time-dependent Levi-Civita connection is not cosmetic, since commutators of a different connection would contain torsion and nonmetricity terms not controlled by $|\operatorname{Rm}|\,|\nabla\operatorname{Rm}|^2$. The estimate also does not close by itself; without a separate curvature bound, the coefficient multiplying $|\nabla\operatorname{Rm}|^2$ may become unbounded. The remaining problem is to convert this hierarchy into bounds that depend only on a curvature bound, scale, and elapsed time. Shi's estimates solve that problem by combining the inequalities with time weights, cutoffs, and the parabolic maximum principle.
[remark: Shi Local Derivative Estimates as a Background Input]
Shi's derivative estimates are used here as a background regularity input rather than as a theorem proved in this page. In the local form needed below, if a smooth Ricci flow has $|\operatorname{Rm}|\le K$ on a compactly contained parabolic neighbourhood $B_{g(0)}(x_0,r)\times[0,T]$, then for each $m\ge1$ and each $0<\tau\le T$ there is a constant $C=C(n,m,K,r,T,\tau)$ such that
\begin{align*}
|\nabla^m\operatorname{Rm}|\le C
\end{align*}
on $B_{g(0)}(x_0,r/2)\times[\tau,T]$. In the complete bounded-curvature case this has the scale-invariant positive-time form
\begin{align*}
|\nabla^m\operatorname{Rm}|\le C_mK\left(t^{-m/2}+K^{m/2}\right).
\end{align*}
[/remark]
Shi's estimate is a central regularity result: a zeroth-order curvature bound instantly produces bounds for all curvature derivatives at positive time. The compact containment or completeness assumptions are not decorative; without them a maximum of the cutoff Bernstein quantity may escape through the boundary of the region where the curvature bound is known. The estimate also does not give a uniform bound up to $t=0$ from a curvature bound alone, as the factor $t^{-m/2}$ records parabolic smoothing from positive time. This is why blow-up limits of Ricci flows are smooth after parabolic rescaling, provided curvature is controlled on the relevant neighbourhoods.
[example: First Derivative Scale in a Bounded Curvature Flow]
Assume a complete Ricci flow has $|\operatorname{Rm}|\le K$ on $M\times[0,K^{-1}]$. Applying *[Shi Local Derivative Estimates](/theorems/5983)* with $m=1$ and $0<t\le K^{-1}$ gives
\begin{align*}
|\nabla\operatorname{Rm}|\le C_1K t^{-1/2}.
\end{align*}
Equivalently,
\begin{align*}
|\nabla\operatorname{Rm}|\le \frac{C_1K}{t^{1/2}}.
\end{align*}
Now fix $0<\alpha\le 1$ and evaluate at $t=\alpha K^{-1}$. Since
\begin{align*}
t^{1/2}=(\alpha K^{-1})^{1/2}=\alpha^{1/2}(K^{-1})^{1/2}=\alpha^{1/2}K^{-1/2},
\end{align*}
we have
\begin{align*}
\frac{C_1K}{t^{1/2}}
=\frac{C_1K}{\alpha^{1/2}K^{-1/2}}
=C_1K\alpha^{-1/2}K^{1/2}
=C_1\alpha^{-1/2}K^{3/2}.
\end{align*}
Thus
\begin{align*}
|\nabla\operatorname{Rm}|(x,\alpha K^{-1})\le C_1\alpha^{-1/2}K^{3/2}.
\end{align*}
The power $K^{3/2}$ has the expected scale: curvature has dimension length$^{-2}$, so one covariant spatial derivative adds one factor of length$^{-1}$, giving length$^{-3}=(\text{length}^{-2})^{3/2}$.
[/example]
The chapter's main output is therefore a toolkit rather than a single theorem. We can now differentiate geometric quantities under the flow, regard curvature as solving a nonlinear heat equation, use moving frames to expose the parabolic structure, and invoke Shi estimates to upgrade curvature bounds into smooth local control. These tools will feed directly into tensor maximum principles and the analysis of singularity formation.
The evolution formulas reveal Ricci flow as a nonlinear heat process for curvature, but local differential equations alone do not give global geometric control. Maximum principles supply that missing step by turning the evolution identities into preservation and comparison results.
# 4. Maximum Principles in Ricci Flow
Maximum principles are the bridge between the local evolution equations of curvature and global geometric conclusions. Chapter 3 derived heat-type equations such as $\partial_t S = \Delta S + 2|\operatorname{Ric}|^2$ and analogous tensor equations for $\operatorname{Ric}$ and $\operatorname{Rm}$. This chapter assumes the compact parabolic maximum principle, the Ricci flow curvature evolution equations, and the basic tensor notation for symmetric two-tensors and algebraic curvature operators. The main question is how a local differential inequality prevents a scalar, tensor, or curvature operator from leaving a prescribed positivity region; scalar inequalities give quantitative lower bounds, while tensor maximum principles preserve the curvature cones that drive Hamilton's program.
## Scalar Parabolic Maximum Principle on Compact Manifolds
The first problem is to understand what a heat-type inequality can say on a manifold whose metric is itself moving. In Ricci flow the Laplacian, volume form, and distance function depend on time, but at a spatial minimum the elementary sign information behind the maximum principle still survives.
[quotetheorem:5984]
[citeproof:5984]
This theorem turns scalar curvature evolution into a comparison with an ordinary differential equation. Compactness is used to ensure that a first spatial minimum is attained; on a noncompact manifold the same conclusion requires additional hypotheses such as bounded geometry and barriers at infinity. The local Lipschitz condition on $F$ prevents ambiguity in the comparison ODE, so dropping it can destroy uniqueness of the barrier. The theorem gives lower bounds and preservation of scalar inequalities, but it does not by itself explain strict positivity or tensorial curvature conditions; those require the strong and bundle-valued versions below.
[example: Lower Scalar Curvature Bound]
Let $(M^n,g(t))$ be a compact Ricci flow, and let $R$ be its scalar curvature. At a point, choose a $g(t)$-orthonormal frame diagonalizing $\operatorname{Ric}$, with eigenvalues $\lambda_1,\dots,\lambda_n$. Then $R=\sum_{i=1}^n\lambda_i$ and $|\operatorname{Ric}|^2=\sum_{i=1}^n\lambda_i^2$, so Cauchy--Schwarz applied to $(\lambda_1,\dots,\lambda_n)$ and $(1,\dots,1)$ gives
\begin{align*}
R^2=\left(\sum_{i=1}^n\lambda_i\right)^2\le \left(\sum_{i=1}^n\lambda_i^2\right)\left(\sum_{i=1}^n1^2\right)=n|\operatorname{Ric}|^2.
\end{align*}
Thus $|\operatorname{Ric}|^2\ge R^2/n$. Using the scalar curvature evolution equation under Ricci flow, we get
\begin{align*}
\partial_t R=\Delta R+2|\operatorname{Ric}|^2\ge \Delta R+\frac{2}{n}R^2.
\end{align*}
Assume $R(\cdot,0)\ge -K$ for some $K>0$. We compare $R$ with the solution of
\begin{align*}
\varphi'(t)=\frac{2}{n}\varphi(t)^2,\qquad \varphi(0)=-K.
\end{align*}
Since $\varphi(0)=-K\ne 0$, the solution remains nonzero on its interval of existence, and division by $\varphi(t)^2$ gives
\begin{align*}
\frac{\varphi'(t)}{\varphi(t)^2}=\frac{2}{n}.
\end{align*}
Also,
\begin{align*}
\frac{d}{dt}\left(-\frac{1}{\varphi(t)}\right)=\frac{\varphi'(t)}{\varphi(t)^2}.
\end{align*}
Therefore
\begin{align*}
-\frac{1}{\varphi(t)}=-\frac{1}{\varphi(0)}+\frac{2}{n}t.
\end{align*}
Substituting $\varphi(0)=-K$ gives
\begin{align*}
-\frac{1}{\varphi(t)}=\frac{1}{K}+\frac{2}{n}t.
\end{align*}
Multiplying by $-1$ and inverting,
\begin{align*}
\frac{1}{\varphi(t)}=-\frac{1}{K}-\frac{2}{n}t=-\frac{1+\frac{2K}{n}t}{K}.
\end{align*}
Hence
\begin{align*}
\varphi(t)=-\frac{K}{1+\frac{2K}{n}t}.
\end{align*}
By the *Scalar Parabolic Maximum Principle*, applied with $F(s,t)=\frac{2}{n}s^2$, the initial inequality $\varphi(0)=-K\le \inf_M R(\cdot,0)$ implies
\begin{align*}
R(x,t)\ge \varphi(t)=-\frac{K}{1+\frac{2K}{n}t}
\end{align*}
for every $x\in M$ and every time for which the compact flow exists. Thus scalar curvature cannot immediately run to $-\infty$ under compact Ricci flow; the quadratic reaction term gives an explicit lower barrier that moves upward from $-K$.
[/example]
The lower-bound example is a weak comparison statement: it prevents crossing below an ODE barrier. A later pinching argument also needs to know what happens when the minimum is actually attained at positive time, so the next theorem records the strong form needed to distinguish strict positivity from rigidity.
[quotetheorem:5985]
[citeproof:5985]
For Ricci flow, the scalar strong principle says that weak scalar positivity either becomes positive or reflects a rigid vanishing alternative. Connectedness is essential for the global conclusion: on a disconnected manifold one component can vanish while another is positive. The nonnegativity assumption is also essential, since sign-changing solutions of the heat equation can have isolated zeros without being identically zero. This result still applies only to scalar functions; tensor positivity can fail through a zero eigenvector moving in a vector bundle, so the next step is to formulate a maximum principle that controls those directions.
## Maximum Principle for Vector Bundles and Symmetric Two-Tensors
The next problem is to preserve positivity for sections of vector bundles. For a symmetric two-tensor $h$, nonnegativity means $h(v,v) \ge 0$ for every tangent vector $v$, so the dangerous point is a spacetime point where a zero eigenvalue first appears.
[definition: Time-Dependent Bundle Laplacian]
Let $E \to M$ be a vector bundle over a compact manifold with a time-dependent bundle metric and compatible connection $\nabla(t)$. For each $t$, the associated rough Laplacian is the operator
\begin{align*}
\Delta_t : \Gamma(E) \to \Gamma(E)
\end{align*}
defined by
\begin{align*}
\Delta_t s = \sum_{i=1}^{n}\left(\nabla(t)_{e_i}\nabla(t)_{e_i}s - \nabla(t)_{\nabla^{g(t)}_{e_i}e_i}s\right),
\end{align*}
where $(e_1,\dots,e_n)$ is any local $g(t)$-orthonormal frame.
[/definition]
The rough Laplacian diffuses sections in a way compatible with parallel transport, so the heat term alone should not create a first exit from a parallel-transport-invariant constraint. To state the constraint that the section must remain inside, we need a fiberwise convex target set rather than an ordered interval.
[definition: Fiberwise Closed Convex Set]
Let $E \to M$ be a vector bundle with fiber metric. A family $\mathcal K \subset E$ is a fiberwise closed convex set if each fiber slice $\mathcal K_x = \mathcal K \cap E_x$ is a closed convex subset of $E_x$.
[/definition]
Convexity is the geometric replacement for ordering in the scalar maximum principle: it supplies supporting hyperplanes at boundary points. The maximum principle below is needed because the reaction term must be tested against those supporting hyperplanes, while the diffusion term is controlled by parallel extension.
[quotetheorem:5986]
[citeproof:5986]
Hamilton's principle is general, but its hypotheses carry real content. Closedness and convexity make nearest-point and supporting-hyperplane arguments available; without convexity, the heat term may push a section across an inward-curving boundary even when the fiber ODE looks harmless. Parallel-transport invariance is also necessary, because the diffusion term compares nearby fibers using the connection. In Ricci flow applications the fibers are often symmetric two-tensors such as $\operatorname{Ric}$ or modified curvature tensors, and the next theorem converts the abstract ODE-invariance hypothesis into the null-vector condition used in actual tensor calculations.
[quotetheorem:5987]
[citeproof:5987]
This tensor principle explains why curvature positivity is plausible under Ricci flow: once the curvature evolution has been written as diffusion plus algebraic reaction, the question becomes an algebraic tangency condition on a cone. The null-vector condition is the part that cannot be omitted; a reaction term that is negative on a null direction can immediately create a negative eigenvalue from a semidefinite tensor. The theorem preserves weak nonnegativity, but it does not say that zero eigenvalues disappear or that the null space has any regularity. Those stronger conclusions require the equality case of the tensor maximum principle and lead to the parallel-null-space discussion later in the chapter.
[example: Ricci Tensor in Dimension Three]
On a $3$-manifold, the Riemann curvature tensor is determined by the Ricci tensor, so the null-vector condition can be checked in Ricci eigenvalues. Suppose $\operatorname{Ric}\ge 0$ and choose a $g(t)$-orthonormal frame diagonalising $\operatorname{Ric}$ with eigenvalues $0,\mu,\nu$, where $e_1$ spans the zero eigendirection. Thus $R_{11}=0$, $R_{22}=\mu$, $R_{33}=\nu$, and $R_{kl}=0$ for $k\ne l$.
The reaction part of the covariant Ricci evolution is
\begin{align*}
Q_{ij}=2R_{ikjl}R_{kl}-2R_{ik}R_{kj}.
\end{align*}
Evaluating on the null vector $e_1$ gives
\begin{align*}
Q(e_1,e_1)=Q_{11}=2R_{1k1l}R_{kl}-2R_{1k}R_{k1}.
\end{align*}
Since $\operatorname{Ric}$ is diagonal and $R_{11}=R_{12}=R_{13}=0$, the second term is
\begin{align*}
2R_{1k}R_{k1}=2(R_{11}R_{11}+R_{12}R_{21}+R_{13}R_{31})=0.
\end{align*}
The first term keeps only the diagonal Ricci entries:
\begin{align*}
2R_{1k1l}R_{kl}=2R_{1212}R_{22}+2R_{1313}R_{33}=2K_{12}\mu+2K_{13}\nu.
\end{align*}
Hence
\begin{align*}
Q(e_1,e_1)=2K_{12}\mu+2K_{13}\nu.
\end{align*}
In dimension three, the Ricci eigenvalues are sums of the sectional curvatures through the corresponding basis vector:
\begin{align*}
0=K_{12}+K_{13}.
\end{align*}
\begin{align*}
\mu=K_{12}+K_{23}.
\end{align*}
\begin{align*}
\nu=K_{13}+K_{23}.
\end{align*}
Subtracting the last equation from the middle equation gives
\begin{align*}
\mu-\nu=K_{12}-K_{13}.
\end{align*}
Combining this with $K_{12}+K_{13}=0$ gives
\begin{align*}
K_{13}=-K_{12}.
\end{align*}
Therefore
\begin{align*}
\mu-\nu=K_{12}-(-K_{12})=2K_{12}.
\end{align*}
So
\begin{align*}
K_{12}=\frac{\mu-\nu}{2}.
\end{align*}
Using $K_{13}=-K_{12}$ also gives
\begin{align*}
K_{13}=\frac{\nu-\mu}{2}.
\end{align*}
Substituting these values into $Q(e_1,e_1)$,
\begin{align*}
Q(e_1,e_1)=2\left(\frac{\mu-\nu}{2}\right)\mu+2\left(\frac{\nu-\mu}{2}\right)\nu.
\end{align*}
Thus
\begin{align*}
Q(e_1,e_1)=(\mu-\nu)\mu+(\nu-\mu)\nu.
\end{align*}
Expanding the two products,
\begin{align*}
Q(e_1,e_1)=\mu^2-\mu\nu+\nu^2-\mu\nu.
\end{align*}
Combining the middle terms,
\begin{align*}
Q(e_1,e_1)=\mu^2-2\mu\nu+\nu^2.
\end{align*}
Factoring the square,
\begin{align*}
Q(e_1,e_1)=(\mu-\nu)^2\ge 0.
\end{align*}
Thus the reaction term is inward-pointing on every null direction on the boundary of the cone of nonnegative Ricci tensors. By the *Tensor Maximum Principle for Symmetric Two-Tensors*, nonnegative Ricci curvature is preserved in dimension three.
[/example]
The preceding example is an algebraic computation in disguise. The next section packages such computations using cones of algebraic curvature tensors, which are the natural fibers for the full curvature operator.
## Invariant Convex Cones of Algebraic Curvature Tensors
The final problem is to identify curvature conditions that are preserved by the quadratic reaction term in the Riemann curvature evolution. Diffusion is handled by Hamilton's principle; the real content is to prove that a chosen cone is invariant under the curvature ODE.
[definition: Algebraic Curvature Tensor]
Let $V$ be a finite-dimensional real [inner product space](/page/Inner%20Product%20Space). An algebraic curvature tensor on $V$ is a multilinear map $R: V^4 \to \mathbb R$ satisfying
\begin{align*}
R_{ijkl} = -R_{jikl} = -R_{ijlk}.
\end{align*}
\begin{align*}
R_{ijkl} = R_{klij}.
\end{align*}
\begin{align*}
R_{ijkl} + R_{iklj} + R_{iljk} = 0.
\end{align*}
[/definition]
The algebraic symmetries isolate the pointwise object whose evolution is governed by diffusion plus a quadratic reaction term. To apply a convex-cone maximum principle, we need to translate a familiar sectional condition into a closed convex cone in this vector space.
[definition: Nonnegative Curvature Operator]
Let $R$ be an algebraic curvature tensor on an inner product space $V$. The associated curvature operator $\mathcal R: \Lambda^2 V^* \to \Lambda^2 V^*$ is defined by
\begin{align*}
(\mathcal R(\omega))_{ij} = \frac{1}{2}\sum_{k,l} R_{ijkl}\omega_{kl}.
\end{align*}
The curvature operator is nonnegative if
\begin{align*}
(\mathcal R\omega,\omega)_{\Lambda^2 V^*} \ge 0
\end{align*}
for every $\omega \in \Lambda^2 V^*$.
[/definition]
This cone is closed, convex, and invariant under orthogonal changes of frame. The preservation theorem is needed because all geometric applications require the cone to be invariant not only under frame changes, but also under the curvature reaction ODE.
[quotetheorem:5988]
[citeproof:5988]
This theorem is one of the central structural facts of the subject. It says that a pointwise curvature inequality, although nonlinear and tensorial, behaves like a parabolic invariant once the right algebraic cone has been identified. The curvature-operator hypothesis is stronger than nonnegative sectional curvature in higher dimensions, and the proof uses that strength through the convex cone of [self-adjoint operators](/page/Self-Adjoint%20Operators) on $\Lambda^2T^*M$. The compactness assumption again avoids boundary-at-infinity issues; without it, one needs a noncompact maximum principle with curvature control. The theorem does not classify metrics with nonnegative curvature operator, but it supplies the preserved positivity condition needed for later pinching and rigidity arguments.
[quotetheorem:5989]
[citeproof:5989]
Preservation is a weak statement: it keeps a solution inside the cone but does not say whether boundary points disappear. The dimension-three restriction is essential for this Ricci argument because the Riemann tensor is then determined by $\operatorname{Ric}$; in higher dimensions the Weyl tensor contributes extra reaction terms and nonnegative Ricci curvature is not preserved by the same null-vector check. The theorem also does not produce lower quantitative Ricci bounds beyond nonnegativity. To understand pinching and rigidity, the strong tensor maximum principle is needed to decide when weakly positive curvature improves to strict positivity and when a parallel null distribution remains.
[quotetheorem:5990]
[citeproof:5990]
The improvement theorem is used to pass from weak curvature assumptions to strict curvature geometry after a short time. The holonomy condition is the hypothesis that rules out a nonzero parallel family of curvature-null two-forms; without it, product metrics such as round cylinders can remain on the boundary of the cone. The assumption that the curvature operator is somewhere positive excludes the opposite failure mode in which every curvature direction is null. The theorem does not say that every weakly nonnegative solution becomes round or locally symmetric, but it identifies the analytic alternative that later rigidity theorems refine using holonomy theory and topological splitting arguments.
[example: Round Cylinder and Boundary of the Cone]
For $n\ge 3$, write the shrinking cylinder metric as
\begin{align*}
g(t)=r(t)^2 g_{S^{n-1}}+dz^2,\qquad r(t)^2=r(0)^2-2(n-2)t.
\end{align*}
Choose a $g(t)$-orthonormal product frame $e_1,\dots,e_{n-1},e_n$, where $e_1,\dots,e_{n-1}$ are tangent to $S^{n-1}$ and $e_n=\partial_z$. The round factor has sectional curvature $1/r(t)^2$, while the line factor is flat and the product has no mixed curvature terms. Thus
\begin{align*}
R(e_i,e_j,e_i,e_j)=\frac{1}{r(t)^2}
\end{align*}
for $1\le i<j\le n-1$, and
\begin{align*}
R(e_i,e_n,e_i,e_n)=0
\end{align*}
for $1\le i\le n-1$. Hence the curvature operator has eigenvalue $1/r(t)^2$ on each sphere two-form $e_i\wedge e_j$ and eigenvalue $0$ on each mixed two-form $e_i\wedge e_n$.
Let
\begin{align*}
\omega=\sum_{1\le i<j\le n-1} a_{ij}e_i\wedge e_j+\sum_{i=1}^{n-1} b_i e_i\wedge e_n.
\end{align*}
The two-form basis is orthonormal, so the sphere and mixed components are orthogonal. Therefore
\begin{align*}
(\mathcal R\omega,\omega)=\frac{1}{r(t)^2}\sum_{1\le i<j\le n-1}a_{ij}^2+\sum_{i=1}^{n-1}0\cdot b_i^2.
\end{align*}
Equivalently,
\begin{align*}
(\mathcal R\omega,\omega)=\frac{1}{r(t)^2}\sum_{1\le i<j\le n-1}a_{ij}^2\ge 0.
\end{align*}
So the curvature operator is nonnegative. It is not positive, because if $v\in TS^{n-1}$ is nonzero, then
\begin{align*}
(\mathcal R(v\wedge e_n),v\wedge e_n)=0.
\end{align*}
The null space is exactly the mixed subspace
\begin{align*}
\{v\wedge e_n:v\in TS^{n-1}\}.
\end{align*}
For the product connection, $\nabla_X e_n=0$ for every vector field $X$. Hence
\begin{align*}
\nabla_X(v\wedge e_n)=(\nabla_Xv)\wedge e_n+v\wedge\nabla_Xe_n.
\end{align*}
Using $\nabla_Xe_n=0$, this becomes
\begin{align*}
\nabla_X(v\wedge e_n)=(\nabla_Xv)\wedge e_n,
\end{align*}
which is again a mixed two-form. Thus the zero curvature directions form a parallel null subspace, and the shrinking cylinder remains on the boundary of the nonnegative curvature-operator cone instead of becoming positive. This is the model reason the strong tensor maximum principle must include a parallel-null-space alternative.
[/example]
Maximum principles therefore convert the evolution equations of the previous chapter into durable geometric information. Scalar comparison gives quantitative lower bounds, Hamilton's bundle principle preserves tensor cones, and invariant curvature cones turn algebraic positivity into parabolic invariance. These tools will be used repeatedly when studying singularity formation, pinching estimates, and the structure of ancient solutions.
Maximum principles convert curvature evolution into qualitative geometric information, but many applications require sharper quantitative control. Harnack inequalities provide that refinement by comparing values of solutions across space and time along an evolving metric.
# 5. Harnack Inequalities and Li-Yau Estimates
This chapter studies the quantitative form of the maximum principle for heat-type equations on manifolds whose geometry is itself evolving. Chapters 3 and 4 used evolution equations and tensor maximum principles to obtain qualitative preservation results. We now ask for estimates that compare a solution at different points and times, and for curvature analogues that behave like differential Harnack inequalities for the Ricci flow.
The expected background is the heat equation on Riemannian manifolds, Bochner's formula, the parabolic maximum principle, and the curvature evolution equations for Ricci flow. The guiding pattern is that a parabolic equation often controls not only the value of a positive solution, but also a combination of its time derivative, its gradient, and the ambient curvature. In the static case this is the [Li-Yau gradient estimate](/theorems/5992) for the heat equation. Under Ricci flow the metric variation contributes additional terms, and Hamilton's Harnack estimates turn the same idea into a curvature inequality.
## Positive Heat Solutions on Fixed Backgrounds
The first question is how much a positive heat solution can oscillate between nearby points and nearby times. The ordinary maximum principle controls extrema, but it does not directly compare $u(x,t)$ with $u(y,s)$ when $s<t$. The Li-Yau estimate supplies the missing differential control by applying the parabolic maximum principle to the logarithm of the solution.
[definition: Positive Heat Solution]
Let $(M,g)$ be a Riemannian manifold and let $I \subset \mathbb R$ be an interval. A positive heat solution on $M \times I$ is a smooth function $u:M \times I \to (0,\infty)$ satisfying
\begin{align*}
\partial_t u = \Delta_g u.
\end{align*}
[/definition]
Passing to $f=\log u$ separates the multiplicative scale of $u$ from its spatial variation. To build a maximum-principle quantity involving $|\nabla f|^2$ and $\partial_t f$, we first need the exact equation satisfied by this logarithmic variable.
[quotetheorem:5991]
[citeproof:5991]
The logarithmic equation contains the two competing quantities in the Harnack expression. Positivity of $u$ is not a cosmetic assumption: it is what makes $f=\log u$ smooth and turns multiplicative comparison of $u$ into additive comparison of $f$. The chain-rule identity itself still holds if $g$ is replaced by a time-dependent family and $u$ solves $\partial_tu=\Delta_{g(t)}u$, but the static Li-Yau computation does not pass unchanged to that setting because differentiating $|\nabla f|_{g(t)}^2$ and $\Delta_{g(t)}f$ produces metric-variation terms.
A concrete model shows the obstruction. On $\mathbb R^n$ let $g(t)=e^{-2t}g_{\mathrm{Euc}}$ and solve
\begin{align*}
\partial_tu=\Delta_{g(t)}u=e^{2t}\Delta_{\mathrm{Euc}}u.
\end{align*}
The heat kernel is the Euclidean kernel with effective time
\begin{align*}
A(t)=\int_0^t e^{2s}\,ds=\frac{e^{2t}-1}{2}.
\end{align*}
For $f=\log u$, the static Harnack expression becomes
\begin{align*}
|\nabla f|_{g(t)}^2-\partial_t f=\frac{n}{2e^{-2t}A(t)}
=\frac{n}{1-e^{-2t}},
\end{align*}
which is larger than $n/(2t)$ for $t>0$. Thus the fixed-background Li-Yau bound with the same time constant fails for this moving metric, even though the logarithmic heat equation remains valid with $g(t)$ inserted.
The next step is to use Bochner's identity to control the parabolic evolution of the Harnack expression. This is the first point where the geometry of the background manifold enters the argument: Bochner's formula produces a Ricci curvature term that must have a favourable sign. The estimate below is therefore not only a heat-equation estimate, but a comparison between diffusion and the lower Ricci curvature of the space on which diffusion takes place.
[quotetheorem:5992]
[citeproof:5992]
Completeness prevents the maximum-principle argument from losing maxima at a metric boundary; on a compact manifold this issue is absent, while on a noncompact manifold it is replaced by the cutoff hypotheses in the statement. The curvature condition $\operatorname{Ric}_g\ge 0$ is precisely what prevents the Bochner curvature term from having the wrong sign. On hyperbolic space, where $\operatorname{Ric}=-(n-1)g$, the sharp Li-Yau type estimate contains additional lower-curvature terms, so the Euclidean constant $n/(2t)$ is not a universal bound. Positivity of $u$ is needed throughout because the estimate is really an estimate for $\log u$: a signed heat solution such as a difference of two positive heat kernels has zeros, so $\log u$ is not defined globally and no multiplicative Harnack comparison of this form can hold across its nodal set.
The differential estimate is local in space-time, so it is not yet the comparison theorem that Harnack inequalities are meant to provide. To compare two separate points, we integrate the logarithmic derivative of $u$ along a space-time path and optimise over the path.
[quotetheorem:5993]
[citeproof:5993]
This integrated form uses the same hypotheses as the differential Li-Yau estimate, so it inherits both its strength and its limitations. The restriction $s>0$ is essential because the time factor
\begin{align*}
\left(\frac{t}{s}\right)^{n/2}
\end{align*}
records the singular behaviour of heat kernels as the initial time is approached. The estimate compares positive solutions only forward in time and depends on the Riemannian distance for the fixed metric; once the metric moves, the path energy and the Harnack quantity must be modified.
The constants in this comparison theorem are not artifacts of the proof. The fundamental solution on Euclidean space saturates the differential inequality, so any stronger universal estimate would fail even in flat geometry.
[example: Euclidean Gaussian Heat Kernel]
On $\mathbb R^n$ with its Euclidean metric, the heat kernel based at the origin is
\begin{align*}
G(x,t)=(4\pi t)^{-n/2}\exp\left(-\frac{|x|^2}{4t}\right), \qquad t>0.
\end{align*}
For $f=\log G$, taking the logarithm of the product gives
\begin{align*}
f(x,t)=-\frac{n}{2}\log(4\pi t)-\frac{|x|^2}{4t}.
\end{align*}
When differentiating in the spatial variables, $t$ is fixed and $\nabla |x|^2=2x$, so
\begin{align*}
\nabla f=0-\frac{1}{4t}\nabla |x|^2=-\frac{x}{2t}.
\end{align*}
Therefore
\begin{align*}
|\nabla f|^2=\left\langle -\frac{x}{2t},-\frac{x}{2t}\right\rangle=\frac{|x|^2}{4t^2}.
\end{align*}
When differentiating in time, $x$ is fixed and $\partial_t(t^{-1})=-t^{-2}$, hence
\begin{align*}
\partial_t f=-\frac{n}{2}\cdot\frac{1}{t}-\frac{|x|^2}{4}\partial_t(t^{-1})=-\frac{n}{2t}+\frac{|x|^2}{4t^2}.
\end{align*}
Substituting the two computed terms into the Harnack expression gives
\begin{align*}
|\nabla f|^2-\partial_t f=\frac{|x|^2}{4t^2}-\left(-\frac{n}{2t}+\frac{|x|^2}{4t^2}\right)=\frac{n}{2t}.
\end{align*}
Thus the Euclidean Gaussian attains the constant $n/(2t)$ pointwise, so the Li-Yau estimate is sharp on the flat model.
[/example]
## Heat Equations Coupled to Ricci Flow
Ricci flow changes the metric while the heat equation evolves, so the next problem is to identify which Bochner identities survive in a moving geometry. The main new feature is that gradients, Laplacians, and volume form all depend on $t$. The analytic gain is that the variation $\partial_t g=-2\operatorname{Ric}$ cancels the Ricci term appearing in the ordinary Bochner formula.
[definition: Forward Heat Solution Along Ricci Flow]
Let $(M,g(t))_{t\in I}$ be a Ricci flow. A forward heat solution along the flow is a smooth function $u:M\times I\to\mathbb R$ satisfying
\begin{align*}
\partial_t u=\Delta_{g(t)}u.
\end{align*}
[/definition]
The fixed-metric Bochner formula contains $\operatorname{Ric}(\nabla u,\nabla u)$. Under Ricci flow that term is exactly balanced by differentiating the metric used to measure $|\nabla u|^2$, which gives a cleaner evolution equation for the gradient energy density.
[quotetheorem:5994]
[citeproof:5994]
The cancellation depends on both hypotheses: the metric must evolve by $\partial_tg=-2\operatorname{Ric}$ and $u$ must solve the forward heat equation for that same family of metrics. If the metric evolved by a different symmetric tensor, the time derivative of $g^{-1}$ would leave an additional term in the gradient evolution. If $u$ had a forcing term or solved the conjugate heat equation instead, the final expression would contain the gradient of that defect. Thus the theorem is a precise compatibility identity, not a general Bochner formula for arbitrary moving backgrounds.
The evolving Bochner formula is a pointwise identity, so it does not yet explain how global heat quantities change in time. The next problem is to differentiate integrals under Ricci flow, and this motivates the volume variation formula.
[quotetheorem:5995]
[citeproof:5995]
This formula is local and requires no compactness, but global differentiations of integrals require enough integrability or compact support to justify passing the derivative through the integral sign. For instance, on the static Ricci flow $(\mathbb R,g_{\mathrm{Euc}})$ the local formula gives $\partial_t d\mu=0$, but the global quantity
\begin{align*}
\int_{\mathbb R} e^t\,d\mu
\end{align*}
is infinite for every $t$. Its derivative is therefore not a finite number obtained by integrating $\partial_t(e^t)$ over $\mathbb R$; the formal calculation has no global meaning without an integrability hypothesis. The scalar curvature appears because the volume density responds only to the trace of the metric variation. For a general variation
\begin{align*}
\partial_tg=h,
\end{align*}
the volume factor would be
\begin{align*}
\frac{1}{2}\operatorname{tr}_g h
\end{align*}
instead of $-R$. Under Ricci flow this trace term is exactly what changes the formal adjoint of the heat operator.
Because the volume form loses a factor of scalar curvature, the adjoint of the forward heat operator is not obtained by merely changing the sign of time. This leads to the conjugate heat equation, which will become central in entropy formulae.
[definition: Conjugate Heat Solution]
Let $(M,g(t))_{t\in[0,T]}$ be a Ricci flow and write $\tau=T-t$. A conjugate heat solution is a smooth function $v:M\times[0,T]\to\mathbb R$ satisfying
\begin{align*}
-\partial_t v-\Delta_{g(t)}v+Rv=0.
\end{align*}
[/definition]
Equivalently, in backward time the same equation reads $\partial_\tau v=\Delta_{g(T-\tau)}v-Rv$. The extra scalar curvature term is forced by the evolution of $d\mu_{g(t)}$; the next identity verifies that it is exactly the term needed to make forward and backward heat solutions dual to each other.
[quotetheorem:5996]
[citeproof:5996]
Compactness is used to remove boundary terms and to justify differentiating the integral without imposing decay assumptions at infinity. On a manifold with boundary, one would need compatible boundary conditions, such as Dirichlet or Neumann conditions that make the integration-by-parts boundary contribution vanish. On a complete noncompact flow, the same conservation law requires decay or cutoff hypotheses strong enough to pass to the limit in the integration-by-parts identity. This is why the compact statement captures the formal adjointness cleanly, while applications to noncompact limits need separate analytic control.
The pairing identity is the basic adjointness statement behind later entropy monotonicity formulae. For the present chapter, the same evolving calculus also gives pointwise and integral control of heat energy along the flow.
[example: Heat Energy Along Ricci Flow]
Let $u$ be a forward heat solution on a compact Ricci flow $(M,g(t))$, and define its Dirichlet energy by
\begin{align*}
E(t)=\int_M |\nabla u|_{g(t)}^2\,d\mu_{g(t)}.
\end{align*}
By *Evolving Bochner Formula For Gradients*,
\begin{align*}
(\partial_t-\Delta_{g(t)})|\nabla u|^2=-2|\operatorname{Hess}u|^2.
\end{align*}
Equivalently,
\begin{align*}
\partial_t|\nabla u|^2=\Delta_{g(t)}|\nabla u|^2-2|\operatorname{Hess}u|^2.
\end{align*}
Since $|\operatorname{Hess}u|^2\ge 0$, the pointwise identity immediately gives
\begin{align*}
(\partial_t-\Delta_{g(t)})|\nabla u|^2\le 0.
\end{align*}
For the integrated energy, compactness justifies differentiating under the integral sign. Using the product rule for the time-dependent measure,
\begin{align*}
\frac{d}{dt}E(t)=\int_M \partial_t|\nabla u|^2\,d\mu_{g(t)}+\int_M |\nabla u|^2\,\partial_t(d\mu_{g(t)}).
\end{align*}
By *Evolving Volume Formula*,
\begin{align*}
\partial_t(d\mu_{g(t)})=-R\,d\mu_{g(t)}.
\end{align*}
Substituting the two evolution formulae gives
\begin{align*}
\frac{d}{dt}E(t)=\int_M \left(\Delta_{g(t)}|\nabla u|^2-2|\operatorname{Hess}u|^2\right)\,d\mu_{g(t)}-\int_M R|\nabla u|^2\,d\mu_{g(t)}.
\end{align*}
Because $M$ is compact without boundary, [integration by parts](/theorems/2098) gives
\begin{align*}
\int_M \Delta_{g(t)}|\nabla u|^2\,d\mu_{g(t)}=0.
\end{align*}
Therefore
\begin{align*}
\frac{d}{dt}E(t)=-2\int_M |\operatorname{Hess}u|^2\,d\mu_{g(t)}-\int_M R|\nabla u|^2\,d\mu_{g(t)}.
\end{align*}
If $R\ge 0$, both terms on the right are nonpositive, so $E'(t)\le 0$. Thus nonnegative scalar curvature makes the total Dirichlet energy decrease, while the pointwise Bochner inequality $|\nabla u|^2$ is a heat subsolution without any scalar curvature sign assumption.
[/example]
## Hamilton Harnack Inequalities For Curvature
The final question is whether curvature under Ricci flow satisfies a differential Harnack inequality analogous to Li-Yau. Since curvature itself solves a nonlinear heat-type equation, the Harnack quantity must include both time derivatives and spatial derivatives of curvature. Hamilton discovered that positivity of curvature operator provides enough convexity for a tensor maximum principle argument.
[definition: Nonnegative Curvature Operator]
Let $(M,g)$ be a Riemannian manifold. The curvature operator is the symmetric endomorphism
\begin{align*}
\operatorname{Rm}:\Lambda^2T_pM\to\Lambda^2T_pM
\end{align*}
determined by the Riemann curvature tensor at $p$. The metric has nonnegative curvature operator if
\begin{align*}
\langle \operatorname{Rm}(\omega),\omega\rangle \ge 0
\end{align*}
for every $p\in M$ and every $\omega\in\Lambda^2T_pM$.
[/definition]
The definition isolates the convex cone used by Hamilton's tensor maximum principle. The course ultimately wants a scalar differential inequality for $R$, but the clean proof first establishes Hamilton's matrix inequality and then contracts it. The [trace theorem](/theorems/60) is therefore stated here as the scalar consequence, with its proof read together with the matrix theorem that follows.
[quotetheorem:5997]
[citeproof:5997]
Completeness and bounded curvature are the analytic hypotheses that allow the tensor maximum principle to be applied globally, either directly in the compact case or through cutoff arguments in the complete case. Nonnegative curvature operator is the structural condition that keeps the algebraic reaction terms inside the nonnegative cone; weaker scalar curvature assumptions do not control the full quadratic curvature terms in the Harnack tensor. For example, on a product of two surfaces whose sectional curvatures have opposite signs, the scalar curvature can be positive if the positive factor dominates, but the curvature operator still has a negative two-form direction coming from the negatively curved factor. Hamilton's matrix reaction terms see that direction, so scalar curvature positivity alone cannot preserve the Harnack cone. The trace estimate is powerful but loses information from the matrix inequality, since it controls only scalar curvature after choosing a vector field and no longer records the two-form directions of the curvature operator.
The trace theorem has a free vector field $V$, but applications need comparison along trajectories rather than only at a fixed point. A pointwise choice of $V$ does not yet say how curvature changes as one moves through space-time. By taking $V$ from the velocity of a chosen curve, the estimate becomes an inequality that can be integrated along paths.
[quotetheorem:5998]
[citeproof:5998]
The curve form turns the trace Harnack inequality into a differential comparison along a chosen trajectory, which is the curvature analogue of integrating the Li-Yau estimate along a space-time path. Its hypotheses are not incidental. If completeness fails, a maximum of the Harnack tensor can escape through the missing boundary of the manifold, so the local evolution inequality need not imply a global comparison. If curvature is unbounded on a complete noncompact flow, cutoff terms in the tensor maximum principle can acquire uncontrolled curvature factors and may not vanish as the cutoff radius tends to infinity. If the curvature operator is not nonnegative, a two-form direction with negative curvature can enter the matrix Harnack quantity even when $R$ is positive, and the contracted curve inequality has no tensor inequality behind it. At the same time, the inequality is only a scalar statement. It controls $\tau R$ along curves, but it does not determine how sectional or curvature-operator directions evolve, and it cannot distinguish two flows with the same scalar curvature but different curvature operators.
This loss of directional information is the reason the course needs Hamilton's matrix form rather than stopping at the trace estimate. The tensor maximum principle naturally acts on a cone of quadratic forms involving two-form directions, and the trace Harnack inequality is obtained by contracting that stronger statement. The next theorem records the full preserved inequality, which is the version stable enough to identify equality cases and to feed into later singularity analysis.
[quotetheorem:5999]
[citeproof:5999]
The assumptions in the matrix theorem are stronger than those in many scalar curvature estimates because the conclusion is a full quadratic-form inequality, not only a statement about $R$. Completeness and bounded curvature are needed for the tensor maximum principle and the limiting cutoff argument; without them, the evolution inequality for $Z$ may not be enough to control behaviour at infinity. A complete noncompact flow with uncontrolled curvature growth can produce cutoff error terms that do not vanish, so the global maximum-principle conclusion is no longer justified by the local evolution inequality alone. Nonnegative curvature operator is also essential to Hamilton's proof because the reaction terms act on two-form directions: products such as a positively curved surface with a negatively curved one give mixed curvature-operator signs even when some scalar quantities remain controlled, and the Harnack cone is not preserved by the same algebraic argument. The theorem still has limitations: it is a forward-time estimate for solutions with the prescribed curvature positivity, and it does not apply directly to arbitrary Ricci flows, to flows after curvature-operator positivity has failed, or to weak singular limits without a separate approximation argument.
The equality cases of the matrix inequality are geometrically meaningful, not just algebraic degeneracies. Shrinking solitons provide the model case where the Harnack quantity is exactly balanced by self-similar motion.
[example: Harnack Monotonicity On Shrinking Solitons]
A shrinking gradient Ricci soliton at backward time $\tau>0$ satisfies
\begin{align*}
\operatorname{Ric}+\operatorname{Hess} f=\frac{1}{2\tau}g.
\end{align*}
Tracing with $g^{ij}$ gives
\begin{align*}
R+\Delta f=\frac{n}{2\tau}.
\end{align*}
Taking the divergence of the soliton equation and using the contracted Bianchi identity gives
\begin{align*}
0=\frac{1}{2}\nabla_jR+\nabla^i\nabla_i\nabla_jf.
\end{align*}
Commuting derivatives on the one-form $\nabla_j f$ gives
\begin{align*}
\nabla^i\nabla_i\nabla_jf=\nabla_j\Delta f+\operatorname{Ric}_{jk}\nabla^kf.
\end{align*}
Since $\tau$ is fixed on each time slice, differentiating $R+\Delta f=n/(2\tau)$ in space gives
\begin{align*}
\nabla_j\Delta f=-\nabla_jR.
\end{align*}
Substituting this into the divergence identity gives
\begin{align*}
0=\frac{1}{2}\nabla_jR-\nabla_jR+\operatorname{Ric}_{jk}\nabla^kf.
\end{align*}
Hence
\begin{align*}
\nabla_jR=2\operatorname{Ric}_{jk}\nabla^kf.
\end{align*}
Along the canonical self-similar space-time curves, the distinguished Harnack direction is $V=-\nabla f$. In Hamilton's trace Harnack expression, the gradient and Ricci terms then become
\begin{align*}
2\langle \nabla R,V\rangle+2\operatorname{Ric}(V,V)=-2\operatorname{Ric}(\nabla f,\nabla f).
\end{align*}
On the exact self-similar shrinking flow, the time derivative and scaling term balance this remaining contribution:
\begin{align*}
\partial_tR-2\operatorname{Ric}(\nabla f,\nabla f)+\frac{R}{t}=0.
\end{align*}
Therefore
\begin{align*}
\partial_tR+2\langle\nabla R,-\nabla f\rangle+2\operatorname{Ric}(\nabla f,\nabla f)+\frac{R}{t}=0.
\end{align*}
Thus the trace Harnack inequality is saturated on the soliton directions, so the rescaled scalar curvature along canonical soliton curves is constant in the equality model rather than merely monotone.
[/example]
This soliton example previews the role Harnack inequalities will play in singularity analysis. The estimates identify self-similar shrinking geometries as rigid models and constrain the possible behaviour of ancient limits.
[remark: Role In Singularity Analysis]
Li-Yau estimates control positive heat solutions by comparing space-time points, while Hamilton Harnack inequalities control curvature in a comparable differential way. In blow-up analysis, these estimates rule out certain oscillatory behaviours of ancient solutions and identify self-similar shrinking solutions as equality models. This is why Harnack inequalities appear before entropy and reduced-volume methods in the course: they give the first systematic bridge from parabolic maximum principles to geometric monotonicity.
[/remark]
Harnack estimates show that Ricci flow has strong hidden monotonicity, and that structure becomes most useful near singular behavior. The next chapter uses these estimates to analyze blow-up limits and explain what appears when curvature becomes large.
# 6. Singularities and Blow-Up Limits
These notes study Ricci flow as a geometric evolution equation, with emphasis on how curvature estimates, maximum principles, compactness, and monotonicity combine to analyse changing Riemannian metrics. The prerequisites are the basic differential geometry of curvature tensors and Ricci curvature, parabolic PDE intuition, and the short-time existence theory developed in Chapter 2. Chapters 3--5 treated smooth existence intervals, curvature evolution, maximum principles, and Harnack estimates; this chapter begins the singularity theory by asking what forces a maximal solution to end, how to magnify the geometry near high-curvature regions, and what limiting flows record the shape of the breakdown.
## Maximal Time and Curvature Blow-Up
A first obstruction must be separated from all secondary ones: does a compact Ricci flow stop because the underlying manifold degenerates, or because curvature becomes unbounded? Hamilton's extension principle says that, for a smooth compact flow, bounded curvature is enough to continue the solution past the alleged final time.
[definition: Maximal Ricci Flow]
Let $M$ be a compact smooth manifold, and let $\operatorname{Met}^\infty(M)$ denote the space of smooth Riemannian metrics on $M$. A smooth map
\begin{align*}
g:[0,T) &\longrightarrow \operatorname{Met}^\infty(M)
\end{align*}
is a maximal Ricci flow on $[0,T)$ if $g(t)$ solves
\begin{align*}
\frac{\partial g}{\partial t} &= -2\operatorname{Ric}(g(t))
\end{align*}
for every $t\in[0,T)$, and there is no $\varepsilon>0$ and smooth Ricci flow $\tilde g:[0,T+\varepsilon)\to\operatorname{Met}^\infty(M)$ such that $\tilde g(t)=g(t)$ for all $t\in[0,T)$.
[/definition]
The definition is deliberately phrased as a non-extension statement. Since short-time existence applies from any smooth metric, a maximal interval can end only if the limiting metric fails to be smooth enough to restart the equation. The next theorem gives the analytic criterion that rules out such failure when curvature remains bounded.
[quotetheorem:6000]
[citeproof:6000]
The hypotheses are doing real work. Compactness prevents escape of geometry to spatial infinity; on noncompact manifolds, bounded curvature alone is not the clean global continuation statement without additional completeness and bounded-geometry assumptions. The finite upper time $T<\infty$ matters because the theorem is a local-in-time extension result, not a statement about what happens at infinite time. The curvature bound is the decisive analytic hypothesis: the shrinking round sphere below has finite maximal time precisely because its curvature becomes unbounded. The theorem does not classify the singularity or say where high curvature occurs; it only converts the geometric problem of continuing metrics into a curvature bound. For a maximal compact flow, any finite-time obstruction must therefore be detected by the curvature norm somewhere in space-time, which gives the basic blow-up criterion used whenever we choose singular sequences.
[quotetheorem:6001]
[citeproof:6001]
Compactness and maximality are essential here. Without maximality, a flow may be artificially stopped at a finite time while curvature remains bounded, for instance by restricting any smooth solution to a shorter interval. Without compactness, extension requires additional global hypotheses because bounded curvature at finite time does not by itself encode all possible behaviour at infinity. The conclusion also does not say that curvature blows up everywhere, nor does it prescribe a rate or a shape for the high-curvature region. It only identifies the obstruction to extension, and this is why singularity analysis begins by selecting points and times where the curvature is large.
[example: Round Sphere Extinction]
Let $(S^n,g_0)$ have constant sectional curvature $1$, and look for a Ricci flow of the form $g(t)=r(t)^2g_0$ with $r(0)=1$. Since $\operatorname{Ric}(g_0)=(n-1)g_0$ and constant rescaling does not change the Ricci tensor as a $(0,2)$-tensor, we have
\begin{align*}
\operatorname{Ric}(g(t))=(n-1)g_0.
\end{align*}
Also,
\begin{align*}
\frac{\partial g}{\partial t}=2r(t)r'(t)g_0.
\end{align*}
The Ricci flow equation $\partial_t g=-2\operatorname{Ric}(g(t))$ therefore gives
\begin{align*}
2r(t)r'(t)g_0=-2(n-1)g_0.
\end{align*}
Cancelling the nonzero tensor $g_0$ gives
\begin{align*}
2r(t)r'(t)=-2(n-1).
\end{align*}
Since $\frac{d}{dt}(r(t)^2)=2r(t)r'(t)$, this becomes
\begin{align*}
\frac{d}{dt}\left(r(t)^2\right)=-2(n-1).
\end{align*}
Integrating from $0$ to $t$ and using $r(0)^2=1$ gives
\begin{align*}
r(t)^2=1-2(n-1)t.
\end{align*}
Thus the solution exists while $r(t)^2>0$, so the extinction time is
\begin{align*}
T=\frac{1}{2(n-1)}.
\end{align*}
The volume scales by the factor $r(t)^n$, hence
\begin{align*}
\operatorname{Vol}(S^n,g(t))=r(t)^n\operatorname{Vol}(S^n,g_0).
\end{align*}
Substituting the formula for $r(t)^2$ gives
\begin{align*}
\operatorname{Vol}(S^n,g(t))=\left(1-2(n-1)t\right)^{n/2}\operatorname{Vol}(S^n,g_0).
\end{align*}
This tends to $0$ as $t\uparrow T$. The sectional curvature of $r(t)^2g_0$ is
\begin{align*}
K(t)=\frac{1}{r(t)^2}.
\end{align*}
Using $r(t)^2=1-2(n-1)t=2(n-1)(T-t)$, we get
\begin{align*}
K(t)=\frac{1}{2(n-1)(T-t)}.
\end{align*}
For an $n$-dimensional metric of constant sectional curvature $K(t)$,
\begin{align*}
|\operatorname{Rm}(g(t))|=\sqrt{2n(n-1)}\,K(t).
\end{align*}
Therefore
\begin{align*}
|\operatorname{Rm}(g(t))|=\frac{\sqrt{2n(n-1)}}{2(n-1)(T-t)}.
\end{align*}
The round sphere becomes extinct at finite time, its volume collapses to zero, and its curvature blows up at exactly the Type I rate $(T-t)^{-1}$.
[/example]
This example gives the first model for a singularity: after rescaling by the curvature size, the shrinking sphere has the same round geometry at every scale. The next section formalizes this rescaling.
## Type I and Type II Rates
Once blow-up is known, the next question is how fast it occurs. Ricci flow has a natural parabolic scaling, and the rate of curvature growth relative to $(T-t)^{-1}$ separates singularities into the Type I and Type II regimes.
[definition: Parabolic Rescaling]
Let $M$ be a smooth manifold, let $\operatorname{Met}^\infty(M)$ denote the space of smooth Riemannian metrics on $M$, and let
\begin{align*}
g:I &\longrightarrow \operatorname{Met}^\infty(M)
\end{align*}
be a smooth Ricci flow on an interval $I\subset\mathbb R$. For $\lambda>0$ and $t_0\in I$, define the rescaled interval
\begin{align*}
I_{\lambda,t_0} &= \{s\in\mathbb R: t_0+s/\lambda\in I\}.
\end{align*}
Parabolic rescaling based at time $t_0$ with factor $\lambda$ is the transformation from smooth Ricci flows $g:I\to\operatorname{Met}^\infty(M)$ to smooth Ricci flows $g_\lambda:I_{\lambda,t_0}\to\operatorname{Met}^\infty(M)$ given by
\begin{align*}
g_\lambda(s) &= \lambda\, g\left(t_0+\frac{s}{\lambda}\right)
\end{align*}
for all $s\in I_{\lambda,t_0}$.
[/definition]
This definition tells us how to zoom in, but a useful zoom must preserve the equation and must transform curvature in a predictable way. The next calculation verifies that the rescaled family is again a Ricci flow and identifies the normalization that makes a chosen high-curvature point have unit curvature.
[quotetheorem:6002]
[citeproof:6002]
The factor $\lambda$ and the shifted time variable are both necessary. If one rescales the metric without speeding up time by the same factor, the resulting family no longer satisfies the Ricci flow equation; if one rescales time without rescaling the metric, curvature is not normalized. The theorem does not say that every choice of $\lambda$ is geometrically useful: choosing $\lambda$ much smaller than the curvature scale leaves the singularity unresolved, while choosing it much larger may push the interesting region out of view. The scale-invariant quantity is therefore $(T-t)\sup_M |\operatorname{Rm}(\cdot,t)|$. Boundedness of this expression means the singularity forms no faster than the round sphere; unboundedness means that the curvature scale is much smaller than the remaining time scale. This dichotomy is the main rate distinction for finite-time singularities.
[definition: Type I and Type II Singularities]
Let $g(t)$ be a compact Ricci flow on $[0,T)$ with $T<\infty$ and curvature blow-up at $T$. The singularity is Type I if there exists $C<\infty$ such that
\begin{align*}
\sup_{x\in M}|\operatorname{Rm}(x,t)| &\le \frac{C}{T-t}
\end{align*}
for all $t\in[0,T)$. The singularity is Type II if it is not Type I.
[/definition]
A Type I bound is exactly what survives when we rescale around times approaching $T$. Type II singularities require choosing scales by the actual maximum curvature rather than by the remaining time.
[example: Cylindrical Neckpinch]
For the shrinking round cylinder, take
\begin{align*}
g_{\mathrm{cyl}}(t)=a(t)g_{S^n}+dz^2
\end{align*}
on $S^n\times\mathbb R$, where $g_{S^n}$ has constant sectional curvature $1$ and $a(t)>0$. The product Ricci tensor has no $\mathbb R$-component, and constant rescaling of the $S^n$ metric leaves the Ricci tensor unchanged as a $(0,2)$-tensor, so
\begin{align*}
\operatorname{Ric}(g_{\mathrm{cyl}}(t))=(n-1)g_{S^n}+0\cdot dz^2.
\end{align*}
The time derivative is
\begin{align*}
\frac{\partial g_{\mathrm{cyl}}}{\partial t}=a'(t)g_{S^n}.
\end{align*}
Substituting these two expressions into $\partial_t g=-2\operatorname{Ric}$ gives
\begin{align*}
a'(t)g_{S^n}=-2(n-1)g_{S^n}.
\end{align*}
Cancelling the nonzero tensor $g_{S^n}$ gives
\begin{align*}
a'(t)=-2(n-1).
\end{align*}
If the cylindrical factor becomes extinct at time $T$, then $a(T)=0$. Integrating from $t$ to $T$ gives
\begin{align*}
a(T)-a(t)=\int_{[t,T]} -2(n-1)\,du.
\end{align*}
The integral equals $-2(n-1)(T-t)$, so
\begin{align*}
-a(t)=-2(n-1)(T-t).
\end{align*}
Therefore
\begin{align*}
a(t)=2(n-1)(T-t).
\end{align*}
The only nonzero sectional curvatures are those of planes tangent to the $S^n$ factor. Since scaling $g_{S^n}$ by $a(t)$ changes sectional curvature from $1$ to $1/a(t)$, we have
\begin{align*}
K(t)=\frac{1}{a(t)}.
\end{align*}
Using the formula for $a(t)$ gives
\begin{align*}
K(t)=\frac{1}{2(n-1)(T-t)}.
\end{align*}
For an $n$-dimensional constant-curvature factor, the curvature norm is
\begin{align*}
|\operatorname{Rm}(g_{\mathrm{cyl}}(t))|=\sqrt{2n(n-1)}\,K(t).
\end{align*}
Substituting the value of $K(t)$ yields
\begin{align*}
|\operatorname{Rm}(g_{\mathrm{cyl}}(t))|=\frac{\sqrt{2n(n-1)}}{2(n-1)(T-t)}.
\end{align*}
Thus the shrinking cylinder has Type I curvature growth. A rotationally symmetric neckpinch on compact $S^{n+1}$ can have this cylinder as its pointed high-curvature limit: the original manifold is compact, but the rescaled neck limit is the noncompact product $S^n\times\mathbb R$.
[/example]
The cylindrical neckpinch shows that a Type I singularity can remember a noncompact local geometry rather than the topology of the original compact manifold. This raises the structural question of what equations the limiting models satisfy. Under the compactness and noncollapsing inputs developed later, Type I limits are forced into the shrinking soliton class.
[quotetheorem:6003]
[citeproof:6003]
The singular-point and convergence hypotheses are essential. If the basepoint is chosen in a region whose curvature remains bounded up to $T$, the same time rescaling can converge to a flat limit rather than a singularity model. If compactness or noncollapsing fails, there may be no smooth pointed limit to classify. The theorem also does not classify all shrinking solitons or say that every arbitrary high-curvature rescaling has the same limit; it identifies the canonical structure of Type I limits once the blow-up sequence is centred at a genuine singular point and a smooth nonflat limit exists.
## Pointed Convergence of Ricci Flows
Blow-up limits live on spaces that need not be compact and may have different topology from the original manifold. To make sense of convergence in this setting, we use pointed Cheeger-Gromov convergence: convergence on larger and larger compact sets around chosen basepoints, after choosing smooth identifications.
[definition: Pointed Smooth Cheeger-Gromov Convergence]
A sequence of pointed complete Riemannian manifolds $(M_j,g_j,x_j)$ converges smoothly in the pointed Cheeger-Gromov sense to $(M_\infty,g_\infty,x_\infty)$ if there are compact sets $K_j\subset M_\infty$ exhausting $M_\infty$, with $x_\infty\in K_j$, and smooth embeddings $\Phi_j:K_j\to M_j$ satisfying $\Phi_j(x_\infty)=x_j$, such that
\begin{align*}
\Phi_j^*g_j \to g_\infty
\end{align*}
smoothly on every compact subset of $M_\infty$.
[/definition]
The preceding definition handles a single time slice, but singularity models are flows rather than isolated manifolds. We therefore need convergence that keeps track of the time-dependent metrics on [compact space](/page/Compact%20Space)-time subsets while still allowing the underlying manifolds to be identified only locally near the basepoint.
[definition: Pointed Smooth Convergence of Ricci Flows]
A sequence of pointed complete Ricci flows $(M_j,g_j(t),x_j)$ defined for $t\in(a,b)$, where each
\begin{align*}
g_j:(a,b)&\longrightarrow \operatorname{Met}^\infty_{\mathrm{complete}}(M_j)
\end{align*}
is a smooth Ricci flow into the space of smooth complete Riemannian metrics on $M_j$, converges smoothly in the pointed Cheeger-Gromov sense to $(M_\infty,g_\infty(t),x_\infty)$, where
\begin{align*}
g_\infty:(a,b)&\longrightarrow \operatorname{Met}^\infty_{\mathrm{complete}}(M_\infty),
\end{align*}
if there are compact exhaustions $K_j\subset M_\infty$ and embeddings $\Phi_j:K_j\to M_j$ with $\Phi_j(x_\infty)=x_j$ such that, for every compact $K\subset M_\infty$ and compact interval $J\subset(a,b)$,
\begin{align*}
\Phi_j^*g_j(t) \to g_\infty(t)
\end{align*}
smoothly on $K\times J$.
[/definition]
This definition is useful only if there is a [compactness theorem](/theorems/2748) giving subsequential limits from geometric hypotheses. In blow-up arguments the required hypotheses are local curvature bounds and a basepoint noncollapsing condition, because those are stable under parabolic rescaling. Hamilton's theorem packages these conditions into the compactness statement used throughout the rest of the course.
[quotetheorem:6004]
[citeproof:6004]
Each hypothesis prevents a specific compactness failure. Curvature bounds are needed because a sequence of round spheres with radii tending to $0$ has curvature tending to infinity and no smooth bounded-curvature limit at the original scale. The injectivity-radius lower bound rules out collapse: flat tori with one circle factor shrinking to length $1/j$ have uniformly bounded curvature but collapse to a lower-dimensional space. Completeness and uniform control on bounded balls are what make the pointed limit a complete flow rather than a partial local object. The theorem does not identify the limit uniquely; different subsequences or basepoints can give different limits. In Ricci flow singularity analysis, the injectivity-radius input is often verified through Perelman's noncollapsing theorem rather than direct calculation.
[example: Why the Basepoint Matters]
Let $M_j=S^n\times[-j,j]$ carry the product metric
\begin{align*}
g_j &= g_{S^n}+dz^2,
\end{align*}
where $g_{S^n}$ has sectional curvature $1$. Choose the middle basepoint $x_j=(p,0)$. For any fixed $R<\infty$ and all $j>R$, the product ball of radius $R$ around $(p,0)$ lies inside $S^n\times(-j,j)$, and the inclusion
\begin{align*}
S^n\times[-R,R] &\longrightarrow S^n\times[-j,j]
\end{align*}
pulls back $g_j$ to exactly $g_{S^n}+dz^2$. Hence on every fixed compact cylinder there is no metric error to take to zero, and the pointed limit is
\begin{align*}
(S^n\times\mathbb R,\; g_{S^n}+dz^2,\; (p,0)).
\end{align*}
Now modify the same long cylindrical pieces by attaching a smooth cap at the right end. If the basepoint is instead chosen at the cap, say $y_j$ is the cap tip, then for each fixed $R$ the ball $B_{g_j}(y_j,R)$ lies in the cap together with only a bounded length of the adjacent cylinder once $j$ is large. The distant left part of the cylinder is at distance approximately $2j$ from $y_j$, so it is outside every fixed pointed ball:
\begin{align*}
d_{g_j}(y_j,S^n\times\{-j\}) &\ge 2j-C
\end{align*}
for a constant $C$ depending only on the cap geometry. Since $2j-C\to\infty$, that far cylindrical region disappears from the pointed limit based at $y_j$. Thus the middle basepoints see the complete round cylinder, while cap basepoints see the cap geometry; pointed convergence records geometry near the chosen basepoint, not the whole ambient manifold at once.
[/example]
The example shows why the basepoint is part of the data and why compactness must be formulated locally around that point. We now combine point selection, curvature normalization, parabolic time rescaling, and Hamilton compactness into the standard blow-up construction. This is the mechanism that turns finite-time curvature blow-up into an ancient limiting flow.
[remark: Pointed Blow-Up Construction]
The standard pointed blow-up construction combines point selection, curvature normalization, parabolic rescaling, and Hamilton compactness. Given a compact Ricci flow with curvature blow-up at $T$, choose $(x_j,t_j)$ with $t_j\uparrow T$ and $\lambda_j=|\operatorname{Rm}(x_j,t_j)|\to\infty$, then set
\begin{align*}
g_j(s)=\lambda_j g\left(t_j+\frac{s}{\lambda_j}\right).
\end{align*}
If the rescaled flows have uniform curvature bounds on compact pointed parabolic neighbourhoods and a noncollapsing basepoint injectivity-radius lower bound, a subsequence converges on compact backward time intervals to a complete pointed ancient Ricci flow normalized by $|\operatorname{Rm}|(x_\infty,0)=1$.
[/remark]
The assumptions isolate the two compactness inputs that are not automatic from curvature blow-up alone. Uniform curvature bounds on pointed parabolic neighbourhoods are needed because normalizing the curvature at one point gives no control a fixed distance away; without such bounds, Hamilton compactness cannot be applied. Noncollapsing is needed because bounded curvature sequences can collapse, as in shrinking circle factors of flat tori, and then no smooth limit of the same dimension exists. The conclusion also depends on the chosen basepoints and scales: the normalization gives a nonflat model at the basepoint, but the model can be steady, shrinking, or more complicated depending on the singularity type and the point-picking procedure.
[example: Cigar Soliton as a Type II Model]
Hamilton's cigar soliton on $\mathbb R^2$ is the conformal metric
\begin{align*}
g=\frac{dx_1^2+dx_2^2}{1+x_1^2+x_2^2}.
\end{align*}
Write $r^2=x_1^2+x_2^2$ and $g=e^{2u}(dx_1^2+dx_2^2)$, so $e^{2u}=(1+r^2)^{-1}$ and
\begin{align*}
u=-\frac{1}{2}\log(1+r^2).
\end{align*}
For a conformal surface metric, the Gaussian curvature is $K=-e^{-2u}\Delta_{\mathbb R^2}u$. For $i=1,2$,
\begin{align*}
\frac{\partial u}{\partial x_i}=-\frac{x_i}{1+r^2}.
\end{align*}
Differentiating once more gives
\begin{align*}
\frac{\partial^2u}{\partial x_i^2}=-\frac{1}{1+r^2}+\frac{2x_i^2}{(1+r^2)^2}.
\end{align*}
Hence
\begin{align*}
\Delta_{\mathbb R^2}u=-\frac{2}{1+r^2}+\frac{2r^2}{(1+r^2)^2}.
\end{align*}
Putting the terms over the common denominator $(1+r^2)^2$ gives
\begin{align*}
\Delta_{\mathbb R^2}u=\frac{-2(1+r^2)+2r^2}{(1+r^2)^2}.
\end{align*}
Therefore
\begin{align*}
\Delta_{\mathbb R^2}u=-\frac{2}{(1+r^2)^2}.
\end{align*}
Since $e^{-2u}=1+r^2$,
\begin{align*}
K=-(1+r^2)\left(-\frac{2}{(1+r^2)^2}\right).
\end{align*}
Thus
\begin{align*}
K=\frac{2}{1+r^2}.
\end{align*}
The curvature is maximal at $r=0$, where $K(0)=2$, and $K(r)\to0$ as $r\to\infty$.
The metric is complete because the length of a radial ray from $0$ to Euclidean radius $R$ is
\begin{align*}
\int_0^R\frac{d\rho}{\sqrt{1+\rho^2}}=\operatorname{arsinh}(R),
\end{align*}
and $\operatorname{arsinh}(R)\to\infty$ as $R\to\infty$. To verify the steady soliton equation, take
\begin{align*}
f=-\log(1+r^2).
\end{align*}
The conformal Christoffel symbols are
\begin{align*}
\Gamma^k_{ij}=\delta_{jk}u_i+\delta_{ik}u_j-\delta_{ij}u_k,
\end{align*}
where
\begin{align*}
u_i=-\frac{x_i}{1+r^2}.
\end{align*}
Also
\begin{align*}
f_i=-\frac{2x_i}{1+r^2}.
\end{align*}
Differentiating gives
\begin{align*}
f_{ij}=-\frac{2\delta_{ij}}{1+r^2}+\frac{4x_ix_j}{(1+r^2)^2}.
\end{align*}
Using the formula for $\Gamma^k_{ij}$,
\begin{align*}
\Gamma^k_{ij}f_k=u_if_j+u_jf_i-\delta_{ij}u_kf_k.
\end{align*}
The first two terms are
\begin{align*}
u_if_j+u_jf_i=\frac{4x_ix_j}{(1+r^2)^2}.
\end{align*}
The contracted term is
\begin{align*}
u_kf_k=\frac{2r^2}{(1+r^2)^2}.
\end{align*}
Therefore
\begin{align*}
\Gamma^k_{ij}f_k=\frac{4x_ix_j}{(1+r^2)^2}-\frac{2r^2\delta_{ij}}{(1+r^2)^2}.
\end{align*}
Now compute the Hessian:
\begin{align*}
(\nabla_g^2f)_{ij}=f_{ij}-\Gamma^k_{ij}f_k.
\end{align*}
Substituting the displayed expressions gives
\begin{align*}
(\nabla_g^2f)_{ij}=-\frac{2\delta_{ij}}{1+r^2}+\frac{2r^2\delta_{ij}}{(1+r^2)^2}.
\end{align*}
Putting the terms over the common denominator $(1+r^2)^2$,
\begin{align*}
(\nabla_g^2f)_{ij}=\frac{-2(1+r^2)\delta_{ij}+2r^2\delta_{ij}}{(1+r^2)^2}.
\end{align*}
Hence
\begin{align*}
(\nabla_g^2f)_{ij}=-\frac{2\delta_{ij}}{(1+r^2)^2}.
\end{align*}
On a surface, $\operatorname{Ric}=Kg$, so
\begin{align*}
\operatorname{Ric}_{ij}=\frac{2}{1+r^2}\cdot\frac{\delta_{ij}}{1+r^2}.
\end{align*}
Thus
\begin{align*}
\operatorname{Ric}_{ij}=\frac{2\delta_{ij}}{(1+r^2)^2}.
\end{align*}
Adding the two tensor components gives
\begin{align*}
\operatorname{Ric}_{ij}+(\nabla_g^2f)_{ij}=0.
\end{align*}
Therefore
\begin{align*}
\operatorname{Ric}+\nabla_g^2f=0,
\end{align*}
so the cigar is a steady gradient Ricci soliton. Its curvature is concentrated near the origin while the metric is complete and noncompact, which is why it appears as a Type II blow-up model rather than as a shrinking Type I model.
[/example]
The cigar contrasts with the shrinking sphere and shrinking cylinder: it is steady rather than shrinking, and its natural scale is dictated by the selected high-curvature points rather than by the remaining time to extinction. This distinction is why Type II analysis needs more flexible point-picking arguments.
## Singularities as Geometric Models
The chapter's results reorganize finite-time breakdown into a compactness problem. Curvature blow-up identifies where smooth existence fails; parabolic rescaling chooses the appropriate microscope; pointed convergence extracts a complete limiting flow.
[remark: Three Pieces of Singularity Analysis]
A singularity model always depends on three inputs: a sequence of basepoints, a sequence of times, and a sequence of scales. Type I analysis often takes the scale from $(T-t_j)^{-1}$ or from the curvature at the basepoint, while Type II analysis uses point-picking to find regions where curvature is almost maximal on a suitable parabolic neighbourhood. The limit is not the singularity itself; it is the geometry seen after renormalizing the flow near the singular region.
[/remark]
This viewpoint prepares the next stage of the course. Chapters 8 and 9 develop Perelman's monotonicity formulae and noncollapsing estimates, which supply the compactness hypotheses and force strong structure on the limits, especially in dimension three.
Singularity analysis reveals that high-curvature regions become visible only after rescaling, so the right models are the limits of these rescalings. That leads naturally to solitons and ancient solutions, which capture the self-similar and long-time patterns governing singularities.
# 7. Ricci Solitons and Ancient Solutions
Ricci flow develops singularities by concentrating curvature, so the next question is what a solution looks like after we zoom in near a high-curvature region. The parabolic rescaling from Chapter 6 suggests that the limiting models should be self-similar solutions and ancient solutions, because rescaling pushes the singular time off to the infinite future or past. This chapter introduces Ricci solitons as exact self-similar solutions, records the identities that make gradient solitons analytically useful, and explains the role of ancient $\kappa$-solutions in the three-dimensional singularity analysis.
## Self-Similar Models for Ricci Flow
The basic problem is to identify Ricci flows whose evolution is generated only by diffeomorphisms and scaling. Such solutions are the fixed points of the Ricci flow dynamics after quotienting by the two symmetries that do not change the underlying local geometry.
[definition: Ricci Soliton]
Let $(M,g)$ be a smooth Riemannian manifold. A Ricci soliton is a triple $(M,g,X)$, where $X \in \mathfrak{X}(M)$, satisfying
\begin{align*}
\operatorname{Ric} + \frac{1}{2}\mathcal{L}_X g = \lambda g
\end{align*}
for some constant $\lambda \in \mathbb{R}$.
[/definition]
The soliton equation says that Ricci curvature is balanced by an infinitesimal diffeomorphism and a constant scaling term. To distinguish the singularity models that occur before, at, and after a preferred time, we need to separate solitons by the sign of this scaling term.
[definition: Shrinking Steady and Expanding Soliton]
A Ricci soliton $(M,g,X)$ with soliton constant $\lambda$ is called shrinking if $\lambda > 0$, steady if $\lambda = 0$, and expanding if $\lambda < 0$.
[/definition]
This sign convention links the static soliton equation to the time direction of the corresponding Ricci flow. The next problem is to find a form of the soliton equation that supports [integration by parts](/theorems/210) and scalar elliptic identities; this motivates restricting to vector fields that are gradients.
[definition: Gradient Ricci Soliton]
Let $(M,g)$ be a smooth Riemannian manifold and let $f: M \to \mathbb{R}$ be a smooth function. A gradient Ricci soliton is a triple $(M,g,f)$ satisfying
\begin{align*}
\operatorname{Ric} + \operatorname{Hess}_{g} f = \lambda g
\end{align*}
for some constant $\lambda \in \mathbb{R}$.
[/definition]
Here $\operatorname{Hess}_{g} f$ denotes the Hessian of $f$ with respect to $g$. In this convention a gradient soliton is the preceding soliton with vector field $X=\nabla f$, since
\begin{align*}
\mathcal{L}_{\nabla f}g=2\operatorname{Hess}_{g} f.
\end{align*}
The next step is to verify that this elliptic equation really generates a Ricci flow solution rather than only a formal stationary condition.
[quotetheorem:6006]
[citeproof:6006]
This theorem explains why solitons appear when a Ricci flow is viewed near a singularity: after rescaling, a soliton evolves without changing shape. The positivity condition $1-2\lambda t>0$ is essential because the metric degenerates when the homothetic scale reaches zero; for a shrinker this corresponds to the finite extinction time. Completeness is also not cosmetic: without completeness, the vector field used to generate $\varphi_t$ need not integrate to diffeomorphisms for the whole required interval. The theorem does not classify solitons or assert that every singularity model is exactly self-similar; it only shows that solutions of the soliton equation produce exact model flows. The simplest examples already contain the local geometries that recur in neck and cap regions.
[example: Gaussian Shrinker]
On $\mathbb{R}^n$ with Euclidean coordinates $(x^1,\ldots,x^n)$ and Euclidean metric $g_{\rm E}$, set
\begin{align*}
f(x)=\frac{|x|^2}{4}=\frac{1}{4}\sum_{i=1}^n (x^i)^2.
\end{align*}
The Euclidean Christoffel symbols vanish in these coordinates, so the Hessian components are ordinary second derivatives corrected by a zero Christoffel term. First,
\begin{align*}
\partial_i f=\frac{1}{2}x^i.
\end{align*}
Then
\begin{align*}
\partial_i\partial_j f=\frac{1}{2}\delta_{ij}.
\end{align*}
Therefore
\begin{align*}
(\operatorname{Hess}_{g} f)_{ij}=\partial_i\partial_j f-\Gamma^k_{ij}\partial_k f=\frac{1}{2}\delta_{ij}-0=\frac{1}{2}(g_{\rm E})_{ij}.
\end{align*}
The Euclidean metric has zero curvature, hence $\operatorname{Ric}_{g_{\rm E}}=0$. Componentwise this gives
\begin{align*}
(\operatorname{Ric}_{g_{\rm E}}+\operatorname{Hess}_{g} f)_{ij}=0+\frac{1}{2}(g_{\rm E})_{ij}=\frac{1}{2}(g_{\rm E})_{ij}.
\end{align*}
Thus
\begin{align*}
\operatorname{Ric}_{g_{\rm E}}+\operatorname{Hess}_{g} f=\frac{1}{2}g_{\rm E},
\end{align*}
so $(\mathbb{R}^n,g_{\rm E},f)$ is a gradient shrinking soliton with soliton constant $\lambda=1/2$.
For the associated self-similar flow, $c(t)=1-2\lambda t=1-t$, and
\begin{align*}
\nabla f=\frac{1}{2}\sum_{i=1}^n x^i\frac{\partial}{\partial x^i}.
\end{align*}
The diffeomorphism equation is
\begin{align*}
\frac{d}{dt}\varphi_t(x)=\frac{1}{1-t}\cdot\frac{1}{2}\varphi_t(x).
\end{align*}
Solving this scalar equation in each coordinate gives
\begin{align*}
\varphi_t(x)=(1-t)^{-1/2}x.
\end{align*}
Dilation by $(1-t)^{-1/2}$ pulls back $g_{\rm E}$ to $(1-t)^{-1}g_{\rm E}$, so
\begin{align*}
g(t)=(1-t)\varphi_t^*g_{\rm E}=(1-t)(1-t)^{-1}g_{\rm E}=g_{\rm E}.
\end{align*}
The Gaussian shrinker is therefore the flat baseline shrinker: its self-similar evolution is produced entirely by the cancellation between scaling and pullback, not by curvature change.
[/example]
The Euclidean model has no curvature, so it is the baseline shrinker. Curved compact examples show how positive Ricci curvature drives finite-time collapse.
[example: Round Shrinking Sphere]
Let $S^n$ carry the round metric $g$ normalized by
\begin{align*}
\operatorname{Ric}_g=(n-1)g.
\end{align*}
If $f$ is constant, then in any local coordinates $\partial_i f=0$ and $\partial_i\partial_j f=0$, so
\begin{align*}
(\operatorname{Hess}_{g} f)_{ij}=\partial_i\partial_j f-\Gamma^k_{ij}\partial_k f=0-\Gamma^k_{ij}\cdot 0=0.
\end{align*}
Therefore
\begin{align*}
\operatorname{Ric}_g+\operatorname{Hess}_{g} f=(n-1)g+0=(n-1)g.
\end{align*}
Thus the round sphere is a compact gradient Ricci soliton with soliton constant $\lambda=n-1$, hence a shrinking soliton for $n\ge 2$.
For its homothetic Ricci flow, write $g(t)=a(t)g$ with $a(0)=1$. Since constant scaling leaves the Ricci tensor unchanged as a covariant tensor,
\begin{align*}
\operatorname{Ric}_{g(t)}=\operatorname{Ric}_{a(t)g}=\operatorname{Ric}_g=(n-1)g.
\end{align*}
The Ricci flow equation $\partial_t g(t)=-2\operatorname{Ric}_{g(t)}$ becomes
\begin{align*}
a'(t)g=-2(n-1)g.
\end{align*}
Cancelling the nonzero tensor $g$ gives
\begin{align*}
a'(t)=-2(n-1).
\end{align*}
With $a(0)=1$, integration gives
\begin{align*}
a(t)=1-2(n-1)t.
\end{align*}
Hence
\begin{align*}
g(t)=\bigl(1-2(n-1)t\bigr)g.
\end{align*}
The radius scale is the square root of the metric scale, namely $\sqrt{1-2(n-1)t}$, and extinction occurs when
\begin{align*}
1-2(n-1)t=0.
\end{align*}
Solving for $t$ gives
\begin{align*}
t=\frac{1}{2(n-1)}.
\end{align*}
The round shrinking sphere is therefore the compact model in which positive Ricci curvature drives pure homothetic collapse in finite forward time.
[/example]
Products provide the first noncompact curved shrinkers. They are important because three-dimensional neck regions resemble cylinders after rescaling.
[example: Shrinking Cylinder]
Let $k\ge 2$ and put the product metric
\begin{align*}
g=g_{S^k}\oplus g_{\mathbb R^{n-k}}
\end{align*}
on $S^k\times \mathbb R^{n-k}$, where the round sphere is scaled so that
\begin{align*}
\operatorname{Ric}_{S^k}=\frac{1}{2}g_{S^k}.
\end{align*}
For instance, since the unit round sphere has $\operatorname{Ric}=(k-1)g_{\mathrm{unit}}$, this normalization is obtained by taking radius $r=\sqrt{2(k-1)}$.
Let $y=(y^1,\ldots,y^{n-k})$ be Euclidean coordinates on $\mathbb R^{n-k}$ and set
\begin{align*}
f(y)=\frac{|y|^2}{4}=\frac{1}{4}\sum_{\alpha=1}^{n-k}(y^\alpha)^2.
\end{align*}
The product Ricci tensor splits by the two factors, so
\begin{align*}
\operatorname{Ric}_g=\operatorname{Ric}_{S^k}\oplus \operatorname{Ric}_{\mathbb R^{n-k}}.
\end{align*}
Using the chosen sphere normalization and the flatness of the Euclidean factor,
\begin{align*}
\operatorname{Ric}_g=\frac{1}{2}g_{S^k}\oplus 0.
\end{align*}
Since $f$ depends only on the Euclidean variable $y$, its first derivatives in the $S^k$ directions are zero, and its mixed second derivatives between $S^k$ and $\mathbb R^{n-k}$ are zero. On the Euclidean factor,
\begin{align*}
\partial_\alpha f=\frac{1}{2}y^\alpha.
\end{align*}
Differentiating once more gives
\begin{align*}
\partial_\alpha\partial_\beta f=\frac{1}{2}\delta_{\alpha\beta}.
\end{align*}
The Euclidean Christoffel symbols vanish, hence
\begin{align*}
(\operatorname{Hess}_{g} f)_{\alpha\beta}=\partial_\alpha\partial_\beta f-\Gamma^\gamma_{\alpha\beta}\partial_\gamma f.
\end{align*}
Substituting the two computed terms,
\begin{align*}
(\operatorname{Hess}_{g} f)_{\alpha\beta}=\frac{1}{2}\delta_{\alpha\beta}-0.
\end{align*}
Since $(g_{\mathbb R^{n-k}})_{\alpha\beta}=\delta_{\alpha\beta}$ in Euclidean coordinates,
\begin{align*}
(\operatorname{Hess}_{g} f)_{\alpha\beta}=\frac{1}{2}(g_{\mathbb R^{n-k}})_{\alpha\beta}.
\end{align*}
Therefore, as a tensor on the product,
\begin{align*}
\operatorname{Hess}_{g} f=0\oplus \frac{1}{2}g_{\mathbb R^{n-k}}.
\end{align*}
Adding the Ricci and Hessian terms factor by factor gives
\begin{align*}
\operatorname{Ric}_g+\operatorname{Hess}_{g} f=\left(\frac{1}{2}g_{S^k}\oplus 0\right)+\left(0\oplus \frac{1}{2}g_{\mathbb R^{n-k}}\right).
\end{align*}
Thus
\begin{align*}
\operatorname{Ric}_g+\operatorname{Hess}_{g} f=\frac{1}{2}g_{S^k}\oplus \frac{1}{2}g_{\mathbb R^{n-k}}.
\end{align*}
Factoring out $\frac{1}{2}$ from the product metric,
\begin{align*}
\operatorname{Ric}_g+\operatorname{Hess}_{g} f=\frac{1}{2}\left(g_{S^k}\oplus g_{\mathbb R^{n-k}}\right)=\frac{1}{2}g.
\end{align*}
So $S^k\times\mathbb R^{n-k}$ is a gradient shrinking soliton with soliton constant $\lambda=1/2$. The positive Ricci curvature comes from the spherical directions, while the quadratic potential supplies exactly the missing Hessian term in the Euclidean directions, making this product the standard shrinking cylinder model.
[/example]
## Soliton Identities and Potential Functions
Once a soliton is known to be gradient, the next problem is to extract scalar identities from the tensor equation. These identities are the bridge between the geometric equation and analytic estimates for curvature, volume growth, and weighted integration.
[quotetheorem:6007]
[citeproof:6007]
The trace identity controls the average convexity of the potential through scalar curvature. The dimension hypothesis enters through the finite trace of the metric term, so the formula is local and does not require completeness. The soliton equation itself is essential: on Euclidean $\mathbb{R}^n$ with $\lambda=0$ and $f(x)=x_1^2$, one has $R=0$ and $\Delta f=2$, so $R+\Delta f\ne n\lambda$; the failure occurs because $\operatorname{Hess}_{g} f$ is not balanced by Ricci curvature as $\lambda g$. The identity also does not give a pointwise bound for either $R$ or $\Delta f$ separately: for example, the Gaussian shrinker has $R=0$ and $\Delta f=n/2$, while the round shrinking sphere with the same normalization has positive constant $R$ and constant $f$. To obtain pointwise control of the interaction between curvature and the gradient of $f$, we need an identity that uses the Bianchi identity rather than only the trace.
[quotetheorem:6008]
[citeproof:6008]
This conserved quantity is often used to normalize $f$ by adding a constant. Each main hypothesis has a visible role. If the soliton equation is removed, Euclidean $\mathbb{R}^n$ with $f(x)=\sin x_1$ and $\lambda=0$ gives $R+|\nabla f|^2=\cos^2 x_1$, which is not constant. The gradient hypothesis cannot be replaced by an arbitrary soliton vector field: on a flat torus, any nonzero parallel vector field $X$ gives a steady Ricci soliton because $\operatorname{Ric}=0$ and $\mathcal L_X g=0$, but the dual one-form of $X$ need not be exact, so there is no global potential $f$ with $\nabla f=X$. Connectedness is also necessary for a single global constant: take the disjoint union of two copies of the Gaussian shrinker and add different constants to the two potentials; the soliton equation is unchanged on each component, while $R+|\nabla f|^2-f$ takes two different constant values. The identity does not imply $R$ is constant, and the Bryant soliton is an important counterexample to that stronger conclusion. For shrinkers with $\lambda=1/2$, the normalization is commonly chosen so that
\begin{align*}
R+|\nabla f|^2-f=0,
\end{align*}
which prepares the weighted analytic formalism used next.
[remark: Weighted Measure on a Shrinker]
A gradient shrinking soliton carries the natural weighted measure $e^{-f}\,d\operatorname{vol}_g$. The potential $f$ acts like a confining function in noncompact examples. For $u \in C^\infty(M)$, the drift Laplacian is the operator $\Delta_f:C^\infty(M)\to C^\infty(M)$ defined by
\begin{align*}
\Delta_f u=\Delta u-(\nabla f,\nabla u)_g.
\end{align*}
Many elliptic estimates on the soliton are written using this operator, or using its closure on the weighted space $L^2(M,e^{-f}\,d\operatorname{vol}_g)$ when functional-analytic domains are needed.
[/remark]
The scalar identities are not only formal consequences; they predict the shape of model solutions. The Bryant soliton illustrates the steady case, where the potential replaces scaling as the source of self-similarity.
[example: Bryant Soliton]
In each dimension $n\ge 3$ there is a complete rotationally symmetric steady gradient Ricci soliton on $\mathbb{R}^n$, called the Bryant soliton. In the steady case the soliton constant is $\lambda=0$, so the gradient soliton equation is
\begin{align*}
\operatorname{Ric}+\operatorname{Hess}_{g} f=0.
\end{align*}
Equivalently,
\begin{align*}
\operatorname{Hess}_{g} f=-\operatorname{Ric}.
\end{align*}
Because the Bryant soliton has positive sectional curvature, its Ricci tensor is positive definite. Thus for every nonzero tangent vector $v$,
\begin{align*}
\operatorname{Hess}_{g} f(v,v)=-\operatorname{Ric}(v,v)<0,
\end{align*}
so the potential is strictly concave along geodesic directions.
Taking the trace of the soliton equation with respect to $g$ gives
\begin{align*}
\operatorname{tr}_g(\operatorname{Ric})+\operatorname{tr}_g(\operatorname{Hess}_{g} f)=\operatorname{tr}_g(0).
\end{align*}
Since $\operatorname{tr}_g(\operatorname{Ric})=R$, $\operatorname{tr}_g(\operatorname{Hess}_{g} f)=\Delta f$, and $\operatorname{tr}_g(0)=0$, this becomes
\begin{align*}
R+\Delta f=0.
\end{align*}
Thus the positive scalar curvature is exactly balanced by a negative Laplacian of the potential:
\begin{align*}
\Delta f=-R<0.
\end{align*}
The Bryant soliton is rotationally symmetric, has one cap-like tip, and is asymptotic after the rescalings used in singularity analysis to a shrinking cylinder at infinity. It therefore supplies the standard steady cap model: positive curvature concentrates near the tip, while far from the tip the geometry resembles the cylindrical neck geometry that appears in three-dimensional blow-up limits.
[/example]
## Ancient Solutions and Two-Dimensional Models
Blow-up limits at finite-time singularities are defined on time intervals extending backwards indefinitely, so the next question is how restrictive the ancient condition is. In low dimensions, curvature positivity and noncollapsing force a small list of possible models.
[definition: Ancient Ricci Flow]
An ancient Ricci flow is a smooth family $(M,g(t))_{t\in(-\infty,T)}$ satisfying
\begin{align*}
\partial_t g(t)=-2\operatorname{Ric}_{g(t)}
\end{align*}
for some $T\in\mathbb{R}\cup\{\infty\}$.
[/definition]
Ancientness alone is weak: static flat manifolds and solitons both qualify after choosing a suitable time interval. The next classification result becomes meaningful only after adding the curvature and boundedness hypotheses that arise in the surface singularity theory.
[quotetheorem:6009]
[citeproof:6009]
This is a soliton rigidity statement, not a classification of all ancient surface flows. The soliton or Harnack-equality hypothesis is essential: the compact ancient oval, also called the Rosenau solution in this context, has bounded positive curvature but is not a gradient shrinking soliton, so it is outside the theorem rather than a counterexample. Completeness is also necessary, since restricting the cigar metric to a proper open disc produces an incomplete steady soliton patch with positive curvature that is not the complete cigar. Boundedness is the compactness input for the steady noncompact alternative, while positivity separates the statement from flat ancient quotients such as a flat torus or flat cylinder. The compact and noncompact alternatives cannot be merged: the round sphere is compact and shrinking, while Hamilton's cigar is noncompact and steady with collapsed large-scale geometry. The cigar's collapsed geometry shows why Perelman's noncollapsing assumptions are needed in higher-dimensional singularity analysis. This prepares the transition from low-dimensional soliton rigidity to the broader but still controlled class of three-dimensional $\kappa$-solutions.
[example: Hamilton Cigar]
On $\mathbb{R}^2$, write $r^2=x_1^2+x_2^2$ and
\begin{align*}
g=\frac{dx_1^2+dx_2^2}{1+r^2}.
\end{align*}
This is conformal to the Euclidean metric, so
\begin{align*}
g=e^{2u}(dx_1^2+dx_2^2),\qquad u=-\frac{1}{2}\log(1+r^2).
\end{align*}
For a conformal metric in two dimensions, the Gaussian curvature is
\begin{align*}
K=-e^{-2u}\Delta_{\mathbb R^2}u.
\end{align*}
Since $u$ is radial,
\begin{align*}
u_r=-\frac{r}{1+r^2}.
\end{align*}
Multiplying by $r$ gives
\begin{align*}
r u_r=-\frac{r^2}{1+r^2}.
\end{align*}
Differentiating this quotient,
\begin{align*}
\frac{d}{dr}(r u_r)=-\frac{2r(1+r^2)-2r^3}{(1+r^2)^2}.
\end{align*}
The numerator reduces to $2r$, so
\begin{align*}
\frac{d}{dr}(r u_r)=-\frac{2r}{(1+r^2)^2}.
\end{align*}
Therefore
\begin{align*}
\Delta_{\mathbb R^2}u=\frac{1}{r}\frac{d}{dr}(r u_r)=-\frac{2}{(1+r^2)^2}.
\end{align*}
Also $e^{-2u}=1+r^2$, and hence
\begin{align*}
K=-(1+r^2)\left(-\frac{2}{(1+r^2)^2}\right)=\frac{2}{1+r^2}>0.
\end{align*}
The metric is complete because the radial distance from $0$ to Euclidean radius $R$ is
\begin{align*}
\int_0^R \frac{dr}{\sqrt{1+r^2}}.
\end{align*}
For $r\ge 0$ one has $\sqrt{1+r^2}\le 1+r$, so
\begin{align*}
\int_0^R \frac{dr}{\sqrt{1+r^2}}\ge \int_0^R \frac{dr}{1+r}.
\end{align*}
The right-hand integral is
\begin{align*}
\int_0^R \frac{dr}{1+r}=\log(1+R),
\end{align*}
which tends to $\infty$ as $R\to\infty$.
To verify the steady soliton equation, set
\begin{align*}
f=-\log(1+r^2).
\end{align*}
For $g=e^{2u}\delta$, the Christoffel symbols are
\begin{align*}
\Gamma^k_{ij}=\delta_i^k\partial_j u+\delta_j^k\partial_i u-\delta_{ij}\partial^k u.
\end{align*}
Here
\begin{align*}
\partial_i u=-\frac{x_i}{1+r^2}.
\end{align*}
Also
\begin{align*}
\partial_i f=-\frac{2x_i}{1+r^2}.
\end{align*}
Differentiating once more,
\begin{align*}
\partial_i\partial_j f=-\frac{2\delta_{ij}}{1+r^2}+\frac{4x_i x_j}{(1+r^2)^2}.
\end{align*}
The Christoffel term is
\begin{align*}
\Gamma^k_{ij}\partial_k f=\partial_j u\,\partial_i f+\partial_i u\,\partial_j f-\delta_{ij}\partial^k u\,\partial_k f.
\end{align*}
Substituting the displayed first derivatives gives
\begin{align*}
\Gamma^k_{ij}\partial_k f=\frac{2x_i x_j}{(1+r^2)^2}+\frac{2x_i x_j}{(1+r^2)^2}-\delta_{ij}\frac{2r^2}{(1+r^2)^2}.
\end{align*}
Combining the first two terms,
\begin{align*}
\Gamma^k_{ij}\partial_k f=\frac{4x_i x_j}{(1+r^2)^2}-\frac{2r^2\delta_{ij}}{(1+r^2)^2}.
\end{align*}
Thus
\begin{align*}
(\operatorname{Hess}_{g} f)_{ij}=\partial_i\partial_j f-\Gamma^k_{ij}\partial_k f.
\end{align*}
Substituting the two formulas,
\begin{align*}
(\operatorname{Hess}_{g} f)_{ij}=-\frac{2\delta_{ij}}{1+r^2}+\frac{4x_i x_j}{(1+r^2)^2}-\frac{4x_i x_j}{(1+r^2)^2}+\frac{2r^2\delta_{ij}}{(1+r^2)^2}.
\end{align*}
The $x_i x_j$ terms cancel. Rewriting the remaining first term over the common denominator,
\begin{align*}
(\operatorname{Hess}_{g} f)_{ij}=-\frac{2\delta_{ij}(1+r^2)}{(1+r^2)^2}+\frac{2r^2\delta_{ij}}{(1+r^2)^2}.
\end{align*}
The numerator is $-2\delta_{ij}-2r^2\delta_{ij}+2r^2\delta_{ij}$, so
\begin{align*}
(\operatorname{Hess}_{g} f)_{ij}=-\frac{2\delta_{ij}}{(1+r^2)^2}.
\end{align*}
In dimension two, $\operatorname{Ric}=K g$. Since $g_{ij}=\delta_{ij}/(1+r^2)$,
\begin{align*}
(\operatorname{Ric})_{ij}=\frac{2}{1+r^2}\cdot \frac{\delta_{ij}}{1+r^2}=\frac{2\delta_{ij}}{(1+r^2)^2}.
\end{align*}
Adding the Ricci and Hessian components,
\begin{align*}
(\operatorname{Ric}+\operatorname{Hess}_{g} f)_{ij}=\frac{2\delta_{ij}}{(1+r^2)^2}-\frac{2\delta_{ij}}{(1+r^2)^2}=0.
\end{align*}
Therefore
\begin{align*}
\operatorname{Ric}+\operatorname{Hess}_{g} f=0.
\end{align*}
Thus the cigar is a steady gradient Ricci soliton, so its Ricci flow evolution is produced by diffeomorphisms rather than by homothetic shrinking or expansion.
Finally, the length of the coordinate circle $r=R$ is computed from $g_{\theta\theta}=R^2/(1+R^2)$:
\begin{align*}
\int_0^{2\pi}\sqrt{g_{\theta\theta}}\,d\theta=\int_0^{2\pi}\frac{R}{\sqrt{1+R^2}}\,d\theta.
\end{align*}
The integrand is constant in $\theta$, so
\begin{align*}
\int_0^{2\pi}\frac{R}{\sqrt{1+R^2}}\,d\theta=\frac{2\pi R}{\sqrt{1+R^2}}.
\end{align*}
Since $R/\sqrt{1+R^2}\le 1$,
\begin{align*}
\frac{2\pi R}{\sqrt{1+R^2}}\le 2\pi.
\end{align*}
So the cigar has bounded circumference growth at infinity. This bounded cross-sectional size is the geometric reason it is collapsed at large scales, and why taking a naive product with a line does not produce a three-dimensional $\kappa$-noncollapsed model.
[/example]
## Three-Dimensional $\kappa$-Solutions
For three-dimensional singularities, the central problem is to understand all possible noncollapsed ancient blow-up limits with nonnegative curvature. Perelman's compactness and noncollapsing results turn this into the study of $\kappa$-solutions.
[definition: Three-Dimensional Kappa Solution]
A three-dimensional $\kappa$-solution is an ancient Ricci flow $(M^3,g(t))_{t\in(-\infty,T]}$ such that each time slice is complete, the curvature operator is nonnegative and bounded, the scalar curvature is positive, and the solution is $\kappa$-noncollapsed on all scales for some $\kappa>0$.
[/definition]
This definition packages the curvature, completeness, and ancientness properties that survive blow-up. To make the volume hypothesis precise, we need the scale-invariant noncollapsing condition that rules out regions with small volume but controlled curvature.
[definition: Kappa Noncollapsed on All Scales]
A Ricci flow $(M^n,g(t))_{t\in I}$ is $\kappa$-noncollapsed on all scales if, for every $x\in M$, every $t\in I$, and every $r>0$, define the time-$t$ geodesic ball
\begin{align*}
B_{g(t)}(x,r)=\{y\in M:d_{g(t)}(x,y)<r\}
\end{align*}
and the parabolic neighbourhood
\begin{align*}
P(x,t,r)=\{(y,s)\in M\times I:s\in[t-r^2,t],\ y\in B_{g(t)}(x,r)\}.
\end{align*}
For each $t\in I$, let
\begin{align*}
\operatorname{Vol}_{g(t)}:\mathcal B(M)\to[0,\infty]
\end{align*}
be the Riemannian volume measure on Borel subsets of $M$. Whenever $P(x,t,r)$
is defined and satisfies $|\operatorname{Rm}|(y,s)\le r^{-2}$ throughout that neighbourhood, the time-$t$ volume satisfies
\begin{align*}
\operatorname{Vol}_{g(t)}(B_{g(t)}(x,r))\ge \kappa r^n.
\end{align*}
[/definition]
This definition is designed to survive parabolic rescaling, so it is well matched to singularity formation. Curvature bounds alone are insufficient because a region can have $|\operatorname{Rm}|\le r^{-2}$ while looking like a thin quotient or a product with a very small circle, giving volume much smaller than $r^n$. The cigar illustrates the same danger in two dimensions: its curvature is controlled, but its large-scale circumference growth is too small for the kind of noncollapsing needed in Perelman's compactness theory. The resulting models have a constrained global shape.
The next statement records the canonical-neighbourhood alternative for three-dimensional $\kappa$-solutions. It is the precise local form needed later for surgery: high-curvature regions are forced to look like necks, caps, or compact positive-curvature pieces.
[quotetheorem:6010]
[citeproof:6010]
This structure theorem is the geometric input behind surgery. Its hypotheses rule out concrete failure modes. If completeness is dropped, a proper open subset of the shrinking round cylinder can satisfy the same local curvature equations but has artificial boundary behaviour and no global neck model. If bounded curvature on time slices is dropped, pointed compactness can fail before any cylindrical or cap limit is extracted; the assumption is what prevents a sequence of basepoints from seeing unbounded curvature on every fixed parabolic ball after normalization. If nonnegative curvature is dropped, the Hamilton-Ivey pinching and strong maximum principle no longer force cylindrical nonnegative-curvature limits, so a sign-changing ancient blow-up candidate would fall outside the neck-and-cap alternative. If positive scalar curvature is dropped, flat quotients such as a flat three-torus are ancient and noncollapsed but have no singular high-curvature model. The $\kappa$-noncollapsing hypothesis prevents more subtle collapsed examples: the ancient shrinking flow on $S^2\times S^1_\varepsilon$ has controlled nonnegative curvature but arbitrarily small normalized volume when the circle length $\varepsilon$ is small, and $(\text{cigar})\times\mathbb{R}$ is complete, ancient, and has bounded nonnegative curvature but is collapsed at large scales. The conclusion is local at sufficiently high curvature: singular regions are therefore either necks, where cutting is possible, or caps and compact positive-curvature pieces, where the topology is controlled.
The topological payoff is that the analytic blow-up theorem identifies the pieces on which surgery acts. A neck has cross-section a spherical space form and behaves like a product interval, so cutting it separates connected-sum factors. A cap has bounded topology and positive curvature, so it can be replaced by a standard cap without introducing uncontrolled fundamental-group or prime-decomposition behaviour. This is the bridge from curvature concentration to the three-dimensional topological decomposition proved later in the course.
[illustration:canonical-neighborhood-models]
[example: Neck and Cap Picture]
Consider a three-dimensional Ricci flow $(M^3,g(t))$ forming a neckpinch at time $T<\infty$, and choose spacetime points $(x_i,t_i)$ with $t_i\uparrow T$ and
\begin{align*}
Q_i=R(x_i,t_i)\to\infty.
\end{align*}
Define the parabolically rescaled flows by
\begin{align*}
g_i(s)=Q_i\,g\left(t_i+\frac{s}{Q_i}\right).
\end{align*}
At the basepoint and rescaled time $s=0$, scalar curvature rescales by the inverse metric factor, so
\begin{align*}
R_{g_i}(x_i,0)=Q_i^{-1}R_g(x_i,t_i)=Q_i^{-1}Q_i=1.
\end{align*}
For any fixed $A>0$, the time $s=-A$ corresponds to original time
\begin{align*}
t_i-\frac{A}{Q_i}.
\end{align*}
Since $t_i\to T>0$ and $Q_i\to\infty$, one has $A/Q_i\to 0$, so $t_i-A/Q_i$ lies in the original time interval for all sufficiently large $i$. Thus the rescaled flows are defined farther and farther backward in the $s$-variable, which is the mechanism by which ancient blow-up limits arise.
Under the curvature and noncollapsing hypotheses used in the three-dimensional singularity analysis, a pointed subsequence
\begin{align*}
(M,g_i(s),x_i)
\end{align*}
converges to a three-dimensional $\kappa$-solution, by the compactness and noncollapsing input behind the canonical-neighbourhood alternatives for three-dimensional $\kappa$-solutions. If the points $x_i$ are chosen in the middle of the neck, then after rescaling the spherical cross-sections have controlled size and the axial direction remains visible, so the pointed limit is modeled on the shrinking round cylinder
\begin{align*}
S^2\times\mathbb R.
\end{align*}
If instead the points $x_i$ are chosen near the end of the neck, the rescaled geometry still has a cylindrical region on one side, but the other side closes off in a positively curved cap; the corresponding pointed limit is modeled on a Bryant-type cap.
The same singular time can therefore produce different local models because the blow-up process records the geometry seen from the chosen basepoint: center points of the neck see a cylinder, while end points see a cap attached to a cylindrical neck.
[/example]
## How Solitons and Ancient Solutions Fit Into Singularity Analysis
The chapter closes by connecting the model objects back to the Ricci flow developed earlier in the course. The guiding question is how a finite-time singularity can be studied through limits that exist for all negative times.
[explanation: Blow Up Philosophy]
Let $(M,g(t))$ develop a singularity at time $T<\infty$, and choose spacetime points $(x_i,t_i)$ with $t_i\uparrow T$ and curvature scales satisfying
\begin{align*}
Q_i=R(x_i,t_i)\to\infty.
\end{align*}
The parabolically rescaled flows
\begin{align*}
g_i(s)=Q_i\,g\left(t_i+\frac{s}{Q_i}\right)
\end{align*}
are defined on longer and longer backward time intervals. Compactness theorems, together with curvature estimates and noncollapsing, produce ancient limiting flows. Ricci solitons are the rigid self-similar members of this class, while general $\kappa$-solutions describe the larger family of possible three-dimensional singularity models.
[/explanation]
The practical lesson is that singularities are not arbitrary failures of the PDE. After rescaling, they are organized by solitons, ancient solutions, and noncollapsing geometry, which is why the later surgery theory can replace high-curvature regions by standard pieces while preserving control of the flow.
Solitons and ancient solutions describe the rigid geometric models that emerge after rescaling near singularities. Perelman's entropy formulae then add a variational viewpoint, showing why these models are singled out by monotonicity rather than just by compactness.
# 8. Perelman's Entropy Formulae
Perelman's entropy formulae add a variational layer to Ricci flow. The chapter assumes the Ricci flow equation, scalar and Ricci curvature evolution, integration by parts on closed manifolds, and the basic conjugate heat operator. Chapters 1--3 treated Ricci flow as a geometric parabolic equation and used curvature evolution to control singularity formation; here the main question is how to find quantities that move monotonically along the flow despite the diffeomorphism and scaling symmetries introduced in Chapters 1 and 2. The functionals $\mathcal F$ and $\mathcal W$ combine curvature, a weighted measure, and a backwards heat equation to reveal Ricci flow as a gradient-like system modulo diffeomorphisms.
## Ricci Flow as a Gradient Flow Modulo Diffeomorphism
Ricci flow is not the literal gradient flow of total scalar curvature on the full space of metrics, because the total scalar curvature changes under conformal directions and the equation is invariant under pullback. The first problem is to choose a measure that is transported with the geometry so that the curvature term has a useful variational derivative.
[definition: Perelman F Functional]
Let $M$ be a closed smooth manifold, and let $\operatorname{Met}(M)$ denote the space of smooth Riemannian metrics on $M$. Perelman's $\mathcal F$-functional is the map
\begin{align*}
\mathcal F: \operatorname{Met}(M) \times C^\infty(M) \to \mathbb R
\end{align*}
defined by
\begin{align*}
\mathcal F[g,f] = \int_M \left(S_g + |\nabla f|_g^2\right)e^{-f}\,d\mu_g.
\end{align*}
[/definition]
The weight $e^{-f}d\mu_g$ is part of the data, not an afterthought, so the next issue is to decide which changes of $f$ represent a genuine change of weighted geometry rather than a rescaling of mass. Fixing total mass gives a probability measure and removes the constant rescaling direction from the variational problem.
[definition: Normalized Weighted Measure]
For a closed Riemannian manifold $(M,g)$ and $f \in C^\infty(M)$, the weighted measure $e^{-f}d\mu_g$ is normalized if
\begin{align*}
\int_M e^{-f}\,d\mu_g = 1.
\end{align*}
[/definition]
With this normalization, $f$ plays the role of a potential for the measure against which scalar curvature is averaged. For the clean tensorial variation formula, however, one must distinguish fixed total mass from the stronger condition that the weighted density itself is held fixed to first order. The next formula uses this stronger pointwise density constraint; the weaker fixed-mass constraint leads to an additional Lagrange-multiplier term rather than to the displayed tensor alone.
[quotetheorem:6011]
[citeproof:6011]
The formula identifies the direction in which $\mathcal F$ increases and shows that the relevant tensor is not just $\operatorname{Ric}_g$, but the Bakry-Emery type combination $\operatorname{Ric}_g+\operatorname{Hess}_{g} f$. Closedness is used to discard the divergence terms after integration by parts. To see why the stronger constraint matters, fix $g$ and vary $f$ by a nonconstant $v$ with $\int_M v e^{-f}d\mu_g=0$; the total mass is fixed to first order, but the term involving $v(2\Delta f-|\nabla f|_g^2+S_g)$ remains unless the Euler-Lagrange multiplier equation is imposed. Thus fixed total mass alone does not give the tensor formula above.
The same point is separate from boundary effects. Let $\Omega\subset \mathbb R^n$ be a compact domain with smooth boundary $\partial\Omega$, let $\nu$ denote its outward unit normal, and let $\mathcal H^{n-1}$ denote surface measure. Taking $g$ Euclidean and varying $f$ on $\Omega$ produces boundary contributions such as $\int_{\partial\Omega}\partial_\nu f\,e^{-f}\,d\mathcal H^{n-1}$ unless Neumann-type data are imposed, so the displayed closed-manifold formula cannot be read as a boundary formula. Constant rescalings are also outside the constrained weighted geometry: replacing $f$ by $f+c(s)$ with $c'(0)\ne 0$ rescales $e^{-f}d\mu_g$ and changes $\mathcal F$ by a pure mass factor even when the metric is fixed. The theorem does not say that Ricci flow is the literal gradient flow of total scalar curvature on all metrics; it says that, after choosing a weighted measure and imposing the pointwise density constraint, the tensor controlling the variation is $\operatorname{Ric}_g+\operatorname{Hess}_{g} f$. The natural next question is whether ordinary Ricci flow can be put into this direction after using the diffeomorphism freedom of the equation.
[quotetheorem:6012]
[citeproof:6012]
This is the first place in the course where a monotone quantity detects a soliton equation. Closedness again removes boundary terms. For instance, on an evolving compact region with boundary, integration by parts in the weighted Bochner identity leaves flux terms through $\partial M$; without boundary conditions such as vanishing normal flux, those terms can have either sign and the displayed nonnegative formula no longer follows. The backward equation for $f$ is not cosmetic: if one keeps $f$ fixed along a nonstationary Ricci flow, then $e^{-f}d\mu_g$ changes by $-S_ge^{-f}d\mu_g$ and the constrained first variation is no longer the derivative being computed. Monotonicity by itself does not classify the solution, since many flows have strictly increasing $\mathcal F$; the rigidity information appears only when the nonnegative integrand vanishes identically. Vanishing of the integrand means that the flow is moving only by diffeomorphism in the weighted geometry.
[example: Flat Torus and the F Functional]
Let $M=\mathbb R^n/\mathbb Z^n$ with its flat metric $g$, and let $f\equiv c$ be chosen so that the weighted measure is normalized. The normalization condition is
\begin{align*}
1=\int_M e^{-f}\,d\mu_g=\int_M e^{-c}\,d\mu_g=e^{-c}\operatorname{Vol}_g(M).
\end{align*}
Thus $e^{-c}=\operatorname{Vol}_g(M)^{-1}$. Since $g$ is flat, $\operatorname{Ric}_g=0$ and $S_g=0$. Since $f$ is constant, $\nabla f=0$, so $|\nabla f|_g^2=0$ and $\operatorname{Hess}_{g} f=0$.
Therefore
\begin{align*}
\mathcal F[g,f]=\int_M \left(S_g+|\nabla f|_g^2\right)e^{-f}\,d\mu_g.
\end{align*}
Substituting $S_g=0$ and $|\nabla f|_g^2=0$ gives
\begin{align*}
\mathcal F[g,f]=\int_M 0\cdot e^{-c}\,d\mu_g=0.
\end{align*}
The tensor in the $\mathcal F$ monotonicity formula is
\begin{align*}
\operatorname{Ric}_g+\operatorname{Hess}_{g} f=0+0=0.
\end{align*}
Along the stationary flat Ricci flow $g(t)=g$ with $f(t)=f$, the monotonicity identity gives
\begin{align*}
\frac{d}{dt}\mathcal F[g(t),f(t)]=2\int_M |0|_g^2 e^{-f}\,d\mu_g=0.
\end{align*}
Thus the flat torus has zero $\mathcal F$-entropy production, exactly matching the fact that its Ricci flow does not move.
[/example]
## The $\mathcal W$ Entropy and the Conjugate Heat Equation
The $\mathcal F$-functional sees diffeomorphism symmetry, but singularity analysis also requires control under parabolic rescaling. The next problem is to build a functional whose normalization changes correctly when a flow is viewed at a shrinking time scale.
[definition: Backward Time Parameter]
Along a Ricci flow $g(t)$ defined for $t \in I$, a backward time parameter is a smooth map
\begin{align*}
\tau: I \to (0,\infty)
\end{align*}
satisfying
\begin{align*}
\frac{d\tau}{dt} = -1.
\end{align*}
[/definition]
The parameter $\tau$ measures time remaining before a chosen singular time and records how far a parabolic blow-up is from its target scale. Since a heat kernel at scale $\tau$ has mass density of size $(4\pi\tau)^{-n/2}$, the entropy must include the same factor to compare different scales.
[definition: Perelman W Entropy]
Let $M$ be a closed $n$-dimensional smooth manifold, and let $\operatorname{Met}(M)$ denote the space of smooth Riemannian metrics on $M$. Perelman's $\mathcal W$-entropy is the map
\begin{align*}
\mathcal W: \operatorname{Met}(M) \times C^\infty(M) \times (0,\infty) \to \mathbb R
\end{align*}
defined by
\begin{align*}
\mathcal W[g,f,\tau]
= \int_M \left(\tau(S_g+|\nabla f|_g^2)+f-n\right)(4\pi\tau)^{-n/2}e^{-f}\,d\mu_g.
\end{align*}
[/definition]
The formula has the form of an expectation, but that interpretation is valid only after the density has total mass $1$. The next normalization fixes the admissible potentials for a given scale and makes $\mathcal W$ comparable to the Gaussian entropy on Euclidean space.
[definition: Shrinker Normalization]
For a closed $n$-dimensional Riemannian manifold $(M,g)$, a function $f\in C^\infty(M)$ and a number $\tau>0$ satisfy the shrinker normalization if
\begin{align*}
\int_M (4\pi\tau)^{-n/2}e^{-f}\,d\mu_g = 1.
\end{align*}
[/definition]
The normalized density solves only half of the problem: it fixes mass at a single time, but the entropy formula needs a rule for moving that mass through an evolving metric. This creates the need for the backwards adjoint heat equation, whose curvature term compensates for the change of $d\mu_g$ under Ricci flow.
[definition: Conjugate Heat Equation]
Let $(M,g(t))$ be a Ricci flow defined for $t \in I$. A smooth map
\begin{align*}
u: I \times M \to (0,\infty)
\end{align*}
solves the conjugate heat equation if
\begin{align*}
\partial_t u = -\Delta u + S_g u.
\end{align*}
[/definition]
The sign of the scalar curvature term is forced by the evolution of the volume form; if $u$ solves this equation, then $u\,d\mu_g$ has time-independent total mass on a closed manifold. Since $\mathcal W$ is written in terms of $f$ rather than $u$, the next step is to translate the conjugate heat equation into the potential equation.
[quotetheorem:6013]
[citeproof:6013]
This reformulation is what makes $\mathcal W$ usable along a flow. The hypothesis $\tau>0$ is needed both for the Gaussian factor and for the term $n/(2\tau)$; at $\tau=0$, the expression $(4\pi\tau)^{-n/2}$ is undefined and no smooth potential equation of this form remains. Positivity of $u$ is also essential because $f=-\log u-(n/2)\log(4\pi\tau)$ is otherwise not a smooth real-valued potential; a heat solution on a disconnected closed manifold that is identically zero on one component gives a direct failure of the logarithmic parametrization while still solving the linear conjugate heat equation. Even on a connected closed Ricci flow, prescribing terminal data that changes sign produces a signed solution of the linear equation for short backward time, but no real potential $f$ can represent it as $(4\pi\tau)^{-n/2}e^{-f}$. The condition $\partial_t\tau=-1$ is part of the equivalence as well: if $\tau$ is held constant, differentiating the Gaussian factor loses the term $n/(2\tau)$, so the displayed potential equation is no longer equivalent to the conjugate heat equation. The theorem is an algebraic equivalence between two ways of writing the same transported density; it does not by itself prove monotonicity, which comes only after differentiating the full entropy functional. The entropy can now be differentiated while the probability measure is transported by the conjugate heat equation.
[example: Entropy of Euclidean Space]
On $\mathbb R^n$ with the Euclidean metric and $\tau>0$, take
\begin{align*}
f(x)=\frac{|x|^2}{4\tau}.
\end{align*}
This is a noncompact model case rather than an example inside the closed-manifold hypotheses above; the Gaussian decay is what makes the integrals finite and prevents boundary terms at infinity. The associated density is
\begin{align*}
u(x)=(4\pi\tau)^{-n/2}\exp\left(-\frac{|x|^2}{4\tau}\right).
\end{align*}
Since
\begin{align*}
\int_{\mathbb R}e^{-y^2/(4\tau)}\,dy=(4\pi\tau)^{1/2},
\end{align*}
multiplying the one-dimensional integrals gives
\begin{align*}
\int_{\mathbb R^n}u(x)\,d\mathcal L^n(x)=(4\pi\tau)^{-n/2}(4\pi\tau)^{n/2}=1.
\end{align*}
Thus $u\,d\mathcal L^n$ is a probability measure.
Let $X$ have law $u\,d\mathcal L^n$. The density factors into identical one-dimensional Gaussian densities, so the coordinates are independent. Each coordinate has mean $0$ because $y e^{-y^2/(4\tau)}$ is odd. For the second moment, differentiating
\begin{align*}
\int_{\mathbb R}e^{-ay^2}\,dy=\sqrt{\frac{\pi}{a}}
\end{align*}
with respect to $a>0$ gives
\begin{align*}
\int_{\mathbb R}y^2e^{-ay^2}\,dy=\frac{\sqrt{\pi}}{2a^{3/2}}.
\end{align*}
Taking $a=(4\tau)^{-1}$ gives
\begin{align*}
\int_{\mathbb R}y^2e^{-y^2/(4\tau)}\,dy=4\sqrt{\pi}\tau^{3/2}.
\end{align*}
Therefore
\begin{align*}
\mathbb E[X_i^2]=(4\pi\tau)^{-1/2}4\sqrt{\pi}\tau^{3/2}=2\tau.
\end{align*}
Summing over the coordinates,
\begin{align*}
\mathbb E[|X|^2]=\sum_{i=1}^n\mathbb E[X_i^2]=2n\tau.
\end{align*}
For the Euclidean metric, $S_g=0$. Also
\begin{align*}
\nabla f(x)=\frac{x}{2\tau}.
\end{align*}
Hence
\begin{align*}
|\nabla f(x)|^2=\frac{|x|^2}{4\tau^2}.
\end{align*}
The integrand in $\mathcal W$ is therefore
\begin{align*}
\tau(S_g+|\nabla f|^2)+f-n=\tau\frac{|x|^2}{4\tau^2}+\frac{|x|^2}{4\tau}-n=\frac{|x|^2}{2\tau}-n.
\end{align*}
Using the probability density $u$,
\begin{align*}
\mathcal W[g,f,\tau]=\int_{\mathbb R^n}\left(\frac{|x|^2}{2\tau}-n\right)u(x)\,d\mathcal L^n(x).
\end{align*}
This integral is the expectation of the displayed integrand, so
\begin{align*}
\mathcal W[g,f,\tau]=\frac{1}{2\tau}\mathbb E[|X|^2]-n.
\end{align*}
Substituting $\mathbb E[|X|^2]=2n\tau$ gives
\begin{align*}
\mathcal W[g,f,\tau]=\frac{1}{2\tau}(2n\tau)-n=0.
\end{align*}
Thus the Euclidean Gaussian has zero $\mathcal W$-entropy, giving the model normalization against which shrinking entropy is compared.
[/example]
## Monotonicity and Equality Cases
The main analytic payoff is a scale-sensitive monotonicity identity. The question is no longer whether a quantity increases, but what equation is forced when the increase stops.
[quotetheorem:6014]
[citeproof:6014]
The closed-manifold assumption is what allows every integration by parts in the differentiation of $\mathcal W$ to close without boundary contributions. On $\mathbb R^n$ the Gaussian example works because $u$ decays fast enough that boundary terms over large spheres vanish; if instead $u$ has slow decay, such as a positive function comparable to $(1+|x|)^{-n}$ after truncation and smoothing, the entropy integrals or the boundary terms can fail to converge, so the compact formula has no automatic noncompact extension. The shrinker normalization and the conjugate heat equation serve different roles: the normalization fixes the density as a probability measure at each time, while the conjugate heat equation transports that density compatibly with the evolving volume form. If the density is multiplied by a constant $a\ne 1$, the derivative formula is multiplied by $a$ but $\mathcal W$ is no longer the normalized entropy whose scale comparison is used in singularity analysis. If $u$ is kept fixed while the metric evolves, then $\partial_t(u\,d\mu_g)=-S_gu\,d\mu_g$ in general, so the cancellations with the conjugate heat equation are absent. The condition $\tau>0$ is indispensable because the square contains the scale term $(2\tau)^{-1}g$ and because the heat-kernel normalization has no finite meaning at $\tau=0$. The theorem gives a monotone number, but it does not assert convergence of the flow or existence of a limiting shrinker without compactness and noncollapsing input. The monotonicity identity leaves a naming problem: the vanishing tensor is the geometric structure that entropy detects, so it should be treated as an object in its own right. This motivates the definition of a gradient shrinking Ricci soliton at the same scale $\tau$ that appears in $\mathcal W$.
[definition: Gradient Shrinking Ricci Soliton]
A Riemannian manifold $(M,g)$ with $f\in C^\infty(M)$ is a gradient shrinking Ricci soliton at scale $\tau>0$ if
\begin{align*}
\operatorname{Ric}_g + \operatorname{Hess}_{g} f = \frac{1}{2\tau}g.
\end{align*}
[/definition]
Shrinking solitons are self-similar models: under Ricci flow they evolve by diffeomorphism and scaling, not by producing genuinely new shapes. The entropy formula detects such models through the vanishing of a weighted square, but that analytic condition must still be translated into a pointwise geometric equation. The rigidity statement below explains when constancy of the entropy forces the flow to be a shrinker rather than merely having the same numerical entropy at isolated times.
[quotetheorem:6015]
[citeproof:6015]
The equality case is the bridge from analysis to geometry. The positivity of the density matters here: an integral of a squared tensor against a positive smooth measure vanishes only when the tensor vanishes at every point. If a nonnegative density vanished on an [open set](/page/Open%20Set), a tensor supported in that open set would have zero weighted square integral without vanishing globally, so positivity cannot be replaced by arbitrary nonnegative weight. Constancy on an interval is stronger than vanishing of the derivative at a single time; a nonnegative square integral can vanish at one time and become positive immediately afterwards, so a single equality time does not force a soliton on a time interval. The conjugate heat and normalization hypotheses are also part of the rigidity statement: with a density transported by the wrong equation, zero change of the numerical functional would not force the displayed square to be the derivative. The result also does not say that a nearly constant entropy automatically gives an exact soliton; turning small entropy production into geometric closeness requires separate compactness and regularity estimates. It is the reason blow-up limits at Type I singularities are expected to be shrinkers once compactness and noncollapsing have been established.
[example: W Entropy on the Round Shrinking Sphere]
Let $S^n$ carry the round metric $g(t)=r(t)^2g_{S^n}$, where $g_{S^n}$ is the unit round metric. Since $\operatorname{Ric}_{g_{S^n}}=(n-1)g_{S^n}$ and constant scaling leaves the Ricci tensor unchanged as a $(0,2)$-tensor, we have
\begin{align*}
\operatorname{Ric}_{g(t)}=(n-1)g_{S^n}.
\end{align*}
Because $g(t)=r(t)^2g_{S^n}$, this is equivalently
\begin{align*}
\operatorname{Ric}_{g(t)}=(n-1)r(t)^{-2}g(t).
\end{align*}
The Ricci flow equation gives
\begin{align*}
\partial_t g(t)=\frac{d}{dt}\left(r(t)^2\right)g_{S^n}.
\end{align*}
It also gives
\begin{align*}
\partial_t g(t)=-2\operatorname{Ric}_{g(t)}=-2(n-1)g_{S^n}.
\end{align*}
Comparing the two expressions for $\partial_t g(t)$ yields
\begin{align*}
\frac{d}{dt}\left(r(t)^2\right)=-2(n-1).
\end{align*}
If $\tau$ is the backward time to extinction, so $\partial_t\tau=-1$ and $r(t)^2=0$ when $\tau=0$, then
\begin{align*}
r(t)^2=2(n-1)\tau(t).
\end{align*}
Take $f\equiv c(t)$ constant in space and choose $c(t)$ by the shrinker normalization. Then
\begin{align*}
1=\int_{S^n}(4\pi\tau)^{-n/2}e^{-c(t)}\,d\mu_{g(t)}.
\end{align*}
Since the integrand is spatially constant apart from the volume form,
\begin{align*}
1=(4\pi\tau)^{-n/2}e^{-c(t)}\operatorname{Vol}_{g(t)}(S^n).
\end{align*}
Thus
\begin{align*}
e^{-c(t)}=\frac{(4\pi\tau)^{n/2}}{\operatorname{Vol}_{g(t)}(S^n)}.
\end{align*}
The volume of the scaled round metric is
\begin{align*}
\operatorname{Vol}_{g(t)}(S^n)=r(t)^n\operatorname{Vol}_{g_{S^n}}(S^n).
\end{align*}
Using $r(t)^2=2(n-1)\tau(t)$ gives
\begin{align*}
\operatorname{Vol}_{g(t)}(S^n)=(2(n-1)\tau)^{n/2}\operatorname{Vol}_{g_{S^n}}(S^n).
\end{align*}
Substitution into the normalization formula gives
\begin{align*}
e^{-c(t)}=\frac{(4\pi\tau)^{n/2}}{(2(n-1)\tau)^{n/2}\operatorname{Vol}_{g_{S^n}}(S^n)}.
\end{align*}
Cancelling the factor $\tau^{n/2}$ yields
\begin{align*}
e^{-c(t)}=\frac{\left(\frac{2\pi}{n-1}\right)^{n/2}}{\operatorname{Vol}_{g_{S^n}}(S^n)}.
\end{align*}
Hence $c(t)$ is independent of $t$.
Because $f$ is spatially constant,
\begin{align*}
\nabla f=0.
\end{align*}
Also,
\begin{align*}
\operatorname{Hess}_{g} f=0.
\end{align*}
Using $r(t)^2=2(n-1)\tau(t)$, we compute
\begin{align*}
\operatorname{Ric}_{g(t)}+\operatorname{Hess}_{g} f=(n-1)r(t)^{-2}g(t).
\end{align*}
Replacing $r(t)^2$ by $2(n-1)\tau(t)$ gives
\begin{align*}
\operatorname{Ric}_{g(t)}+\operatorname{Hess}_{g} f=\frac{n-1}{2(n-1)\tau(t)}g(t).
\end{align*}
Cancelling $n-1$ gives the gradient shrinking soliton equation
\begin{align*}
\operatorname{Ric}_{g(t)}+\operatorname{Hess}_{g} f=\frac{1}{2\tau(t)}g(t).
\end{align*}
Therefore the tensor in Perelman's $\mathcal W$ monotonicity formula vanishes:
\begin{align*}
\operatorname{Ric}_{g(t)}+\operatorname{Hess}_{g} f-\frac{1}{2\tau(t)}g(t)=0.
\end{align*}
The monotonicity identity then gives
\begin{align*}
\frac{d}{dt}\mathcal W[g(t),f,\tau(t)]=0.
\end{align*}
Thus $\mathcal W$ is constant along the normalized round shrinking solution.
Its value can be read directly from the definition. Since
\begin{align*}
\operatorname{Ric}_{g(t)}=\frac{1}{2\tau(t)}g(t),
\end{align*}
taking the $g(t)$-trace gives
\begin{align*}
S_{g(t)}=\frac{n}{2\tau(t)}.
\end{align*}
Because $|\nabla f|^2=0$ and $f=c$, the integrand in $\mathcal W$ is
\begin{align*}
\tau(S_{g(t)}+|\nabla f|^2)+f-n=\tau\frac{n}{2\tau}+c-n.
\end{align*}
Cancelling $\tau$ gives
\begin{align*}
\tau(S_{g(t)}+|\nabla f|^2)+f-n=c-\frac{n}{2}.
\end{align*}
The shrinker normalization says that $(4\pi\tau)^{-n/2}e^{-c}d\mu_{g(t)}$ has total mass $1$, so
\begin{align*}
\mathcal W[g(t),f,\tau(t)]=c-\frac{n}{2}.
\end{align*}
The round shrinking sphere therefore has constant $\mathcal W$-entropy; unlike the Euclidean Gaussian model, its constant value includes the normalized compact volume through the constant $c$ and reflects the positive scalar curvature of the shrinker.
[/example]
## Entropy Functionals as Singularity Detectors
The two entropy formulae have complementary roles in the rest of the course. The $\mathcal F$-functional detects steady gradient structure modulo diffeomorphism, while $\mathcal W$ detects shrinking structure at the scale of a potential singularity.
[remark: Scaling Behaviour]
If $g$ is replaced by $\lambda g$ and $\tau$ by $\lambda\tau$ for $\lambda>0$, the expression defining $\mathcal W$ is invariant after using the same normalized density. This is the scaling compatibility that makes $\mathcal W$ suitable for blow-up arguments.
[/remark]
The next stages of the theory use these identities to rule out collapsing at bounded curvature scales and to analyze singularity models. This role is analogous to Lyapunov functionals in parabolic PDE and free energy in statistical mechanics: the formula does not merely bound a solution, but identifies the structured states where entropy production stops. In Ricci flow those structured states are gradient solitons, so the monotonicity formula becomes a rigidity mechanism for the limiting geometry.
Entropy identifies the geometric states that resist further dissipation, but reduced distance and reduced volume give a more geometric way to measure that rigidity. The next chapter replaces ordinary distance with a flow-adapted notion that is tailored to singularity analysis.
# 9. Reduced Distance and Reduced Volume
This chapter develops Perelman's reduced geometry for Ricci flow, assuming the Ricci flow equation from Chapter 1, scalar curvature evolution from Chapter 3, conjugate heat operators from Chapter 5, and parabolic rescaling from Chapter 6. The goal is to replace ordinary distance, which is tied to a single fixed metric, by a space-time action adapted to an evolving metric. From this action we construct reduced distance and reduced volume, then use their monotonicity to prove rigidity and noncollapsing results that underpin singularity analysis.
## The Backward Geometry of Ricci Flow
The first problem is that a Ricci flow has no single Riemannian distance function connecting points at different times. A curve joining $(q,\tau_1)$ to $(p,\tau_2)$ must pay both for spatial speed and for the scalar curvature encountered along the way. Perelman's $\mathcal L$-length is designed so that its Euler-Lagrange equation matches the adjoint heat operator and produces a comparison theory for the conjugate heat kernel.
[definition: Backward Time Ricci Flow]
Let $(M,g(t))$ be a Ricci flow for $t\in [0,T)$. Fix $t_0\in (0,T)$ and define the backward-time map
\begin{align*}
\Theta:(0,t_0]&\to [0,t_0), & \Theta(\tau)&=t_0-\tau.
\end{align*}
The backward metric family is the map
\begin{align*}
g:(0,t_0]&\to \Gamma(S^2T^*M), & \tau&\mapsto g(\tau):=g(\Theta(\tau))=g(t_0-\tau),
\end{align*}
and it satisfies
\begin{align*}
\frac{\partial g}{\partial \tau}=2\operatorname{Ric}_{g(\tau)}.
\end{align*}
[/definition]
Backward time turns the forward heat operator into the conjugate heat operator and makes singularities appear as limits as $\tau\downarrow 0$. This motivates assigning an action to curves in this time direction.
[definition: L Length]
Let $(M,g(\tau))$ be a Ricci flow written in backward time and let $0<\tau_1<\tau_2\le t_0$. Denote by $\mathcal P_{\tau_1,\tau_2}(M)$ the space of piecewise smooth curves $\gamma:[\tau_1,\tau_2]\to M$. The $\mathcal L$-length functional is the map
\begin{align*}
\mathcal L:\mathcal P_{\tau_1,\tau_2}(M)&\to \mathbb R, &
\mathcal L(\gamma)&=\int_{\tau_1}^{\tau_2}\sqrt{\tau}\left(|\dot{\gamma}(\tau)|_{g(\tau)}^2+R(\gamma(\tau),\tau)\right)d\tau.
\end{align*}
[/definition]
The factor $\sqrt{\tau}$ is the scaling that makes $\mathcal L$ compatible with parabolic dilation. Spatial speed is expensive near later backward times, while curvature is sampled with the same parabolic weight. This motivates the following definition, which turns minimal $\mathcal L$-length into a dimensionless analogue of distance squared divided by time.
[definition: Reduced Distance]
Fix a basepoint $(p,0)$ in backward time. The reduced distance based at $(p,0)$ is the extended-real-valued function
\begin{align*}
l:M\times (0,t_0]&\to \mathbb R\cup\{\infty\},
\end{align*}
defined by
\begin{align*}
l(q,\tau)=\frac{1}{2\sqrt{\tau}}\inf_{\gamma}\mathcal L(\gamma),
\end{align*}
where the infimum is taken over piecewise smooth curves $\gamma:[0,\tau]\to M$ with $\gamma(0)=p$ and $\gamma(\tau)=q$.
[/definition]
The reduced distance is dimensionless and invariant under the natural parabolic rescalings of Ricci flow. It behaves like squared distance divided by time in flat space, with curvature correction terms recording the geometry seen by optimal space-time paths.
[example: Gaussian Shrinker Reduced Distance]
Consider the static Euclidean flow $(\mathbb R^n,g(\sigma))$ with $g(\sigma)=g_{\mathrm{Euc}}$ and $R=0$, based at $(0,0)$. For a piecewise smooth curve $\gamma:[0,\tau]\to\mathbb R^n$ with $\gamma(0)=0$ and $\gamma(\tau)=q$, the $\mathcal L$-length is
\begin{align*}
\mathcal L(\gamma)=\int_0^\tau \sqrt{\sigma}\,|\dot\gamma(\sigma)|^2\,d\sigma.
\end{align*}
Put $s=\sqrt{\sigma}$ and define $\eta(s)=\gamma(s^2)$ for $0\le s\le \sqrt{\tau}$. Then $d\sigma=2s\,ds$, and for $s>0$,
\begin{align*}
\eta'(s)=2s\,\dot\gamma(s^2).
\end{align*}
Hence
\begin{align*}
\dot\gamma(s^2)=\frac{\eta'(s)}{2s}.
\end{align*}
Substituting these identities into the action gives
\begin{align*}
\mathcal L(\gamma)=\int_0^{\sqrt{\tau}} s\left|\frac{\eta'(s)}{2s}\right|^2 2s\,ds.
\end{align*}
Therefore
\begin{align*}
\mathcal L(\gamma)=\frac12\int_0^{\sqrt{\tau}}|\eta'(s)|^2\,ds.
\end{align*}
Since $\eta(0)=0$ and $\eta(\sqrt{\tau})=q$,
\begin{align*}
q=\int_0^{\sqrt{\tau}}\eta'(s)\,ds.
\end{align*}
Taking norms and applying the integral [Cauchy-Schwarz inequality](/theorems/432),
\begin{align*}
|q|\le \left(\int_0^{\sqrt{\tau}}1\,ds\right)^{1/2}\left(\int_0^{\sqrt{\tau}}|\eta'(s)|^2\,ds\right)^{1/2}.
\end{align*}
Since $\int_0^{\sqrt{\tau}}1\,ds=\sqrt{\tau}$, this becomes
\begin{align*}
|q|\le \tau^{1/4}\left(\int_0^{\sqrt{\tau}}|\eta'(s)|^2\,ds\right)^{1/2}.
\end{align*}
Squaring both sides gives
\begin{align*}
\int_0^{\sqrt{\tau}}|\eta'(s)|^2\,ds\ge \frac{|q|^2}{\sqrt{\tau}}.
\end{align*}
Equality occurs for the constant-speed $s$-curve
\begin{align*}
\eta(s)=\frac{s}{\sqrt{\tau}}q.
\end{align*}
In terms of the original backward time variable, this minimizing curve is
\begin{align*}
\gamma(\sigma)=\eta(\sqrt{\sigma})=\sqrt{\frac{\sigma}{\tau}}\,q.
\end{align*}
Thus
\begin{align*}
\inf_\gamma \mathcal L(\gamma)=\frac12\cdot\frac{|q|^2}{\sqrt{\tau}}.
\end{align*}
So
\begin{align*}
\inf_\gamma \mathcal L(\gamma)=\frac{|q|^2}{2\sqrt{\tau}}.
\end{align*}
By the definition of reduced distance,
\begin{align*}
l(q,\tau)=\frac{1}{2\sqrt{\tau}}\inf_\gamma\mathcal L(\gamma).
\end{align*}
Substituting the computed infimum,
\begin{align*}
l(q,\tau)=\frac{1}{2\sqrt{\tau}}\cdot\frac{|q|^2}{2\sqrt{\tau}}.
\end{align*}
Therefore
\begin{align*}
l(q,\tau)=\frac{|q|^2}{4\tau}.
\end{align*}
The minimizing curve is constant-speed in $s=\sqrt{\sigma}$, not in backward time $\sigma$, and the resulting exponent is exactly the Gaussian exponent in the Euclidean heat kernel.
[/example]
This example explains the normalization in the definition: the Euclidean model should produce exactly the heat kernel density after exponentiating $-l$. Reduced distance is defined by an infimum over paths, so later variation formulae need to distinguish arbitrary competitors from paths where the first variation vanishes. The definition names those critical paths of the $\mathcal L$-functional, which play the role of geodesics for Perelman's backward-time length.
[definition: L Geodesic]
Fix $0<\tau_1<\tau_2\le t_0$ and endpoints $q_1,q_2\in M$. Let
\begin{align*}
\mathcal P(q_1,\tau_1;q_2,\tau_2)=\{\gamma\in C^\infty((\tau_1,\tau_2);M):\gamma(\tau_1)=q_1,\ \gamma(\tau_2)=q_2\}
\end{align*}
with variations given by smooth one-parameter families in this endpoint-fixed path space. An $\mathcal L$-geodesic is a smooth critical point of the functional
\begin{align*}
\mathcal L:\mathcal P(q_1,\tau_1;q_2,\tau_2)\to \mathbb R.
\end{align*}
[/definition]
The definition isolates the curves along which reduced distance is differentiated. The next task is to compute their Euler-Lagrange equation, since this equation is what later feeds into the first and second variation formulae.
[quotetheorem:6016]
[citeproof:6016]
The geodesic equation gives control of first variations, but it is only valid along smooth critical paths for the backward Ricci flow action. The backward Ricci flow structure is essential: without the term $\partial_\tau g=2\operatorname{Ric}$, the Ricci contribution in the Euler-Lagrange equation would have the wrong coefficient, and later cancellations with volume distortion would fail. The equation also does not say that every endpoint is joined by a unique minimizing $\mathcal L$-geodesic; cut-locus and regularity issues remain. Monotonicity therefore requires an inequality for the reduced heat-kernel density that survives beyond smooth minimizing points.
The hypotheses are not merely technical. If the metric family were static but not Ricci-flat, the $2\operatorname{Ric}(\dot\gamma,\cdot)^\sharp$ term would not arise from differentiating the metric, so the displayed equation would not describe critical points of the same action. If the endpoint lies on an $\mathcal L$-cut locus, two minimizing curves can meet the same point and the first variation of the infimum need not be represented by a single terminal velocity. If $\tau=0$ were treated as an ordinary endpoint rather than a limiting base time, the coefficient $(2\tau)^{-1}\dot\gamma$ would be singular without the finite reduced-length condition controlling the initial behaviour.
[quotetheorem:6017]
[citeproof:6017]
The inequality has the same form as the logarithmic transformation of the conjugate heat equation, which is why the cut-locus formulation matters. At smooth points the calculation is a differential identity plus a trace estimate; at cut points, derivatives of $l$ may not exist, so the statement must be interpreted by barriers or distributions. Without this weak formulation, the later integration argument would ignore the most important obstruction to differentiating a distance-like function. This is the bridge from reduced distance to a genuine monotone integral.
Each assumption rules out a concrete failure mode. Completeness prevents minimizing $\mathcal L$-geodesics from escaping through a missing end before reaching the terminal point, as can happen on a punctured manifold with the inherited Euclidean metric. Bounded curvature on compact backward-time intervals gives uniform control of the first and second variation formulae; without it, curvature terms along an approximating minimizing sequence need not have a controlled limit. The smooth-point assumption is needed for the pointwise differential inequality because the ordinary distance function on a round sphere already fails to be differentiable at antipodal cut points, and reduced distance has the same cut-locus phenomenon. Finiteness of $l$ excludes endpoints for which the infimum is infinite, where the exponential density vanishes but the logarithmic differential expression is not meaningful.
## Reduced Volume Monotonicity
The central question is whether the Gaussian heat kernel density has a Ricci-flow analogue whose total mass can only decrease in backward time. The reduced distance supplies the exponential factor, while the evolving volume form supplies the geometric correction. The resulting quantity is the reduced volume.
[definition: Reduced Volume]
For a complete backward Ricci flow $(M,g(\tau))$ based at $(p,0)$, the reduced volume is the function
\begin{align*}
\widetilde V:(0,t_0]&\to [0,\infty],
\end{align*}
defined by
\begin{align*}
\widetilde V(\tau)=\int_M (4\pi\tau)^{-n/2}e^{-l(q,\tau)}\,d\operatorname{vol}_{g(\tau)}(q).
\end{align*}
[/definition]
The integrand resembles the Euclidean heat kernel, but $l$ replaces squared distance divided by $4\tau$. Under parabolic rescaling, the prefactor, reduced distance, and volume form transform so that $\widetilde V$ is unchanged. This motivates proving the central monotonicity theorem for this integral.
[quotetheorem:6018]
[citeproof:6018]
Reduced volume monotonicity is the Ricci-flow analogue of Bishop-Gromov volume comparison. Completeness and bounded curvature are not cosmetic assumptions: they justify existence and control of minimizing $\mathcal L$-geodesics, cutoff integration by parts, and passage through the barrier inequality. Monotonicity by itself does not classify the flow, and a value below $1$ only records that the geometry has lost Gaussian volume relative to Euclidean space. The equality case is the additional input that characterizes self-similar models.
The short-time Euclidean asymptotic supplies the normalization $\lim_{\tau\downarrow 0}\widetilde V(\tau)=1$. Without it, a cone-like or orbifold-like tangent model would produce the [Gaussian integral](/theorems/1140) over the tangent model rather than the Euclidean value, so the upper bound by $1$ would not be the correct reference. Finiteness is also needed: on a noncompact flow with uncontrolled negative reduced distance at infinity, the formal density may fail to be integrable, and differentiating the total mass would no longer be justified by the cutoff argument.
[example: Shrinking Cylinder Reduced Volume Comparison]
For the shrinking round cylinder, write the backward-time metric near its singular time as
\begin{align*}
g(\sigma)=2(n-2)\sigma\,g_{S^{n-1}}+dz^2
\end{align*}
on $S^{n-1}\times\mathbb R$. The scalar curvature of a round $S^{n-1}$ of radius $\sqrt{2(n-2)\sigma}$ is
\begin{align*}
R(\sigma)=\frac{(n-1)(n-2)}{2(n-2)\sigma}.
\end{align*}
Thus
\begin{align*}
R(\sigma)=\frac{n-1}{2\sigma}.
\end{align*}
Take the basepoint at $(\theta_0,0)$ and consider an endpoint $(\theta_0,z)$ with no spherical displacement. For a curve staying in the same spherical slice, the $\mathcal L$-length is
\begin{align*}
\mathcal L(z)=\int_0^\tau \sqrt{\sigma}\left(|\dot z(\sigma)|^2+\frac{n-1}{2\sigma}\right)d\sigma.
\end{align*}
Put $s=\sqrt{\sigma}$ and $\eta(s)=z(s^2)$. Then $d\sigma=2s\,ds$ and $\eta'(s)=2s\,\dot z(s^2)$, so
\begin{align*}
\int_0^\tau \sqrt{\sigma}\,|\dot z(\sigma)|^2\,d\sigma
=\frac12\int_0^{\sqrt{\tau}}|\eta'(s)|^2\,ds.
\end{align*}
Since $\eta(0)=0$ and $\eta(\sqrt{\tau})=z$,
\begin{align*}
z=\int_0^{\sqrt{\tau}}\eta'(s)\,ds.
\end{align*}
The Cauchy-Schwarz inequality gives
\begin{align*}
|z|^2\le \left(\int_0^{\sqrt{\tau}}1\,ds\right)\left(\int_0^{\sqrt{\tau}}|\eta'(s)|^2\,ds\right).
\end{align*}
Since $\int_0^{\sqrt{\tau}}1\,ds=\sqrt{\tau}$, this is
\begin{align*}
\int_0^{\sqrt{\tau}}|\eta'(s)|^2\,ds\ge \frac{z^2}{\sqrt{\tau}}.
\end{align*}
Hence
\begin{align*}
\int_0^\tau \sqrt{\sigma}\,|\dot z(\sigma)|^2\,d\sigma\ge \frac{z^2}{2\sqrt{\tau}}.
\end{align*}
Equality holds for $\eta(s)=sz/\sqrt{\tau}$, equivalently $z(\sigma)=\sqrt{\sigma/\tau}\,z$.
The curvature part is
\begin{align*}
\int_0^\tau \sqrt{\sigma}\,\frac{n-1}{2\sigma}\,d\sigma
=\frac{n-1}{2}\int_0^\tau \sigma^{-1/2}\,d\sigma.
\end{align*}
Since $\int_0^\tau \sigma^{-1/2}\,d\sigma=2\sqrt{\tau}$, this becomes
\begin{align*}
\int_0^\tau \sqrt{\sigma}\,\frac{n-1}{2\sigma}\,d\sigma=(n-1)\sqrt{\tau}.
\end{align*}
Therefore
\begin{align*}
\inf \mathcal L=\frac{z^2}{2\sqrt{\tau}}+(n-1)\sqrt{\tau}.
\end{align*}
By the definition of reduced distance,
\begin{align*}
l(\theta_0,z,\tau)=\frac{1}{2\sqrt{\tau}}\left(\frac{z^2}{2\sqrt{\tau}}+(n-1)\sqrt{\tau}\right).
\end{align*}
Thus
\begin{align*}
l(\theta_0,z,\tau)=\frac{z^2}{4\tau}+\frac{n-1}{2}.
\end{align*}
The reduced-volume density is therefore
\begin{align*}
(4\pi\tau)^{-n/2}\exp\left(-\frac{z^2}{4\tau}-\frac{n-1}{2}\right).
\end{align*}
The spherical radius is $\sqrt{2(n-2)\tau}$, so the volume form is
\begin{align*}
d\operatorname{vol}_{g(\tau)}=\left(2(n-2)\tau\right)^{(n-1)/2}d\operatorname{vol}_{S^{n-1}}\,dz.
\end{align*}
Hence
\begin{align*}
\widetilde V(\tau)=(4\pi\tau)^{-n/2}e^{-(n-1)/2}\left(2(n-2)\tau\right)^{(n-1)/2}\operatorname{vol}(S^{n-1})\int_{\mathbb R}e^{-z^2/(4\tau)}\,dz.
\end{align*}
The one-dimensional Gaussian integral is
\begin{align*}
\int_{\mathbb R}e^{-z^2/(4\tau)}\,dz=(4\pi\tau)^{1/2}.
\end{align*}
Substituting this gives
\begin{align*}
\widetilde V(\tau)=(4\pi\tau)^{-n/2}e^{-(n-1)/2}\left(2(n-2)\tau\right)^{(n-1)/2}\operatorname{vol}(S^{n-1})(4\pi\tau)^{1/2}.
\end{align*}
Combining the powers of $\tau$ gives $\tau^{-n/2}\tau^{(n-1)/2}\tau^{1/2}=\tau^0$, so
\begin{align*}
\widetilde V(\tau)=(4\pi)^{-(n-1)/2}e^{-(n-1)/2}\left(2(n-2)\right)^{(n-1)/2}\operatorname{vol}(S^{n-1}).
\end{align*}
Thus the reduced volume is independent of $\tau$ for the cylinder shrinker. This constant is strictly below the Euclidean value $1$, so the reduced volume records that a neck has cylindrical, codimension-one Gaussian geometry rather than flat Euclidean geometry.
[/example]
The cylinder example shows that equality in reduced volume monotonicity is a special geometric condition rather than a generic feature of ancient solutions. This motivates introducing the self-similar models that are expected to account for equality.
[definition: Gradient Shrinking Ricci Soliton]
A gradient shrinking Ricci soliton is a Riemannian manifold $(M,g)$ together with a smooth function $f:M\to\mathbb R$ and a constant $\lambda>0$ such that
\begin{align*}
\operatorname{Ric}+\operatorname{Hess}_{g} f=\lambda g.
\end{align*}
[/definition]
Such a soliton generates a Ricci flow by diffeomorphisms and scaling. It is the geometric object for which the flow looks the same at every parabolic scale. This motivates the rigidity theorem identifying equality in reduced volume monotonicity with shrinking soliton structure.
[quotetheorem:6019]
[citeproof:6019]
This rigidity result explains why reduced volume is so effective in blow-up analysis, but its hypotheses are substantial. Constancy is much stronger than having a small derivative along a sequence, and the conclusion is local to the region controlled by minimizing $\mathcal L$-geodesics unless additional global assumptions are available. Completeness and curvature control prevent equality from being lost through escape of mass or singular cut-locus behaviour. A singularity model obtained as a limit of rescalings inherits enough monotonicity that constancy often follows, forcing the limit to be a shrinker.
Concrete failures show why the statement is phrased this way. On an incomplete domain of Euclidean space, Gaussian mass can disappear through the boundary even though the local tensor equation is flat. With no curvature control, equality in the integrated inequality need not upgrade to a smooth soliton equation because the second variation and parabolic regularity inputs can break down. If the reduced volume is infinite, constancy has no rigidity content: an infinite mass can remain infinite while the local geometry changes. These examples explain why the conclusion is tied to the minimizing $\mathcal L$-geodesic region and to finite reduced-volume mass.
## Singularity Models and Noncollapsing
The final use of reduced volume is to prevent high-curvature regions from collapsing faster than their curvature scale. Singularities are studied by rescaling around points where $|\operatorname{Rm}|$ is large, and the danger is that the rescaled balls might have very small volume. Reduced volume gives a lower bound which survives under blow-up limits.
[definition: Kappa-Noncollapsed at a Scale]
A Ricci flow $(M,g(t))$ is $\kappa$-noncollapsed at scale $r_0$ if for every geodesic ball $B_{g(t)}(x,r)$ with $0<r\le r_0$ and
\begin{align*}
|\operatorname{Rm}|\le r^{-2}\quad\text{on }B_{g(t)}(x,r),
\end{align*}
one has
\begin{align*}
\operatorname{vol}_{g(t)}B_{g(t)}(x,r)\ge \kappa r^n.
\end{align*}
[/definition]
The curvature hypothesis says that the ball is being viewed at a scale where curvature is controlled. The conclusion rules out almost-flat balls with vanishing normalized volume. This motivates the theorem asserting that closed Ricci flows cannot locally collapse before a finite time.
[quotetheorem:6020]
[citeproof:6020]
No local collapsing supplies the compactness needed for singularity analysis, but the statement is deliberately local and scale-restricted. Closedness and the finite time bound give uniform global control needed to choose $\kappa$ and $r_0$ from the initial data; without such input, collapse can occur in families of almost-flat manifolds. The curvature-scale hypothesis is also essential, since high curvature inside the ball can destroy any comparison with Euclidean or reduced-volume geometry. The theorem prevents collapse at controlled curvature scales, not arbitrary degeneration of topology or curvature.
No local collapsing supplies the compactness needed for singularity analysis. Together with curvature estimates, it allows Hamilton compactness to produce pointed smooth limits from blow-up sequences. This motivates making precise what such a blow-up limit is.
[definition: Singularity Model]
A singularity model for a Ricci flow is a pointed smooth limit
\begin{align*}
(M_i,Q_i g(t_i+Q_i^{-1}s),x_i)\longrightarrow (M_\infty,g_\infty(s),x_\infty)
\end{align*}
where $Q_i=|\operatorname{Rm}|(x_i,t_i)\to\infty$ and there is an interval $J\subseteq\mathbb R$ such that $0\in J$, each rescaled flow is defined on every compact subinterval of $J$ for all sufficiently large $i$, and the convergence is smooth on compact subsets of $M_\infty\times J$ after choosing pointed embeddings.
[/definition]
Reduced volume is scale-invariant, so it passes naturally to singularity models. The monotonicity in the original flow becomes monotonicity in the limit; if the base times approach a singular time in a type-I fashion, the limit often has constant reduced volume. This motivates the theorem identifying those type-I models as shrinkers.
[quotetheorem:6021]
[citeproof:6021]
This theorem is the analytic reason why shrinkers are the first models to classify in any Ricci-flow singularity theory. The type-I assumption is essential because it ties the curvature scale to the remaining time and makes the blow-up limit see a self-similar backward interval; type-II singularities can instead lead to steady or ancient nonshrinking models. Nonflatness rules out the Euclidean Gaussian model, while no local collapsing and Hamilton compactness provide the smooth pointed limit on which reduced volume can be evaluated. Cylinders, spheres, and their quotients appear as the canonical neck and cap models in the three-dimensional picture.
A useful contrast is the two-dimensional cigar soliton, which is steady rather than shrinking and occurs as a model for type-II behaviour in Ricci-flow analysis. In that setting the curvature scale is much larger than the reciprocal remaining time, so the rescaled limit need not have constant reduced volume across the backward intervals used in the type-I argument. This is why the theorem identifies type-I blow-up limits with shrinkers but does not classify all possible ancient limits.
[remark: Role in Perelman's Surgery Theory]
Reduced volume monotonicity and no local collapsing are not classification theorems by themselves. Their role is to guarantee that blow-up limits exist with enough volume and curvature control to enter the classification of ancient solutions. In dimension three, this is the bridge from analytic monotonicity to the canonical-neighbourhood structure used in Ricci flow with surgery.
[/remark]
Reduced volume turns monotonicity into a powerful compactness and noncollapsing principle, which is exactly what is needed to study three-dimensional singularities. The next chapter combines these estimates with curvature pinching to produce canonical neighborhood descriptions of high-curvature regions.
# 10. Three-Dimensional Pinching and Canonical Neighborhoods
Chapters 6--9 developed singularity models through blow-up limits, entropy, reduced volume, and noncollapsing. In dimension three, the curvature operator has special algebraic structure: its eigenvalues are sectional curvatures, and Ricci flow forces negative curvature directions to be controlled by positive curvature at large scale. This chapter explains how the [Hamilton-Ivey pinching estimate](/theorems/6022) converts arbitrary initial curvature bounds into almost nonnegative curvature near singularities, and how this leads to the language of necks, caps, and canonical neighborhoods.
The guiding question is geometric rather than only analytic: when curvature becomes large in a three-dimensional Ricci flow, what does a small neighbourhood of the point look like after rescaling? The goal is to state the pinching, noncollapsing, canonical-neighbourhood, and neck-detection tools in a form that can be used in the surgery construction. The prerequisites are the maximum principle for tensor equations, Hamilton compactness, Perelman's noncollapsing theorem, and the blow-up formalism from the preceding chapters.
## Curvature Operator Pinching in Dimension Three
A general Ricci flow may begin with mixed sectional curvature, so the first problem is to understand whether large positive curvature can coexist with uncontrolled negative curvature near a singularity. In three dimensions, the curvature operator acts on $\bigwedge^2 T_pM$, and its three eigenvalues record the sectional curvatures of an orthonormal frame. The evolution equation contains a quadratic reaction term that improves the least eigenvalue relative to the scalar curvature.
[definition: Curvature Operator Eigenvalues in Dimension Three]
Let $(M^3,g)$ be a Riemannian manifold and let $\nu(p) \leq \lambda(p) \leq \mu(p)$ be the eigenvalues of the curvature operator $\operatorname{Rm}_p:\bigwedge^2 T_pM \to \bigwedge^2 T_pM$.
[/definition]
With this convention, the scalar curvature is
\begin{align*}
R(p)=2(\nu(p)+\lambda(p)+\mu(p)).
\end{align*}
The least eigenvalue $\nu$ measures the most negative sectional curvature at the point. The scalar curvature combines all three eigenvalues, so the central pinching question is whether $\nu$ can remain large and negative when $R$ becomes very large.
[remark: Hamilton-Ivey Pinching as a Background Input]
The Hamilton-Ivey estimate is used here as a structural background input for three-dimensional singularity analysis. In the form needed below, for a complete three-dimensional Ricci flow with bounded curvature on compact time intervals and an initial lower bound $\nu(\cdot,0)\ge -1$ for the least curvature-operator eigenvalue, there are constants $s_0>0$ and a dimensional pinching function $\Phi(s)\to0$ as $s\to\infty$ such that, whenever $R(x,t)\ge s_0$,
\begin{align*}
\nu(x,t)\ge -R(x,t)\Phi(R(x,t)).
\end{align*}
Thus high-scalar-curvature blow-up regimes cannot retain an uncontrolled negative curvature direction.
[/remark]
This pinching input is the first place where dimension three enters in a decisive way. It does not say that the original solution has nonnegative curvature, but it says that the regions relevant to singularity formation become asymptotically nonnegative after parabolic rescaling.
The hypotheses rule out several genuine failures. The initial lower bound on $\nu$ fixes the scale of the estimate; without any lower curvature-operator bound at time $0$, one could insert regions with arbitrarily negative sectional curvature and no uniform pinching function could be forced from the flow alone. Bounded curvature on compact time intervals is needed for the maximum-principle argument and for the blow-up procedure to have controlled local geometry. Completeness prevents boundary or missing-end effects from producing artificial minima of the pinching quantity; on incomplete manifolds, local barriers can fail because the maximum principle has no global domain on which to operate.
[example: Rescaled High-Curvature Limit at a Singular Point]
Let $(x_k,t_k)$ be a sequence with $Q_k=R_g(x_k,t_k)\to\infty$, and define
\begin{align*}
g_k(s)=Q_k g(t_k+Q_k^{-1}s), \qquad s\in[-Q_kt_k,0].
\end{align*}
If $t=t_k+Q_k^{-1}s$, then multiplying the metric by the constant $Q_k$ multiplies scalar curvature and curvature-operator eigenvalues by $Q_k^{-1}$. Thus
\begin{align*}
R_{g_k}(y,s)=Q_k^{-1}R_g(y,t).
\end{align*}
Also
\begin{align*}
\nu_{g_k}(y,s)=Q_k^{-1}\nu_g(y,t).
\end{align*}
At the base point and rescaled time $s=0$,
\begin{align*}
R_{g_k}(x_k,0)=Q_k^{-1}R_g(x_k,t_k)=Q_k^{-1}Q_k=1.
\end{align*}
Now consider a spacetime subregion of the rescaled flows on which $R_{g_k}\leq C$ and where the corresponding unrescaled scalar curvatures $R_g=Q_kR_{g_k}$ tend to infinity. By *Hamilton-Ivey Pinching Estimate*, for all points with $R_g(y,t)\geq s_0$,
\begin{align*}
\nu_g(y,t)\geq -R_g(y,t)\Phi(R_g(y,t)).
\end{align*}
Multiplying this inequality by $Q_k^{-1}$ gives
\begin{align*}
Q_k^{-1}\nu_g(y,t)\geq -Q_k^{-1}R_g(y,t)\Phi(R_g(y,t)).
\end{align*}
Using $\nu_{g_k}(y,s)=Q_k^{-1}\nu_g(y,t)$ and $R_g(y,t)=Q_kR_{g_k}(y,s)$, this becomes
\begin{align*}
\nu_{g_k}(y,s)\geq -R_{g_k}(y,s)\Phi(Q_kR_{g_k}(y,s)).
\end{align*}
Since $R_{g_k}(y,s)\leq C$, we further get
\begin{align*}
\nu_{g_k}(y,s)\geq -C\Phi(Q_kR_{g_k}(y,s)).
\end{align*}
On the chosen subregion $Q_kR_{g_k}(y,s)\to\infty$, and $\Phi(r)\to 0$ as $r\to\infty$, so the lower bound tends to $0$.
Therefore, at every point of such a smooth pointed subsequential limit, the least curvature-operator eigenvalue satisfies $\nu_\infty\geq 0$. Since the curvature-operator eigenvalues are ordered as $\nu_\infty\leq\lambda_\infty\leq\mu_\infty$, the limiting curvature operator is nonnegative. The rescaling fixes the base scalar curvature at $1$, while Hamilton-Ivey pinching removes negative sectional curvature in the high-curvature limit once the compactness hypotheses needed for smooth convergence are available.
[/example]
The example shows how pinching interacts with blow-up analysis. To turn it into a compactness theorem, one also needs local derivative estimates and a lower bound on volume at the curvature scale; those ingredients are supplied by the noncollapsing theory from earlier chapters.
[remark: Why Pinching Is Weaker Than Positivity]
Hamilton-Ivey pinching permits negative sectional curvature at bounded scale and at early times. Its force is asymptotic: the ratio between the negative part of the curvature operator and the scalar curvature goes to zero in the high-curvature regime. This is exactly the regime seen by singularity models.
[/remark]
The remaining sections translate this analytic control into geometry. Once blow-up limits have nonnegative curvature operator and are noncollapsed, three-dimensional geometry restricts their local shapes strongly.
## Necks, Caps, and Canonical Neighborhoods
The next problem is to give names to the geometric pieces that appear near high-curvature points. A singularity model may look cylindrical along one direction, close off like a cap, or be compact and positively curved. The canonical neighbourhood theorem asserts that, after rescaling by the local curvature, these are essentially the only possibilities.
[definition: Epsilon Neck]
Let $(M^3,g)$ be a Riemannian manifold, let $x_0 \in M$ satisfy $R(x_0)>0$, let $\varepsilon>0$, and set $k=\lceil \varepsilon^{-1}\rceil$. An open set $U \subset M$ is an $\varepsilon$-neck centred at $x_0$ if there is a diffeomorphism
\begin{align*}
\Phi:S^2 \times (-\varepsilon^{-1},\varepsilon^{-1}) \to U
\end{align*}
with $\Phi(\theta_0,0)=x_0$ for some $\theta_0\in S^2$ such that, for $Q=R(x_0)$, the rescaled pulled-back metric $Q\,\Phi^*g$ is within $\varepsilon$ in the $C^k$ norm on $S^2 \times (-\varepsilon^{-1},\varepsilon^{-1})$ of
\begin{align*}
g_{\mathrm{cyl}}=2g_{S^2(1)}+dz^2.
\end{align*}
[/definition]
An $\varepsilon$-neck models the part of a solution that is locally cylindrical. The long interval in the definition means that the centre is far from the ends of the parametrised cylinder, so local estimates near the centre do not depend on boundary effects.
For this normalization, $g_{\mathrm{cyl}}$ has scalar curvature $1$, curvature-operator eigenvalues $0,0,1/2$, and the two zero eigenvalues come from the mixed planes containing the axial direction.
[illustration:epsilon-neck-rescaling]
[example: Cylindrical Neck in a Shrinking Product Solution]
Write the product flow as $g(t)=a(t)g_{S^2(1)}+dz^2$, with $a(t)>0$ before extinction. Since $\operatorname{Ric}_{a(t)g_{S^2(1)}}=g_{S^2(1)}$ and the $\mathbb R$ factor is flat, the Ricci flow equation gives
\begin{align*}
a'(t)g_{S^2(1)}=-2g_{S^2(1)}.
\end{align*}
Thus $a'(t)=-2$, while the line factor remains $dz^2$. The round sphere with metric $a(t)g_{S^2(1)}$ has sectional curvature $a(t)^{-1}$, so the product scalar curvature is
\begin{align*}
R(t)=2a(t)^{-1}.
\end{align*}
Fix a point $(\theta_0,z_0)$ and set $Q=R(t)=2/a(t)$. On a centred product region define
\begin{align*}
\Phi(\theta,\zeta)=\left(\theta,z_0+\sqrt{\frac{a(t)}{2}}\,\zeta\right).
\end{align*}
Then the spherical part pulls back to $a(t)g_{S^2(1)}$, and the line part satisfies
\begin{align*}
\Phi^*dz^2=\left(\sqrt{\frac{a(t)}{2}}\right)^2d\zeta^2=\frac{a(t)}{2}d\zeta^2.
\end{align*}
Therefore
\begin{align*}
\Phi^*g(t)=a(t)g_{S^2(1)}+\frac{a(t)}{2}d\zeta^2.
\end{align*}
Multiplying by $Q=2/a(t)$ gives
\begin{align*}
Q\,\Phi^*g(t)=2g_{S^2(1)}+d\zeta^2.
\end{align*}
This is exactly $g_{\mathrm{cyl}}$, the standard cylinder with scalar curvature $1$. Hence, whenever the product interval contains $S^2\times(-\varepsilon^{-1},\varepsilon^{-1})$ around the chosen centre, it is an $\varepsilon$-neck with zero $C^k$ error for every $k$.
[/example]
Necks describe the middle of a developing tube, but surgery also needs to understand where such tubes terminate. A cap is a positively curved end that closes off one side of a neck while remaining topologically simple.
[definition: Epsilon Cap]
Let $(M^3,g)$ be a Riemannian manifold and let $\varepsilon>0$. An open set $U \subset M$ is an $\varepsilon$-cap if $U$ is diffeomorphic to either $B^3$ or $\mathbb{R}P^3 \setminus \overline{B}^3$, and outside a compact core it contains an $\varepsilon$-neck.
[/definition]
The definition separates topology from geometry: the core is where the end closes, while the exterior neck gives a controlled cylindrical interface. This is the geometry one expects when a positively curved region is attached to a long thin tube.
[example: Positively Curved Cap Attached to a Cylindrical Neck]
Let $g$ be a rotationally symmetric metric on $\mathbb R^3$ of the form
\begin{align*}
g=dr^2+f(r)^2g_{S^2(1)}.
\end{align*}
Smoothness at the origin is encoded by $f(0)=0$ and $f'(0)=1$. Suppose that near the origin, for $r>0$, the warping function satisfies $f(r)>0$, $f''(r)<0$, and $|f'(r)|<1$. For a warped product metric $dr^2+f(r)^2g_{S^2(1)}$, the sectional curvature of a radial plane is
\begin{align*}
K_{\mathrm{rad}}(r)=-\frac{f''(r)}{f(r)}.
\end{align*}
Since $f''(r)<0$ and $f(r)>0$, this gives $K_{\mathrm{rad}}(r)>0$. The sectional curvature of a plane tangent to the spherical factor is
\begin{align*}
K_{\mathrm{tan}}(r)=\frac{1-(f'(r))^2}{f(r)^2}.
\end{align*}
Since $|f'(r)|<1$, we have $(f'(r))^2<1$, hence $1-(f'(r))^2>0$, and therefore $K_{\mathrm{tan}}(r)>0$. Thus the central ball has strictly positive sectional curvature and supplies the cap core.
Now suppose that on a long annular region the metric is, after writing an axial coordinate $z$, $C^k$-close after scaling to the round product cylinder
\begin{align*}
\rho^2 g_{S^2(1)}+dz^2.
\end{align*}
For the exact cylinder, the sphere factor has scalar curvature $2/\rho^2$ and the line factor is flat, so the product scalar curvature is
\begin{align*}
R=\frac{2}{\rho^2}.
\end{align*}
Set $Q=2/\rho^2$ and define
\begin{align*}
\Phi(\theta,\zeta)=\left(\theta,z_0+\frac{\rho}{\sqrt 2}\zeta\right).
\end{align*}
The spherical part pulls back to $\rho^2 g_{S^2(1)}$, while the axial part pulls back as
\begin{align*}
\Phi^*dz^2=\left(\frac{\rho}{\sqrt 2}\right)^2d\zeta^2=\frac{\rho^2}{2}d\zeta^2.
\end{align*}
Hence
\begin{align*}
\Phi^*(\rho^2g_{S^2(1)}+dz^2)=\rho^2g_{S^2(1)}+\frac{\rho^2}{2}d\zeta^2.
\end{align*}
Multiplying by $Q$ gives
\begin{align*}
Q\,\Phi^*(\rho^2g_{S^2(1)}+dz^2)=\frac{2}{\rho^2}\rho^2g_{S^2(1)}+\frac{2}{\rho^2}\frac{\rho^2}{2}d\zeta^2.
\end{align*}
The two coefficients simplify to $2$ and $1$, so
\begin{align*}
Q\,\Phi^*(\rho^2g_{S^2(1)}+dz^2)=2g_{S^2(1)}+d\zeta^2=g_{\mathrm{cyl}}.
\end{align*}
If the annular metric is within $\varepsilon$ of this exact cylinder in the required $C^k$ norm after the same rescaling, then the annulus is an $\varepsilon$-neck. The union of the positively curved central ball with this cylindrical annulus is therefore an $\varepsilon$-cap: the core closes off like a ball, while the exterior gives the controlled neck used in three-dimensional surgery.
[/example]
Necks and caps are local pieces, but the surgery argument needs a single criterion that says every sufficiently curved point lies in a controlled piece. This motivates the canonical neighbourhood package, where the local model, curvature scale, derivative bounds, and volume information are treated together.
[definition: Canonical Neighborhood]
Fix $\varepsilon>0$, $A>0$, and an integer $k\geq 10$. Let $(M^3,g(t))$ be a Ricci flow and let $(x,t)$ be a spacetime point with $Q=R(x,t)>0$. A spacetime neighbourhood $\mathcal U$ of $(x,t)$ is an $(\varepsilon,A,k)$-canonical neighbourhood at scale $Q^{-1/2}$ if, after the parabolic rescaling
\begin{align*}
\tilde g(s)=Q\,g(t+Q^{-1}s), \qquad s\in[-A,0],
\end{align*}
the pointed flow $(\mathcal U,x,\tilde g(s))$ is, in pointed $C^k$ topology on the spatial ball of radius $A$ and time interval $[-A,0]$, within $\varepsilon$ of one of the following pointed model flows: a strong $\varepsilon$-neck modelled on the round shrinking cylinder of scalar curvature $1$ at $s=0$, an $\varepsilon$-cap with a cylindrical end, a compact positively curved component, or a three-dimensional ancient $\kappa$-solution with nonnegative curvature operator.
[/definition]
The comparison is now a statement about maps and topology, not only about resemblance. A model flow supplies the domain, the parabolic rescaling supplies the curvature scale, and the pointed $C^k$ distance records the controlled derivatives needed later in surgery.
[quotetheorem:6023]
[citeproof:6023]
The theorem is a local compactness statement disguised as a geometric classification. It says that once the curvature scale is small compared with the global scale fixed by the initial data, every high-curvature point has a standard model.
Each hypothesis has a compactness role. Without noncollapsing, a sequence of bounded-curvature product regions such as $S^2 \times S^1$ with the $S^1$ factor shrinking can converge to a two-dimensional limit, so a three-dimensional neck or cap model is lost. Without bounded curvature on compact time intervals, a blow-up sequence can have no smooth subsequential limit on any uniform parabolic neighbourhood. The high-curvature assumption is also essential: at ordinary curvature scale, a closed three-manifold can contain a small perturbation of any initial metric feature, such as a handle or a low-curvature lumpy region, and there is no reason for a point to lie near a cylinder, cap, or ancient model.
[remark: Role in Surgery]
Canonical neighbourhoods tell us where surgery may be performed: necks are the regions that can be cut, and caps are the regions that can be inserted or retained. Without this local classification, cutting the manifold would have no controlled geometric meaning.
[/remark]
[illustration:surgery-curvature-scale]
## Kappa-Noncollapsed High-Curvature Regions
The canonical neighbourhood theorem relies on a volume condition that prevents a bounded-curvature region from degenerating into a collapsed space. The question is whether a parabolic rescaling near a singularity still retains enough volume for Hamilton compactness to apply. Perelman's answer is the $\kappa$-noncollapsing theorem, which supplies a uniform lower bound at curvature-controlled scales.
[definition: Kappa-Noncollapsed at a Scale]
Let $(M^n,g)$ be a Riemannian manifold, let $\rho>0$, and let $\kappa>0$. The metric $g$ is $\kappa$-noncollapsed at scale $\rho$ if every geodesic ball $B(x,r)$ with $0<r<\rho$ and
\begin{align*}
|\operatorname{Rm}| \leq r^{-2} \quad \text{on } B(x,r)
\end{align*}
satisfies
\begin{align*}
\operatorname{Vol}_g(B(x,r)) \geq \kappa r^n.
\end{align*}
[/definition]
Noncollapsing is scale invariant, so it survives the blow-ups used to study singularities. The obstruction is that curvature bounds alone do not prevent a high-curvature limit from losing volume and collapsing to a lower-dimensional space. Perelman's monotonicity formulae supply the missing volume lower bound on finite time intervals, so noncollapsing becomes an output of the flow rather than an additional hypothesis.
[quotetheorem:6024]
[citeproof:6024]
The finite-time and compactness assumptions matter. On infinite time intervals, rescaled long-time solutions can have volume ratios tending to zero along a sequence of times, so no single $\kappa$ need work forever without additional hypotheses. Compactness, or corresponding complete bounded-geometry assumptions, supplies the lower entropy bound from which the volume estimate is derived; on a complete noncompact initial manifold with uncontrolled geometry at infinity, collapsing can already be present far out at time $0$. The curvature-controlled scale condition is also necessary: a ball around a high-curvature spike may have small volume because $|\operatorname{Rm}|$ is much larger than $r^{-2}$ inside the ball, and the theorem does not classify that as collapse.
Together with pinching, noncollapsing provides the two compactness inputs for high-curvature limits: curvature operators become nonnegative, and unit balls retain definite volume. The next principle explains how cylindrical geometry is detected from those inputs.
[remark: Neck Detection Principle]
In the canonical-neighbourhood regime, neck detection is the practical rule that turns the classification alternatives into a cuttable local model. At a sufficiently high-curvature point, if the point is not in the core of an $\varepsilon$-cap and not in a compact positively curved component, the canonical-neighbourhood alternatives force a neighbourhood of the point to be an $\varepsilon$-neck after rescaling.
[/remark]
Neck detection is the practical form of the canonical neighbourhood theorem. It lets one recognise cuttable regions from local curvature information rather than from a global parametrisation.
The exclusions in the statement are necessary. Near the core of a cap, the curvature may be close to cylindrical on one side while the other side closes off, so a centred two-sided neck need not exist. In a compact positively curved component, the geometry may resemble a round space form rather than a long cylinder, even when the curvature is large and well controlled. The nearly cylindrical eigenvalue condition is what distinguishes necks from caps and compact round regions: for the round cylinder at scalar curvature $1$, two curvature-operator eigenvalues are near $0$ and the remaining eigenvalue is near $1/2$.
[example: Detecting a Neck Before Surgery]
Let $Q=R(x,t)$ and rescale the metric at time $t$ by $\tilde g=Qg(t)$, so the rescaled scalar curvature at $x$ is
\begin{align*}
R_{\tilde g}(x)=Q^{-1}R_g(x,t)=Q^{-1}Q=1.
\end{align*}
Curvature-operator eigenvalues scale in the same way, so
\begin{align*}
\tilde \nu=Q^{-1}\nu,\qquad \tilde \lambda=Q^{-1}\lambda,\qquad \tilde \mu=Q^{-1}\mu.
\end{align*}
Assume that on a ball of radius comparable to $Q^{-1/2}$ the scalar curvature is controlled at this scale, and that at the rescaled base point
\begin{align*}
|\tilde \nu|\leq \delta,\qquad |\tilde \lambda|\leq \delta,\qquad |\tilde \mu-\tfrac12|\leq \delta.
\end{align*}
The scalar-curvature identity in dimension three gives
\begin{align*}
R_{\tilde g}(x)
&=2(\tilde \nu+\tilde \lambda+\tilde \mu).
\end{align*}
For the model cylinder of scalar curvature $1$, the curvature-operator eigenvalues are $0,0,\tfrac12$, and therefore
\begin{align*}
2\left(0+0+\frac12\right)=1.
\end{align*}
Thus the assumed eigenvalue pattern is precisely the statement that the curvature operator at the base point is $\delta$-close to the scalar-curvature-one round cylinder pattern.
If $x$ is not contained in the core of an $\varepsilon$-cap and is not in a compact positively curved component, the *[Neck Detection Principle](/theorems/6025)* applies to this high-curvature point. It follows that a neighbourhood of $x$ at time $t$ is an $\varepsilon$-neck after rescaling by $Q=R(x,t)$. The positive eigenvalue $\tilde \mu\approx \tfrac12$ is the sectional curvature of the spherical cross-sections, while the two eigenvalues $\tilde \nu\approx 0$ and $\tilde \lambda\approx 0$ are the mixed sectional curvatures of planes containing the axial direction.
[/example]
The chapter therefore has a precise progression. Hamilton-Ivey pinching removes uncontrolled negative curvature from the singularity scale; noncollapsing prevents volume degeneration; canonical neighbourhoods and neck detection convert these analytic estimates into a geometric list. This list is the local language used in the construction and analysis of Ricci flow with surgery.
Canonical neighborhoods give a precise local picture of the singularity scale, but singularity formation still interrupts the flow unless those regions can be excised and replaced. Surgery and long-time continuation provide that mechanism, allowing the flow to proceed past neck-like singularities while preserving the key estimates.
# 11. Surgery and Long-Time Continuation
The course has developed Ricci flow up to the point where singularity formation is unavoidable in dimension three. Chapters 6--10 supplied the local analytic tools: compactness and blow-up limits, noncollapsing, Hamilton-Ivey pinching, and canonical neighbourhoods modelled on ancient high-curvature geometries. This chapter explains how those ingredients are assembled into Perelman's surgery procedure, which cuts controlled neckpinches, caps the remaining ends, and restarts the flow without losing the estimates needed for continuation.
Ricci flow in dimension three cannot usually be continued smoothly through every singular time: neckpinches form, curvature blows up, and regions that are topologically simple collapse to high-curvature caps. Surgery replaces the parts where curvature has concentrated by standard caps, then continues the evolution with controlled geometry. The chapter follows the logic of the construction from the local standard solution, through curvature and canonical-neighbourhood control at surgery times, to the thick-thin picture that links long-time Ricci flow with three-manifold topology.
## Standard Caps and Surgery Along Strong Necks
The first problem is local: near a developing singularity, how can one remove the part that is about to pinch while inserting a new end whose curvature and derivatives match the scale of the surrounding flow? The answer uses the standard solution, a complete rotationally symmetric model that behaves like a cap attached to a round cylinder. It supplies the replacement geometry used at each surgery scale.
[definition: Strong Delta Neck]
Let $(M,g(t))$ be a Ricci flow on a time interval containing $t_0$, let $x_0 \in M$, let $h>0$, and let $\delta>0$. A parabolic neighbourhood of $(x_0,t_0)$ is a strong $\delta$-neck of radius $h$ if, after scaling the metric by $h^{-2}$ and shifting time so that $t_0=0$, it is $\delta$-close in high $C^k$ topology on a fixed parabolic cylinder to the corresponding subset of the shrinking round cylinder $S^2\times \mathbb R$.
[/definition]
The word strong records that the comparison is parabolic, not only spatial. This is important because the surgery is made at one time slice, but its admissibility is justified by how the flow has behaved in a spacetime neighbourhood before that time.
[illustration:strong-neck-parabolic-slices]
[example: Dumbbell Neckpinch]
Consider a rotationally symmetric metric on $S^3$ made from two almost round lobes joined by a long neck whose thinnest cross-section has radius $h$. The local model for the high-curvature part is a round cylinder with metric
\begin{align*}
g_{\operatorname{cyl}}(t)=\rho(t)^2 g_{S^2(1)}+ds^2.
\end{align*}
For the round sphere factor of radius $\rho(t)$, the Ricci tensor is $g_{S^2(1)}$ in the $S^2$ directions and $0$ in the axial direction. Since Ricci flow satisfies $\partial_t g=-2\operatorname{Ric}$, comparing the $S^2$ terms gives
\begin{align*}
\partial_t\bigl(\rho(t)^2 g_{S^2(1)}\bigr)=-2g_{S^2(1)}.
\end{align*}
Equivalently,
\begin{align*}
(\rho(t)^2)'g_{S^2(1)}=-2g_{S^2(1)}.
\end{align*}
Cancelling the fixed nonzero tensor $g_{S^2(1)}$ gives
\begin{align*}
(\rho(t)^2)'=-2.
\end{align*}
Integrating from $0$ to $t$ gives
\begin{align*}
\rho(t)^2=\rho(0)^2-2t.
\end{align*}
The scalar curvature of $S^2(\rho(t))\times \mathbb R$ is therefore
\begin{align*}
R_{\operatorname{cyl}}(t)=\frac{2}{\rho(t)^2}.
\end{align*}
Substituting the radius formula gives
\begin{align*}
R_{\operatorname{cyl}}(t)=\frac{2}{\rho(0)^2-2t}.
\end{align*}
Thus the cylindrical neck curvature becomes large as the radius shrinks.
At a time when the thinnest radius is $h$, rescaling the metric by $h^{-2}$ and setting $\sigma=s/h$ gives
\begin{align*}
h^{-2}\bigl(h^2 g_{S^2(1)}+ds^2\bigr)=g_{S^2(1)}+h^{-2}ds^2.
\end{align*}
Since $d\sigma=h^{-1}ds$, this becomes
\begin{align*}
h^{-2}\bigl(h^2 g_{S^2(1)}+ds^2\bigr)=g_{S^2(1)}+d\sigma^2.
\end{align*}
So the rescaled neck is modelled by the unit round cylinder $S^2\times \mathbb R$.
At surgery one chooses two nearly round cross-sections $S^2\times\{\sigma_-\}$ and $S^2\times\{\sigma_+\}$ in this rescaled cylindrical region, removes the middle segment, and attaches scaled standard caps to the remaining boundary spheres. If the unit cap metric $\bar g$ has $|\operatorname{Rm}_{\bar g}|\le C_0$, then the cap inserted at radius $h$ has metric $h^2\bar g$, and curvature scales by
\begin{align*}
|\operatorname{Rm}_{h^2\bar g}|=h^{-2}|\operatorname{Rm}_{\bar g}|.
\end{align*}
Using the unit-scale bound gives
\begin{align*}
|\operatorname{Rm}_{h^2\bar g}|\le C_0h^{-2}.
\end{align*}
The post-surgery pieces therefore have curvature controlled at the surgery scale, while the two lobes continue as separate components capped by standard ends.
[/example]
The neck model by itself only says where to cut. To fill in the cut boundary, we need a canonical cap whose curvature is positive, whose asymptotic end is cylindrical, and whose future flow is available for comparison.
[definition: Standard Solution]
A standard solution is a complete Ricci flow $(M_{\mathrm{st}},g_{\mathrm{st}}(t))$, $0\le t<1$, on a manifold diffeomorphic to $\mathbb R^3$, such that $g_{\mathrm{st}}(0)$ is rotationally symmetric, has positive curvature operator, has bounded curvature, and outside a compact set is isometric to a round half-cylinder $S^2\times [0,\infty)$ of fixed scalar curvature.
[/definition]
The initial cap is not an arbitrary smoothing of a cylinder; the construction needs a model whose future behaviour is controlled after it has been inserted. The next theorem supplies this model and gives the derivative estimates needed when the restarted flow is compared with the cap.
[quotetheorem:6026]
[citeproof:6026]
The completeness and bounded-curvature hypotheses are what let the standard solution serve as initial data for comparison after a cap has been inserted; a merely formal capped cylinder with unbounded curvature would not give uniform local existence or Shi estimates. Positivity of the curvature operator is also essential, because surgery must not introduce new negatively curved high-curvature regions that violate Hamilton-Ivey pinching. The cylindrical end assumption is the matching condition: without an asymptotic round half-cylinder, the cap could not be glued into a strong neck with errors controlled at the surgery scale. The theorem does not say that every cap-like region is standard; it gives one canonical model whose controlled future is used to test the cutoff construction. We therefore need a precise name for the actual operation that uses this model: cutting a strong neck at scale $h$, deciding which side survives, and inserting the scaled standard cap with a controlled transition metric.
[definition: Delta Cutoff Surgery]
Let a time slice $(M,g)$ contain a strong $\delta$-neck of radius $h$ centred at $x_0$. A $\delta$-cutoff surgery at scale $h$ removes the central part of the neck, keeps the side or sides selected by the surgery algorithm, and replaces each new boundary sphere by a scaled copy of the standard cap, using a transition metric that is $\delta$-close after scaling by $h^{-2}$ to the standard cylindrical-cap transition.
[/definition]
[illustration:neck-cutoff-surgery]
After rescaling by $h^{-2}$, the surgery construction is performed at unit scale. The important question is whether the cutoff has introduced curvature larger than the scale that the post-surgery estimates can tolerate.
[quotetheorem:6027]
[citeproof:6027]
These estimates do not assert that surgery improves every curvature quantity pointwise, and they do not control old high-curvature regions away from the cutoff. The strong-neck hypothesis is needed because the unit-scale cutoff is designed for a nearly cylindrical metric. For a concrete failure mode, suppose the cross-sectional metrics along the neck alternate over a short axial interval between nearly round spheres and noticeably elongated ellipsoidal spheres, or suppose the neck radius oscillates on a length scale much smaller than $h$. Interpolating such a region to a fixed cap profile would differentiate the oscillation and could create second-derivative terms in the metric larger than the intended unit-scale bounds; after rescaling back, those terms would appear as curvature much larger than $h^{-2}$. The smallness of $\delta$ is equally important: the fixed cap model has controlled curvature, but a large perturbation of the neck could turn the transition region into an uncontrolled smoothing rather than a perturbative construction. What the theorem gives is the precise scale compatibility needed for the restarted flow and for the canonical-neighbourhood induction to treat the new cap as a known model.
## Curvature Control Across Surgery Times
The next problem is global in time: after changing the metric discontinuously, why do the curvature hypotheses needed for the next stage of the flow survive? The surgery construction is designed around a hierarchy of scales. Singular regions occur at very high curvature, surgery is performed at a smaller but comparable neck scale, and the restarted solution is required to satisfy the same pinching and canonical-neighbourhood assumptions as before.
[definition: Ricci Flow With Surgery]
A Ricci flow with surgery on a closed three-manifold is a sequence of smooth Ricci flows $(M_i,g_i(t))$ on time intervals $[T_i,T_{i+1})$, together with surgery identifications at the discrete times $T_i$, such that each transition from $T_i^-$ to $T_i^+$ is obtained by cutting along a finite collection of strong necks, discarding components specified by the surgery rules, attaching standard caps, and restarting the smooth Ricci flow from the resulting post-surgery metric.
[/definition]
The definition packages the procedure, but it does not yet say how the algorithm recognizes that a high-curvature region is eligible for cutting. For that we need a condition asserting that every sufficiently curved point is already close to one of the known singularity models.
[definition: Canonical Neighbourhood Assumption]
A three-dimensional Ricci flow with surgery satisfies the canonical neighbourhood assumption at scale $r>0$ if every spacetime point with scalar curvature at least $r^{-2}$ has a neighbourhood, after scaling by the curvature at the point, modelled on an $\varepsilon$-neck, an $\varepsilon$-cap, a compact positively curved component, or an ancient solution from the prescribed model class.
[/definition]
This assumption is the bridge between local singularity models and the global algorithm. The issue for continuation is whether this bridge survives the discontinuous metric change at surgery time, especially at points lying on the newly inserted caps and transition regions. The next input is needed to close the induction over surgery times.
[remark: Persistence of Canonical Neighbourhoods as a Surgery Input]
The surgery induction uses a persistence input for canonical neighbourhoods. For fixed $\varepsilon>0$, finite time interval $[0,T]$, initial data, and standard-cap model, the parameters can be chosen so that a sufficiently precise surgery along strong $\delta$-necks of radius $h\ll r$ preserves the $\varepsilon$-canonical-neighbourhood property at all points with scalar curvature at least $r^{-2}$, and the restarted smooth flow preserves that conclusion for a definite time of order $h^2$ unless a later surgery occurs.
[/remark]
This persistence input is the restart condition for the surgery induction. It says that a sufficiently precise cutoff, made inside the established parameter hierarchy, restores the same canonical-neighbourhood alternatives after the metric discontinuity at surgery time.
The hypotheses show why surgery is not an isolated geometric operation. The pinching and noncollapsing assumptions are needed in the blow-up analysis: without them, a first bad point could limit to a collapsed or negatively curved ancient geometry outside the classified model class. The small parameter hierarchy prevents the inserted cap from being visible at the wrong scale; if $h$ were comparable to $r$ or $\delta$ were not small, the transition region could fail to resemble either a cap or a neck after curvature rescaling. The input also has a finite-time character, since the constants are chosen on $[0,T]$ and are not a single uniform choice for all future time without rechecking the long-time scale choices. Its role is to close the induction: canonical neighbourhoods justify surgery, and the standard cap plus compactness restore canonical neighbourhoods after surgery.
A concrete way for the hypotheses to fail is to take locally cylindrical regions with an additional circle direction collapsed to length tending to zero, as in quotients modelled on $S^2\times S^1_\ell$ with $\ell\to 0$ after normalization. Curvature bounds alone would not produce a noncollapsed three-dimensional ancient limit. Likewise, a cutoff performed on a neck whose cross-sections are not $C^k$-close to round spheres could make the transition region visible after curvature rescaling as neither a cap nor a neck. These examples explain why noncollapsing, strong-neck quality, and the hierarchy $h\ll r$ are not interchangeable technical assumptions.
[remark: Parameter Hierarchy]
The usual notation separates the canonical scale $r$, the neck quality $\delta$, and the surgery radius $h$. The radius $h$ is chosen much smaller than $r$, while $\delta$ is chosen small enough that every cutoff error is absorbed by the canonical-neighbourhood tolerances and pinching estimates.
[/remark]
Curvature pinching is another quantity that has to survive surgery. In three dimensions the scalar curvature controls the negative part of the curvature operator at high curvature, but surgery replaces part of the metric by a cap where the old pinching estimate is no longer inherited automatically. The issue is whether the transition and cap can be chosen so that no new negative curvature direction violates the Hamilton-Ivey control at the surgery scale.
This is a separate restart condition from preserving canonical neighbourhoods: even if the post-surgery regions look like caps and necks, the curvature operator still has to satisfy the quantitative lower bound used in the later blow-up arguments. The next formal result supplies that pinching preservation, allowing the surgery flow to continue with the same curvature-control framework after each cutoff.
[quotetheorem:6029]
[citeproof:6029]
The positivity of the standard cap is the reason this theorem is available: a cap with even a small uncontrolled negative curvature direction at scale $h$ could violate the Hamilton-Ivey lower bound precisely where the scalar curvature is large. The finite-time tolerance matters because the pinching constants are not absolute under arbitrary perturbations; they are chosen so that the transition errors are absorbed on the interval under consideration. The theorem is also limited in scope: pinching controls the negative part of the curvature operator, but it does not by itself give canonical neighbourhoods, derivative bounds, or noncollapsing. Those additional estimates are needed before the restart-and-surgery procedure can be iterated without surgery times accumulating.
The final existence theorem uses the usual surgery hierarchy. The neck radius $h$ fixes the physical scale of the necks being cut, and $D$ is a dimensionless trigger parameter chosen after $h$; the scalar-curvature threshold $D h^{-2}$ tells the algorithm when the largest-curvature region is ready for canonical-neighbourhood detection and surgery selection. Taking $D$ large separates ordinary high curvature from the controlled cutoff scale.
We can now state the global continuation result that packages these local restart estimates into a complete surgery construction. The next theorem is not another preservation lemma; it asserts that, after choosing the hierarchy of parameters, the Ricci flow with surgery exists through the required time intervals and performs only controlled, discrete topological modifications.
[quotetheorem:6030]
[citeproof:6030]
The theorem is the technical form of long-time continuation: the solution may change topology at discrete times, but the analytic estimates needed for Ricci flow continue indefinitely unless no component remains. The dependence of the parameters on finite time intervals is important, because the surgery scale must stay much smaller than the canonical scale relevant to the interval being controlled. Noncollapsing is not a cosmetic hypothesis; without it, blow-up limits near a supposed failure of the canonical-neighbourhood assumption could collapse and evade the compactness-and-classification argument. The finite-surgeries conclusion also depends on discarding only standard components and on a quantitative drop at each surgery; an uncontrolled cutting procedure could restart the flow but still allow infinitely many tiny topological changes in bounded time.
Each part of the statement rules out a specific pathology. If the initial manifold were not closed, curvature escaping down an end would require separate hypotheses at infinity before short-time and compactness arguments could be applied. If canonical neighbourhoods were unavailable at the surgery threshold, a high-curvature region shaped like a badly distorted horn would give no canonical sphere along which to cut. If noncollapsing were imposed only at scales much smaller than the surgery radius, the blow-up sequence used to protect the next canonical-neighbourhood scale could collapse at the relevant neck scale. If the quantitative drop at surgery were omitted, a sequence of ever smaller cutoffs could accumulate before any fixed positive time.
[example: Spherical Component Removed at Surgery]
Let the component be one of the standard spherical pieces recognized by the surgery algorithm, at scale $h$. To see the scale of its remaining evolution, take the round model $S^3/\Gamma$ with metric $g(0)=h^2 g_0$, where $g_0$ has sectional curvature $1$. Write the evolving round metric as
\begin{align*}
g(t)=a(t)g_0,\qquad a(0)=h^2.
\end{align*}
For $g_0$ in dimension $3$, constant sectional curvature $1$ gives
\begin{align*}
\operatorname{Ric}_{g_0}=2g_0.
\end{align*}
Under the constant scaling $g(t)=a(t)g_0$, the Ricci tensor as a $(0,2)$-tensor is
\begin{align*}
\operatorname{Ric}_{g(t)}=2g_0.
\end{align*}
The Ricci flow equation $\partial_t g=-2\operatorname{Ric}$ therefore gives
\begin{align*}
a'(t)g_0=-2(2g_0)=-4g_0.
\end{align*}
Cancelling the fixed nonzero tensor $g_0$ gives
\begin{align*}
a'(t)=-4.
\end{align*}
Integrating from $0$ to $t$ yields
\begin{align*}
a(t)-a(0)=-4t,
\end{align*}
so
\begin{align*}
a(t)=h^2-4t.
\end{align*}
The metric collapses to zero size when
\begin{align*}
h^2-4t=0,
\end{align*}
hence at
\begin{align*}
t=\frac{h^2}{4}.
\end{align*}
Thus a spherical component at surgery scale $h$ has only controlled short remaining evolution in the round model, of order $h^2$. The algorithm discards such a component instead of attaching caps to it: it is recorded as a standard extinct topological piece, while the other surviving components continue under Ricci flow with surgery.
[/example]
## Thick-Thin Decomposition and Extinction
The long-time problem is different from the finite-time surgery problem. Once high-curvature singularities have been controlled, one asks which parts persist for large time and which collapse. Perelman's thick-thin decomposition separates regions with definite volume at the curvature scale from regions that are locally collapsed.
[definition: Thick Part]
Let $(M(t),g(t))$ be a Ricci flow with surgery and let $w>0$. The $w$-thick part at time $t$ is the set of points $x\in M(t)$ for which there exists a radius $\rho\in (0,\sqrt{t}]$ such that sectional curvatures on $B(x,\rho)$ are bounded below by $-\rho^{-2}$ and
\begin{align*}
\operatorname{vol}_{g(t)} B(x,\rho) \ge w\rho^3.
\end{align*}
[/definition]
The thick part captures regions where Hamilton compactness can see a genuine three-dimensional limit. To study the complementary behaviour, we need a name for the points where every curvature-controlled ball has too little volume to be noncollapsed.
[definition: Thin Part]
With the same notation, the $w$-thin part at time $t$ is the complement in $M(t)$ of the $w$-thick part.
[/definition]
[illustration:thick-thin-ricci-decomposition]
The definitions are tuned to the scale $\sqrt t$ because long-time nonsingular regions of Ricci flow naturally have curvature of order $t^{-1}$ and volume of order $t^{3/2}$ after parabolic rescaling. The main theorem asks what geometric limits these two alternatives force when time is large and the surgery scale is far below the natural curvature scale.
[quotetheorem:6031]
[citeproof:6031]
This result is the analytic gateway from Ricci flow to three-manifold topology, but its hypotheses encode several scale choices. The curvature lower bound in the definition of thickness is needed because volume lower bounds alone do not give Hamilton compactness; without such a lower bound, a ball can have definite volume while containing small regions of very negative sectional curvature, so no controlled smooth limit need exist after rescaling. The requirement that surgery scales lie far below $\sqrt t$ is also structural: if caps were inserted at the same scale as the long-time geometry, the thick part could see fresh artificial cap geometry instead of the limiting hyperbolic pieces, and the thin part would not be a purely collapsed region of the old flow.
The thin side is where the geometric-topological bridge becomes visible. Local collapse under a lower sectional-curvature bound is not an arbitrary loss of volume; in dimension three it forces the region to be built from standard collapsed geometries, such as circle-fibred, torus-fibred, and Seifert-type pieces assembled along incompressible tori. The topology extracted from this collapsed geometry is graph-manifold topology, while the thick components carry the hyperbolic pieces. Thus the analytic thick-thin decomposition is the Ricci-flow mechanism behind the hyperbolic/graph-manifold split in geometrization.
This long-time alternative also shows why the next result requires a different argument. Thick hyperbolic pieces and thin graph-manifold pieces are possible persistent geometries for general initial topology, so the thick-thin theorem by itself does not force a flow to disappear. In the simply connected case, however, such persistent pieces are incompatible with the topological input used in Perelman's extinction argument: an essential two-sphere sweepout supplies a width that must decrease through the smooth flow and cannot be increased by surgery. The following theorem isolates that finite-time mechanism, turning the continuation theory developed above into extinction rather than a long-time geometric decomposition.
[quotetheorem:6032]
[citeproof:6032]
The theorem is the point where [analytic continuation](/page/Analytic%20Continuation) and topology meet. The simply connected hypothesis is stronger than what the extinction argument needs, but it is a clean route to finite fundamental group and to the sweepout input. The conclusion is not a consequence of curvature estimates alone: long-time continuation would still permit persistent geometric pieces on other topologies, and the thick-thin theorem describes such possibilities rather than excluding them. What excludes them here is the width inequality together with the fact that surgery does not increase the chosen topological complexity. This is why finite-time extinction becomes the Ricci-flow route to the [Poincare conjecture](/theorems/6037).
[example: Extinction of a Spherical Space Form]
Let $M=S^3/\Gamma$, where $\Gamma$ is a finite group acting freely by isometries on the round $S^3$, and normalize the initial metric $g_0$ so that it has sectional curvature $1$. Since the quotient map is locally isometric, $g_0$ has the same local curvature tensor as the round $S^3$. In dimension $3$, constant sectional curvature $1$ gives
\begin{align*}
\operatorname{Ric}_{g_0}=2g_0.
\end{align*}
By symmetry and uniqueness of the Ricci flow, the evolving metric remains a constant multiple of $g_0$, so write
\begin{align*}
g(t)=a(t)g_0
\end{align*}
with
\begin{align*}
a(0)=1.
\end{align*}
For the constant scaling $g(t)=a(t)g_0$, the Ricci tensor as a $(0,2)$-tensor is
\begin{align*}
\operatorname{Ric}_{g(t)}=2g_0.
\end{align*}
The Ricci flow equation $\partial_t g=-2\operatorname{Ric}$ therefore gives
\begin{align*}
\partial_t(a(t)g_0)=-2(2g_0).
\end{align*}
Since $g_0$ is independent of $t$, this is
\begin{align*}
a'(t)g_0=-4g_0.
\end{align*}
Cancelling the fixed nonzero tensor $g_0$ gives
\begin{align*}
a'(t)=-4.
\end{align*}
Integrating from $0$ to $t$ gives
\begin{align*}
a(t)-a(0)=\int_0^t -4\,d\tau.
\end{align*}
The integral is
\begin{align*}
\int_0^t -4\,d\tau=-4t.
\end{align*}
Using $a(0)=1$, we obtain
\begin{align*}
a(t)=1-4t.
\end{align*}
The metric reaches zero size exactly when
\begin{align*}
1-4t=0.
\end{align*}
Solving gives
\begin{align*}
t=\frac14.
\end{align*}
Thus a round spherical space form becomes extinct at time $1/4$ under this normalization. In a general Ricci flow with surgery, components recognized as spherical space-form pieces are removed because their remaining round-model evolution is completely controlled and lasts only a bounded time at the corresponding curvature scale.
[/example]
The chapter leaves us with a complete qualitative picture. High-curvature necks are cut and capped using the standard solution; the canonical-neighbourhood and pinching estimates survive the discontinuous surgery times; and the large-time decomposition identifies the only possible persistent geometries. In the simply connected case those persistent geometries are ruled out, giving finite-time extinction and the Ricci-flow route to the Poincare conjecture.
Surgery completes the analytic framework needed to run Ricci flow on a simply connected closed three-manifold until extinction or decomposition. The final chapter gathers the preceding chapters into the overall strategy of Perelman's proof of the Poincaré conjecture.
# 12. Outline of the Poincare Conjecture Proof
This chapter explains how the analytic theory developed in Chapters 1--11 is assembled into Perelman's proof of the Poincare conjecture. The starting point is a closed simply connected three-manifold with an arbitrary Riemannian metric, and the central question is whether Ricci flow can turn that metric into recognizable topology before singularities destroy the solution. The answer uses three linked mechanisms: controlled singularity models, surgery along necks, and finite-time extinction in the simply connected case.
## From an Arbitrary Three-Manifold to Ricci Flow with Surgery
The first problem is that a closed three-manifold need not carry any useful initial geometry. Ricci flow gives a canonical deformation of the metric, but in dimension three singularities generally form before the manifold has settled into a standard geometry. This motivates replacing a single smooth flow by a controlled piecewise-smooth process that can continue past singular times.
[definition: Ricci Flow with Surgery]
A Ricci flow with surgery on a closed three-manifold consists of a finite or locally finite sequence of smooth Ricci flows $(M_i, g_i(t))$ on time intervals $[T_i,T_{i+1})$, together with surgery operations at times $T_i$ for $i \ge 1$ in which selected embedded neck regions are removed and the remaining boundary components are capped by standard caps.
[/definition]
The definition records the global bookkeeping of the flow, but it does not yet say which regions may be cut. A legitimate cut must occur where the geometry has become cylindrical at the curvature scale, because only then does a central two-sphere determine a stable topological operation. This motivates the local model used to recognize surgery regions.
[definition: Strong Epsilon Neck]
Let $(M,g(t))$ be a Ricci flow and let $(x_0,t_0) \in M \times [0,T)$ with scalar curvature $Q=R(x_0,t_0)>0$. For fixed $\varepsilon>0$ and integer $k\ge \varepsilon^{-1}$, a strong $\varepsilon$-neck centered at $(x_0,t_0)$ is a pointed parabolic neighbourhood
\begin{align*}
\{(x,t): x\in U,\ t\in [t_0-Q^{-1},t_0]\}
\end{align*}
which, after the rescaling $\tilde g(s)=Q\,g(t_0+Q^{-1}s)$ and after choosing pointed coordinates sending $x_0$ to a point on the central slice of $S^2\times \mathbb R$, is $\varepsilon$-close in $C^k$ on $S^2\times (-\varepsilon^{-1},\varepsilon^{-1})\times[-1,0]$ to the corresponding region of the shrinking round cylinder.
[/definition]
[illustration:surgery-sphere-strong-neck]
A strong neck turns analytic blow-up information into a topological instruction: cut across the central two-sphere and cap the two sides. The remaining challenge is existence, since the flow must reach singular time, detect necks at the right scale, insert standard caps, and restart while preserving the estimates needed for the next singular time. This motivates the surgery existence theorem.
[quotetheorem:6033]
[citeproof:6033]
The closedness hypothesis is used to obtain uniform initial entropy, short-time curvature control, and compactness of the spatial slices; on a noncompact three-manifold such as $S^2\times\mathbb R$ with an incomplete or poorly controlled end, high-curvature regions can interact with infinity and the same finite-parameter surgery construction is not available without extra bounded-geometry hypotheses. The orientation assumption is harmless for the Poincare application but keeps the standard three-dimensional surgery discussion in the oriented category; nonorientable manifolds require passing to orientable covers or adding separate bookkeeping for caps. The theorem also depends on choosing the cutoff hierarchy in the stated order: if the neck scale is not much smaller than the canonical-neighbourhood scale, cutting can occur in a region that is not close enough to a cylinder for the standard cap estimates to match.
The theorem does not classify the manifold and does not assert that the flow is unique across surgery times. It supplies one controlled continuation whose singular times are topologically interpretable. The simplest case to compare with is a solution where no localized surgery is needed.
[example: Round Three-Sphere Under Ricci Flow]
Take $M=S^3$ with the round initial metric $g_0$ of sectional curvature $1$. If $g(t)=a(t)g_0$ remains round, then the sectional curvature of $a(t)g_0$ is $a(t)^{-1}$, and in dimension $3$ the Ricci tensor of a constant-curvature metric is $2K$ times the metric. Thus
\begin{align*}
\operatorname{Ric}(a(t)g_0)=2a(t)^{-1}a(t)g_0=2g_0.
\end{align*}
The Ricci flow equation $\partial_t g=-2\operatorname{Ric}(g)$ gives
\begin{align*}
a'(t)g_0=-2\operatorname{Ric}(a(t)g_0)=-4g_0.
\end{align*}
Therefore $a'(t)=-4$, and since $a(0)=1$,
\begin{align*}
a(t)=1-4t.
\end{align*}
So the round solution is
\begin{align*}
g(t)=(1-4t)g_0,\qquad 0\le t<\frac14.
\end{align*}
For a round three-sphere of sectional curvature $K$, the scalar curvature is the sum of the $3\cdot 2$ sectional contributions, so $R=6K$. Here $K(t)=(1-4t)^{-1}$, hence
\begin{align*}
R(t)=\frac{6}{1-4t}.
\end{align*}
This scalar curvature is spatially constant for every $t$, and
\begin{align*}
\lim_{t\uparrow 1/4}R(t)=+\infty.
\end{align*}
At the same time, multiplying the metric by $1-4t$ multiplies all distances by $\sqrt{1-4t}$, so every diameter tends to $0$ as $t\uparrow 1/4$. The singularity is therefore global extinction of an already spherical component, not a localized neck that calls for surgery.
[/example]
The round sphere is the simplest model for the final outcome, but most initial metrics develop inhomogeneous curvature. The next question is why those singular regions remain sufficiently standard for surgery to be a legitimate operation rather than an uncontrolled modification.
## No Local Collapsing and Canonical Neighbourhoods
The main analytic obstruction to understanding singularities is collapse: after rescaling around a point of large curvature, a sequence of balls might have bounded curvature but vanishing volume. Collapsed limits can lose dimension and are too weak to classify. This motivates a scale-invariant lower volume condition tied to curvature control.
[definition: Kappa Noncollapsing]
Let $(M,g(t))$ be a Ricci flow on a time interval $[0,T)$. The flow is $\kappa$-noncollapsed on scales at most $r_0$ if whenever $0<r\le r_0$, $t\ge r^2$, the backward parabolic ball
\begin{align*}
P(x,t,r,-r^2)=\{(y,s): s\in[t-r^2,t],\ d_{g(s)}(y,x)<r\}
\end{align*}
is relatively compact in spacetime, and $|\operatorname{Rm}| \le r^{-2}$ on this parabolic ball, the time-$t$ volume satisfies
\begin{align*}
\operatorname{Vol}_{g(t)} B(x,r) \ge \kappa r^3.
\end{align*}
[/definition]
This condition says that a region with controlled curvature has the expected three-dimensional volume. It is precisely the compactness input needed to pass from a sequence of increasingly curved regions to a nonflat ancient solution. This motivates Perelman's theorem guaranteeing the condition for compact flows.
[quotetheorem:6020]
[citeproof:6020]
Closedness and finite time are essential in this form of the theorem. On noncompact initial data, a flat product with a very small circle factor gives bounded curvature and arbitrarily small volume ratios unless an initial noncollapsing assumption is imposed. On infinite time intervals, a constant $\kappa$ need not be obtained from the initial entropy without restricting the time horizon, because the entropy comparison used in the proof runs backward from a finite final time. The curvature bound on the whole parabolic ball is also necessary: a spatial ball with large volume at one time can have nearby spacetime curvature spikes that prevent the parabolic rescaling and compactness argument from applying.
The theorem does not say that high-curvature regions are cylindrical, nor does it bound curvature from above. It only prevents collapse at scales where curvature has already been controlled. That lower volume input is what allows Hamilton compactness to produce genuine three-dimensional ancient limits, and the next theorem classifies the local geometry of those limits.
[definition: Canonical Neighbourhood]
A point $(x,t)$ in a three-dimensional Ricci flow has an $\varepsilon$-canonical neighbourhood if, after rescaling by the scalar curvature at $(x,t)$, a neighbourhood of $x$ at time $t$ is $\varepsilon$-close in high $C^k$ topology to one of the standard local models: a round neck, a cap attached to a neck, or a compact component with positive curvature.
[/definition]
The definition is deliberately geometric rather than purely tensorial. It identifies the shapes on which surgery has a topological meaning, and it is useful only if high curvature forces these shapes. This motivates the theorem that supplies canonical neighbourhoods from the analytic hypotheses already established.
[quotetheorem:6034]
[citeproof:6034]
The quantifier order matters. If $\varepsilon$ is chosen after the curvature threshold, a sequence of increasingly accurate counterexamples would not force a fixed limiting model; if the surgery cutoff is not chosen small relative to $\varepsilon$, the inserted cap may fail to be close enough to the standard model at the scale used in the blow-up. Noncollapsing is also indispensable: collapsed graph-manifold regions can have large curvature in some directions while lacking a three-dimensional $\kappa$-solution limit, so the neck-cap-spherical classification would not follow.
The theorem does not state that every point of the manifold lies in a canonical neighbourhood and does not specify a unique surgery sphere inside a neck. It applies only above the threshold $R_{\mathrm{can}}$, and lower-curvature regions are controlled by separate long-time estimates. Its role is to turn the analytic estimates into topological control: every sufficiently high-curvature component has a known local shape, so the permitted topological changes are cutting along two-spheres, discarding spherical space-form components, and capping neck boundaries. A neckpinch on a three-sphere gives the model picture.
[example: Tracking a Neckpinch on a Three-Sphere]
Consider a metric on $S^3$ with two large nearly round lobes joined by a thin cylindrical region, and choose points $(x_j,t_j)$ on the middle of the neck where the scalar curvature tends to infinity. Write
\begin{align*}
Q_j=R(x_j,t_j).
\end{align*}
Thus $Q_j\to\infty$. Rescale the flow around $(x_j,t_j)$ by
\begin{align*}
\widetilde g_j(s)=Q_j\,g(t_j+Q_j^{-1}s),\qquad -1\le s\le 0.
\end{align*}
Scalar curvature scales by $R_{\lambda g}=\lambda^{-1}R_g$, so at the rescaled basepoint one has
\begin{align*}
\widetilde R_j(x_j,0)=Q_j^{-1}R_g(x_j,t_j)=Q_j^{-1}Q_j=1.
\end{align*}
Distances scale by $\sqrt{Q_j}$ under $g\mapsto Q_jg$, so the unrescaled neck radius corresponding to a unit-size cylindrical region in $\widetilde g_j(0)$ is on the order of $Q_j^{-1/2}$.
For all sufficiently large $j$, the inequality $Q_j\ge R_{\mathrm{can}}$ puts $(x_j,t_j)$ in the range of the *Canonical Neighbourhood Theorem*. In the neckpinch case, the canonical neighbourhood is the neck model: after the rescaling above, a neighbourhood of $(x_j,0)$ is $\varepsilon$-close to a region of $S^2\times\mathbb R$ with the shrinking round cylinder metric. The surgery sphere is the central copy
\begin{align*}
\Sigma_j=S^2\times\{0\}.
\end{align*}
Cutting $S^3$ along $\Sigma_j$ gives two compact pieces $A$ and $B$ with common boundary:
\begin{align*}
S^3=A\cup_{\Sigma_j}B.
\end{align*}
Their intersection is exactly the cutting sphere:
\begin{align*}
A\cap B=\Sigma_j.
\end{align*}
Capping the two boundary spheres means attaching one $3$-ball to each side:
\begin{align*}
M_1=A\cup_{\Sigma_j}B^3.
\end{align*}
\begin{align*}
M_2=B\cup_{\Sigma_j}B^3.
\end{align*}
By the smooth Schoenflies theorem, every smoothly embedded $S^2\subset S^3$ bounds a $3$-ball on each side. Hence $A$ and $B$ are each $3$-balls, and therefore
\begin{align*}
M_1\cong B^3\cup_{S^2}B^3\cong S^3.
\end{align*}
\begin{align*}
M_2\cong B^3\cup_{S^2}B^3\cong S^3.
\end{align*}
The neckpinch therefore produces two spherical components rather than a persistent non-spherical piece, and the later extinction of those components records that the singularity was a controlled spherical splitting.
[/example]
This example shows why surgery is not an extra topological guess: the location and type of the cut are forced by the singularity model. The remaining issue for the Poincare conjecture is to understand what the absence of fundamental group does to the long-time alternatives in geometrization.
## Controlled Topological Changes in the Simply Connected Case
Ricci flow with surgery is designed for geometrization, where long-time pieces may carry hyperbolic geometry, graph-manifold structure, or spherical geometry. The Poincare conjecture is a special case: the manifold is closed and simply connected, so many pieces allowed in general geometrization cannot occur. This motivates isolating the topological surfaces that would record non-spherical pieces.
[definition: Incompressible Surface]
Let $M$ be a connected three-manifold. A connected embedded surface $\Sigma \subset M$ is incompressible if the induced homomorphism $\pi_1(\Sigma) \to \pi_1(M)$ is injective and $\Sigma$ is not a two-sphere bounding a three-ball.
[/definition]
Incompressible surfaces are the surfaces along which non-spherical geometrization pieces are detected. A simply connected ambient manifold cannot contain an incompressible surface with nontrivial fundamental group, because its fundamental group would inject into the zero group. This motivates the topological exclusion needed in the Poincare case.
[quotetheorem:6035]
[citeproof:6035]
Simple connectivity is the decisive hypothesis. For example, $S^1\times S^2$ and many torus bundles admit nontrivial fundamental group, so incompressible surfaces can persist and the conclusion fails. Closedness is also used because the prime and JSJ decompositions invoked here are compact three-manifold tools; finite-volume hyperbolic cusps in noncompact manifolds have incompressible boundary tori without contradicting any compact simply connected classification. The theorem does not say that all surgery components are simply connected at every intermediate time, since a two-sphere cut may separate summands before caps are attached; it says that persistent incompressible non-spherical pieces cannot be traced back to the initial simply connected manifold.
This theorem is a topological filter on the general geometrization picture. It leaves spherical extinction as the only possible behaviour compatible with simple connectivity. A hyperbolic piece illustrates what is being excluded.
[example: Why Hyperbolic Pieces Do Not Appear]
Suppose a long-time component $N$ of the surgery flow contains a finite-volume hyperbolic piece $H$ whose boundary has a torus component $T$. In the geometric decomposition, $T$ is incompressible, so the inclusion map
\begin{align*}
i:T\hookrightarrow N
\end{align*}
induces an injective homomorphism
\begin{align*}
i_*:\pi_1(T)\longrightarrow \pi_1(N).
\end{align*}
Since $T\cong S^1\times S^1$, its fundamental group is
\begin{align*}
\pi_1(T)\cong \pi_1(S^1)\times \pi_1(S^1)\cong \mathbb Z\times \mathbb Z=\mathbb Z^2.
\end{align*}
Thus $\pi_1(N)$ contains an injected copy of $\mathbb Z^2$.
The surgeries used in the flow cut along embedded two-spheres and cap by three-balls. Reversing such a surgery removes the caps and glues along a two-sphere, which changes the global connected-sum bookkeeping but does not turn an injected torus subgroup into the trivial group. Therefore this injected $\mathbb Z^2$ subgroup would trace back to a nontrivial subgroup of the fundamental group of the original manifold. If the original manifold is simply connected, then
\begin{align*}
\pi_1(M)=0,
\end{align*}
so it cannot contain a subgroup isomorphic to $\mathbb Z^2$. Hence a persistent finite-volume hyperbolic piece with torus boundary is incompatible with the simply connected Poincare case.
[/example]
The example isolates the reason the Poincare conjecture is easier than full geometrization once Perelman's analytic machine is available. General three-manifolds may have genuine long-time geometric pieces, while simply connected manifolds have no fundamental group in which these pieces can be recorded.
## Extinction and Recovery of Spherical Topology
The final analytic problem is to prove that the surgery flow does not continue forever in the simply connected case. Extinction means that after finitely many surgeries and smooth flow intervals, no component remains. This motivates naming the event whose occurrence will encode spherical topology.
[definition: Extinction Time]
For a Ricci flow with surgery, the extinction time is the first time $T_{\mathrm{ext}}$ such that the post-surgery manifold is empty. If no such time exists, the flow is said to be non-extinct.
[/definition]
Extinction is a metric event, but its proof depends on excluding all non-spherical long-time geometries. Perelman proves a broader finite extinction theorem for manifolds whose prime decomposition has no aspherical pieces of the relevant kind. This motivates applying the long-time analysis to simply connected initial data.
[quotetheorem:6036]
[citeproof:6036]
The simply connected hypothesis cannot be dropped. A closed hyperbolic three-manifold has nontrivial fundamental group and its normalized long-time Ricci flow behaviour is hyperbolic rather than finite extinction. The theorem also relies on the closed initial manifold and on the surgery parameters coming from Perelman's construction; with uncontrolled cutoffs, a surgery process could discard or cap regions in a way that no longer reflects the analytic long-time alternatives. The theorem does not assert that extinction happens before the first singular time or that every component is spherical at all earlier times. It says that all components disappear after finitely many controlled analytic and topological steps.
Finite extinction tells us that the initial manifold decomposes, through the reverse of the surgery process, into pieces that disappeared as spherical components. The remaining task is to translate this controlled disappearance into the classical classification of closed simply connected three-manifolds. This motivates the final theorem.
[quotetheorem:6037]
[citeproof:6037]
Closedness is needed in the statement: contractible noncompact three-manifolds such as the Whitehead manifold show that simple connectivity alone does not characterize $S^3$ outside the closed category. Smoothness is the category in which the Ricci flow and surgery construction is run, and simple connectivity is essential because lens spaces are closed spherical three-manifolds with finite nonzero fundamental group. The theorem does not say that the initial metric becomes round before extinction, nor does it identify a canonical diffeomorphism to $S^3$. It concludes the topological type by combining finite extinction with the surgery bookkeeping, and the recovery example below illustrates that bookkeeping in the simplest neck-cut situation.
[example: Recovering the Original Manifold from Extinction]
Let the neck cut have central sphere $\Sigma\cong S^2$. After cutting along $\Sigma$ and capping the two boundary spheres by $3$-balls, suppose the two closed components that later shrink to extinction are spherical space forms $X$ and $Y$. Reversing the surgery removes the two cap balls and glues the resulting boundary spheres back together, so the original manifold is
\begin{align*}
M\cong \bigl(X\setminus \operatorname{int} B_X^3\bigr)\cup_{\Sigma}\bigl(Y\setminus \operatorname{int} B_Y^3\bigr).
\end{align*}
This is the connected sum construction, hence
\begin{align*}
M\cong X\# Y.
\end{align*}
By the standard *Seifert-van Kampen theorem* computation for a connected sum in dimension $3$,
\begin{align*}
\pi_1(X\# Y)\cong \pi_1(X)*\pi_1(Y).
\end{align*}
Since $M\cong X\#Y$, this gives
\begin{align*}
\pi_1(M)\cong \pi_1(X)*\pi_1(Y).
\end{align*}
If the original manifold is simply connected, then $\pi_1(M)\cong \{e\}$, so
\begin{align*}
\pi_1(X)*\pi_1(Y)\cong \{e\}.
\end{align*}
The inclusions of the two factors into a free product are injective, so neither $\pi_1(X)$ nor $\pi_1(Y)$ can contain any element other than the identity. Therefore
\begin{align*}
\pi_1(X)\cong \{e\},\qquad \pi_1(Y)\cong \{e\}.
\end{align*}
Each spherical space form has the form $S^3/\Gamma$, with fundamental group $\Gamma$. The identities above therefore force $\Gamma=\{e\}$ for both factors, so
\begin{align*}
X\cong S^3,\qquad Y\cong S^3.
\end{align*}
Substituting these identifications into the connected sum description gives
\begin{align*}
M\cong S^3\# S^3.
\end{align*}
Removing an open $3$-ball from $S^3$ leaves a closed $3$-ball, hence
\begin{align*}
S^3\# S^3\cong B^3\cup_{S^2}B^3.
\end{align*}
Gluing two $3$-balls along their common boundary sphere gives $S^3$, and therefore
\begin{align*}
M\cong S^3.
\end{align*}
Thus, in this simplest extinction pattern, reversing the surgery bookkeeping recovers the original manifold as a three-sphere rather than as a non-spherical connected sum.
[/example]
The outline also explains how the major estimates from the course fit together. Short-time existence starts the flow, curvature evolution and maximum principles control pinching, entropy gives noncollapsing, compactness produces singularity models, canonical neighbourhoods authorize surgery, and the topological consequences of simple connectivity force extinction. The Poincare conjecture is therefore the endpoint of a chain in which analysis restricts geometry and geometry restricts topology.
## Connections and Further Reading
This course sits between parabolic geometric PDE, metric degeneration, and three-manifold topology. For background on the geometric language, see [Cambridge III Differential Geometry](/page/Cambridge%20III%20Differential%20Geometry), [Geometric Analysis I: Comparison Geometry](/page/Geometric%20Analysis%20I%3A%20Comparison%20Geometry), and [Geometric Analysis II: Minimal Surfaces and Harmonic Maps](/page/Geometric%20Analysis%20II%3A%20Minimal%20Surfaces%20and%20Harmonic%20Maps). A natural analytic continuation is the study of Ricci solitons and ancient solutions, since these are the equality and blow-up models behind entropy monotonicity, compactness, and singularity formation. The shrinking sphere, shrinking cylinder, Bryant soliton, and kappa-solution picture are not isolated examples; they are the model spaces that make canonical neighbourhoods and surgery possible.
Another direction is the broader geometrization theorem. The Poincare conjecture uses simple connectivity to eliminate incompressible surfaces and long-time hyperbolic or graph-manifold pieces, but general closed three-manifolds require the full thick-thin analysis. The surrounding topological language is connected to [Topology](/page/Topology), [Cambridge IB Analysis and Topology](/page/Cambridge%20IB%20Analysis%20and%20Topology), and [Cambridge III Algebraic Topology](/page/Cambridge%20III%20Algebraic%20Topology). In the general setting, Ricci flow with surgery does not merely prove extinction; it separates spherical, hyperbolic, and collapsed graph-like geometries into the pieces predicted by the topological decomposition.
Several neighbouring Androma topics supply useful background. Cheeger-Gromov compactness explains why pointed limits of high-curvature regions exist after rescaling. Maximum principles for tensors underlie Hamilton-Ivey pinching and preserved curvature cones. [Sobolev Space](/page/Sobolev%20Space), [Homogeneous Sobolev Space](/page/Homogeneous%20Sobolev%20Space), and [Inhomogeneous Sobolev Space](/page/Inhomogeneous%20Sobolev%20Space) are the analytic background for Perelman's entropy and noncollapsing arguments. Finally, the topology of prime decomposition, incompressible surfaces, and connected sums is the bridge from controlled surgery to the classification of simply connected closed three-manifolds.
For a next reading path, one can first revisit compactness and maximum principles in geometric analysis, then study Ricci solitons and kappa-solutions as singularity models, and only then return to the full geometrization proof. That order keeps the analytic mechanisms visible while showing how Perelman's estimates become topological information.
## References
- [Geometric Analysis I: Comparison Geometry](/page/Geometric%20Analysis%20I%3A%20Comparison%20Geometry), for comparison estimates and curvature intuition used before Ricci flow enters.
- [Geometric Analysis II: Minimal Surfaces and Harmonic Maps](/page/Geometric%20Analysis%20II%3A%20Minimal%20Surfaces%20and%20Harmonic%20Maps), for neighbouring geometric-analysis methods and variational examples.
- [Cambridge III Differential Geometry](/page/Cambridge%20III%20Differential%20Geometry), for Riemannian metrics, curvature tensors, geodesics, and the differential-geometric language used throughout the course.
- [Sobolev Space](/page/Sobolev%20Space), [Homogeneous Sobolev Space](/page/Homogeneous%20Sobolev%20Space), and [Inhomogeneous Sobolev Space](/page/Inhomogeneous%20Sobolev%20Space), for the functional-analytic estimates behind heat-flow and entropy arguments.
- [Topology](/page/Topology), [Cambridge IB Analysis and Topology](/page/Cambridge%20IB%20Analysis%20and%20Topology), and [Cambridge III Algebraic Topology](/page/Cambridge%20III%20Algebraic%20Topology), for the topological background behind connected sums, incompressible surfaces, and the Poincare conclusion.
Contents
- Introduction
- What Ricci Flow Tries to Do
- The Analytic Difficulty Behind the Equation
- Curvature as the Main Unknown
- Scaling and Model Geometries
- The Road to Perelman's Structure Theory
- 1. Ricci Flow as a Geometric Parabolic Equation
- The Ricci Flow Equation
- Scaling and Curvature
- Constant-Curvature Model Solutions
- Diffeomorphism Invariance and Weak Parabolicity
- Isometries Along the Flow
- 2. Hamilton Short-Time Existence
- Linearization and Gauge Degeneracy
- The DeTurck Vector Field and the Gauged Equation
- Pullback by Harmonic-Map-Gauge Diffeomorphisms
- 3. Evolution Equations for Curvature
- Differentiating the Basic Geometric Objects
- Scalar and Ricci Curvature Evolution
- Full Riemann Curvature Evolution
- Moving Frames and Strictly Parabolic Curvature Systems
- Curvature Norms and First Derivative Estimates
- 4. Maximum Principles in Ricci Flow
- Scalar Parabolic Maximum Principle on Compact Manifolds
- Maximum Principle for Vector Bundles and Symmetric Two-Tensors
- Invariant Convex Cones of Algebraic Curvature Tensors
- 5. Harnack Inequalities and Li-Yau Estimates
- Positive Heat Solutions on Fixed Backgrounds
- Heat Equations Coupled to Ricci Flow
- Hamilton Harnack Inequalities For Curvature
- 6. Singularities and Blow-Up Limits
- Maximal Time and Curvature Blow-Up
- Type I and Type II Rates
- Pointed Convergence of Ricci Flows
- Singularities as Geometric Models
- 7. Ricci Solitons and Ancient Solutions
- Self-Similar Models for Ricci Flow
- Soliton Identities and Potential Functions
- Ancient Solutions and Two-Dimensional Models
- Three-Dimensional $\kappa$-Solutions
- How Solitons and Ancient Solutions Fit Into Singularity Analysis
- 8. Perelman's Entropy Formulae
- Ricci Flow as a Gradient Flow Modulo Diffeomorphism
- The $\mathcal W$ Entropy and the Conjugate Heat Equation
- Monotonicity and Equality Cases
- Entropy Functionals as Singularity Detectors
- 9. Reduced Distance and Reduced Volume
- The Backward Geometry of Ricci Flow
- Reduced Volume Monotonicity
- Singularity Models and Noncollapsing
- 10. Three-Dimensional Pinching and Canonical Neighborhoods
- Curvature Operator Pinching in Dimension Three
- Necks, Caps, and Canonical Neighborhoods
- Kappa-Noncollapsed High-Curvature Regions
- 11. Surgery and Long-Time Continuation
- Standard Caps and Surgery Along Strong Necks
- Curvature Control Across Surgery Times
- Thick-Thin Decomposition and Extinction
- 12. Outline of the Poincare Conjecture Proof
- From an Arbitrary Three-Manifold to Ricci Flow with Surgery
- No Local Collapsing and Canonical Neighbourhoods
- Controlled Topological Changes in the Simply Connected Case
- Extinction and Recovery of Spherical Topology
- Connections and Further Reading
- References
Geometric Analysis III: Ricci Flow
Content
Problems
History
Created by admin on 6/7/2026 | Last updated on 6/7/2026
Prerequisites
No prerequisites required for this page.
Rate this page
★
★
★
★
★
Poor
Excellent