Dynamical systems is the mathematical study of how things change over time. Whether modeling the swing of a pendulum, the spread of disease, or the orbits of planets, we describe these phenomena using differential equations and maps that evolve according to deterministic rules. This course develops the core theory and techniques for understanding qualitative behavior — asking not "what is the exact solution?" but rather "how does the system behave in the long term?" What attracts trajectories? When do stable equilibria lose stability? How do small changes in parameters trigger dramatic shifts in behavior? These questions form the heart of dynamical systems and have profound applications across physics, biology, engineering, and ecology.
The course begins with fundamental concepts: what we mean by flows and solutions to differential equations, and how to visualize dynamics in the plane using phase portraits. From there, we develop the theory of stability, learning to classify equilibrium points and determine which ones attract or repel nearby trajectories. We then turn to more complex dynamics — periodic orbits that cycle indefinitely, and bifurcations where the qualitative structure of the system changes as parameters vary. The final chapters extend these ideas beyond continuous flows to discrete maps and introduce chaos, where sensitivity to initial conditions produces behavior that is deterministic yet unpredictable.
Each chapter builds systematically on the last. The geometric intuition from planar flows underpins our stability analysis. Understanding when periodic orbits exist and remain stable prepares us for bifurcation theory, where we track how these objects collide and disappear. Maps extend these insights to discrete time, and chaotic dynamics reveal the frontiers where order gives way to complexity.
# Introduction
This chapter establishes the foundational language of dynamical systems. A dynamical system is a mathematical structure specifying how a set of variables evolves in time according to a fixed rule. Rather than pursuing explicit solutions — which are unavailable for almost all nonlinear systems — this course adopts a geometric viewpoint: we study qualitative and topological properties of the collection of all trajectories. The central objects introduced here, the phase space, the flow, and various species of invariant sets, appear in every subsequent chapter.
## The State Space and the Vector Field
The starting point is a choice of state space and a rule for how points in that space evolve.
[definition: Dynamical System and State Space]
A **state space** (or **phase space**) is a subset $E \subseteq \mathbb{R}^n$. The **state** of the system at time $t$ is a point $x \in E$. An **autonomous dynamical system** on $E$ is specified by a continuously differentiable vector field $f \colon E \to \mathbb{R}^n$, and the evolution of the state is governed by the ordinary differential equation
\begin{align*}
\dot{x} \equiv \frac{dx}{dt} = f(x).
\end{align*}
A **non-autonomous** system has a time-dependent vector field $f \colon E \times \mathbb{R} \to \mathbb{R}^n$, giving $\dot{x} = f(x, t)$.
[/definition]
At first glance, the restriction to autonomous first-order systems might seem severe: physical systems often have time-dependent forcing, and many classical equations (the pendulum equation, the beam equation) are naturally second-order or higher. Two standard reductions show that neither of these is a genuine limitation. Time-dependence can be eliminated by introducing an extra clock variable, and higher-order derivatives can be traded for additional first-order equations by enlarging the state vector.
[remark: Autonomous Reduction]
Every non-autonomous system $\dot{x} = f(x, t)$ can be formally made autonomous by enlarging the state space. Define $y = (x, \theta) \in E \times \mathbb{R}$, and introduce the extended vector field $g \colon E \times \mathbb{R} \to \mathbb{R}^n \times \mathbb{R}$ by $g(y) = (f(x, \theta), 1)$. Then $\dot{y} = g(y)$ is autonomous with $\theta(t) = t$ (setting the initial condition $\theta(0) = 0$). Throughout this course, all systems are assumed autonomous unless stated otherwise.
[/remark]
Together, these two reductions show that the autonomous first-order framework is not a restriction: any non-autonomous or higher-order system can be absorbed into it by enlarging the state space. The price of the reduction is a larger state space, but the reward is a uniform framework in which every system has the form $\dot{x} = f(x)$. The real content of the theory begins once we fix such a system and ask what the totality of its solutions looks like. In particular, the second reduction — rewriting a single $n$th-order equation as an $n$-dimensional first-order system — is how higher-order mechanics enters the theory.
[remark: Reduction to First-Order Systems]
Any scalar $n$th-order ODE of the form
\begin{align*}
\frac{d^n x}{dt^n} = g\!\left(x,\, \frac{dx}{dt},\, \ldots,\, \frac{d^{n-1}x}{dt^{n-1}}\right)
\end{align*}
can be written as a first-order system on $E = \mathbb{R}^n$. Set
\begin{align*}
y = \left(x,\; \frac{dx}{dt},\; \ldots,\; \frac{d^{n-1}x}{dt^{n-1}}\right)^\top.
\end{align*}
Then $\dot{y} = f(y)$, where $f_i(y) = y_{i+1}$ for $1 \le i \le n-1$ and $f_n(y) = g(y)$. This reduction shows that the first-order autonomous framework is fully general.
[/remark]
## Initial Value Problems and the Flow
Given an autonomous system $\dot{x} = f(x)$, one naturally asks: does a solution starting from a prescribed initial state exist, and is it unique?
[definition: Initial Value Problem]
Let $E \subseteq \mathbb{R}^n$, $f \colon E \to \mathbb{R}^n$ a vector field, $x_0 \in E$, and $t_0 \in I \subseteq \mathbb{R}$ an interval. The **initial value problem** (IVP) is: find a differentiable map $x \colon I \to E$ satisfying
\begin{align*}
\dot{x}(t) = f(x(t)) \quad \text{for all } t \in I, \qquad x(t_0) = x_0.
\end{align*}
[/definition]
The IVP asks only whether a solution exists through a single prescribed point. But a dynamical system has infinitely many initial conditions, one for each point of $E$. Solving each IVP separately would produce an unwieldy collection of individual curves. The key insight is that, for an autonomous system, all these solutions can be assembled into a single smooth family — the flow map — which encodes the entire dynamics in one object.
For an autonomous system, the collection of all solutions can be packaged into a single object — the flow.
[definition: Flow]
Let $f \colon E \to \mathbb{R}^n$ be a vector field on $E \subseteq \mathbb{R}^n$. The **flow** of $f$ is the map $\varphi \colon \mathbb{R} \times E \to E$ defined by $\varphi(t, x_0) = x(t)$, where $x \colon \mathbb{R} \to E$ is the solution to the IVP $\dot{x} = f(x)$, $x(0) = x_0$. When the solution exists for all $t \in \mathbb{R}$, the flow satisfies the **group law**
\begin{align*}
\varphi(s+t, x_0) = \varphi(s, \varphi(t, x_0)) \quad \text{for all } s, t \in \mathbb{R},\; x_0 \in E.
\end{align*}
[/definition]
The group law is the key structural fact about autonomous flows: it expresses that evolving for time $t$ and then for time $s$ is the same as evolving for time $s + t$. This property fails for non-autonomous systems, which is one reason the autonomous case is treated separately.
### Existence and Uniqueness
Two fundamental results govern the IVP. The first guarantees existence under mere continuity.
[quotetheorem:2773]
The Cauchy-Peano theorem guarantees existence, but says nothing about uniqueness and nothing about what happens at the boundary of the interval $|t - t_0| < \min(\alpha, \beta/M)$. In particular, the theorem does not assert that a solution can be extended, that different initial conditions lead to different solutions, or that the maximal interval of existence is $\mathbb{R}$ — all of these can fail. The next example shows that multiple solutions can emanate from the same initial condition.
[example: Non-uniqueness Without Lipschitz Condition]
Consider the scalar system on $E = \mathbb{R}$:
\begin{align*}
\dot{x} = \begin{cases} \sqrt{x} & x \ge 0, \\ 0 & x < 0. \end{cases}
\end{align*}
The right-hand side is continuous on $\mathbb{R}$, so the Cauchy-Peano theorem guarantees existence from any initial condition. However, for the initial condition $x(0) = 0$, the family
\begin{align*}
x_\tau(t) = \begin{cases} 0 & t < \tau, \\ \tfrac{1}{4}(t - \tau)^2 & t \ge \tau \end{cases}
\end{align*}
is a solution for every $\tau \ge 0$, as one verifies by direct computation: for $t > \tau$, $\dot{x}_\tau = \tfrac{1}{2}(t - \tau)$ and $\sqrt{x_\tau} = \tfrac{1}{2}(t - \tau)$, so the equation is satisfied. Since $\tau$ is arbitrary, infinitely many distinct solutions start from the origin. The failure of uniqueness arises because $x \mapsto \sqrt{x}$ is not Lipschitz near $x = 0$: for small $x > 0$, $|\sqrt{x} - 0|/|x - 0| = x^{-1/2} \to \infty$.
[/example]
To obtain uniqueness, one imposes a Lipschitz condition on the vector field. The idea is to demand that $f$ does not vary too wildly: if two initial conditions are close, the corresponding values of $f$ must also be close, with a quantitative bound on the rate of divergence. This is weaker than differentiability but strong enough to prevent the branching behaviour seen in the $\sqrt{x}$ example, where the vector field's rate of change blows up near the origin.
[definition: Lipschitz Condition]
A map $f \colon U \subseteq \mathbb{R}^n \to \mathbb{R}^m$ satisfies a **Lipschitz condition** at $x_0 \in U$ with **Lipschitz constant** $L \ge 0$ if there exists $a > 0$ such that
\begin{align*}
|x - x_0| < a \implies |f(x) - f(x_0)| \le L\,|x - x_0|.
\end{align*}
The map $f$ is called **locally Lipschitz** on $U$ if it satisfies a Lipschitz condition at every point of $U$ (possibly with varying constants).
[/definition]
The Lipschitz condition threads between continuity (which gives only existence) and differentiability (which is more than needed). It is the minimal regularity that forces solutions to be unique. In practice, the most important source of locally Lipschitz vector fields is continuous differentiability: if $f \in C^1(E)$, then on any bounded open set $U \subset E$, the mean value inequality gives $|f(x) - f(y)| \le \sup_{z \in U} |Jf_z| \cdot |x - y|$, so $f$ is Lipschitz on $U$ with constant $L = \sup_{z \in U} |Jf_z|$.
[remark: Regularity Hierarchy]
The implication chain
\begin{align*}
C^1 \implies \text{locally Lipschitz} \implies \text{continuous}
\end{align*}
holds for maps between Euclidean spaces. Differentiability implies the Lipschitz condition because $|f(x) - f(x_0)| \le \sup_{z \in U} |Jf_z|\cdot|x - x_0|$ by the mean value inequality. The converse fails: $x \mapsto |x|$ is Lipschitz but not differentiable at $0$.
[/remark]
With the Lipschitz condition in place, uniqueness follows by a fixed-point argument. One reformulates the IVP as an integral equation $x(t) = x_0 + \int_0^t f(x(s))\, ds$, then shows that the right-hand side defines a contraction on the Banach space of continuous curves $C([0, \delta]; \mathbb{R}^n)$ for $\delta$ sufficiently small (depending on $L$). The Banach fixed-point theorem then yields a unique fixed point — the unique solution to the IVP on $[0, \delta]$. The same argument applies backward in time, and iterating the construction (starting from the endpoint of each interval) extends the solution to a maximal interval of existence.
[quotetheorem:2774]
The Picard-Lindelöf theorem guarantees a local solution and continuous dependence on initial data. It does not guarantee that the solution persists for all time, nor does it assert global Lipschitz continuity. The condition that $f$ is Lipschitz at $x_0$ cannot be relaxed to mere continuity without losing uniqueness, as the non-uniqueness example above shows. The word "local" is essential: the length of the interval of guaranteed existence depends on the Lipschitz constant and on how far the solution is permitted to travel, and this interval shrinks as the solution grows. The next example shows that even with a smooth, Lipschitz vector field, solutions may escape to infinity in finite time.
[example: Finite-Time Blow-Up]
Consider $\dot{x} = x^3$ on $E = \mathbb{R}$ with initial condition $x(0) = 1$. The vector field $x \mapsto x^3$ is smooth, hence Lipschitz near any point; the Picard-Lindelöf theorem guarantees a unique local solution. Separating variables,
\begin{align*}
\int_1^{x(t)} \frac{ds}{s^3} = \int_0^t d\tau \implies -\frac{1}{2x(t)^2} + \frac{1}{2} = t \implies x(t) = \frac{1}{\sqrt{1 - 2t}}.
\end{align*}
This solution exists only for $t \in (-\infty, \tfrac{1}{2})$, and $x(t) \to +\infty$ as $t \to \tfrac{1}{2}^-$. This does not contradict the Picard-Lindelöf theorem, which asserts existence on some interval around $t_0 = 0$ — the issue is that the Lipschitz constant of $f(x) = x^3$ on any neighbourhood of $x_0$ grows without bound as $|x| \to \infty$, so the interval of guaranteed existence shrinks to zero as the solution grows.
[/example]
From this point forward, all vector fields $f$ are assumed to be continuously differentiable (hence locally Lipschitz) unless stated otherwise.
## Orbits and Trajectories
By the Picard-Lindelöf theorem, each initial condition $x_0 \in E$ determines a unique solution curve (at least locally). But we have one such curve for every point of $E$. Studying each curve in isolation would give a picture as fragmented as a collection of individual rain tracks. The orbit concept organises all these curves into a coherent geometric structure: each point traces out a curve in the phase space, and the entire phase space is partitioned into these non-crossing curves. This partition — the **phase portrait** — is the primary object of study in the qualitative theory.
The geometric picture of a dynamical system is built from orbits — the curves traced by solutions in the phase space.
[definition: Orbit]
Let $\varphi$ be the flow of $\dot{x} = f(x)$ on $E \subseteq \mathbb{R}^n$. The **orbit** (or **trajectory**) of $\varphi$ through $x_0 \in E$ is the set
\begin{align*}
\mathcal{O}(x_0) = \{\varphi(t, x_0) : t \in \mathbb{R}\}.
\end{align*}
The **forward orbit** is $\mathcal{O}^+(x_0) = \{\varphi(t, x_0) : t \ge 0\}$, and the **backward orbit** is $\mathcal{O}^-(x_0) = \{\varphi(t, x_0) : t \le 0\}$.
[/definition]
A fundamental structural property of autonomous systems is that distinct orbits never cross. Precisely: if $\varphi(t_1, x_1) = \varphi(t_2, x_2)$ for some $t_1, t_2$, then $\mathcal{O}(x_1) = \mathcal{O}(x_2)$. This follows from the uniqueness theorem: if two solutions meet at a common point, the group law forces them to agree everywhere. The phase space is therefore partitioned into disjoint orbits.
<!-- illustration-needed: a phase portrait in R^2 showing two different trajectories meeting would be impossible — show instead a partition of the plane into non-crossing orbits, with arrows indicating direction of flow -->
## Invariant Sets and Special Orbits
Knowing the phase portrait exists does not tell us where most trajectories go. The phase space may be infinite, and individual trajectories may be complicated. The practical strategy is to identify special subsets of $E$ that the flow cannot escape: once a trajectory enters such a set, it stays there forever. These are **invariant sets**. If we can show that every trajectory eventually enters an invariant set with simple internal structure — say, a set containing only a fixed point or a periodic orbit — we have determined the long-time fate of all solutions without ever writing down an explicit formula.
To see why invariance matters, consider what goes wrong without it: if we try to study a trajectory by confining our attention to some arbitrary compact region, there is no guarantee that the trajectory stays in the region, and our analysis breaks down the moment the trajectory exits. An invariant set is precisely the class of region for which this problem cannot occur.
The key structural concept is that of an invariant set.
[definition: Invariant Set]
A set $\Lambda \subseteq E$ is **positively invariant** under the flow $\varphi$ if $x \in \Lambda \implies \varphi(t, x) \in \Lambda$ for all $t \ge 0$. It is **negatively invariant** if the same holds for all $t \le 0$. It is **(fully) invariant** if it is both positively and negatively invariant, i.e., if $x \in \Lambda \implies \mathcal{O}(x) \subseteq \Lambda$.
[/definition]
In practice, positive invariance is the most useful notion: it guarantees that once a trajectory enters a region, it remains there for all future time. The standard technique for establishing positive invariance of a compact set $\Lambda$ is to check the boundary condition — at every point of $\partial \Lambda$, the vector field $f$ must point into $\Lambda$ (or be tangent to it). If $f$ points strictly inward everywhere on $\partial \Lambda$, no trajectory can escape, because escape would require the velocity to point outward at the moment of exit.
[example: Invariant Set via Boundary Analysis]
Consider the one-dimensional system $\dot{x} = -x$ on $E = \mathbb{R}$. We claim that the closed interval $\Lambda = [-1, 1]$ is positively invariant. At the boundary point $x = 1$, the vector field evaluates to $f(1) = -1 < 0$, so the flow points strictly inward. At $x = -1$, $f(-1) = 1 > 0$, again pointing inward. Since $f$ is continuous and points strictly into $\Lambda$ on $\partial \Lambda = \{-1, 1\}$, any trajectory starting in $\Lambda$ cannot exit: exiting would require the state to reach $\pm 1$ with the velocity directed outward, contradicting $f(\pm 1) = \mp 1$. Hence $\Lambda$ is positively invariant. (The explicit solution $x(t) = x_0 e^{-t}$ confirms that $|x(t)| \le |x_0| \le 1$ for all $t \ge 0$.)
[/example]
The orbit $\mathcal{O}(x_0)$ is itself invariant for every $x_0$. Among all invariant sets, the simplest are those consisting of a single orbit: a point that does not move, or a trajectory that returns to itself. These are the fixed points and periodic orbits, respectively. Each represents a distinct mode of long-term behaviour that the system can lock into.
[definition: Fixed Point]
A point $x^* \in E$ is a **fixed point** (also called an **equilibrium**, **stationary point**, or **critical point**) of $f$ if $f(x^*) = 0$. The fixed point satisfies $\varphi(t, x^*) = x^*$ for all $t \in \mathbb{R}$, so $\mathcal{O}(x^*) = \{x^*\}$.
[/definition]
A fixed point is the simplest possible invariant set: a single state from which the system never moves. In many systems (for instance, any gradient flow $\dot{x} = -\nabla V(x)$ on a compact domain) every trajectory converges to a fixed point as $t \to \infty$, so finding and classifying fixed points already determines the global dynamics. But not all systems have this property — in the plane, the simplest counter-example is the harmonic oscillator $\ddot{x} + x = 0$, whose trajectories are circles that never approach any fixed point. The next level of complexity is therefore a trajectory that moves but eventually returns to where it started — a **periodic orbit**. Periodic orbits are the dynamical analogue of closed curves in geometry, and they play a central role in the long-time analysis of planar systems.
[definition: Periodic Point and Periodic Orbit]
A point $x_0 \in E$ is a **periodic point** with **period** $T > 0$ if $\varphi(T, x_0) = x_0$ and $\varphi(t, x_0) \ne x_0$ for all $0 < t < T$. The associated **periodic orbit** is the set $\mathcal{C} = \{\varphi(t, x_0) : 0 \le t < T\}$, which is a closed curve in $E$.
[/definition]
Not all periodic orbits are equally important. In a Hamiltonian system, periodic orbits typically come in continuous families (nested closed curves on an energy surface), and perturbing the initial condition slightly just moves to a neighbouring member of the family. Such orbits cannot be the long-time attractor for a generic trajectory. The dynamically significant periodic orbits are the **isolated** ones — those which attract or repel nearby trajectories.
[definition: Limit Cycle]
A periodic orbit $\mathcal{C}$ is a **limit cycle** if it is isolated: there exists a neighbourhood $U$ of $\mathcal{C}$ in $E$ containing no other periodic orbits.
[/definition]
The distinction between a periodic orbit and a limit cycle is important: a periodic orbit can be part of a continuous family (in which case it is not isolated), while a limit cycle stands alone and typically attracts or repels nearby trajectories.
[example: A Family of Periodic Orbits with No Limit Cycle]
Consider the system on $\mathbb{R}^2$:
\begin{align*}
\dot{x} = -y^3, \qquad \dot{y} = x^3.
\end{align*}
The function $H(x, y) = x^4 + y^4$ is a first integral: differentiating along trajectories,
\begin{align*}
\dot{H} = 4x^3 \dot{x} + 4y^3 \dot{y} = 4x^3(-y^3) + 4y^3(x^3) = 0.
\end{align*}
Every level set $\{x^4 + y^4 = c\}$ for $c > 0$ is therefore an orbit. These are closed curves surrounding the origin (which is a fixed point), so all non-equilibrium orbits are periodic. Since every neighbourhood of any such orbit contains other periodic orbits (on adjacent level sets), no orbit is isolated and hence there are no limit cycles.
[/example]
The preceding example shows that the mere existence of periodic orbits tells us little: in a Hamiltonian system every orbit may be periodic, yet the dynamics is far from attracting. To detect genuine long-time behaviour we must identify the isolated periodic orbits — the limit cycles — toward which nearby trajectories converge.
[example: A Limit Cycle]
Consider the system on $\mathbb{R}^2$:
\begin{align*}
\dot{x} = -y + x(1 - x^2 - y^2), \qquad \dot{y} = x + y(1 - x^2 - y^2).
\end{align*}
Converting to polar coordinates $r, \theta$ via $x = r\cos\theta$, $y = r\sin\theta$, one computes $r\dot{r} = x\dot{x} + y\dot{y}$ and $r^2\dot{\theta} = x\dot{y} - y\dot{x}$, giving
\begin{align*}
\dot{r} = r(1 - r^2), \qquad \dot{\theta} = 1.
\end{align*}
The radial equation $\dot{r} = r(1 - r^2)$ has equilibria at $r = 0$ and $r = 1$. For $0 < r < 1$, $\dot{r} > 0$ (orbits spiral outward); for $r > 1$, $\dot{r} < 0$ (orbits spiral inward). The circle $r = 1$ is therefore an isolated periodic orbit — the unique limit cycle.
To confirm that the origin is the only fixed point in Cartesian coordinates: a fixed point requires $\dot{x} = \dot{y} = 0$, hence $\dot{r} = 0$. This gives $r = 0$ (the origin) or $r = 1$. For $r = 1$, we would need $\dot{x} = -y + x \cdot 0 = -y = 0$ and $\dot{y} = x + y \cdot 0 = x = 0$, forcing $x = y = 0$, a contradiction since $r = 1$. So the origin is the only fixed point.
[/example]
The limit cycle example illustrates the general principle: to understand the phase portrait globally, one identifies the special invariant objects — fixed points, limit cycles, connecting orbits — and then determines how the remaining trajectories are organised around them. In the plane, the Poincaré-Bendixson theorem (Chapter 3) will show that these special orbits are essentially the only possible long-time behaviours for bounded trajectories. The connecting orbit definitions below introduce the remaining building blocks of this catalogue.
### Connecting Orbits
When a dynamical system has multiple fixed points, orbits may connect them asymptotically.
[definition: Homoclinic Orbit]
Let $x^* \in E$ be a fixed point of $f$, and suppose there exists $y \ne x^*$ such that
\begin{align*}
\varphi(t, y) \to x^* \quad \text{as } t \to +\infty \quad \text{and} \quad \varphi(t, y) \to x^* \quad \text{as } t \to -\infty.
\end{align*}
Then the orbit $\mathcal{O}(y)$ is called a **homoclinic orbit** (connecting $x^*$ to itself).
[/definition]
<!-- illustration-needed: a homoclinic orbit in R^2 — a loop-shaped trajectory departing from a saddle point x^* and returning to x^* as t -> ±∞, with arrows showing the direction of flow -->
A homoclinic orbit forms a loop in phase space: it departs from a fixed point along the unstable manifold and returns to the same point along the stable manifold, spending infinite time at both ends. These orbits are structurally fragile — a generic perturbation breaks the loop — but when they exist, they are dynamically significant because nearby trajectories can exhibit sensitive dependence on initial conditions. When a system has more than one fixed point, the analogous phenomenon is a trajectory connecting two distinct equilibria.
[definition: Heteroclinic Orbit]
Let $x_0^*, x_1^* \in E$ be two distinct fixed points of $f$, and suppose there exists $y \ne x_0^*, x_1^*$ such that
\begin{align*}
\varphi(t, y) \to x_0^* \text{ as } t \to -\infty \qquad \text{and} \qquad \varphi(t, y) \to x_1^* \text{ as } t \to +\infty.
\end{align*}
Then $\mathcal{O}(y)$ is a **heteroclinic orbit** connecting $x_0^*$ (in the past) to $x_1^*$ (in the future). A finite collection of heteroclinic orbits $\{\mathcal{O}(y_1), \ldots, \mathcal{O}(y_k)\}$ whose endpoints form a cycle is called a **heteroclinic cycle**.
[/definition]
<!-- illustration-needed: a heteroclinic orbit in R^2 — a trajectory running from one saddle point x_0^* to another x_1^*, with arrows on the orbit and the two fixed points labelled -->
## Limit Sets
We know that orbits partition the phase space into non-crossing curves, and that certain special orbits — fixed points, periodic orbits, connecting orbits — serve as organising landmarks. But given a specific trajectory $\varphi(t, x_0)$ that is not itself one of these special orbits, where does it end up as $t \to \infty$? The forward orbit $\mathcal{O}^+(x_0)$ is a curve in $E$, possibly unbounded, possibly dense in some region. To make the question "where does the trajectory go?" precise, we need the concept of the limit set: the collection of all points that the orbit approaches arbitrarily closely as time advances.
[definition: Omega-Limit Set]
Let $\varphi$ be the flow of $\dot{x} = f(x)$ on $E$, and $x \in E$. The **$\omega$-limit set** of $x$ is
\begin{align*}
\omega(x) = \{y \in E : \exists\, t_n \to +\infty \text{ with } \varphi(t_n, x) \to y\}.
\end{align*}
The **$\alpha$-limit set** is defined analogously with $t_n \to -\infty$.
[/definition]
Informally, $\omega(x)$ is the set of all accumulation points of the forward orbit $\mathcal{O}^+(x)$. For example, if a trajectory spirals inward toward a circle, that circle is the $\omega$-limit set.
[quotetheorem:2775]
[explanation: What the Omega-Limit Theorem Does and Does Not Say]
The theorem says that whenever a trajectory is trapped in a bounded region, its set of accumulation points is a non-trivial, structurally coherent object. The boundedness hypothesis is essential: if the orbit escapes to infinity, $\omega(x)$ may be empty. The theorem does not say that $\omega(x)$ is a single point, a single orbit, or even simple in shape — in higher dimensions, $\omega(x)$ can be a complicated fractal set (a strange attractor). In the plane, however, the theorem's conclusion combined with the Jordan curve theorem leads to the Poincaré-Bendixson theorem: a non-empty, compact, connected, invariant set in $\mathbb{R}^2$ that contains no fixed point must contain a periodic orbit, severely restricting what $\omega(x)$ can look like.
[/explanation]
[proof]
The four properties are established separately.
**Non-empty:** Since $\mathcal{O}^+(x)$ is bounded, the sequence $(\varphi(n, x))_{n \ge 1}$ is bounded in $\mathbb{R}^n$. By the Bolzano-Weierstrass theorem it has a convergent subsequence, and the limit belongs to $\omega(x)$ by definition.
**Closed:** Let $(y_k)$ be a sequence in $\omega(x)$ with $y_k \to y$. For each $k$, there is a sequence of times $t_n^{(k)} \to \infty$ with $\varphi(t_n^{(k)}, x) \to y_k$. By a diagonal argument, one extracts a sequence $s_m \to \infty$ with $\varphi(s_m, x) \to y$, so $y \in \omega(x)$.
**Invariant:** Let $y \in \omega(x)$ and $s \in \mathbb{R}$; we must show $\varphi(s, y) \in \omega(x)$. By definition there exists $t_n \to \infty$ with $\varphi(t_n, x) \to y$. The group law gives $\varphi(t_n + s, x) = \varphi(s, \varphi(t_n, x))$. Since $\varphi(s, \cdot)$ is continuous, $\varphi(s, \varphi(t_n, x)) \to \varphi(s, y)$, and $t_n + s \to \infty$, so $\varphi(s, y) \in \omega(x)$.
**Connected:** Suppose for contradiction that $\omega(x) = A \cup B$ with $A, B$ non-empty, open in $\omega(x)$, and disjoint. Choose $a \in A$ and $b \in B$ with associated time sequences $t_n \to \infty$ (with $\varphi(t_n, x) \to a$) and $s_n \to \infty$ (with $\varphi(s_n, x) \to b$). For all sufficiently large $n$, the continuous path $t \mapsto \varphi(t, x)$ from $\varphi(t_n, x)$ (near $A$) to $\varphi(s_n, x)$ (near $B$) must pass through points at positive distance from both $A$ and $B$. These intermediate points accumulate to a point in $\omega(x) \setminus (A \cup B)$, a contradiction.
[/proof]
## Topological Equivalence and Structural Stability
When are two dynamical systems "the same"? The most naive answer — that the vector fields are identical — is far too restrictive. A student of mechanics and a student of biology might arrive at phase portraits that are mirror images of each other, or differ only in the speed at which trajectories are traversed, yet exhibit the same long-time behaviour. On the other hand, insisting that orbits be related by a smooth diffeomorphism (a $C^1$ or $C^\infty$ map) is too strict: a spiral and a set of concentric circles are qualitatively "the same" in that both are families of closed or winding curves around an equilibrium, but no diffeomorphism can flatten a spiral into a circle (since their tangent directions differ). The right equivalence relation for dynamical systems is one that respects orbit structure — preserving which points flow to which — without requiring the time parameterisation or the smooth geometry to match. This is topological equivalence.
Two dynamical systems may differ in their explicit form yet share the same qualitative orbit structure. The correct notion of equivalence for flows is topological rather than algebraic.
[definition: Topological Equivalence of Flows]
Two flows $\varphi^f$ on $E^f \subseteq \mathbb{R}^n$ and $\varphi^g$ on $E^g \subseteq \mathbb{R}^m$ are **topologically equivalent** if there exists a homeomorphism $h \colon E^f \to E^g$ and a continuous function $\tau \colon E^f \times \mathbb{R} \to \mathbb{R}$ that is strictly increasing in its second argument and satisfies $\tau(x, 0) = 0$, such that
\begin{align*}
h(\varphi^f(t, x)) = \varphi^g(\tau(x,t), h(x)) \quad \text{for all } x \in E^f,\; t \in \mathbb{R},
\end{align*}
with the cocycle condition $\tau(x, t_1 + t_2) = \tau(x, t_1) + \tau(\varphi^f(t_1, x), t_2)$. The homeomorphism $h$ maps orbits of $\varphi^f$ to orbits of $\varphi^g$ preserving orientation. The stronger condition $\tau(x, t) = t$ (i.e., the time parameterisation is also preserved) defines **topological conjugacy**.
[/definition]
Topological equivalence maps fixed points to fixed points and periodic orbits to periodic orbits (though periods may differ). It preserves the topological type of every orbit. Topological conjugacy is strictly stronger: it additionally requires that the time taken to traverse corresponding orbit segments be the same.
[example: Two Topologically Conjugate Spiral Systems]
Consider the two systems in polar coordinates:
\begin{align*}
\text{System }f: \quad &\dot{r} = -r, \quad \dot{\theta} = 1;\\
\text{System }g: \quad &\dot{\rho} = -2\rho, \quad \dot{\psi} = 0.
\end{align*}
Integrating explicitly:
\begin{align*}
\varphi^f(t, r_0, \theta_0) = (r_0 e^{-t},\; \theta_0 + t), \qquad \varphi^g(t, \rho_0, \psi_0) = (\rho_0 e^{-2t},\; \psi_0).
\end{align*}
Define $h \colon E^f \to E^g$ by $h(0) = 0$ and $h(r, \theta) = (r^2, \theta + \ln r)$ for $r > 0$. Then
\begin{align*}
h(\varphi^f(t, r_0, \theta_0)) = (r_0^2 e^{-2t},\; \theta_0 + \ln r_0) = \varphi^g(t, h(r_0, \theta_0)).
\end{align*}
Since $\tau(x, t) = t$, the systems are in fact topologically conjugate (not merely equivalent). System $f$ has spiralling orbits (both $r$ and $\theta$ change), while system $g$ has radial orbits ($\theta$ is constant); the homeomorphism $h$ identifies these geometrically different pictures.
[/example]
The map $h$ in the above example needs verification that it is indeed a homeomorphism. For $r > 0$, $h(r, \theta) = (r^2, \theta + \ln r)$: this is a bijection from $\mathbb{R}^2 \setminus \{0\}$ to itself (the inverse sends $(\rho, \psi)$ to $(\sqrt{\rho},\, \psi - \tfrac{1}{2}\ln\rho)$), and both $h$ and $h^{-1}$ are continuous, since they are compositions of continuous functions on $r > 0$. At the origin, $h(0) = 0$ and $h(r, \theta) \to (0, \cdot)$ as $r \to 0$, so continuity at $0$ follows. Hence $h$ is a homeomorphism.
[example: Same Trajectories, Different Time Parameterisations]
Consider the systems
\begin{align*}
\text{System }f: \quad &\dot{r} = 0, \quad \dot{\theta} = 1;\\
\text{System }g: \quad &\dot{r} = 0, \quad \dot{\theta} = r + \sin^2\theta.
\end{align*}
Both systems have $\dot{r} = 0$, so every circle $\{r = c\}$ is invariant and the trajectories of both systems are the same circles traversed in the same direction. Setting $h = \operatorname{id}_{E}$, we see that orbits match up; however, the angular speeds differ ($\dot{\theta} = 1$ vs.\ $\dot{\theta} = r + \sin^2\theta$), so the time parameterisation $\tau$ is non-trivial. The systems are topologically equivalent but not conjugate.
[/example]
The examples above show that topological equivalence is the correct lens for comparing two given systems. But in applications, we rarely know the vector field exactly — measurement error, modelling simplifications, and parameter uncertainty all introduce small perturbations. The natural question is: if we perturb the vector field slightly (in the $C^1$ topology), does the qualitative picture survive? A system for which the answer is "yes" is called structurally stable. Structural stability is the dynamical analogue of transversality in differential topology: it identifies the systems whose behaviour is robust, as opposed to those that sit at a critical boundary where arbitrarily small changes can destroy or create invariant objects.
[definition: Structural Stability]
The vector field $f \colon E \to \mathbb{R}^n$ (defining $\dot{x} = f(x)$) is **structurally stable** if there exists $\varepsilon > 0$ such that for every perturbation $g \colon E \to \mathbb{R}^n$ satisfying
\begin{align*}
|g(x)| + \sum_{i,j} \left|\frac{\partial g_j}{\partial x_i}(x)\right| < \varepsilon \quad \text{for all } x \in E,
\end{align*}
the perturbed vector field $f + g$ is topologically equivalent to $f$.
[/definition]
Structural stability is the appropriate notion of robustness for a dynamical system: if the model equations are only known approximately, we want confidence that small modelling errors do not change the qualitative picture. A system that is not structurally stable sits at a bifurcation — arbitrarily small perturbations can alter the orbit topology, creating or destroying fixed points, limit cycles, or connecting orbits. The classification of structurally unstable systems is the content of bifurcation theory, which appears in later chapters.
These topological and structural concepts now must be realized concretely in the simplest nontrivial setting: two-dimensional phase spaces. Understanding how to classify fixed points and sketch phase portraits in $\mathbb{R}^2$ forms the foundation for all qualitative dynamics.
# 2. Flows in the Plane
The preceding chapter developed the foundations of autonomous dynamical systems in $\mathbb{R}^n$: existence and uniqueness of trajectories, the flow map $\varphi$, and the basic vocabulary of orbits, limit sets, and topological equivalence. This chapter specialises to the plane $\mathbb{R}^2$ and asks a more concrete question: given a nonlinear system $\dot{x} = f(x, y)$, $\dot{y} = g(x, y)$, how can we determine the qualitative structure of the phase portrait without solving the equations explicitly? The answer is linearisation — approximating the system near each fixed point by a linear ODE whose phase portrait we can classify completely. We develop this classification, establish when linearisation faithfully reflects the nonlinear dynamics (the Hartman–Grobman theorem), state the Stable Manifold Theorem for hyperbolic fixed points, and conclude with a systematic procedure for sketching phase portraits.
## Linearisation at a Fixed Point
**Problem.** Given a smooth vector field $f \colon U \subseteq \mathbb{R}^2 \to \mathbb{R}^2$ and a fixed point $x_0$ (i.e. $f(x_0) = 0$), what is the simplest approximation to $\dot{x} = f(x)$ that captures local behaviour near $x_0$?
The Taylor expansion of $f$ about $x_0$ gives
\begin{align*}
f(x) = f(x_0) + Jf_{x_0}(x - x_0) + O(|x - x_0|^2) = Jf_{x_0}(x - x_0) + O(|x - x_0|^2),
\end{align*}
since $f(x_0) = 0$. Setting $y = x - x_0$, the leading-order approximation is the linear system $\dot{y} = Ay$.
[definition: Linearisation]
Let $U \subseteq \mathbb{R}^n$ be open and $f \colon U \to \mathbb{R}^n$ be continuously differentiable. Let $x_0 \in U$ be a fixed point of $\dot{x} = f(x)$, so that $f(x_0) = 0$. The **linearisation** of $f$ about $x_0$ is the linear system
\begin{align*}
\dot{y} = Ay, \qquad y = x - x_0,
\end{align*}
where $A = Jf_{x_0} \in \mathbb{R}^{n \times n}$ is the Jacobian matrix of $f$ evaluated at $x_0$, with entries
\begin{align*}
A_{ij} = \frac{\partial f_i}{\partial x_j}\bigg|_{x = x_0}.
\end{align*}
[/definition]
The linearisation discards the $O(|y|^2)$ corrections and replaces the nonlinear flow near $x_0$ with the linear flow of $\dot{y} = Ay$. Whether this approximation is faithful — in the sense of topological equivalence of the two flows — depends on the eigenvalues of $A$, which is the central question of this chapter.
## Classification of Fixed Points in $\mathbb{R}^2$
**Problem.** Two fixed points can both be stable and yet look completely different: one attracts all nearby trajectories monotonically along straight lines, another winds them inward in spirals, a third repels them along two sharply defined curves. Without a classification, every new system presents an unfamiliar picture and we have no language to describe what we see, let alone to predict it from the equations. The goal of this section is to answer: *given the eigenvalues of the linearisation, what does the phase portrait look like near the fixed point?*
For a planar linear system $\dot{y} = Ay$ with $A \in \mathbb{R}^{2 \times 2}$, the qualitative behaviour near the origin is completely determined by the eigenvalues $\lambda_1, \lambda_2$ of $A$. These satisfy the characteristic equation
\begin{align*}
\lambda^2 - (\operatorname{Tr} A)\lambda + \det A = 0,
\end{align*}
so the eigenvalue type is read off from the trace and determinant of $A$.
We classify the generic cases first, then handle the degenerate ones. The taxonomy arises because the eigenvalues $\lambda_1, \lambda_2$ of $A$ can be real or complex, and if real they can agree or disagree in sign. Each combination produces qualitatively distinct geometry: real eigenvalues of the same sign give all trajectories approaching or receding along preferred directions (nodes); real eigenvalues of opposite sign produce a saddle with one family approaching and one receding; complex eigenvalues with non-zero real part produce spirals whose rate and direction of winding are set by $\operatorname{Im}(\lambda)$ while their stability is set by $\operatorname{Re}(\lambda)$. There is no overlap between these cases — each determines a different topological type of flow.
### Hyperbolic Fixed Points
[definition: Hyperbolic Fixed Point]
A fixed point $x_0$ of $\dot{x} = f(x)$ is **hyperbolic** if all eigenvalues of the linearisation matrix $A = Jf_{x_0}$ have non-zero real part: $\operatorname{Re}(\lambda_i) \neq 0$ for all $i$.
[/definition]
Hyperbolic fixed points are the structurally stable case: small perturbations to $f$ do not change their qualitative type. There are three hyperbolic configurations in the plane.
[definition: Saddle Point]
The fixed point $x_0$ is a **saddle point** if the eigenvalues $\lambda_1, \lambda_2$ of $A$ are real with opposite sign: $\lambda_1 < 0 < \lambda_2$. Equivalently, $\det A < 0$.
[/definition]
Near a saddle, trajectories approach $x_0$ along the eigendirection corresponding to $\lambda_1 < 0$ and recede along the eigendirection corresponding to $\lambda_2 > 0$. The phase portrait exhibits two pairs of separatrix curves through the fixed point; all other nearby trajectories are repelled.
<!-- illustration-needed: saddle point phase portrait — show two straight-line trajectories along the stable eigendirection (approaching the origin) and two along the unstable eigendirection (leaving the origin), with nearby hyperbolic-shaped trajectories -->
[definition: Node]
The fixed point $x_0$ is a **node** if the eigenvalues $\lambda_1, \lambda_2$ of $A$ are real and of the same sign: $\lambda_1, \lambda_2 < 0$ (stable node) or $\lambda_1, \lambda_2 > 0$ (unstable node). Equivalently, $\det A > 0$ and $(\operatorname{Tr} A)^2 > 4\det A$.
[/definition]
For a stable node, all trajectories converge to $x_0$ as $t \to +\infty$, approaching tangentially to the eigendirection of the eigenvalue of smaller absolute value (the "slower" direction). For an unstable node the picture is reversed. When $\lambda_1 = \lambda_2 = \lambda$ and $A$ is diagonalisable (star node), all trajectories are straight lines through $x_0$. When $A$ has a single Jordan block, $A = \begin{pmatrix} \lambda & 1 \\ 0 & \lambda \end{pmatrix}$ (improper node), the system gives
\begin{align*}
y(t) = y_0 e^{\lambda t}, \qquad x(t) = x_0 e^{\lambda t} + y_0 t e^{\lambda t},
\end{align*}
so that $y/x = y_0/(x_0 + y_0 t) \to 0$ as $t \to \pm\infty$: all trajectories become tangent to the $x$-axis.
<!-- illustration-needed: stable node phase portrait — show trajectories spiralling along the slow eigendirection and converging to the origin, contrasting with the star node where all trajectories are radial lines -->
[definition: Focus]
The fixed point $x_0$ is a **focus** (or spiral) if the eigenvalues of $A$ are complex with non-zero real part: $\lambda_{1,2} = \alpha \pm i\omega$ with $\alpha \neq 0$ and $\omega \neq 0$. The focus is **stable** if $\alpha < 0$ and **unstable** if $\alpha > 0$.
[/definition]
To see why trajectories spiral, let $v = v_x + iv_y$ be a complex eigenvector with eigenvalue $\lambda = \alpha + i\omega$. Separating real and imaginary parts of $Av = \lambda v$:
\begin{align*}
Av_x &= \alphav_x - \omegav_y, \\
Av_y &= \alphav_y + \omegav_x.
\end{align*}
The vectors $v_x$ and $v_y$ serve as the axes of spiralling: $\omega$ drives rotation between them and $\alpha$ drives radial growth or decay. Because no real eigenvector exists, there is no preferred direction in the real plane along which the flow points invariantly — trajectories spiral around the fixed point.
[remark: Trace-Determinant Criterion]
The eigenvalue type is read directly from $\operatorname{Tr} A$ and $\det A$:
- $\det A < 0$: saddle.
- $\det A > 0$, $(\operatorname{Tr} A)^2 > 4\det A$: node ($\operatorname{Tr} A \neq 0$) or repeated real eigenvalue.
- $\det A > 0$, $(\operatorname{Tr} A)^2 < 4\det A$, $\operatorname{Tr} A \neq 0$: focus.
- $\det A > 0$, $\operatorname{Tr} A = 0$: centre (non-hyperbolic).
[/remark]
<!-- illustration-needed: trace-determinant diagram — show the $\det A$-vs-$\operatorname{Tr} A$ plane divided into regions: $\det A < 0$ (saddle), $\det A > 0$ above the parabola $(\operatorname{Tr} A)^2 = 4\det A$ (node), $\det A > 0$ below the parabola with $\operatorname{Tr} A \neq 0$ (focus), and $\det A > 0$ on the $\operatorname{Tr} A = 0$ axis (centre). Label each region and the parabola. -->
We further classify hyperbolic fixed points by the sign of $\operatorname{Re}(\lambda_i)$:
[definition: Source, Sink, Saddle]
Let $x_0$ be a hyperbolic fixed point of $\dot{x} = f(x)$ with linearisation eigenvalues $\lambda_1, \lambda_2$.
- $x_0$ is a **sink** (or stable fixed point) if $\operatorname{Re}(\lambda_i) < 0$ for all $i$.
- $x_0$ is a **source** (or unstable fixed point) if $\operatorname{Re}(\lambda_i) > 0$ for all $i$.
- $x_0$ is a **saddle** if one eigenvalue has negative real part and the other has positive real part.
[/definition]
### Non-Hyperbolic Fixed Points: The Centre
When $\operatorname{Tr} A = 0$ and $\det A > 0$, the eigenvalues are purely imaginary, $\lambda_{1,2} = \pm i\omega$ with $\omega > 0$. This is the only non-hyperbolic case in the planar classification that has a standard name.
[definition: Centre]
The fixed point $x_0$ is a **centre** of the linearised system $\dot{y} = Ay$ if the eigenvalues of $A$ are purely imaginary: $\lambda_{1,2} = \pm i\omega$ with $\omega > 0$. All trajectories of the linear system near $x_0$ are closed curves (ellipses in suitable coordinates).
[/definition]
The centre is a degenerate, structurally unstable configuration for the linearisation: it sits on the boundary between stable and unstable spirals. Whether the full nonlinear system also has a centre or instead a focus depends on higher-order terms, as discussed below.
The remaining non-hyperbolic cases arise when $\det A = 0$ (one or both eigenvalues zero): these produce lines of fixed points or the degenerate star/improper node patterns described above, and are treated individually when encountered.
## Fixed Points of Hamiltonian Systems
A class of systems where the classification simplifies significantly is provided by Hamiltonian mechanics.
**Question.** For a planar system arising from a Hamiltonian $H \colon \mathbb{R}^2 \to \mathbb{R}$, what types of fixed points can occur?
A planar Hamiltonian system takes the form
\begin{align*}
\dot{x} = \frac{\partial H}{\partial y}, \qquad \dot{y} = -\frac{\partial H}{\partial x}.
\end{align*}
[quotetheorem:2776]
[citeproof:2776]
[remark: Centres Generic in Hamiltonian Systems]
For Hamiltonian systems the centre is the generic stable configuration: whenever $H$ has a non-degenerate local minimum at $x_0$ (i.e. the Hessian of $H$ is positive definite), the fixed point is a stable centre of the linearisation, and the nonlinear system also exhibits closed orbits (as level curves of $H$). This is in stark contrast to general systems, where the centre is structurally unstable.
[/remark]
## The Effect of Nonlinear Terms
The central question of this section is: when does the linearised phase portrait faithfully represent the nonlinear one?
**Question.** If the linearisation $\dot{y} = Ay$ at a fixed point $x_0$ has a certain qualitative type (saddle, node, focus), does the nonlinear system $\dot{x} = f(x)$ have the same qualitative type near $x_0$?
The answer is affirmative precisely when the fixed point is hyperbolic. This is the content of the Hartman–Grobman theorem, which the course states without full proof.
[quotetheorem:2777]
This theorem is the justification for the entire linearisation programme: for hyperbolic fixed points, the linearisation captures all topological features of the flow (fixed point type, number and directions of stable and unstable manifolds). The theorem fails for non-hyperbolic fixed points, where higher-order terms must be analysed.
[example: Hyperbolicity Cannot Be Dropped]
To see concretely that the theorem fails without hyperbolicity, consider the one-parameter family
\begin{align*}
\dot{x} &= \varepsilon x - y, \\
\dot{y} &= x + \varepsilon y,
\end{align*}
for $\varepsilon \in \mathbb{R}$. The matrix $A_\varepsilon = \begin{pmatrix} \varepsilon & -1 \\ 1 & \varepsilon \end{pmatrix}$ has eigenvalues $\varepsilon \pm i$. For any $\varepsilon > 0$ the fixed point is an unstable focus; for $\varepsilon < 0$ it is a stable focus; for $\varepsilon = 0$ it is a centre with purely imaginary eigenvalues $\pm i$. At $\varepsilon = 0$ the fixed point is non-hyperbolic, and any arbitrarily small perturbation $\varepsilon \neq 0$ changes the topological type from centre to focus — the portrait is not stable. Hartman–Grobman says nothing for $\varepsilon = 0$, and indeed the full nonlinear system at $\varepsilon = 0$ could be either a centre or a focus depending on higher-order terms.
[/example]
### Non-Hyperbolic Case: The Centre–Focus Problem
When $A$ has purely imaginary eigenvalues $\pm i\omega$, the linearisation is a centre, but the nonlinear system need not be. Three outcomes are possible.
[example: Linearised Centre That Becomes a Stable Focus]
Consider the system
\begin{align*}
\dot{x} &= -y - x^3, \\
\dot{y} &= x - y^3.
\end{align*}
The linearisation at the origin has $A = \begin{pmatrix} 0 & -1 \\ 1 & 0 \end{pmatrix}$, which has eigenvalues $\pm i$: a centre. To analyse the nonlinear system, convert to polar coordinates $x = r\cos\theta$, $y = r\sin\theta$. Computing $r\dot{r} = x\dot{x} + y\dot{y}$:
\begin{align*}
r\dot{r} = x(-y - x^3) + y(x - y^3) = -x^4 - y^4.
\end{align*}
Thus $\dot{r} = -(x^4 + y^4)/r$. Since $x^4 + y^4 > 0$ for $(x,y) \neq (0,0)$, we have $\dot{r} < 0$ for all $r > 0$: the radius decreases strictly along every trajectory. The origin is therefore a stable focus of the nonlinear system, not a centre, despite the linearisation predicting a centre.
[/example]
[example: A Nonlinear Centre]
Consider the system
\begin{align*}
\dot{x} &= -y - 2x^2 y, \\
\dot{y} &= x + 2y^2 x.
\end{align*}
This is a Hamiltonian system with $H(x,y) = \frac{1}{2}(x^2 + y^2) + \frac{1}{2}(x^2 + y^2)^2/2$, so every trajectory lies on a level set of $H$. Since $H$ has a global minimum at the origin, all non-trivial nearby orbits are closed curves. The linearisation has eigenvalues $\pm i$ (a centre), and the nonlinear system preserves this: the origin is a genuine nonlinear centre.
[/example]
The contrast between these two examples shows that without hyperbolicity, the linearisation provides only local information and the higher-order terms determine whether a centre or focus occurs.
### Non-Hyperbolic Case: One Zero Eigenvalue
When $A$ has one zero eigenvalue and one non-zero eigenvalue, various degenerate configurations arise depending on the nonlinear terms. Two illustrative examples:
- **Saddle-node.** The system $\dot{x} = x^2$, $\dot{y} = -y$ has linearisation at the origin with eigenvalues $0$ and $-1$. The $x$-axis is not a manifold of equilibria; instead, trajectories approach the origin from one side and recede on the other — a saddle-node configuration.
- **Nonlinear saddle.** The system $\dot{x} = x^3$, $\dot{y} = -y$ has linearisation eigenvalues $0$ and $-1$. The $x$-direction is a centre manifold whose dynamics are governed by $\dot{x} = x^3$, a nonlinear saddle.
When $A = 0$ (both eigenvalues zero), almost any behaviour is possible. One approach is to switch to polar coordinates and find the angles at which $\dot{\theta} = 0$ as $r \to 0$; between consecutive such angles, three distinct behaviours (elliptic, hyperbolic, parabolic sectors) can occur.
## Stable and Unstable Manifolds
**Problem.** The Hartman–Grobman theorem tells us that near a hyperbolic fixed point the nonlinear flow is topologically conjugate to the linear one. But topology is coarse: it tells us the portrait looks like a saddle, a node, or a spiral, but nothing about *which initial conditions* are actually attracted to the fixed point, or how those sets depend on the nonlinear terms. For a saddle, the two stable eigendirections $E^s$ are tangent to the attracting curves — but are those eigendirections themselves invariant under the nonlinear flow? The answer is no: trajectories starting on $E^s$ need not stay on $E^s$, because $E^s$ is a property of the linearisation and the nonlinear corrections bend the actual trajectories away from it. What *is* globally invariant is a smooth curve tangent to $E^s$ at the fixed point — the stable manifold. Identifying it is what determines which initial conditions are captured by the fixed point.
For a hyperbolic fixed point, the linearised phase space decomposes into invariant subspaces. The Stable Manifold Theorem shows this decomposition lifts to invariant manifolds of the nonlinear system.
[definition: Stable, Unstable, and Centre Subspaces]
Let $x_0$ be a fixed point of $\dot{x} = f(x)$ with linearisation matrix $A$. The **stable subspace** $E^s$, **unstable subspace** $E^u$, and **centre subspace** $E^c$ at $x_0$ are the linear subspaces of $\mathbb{R}^n$ spanned respectively by the (generalised) eigenvectors of $A$ whose eigenvalues have real part $< 0$, $> 0$, and $= 0$. These subspaces are $A$-invariant and satisfy $\mathbb{R}^n = E^s \oplus E^u \oplus E^c$.
[/definition]
A hyperbolic fixed point has $E^c = \{0\}$, so $\mathbb{R}^n = E^s \oplus E^u$.
The fundamental theorem of the subject is:
[quotetheorem:2778]
The geometric meaning in $\mathbb{R}^2$ is the most important to absorb: near a hyperbolic saddle point, there are two invariant curves through the fixed point — one along which trajectories approach the fixed point forward in time (the stable manifold, tangent to $E^s$) and one along which they approach backward in time (the unstable manifold, tangent to $E^u$). The theorem guarantees that the eigenvectors of $A$ tell us the tangent directions of these curves, even though the curves themselves are curved by nonlinear terms.
<!-- illustration-needed: stable and unstable manifolds at a saddle — show $E^s$ and $E^u$ as the linearised eigenvector directions at the origin, and $W^s_{\mathrm{loc}}$ and $W^u_{\mathrm{loc}}$ as smooth curves tangent to them, with arrows indicating flow direction along each branch -->
### Computing Stable and Unstable Manifolds by Power Series
The Stable Manifold Theorem is constructive: the manifolds can be approximated locally by power series whose coefficients are determined by plugging the ansatz into the ODE.
**Procedure.** Let $0$ be a hyperbolic fixed point of $\dot{x} = f(x)$ in $\mathbb{R}^2$, with eigenvectors $e_1, e_2$ and eigenvalues $\lambda_1, \lambda_2$.
For each eigenvector $e_i$ with $\lambda_i < 0$ (stable direction): if $(e_i)_x \neq 0$, write the stable manifold as a graph $y = y(x)$ and expand
\begin{align*}
y(x) = a_1 \frac{(e_i)_y}{(e_i)_x} x + a_2 x^2 + a_3 x^3 + \cdots
\end{align*}
(note $a_0 = 0$ since $0$ lies on the manifold, and the linear coefficient is fixed by the tangency condition). Substitute into the constraint $\dot{y} = \dot{x}\, dy/dx$ — where $\dot{x}$ and $\dot{y}$ are given by the ODE — and equate coefficients to solve for $a_2, a_3, \ldots$. If $(e_i)_x = 0$, expand $x = x(y)$ instead. For unstable manifolds, repeat with $\lambda_i > 0$.
[example: Power Series for the Stable Manifold]
Consider the system
\begin{align*}
\dot{x} = x, \qquad \dot{y} = -y + x^2.
\end{align*}
The origin is a fixed point. The linearisation is $A = \begin{pmatrix} 1 & 0 \\ 0 & -1 \end{pmatrix}$, with eigenvalues $\lambda_1 = 1$ (eigenvector $(1, 0)^\top$) and $\lambda_2 = -1$ (eigenvector $(0, 1)^\top$). This is a hyperbolic saddle.
The system can be solved exactly: $x(t) = x_0 e^t$, $y(t) = \frac{1}{3}x_0^2 e^{2t} + \bigl(y_0 - \frac{1}{3}x_0^2\bigr)e^{-t}$. The two invariant curves $x = 0$ (stable, since $x \to 0$ as $t \to +\infty$ when $x_0 = 0$) and $y = \frac{1}{3}x^2$ (unstable, since $y - \frac{1}{3}x^2 \propto e^{-t} \to 0$) are visible directly.
To verify using the power series method: for the unstable manifold, tangent to $E^u = \operatorname{span}(1,0)^\top$, write $y(x) = a_2 x^2 + a_3 x^3 + \cdots$. The constraint $\dot{y} = \dot{x}\, dy/dx$ gives
\begin{align*}
-y + x^2 &= x \cdot (2a_2 x + 3a_3 x^2 + \cdots).
\end{align*}
Substituting $y = a_2 x^2 + a_3 x^3 + \cdots$ on the left:
\begin{align*}
(-a_2 x^2 - a_3 x^3 + \cdots) + x^2 &= 2a_2 x^2 + 3a_3 x^3 + \cdots.
\end{align*}
Equating coefficients of $x^2$: $1 - a_2 = 2a_2$, so $a_2 = \frac{1}{3}$. Equating $x^3$: $-a_3 = 3a_3$, so $a_3 = 0$; similarly all higher coefficients vanish. The unstable manifold is $y = \frac{1}{3}x^2$, confirming the exact solution.
For the stable manifold, tangent to $E^s = \operatorname{span}(0,1)^\top$, write $x(y) = b_2 y^2 + b_3 y^3 + \cdots$. The analogous computation yields $x(y) = 0$, i.e. the stable manifold is the $y$-axis, again consistent with the exact solution.
[/example]
[remark: Radius of Convergence]
The power series expansion converges only in a neighbourhood of the fixed point. For a system like
\begin{align*}
\dot{x} = x - xy, \qquad \dot{y} = -y + x^2,
\end{align*}
the unstable manifold can be approximated as $y(x) = \frac{1}{3}x^2 + \frac{2}{45}x^4 + O(x^6)$, but the truncated series is not itself a trajectory, and the true unstable manifold cannot be extended globally by this power series — it fails to converge beyond the local neighbourhood. The Stable Manifold Theorem is a local result, and extending manifolds globally requires additional analysis.
[/remark]
## Separatrices
**Problem.** Knowing the local geometry near each fixed point is not enough to understand the global phase portrait. Two initial conditions very close to each other near a saddle can end up in completely different places: one spirals into a stable focus, the other escapes to infinity. How do we determine which initial conditions lead to qualitatively different long-term fates? The answer is to identify the separatrices — curves that act as boundaries between regions with distinct asymptotic behaviour.
At a saddle point in $\mathbb{R}^2$, the stable and unstable manifolds are curves. Because trajectories on opposite sides of a stable or unstable manifold branch diverge to different limit sets, these curves divide the plane into regions with qualitatively different long-time behaviour.
[definition: Separatrix]
A **separatrix** of a flow $\varphi \colon \mathbb{R} \times U \to \mathbb{R}^2$ is a trajectory that separates the phase space into two regions with qualitatively different asymptotic behaviour. The stable and unstable manifolds of a saddle point are the primary examples.
[/definition]
Separatrices are the skeleton of the phase portrait: once their positions are determined, the remaining trajectories are forced into the sectors they delimit. Two trajectories on the same side of a separatrix must have the same qualitative fate (same $\omega$-limit set); two trajectories on opposite sides may not.
<!-- illustration-needed: separatrices of a saddle — show $W^s$ and $W^u$ dividing a neighbourhood of the saddle into four sectors, with arrows indicating inward flow on $W^s$ and outward flow on $W^u$, and sample trajectories in each sector -->
## Sketching Phase Portraits
**Problem.** Most nonlinear systems cannot be solved explicitly, yet we need to understand the global behaviour of trajectories. Given a system $\dot{x} = f(x,y)$, $\dot{y} = g(x,y)$ that we cannot integrate, how do we construct a reliable global picture of the flow — one that correctly shows which trajectories converge, which diverge, and which regions of the plane belong to the basin of attraction of each fixed point? The answer is a systematic procedure that stitches together the local information (fixed point classifications, stable and unstable manifolds) with global constraints (nullclines, symmetries, large-$r$ asymptotics) to produce a qualitatively faithful portrait without ever solving the ODE.
The classification theory and the Stable Manifold Theorem together yield a systematic procedure for constructing qualitatively correct phase portraits of planar systems.
**Procedure for Sketching Phase Portraits of $\dot{x} = f(x,y)$, $\dot{y} = g(x,y)$.**
(1) **Attempt an explicit solution.** If the ODE is separable, exact, or otherwise tractable, solve it. This is rarely possible for genuinely nonlinear systems but should be the first check.
(2) **Locate fixed points.** Find all $(x_0, y_0)$ with $f(x_0, y_0) = 0$ and $g(x_0, y_0) = 0$.
(3) **Classify hyperbolic fixed points.** For each fixed point, compute $A = Jf_{(x_0,y_0)}$, find the eigenvalues from $\lambda^2 - (\operatorname{Tr} A)\lambda + \det A = 0$, and classify as saddle, stable/unstable node, or stable/unstable focus. For saddle points, find the eigenvectors and use the power series method to sketch the local stable and unstable manifolds.
(4) **Handle non-hyperbolic fixed points via polar coordinates.** If a fixed point has purely imaginary or zero eigenvalues, convert to polar coordinates using
\begin{align*}
r\dot{r} &= \dot{x}\, x + \dot{y}\, y, \\
r^2\dot{\theta} &= \dot{y}\, x - \dot{x}\, y,
\end{align*}
and determine the sign of $\dot{r}$ to distinguish stable focus, unstable focus, and centre.
(5) **Locate nullclines.** The $x$-nullcline is $\{f(x,y) = 0\}$ (where $\dot{x} = 0$) and the $y$-nullcline is $\{g(x,y) = 0\}$ (where $\dot{y} = 0$). Fixed points are intersections of nullclines. Between nullclines, the signs of $\dot{x}$ and $\dot{y}$ are constant, constraining the direction of flow.
(6) **Exploit symmetries.** If $f(-x, y) = -f(x,y)$ and $g(-x,y) = g(x,y)$, the system is symmetric under $x \mapsto -x$ and the phase portrait is symmetric about the $y$-axis. Similarly for other reflections or rotations.
(7) **Analyse large-$r$ behaviour.** Use polar coordinates or estimates to determine where trajectories go as $|(x,y)| \to \infty$. This constrains whether orbits are bounded or unbounded.
(8) **Fill in remaining trajectories.** Use the vector field $(\dot{x}, \dot{y})$ and the information collected to sketch orbits between the structural features (fixed points, separatrices, nullclines).
[example: A Saddle-Node Phase Portrait]
Consider the system
\begin{align*}
\dot{x} = x - xy, \qquad \dot{y} = -y + x^2.
\end{align*}
**Fixed points.** Setting $\dot{x} = 0$: $x(1-y) = 0$, so $x = 0$ or $y = 1$. Setting $\dot{y} = 0$: $y = x^2$. The intersection $x = 0$, $y = 0$ gives the fixed point at the origin. The intersection $y = 1$, $y = x^2$ gives $x^2 = 1$, so two further fixed points at $(\pm 1, 1)$.
**Classification of the origin.** The Jacobian at $(0,0)$ is
\begin{align*}
A = \begin{pmatrix} 1 - y & -x \\ 2x & -1 \end{pmatrix}\bigg|_{(0,0)} = \begin{pmatrix} 1 & 0 \\ 0 & -1 \end{pmatrix}.
\end{align*}
Eigenvalues $\lambda_1 = 1$, $\lambda_2 = -1$: the origin is a saddle. The eigenvectors are $(1,0)^\top$ (unstable) and $(0,1)^\top$ (stable). The unstable manifold is tangent to the $x$-axis and the stable manifold is the $y$-axis itself (since $\dot{x}|_{x=0} = 0$, so the $y$-axis is invariant).
**Symmetry.** The system satisfies $f(-x,y) = -f(x,y)$ and $g(-x,y) = g(x,y)$, so the phase portrait is symmetric under $x \mapsto -x$.
**Nullclines.** The $x$-nullcline is $x(1-y) = 0$, i.e. the $y$-axis $\{x = 0\}$ and the horizontal line $\{y = 1\}$. The $y$-nullcline is $y = x^2$, a parabola. In the region $x > 0$, $0 < y < 1$: $\dot{x} > 0$ and $\dot{y} > 0$ for $x^2 > y$, pointing up and to the right. Crossing into $y > x^2$ reverses the sign of $\dot{y}$.
**Large-$r$ behaviour.** For $|(x,y)| \to \infty$ with $|y| \ll |x|$, the dominant term in $\dot{x}$ is $x$ (growth) and $\dot{y} \approx x^2$ (growth), so trajectories escape to infinity in this regime. On the other hand, along the $y$-axis ($x = 0$), $\dot{x} = 0$ and $\dot{y} = -y$, so the $y$-axis is an invariant line along which all trajectories converge to the origin as $t \to +\infty$ — confirming it is the global stable manifold of the origin.
**Local manifolds at $(\pm 1, 1)$.** By symmetry it suffices to analyse $(1,1)$. The Jacobian there is
\begin{align*}
A = \begin{pmatrix} 1-y & -x \\ 2x & -1 \end{pmatrix}\bigg|_{(1,1)} = \begin{pmatrix} 0 & -1 \\ 2 & -1 \end{pmatrix}.
\end{align*}
The characteristic equation is $\lambda^2 + \lambda + 2 = 0$, with roots $\lambda = \frac{-1 \pm \sqrt{-7}}{2} = -\tfrac{1}{2} \pm \tfrac{\sqrt{7}}{2}i$. Since $\operatorname{Re}(\lambda) = -\tfrac{1}{2} < 0$, the point $(1,1)$ is a stable focus. Nearby trajectories spiral inward, and by symmetry $(-1,1)$ is also a stable focus.
[/example]
[remark: Global Manifolds from Local Computation]
The local stable and unstable manifolds computed by power series can be extended globally: once a trajectory is found starting tangent to $E^s$ or $E^u$, it can be followed forward or backward in time to trace the global manifold. In practice, one combines the local power series approximation near the fixed point with the qualitative flow information (nullclines, symmetries, behaviour at infinity) to trace the global picture.
[/remark]
With the geometry of phase portraits now in hand, we shift from describing what the dynamics looks like to quantifying stability rigorously. Lyapunov functions and basins of attraction provide the tools to determine precisely which invariant sets attract or repel nearby solutions.
# 3. Stability
The preceding chapter developed qualitative tools for understanding the local geometry of flows near fixed points — eigenvalue classification, linearisation, stable and unstable manifolds. These tools answer the question of *what* the phase portrait looks like near an equilibrium. This chapter asks a different question: given a fixed point or more general invariant set, what happens to a trajectory that starts *close* but not exactly on it? Does it stay close? Does it converge? These are the questions of stability theory.
We begin by giving precise definitions of several distinct stability concepts — Lyapunov stability, quasi-asymptotic stability, and asymptotic stability — which capture different aspects of the qualitative behaviour near an invariant set. We then introduce the central tool of the chapter: Lyapunov functions. A Lyapunov function is a scalar-valued function on phase space whose level sets trap trajectories, allowing us to prove stability without ever solving the differential equation. We prove Lyapunov's two main stability theorems and La Salle's invariance principle, which extends the method to cases where the Lyapunov function is not strictly decreasing. The chapter closes with bounding functions, which are used to confine trajectories to compact regions even when a full Lyapunov argument is not available.
## Definitions of Stability
**Problem.** Suppose $\Lambda \subset E$ is an invariant set of the autonomous system $\dot{x} = f(x)$. If we perturb the initial condition slightly away from $\Lambda$, does the resulting trajectory remain near $\Lambda$? Does it return to $\Lambda$? The answer depends on which of several distinct properties $\Lambda$ possesses, and getting the definitions right is essential.
To make the question precise we need to measure proximity to a set.
[definition: Neighbourhood of a Set]
Let $E \subseteq \mathbb{R}^n$ and let $\Lambda \subset E$. For $\delta > 0$, the **$\delta$-neighbourhood** of $\Lambda$ is
\begin{align*}
N_\delta(\Lambda) := \{ x \in E : \exists\, y \in \Lambda \text{ such that } |x - y| < \delta \}.
\end{align*}
[/definition]
This is precisely the open $\delta$-tube around $\Lambda$, the set of all points within Euclidean distance $\delta$ of some point of $\Lambda$. We also need a notion of a trajectory *tending to* a set.
[definition: Flow Tending to a Set]
Let $\phi_t \colon E \to E$ be the flow of $\dot{x} = f(x)$. We say that $\phi_t(x) \to \Lambda$ as $t \to \infty$ if
\begin{align*}
\inf_{y \in \Lambda} |\phi_t(x) - y| \to 0 \quad \text{as } t \to \infty.
\end{align*}
[/definition]
The infimum $\inf_{y \in \Lambda} |\phi_t(x) - y|$ is the distance from the trajectory point $\phi_t(x)$ to the set $\Lambda$. The condition requires this distance to shrink to zero. Requiring all trajectories starting near $\Lambda$ to eventually tend to $\Lambda$ is, however, too strong for many applications: there are systems where trajectories beginning near an equilibrium do not converge but also do not drift away. The three definitions below separate these behaviours.
[definition: Lyapunov Stability]
An invariant set $\Lambda \subset E$ is **Lyapunov stable** if for every $\epsilon > 0$ there exists $\delta > 0$ such that
\begin{align*}
x \in N_\delta(\Lambda) \implies \phi_t(x) \in N_\epsilon(\Lambda) \quad \text{for all } t \geq 0.
\end{align*}
[/definition]
The intuition is: *start near $\Lambda$, stay near $\Lambda$*. The $\epsilon$-$\delta$ structure mirrors the classical definition of continuity — given any tolerance $\epsilon$ on the future position, we can find a restriction $\delta$ on the initial perturbation that guarantees the trajectory stays within that tolerance for all future time. Importantly, Lyapunov stability makes no claim about convergence; the trajectory need not approach $\Lambda$, only remain confined.
<!-- illustration-needed: Lyapunov stability — show an invariant set Lambda with an epsilon-tube and a delta-tube, with a trajectory starting in the delta-tube and remaining inside the epsilon-tube for all positive time -->
[definition: Quasi-Asymptotic Stability]
An invariant set $\Lambda \subset E$ is **quasi-asymptotically stable** if there exists $\delta > 0$ such that
\begin{align*}
x \in N_\delta(\Lambda) \implies \phi_t(x) \to \Lambda \text{ as } t \to \infty.
\end{align*}
[/definition]
The intuition is: *start near $\Lambda$, tend to $\Lambda$*. Quasi-asymptotic stability asserts eventual convergence to $\Lambda$, but says nothing about whether the trajectory remains close along the way. A trajectory might first wander far from $\Lambda$ before eventually returning — quasi-asymptotic stability does not forbid this.
<!-- illustration-needed: Quasi-asymptotic stability — show a trajectory that initially moves away from Lambda before converging back, contrasted with Lyapunov stability where the trajectory stays confined -->
These two properties are logically independent. A system can be Lyapunov stable without being quasi-asymptotically stable (a centre is the canonical example), and conversely quasi-asymptotically stable without being Lyapunov stable. The combination of both is the most useful notion in practice.
[definition: Asymptotic Stability]
An invariant set $\Lambda \subset E$ is **asymptotically stable** if it is both Lyapunov stable and quasi-asymptotically stable.
[/definition]
Asymptotic stability means that trajectories starting sufficiently close to $\Lambda$ both remain close at all times and converge to $\Lambda$ as $t \to \infty$. This is the gold standard for equilibria: it guarantees that small perturbations are corrected over time without the system first undergoing large excursions.
[definition: Unstable Invariant Set]
An invariant set $\Lambda$ is **unstable** if it is neither Lyapunov stable nor quasi-asymptotically stable.
[/definition]
These four definitions partition the possible long-term responses to perturbation: a set either keeps trajectories nearby (Lyapunov stable), attracts them back (quasi-asymptotically stable), does both (asymptotically stable), or does neither (unstable). The following example shows that asymptotic stability does not require the norm to decrease monotonically — a trajectory can make a transient excursion before converging, and the Lyapunov stability condition is satisfied through a uniform bound rather than pointwise decay.
[example: Asymptotic Stability with Transient Growth]
Consider the system
\begin{align*}
\begin{pmatrix} \dot{x} \\ \dot{y} \end{pmatrix} = \begin{pmatrix} y - \mu x \\ -2\mu y \end{pmatrix}, \quad \mu > 0.
\end{align*}
The system is linear, and its solution with initial condition $(x_0, y_0)$ is
\begin{align*}
y(t) &= y_0 e^{-2\mu t}, \\
x(t) &= x_0 e^{-\mu t} + y_0 \mu^{-1}(e^{-\mu t} - e^{-2\mu t}).
\end{align*}
We verify asymptotic stability of the origin. For Lyapunov stability, we need a bound on $|x(t)|$ in terms of the initial data. Using the explicit solution and the triangle inequality:
\begin{align*}
|x(t)| \leq |x_0| e^{-\mu t} + |y_0| \mu^{-1} e^{-\mu t} \leq |x_0| + \mu^{-1} |y_0|.
\end{align*}
Therefore
\begin{align*}
x(t)^2 + y(t)^2 \leq \left(x_0^2 + y_0^2\right)\left(1 + \mu^{-1} + \frac{\mu^{-2}}{16}\right),
\end{align*}
which shows that the norm of the solution is uniformly bounded by a constant multiple of the initial norm. Given $\epsilon > 0$, choose $\delta$ small enough that $\delta^2 (1 + \mu^{-1} + \mu^{-2}/16) < \epsilon^2$; then any trajectory starting within $\delta$ of the origin stays within $\epsilon$. This proves Lyapunov stability.
For quasi-asymptotic stability, both $x(t)$ and $y(t)$ contain only decaying exponentials, so $x(t) \to 0$ and $y(t) \to 0$ as $t \to \infty$ for any initial condition. Hence the origin is asymptotically stable.
This example is instructive precisely because when $\mu$ is small, the solution can attain large values — the $\mu^{-1}$ factor in $x(t)$ can make the transient excursion much larger than the initial perturbation — before eventually decaying. Lyapunov stability is confirmed not by monotone decay, but by the uniform bound. Requiring the norm to decrease monotonically is strictly stronger and would exclude this example.
[/example]
When there are multiple asymptotically stable invariant sets, each trajectory is attracted to one of them. The set of initial conditions leading to a particular limit set is characterised by the basin of attraction.
[definition: Basin of Attraction]
If $\Lambda$ is an asymptotically stable invariant set, the **basin of attraction** of $\Lambda$ is
\begin{align*}
\mathcal{B}(\Lambda) := \{ x \in E : \phi_t(x) \to \Lambda \text{ as } t \to \infty \}.
\end{align*}
[/definition]
The basin of attraction measures the reach of $\Lambda$: it is the collection of all initial conditions whose long-time evolution is governed by $\Lambda$. In the simplest situation — a single asymptotically stable equilibrium — the basin may fill the entire phase space, leading to the following strengthened notion.
[definition: Globally Attracting Set]
If $\mathcal{B}(\Lambda) = \mathbb{R}^n$, then $\Lambda$ is called **globally attracting** (or **globally stable**).
[/definition]
The basin of attraction can have a highly intricate geometric structure when there are multiple stable invariant sets, with fractal boundaries separating different basins. For a single isolated fixed point $x^* \in E$ with $f(x^*) = 0$, the stability can often be read off from the eigenvalues of the Jacobian matrix $Jf_{x^*}$.
[quotetheorem:2779]
This theorem follows from the topological conjugacy of the nonlinear flow near a hyperbolic fixed point with its linearisation, established in Chapter 2. The linearisation analysis of eigenvalues thus directly determines stability for hyperbolic points. However, when $x^*$ is non-hyperbolic, or when we want quantitative information about the basin of attraction, or when the system is too complicated for explicit linearisation, we need a more powerful method. This is provided by Lyapunov functions.
## Lyapunov Functions
**Problem.** For a complicated nonlinear system $\dot{x} = f(x)$, how can we certify stability of an equilibrium without solving the ODE? The answer is to find a scalar function $\mathcal{V}$ on phase space that plays the role of an "energy": if $\mathcal{V}$ is positive everywhere away from the equilibrium and decreases along every trajectory, then the trajectories cannot escape to large values, and in fact must converge.
[definition: Lyapunov Function]
Let $E \subseteq \mathbb{R}^n$ be an open neighbourhood of the origin, and suppose $f \colon E \to \mathbb{R}^n$ is continuously differentiable with $f(0) = 0$. A function $\mathcal{V} \colon E \to \mathbb{R}$ is a **Lyapunov function** for the system $\dot{x} = f(x)$ on $E$ if:
1. $\mathcal{V} \in C^1(E)$,
2. $\mathcal{V}(0) = 0$,
3. $\mathcal{V}(x) > 0$ for all $x \in E \setminus \{0\}$,
4. $f(x) \cdot \nabla \mathcal{V}(x) \leq 0$ for all $x \in E$.
[/definition]
Condition (4) is the crucial one. Since $\dot{\mathcal{V}}(x) := \frac{d}{dt}\mathcal{V}(\phi_t(x))\big|_{t=0} = \nabla \mathcal{V}(x) \cdot f(x)$, the condition $f \cdot \nabla \mathcal{V} \leq 0$ means that $\mathcal{V}$ is non-increasing along every trajectory. The following proposition clarifies the equivalence between the pointwise condition on $\dot{\mathcal{V}}$ and the global condition on the flow.
[quotetheorem:2780]
[citeproof:2780]
This equivalence is useful because the pointwise condition $\dot{\mathcal{V}} \leq 0$ is far easier to check in practice — it reduces to an algebraic inequality on $f$ and $\nabla \mathcal{V}$ — whereas the global monotonicity condition is what drives the geometric argument. The following example works out both conditions explicitly for the simplest possible case, confirming that they agree and showing how the verification proceeds in concrete terms.
[example: Lyapunov Function for a Linear Decay]
Consider the scalar system $\dot{x} = -ax$ with $a > 0$ and state space $E = \mathbb{R}$. The flow is $\phi_t(x) = x e^{-at}$. Take
\begin{align*}
\mathcal{V}(x) = \frac{1}{2} x^2.
\end{align*}
We verify the three conditions. First, $\mathcal{V}(0) = 0$. Second, $\mathcal{V}(x) = \frac{1}{2}x^2 > 0$ for $x \neq 0$. Third, computing the time derivative along trajectories:
\begin{align*}
\dot{\mathcal{V}}(x) = \frac{d\mathcal{V}}{dx} \cdot f(x) = x \cdot (-ax) = -ax^2 \leq 0,
\end{align*}
with equality only at $x = 0$. So $\mathcal{V}$ is a Lyapunov function. We can also verify directly from the flow:
\begin{align*}
\mathcal{V}(\phi_t(x)) = \frac{1}{2}x^2 e^{-2at} < \frac{1}{2} x^2 = \mathcal{V}(x)
\end{align*}
for all $x \neq 0$ and $t > 0$, confirming the non-increasing condition globally.
[/example]
The power of Lyapunov functions lies in the following two theorems, which translate properties of $\mathcal{V}$ into stability conclusions for the origin.
[quotetheorem:2781]
[citeproof:2781]
<!-- illustration-needed: Lyapunov's First Theorem proof — show the origin inside a sublevel set U_1 = {V < alpha} inside the ball B(0, epsilon), with alpha defined as the minimum of V on the sphere |x| = epsilon. Show a trajectory starting in B(0,delta) confined to U_1 -->
The proof shows that the sublevel sets $\{\mathcal{V}(x) \leq c\}$ are forward-invariant: once a trajectory enters a sublevel set, it remains there. The origin sits at the bottom of these nested sublevel sets. When $\mathcal{V}$ is strictly decreasing — not merely non-increasing — along trajectories, the trajectory is not merely confined but actually driven toward the origin.
[quotetheorem:2782]
[citeproof:2782]
The compactness argument in the proof — finding a uniform lower bound $b$ on $|\dot{\mathcal{V}}|$ over the annulus $A_{\alpha,\epsilon}$ — is the key step that converts a pointwise strict inequality into a global contradiction. It is worth pausing to understand what the level sets of $\mathcal{V}$ look like geometrically, since this picture will recur throughout the rest of the chapter.
[remark: Geometric Interpretation]
The level sets $\{\mathcal{V}(x) = c\}$ for varying $c > 0$ form a nested family of closed surfaces surrounding the origin. When $\dot{\mathcal{V}} \leq 0$, every trajectory moves inward through this family (or tangentially to a level set), never outward. When $\dot{\mathcal{V}} < 0$ strictly, the trajectory must cross every level set and therefore converge to the only point inside all of them: the origin.
[/remark]
<!-- illustration-needed: Lyapunov level sets — show the origin surrounded by nested closed curves {V = c1}, {V = c2}, {V = c3} with c1 < c2 < c3, and a trajectory that crosses each curve inward, spiraling toward the origin -->
The two Lyapunov theorems so far cover the case where we want to certify stability. The same framework can be turned around to certify instability, using the fact that instability is stability of the time-reversed flow. The following remark records this duality, which is occasionally the quickest route to an instability proof when a direct argument is unwieldy.
[remark: Instability via Time Reversal]
To prove instability of an equilibrium using a Lyapunov-type argument, reverse the sense of time: replace $t$ by $-t$ and $f$ by $-f$. A strict Lyapunov function for the time-reversed system provides a function that is strictly increasing along trajectories of the original system, certifying instability.
[/remark]
## La Salle's Invariance Principle
**Problem.** Lyapunov's Second Theorem requires $\dot{\mathcal{V}} < 0$ everywhere away from the origin. In many systems of interest — particularly those with conserved quantities or symmetries — one can only achieve $\dot{\mathcal{V}} \leq 0$ with equality on a set larger than just the origin. Can asymptotic stability still be concluded?
The answer is yes, provided the set where $\dot{\mathcal{V}} = 0$ contains no invariant trajectories other than the origin itself. This is the content of La Salle's invariance principle, which locates the $\omega$-limit set of every trajectory not in terms of the full Lyapunov function decrease, but in terms of where $\mathcal{V}$ eventually stabilises.
[quotetheorem:2783]
[citeproof:2783]
The theorem as stated locates $\omega$-limit sets inside the level-set-invariant family $M_c$, which is precise but somewhat indirect to use. In applications one almost never works with $M_c$ directly; instead, one passes to the set where $\dot{\mathcal{V}}$ vanishes, which is typically characterised by an explicit equation on the phase variables. The following remark restates the conclusion in this more operational form.
[remark: The Useful Reformulation]
In practice La Salle is applied as follows. Let $S := \{y \in D : \dot{\mathcal{V}}(y) = 0\}$. If $y \in \omega(x) \subseteq M_c$, then $\mathcal{V}(\phi_t(y)) = c$ for all $t \geq 0$, which differentiating gives $\dot{\mathcal{V}}(\phi_t(y)) = 0$ for all $t \geq 0$. In particular $y \in S$. Since $\omega(x)$ is itself invariant (Chapter 1), it is an invariant subset of $S$. Therefore:
\begin{align*}
\phi_t(x) \to \text{largest invariant subset of } S \cap D \quad \text{as } t \to \infty.
\end{align*}
If the largest invariant subset of $S \cap D$ is $\{0\}$, then every trajectory in $D$ converges to the origin.
[/remark]
The following corollary is the form in which La Salle's principle is most often applied.
[quotetheorem:2784]
[citeproof:2784]
The key observation in the proof is that knowing the $\omega$-limit set is $\{0\}$ (via La Salle) only gives convergence along a subsequence, while Lyapunov stability upgrades subsequential convergence to full convergence.
### Worked Examples Using La Salle
The following examples illustrate the general strategy for applying La Salle's principle.
[example: Perturbed Hamiltonian System]
Consider the system
\begin{align*}
\dot{x} &= y - xy^2, \\
\dot{y} &= -x^3.
\end{align*}
Without the term $-xy^2$, this would be the Hamiltonian system with $H(x, y) = \frac{1}{2}y^2 + \frac{1}{4}x^4$. We compute $\dot{H}$ along trajectories of the actual system:
\begin{align*}
\dot{H} &= y \dot{y} + x^3 \dot{x} = y(-x^3) + x^3(y - xy^2) = -x^4 y^2 \leq 0.
\end{align*}
Since $H \geq 0$ and $H(x,y) = 0$ iff $(x,y) = (0,0)$, the function $H$ is a (non-strict) Lyapunov function on $\mathbb{R}^2$. For any initial condition $x_0$, the domain $\mathcal{D} := \{x : H(x) \leq H(x_0)\}$ is compact, forward-invariant (since $\dot{H} \leq 0$), and closed. La Salle's Invariance Principle gives
\begin{align*}
\omega(x_0) \subseteq \text{largest invariant subset of } \{(x,y) : \dot{H}(x,y) = 0\} \cap \mathcal{D}.
\end{align*}
The set $\{\dot{H} = 0\} = \{x = 0\} \cup \{y = 0\}$. On the line $x = 0$: the equations give $\dot{x} = y$ and $\dot{y} = 0$, so $y$ is constant and $x(t) = yt$. For this to stay in $\{x = 0\}$ requires $y = 0$. On the line $y = 0$: the equations give $\dot{x} = 0$ and $\dot{y} = -x^3$, so $x$ is constant. For the trajectory to remain in $\{y = 0\}$ requires $\dot{y} = -x^3 = 0$, hence $x = 0$. The only invariant subset of $\{\dot{H} = 0\}$ is therefore $\{(0,0)\}$. By La Salle, $\omega(x_0) = \{0\}$ for every $x_0$, so the origin is a global attractor.
[/example]
The perturbed Hamiltonian example illustrates La Salle in a global setting where the entire phase plane is available as a forward-invariant domain. In many systems, however, the dissipation $\dot{\mathcal{V}} \leq 0$ only holds in a bounded region, and we must restrict attention to a sublevel set that sits inside that region. The next example shows how to carry out this restriction systematically and how the choice of sublevel set directly yields an estimate of the basin of attraction.
[example: Estimating the Basin of Attraction via Sublevel Sets]
Consider the system
\begin{align*}
\dot{x} &= y + \mu\!\left(\tfrac{1}{3}x^3 - x\right), \\
\dot{y} &= -x,
\end{align*}
with $\mu > 0$. We try the candidate $\mathcal{V}(x, y) = x^2 + y^2$, which is a natural choice when $\dot{x}$ involves $y$ and $\dot{y}$ involves $-x$, since the cross-terms cancel. Computing:
\begin{align*}
\dot{\mathcal{V}} &= 2x\dot{x} + 2y\dot{y} = 2x\!\left(y + \mu\!\left(\tfrac{1}{3}x^3 - x\right)\right) + 2y(-x) \\
&= 2\mu x^2 \!\left(\tfrac{1}{3}x^2 - 1\right).
\end{align*}
This is non-positive when $|x| \leq \sqrt{3}$. Choose the domain $\mathcal{D} := \{(x,y) : \mathcal{V}(x,y) \leq 3\}$. On $\mathcal{D}$ we have $x^2 \leq x^2 + y^2 \leq 3$, so $|x| \leq \sqrt{3}$, confirming $\dot{\mathcal{V}} \leq 0$ on $\mathcal{D}$. The set $\mathcal{D}$ is compact (a closed disk of radius $\sqrt{3}$) and forward-invariant.
La Salle's principle gives $\phi_t(x) \to$ largest invariant subset of $\{\dot{\mathcal{V}} = 0\} \cap \mathcal{D}$. The set $\{\dot{\mathcal{V}} = 0\}$ consists of the lines $x = 0$ and $x = \pm\sqrt{3}$.
We claim the only invariant subset within $\mathcal{D}$ is the origin. On the line $x = 0$: the equation gives $\dot{x} = y$, so the trajectory stays in $\{x = 0\}$ only if $y = 0$, whence $(x,y) = (0,0)$. The points $(\pm\sqrt{3}, 0)$ lie on $\partial\mathcal{D}$ and have $\dot{x} = \pm\sqrt{3}\mu(1 - 1) = 0$ and $\dot{y} = \mp\sqrt{3} \neq 0$, so they are not invariant. For $x = \pm\sqrt{3}$ with $y \neq 0$, the trajectory immediately leaves the line $x = \pm\sqrt{3}$. Hence the only invariant subset of $\{\dot{\mathcal{V}} = 0\} \cap \mathcal{D}$ is $\{(0,0)\}$, and by La Salle, $\mathcal{D} \subseteq \mathcal{B}(\{0\})$.
[/example]
The method illustrated in these examples generalises to a systematic procedure.
[explanation: General Method for Establishing Stability via La Salle]
To prove asymptotic stability of the origin and estimate its basin of attraction:
1. **Find a candidate $\mathcal{V}$** that is positive definite ($\mathcal{V}(0) = 0$, $\mathcal{V}(x) > 0$ for $x \neq 0$) and $C^1$.
2. **Compute $\dot{\mathcal{V}}$** along trajectories and determine the largest domain $\mathcal{D}$ on which $\dot{\mathcal{V}} \leq 0$.
3. **Find a sublevel set** $C_k := \{x : \mathcal{V}(x) < k\} \subseteq \mathcal{D}$ that is compact and forward-invariant (sublevel sets of $\mathcal{V}$ with $\dot{\mathcal{V}} \leq 0$ on their boundary are forward-invariant).
4. **Identify the invariant subsets of $\{\dot{\mathcal{V}} = 0\} \cap C_k$**. If the only invariant subset is $\{0\}$, then La Salle gives $C_k \subseteq \mathcal{B}(\{0\})$.
5. **Optimise**: try different $\mathcal{V}$ or adjust $k$ to obtain the largest possible estimate of the basin of attraction.
When $\mathcal{V}$ is strictly decreasing — $\dot{\mathcal{V}} < 0$ away from the origin — steps 4 and 5 are automatic, recovering Lyapunov's Second Theorem as a special case of La Salle.
[/explanation]
### A Key Examination Example
The following worked problem from the 2010 Tripos illustrates the full strategy.
[example: 2010 Tripos — Basin of Attraction via La Salle]
Consider the system
\begin{align*}
\dot{x} &= -x + 2y + 2xy - x^2 - 4y^2, \\
\dot{y} &= -y + xy,
\end{align*}
and the candidate Lyapunov function $\mathcal{V}(x,y) = x^2 + \beta y^2$ for constant $\beta > 0$.
**Step 1: Find the values of $\beta$ for which $\mathcal{V}$ is a Lyapunov function near the origin.**
Since $\mathcal{V}(0,0) = 0$ and $\mathcal{V}(x,y) > 0$ for $(x,y) \neq (0,0)$ when $\beta > 0$, it remains to check $\dot{\mathcal{V}} \leq 0$ in a neighbourhood of the origin. Computing:
\begin{align*}
\dot{\mathcal{V}} &= 2x\dot{x} + 2\beta y \dot{y} \\
&= 2x(-x + 2y + 2xy - x^2 - 4y^2) + 2\beta y(-y + xy) \\
&= -2x^2 + 4xy - 2\beta y^2 + \mathcal{O}(3) \\
&= -(x - y)^2 - 2y^2(\beta - 1) + \mathcal{O}(3),
\end{align*}
where $\mathcal{O}(3)$ denotes terms of order three and higher in $(x, y)$.
For $\beta > 1$, both $-(x-y)^2$ and $-2y^2(\beta - 1)$ are non-positive, so $\dot{\mathcal{V}} \leq 0$ in a sufficiently small neighbourhood of the origin, with equality only when $x = y = 0$. Hence $\mathcal{V}$ is a Lyapunov function for $\beta > 1$.
For $\beta = 1$: on the line $x = y$, the quadratic terms vanish and $\dot{\mathcal{V}} = \mathcal{O}(3)$. Substituting $x = y$ gives $\dot{\mathcal{V}} = -4x^3$, which is positive for $x < 0$. Hence $\beta = 1$ does not work.
For $\beta < 1$: setting $x = y$ gives $\dot{\mathcal{V}} \approx -2y^2(\beta - 1) > 0$ for small $y \neq 0$, so $\mathcal{V}$ fails to be non-increasing. Hence the Lyapunov condition holds precisely for $\beta > 1$.
**Step 2: For $\beta = 2$, estimate the basin of attraction using $\mathcal{V} = x^2 + 2y^2$.**
With $\beta = 2$, the full (not just leading-order) expression for $\dot{\mathcal{V}}$ is:
\begin{align*}
\dot{\mathcal{V}} &= 2x(-x + 2y + 2xy - x^2 - 4y^2) + 4y(-y + xy) \\
&= (1 + x)(-2x^2 + 4xy - 4y^2).
\end{align*}
The quadratic factor $-2x^2 + 4xy - 4y^2 = -2(x^2 - 2xy + 2y^2) = -2((x-y)^2 + y^2) < 0$ for $(x,y) \neq (0,0)$. The factor $(1 + x) > 0$ when $x > -1$. Therefore $\dot{\mathcal{V}} < 0$ for all $(x,y) \neq (0,0)$ with $x > -1$.
For $k < 1$, the sublevel set $C_k := \{x^2 + 2y^2 < k\}$ is an open ellipse contained in $\{x > -1\}$ (since $x^2 < k < 1$ implies $|x| < 1$). The closed set $\overline{C}_k$ is compact and, since $\dot{\mathcal{V}} < 0$ on $\partial C_k \subset \{x > -1, (x,y) \neq 0\}$, is forward-invariant. Since $\mathcal{V}$ is strictly decreasing on $\overline{C}_k \setminus \{0\}$, the La Salle Corollary gives $C_k \subseteq \mathcal{B}(\{0\})$.
This holds for every $k \in (0, 1)$, so
\begin{align*}
\bigcup_{k \in (0,1)} C_k = \{(x,y) : x^2 + 2y^2 < 1\} \subseteq \mathcal{B}(\{0\}).
\end{align*}
The basin of attraction of the origin contains the open ellipse $\{x^2 + 2y^2 < 1\}$.
[/example]
## Bounding Functions
**Problem.** The Lyapunov method proves stability of an equilibrium and estimates its basin of attraction. A different but related question is: even if the system does not have a stable equilibrium, can we show that all trajectories remain bounded — confined to some compact region of phase space? This is the role of bounding functions.
Even when a globally defined strict Lyapunov function does not exist, one can sometimes find a function $\mathcal{V}$ that decreases outside a compact region, forcing all trajectories to eventually enter and remain inside that region.
[definition: Bounding Function]
A continuously differentiable function $\mathcal{V} \colon E \to \mathbb{R}$ is a **bounding function** for $\dot{x} = f(x)$ if:
1. For each $k > 0$, the sublevel set $V_k := \{x \in E : \mathcal{V}(x) < k\}$ is a bounded, simply connected domain with $V_k \subset V_{k'}$ whenever $k < k'$.
2. There exists a compact, simply connected domain $D \subset E$ and constants $\delta > 0$, $\kappa > 0$ such that $D \subset V_\kappa$ and
\begin{align*}
\dot{\mathcal{V}}(x) < -\delta < 0 \quad \text{for all } x \in E \setminus D.
\end{align*}
[/definition]
The definition captures a funnelling mechanism: outside the compact core $D$, the function $\mathcal{V}$ is uniformly driven down, so any trajectory starting outside $D$ must eventually descend into the sublevel set $V_\kappa$ that already contains $D$. Once inside $V_\kappa$, the trajectory cannot escape because the decrease condition on $\partial V_\kappa$ prevents exit. The following theorem makes this argument precise.
[quotetheorem:2785]
The proof follows by a two-case analysis on the initial position: for any initial condition, either the trajectory is already in $V_\kappa$, or it lies outside $D$ and $\mathcal{V}$ decreases at rate at least $\delta$ until the trajectory enters $D \subset V_\kappa$. Once inside $V_\kappa$, the trajectory cannot escape because $\dot{\mathcal{V}} < 0$ everywhere on $\partial V_\kappa \setminus D \subseteq E \setminus D$.
<!-- illustration-needed: Bounding function — show the compact domain D inside the sublevel set V_kappa, with the region E\D shaded to indicate where V-dot < -delta < 0. Show a trajectory entering from outside and being funnelled into V_kappa -->
Applying this theorem requires exhibiting a concrete $\mathcal{V}$ and computing $\dot{\mathcal{V}}$ explicitly to identify the compact core $D$ outside which the decrease condition holds. Polar coordinate systems, where the radial and angular dynamics decouple at leading order, are a natural setting for this computation — the condition $\dot{\mathcal{V}} < -\delta$ outside $D$ becomes an inequality on $r$ alone. The following example carries out this programme for a system with $\theta$-dependent perturbations.
[example: Bounding Functions in Polar Coordinates]
Consider the system in polar coordinates:
\begin{align*}
\dot{r} &= r - r^3(1 + k\sin\theta) - r^5, \quad 0 < k < 1, \\
\dot{\theta} &= 1.
\end{align*}
We take the candidate $\mathcal{V}(r, \theta) = \frac{1}{2}r^2$. Then
\begin{align*}
\dot{\mathcal{V}} = r\dot{r} = r^2\bigl(1 - r^2(1 + k\sin\theta) - r^4\bigr).
\end{align*}
Since $\sin\theta \geq -1$, we have $1 + k\sin\theta \geq 1 - k > 0$, and therefore
\begin{align*}
\dot{\mathcal{V}} \leq r^2\bigl(1 - r^2(1-k) - r^4\bigr).
\end{align*}
For $r^2 > \frac{1}{1-k}$, we have $r^2(1-k) > 1$, so
\begin{align*}
\dot{\mathcal{V}} \leq r^2\!\left(1 - 1 - \frac{r^4}{(1-k)r^2/(1-k)}\right) < 0.
\end{align*}
More precisely, when $r^2 \geq \frac{1}{1-k}$:
\begin{align*}
\dot{\mathcal{V}} \leq \frac{1}{1-k}\!\left(1 - 1 - \frac{1}{(1-k)^2}\right) = -\frac{1}{(1-k)^3} < 0.
\end{align*}
Setting $D := \{r \leq \frac{1}{\sqrt{1-k}}\}$, we see that $\dot{\mathcal{V}} < -\frac{1}{(1-k)^3}$ for all $r > \frac{1}{\sqrt{1-k}}$, i.e., outside $D$. By the Bounding Function Theorem, all trajectories eventually enter and remain in the disk $r \leq \frac{1}{\sqrt{1-k}}$.
This confines the long-time dynamics to a compact region where more detailed analysis — for instance via Poincaré-Bendixson theory — can establish the existence of limit cycles.
[/example]
Fixed points are the simplest invariant sets, but they represent only the stationary solutions. Periodic orbits — the next layer of complexity — appear when a trajectory closes on itself, and their stability properties determine whether perturbations grow or decay.
# 4. Existence and Stability of Periodic Orbits
The preceding chapters developed tools for understanding fixed points and their local dynamics through linearisation, stable manifold theory, and Lyapunov functions. Periodic orbits — closed trajectories that the system traverses indefinitely — present a qualitatively different challenge. They cannot be detected by local analysis near a fixed point, and their existence often requires global information about the phase portrait. This chapter develops three categories of tools: index-theoretic methods that constrain where periodic orbits can exist, existence theorems that guarantee their presence under trapping conditions, and Floquet theory which analyses their stability by studying how nearby trajectories evolve over one period. The chapter closes with a detailed treatment of the Van der Pol oscillator, which serves as a testbed for all these techniques simultaneously.
## The Poincaré Index
<!-- illustration-needed: Two simple closed curves in the plane with a vector field drawn at sample points — one curve encircling a node (vector field rotating once counterclockwise, index +1) and one encircling a saddle (alternating stable/unstable directions, index -1). Labels show the angle psi rotating by 2pi vs -2pi. -->
The central question motivating this section is: can we determine, without solving the system, whether a given region of the phase plane can contain a periodic orbit? A periodic orbit is a closed curve along which the vector field is everywhere tangent. Any topological constraint on how a vector field can behave on closed curves will therefore constrain the possible locations of periodic orbits. The Poincaré index provides exactly such a constraint.
[definition: Poincaré Index of a Curve]
Let $f = (f_1, f_2) \colon U \to \mathbb{R}^2$ be a continuously differentiable vector field on an open set $U \subseteq \mathbb{R}^2$. For any simple closed curve $\Gamma \subset U$ that does not pass through any zero of $f$, define the angle function $\psi \colon \Gamma \to \mathbb{R}$ by $\psi = \arctan(f_2 / f_1)$. The **Poincaré index** $I_\Gamma$ is the integer multiple of $2\pi$ in the total change of $\psi$ as $\Gamma$ is traversed once counterclockwise:
\begin{align*}
I_\Gamma = \frac{1}{2\pi} \oint_\Gamma d\psi = \frac{1}{2\pi} \oint_\Gamma \frac{f_1\, df_2 - f_2\, df_1}{f_1^2 + f_2^2}.
\end{align*}
[/definition]
The integral formula for $I_\Gamma$ follows from differentiating $\psi = \arctan(f_2/f_1)$ and computing $d\psi$ in terms of $f_1$ and $f_2$. Since $\Gamma$ is a closed curve and $f$ is nonzero on it, the angle $\psi$ returns to a value differing from its starting value by an integer multiple of $2\pi$; this integer is $I_\Gamma$.
[remark: Relation to the Winding Number]
The Poincaré index should not be confused with the winding number of a planar curve around a point, nor with the argument principle from complex analysis. The crucial difference is that $\psi = \arctan(f_2/f_1)$ measures the angle of the **vector field** $f$ at each point of $\Gamma$, not the angle of the position vector from a base point. All quantities are real, and no complex variable theory is involved.
[/remark]
When $\Gamma$ encloses an isolated zero of $f$ — that is, a fixed point of the dynamical system $\dot{x} = f(x)$ — the index captures topological information about how the vector field winds around that fixed point.
[definition: Poincaré Index of an Isolated Fixed Point]
Let $x_0$ be an isolated fixed point of $\dot{x} = f(x)$, meaning $f(x_0) = 0$ and there exists $r > 0$ such that $f(x) \neq 0$ for all $x \in B(x_0, r) \setminus \{x_0\}$. The **Poincaré index** of $x_0$ is defined as $I_{x_0} := I_\Gamma$ for any simple closed curve $\Gamma$ that encloses $x_0$ and no other fixed point. This value is independent of the particular choice of $\Gamma$, as follows from the homotopy invariance stated below.
[/definition]
The following properties of the index are fundamental. The first two are proved using continuity and homotopy arguments; the third is a direct consequence of the formula for $I_\Gamma$.
[quotetheorem:2786]
[citeproof:2786]
The integer constraint in (i) is what gives the index its topological power: a quantity that varies continuously but takes only integer values must be locally constant, so it cannot change under small deformations of the curve or the vector field. This rigidity is the engine behind every application of the index. Property (ii) provides the contrapositive that makes the index useful in practice — if $I_\Gamma \neq 0$, the curve must enclose at least one zero of $f$. The homotopy invariance (iii) is what allows us to define the index of an isolated fixed point unambiguously, since any two enclosing curves can be continuously deformed into each other without crossing the zero. One subtlety deserves emphasis: the definition of $I_\Gamma$ requires that $f \neq 0$ on $\Gamma$. If $\Gamma$ passes through a zero of $f$, the angle $\psi$ is undefined at that point and the integral diverges; the index is simply not defined for such curves. This is not a technicality — it is the reason why homotopy deformations in (iii) must avoid fixed points, and it is what makes the index constraint theorem below non-trivial: the constraint applies only to curves on which the vector field never vanishes.
[quotetheorem:2787]
[citeproof:2787]
These values immediately yield a powerful necessary condition for the existence of periodic orbits. The index $+1$ shared by nodes and foci does not, by itself, determine the type: a stable focus, an unstable focus, and a stable node all carry the same index. The index only constrains which combinations of fixed points a periodic orbit can enclose, not the type of each fixed point.
[quotetheorem:2788]
[citeproof:2788]
One consequence worth noting: if all fixed points enclosed by a putative periodic orbit are saddles (index $-1$ each), then the total enclosed index is negative or zero, never $+1$, so no periodic orbit can surround only saddle points. Conversely, a single node or focus suffices.
[example: Ruling Out Periodic Orbits via the Index]
Consider the competitive system
\begin{align*}
\dot{r} &= r(3 - r - 2s), \\
\dot{s} &= s(2 - r - s).
\end{align*}
The fixed points in the first quadrant $r, s \geq 0$ are $(0,0)$, $(0,2)$, $(3,0)$ (all nodes), and $(1,1)$ (a saddle). The axes $\{r = 0\}$ and $\{s = 0\}$ are invariant, so no periodic orbit can cross either axis. Thus any periodic orbit in the open first quadrant $r, s > 0$ must encircle some subset of the interior fixed points. The only interior fixed point is the saddle $(1,1)$, which has index $-1$. Since the index of any periodic orbit must equal $+1$, and the only candidate for enclosed fixed points yields index $-1 \neq 1$, there are no periodic orbits in the first quadrant.
[/example]
## The Poincaré–Bendixson Theorem
The index theory of the previous section rules out periodic orbits in certain regions. The Poincaré–Bendixson theorem does the opposite: under appropriate trapping conditions, it guarantees that a periodic orbit must exist. The key hypothesis is the confinement of a forward orbit to a compact region that contains no fixed points. In such a region, the only possible long-time behaviour — by the classification of $\omega$-limit sets in the plane — is a periodic orbit.
[quotetheorem:2789]
The proof of this theorem uses the structure of planar flows in an essential way: in $\mathbb{R}^2$, a flow satisfying the no-crossing property (guaranteed by uniqueness of solutions) cannot exhibit the complicated recurrence behaviour possible in higher dimensions. The key fact is that the $\omega$-limit set must be a non-empty, compact, connected, invariant set containing no fixed points; in $\mathbb{R}^2$, such a set is necessarily a periodic orbit. The proof requires the Jordan Curve Theorem — which gives the inside/outside dichotomy for simple closed curves in $\mathbb{R}^2$ — together with the fact that distinct trajectories cannot cross; in $\mathbb{R}^3$ both ingredients fail, and the theorem has no analogue (the Lorenz attractor is a compact invariant set with no fixed points and no periodic orbits). We state the result without proof.
[remark: Simply Connected Regions Are Excluded]
The compact trapping set $D$ cannot be simply connected (i.e., topologically a disc). If it were, then since $D$ is positively invariant with no fixed points, the index theorem applied to the boundary of $D$ would force the sum of indices of fixed points inside $D$ to equal $+1$, contradicting the assumption that $D$ contains no fixed points. Thus $D$ must have a hole — in practice, it is typically an annular region from which trajectories can neither escape through the outer boundary nor enter through the inner boundary.
[/remark]
<!-- illustration-needed: An annular trapping region D with trajectories pointing inward on the outer boundary and inward on the inner boundary, containing a periodic orbit in between. -->
The practical challenge is constructing the trapping annulus. The outer boundary is typically found by showing that $\dot{r} < 0$ (or energy is decreasing) at large radius. The inner boundary is found near an unstable fixed point: since the origin repels nearby trajectories, there is a small circle on which $\dot{r} > 0$. Any annular region bounded by these two circles is then positively invariant and contains no fixed points — exactly the hypothesis of Poincaré–Bendixson.
[example: Existence of a Limit Cycle via an Annulus]
Consider the system
\begin{align*}
\dot{x} &= x - y - 2x(x^2 + y^2), \\
\dot{y} &= y + x - y(x^2 + y^2).
\end{align*}
The linear part has an unstable focus at the origin, while the cubic terms produce inward forcing at large radius. To quantify this, convert to polar coordinates. Computing $\dot{r} = (x\dot{x} + y\dot{y})/r$ gives
\begin{align*}
\dot{r} = r(1 - r^2 - x^2) = r(1 - r^2 - r^2\cos^2\theta),
\end{align*}
from which
\begin{align*}
r(1 - 2r^2) < \dot{r} < r(1 - r^2).
\end{align*}
The left inequality shows $\dot{r} > 0$ whenever $r < 1/\sqrt{2}$; the right inequality shows $\dot{r} < 0$ whenever $r > 1$. Thus every trajectory entering the annulus
\begin{align*}
C = \left\{(x,y) : \frac{1}{\sqrt{2}} \leq \sqrt{x^2 + y^2} \leq 1\right\}
\end{align*}
remains in $C$ for all future time. To check for fixed points in $C$, compute
\begin{align*}
\dot{\theta} = \frac{x\dot{y} - \dot{x}y}{r^2} = 1 + \frac{1}{2}r^2\sin 2\theta.
\end{align*}
Since $r \leq 1$ in $C$, we have $|\frac{1}{2}r^2\sin 2\theta| \leq \frac{1}{2}r^2 \leq \frac{1}{2}$, so $\dot{\theta} \geq 1 - \frac{1}{2} = \frac{1}{2} > 0$. Thus $\dot{\theta} \neq 0$ throughout $C$ and there are no fixed points. By the Poincaré–Bendixson theorem, there exists a periodic orbit contained in $C$.
[/example]
The annulus argument works because the bounding circles provide clean quantitative estimates for $\dot{r}$. In more complex systems, the inner and outer boundaries may be more elaborate curves, but the principle — find a region from which trajectories cannot escape and which contains no fixed points — remains the same.
[example: Periodic Orbit with Parameter Restriction]
Consider the system, written in polar coordinates,
\begin{align*}
\dot{r} &= r + 2r^2\cos\theta - r^3, \\
\dot{\theta} &= 1 - ar\cos\theta.
\end{align*}
One computes that $\dot{r} > 0$ for $r < \sqrt{2} - 1$ and $\dot{r} < 0$ for $r > \sqrt{2} + 1$, so the annulus $\sqrt{2} - 1 \leq r \leq \sqrt{2} + 1$ is forward invariant. Fixed points inside the annulus satisfy the simultaneous equations $1 + 2r\cos\theta - r^2 = 0$ and $1 - ar\cos\theta = 0$, giving $r^2 = 1 + 2/a$. Such fixed points lie in the annulus only when $a > \sqrt{2} - 1$. Therefore, when $a < \sqrt{2} - 1$, the annulus contains no fixed points, and the Poincaré–Bendixson theorem guarantees the existence of a periodic orbit.
[/example]
## Dulac's Criterion and the Divergence Test
Even when a trapping region exists, we may need to determine whether the periodic orbit inside it is unique. Dulac's criterion addresses the opposite question to Poincaré–Bendixson: it provides conditions that rule out periodic orbits entirely in a region, or limit their number. The key idea is to use the divergence theorem. If a periodic orbit $\Gamma$ exists in a simply connected domain $D$, then the vector field $f$ is tangent to $\Gamma$, so $f \cdot n = 0$ on $\Gamma$ (where $n$ is the outward unit normal). By the divergence theorem applied to the region $A$ enclosed by $\Gamma$,
\begin{align*}
0 = \oint_\Gamma f \cdot n\, d\ell = \int_A \nabla \cdot f\, dA.
\end{align*}
If $\nabla \cdot f$ has one sign throughout $A$, this is a contradiction. Multiplying $f$ by a positive scalar function $\phi$ before applying this reasoning gives Dulac's criterion.
[quotetheorem:2790]
[citeproof:2790]
Dulac's criterion, like all topological methods in this chapter, is a global tool: it says something about entire regions, not about the behaviour near a single point. This is what makes it complementary to Floquet theory, which operates near a specific periodic orbit.
[remark: The Divergence Test as a Special Case]
The choice $\phi \equiv 1$ is called the **divergence test**: if $\nabla \cdot f$ is of one sign throughout $D$, then $D$ contains no periodic orbits. The general criterion with a nontrivial $\phi$ is more powerful because an appropriate choice of $\phi$ can make $\nabla \cdot (\phi\, f)$ one-signed even when $\nabla \cdot f$ changes sign.
[/remark]
The power of Dulac's criterion extends to doubly connected domains, where it can establish uniqueness rather than nonexistence.
[quotetheorem:2791]
[citeproof:2791]
The doubly connected version is particularly useful in population dynamics, where the annular domain models a region with a lower bound on population densities (the hole corresponds to extinction) and an upper bound from carrying capacity.
[example: Lotka–Volterra Type System]
Consider the system
\begin{align*}
\dot{r} &= r(a - br - cs), \quad b, c > 0, \\
\dot{s} &= s(d - er - fs), \quad e, f > 0.
\end{align*}
Take $\phi = (rs)^{-1}$, which is smooth and positive on the open first quadrant $\{r > 0, s > 0\}$. Computing:
\begin{align*}
\nabla \cdot (\phi\, f) = \frac{\partial}{\partial r}\!\left(\frac{a - br - cs}{s}\right) + \frac{\partial}{\partial s}\!\left(\frac{d - er - fs}{r}\right) = -\frac{b}{s} - \frac{f}{r}.
\end{align*}
This expression is strictly negative throughout $\{r > 0, s > 0\}$. By Dulac's criterion, there are no periodic orbits in the first quadrant.
[/example]
The choice $\phi = (rs)^{-1}$ in the Lotka–Volterra example is a Dulac function that works for a broad family of competitive systems. Finding an appropriate $\phi$ is often the creative step; once found, the divergence computation is mechanical.
[example: Damped Pendulum with Constant Torque]
The damped pendulum with constant torque satisfies
\begin{align*}
\dot{\theta} &= p, \\
\dot{p} &= F - kp - \sin\theta,
\end{align*}
with $k > 0$. Taking $\phi \equiv 1$, the divergence test gives
\begin{align*}
\nabla \cdot f = \frac{\partial \dot{\theta}}{\partial \theta} + \frac{\partial \dot{p}}{\partial p} = 0 + (-k) = -k < 0,
\end{align*}
which is negative everywhere. On the cylinder $\mathbb{T} \times \mathbb{R}$ (with $\theta \in \mathbb{R}/2\pi\mathbb{Z}$), the divergence test excludes all periodic orbits that lie in a simply connected domain. The only periodic orbits compatible with this are those that wind around the cylinder (i.e., orbits along which $\theta$ increases by $2\pi$ over one period). Thus there is at most one such orbit, which must encircle the entire cylinder in the $\theta$-direction.
[/example]
## Near-Hamiltonian Flows and the Energy Balance Method
The index and Dulac methods address existence and non-existence of periodic orbits using topological and divergence-based arguments. Neither says anything about *which* level curves survive as isolated limit cycles when the system is nearly integrable. The energy balance method fills this gap: for systems close to a Hamiltonian one, it identifies which unperturbed orbits persist under perturbation by checking whether energy is gained or lost on balance over one circuit.
Pure Hamiltonian systems — those of the form $\dot{x} = H_y$, $\dot{y} = -H_x$ — have $\nabla \cdot f = H_{xy} - H_{xy} = 0$, so the divergence test yields no information about periodic orbits. Moreover, Hamiltonian systems have $dH/dt = 0$, so every level curve of $H$ is invariant. When closed, these level curves form a continuous family of periodic orbits rather than isolated limit cycles. Adding a small perturbation to such a system generically destroys most of these orbits, leaving at most finitely many isolated ones. The energy balance method determines which level curves survive as limit cycles.
### Hamiltonian Systems and Their Properties
Hamiltonian systems occupy a special position in the classification of planar flows. Unlike dissipative systems — where areas in phase space contract under the flow, forcing energy to decrease — Hamiltonian systems are area-preserving. This means isolated limit cycles cannot exist: the flow neither contracts nor expands phase space, so no orbit can attract or repel neighbours. Instead, Hamiltonian systems generically have families of periodic orbits filling open regions, organised by the level curves of the conserved energy function $H$. This structure is rigid and beautiful, but also fragile: any perturbation that breaks the area-preserving property will in general destroy all but finitely many of these orbits.
[definition: Hamiltonian System]
A dynamical system $\dot{x} = f(x)$ with $x = (x, y) \in \mathbb{R}^2$ is **Hamiltonian** with Hamiltonian $H \colon \mathbb{R}^2 \to \mathbb{R}$ if
\begin{align*}
\dot{x} = \frac{\partial H}{\partial y}(x, y), \qquad \dot{y} = -\frac{\partial H}{\partial x}(x, y).
\end{align*}
[/definition]
For a Hamiltonian system, $dH/dt = H_x \dot{x} + H_y \dot{y} = H_x H_y - H_y H_x = 0$, so $H$ is a first integral. Level curves $\{H = c\}$ are invariant under the flow; if they are closed curves, they are periodic orbits. The Jacobian matrix of a Hamiltonian vector field always has trace zero, so fixed points are either saddles or nonlinear centres.
### Near-Hamiltonian Perturbations
When a small perturbation is added to a Hamiltonian system, the rigid structure of concentric periodic orbits is generically destroyed. The divergence of the perturbed field is no longer zero, so trajectories no longer preserve area. Some level curves — those where the perturbation injects and dissipates energy in equal measure over one circuit — persist as isolated limit cycles. The rest either expand outward or contract inward. The energy balance method makes this precise.
[definition: Near-Hamiltonian System]
A **near-Hamiltonian system** is a system of the form
\begin{align*}
\begin{pmatrix} \dot{x} \\ \dot{y} \end{pmatrix} = \begin{pmatrix} f_1(x,y) + \varepsilon g_1(x,y) \\ f_2(x,y) + \varepsilon g_2(x,y) \end{pmatrix},
\end{align*}
where $f_1 = H_y$ and $f_2 = -H_x$ for some $H \in C^2(\mathbb{R}^2)$, the perturbation functions $g_1, g_2$ are continuously differentiable, and $\varepsilon > 0$ is a small parameter.
[/definition]
For this system, the rate of change of $H$ along a trajectory is
\begin{align*}
\frac{dH}{dt} = H_x \dot{x} + H_y \dot{y} = H_x(f_1 + \varepsilon g_1) + H_y(f_2 + \varepsilon g_2).
\end{align*}
Since $f_1 = H_y$ and $f_2 = -H_x$, we have $H_x f_1 + H_y f_2 = H_x H_y + H_y(-H_x) = 0$. Thus
\begin{align*}
\frac{dH}{dt} = \varepsilon(g_1 H_x + g_2 H_y) = \varepsilon(-g_1 f_2 + g_2 f_1).
\end{align*}
This observation directly yields a necessary condition for periodic orbits.
[quotetheorem:2792]
[citeproof:2792]
### The Energy Balance Method
The energy balance method converts the problem of finding limit cycles into a root-finding problem for a single function $M(H_0)$. The idea is that for small $\varepsilon$, the perturbed periodic orbits lie close to the unperturbed ones, so the net energy change over one circuit can be computed approximately by integrating $\dot{H}$ along the unperturbed orbit. An orbit persists if and only if this net change is zero.
For small $\varepsilon$, the periodic orbits of the perturbed system lie close to the level curves $\{H = H_0\}$ of the unperturbed Hamiltonian. Specifically, by continuous dependence on parameters, any periodic orbit $H_\varepsilon$ of the perturbed system is within $O(\varepsilon)$ of some unperturbed level curve $\{H = H_0\}$.
The change in $H$ accumulated over the perturbed orbit $H_\varepsilon$ (a closed curve close to $\{H = H_0\}$) is
\begin{align*}
\Delta H(H_0) := \oint_{H_\varepsilon} dH = \varepsilon \oint_{H_0} (g_2 f_1 - g_1 f_2)\, dt + O(\varepsilon^2),
\end{align*}
where the leading-order approximation replaces the integration contour $H_\varepsilon$ by the unperturbed orbit $\{H = H_0\}$, incurring an $O(\varepsilon)$ error in the integrand integrated over an $O(1)$ length, giving an $O(\varepsilon^2)$ correction.
[definition: Energy Balance Method]
The **energy balance method** for finding limit cycles of near-Hamiltonian flows proceeds as follows. Define the **Melnikov function**
\begin{align*}
M(H_0) := \oint_{\{H = H_0\}} (g_2 f_1 - g_1 f_2)\, dt,
\end{align*}
where the integral is taken along the unperturbed periodic orbit $\{H = H_0\}$. A simple zero of $M$ — a value $H_0^*$ with $M(H_0^*) = 0$ and $M'(H_0^*) \neq 0$ — indicates the existence of a periodic orbit of the perturbed system near $\{H = H_0^*\}$ for sufficiently small $\varepsilon > 0$. If $M$ has no simple zero on an interval of values, there are no periodic orbits near those level curves.
[/definition]
The condition $M'(H_0^*) \neq 0$ is the non-degeneracy that ensures the zero is simple and that the implicit function theorem applies to continue the orbit to small $\varepsilon > 0$. When $M$ vanishes identically on an interval, the method is inconclusive and higher-order terms in $\varepsilon$ must be considered.
[remark: Stability from the Energy Balance Method]
The stability of the periodic orbit found near $\{H = H_0^*\}$ can be read off from the sign of $M'(H_0^*)$. If $M'(H_0^*) < 0$, then trajectories starting at nearby level curves with $H_0 > H_0^*$ have $\Delta H < 0$ (they lose energy) and those with $H_0 < H_0^*$ have $\Delta H > 0$ (they gain energy), so trajectories converge to the periodic orbit, making it stable. If $M'(H_0^*) > 0$, the orbit is unstable.
[/remark]
The sign condition $M'(H_0^*) < 0$ for stability is consistent with the Floquet multiplier formula: since $\nabla \cdot f = \varepsilon(\partial_x g_1 + \partial_y g_2)$ for the near-Hamiltonian system, the Floquet multiplier $\mu_2 = \exp(\int_0^P \nabla \cdot f\, dt)$ is less than $1$ precisely when the orbit is a net energy sink — which is what $M'(H_0^*) < 0$ encodes to leading order in $\varepsilon$.
[example: A Near-Hamiltonian System with One Limit Cycle]
Consider
\begin{align*}
\begin{pmatrix} \dot{x} \\ \dot{y} \end{pmatrix} = \begin{pmatrix} y \\ -x + x^2 + \varepsilon y(a - x) \end{pmatrix}, \qquad \varepsilon > 0.
\end{align*}
For $\varepsilon = 0$ this is a Hamiltonian system with $H(x,y) = \frac{1}{2}(x^2 + y^2) - \frac{1}{3}x^3$. The perturbation has $g_1 = 0$ and $g_2 = y(a - x)$. Since $dH/dt = \varepsilon y^2(a - x)$ along trajectories, the sign of $\dot{H}$ is determined by whether $x < a$ or $x > a$. By the necessary condition for periodic orbits, no periodic orbit can lie entirely in $\{x < a\}$ or entirely in $\{x > a\}$; any periodic orbit must cross the line $x = a$.
The Poincaré index analysis shows that the fixed points are $(0,0)$ (an unstable focus) and $(1,0)$ (a saddle). Any periodic orbit must encircle $(0,0)$ but not $(1,0)$, as the sum of indices would otherwise not equal $+1$.
The Melnikov function is
\begin{align*}
M(H_0) &= \oint_{\{H = H_0\}} g_2 f_1\, dt = \oint_{\{H = H_0\}} y(a-x)\, dx \\
&= 2\int_{x_{\min}}^{x_{\max}} y(a-x)\, dx = 2\int_{x_{\min}}^{x_{\max}} \sqrt{2H_0 - x^2 + \tfrac{2}{3}x^3}\,(a-x)\, dx,
\end{align*}
where the factor of $2$ accounts for symmetry (the orbit is traversed once in each direction in $y$). This integral cannot be evaluated in closed form, but numerical computation shows that for $a = \frac{1}{3}$, there exists a unique $H_0^* \in (0, \frac{1}{6})$ where $M(H_0^*) = 0$ with $M'(H_0^*) \neq 0$. The perturbed system therefore has exactly one limit cycle, lying close to the level curve $\{H = H_0^*\}$.
[/example]
<!-- illustration-needed: Phase portrait of the unperturbed Hamiltonian system showing closed level curves of H, the saddle at (1,0), and the centre at (0,0); overlay the approximate location of H_0* -->
## Stability of Periodic Orbits
A periodic orbit that exists can be stable or unstable. Establishing stability requires more than checking whether nearby initial conditions stay close at some fixed time — one must understand what happens over arbitrarily many periods. Near a fixed point, the constant Jacobian governs the long-term behaviour, and eigenvalues tell the whole story. Near a periodic orbit, the Jacobian varies periodically with time, so eigenvalue analysis at a single point is insufficient. Floquet theory provides the appropriate framework by examining how small perturbations evolve over exactly one period.
### Floquet Theory and the Fundamental Matrix
Consider a periodic orbit $\hat{x}(t)$ of $\dot{x} = f(x)$ with period $P$, so $\hat{x}(t+P) = \hat{x}(t)$. Write a nearby solution as $x(t) = \hat{x}(t) + \xi(t)$, where $|\xi(0)|$ is small. Substituting into $\dot{x} = f(x)$ and expanding:
\begin{align*}
\dot{\hat{x}} + \dot{\xi} = f(\hat{x} + \xi) = f(\hat{x}) + Jf_{\hat{x}}(t)\,\xi + O(|\xi|^2).
\end{align*}
Since $\dot{\hat{x}} = f(\hat{x})$, the perturbation satisfies the **linearised variational equation**
\begin{align*}
\dot{\xi}(t) = A(t)\,\xi(t), \qquad A(t) := Jf_{\hat{x}(t)},
\end{align*}
where $Jf_{\hat{x}(t)}$ is the Jacobian matrix of $f$ evaluated along the orbit. Since $\hat{x}(t)$ is periodic, $A(t)$ is $P$-periodic: $A(t+P) = A(t)$.
Linear ODEs with periodic coefficients have a fundamental structure captured by Floquet theory. The general solution is written as
\begin{align*}
\xi(t) = F(t)\,\xi(0),
\end{align*}
where $F(t)$ is the **fundamental matrix solution**, which satisfies
\begin{align*}
\dot{F}(t) = A(t)\,F(t), \qquad F(0) = I.
\end{align*}
The matrix $F(P)$ — called the **monodromy matrix** — encodes how perturbations evolve over one period: $\xi(P) = F(P)\,\xi(0)$, and more generally $\xi(nP) = F(P)^n\,\xi(0)$.
[definition: Floquet Multipliers]
The **Floquet multipliers** $\mu_1, \ldots, \mu_n$ of a periodic orbit $\hat{x}(t)$ of period $P$ are the eigenvalues of the monodromy matrix $F(P)$, where $F(t)$ is the fundamental matrix solution of $\dot{F}(t) = A(t)F(t)$ with $F(0) = I$ and $A(t) = Jf_{\hat{x}(t)}$.
[/definition]
The Floquet multipliers encode exactly how a perturbation is magnified or shrunk after one complete period. A multiplier $|\mu_i| < 1$ means the corresponding perturbation direction contracts after each period; $|\mu_i| > 1$ means it expands. As the next theorem shows, the value $\mu = 1$ always appears — it corresponds to the direction along the orbit, where a perturbation merely shifts the phase and neither grows nor decays.
[definition: Floquet Exponents]
The **Floquet exponents** (or **Lyapunov exponents**) $\lambda_i$ associated to the Floquet multipliers are defined by $\lambda_i = P^{-1}\log\mu_i$, so that $\mu_i = e^{\lambda_i P}$. They measure the mean exponential rate of growth or decay of perturbations in the direction of the $i$-th eigenspace of $F(P)$.
[/definition]
The Floquet exponents are the continuous-time counterparts of the Floquet multipliers: a negative exponent $\lambda_i < 0$ corresponds to a contracting direction, while a positive exponent $\lambda_i > 0$ corresponds to an expanding one. The exponent $\lambda = 0$ (multiplier $\mu = 1$) always appears, corresponding to the phase direction along the orbit.
[quotetheorem:2793]
[citeproof:2793]
The eigenvector $\dot{\hat{x}}(0)$ corresponds to a perturbation along the orbit itself; such a perturbation merely shifts the phase and does not decay or grow. All stability information is encoded in the remaining Floquet multipliers.
[definition: Hyperbolic Periodic Orbit]
A periodic orbit $\hat{x}(t)$ is **hyperbolic** if all Floquet multipliers except the unit multiplier $\mu = 1$ lie strictly off the unit circle, i.e., $|\mu_i| \neq 1$ for all non-unit multipliers $\mu_i$.
[/definition]
Hyperbolicity is the generic condition: among all periodic orbits, the non-hyperbolic ones — those with a multiplier exactly on the unit circle — form a set of codimension at least one in parameter space. A hyperbolic periodic orbit persists under small perturbations of the vector field (by the implicit function theorem applied to the Poincaré return map). Non-hyperbolic orbits are the borderline case where stability is not determined by the linear approximation alone.
[quotetheorem:2794]
[citeproof:2794]
The stability criterion above applies when the orbit is hyperbolic. When a non-unit Floquet multiplier has $|\mu_i| = 1$ — the non-hyperbolic case — the linearisation is inconclusive. Stability must then be determined by higher-order terms, much as for a centre at a fixed point; the orbit may be Lyapunov stable without being asymptotically stable, or it may be unstable despite the inconclusive linear picture. The relationship to Lyapunov stability is subtle: asymptotic orbital stability (nearby orbits converging to $\hat{x}$) is strictly stronger than Lyapunov stability of the periodic orbit as a subset of phase space.
### The Floquet Multiplier Formula in $\mathbb{R}^2$
In two dimensions, the monodromy matrix $F(P)$ is $2 \times 2$. One eigenvalue is always $\mu_1 = 1$ (the unit multiplier, from the phase direction along the orbit). Since $\det F(P) = \mu_1 \mu_2 = \mu_2$, the non-unit multiplier is determined by $\det F(P)$ alone. The following proposition gives a computable formula for this determinant.
[quotetheorem:2795]
[citeproof:2795]
This formula gives an immediately applicable stability criterion in $\mathbb{R}^2$:
\begin{align*}
\int_0^P \nabla \cdot f(\hat{x})\, dt < 0 &\implies \mu_2 < 1 \implies \text{asymptotically stable}, \\
\int_0^P \nabla \cdot f(\hat{x})\, dt > 0 &\implies \mu_2 > 1 \implies \text{unstable}, \\
\int_0^P \nabla \cdot f(\hat{x})\, dt = 0 &\implies \mu_2 = 1 \implies \text{non-hyperbolic}.
\end{align*}
Notice also that Dulac's criterion is intimately related to this formula: if $\nabla \cdot f < 0$ everywhere, then the integral above is negative for any periodic orbit, implying $\mu_2 < 1$ and hence stability. Combined with the fact that Dulac with $\phi = 1$ excludes periodic orbits in simply connected regions, this shows the two criteria are consistent.
[example: Stability of a Limit Cycle via the Floquet Formula]
Consider the system in polar coordinates:
\begin{align*}
\dot{r} = r(1 - r^2), \qquad \dot{\theta} = \frac{1}{r}.
\end{align*}
Setting $\dot{r} = 0$ gives $r = 0$ (fixed point) or $r = 1$ (potential periodic orbit). At $r = 1$, $\dot{\theta} = 1$, so the orbit has period $T = 2\pi$.
To verify stability, set $r = 1 + \delta(t)$ and $\theta = t + \varepsilon(t)$ for small $\delta, \varepsilon$. The equation for $r$ gives:
\begin{align*}
\dot{\delta} = (1 + \delta)[1 - (1+\delta)^2] = (1+\delta)(-2\delta + O(\delta^2)) = -2\delta + O(\delta^2),
\end{align*}
so $\dot{\delta} \approx -2\delta$, giving $\delta(t) = \delta_0 e^{-2t}$. For $\theta$:
\begin{align*}
\dot{\varepsilon} = \frac{1}{1+\delta} - 1 = -\delta + O(\delta^2) \approx -\delta_0 e^{-2t}.
\end{align*}
Integrating: $\varepsilon(t) = \frac{1}{2}\delta_0 e^{-2t} + (\varepsilon_0 - \frac{1}{2}\delta_0)$. The monodromy matrix (at $t = T = 2\pi$) is
\begin{align*}
F(T) = \begin{pmatrix} e^{-4\pi} & 0 \\ \frac{1}{2}(e^{-4\pi} - 1) & 1 \end{pmatrix},
\end{align*}
with eigenvalues $\mu_1 = 1$ and $\mu_2 = e^{-4\pi} < 1$. The orbit is asymptotically stable.
To confirm using the Floquet formula, compute $\nabla \cdot f$ in polar coordinates:
\begin{align*}
\nabla \cdot f = \frac{1}{r}\frac{\partial(r\dot{r})}{\partial r} + \frac{1}{r}\frac{\partial(r\dot{\theta})}{\partial \theta} = \frac{1}{r}\frac{\partial}{\partial r}[r^2(1-r^2)] + 0 = 2 - 4r^2.
\end{align*}
On the orbit $r = 1$: $\nabla \cdot f = -2$. Then
\begin{align*}
\int_0^{2\pi} \nabla \cdot f\, dt = -2 \cdot 2\pi = -4\pi < 0,
\end{align*}
confirming $\mu_2 = e^{-4\pi} < 1$ and asymptotic stability, consistent with both approaches.
[/example]
## The Van der Pol Oscillator
The Van der Pol oscillator is a second-order nonlinear ODE that models a self-sustaining oscillation arising from nonlinear damping. It was originally introduced to describe vacuum tube circuits, but it has since become a canonical example in dynamical systems theory because it exhibits — provably — a unique stable limit cycle for every value of the parameter $\mu > 0$, while being amenable to different asymptotic analyses depending on whether $\mu$ is small or large.
[definition: Van der Pol System]
The **Van der Pol oscillator** with parameter $\mu > 0$ is the planar system
\begin{align*}
\dot{x} &= y - F(x), \qquad F(x) = \mu x\!\left(\tfrac{1}{3}x^2 - 1\right), \\
\dot{y} &= -x.
\end{align*}
[/definition]
The function $F(x) = \mu x(\frac{1}{3}x^2 - 1)$ is a cubic with roots at $x = 0, \pm\sqrt{3}$ and satisfies $F'(x) = \mu(x^2 - 1)$, so $F$ is decreasing on $(-1,1)$ and increasing outside this interval.
The system has a symmetry $(x, y) \mapsto (-x, -y)$ (rotation by $\pi$), which will be exploited to prove the existence of a periodic orbit by a symmetry argument.
For $\mu = 0$, the system is Hamiltonian with $H = \frac{1}{2}(x^2 + y^2)$, and all trajectories except the origin are periodic circles. For $\mu > 0$, the origin is the unique fixed point. Its Jacobian is
\begin{align*}
Jf_{(0,0)} = \begin{pmatrix} \mu & 1 \\ -1 & 0 \end{pmatrix},
\end{align*}
with characteristic polynomial $\lambda^2 - \mu\lambda + 1 = 0$ and eigenvalues $\lambda = \frac{\mu \pm \sqrt{\mu^2 - 4}}{2}$. For $0 < \mu < 2$, the eigenvalues are complex with positive real part $\mu/2 > 0$, so the origin is an unstable focus. For $\mu > 2$, both eigenvalues are real and positive, making it an unstable node.
### Nullclines and the Liénard Plane
The phase plane for Van der Pol has a rich geometric structure that can be read directly from the nullclines. The $\dot{x} = 0$ nullcline is the cubic curve $y = F(x)$, which separates regions of rightward and leftward motion. The $\dot{y} = 0$ nullcline is the vertical axis $x = 0$, separating upward and downward motion. The limit cycle must weave between these regions, spending time near each nullcline. This interplay between the two nullclines — and the energy calculation it implies — is what the Liénard plane makes precise.
The $\dot{x} = 0$ nullcline is the curve $y = F(x)$, and the $\dot{y} = 0$ nullcline is the line $x = 0$. Consider the Liénard energy $H = \frac{1}{2}(x^2 + y^2)$. Its rate of change along trajectories is
\begin{align*}
\dot{H} = x\dot{x} + y\dot{y} = x(y - F(x)) + y(-x) = -xF(x) = -\mu x^2\!\left(\tfrac{1}{3}x^2 - 1\right).
\end{align*}
Since $x^2(\frac{1}{3}x^2 - 1) < 0$ for $0 < |x| < \sqrt{3}$ and $> 0$ for $|x| > \sqrt{3}$:
- $\dot{H} > 0$ in the strip $\{0 < |x| < \sqrt{3}\}$ (energy increases),
- $\dot{H} < 0$ in $\{|x| > \sqrt{3}\}$ (energy decreases).
This mixed sign prevents $H$ from serving directly as a Lyapunov function, but the structure is precisely what forces trajectories to converge to a limit cycle.
### Existence and Uniqueness of the Limit Cycle
The proof of existence uses a symmetric trapping argument. Consider a trajectory starting at the point $(-\sqrt{3}, y_0)$ on the left edge of the strip. By symmetry of the system under $(x,y) \mapsto (-x,-y)$, the trajectory crosses the right edge $\{x = \sqrt{3}\}$ at some point $(\sqrt{3}, y_1)$ and, continuing, crosses back to $\{x = -\sqrt{3}\}$ at $(-\sqrt{3}, -y_2)$ (by the rotation symmetry, the $y$-coordinate after one half-circuit changes sign).
A periodic orbit exists precisely when $y_2 = y_0$, so that the trajectory is closed by symmetry. To show such a value exists and is unique, define the function
\begin{align*}
\Delta\hat{H}(y_0) := H\!\left((\sqrt{3}, y_2)\right) - H\!\left((-\sqrt{3}, y_0)\right) = \tfrac{1}{2}(y_2^2 - y_0^2).
\end{align*}
The key properties of $\Delta\hat{H}$ are established by the following argument. First, as $y_0 \to 0^+$, the trajectory starting at $(-\sqrt{3}, y_0)$ passes through the inner strip where $\dot{H} > 0$, so $H$ increases: $H(\sqrt{3}, y_1) > H(-\sqrt{3}, y_0)$, implying $y_1 > y_0$. A further computation shows $y_2 > y_0$ for small $y_0$, so $\Delta\hat{H}(0^+) > 0$.
Second, as $y_0 \to \infty$, the trajectory rises to large energy. In the outer region $\{|x| > \sqrt{3}\}$ where $\dot{H} < 0$, the energy loss dominates the energy gain in the strip, so $y_2 < y_0$ for large $y_0$, giving $\Delta\hat{H}(y_0) < 0$ for large $y_0$. Moreover, $\Delta\hat{H}(y_0) \to -\infty$.
Third, $\Delta\hat{H}$ is a decreasing function of $y_0$. This follows because $y_1$ and $y_2$ are increasing functions of $y_0$ (by the no-crossing property of trajectories), but the energy $\Delta H$ gained in the strip is a decreasing function of $y_0$: as $y_0$ increases, trajectories pass higher in the phase plane, where the strip contribution $\int_{-\sqrt{3}}^{\sqrt{3}} \dot{H}/\dot{x}\, dx = \int_{-\sqrt{3}}^{\sqrt{3}} [-xF(x)/(y - F(x))]\, dx$ decreases because the denominator $y - F(x)$ grows.
By continuity, $\Delta\hat{H}$ starts positive, decreases monotonically, and tends to $-\infty$. It therefore has exactly one zero, call it $y_0 = y^*$. At this value, $y_2 = y^*$, and the rotation symmetry of the system closes the orbit into a periodic orbit.
[quotetheorem:2796]
[citeproof:2796]
The existence proof is constructive: it identifies the specific initial condition $y^*$ from which the symmetric trajectory closes. The uniqueness proof uses the strict monotonicity of $\Delta\hat{H}$, which in turn rests on the no-crossing property of trajectories. When $\mu = 0$, the system is exactly Hamiltonian and every circle is a periodic orbit; for $\mu > 0$, all these orbits are destroyed except one, which is the limit cycle at $y^* = y^*(\mu)$. As $\mu \to 0^+$, the limit cycle approaches the circle of radius $2$ (as shown by the Melnikov analysis in the next section).
### Limiting Behaviour: Small and Large $\mu$
The Van der Pol oscillator has qualitatively different behaviour depending on whether $\mu$ is small or large, even though a unique stable limit cycle exists for all $\mu > 0$. For small $\mu$, the system is a weakly damped oscillator close to Hamiltonian, and the limit cycle is nearly circular. For large $\mu$, the damping dominates and the oscillation becomes a relaxation oscillation — a pattern of slow drifts punctuated by rapid jumps — whose period grows linearly with $\mu$. At $\mu = 0$ the system is exactly Hamiltonian, with all circles $\{H = H_0\}$ being periodic orbits; the unique limit cycle for $\mu > 0$ bifurcates from this continuum as $\mu$ increases from zero, with only the circle of radius $2$ (energy $H_0 = 2$) surviving to leading order.
**Small $\mu$ ($0 < \mu \ll 1$):** The near-Hamiltonian approach applies with unperturbed Hamiltonian $H = \frac{1}{2}(x^2 + y^2)$ and the perturbation entering through $\dot{x} = y - F(x)$. The unperturbed orbits are circles of radius $r = \sqrt{2H_0}$ parameterised as $x = r\cos t$, $y = r\sin t$. Along trajectories, $dH/dt = -xF(x) = \mu x^2(1 - \frac{x^2}{3})$, so the Melnikov function is
\begin{align*}
M(H_0) &= \mu\int_0^{2\pi} r^2\cos^2 t\!\left(1 - \tfrac{r^2\cos^2 t}{3}\right) dt \\
&= \mu r^2\!\left[\pi - \tfrac{r^2}{3}\cdot\tfrac{3\pi}{4}\right] = \mu\pi r^2\!\left(1 - \tfrac{r^2}{4}\right).
\end{align*}
Substituting $r^2 = 2H_0$ gives $M(H_0) = 2\mu\pi H_0(1 - H_0/2)$. This has a simple zero at $H_0^* = 2$, corresponding to radius $r^* = \sqrt{2H_0^*} = 2$. To confirm stability: $M'(2) = 2\mu\pi(1 - H_0)\big|_{H_0=2} = -2\mu\pi < 0$, so $M < 0$ for $H_0 > 2$ (orbits lose energy and contract toward the limit cycle) and $M > 0$ for $H_0 < 2$ (orbits gain energy and expand toward it). The limit cycle for small $\mu$ has approximate radius $2$.
**Large $\mu$ ($\mu \gg 1$):** For large $\mu$, the $\dot{x}$ equation dominates: trajectories move rapidly toward the nullcline $y = F(x)$ and then travel slowly along it. This separation into fast and slow timescales is the signature of a relaxation oscillation. The period is dominated by the slow segments along the nullcline $y = F(x)$ between its local extrema at $x = \pm 1$, $F(\pm 1) = \mp \frac{2\mu}{3}$. The slow motion satisfies $\dot{y} = -x$, so $dt = dy/(-x)$. Along the nullcline $y = F(x) = \mu x(\frac{1}{3}x^2-1)$, we have $dy = F'(x)\,dx = \mu(x^2-1)\,dx$. By the rotational symmetry, the period is twice the time to traverse from $(−2, F(−2))$ to $(−1, F(−1))$ (one slow segment):
\begin{align*}
T \approx 2\int_{-2}^{-1} \frac{F'(x)}{-x}\, dx = 2\int_{-2}^{-1} \frac{\mu(x^2-1)}{-x}\, dx = 2\mu\int_1^2 \frac{x^2-1}{x}\, dx = 2\mu\left[\frac{x^2}{2} - \ln x\right]_1^2.
\end{align*}
Evaluating: $\int_1^2 \frac{x^2-1}{x}\,dx = \int_1^2 (x - x^{-1})\,dx = [\frac{x^2}{2} - \ln x]_1^2 = (2 - \ln 2) - \frac{1}{2} = \frac{3}{2} - \ln 2$. Thus
\begin{align*}
T \approx \mu(3 - 2\ln 2) + O(1) \quad \text{as } \mu \to \infty.
\end{align*}
<!-- illustration-needed: Relaxation oscillation phase portrait for large mu — show the limit cycle hugging the cubic nullcline y=F(x) on the slow segments, with fast near-horizontal jumps connecting the two branches near x=pm1. Label the slow segments (drift along nullcline, timescale O(mu)) and fast segments (rapid jump, timescale O(1/mu)). -->
[remark: Universality of the Van der Pol Oscillator]
The Van der Pol system serves as a prototype for a broad class of phenomena. The energy balance analysis demonstrates how self-sustaining oscillations arise when a nonlinear damping term extracts energy from small oscillations and dissipates energy from large ones — a balance that uniquely determines the amplitude of the limit cycle. The Floquet multiplier formula confirms that the multiplier $\mu_2 = \exp\!\left(\int_0^P \nabla \cdot f\, dt\right)$ satisfies $\mu_2 < 1$ for all $\mu > 0$, consistent with the stability established by the energy argument.
[/remark]
When parameters change continuously, the qualitative structure of solutions need not change continuously: new fixed points emerge, periodic orbits change stability, or entirely new dynamics appear. Bifurcation theory classifies these transitions systematically.
# 5. Bifurcations
The previous chapters developed tools for understanding the qualitative behaviour of dynamical systems at a fixed parameter value: classifying fixed points, finding invariant manifolds, and applying Lyapunov methods to determine stability. This chapter addresses a fundamentally different question: what happens to the global structure of the phase portrait as a parameter continuously varies? The answer is the theory of bifurcations, which describes the precise mechanisms by which fixed points are created and destroyed, and by which periodic orbits emerge from equilibria.
## What Is a Bifurcation?
Before introducing definitions, it is worth identifying the central problem. Consider a family of autonomous systems parameterised by a scalar $\mu \in \mathbb{R}$:
\begin{align*}
\dot{x} = f(x;\mu), \quad x \in E \subseteq \mathbb{R}^n,
\end{align*}
where $f \colon E \times \mathbb{R} \to \mathbb{R}^n$ depends continuously — and typically smoothly — on both the state $x$ and the parameter $\mu$. For each fixed $\mu$, we obtain a dynamical system whose phase portrait we can analyse by the methods of earlier chapters. The question is: as $\mu$ varies, does the topological structure of the phase portrait change, and if so, how?
[definition: Bifurcation]
Let $f \colon E \times \mathbb{R} \to \mathbb{R}^n$ be a smooth vector field depending on a parameter $\mu \in \mathbb{R}$. A **bifurcation** is a value $\mu_0 \in \mathbb{R}$, called the **bifurcation point**, at which the topological structure of the flow $\varphi(t, \,\cdot\,;\mu)$ changes as $\mu$ passes through $\mu_0$. More precisely, the flows $\varphi(t, \,\cdot\,;\mu_1)$ and $\varphi(t, \,\cdot\,;\mu_2)$ are not topologically equivalent whenever $\mu_1 < \mu_0 < \mu_2$ (with $\mu_1, \mu_2$ sufficiently close to $\mu_0$).
[/definition]
Topological equivalence here is in the sense defined in Chapter 1: there exists a homeomorphism of the phase space carrying orbits of one flow to orbits of the other, preserving the direction of time. A bifurcation is therefore not merely a quantitative change (such as a fixed point moving to a new location) but a qualitative change in the catalogue of invariant sets.
[definition: Bifurcation Diagram]
A **bifurcation diagram** is a graph in the $(\mu, x)$-plane (or more generally the $(\mu, \|\cdot\|)$-plane) recording, as functions of $\mu$, the locations of fixed points and the amplitudes of periodic orbits. Stable branches are conventionally drawn as solid curves; unstable branches as dashed or dotted curves.
[/definition]
The bifurcation diagram gives a global summary of how the family of systems changes with the parameter and will be our primary visualisation tool throughout this chapter.
[example: Supercritical Hopf-like normal form]
Consider the planar system in polar coordinates:
\begin{align*}
\dot{r} &= r(\mu - r^2), \\
\dot{\theta} &= 1.
\end{align*}
For $\mu \leq 0$: the only invariant set is the fixed point at $r = 0$. To see why there is no periodic orbit, note that any periodic orbit must satisfy $\dot{r} = 0$ with $r > 0$, hence $\mu = r^2 > 0$, a contradiction. For $\mu > 0$: there is a periodic orbit at $r = \sqrt{\mu}$, since $\dot{r} = 0$ there and $\dot{\theta} = 1 \neq 0$. The stability follows from $\partial(\dot{r})/\partial r\big|_{r=\sqrt{\mu}} = \mu - 3r^2\big|_{r=\sqrt{\mu}} = -2\mu < 0$, so the orbit is attracting. The topological structure changes at $\mu = 0$: a new closed orbit is born, making $\mu = 0$ a bifurcation point.
[/example]
[example: Transcritical bifurcation preview]
Consider the system
\begin{align*}
\dot{x} &= xy - \mu x, \\
\dot{y} &= -y + x^2.
\end{align*}
For $\mu \leq 0$, the origin is the only fixed point near zero, and it is unstable. To see this, the Jacobian at the origin is
\begin{align*}
Jf_{\mathbf{0}} = \begin{pmatrix} -\mu & 0 \\ 0 & -1 \end{pmatrix},
\end{align*}
which has eigenvalues $-\mu$ and $-1$. For $\mu < 0$, the eigenvalue $-\mu > 0$, so the origin is an unstable fixed point. At $\mu = 0$ the eigenvalue $-\mu = 0$, making the origin non-hyperbolic. For $\mu > 0$, substituting $\dot{y} = 0$ gives $y = x^2$, and $\dot{x} = 0$ with $x \neq 0$ gives $y = \mu$, so new fixed points appear at $(\pm\sqrt{\mu}, \mu)$. At the fixed point $(\sqrt{\mu}, \mu)$, the Jacobian is
\begin{align*}
Jf_{(\sqrt{\mu},\mu)} = \begin{pmatrix} \mu - \mu & \sqrt{\mu} \\ 2\sqrt{\mu} & -1 \end{pmatrix} = \begin{pmatrix} 0 & \sqrt{\mu} \\ 2\sqrt{\mu} & -1 \end{pmatrix}.
\end{align*}
The trace is $-1 < 0$ and the determinant is $0 \cdot (-1) - \sqrt{\mu} \cdot 2\sqrt{\mu} = -2\mu < 0$ for $\mu > 0$. Since the determinant is negative, the eigenvalues have opposite signs, so $(\sqrt{\mu}, \mu)$ is a saddle point. At $(-\sqrt{\mu}, \mu)$ the same computation applies by symmetry. Meanwhile, the origin for $\mu > 0$ has Jacobian $\begin{pmatrix} -\mu & 0 \\ 0 & -1 \end{pmatrix}$ with both eigenvalues $-\mu < 0$ and $-1 < 0$, so the origin becomes stable. The qualitative structure — specifically the number and stability types of fixed points — changes at $\mu = 0$.
[/example]
## The Centre Manifold and Extended Centre Manifold
### Reduction via the Centre Manifold Theorem
When we are near a bifurcation point, the fixed point is non-hyperbolic: the Jacobian $Jf_{x^*}$ has at least one eigenvalue on the imaginary axis. This invalidates the Hartman–Grobman theorem, and the local dynamics near the fixed point is no longer determined by linearisation alone. The resolution is to identify a lower-dimensional invariant manifold — the centre manifold — that captures all the interesting long-time behaviour.
Recall from Chapter 2 the stable and unstable subspaces $E^s$ and $E^u$ for a hyperbolic fixed point. For a non-hyperbolic fixed point the linearisation has a **centre subspace** $E^c$ spanned by generalised eigenvectors whose eigenvalues have zero real part. The Centre Manifold Theorem asserts that the nonlinear system possesses a corresponding invariant manifold tangent to $E^c$.
[quotetheorem:2797]
The Centre Manifold Theorem is proved using the theory of invariant manifolds for flows, which extends the stable manifold construction from Chapter 2. The key step is to write the system in block-diagonal form separating the centre, stable, and unstable directions, and then seek the centre manifold as a fixed point of a contraction mapping on a space of graphs over $E^c$. The stable and unstable manifolds $W^s$ and $W^u$ are constructed by the same Hadamard–Perron technique as in the hyperbolic case. The non-uniqueness of $W^c$ arises because the contraction does not control flat perturbations: two graphs that agree to all finite orders in Taylor series are equally consistent solutions of the invariance equation. For this reason, $W^c$ is only determined up to a flat error, but this is invisible when we classify bifurcations by finite-order normal form computations.
Several remarks compare this with the Stable Manifold Theorem (Chapter 2), which applies at hyperbolic fixed points.
[remark: Comparison with the Stable Manifold Theorem]
For hyperbolic fixed points, the stable manifold $W^s$ was characterised globally as $W^s = \{x : \varphi(t, x) \to \mathbf{0} \text{ as } t \to +\infty\}$. This characterisation fails at non-hyperbolic points, because trajectories do not necessarily tend directly to $\mathbf{0}$; instead they tend to the centre manifold $W^c$. The centre manifold is thus the attractor for the long-time behaviour in the stable and centre directions. The non-uniqueness of $W^c$ is genuine: different smooth extensions of the local centre manifold can differ by flat functions (functions with all derivatives zero at $\mathbf{0}$), and these differences are invisible at any finite order of Taylor expansion.
[/remark]
The key practical consequence is this: on $W^c$, the dynamics is governed by an ODE of dimension $n_c < n$, which is typically one or two. All qualitative information about the bifurcation — whether it is a saddle-node, transcritical, or pitchfork, and whether it is super- or subcritical — is encoded in this reduced system.
[example: Finding the Centre Manifold]
Consider the system
\begin{align*}
\begin{pmatrix} \dot{x} \\ \dot{y} \end{pmatrix} = \begin{pmatrix} 0 & 0 \\ 0 & -\lambda \end{pmatrix} \begin{pmatrix} x \\ y \end{pmatrix} + \begin{pmatrix} f(x,y) \\ g(x,y) \end{pmatrix}, \quad \lambda > 0,
\end{align*}
where $f, g = O(|x|^2)$ are smooth nonlinear terms. The linearisation has eigenvalues $0$ and $-\lambda$, so $E^c = \{y = 0\}$ and $E^s = \{x = 0\}$. By the Centre Manifold Theorem, $W^c$ is a curve through the origin tangent to $E^c$, hence of the form $y = h(x)$ for a smooth function $h$ satisfying $h(0) = 0$ and $h'(0) = 0$.
To find $h$, we use the invariance of $W^c$: if $y = h(x)$ on $W^c$, then differentiating with respect to $t$ gives
\begin{align*}
\dot{y} = \frac{dh}{dx}\, \dot{x}.
\end{align*}
Substituting $\dot{y} = -\lambda h(x) + g(x, h(x))$ and $\dot{x} = f(x, h(x))$, we obtain the invariance equation
\begin{align*}
-\lambda h(x) + g(x, h(x)) = h'(x)\, f(x, h(x)).
\end{align*}
We expand $h(x) = a_2 x^2 + a_3 x^3 + \cdots$ and match coefficients order by order to determine $a_2, a_3, \ldots$. Once $h$ is known to sufficient order, the dynamics on $W^c$ is given by the one-dimensional ODE $\dot{x} = f(x, h(x))$.
[/example]
This procedure — reducing to a one-dimensional ODE on the centre manifold — is the core technique for bifurcation analysis in the absence of oscillatory behaviour. Once the reduced equation $\dot{x} = f(x, h(x))$ is known, we can classify the bifurcation by reading off the Taylor coefficients, exactly as described in the stationary bifurcations section below. The centre manifold thereby acts as a lens that focuses the full $n$-dimensional dynamics onto a low-dimensional skeleton that contains all qualitatively essential information.
### The Extended Centre Manifold
The Centre Manifold Theorem as stated applies at a fixed parameter value. To understand how the dynamics changes as $\mu$ varies — the essential question for bifurcation theory — we embed the parameter into the phase space.
The key device is to append the auxiliary equation $\dot{\mu} = 0$ to the system, treating $\mu$ as a new state variable that is frozen in time. The extended system on $E \times \mathbb{R}$ is
\begin{align*}
\dot{x} &= f(x; \mu), \\
\dot{\mu} &= 0.
\end{align*}
The fixed point $(\mathbf{0}, 0)$ of this extended system is non-hyperbolic (since $\dot{\mu} = 0$ contributes a zero eigenvalue), and the Centre Manifold Theorem applies to yield the **extended centre manifold** $W^c_{\text{ext}}$. This is an invariant manifold of the extended system that contains the $\mu$-axis and is tangent to the extended centre subspace $E^c_{\text{ext}}$.
The extended centre manifold is two-dimensional when the original centre manifold is one-dimensional (the extra dimension coming from $\mu$). A trajectory on $W^c_{\text{ext}}$ evolves with fixed $\mu$ (since $\dot{\mu} = 0$), and the one-dimensional reduced ODE for $\dot{x}$ (or $\dot{y}$, depending on the parameterisation) describes the bifurcation as $\mu$ varies.
[example: Extended Centre Manifold for a Transcritical System]
Consider
\begin{align*}
\dot{x} &= -\mu x + xy, \\
\dot{y} &= -y + x^2, \\
\dot{\mu} &= 0.
\end{align*}
The linear part at the origin is
\begin{align*}
\begin{pmatrix} \dot{x} \\ \dot{y} \\ \dot{\mu} \end{pmatrix} = \begin{pmatrix} 0 & 0 & 0 \\ 0 & -1 & 0 \\ 0 & 0 & 0 \end{pmatrix} \begin{pmatrix} x \\ y \\ \mu \end{pmatrix} + \begin{pmatrix} -\mu x + xy \\ x^2 \\ 0 \end{pmatrix}.
\end{align*}
The centre subspace $E^c_{\text{ext}}$ is spanned by the directions corresponding to the two zero eigenvalues, giving the $(x,\mu)$-plane ($y = 0$). By the Centre Manifold Theorem, the extended centre manifold is of the form $y = h(x, \mu)$ with
\begin{align*}
h(0,0) = 0, \quad D_x h(0,0) = 0, \quad D_\mu h(0,0) = 0.
\end{align*}
We seek $h$ as a power series in combined order:
\begin{align*}
h(x,\mu) = a_{2,0} x^2 + a_{1,1} x\mu + a_{0,2} \mu^2 + O(3),
\end{align*}
where $O(n)$ denotes terms of combined degree at least $n$ (so $x\mu$ has combined degree $2$). The invariance condition is
\begin{align*}
\dot{y} = \frac{\partial h}{\partial x}\, \dot{x} + \frac{\partial h}{\partial \mu}\, \dot{\mu} = \frac{\partial h}{\partial x}(-\mu x + xh).
\end{align*}
Also, from the ODE, $\dot{y} = -h + x^2$. Substituting $h$ and matching at combined order $2$:
\begin{align*}
-a_{2,0} x^2 - a_{1,1} x\mu - a_{0,2}\mu^2 + x^2 = 2a_{2,0} x \cdot (-\mu x) + O(3).
\end{align*}
At order $x^2$: $-a_{2,0} + 1 = 0$, so $a_{2,0} = 1$. At order $x\mu$: $-a_{1,1} = 0$, and at order $\mu^2$: $-a_{0,2} = 0$. Therefore
\begin{align*}
h(x,\mu) = x^2 + O(3).
\end{align*}
Substituting back into $\dot{x}$ gives the reduced equation on the extended centre manifold:
\begin{align*}
\dot{x} = -\mu x + x h(x,\mu) = -\mu x + x^3 + O(4).
\end{align*}
This is the evolution equation from which we will classify the bifurcation.
[/example]
The computation above illustrates an important structural feature: the extended centre manifold absorbs all dependence on $\mu$ into the reduced equation, so the one-dimensional ODE $\dot{x} = -\mu x + x^3 + O(4)$ encodes the full bifurcation behaviour of the original two-dimensional system near $(x, y, \mu) = (0, 0, 0)$. In the next example, we encounter a case where the linear part already involves $\mu$, requiring a different starting ansatz for $h$.
[example: Extended Centre Manifold with Non-Zero Linear Term in $\mu$]
Consider
\begin{align*}
\dot{x} &= \mu + x^2 + xy + y^2, \\
\dot{y} &= 2\mu - y + x^2 + xy.
\end{align*}
This system has a non-hyperbolic fixed point at the origin when $\mu = 0$. The extended system (appending $\dot{\mu} = 0$) has linearisation matrix
\begin{align*}
\begin{pmatrix} 0 & 0 & 1 \\ 0 & -1 & 2 \\ 0 & 0 & 0 \end{pmatrix}
\end{align*}
with eigenvalues $0, -1, 0$. The centre subspace $E^c_{\text{ext}}$ is determined by the generalised eigenvectors for the zero eigenvalues; computing these shows that $E^c_{\text{ext}}$ is the plane $y = 2\mu$. Therefore the extended centre manifold has the form $y = h(x,\mu)$ with $h(0,0) = 0$ and $D h(0,0) = (0, 2)$ (the gradient of $2\mu$ at the origin).
We seek
\begin{align*}
h(x,\mu) = 2\mu + a_{2,0}x^2 + a_{1,1} x\mu + a_{0,2}\mu^2 + O(3).
\end{align*}
The invariance equation $\dot{y} = (\partial h/\partial x)\dot{x} + (\partial h/\partial \mu)\dot{\mu}$ with $\dot{\mu} = 0$ gives
\begin{align*}
2\mu - h + x^2 + xh = \frac{\partial h}{\partial x}(\mu + x^2 + xh + h^2).
\end{align*}
Matching at combined order $2$ yields $h(x,\mu) = 2\mu + x^2 + O(3)$. The reduced evolution equation on the extended centre manifold is then
\begin{align*}
\dot{x} &= \mu + x^2 + x(2\mu + x^2) + (2\mu + x^2)^2 \\
&= \mu + x^2 + 2\mu x + x^3 + 4\mu^2 + O(3).
\end{align*}
To compare terms of different orders, we use the **balancing argument**: since $\mu$ and $x$ are both assumed small, we write $x = O(\mu^a)$ for some $a > 0$ to be chosen, and select $a$ so that the two dominant terms in $\dot{x}$ are of the same order. The candidates for the dominant balance are $\mu$ and $x^2$ (all other terms are higher order under any reasonable scaling). Requiring $\mu \sim x^2$ gives $x = O(\mu^{1/2})$. With this scaling:
\begin{align*}
\dot{x} = \underbrace{\mu + x^2}_{O(\mu)} + \underbrace{2\mu x + 4\mu^2}_{O(\mu^{3/2})} + O(\mu^{3/2}),
\end{align*}
so to leading order $\dot{x} = \mu + x^2 + O(\mu^{3/2})$. This is the saddle-node normal form, as we shall see in the next section.
[/example]
The balancing argument is not merely a bookkeeping device: it reveals which terms compete at leading order and thereby identifies the bifurcation type without yet solving anything explicitly. By requiring $\mu \sim x^2$, we have determined that the critical manifold is a parabola in the $(x,\mu)$-plane, and the dominant balance $\dot{x} = \mu + x^2$ tells us that two fixed points collide when $x^2 = -\mu$, i.e., for $\mu < 0$. This is the saddle-node scenario. The general classification proceeds by the same method, applied systematically.
### The General Method
The computation in the examples above follows a systematic procedure. We summarise it here for reference.
[explanation: General Method for Classifying a Bifurcation via the Extended Centre Manifold]
Suppose the dynamical system $\dot{x} = f(x;\mu)$ has a bifurcation point at $(x, \mu) = (\mathbf{0}, 0)$.
**(i) Extend the system.** Append $\dot{\mu} = 0$ and find the Jacobian of the extended system. Identify the eigenvalues; the Centre Manifold Theorem applies when some eigenvalues have zero real part and others have negative real part (no positive real parts at the bifurcation point).
**(ii) Find $E^c_{\text{ext}}$.** The centre subspace of the extended linearisation is spanned by the generalised eigenvectors for zero eigenvalues. Its dimension equals the number of zero eigenvalues (counting generalised eigenvectors), typically $2$ after the $\mu$ direction is included.
**(iii) Choose a parameterisation.** The extended centre manifold is locally a graph over $E^c_{\text{ext}}$. Depending on the structure of the system, write either $y = h(x,\mu)$ or $x = h(y,\mu)$, choosing whichever requires expanding $h$ to lower combined order to classify the bifurcation. In practice, choose the variable that appears only linearly in the stable direction of the linearisation.
**(iv) Determine the initial conditions on $h$.** The Centre Manifold Theorem requires $h(0,0)$ to equal the value of the parameterised variable at the fixed point, and $Dh(0,0)$ to match the gradient of the corresponding component of $E^c_{\text{ext}}$.
**(v) Derive the invariance equation.** Since $W^c_{\text{ext}}$ is invariant, the chain rule gives
\begin{align*}
\dot{y} = \frac{\partial h}{\partial x}\, \dot{x} + \frac{\partial h}{\partial \mu}\, \dot{\mu} = \frac{\partial h}{\partial x}\, \dot{x},
\end{align*}
where the last equality uses $\dot{\mu} = 0$. Both $\dot{x}$ and $\dot{y}$ are known from the dynamical system; substitute $y = h(x,\mu)$ throughout.
**(vi) Match coefficients by combined order.** Write $h = v_1 + v_2 + v_3 + \cdots$ where $v_k$ collects all monomials $x^j \mu^l$ with $j + l = k$. Determine coefficients in ascending combined order; each level provides exactly the equations needed to find the next batch of coefficients.
**(vii) Write the reduced equation.** Substitute $y = h(x,\mu)$ into the expression for $\dot{x}$ (or $\dot{y}$) from the original system to obtain the one-dimensional evolution equation on the extended centre manifold.
**(viii) Apply the balancing argument.** Write $x = O(\mu^a)$ and choose $a$ to balance the two dominant terms. This determines the leading-order approximation to $\dot{x}$ and identifies the bifurcation type by the method described in the next section.
[/explanation]
This general method is not merely algorithmic scaffolding: each step has a geometric meaning. Step (i) extends the phase space so that $\mu$ is treated on equal footing with the dynamical variables; step (ii) identifies the directions along which slow dynamics occurs; steps (iii)–(vi) construct the slow manifold as a graph, using the fast directions to eliminate variables; and steps (vii)–(viii) reveal the leading-order reduced dynamics on the slow manifold. The result is a one-dimensional ODE whose Taylor coefficients directly encode which of the three standard bifurcation types is occurring.
[example: Higher-Dimensional System — Lorenz at the Origin]
The Lorenz system
\begin{align*}
\dot{x} &= \sigma(y - x), \\
\dot{y} &= rx - y - xz, \\
\dot{z} &= -bz + xy,
\end{align*}
with parameters $\sigma, r, b > 0$, has a fixed point at the origin for all parameter values. The Jacobian at the origin is
\begin{align*}
Jf_{\mathbf{0}} = \begin{pmatrix} -\sigma & \sigma & 0 \\ r & -1 & 0 \\ 0 & 0 & -b \end{pmatrix}.
\end{align*}
The eigenvalues of this matrix are $-b$ (with eigenvector $(0,0,1)^\top$) and the two eigenvalues of the upper-left $2 \times 2$ block $\begin{pmatrix} -\sigma & \sigma \\ r & -1 \end{pmatrix}$, which are $\frac{-(1+\sigma) \pm \sqrt{(1+\sigma)^2 - 4\sigma(1-r)}}{2}$. The origin becomes non-hyperbolic when one eigenvalue is zero, which occurs precisely when $\det Jf_{\mathbf{0}} = 0$. The determinant of the upper block is $\sigma(1-r)$, so a zero eigenvalue occurs at $r = 1$.
Setting $\mu := r - 1$ and $v := y - x$ as new coordinates, the origin undergoes a bifurcation at $\mu = 0$. The extended system (appending $\dot{\mu} = 0$) can be written with a $4 \times 4$ matrix having eigenvalues $-b, -(1+\sigma), 0, 0$, and the extended centre manifold is two-dimensional. Setting $v = V(x,\mu)$ and $z = Z(x,\mu)$ on this manifold, the invariance equations become
\begin{align*}
\dot{V} &= \dot{x}\, V_x = -(1+\sigma)V + x(\mu - Z), \\
\dot{Z} &= \dot{x}\, Z_x = -bZ + x(V + x).
\end{align*}
Since $V, Z = O(2)$ at the origin, the leading-order terms give $V = O(2)$ and $Z = O(2)$, and the leading-order reduced equation for $\dot{x}$ on the extended centre manifold is $\dot{x} = \sigma V + O(3)$. Working to the necessary order one finds that this system undergoes a pitchfork bifurcation at $\mu = 0$ (and hence $r = 1$), at which the origin changes stability and two symmetric fixed points $(\pm\sqrt{\mu b}, \pm\sqrt{\mu b}, \mu)$ (in original coordinates) bifurcate off.
[/example]
The Lorenz example shows that the extended centre manifold method scales to three-dimensional systems: the same algebraic procedure — extend, identify $E^c_{\text{ext}}$, write $W^c_{\text{ext}}$ as a graph, solve the invariance equation order by order — applies without essential modification. The three-dimensional nature of the Lorenz system only means that we must track two components of the graph ($V$ and $Z$), each satisfying a separate invariance equation that can be solved in the same way. The bifurcation type is then read off from the one-dimensional reduced equation for $\dot{x}$ on the extended centre manifold.
## Stationary Bifurcations: Zero Eigenvalue
We now focus on the case where the bifurcation involves a zero eigenvalue of the Jacobian. Working with the reduced one-dimensional equation
\begin{align*}
\dot{x} = f(x; \mu), \quad x \in \mathbb{R},
\end{align*}
and assuming that $x = 0$ is a non-hyperbolic fixed point at $\mu = 0$ (so $f(0;0) = 0$ and $\partial f/\partial x\big|_{(0,0)} = 0$), we classify the possible bifurcation types by the Taylor expansion of $f$.
Expanding in $x$ with coefficients depending on $\mu$:
\begin{align*}
\dot{x} = f(x;\mu) = a_0(\mu) + a_1(\mu)x + a_2(\mu)x^2 + a_3(\mu)x^3 + \cdots
\end{align*}
The hypotheses give $a_0(0) = 0$ and $a_1(0) = 0$. Expanding the coefficients in $\mu$: $a_0(\mu) = a_{01}\mu + O(\mu^2)$ and $a_1(\mu) = a_{11}\mu + O(\mu^2)$, while $a_k(\mu) = a_{k0} + O(\mu)$ for $k \geq 2$. The most generic case — where $a_{01}, a_{11}, a_{20}$ are all nonzero — always reduces to the saddle-node normal form after a change of variables. The other standard types arise as special cases when certain genericity conditions fail.
### Saddle-Node Bifurcation
[definition: Saddle-Node Bifurcation]
A **saddle-node bifurcation** (also called a **fold bifurcation**) occurs at $(x, \mu) = (0, 0)$ when the evolution equation on the (extended) centre manifold has the normal form
\begin{align*}
\dot{x} = \mu - x^2.
\end{align*}
The conditions on the original system $f(x;\mu)$ that guarantee reduction to this normal form are: $f(0;0) = 0$, $\partial_x f(0;0) = 0$, $\partial_\mu f(0;0) \neq 0$, and $\partial_{xx} f(0;0) \neq 0$.
[/definition]
To understand the saddle-node bifurcation, we read off the fixed points of $\dot{x} = \mu - x^2$ directly: $x = \pm\sqrt{\mu}$, which exist only for $\mu \geq 0$. For the stability, $\partial(\mu - x^2)/\partial x = -2x$: at $x = +\sqrt{\mu}$ this is $-2\sqrt{\mu} < 0$ (stable), and at $x = -\sqrt{\mu}$ it is $+2\sqrt{\mu} > 0$ (unstable). We summarise:
- $\mu > 0$: two fixed points, $x = \sqrt{\mu}$ (stable, a node) and $x = -\sqrt{\mu}$ (unstable, a saddle). Hence the name.
- $\mu = 0$: one fixed point at $x = 0$, which is a half-stable saddle-node (non-hyperbolic).
- $\mu < 0$: no fixed points.
<!-- illustration-needed: Bifurcation diagram for the saddle-node: plot x against mu, showing the upper branch (stable, solid) and lower branch (unstable, dashed) meeting at mu=0 in a parabola opening to the right -->
The saddle-node bifurcation is the **generic** bifurcation: whenever $\partial_\mu f \neq 0$ and $\partial_{xx} f \neq 0$ at a non-hyperbolic fixed point, the system undergoes a saddle-node bifurcation. This can be seen from the general Taylor expansion: completing the square in $x$ and rescaling eliminates all other terms to leading order, yielding exactly $\dot{X} = \mu - X^2$.
### Transcritical Bifurcation
If the system has a fixed point at $x = 0$ for all values of $\mu$ — not just at $\mu = 0$ — then $a_0(\mu) = 0$ identically, and the saddle-node form cannot arise. The next generic case is the transcritical bifurcation.
[definition: Transcritical Bifurcation]
A **transcritical bifurcation** occurs at $(0, 0)$ when the evolution equation has the normal form
\begin{align*}
\dot{x} = \mu x - x^2.
\end{align*}
This arises when $f(0;\mu) = 0$ for all $\mu$ (so $x = 0$ is a fixed point for all $\mu$), $\partial_x f(0;0) = 0$, $\partial_{\mu x} f(0;0) \neq 0$, and $\partial_{xx} f(0;0) \neq 0$.
[/definition]
The fixed points of $\dot{x} = \mu x - x^2 = x(\mu - x)$ are $x = 0$ and $x = \mu$, both existing for all $\mu$. The stability at $x = 0$: $\partial(\mu x - x^2)/\partial x\big|_{x=0} = \mu$, so $x = 0$ is stable for $\mu < 0$ and unstable for $\mu > 0$. The stability at $x = \mu$: $\partial(\mu x - x^2)/\partial x\big|_{x=\mu} = \mu - 2\mu = -\mu$, so $x = \mu$ is stable for $\mu > 0$ and unstable for $\mu < 0$. The two branches of fixed points cross at the origin and **exchange stability** at $\mu = 0$.
<!-- illustration-needed: Bifurcation diagram for the transcritical bifurcation: two curves x=0 and x=mu crossing at the origin; each changes from solid to dashed at the crossing point -->
### Pitchfork Bifurcation
When the system has an additional symmetry — specifically, when $f$ is odd in $x$ (i.e., $f(-x;\mu) = -f(x;\mu)$) — then both $a_0(\mu)$ and $a_2(\mu)$ vanish identically, and the leading-order behaviour involves $x^3$.
[definition: Pitchfork Bifurcation]
A **pitchfork bifurcation** occurs at $(0,0)$ when the evolution equation has the normal form
\begin{align*}
\dot{x} = \mu x \pm x^3.
\end{align*}
The case $\dot{x} = \mu x - x^3$ is called **supercritical** and the case $\dot{x} = \mu x + x^3$ is called **subcritical**. The terminology reflects the stability of the bifurcating branches: in the supercritical case, the new fixed points that appear for $\mu > 0$ are stable; in the subcritical case, the new fixed points appearing for $\mu < 0$ are unstable.
[/definition]
For the supercritical case $\dot{x} = \mu x - x^3 = x(\mu - x^2)$: fixed points are $x = 0$ and $x = \pm\sqrt{\mu}$ (the latter existing only for $\mu > 0$). The stability at $x = 0$ is given by the derivative $\mu - 3x^2\big|_{x=0} = \mu$: stable for $\mu < 0$, unstable for $\mu > 0$. The stability at $x = \pm\sqrt{\mu}$ is $\mu - 3\mu = -2\mu < 0$ for $\mu > 0$: these are stable. So for $\mu < 0$ there is a single stable fixed point at the origin; as $\mu$ increases through $0$, the origin loses stability and two new stable fixed points bifurcate symmetrically.
For the subcritical case $\dot{x} = \mu x + x^3$: fixed points are $x = 0$ and $x = \pm\sqrt{-\mu}$ (the latter for $\mu < 0$). The branch $x = \pm\sqrt{-\mu}$ is unstable (derivative at this point is $\mu + 3(-\mu) = -2\mu > 0$ for $\mu < 0$). The origin is stable for $\mu < 0$ and unstable for $\mu > 0$. In the subcritical case, the system loses stability at $\mu = 0$ without gaining any nearby stable alternatives: a different global attractor must take over, typically through a large-amplitude jump.
<!-- illustration-needed: Side-by-side bifurcation diagrams for supercritical pitchfork (pitchfork opens to the right, new branches stable) and subcritical pitchfork (branches to the left, unstable) -->
The subcritical case deserves emphasis: when the pitchfork is subcritical, the system loses stability at $\mu = 0$ with no nearby stable alternative. The two branches $x = \pm\sqrt{-\mu}$ that appear for $\mu < 0$ are unstable, so they do not catch the trajectory. A perturbation that pushes the system past the unstable branch will send it to a large-amplitude attractor — if one exists globally — or to infinity. In engineering applications, such subcritical losses of stability are associated with sudden buckling or snap-through instabilities, where the structure jumps discontinuously to a new configuration upon exceeding the critical load.
[remark: Identifying the Bifurcation Type]
To classify a bifurcation from the reduced one-dimensional equation $\dot{x} = f(x;\mu)$, apply the following decision procedure after carrying out the balancing argument:
1. If $f(0;\mu) \neq 0$ for generic $\mu$ (i.e., the origin is not a fixed point for all $\mu$), then the bifurcation is a **saddle-node**: two fixed points collide and annihilate.
2. If $f(0;\mu) = 0$ for all $\mu$ (fixed point persists), check whether the leading nonlinear term in $x$ is $x^2$ or $x^3$. If $x^2$, it is a **transcritical** bifurcation (two fixed points exchange stability). If $x^3$ (and the system is odd in $x$), it is a **pitchfork** bifurcation.
A more systematic rule: if in the balanced form of $\dot{x}$ there appears a cubic term in $x$ with no quadratic, the bifurcation is pitchfork. If there is a quadratic term in $x$ with a standalone $\mu$ term, it is saddle-node. If the quadratic term in $x$ appears multiplied by $\mu$ rather than as a standalone constant, it is transcritical.
[/remark]
The decision procedure in the remark is most reliable when the reduced equation is already in normal form or close to it. In practice, one first carries out the centre manifold reduction, then applies the balancing argument, and only at that stage reads off the type. Attempting to classify directly from the original high-dimensional system — before reduction — risks misidentifying the bifurcation because the higher-dimensional terms can shift the apparent leading order.
### Structural Stability of Stationary Bifurcations
We now ask which of these bifurcations persist under small perturbations — that is, which are **structurally stable**.
[explanation: Structural Stability of the Standard Bifurcations]
Recall from Chapter 1 that a vector field $f$ is structurally stable if any sufficiently small perturbation $f + g$ is topologically equivalent to $f$.
**Saddle-node bifurcation:** The saddle-node is structurally stable among all bifurcations. Any small perturbation of the normal form $\dot{x} = \mu - x^2$ merely shifts the location of the bifurcation point slightly in the $(\mu, x)$-plane; it does not change the topological type. The qualitative picture — two fixed points collide and disappear — persists.
**Transcritical bifurcation:** The transcritical bifurcation is **structurally unstable**. To see this, perturb the normal form:
\begin{align*}
\dot{x} = \varepsilon + \mu x - x^2.
\end{align*}
For $\varepsilon \neq 0$, the fixed points no longer satisfy $x = 0$ for all $\mu$. Instead, the crossing of two fixed-point branches is broken: for $\varepsilon > 0$ the two branches no longer intersect (the discriminant $\mu^2 + 4\varepsilon$ is always positive, so there are always two real fixed points, but they never coincide at $\mu = 0$); for $\varepsilon < 0$ there is a gap in $\mu$ where no fixed points exist. In either case, the exchange-of-stability picture is replaced by a qualitatively different topology, showing that the transcritical bifurcation is unstable to perturbations.
**Pitchfork bifurcation:** The pitchfork bifurcation is also **structurally unstable**, for the same reason: it relies on the odd symmetry $f(-x;\mu) = -f(x;\mu)$ being exact. Perturb the supercritical normal form:
\begin{align*}
\dot{x} = \varepsilon + \mu x - x^3.
\end{align*}
For $\varepsilon \neq 0$, the system no longer has $x = 0$ as a fixed point for all $\mu$, and the symmetric pitchfork diagram is split into one of two qualitatively different pictures (depending on the sign of $\varepsilon$): either one branch of fixed points exists for all $\mu$ and a saddle-node occurs at some $\mu < 0$ (for $\varepsilon > 0$), or the reverse. The characteristic three-branch structure of the pitchfork is destroyed.
This shows that in applications where the odd symmetry is only approximate, the pitchfork will be replaced by a perturbed pitchfork diagram, and one should expect to see isolated saddle-node bifurcations nearby.
[/explanation]
## Oscillatory Bifurcations: The Hopf Bifurcation
The zero-eigenvalue bifurcations classified in the previous section cannot account for all observed qualitative changes in dynamical systems. Many systems — chemical oscillators, predator-prey models, the van der Pol circuit, neural firing models — develop self-sustaining periodic behaviour as a parameter crosses a critical value. A stable equilibrium becomes oscillatory, but the equilibrium itself never disappears: it just loses stability while a periodic orbit is born around it. No zero eigenvalue is involved. The question is: what mechanism creates a limit cycle from a stable spiral equilibrium, and what controls whether the transition is gradual or sudden?
The answer cannot come from stationary bifurcation theory. A zero eigenvalue means one direction of the Jacobian is neither expanding nor contracting at first order, and the resulting bifurcation involves fixed points colliding or exchanging stability. But a spiral equilibrium has complex eigenvalues with non-zero imaginary part — it is genuinely rotating — and the loss of stability of a spiral must involve a pair of complex eigenvalues crossing the imaginary axis simultaneously. This purely oscillatory crossing is the Hopf bifurcation.
[example: Van der Pol Oscillator]
To make the question concrete, consider the van der Pol oscillator with a parameter controlling dissipation:
\begin{align*}
\ddot{u} - \mu(1 - u^2)\dot{u} + u &= 0, \quad \mu \in \mathbb{R}.
\end{align*}
Writing $x = u$ and $y = \dot{u}$, this becomes the planar system
\begin{align*}
\dot{x} &= y, \\
\dot{y} &= -x + \mu(1 - x^2)y.
\end{align*}
The only fixed point is the origin. The Jacobian at the origin is
\begin{align*}
Jf_{\mathbf{0}} = \begin{pmatrix} 0 & 1 \\ -1 & \mu \end{pmatrix},
\end{align*}
which has characteristic polynomial $\lambda^2 - \mu\lambda + 1 = 0$ and eigenvalues
\begin{align*}
\lambda = \frac{\mu \pm \sqrt{\mu^2 - 4}}{2}.
\end{align*}
For $|\mu| < 2$, the discriminant $\mu^2 - 4 < 0$, so the eigenvalues are $\lambda = \frac{\mu}{2} \pm i\frac{\sqrt{4 - \mu^2}}{2}$, a pair of complex conjugates with real part $\alpha(\mu) = \mu/2$. For $\mu < 0$ the real part is negative and the origin is a stable spiral: trajectories wind inward. For $\mu > 0$ the real part is positive and the origin is an unstable spiral: trajectories wind outward. At $\mu = 0$, the eigenvalues are $\pm i$, purely imaginary. The transversality condition holds: $d\alpha/d\mu = 1/2 \neq 0$.
Numerical simulation shows that for $\mu > 0$ sufficiently small, there exists a stable limit cycle surrounding the unstable origin. As $\mu$ increases through $0$, the stable spiral loses stability and a periodic orbit appears. This is the phenomenon that zero-eigenvalue theory cannot explain: no eigenvalue passes through zero, yet the qualitative structure changes completely.
[/example]
With this motivating example in hand, we now define the Hopf bifurcation precisely.
[definition: Hopf Bifurcation]
A **Hopf bifurcation** occurs at a fixed point $x^* = \mathbf{0}$ at parameter value $\mu = 0$ when the Jacobian $Jf_{\mathbf{0}}(\mu)$ has a pair of purely imaginary eigenvalues $\lambda(\mu) = \alpha(\mu) \pm i\omega(\mu)$ with $\omega(0) \neq 0$, satisfying the **transversality condition**
\begin{align*}
\left.\frac{d\alpha}{d\mu}\right|_{\mu=0} \neq 0.
\end{align*}
The transversality condition ensures that the eigenvalues actually cross the imaginary axis (rather than touching it and returning) as $\mu$ passes through $0$.
[/definition]
At a Hopf bifurcation, the centre manifold is two-dimensional (spanned by the real and imaginary parts of the complex eigenvector). The dynamics on this two-dimensional centre manifold, in polar coordinates $(r, \theta)$ compatible with the rotational structure of the complex eigenvalues, takes the following normal form.
[definition: Normal Form for the Hopf Bifurcation]
In a neighbourhood of a Hopf bifurcation at the origin, the flow on the two-dimensional centre manifold is, to leading order, described by
\begin{align*}
\dot{r} &= (\mu - ar^2)r, \\
\dot{\theta} &= \omega,
\end{align*}
where $\omega > 0$ is the imaginary part of the eigenvalue at $\mu = 0$, and $a \in \mathbb{R}$ is a computable quantity (depending on third-order derivatives of $f$ at the fixed point). The sign of $a$ determines the type of the Hopf bifurcation:
- $a > 0$: **supercritical** Hopf bifurcation.
- $a < 0$: **subcritical** Hopf bifurcation.
[/definition]
The existence and uniqueness of the periodic orbit — and its amplitude as a function of $\mu$ — is captured by the following fundamental theorem.
[quotetheorem:234]
The proof of the Hopf Bifurcation Theorem proceeds by reducing to the two-dimensional normal form via the centre manifold, then applying the Poincaré return map to a transverse section: the periodic orbit corresponds to a fixed point of this return map, and its existence follows from the implicit function theorem once the non-degeneracy condition $a \neq 0$ is verified.
The normal form is integrable: the $\dot{r}$ equation decouples from $\dot{\theta}$ and is the one-dimensional ODE
\begin{align*}
\dot{r} = r(\mu - ar^2).
\end{align*}
Fixed points of this equation with $r > 0$ give periodic orbits of the full system, since $\dot{\theta} = \omega \neq 0$ ensures the trajectory is genuinely periodic. A fixed point at $r^* > 0$ requires $\mu = a(r^*)^2$, so $r^* = \sqrt{\mu/a}$, which exists when $\mu/a > 0$.
### Supercritical Hopf Bifurcation ($a > 0$)
For the supercritical case, the periodic orbit at $r^* = \sqrt{\mu/a}$ exists for $\mu > 0$.
- $\mu < 0$: The fixed point $r = 0$ is the only equilibrium; the derivative $\partial(\mu - ar^2)/\partial r\big|_{r=0} = \mu < 0$ shows that it is a stable spiral (trajectories in $\mathbb{R}^2$ spiral inward toward the origin).
- $\mu = 0$: The eigenvalues are purely imaginary; the linearised system is a centre, but the nonlinear terms determine the true behaviour.
- $\mu > 0$: The origin is unstable (derivative $\mu > 0$); a stable limit cycle at $r = \sqrt{\mu/a}$ surrounds it. The derivative of $\dot{r}$ with respect to $r$ at this limit cycle is $\mu - 3ar^2\big|_{r=\sqrt{\mu/a}} = \mu - 3\mu = -2\mu < 0$, confirming stability. Trajectories starting between the origin and the cycle spiral outward to the cycle; trajectories starting outside the cycle spiral inward to it.
<!-- illustration-needed: Supercritical Hopf bifurcation diagram: amplitude r of the periodic orbit plotted against mu, showing r=0 stable for mu<0, bifurcation at mu=0, and stable branch r=sqrt(mu/a) for mu>0 -->
<!-- illustration-needed: Phase portraits in the (x,y)-plane for the supercritical Hopf at three values: mu<0 (stable spiral to origin), mu=0 (neutral), mu>0 (unstable origin surrounded by stable limit cycle) -->
The supercritical case is the benign scenario: as $\mu$ increases through zero, the system transitions continuously from a stable fixed point to a stable periodic orbit. The amplitude $r^* = \sqrt{\mu/a}$ grows from zero like a square root, so the oscillation builds gradually. This gradual onset is characteristic of supercritical Hopf bifurcations and distinguishes them sharply from the abrupt loss of stability seen in the subcritical case. In particular, near the bifurcation, the amplitude and period of the limit cycle can be measured and compared directly with the $\sqrt{\mu}$ and $2\pi/\omega_0$ predictions — a valuable diagnostic in experiments.
### Subcritical Hopf Bifurcation ($a < 0$)
For the subcritical case, the periodic orbit at $r^* = \sqrt{\mu/a} = \sqrt{-\mu/|a|}$ exists for $\mu < 0$.
- $\mu < 0$: An unstable limit cycle at $r = \sqrt{-\mu/|a|}$ surrounds the stable fixed point at $r = 0$. Trajectories starting inside the cycle spiral inward to the origin; trajectories starting outside the cycle spiral outward (escaping to infinity in the truncated normal form, or to some other attractor in the full system).
- $\mu = 0$: The limit cycle has collapsed to zero; the origin has purely imaginary eigenvalues.
- $\mu > 0$: The origin is an unstable spiral. No nearby periodic orbit exists; trajectories escape.
The subcritical Hopf bifurcation is more dangerous in applications: as $\mu$ increases through $0$, the stable fixed point abruptly loses stability with no nearby stable periodic orbit to catch the system. This typically produces a large-amplitude jump to a distant attractor, making the transition discontinuous and potentially hysteretic.
<!-- illustration-needed: Subcritical Hopf bifurcation diagram showing the unstable limit cycle branch for mu<0, and the qualitative difference from the supercritical case -->
[remark: Comparison of Supercritical and Subcritical Hopf Bifurcations]
Both types share the essential defining feature: a limit cycle is created or destroyed as $\mu$ passes through $0$. The difference lies in which side of the bifurcation carries the periodic orbit and whether that orbit is stable or unstable. In the supercritical case, the system transitions smoothly from a stable fixed point to a stable periodic orbit. In the subcritical case, the loss of stability of the fixed point is catastrophic: the system must jump to a distant state, and there is a region of bistability (for $\mu$ slightly below $0$) where both the stable fixed point and some large-amplitude attractor coexist.
[/remark]
The continuous-time framework has been comprehensive, yet many physical and mathematical systems evolve in discrete steps rather than continuously. Poincare return maps, numerical integrators, and discrete population models all give rise naturally to maps, where bifurcation analysis must be adapted.
# 6. Maps
The continuous-time theory developed in the preceding chapters — flows, fixed points, limit cycles, bifurcations — has a parallel discrete-time counterpart in the theory of iterated maps. A **map** $F \colon E \to E$ on a state space $E \subseteq \mathbb{R}^n$ defines a dynamical system by iteration: given an initial condition $x_0 \in E$, the orbit is the sequence $x_0, F(x_0), F^2(x_0), \ldots$, where $F^n$ denotes the $n$-fold composition of $F$ with itself. Discrete maps arise naturally as Poincaré return maps for continuous systems, as numerical integration schemes, and as models in their own right in population dynamics, economics, and digital signal processing. This chapter introduces the five canonical maps of the course, develops the definitions of fixed points, cycles, and attractors for maps, analyses their stability via linearisation, and classifies the bifurcations that occur as parameters vary — paralleling the saddle-node, transcritical, pitchfork, and period-doubling phenomena for one-dimensional maps.
## The Five Canonical Maps
**What are the fundamental examples that illustrate the full range of discrete dynamical behaviour?** Before developing general theory, it is essential to have a repertoire of explicit maps whose dynamics are well understood. The five examples below range from the completely regular (the rotation map) to the apparently random (the logistic map at large parameter values), and from one-dimensional maps of an interval to two-dimensional maps of the unit square.
[definition: Logistic Map]
The **logistic map** is the map $F \colon [0,1] \to [0,1]$ defined by
\begin{align*}
F(x) = \mu x(1 - x),
\end{align*}
where $\mu \in (0, 4]$ is a real parameter. The iteration is $x_{n+1} = \mu x_n(1 - x_n)$.
[/definition]
The logistic map was introduced by Robert May in 1976 as a model of population dynamics: $x_n \in [0,1]$ represents the population relative to the carrying capacity, and $\mu$ is the growth rate. For $\mu \in (0,4]$ the map sends $[0,1]$ into itself, as can be verified by noting that $F$ achieves its maximum $\mu/4$ at $x = 1/2$. The parameter $\mu = 3$ already produces visually striking dynamics: the graph of $F$ over $[0,1]$ is a parabola with vertex at $(1/2, 3/4)$, and iterating from almost any initial condition eventually leads to a 2-cycle. As $\mu$ increases toward 4, the map undergoes a cascade of period-doubling bifurcations culminating in chaotic dynamics.
<!-- illustration-needed: Graph of the logistic map F(x) = μx(1-x) over [0,1] for μ=3, together with the diagonal y=x and a cobweb diagram showing convergence to a 2-cycle -->
[definition: Tent Map]
The **tent map** is the map $F \colon [0,1] \to [0,1]$ defined by
\begin{align*}
F(x) = \begin{cases} \mu x & 0 \leq x \leq \tfrac{1}{2}, \\ \mu(1 - x) & \tfrac{1}{2} \leq x \leq 1, \end{cases}
\end{align*}
where $\mu \in (0, 2]$ is a real parameter.
[/definition]
The tent map is piecewise linear, which makes many calculations exact rather than approximate. At $\mu = 2$ it maps $[0,1]$ onto $[0,1]$ and is conjugate to the logistic map at $\mu = 4$: there is a homeomorphism $h \colon [0,1] \to [0,1]$ satisfying $h \circ F_{\text{tent}} = F_{\text{logistic}} \circ h$, so the two maps have identical symbolic dynamics. The tent shape (an inverted V) means the map has slope $\pm \mu$ everywhere except at the kink $x = 1/2$.
[definition: Rotation Map]
The **rotation map** is the map $F \colon \mathbb{R}/\mathbb{Z} \to \mathbb{R}/\mathbb{Z}$ (the circle with circumference 1, identified with $[0,1)$ under addition modulo 1) defined by
\begin{align*}
F(x) = x + \omega \pmod{1},
\end{align*}
where $\omega \in \mathbb{R}$ is the rotation number.
[/definition]
The rotation map exhibits a sharp dichotomy depending on the arithmetic nature of $\omega$. Unlike the logistic or tent map, it has no stretching or folding: every point moves by exactly the same angular displacement at every step, so the rotation is volume-preserving and the map is an isometry of the circle. The only source of complexity is number-theoretic — whether $\omega$ is rational or irrational determines whether orbits are finite or infinite, periodic or aperiodic, and sparse or dense. This arithmetic sensitivity makes the rotation map a clean prototype for the dichotomy between integrable and ergodic dynamics.
[remark: Rational vs Irrational Rotation]
If $\omega \in \mathbb{Q}$, say $\omega = p/q$ in lowest terms, then every point is periodic with period exactly $q$: $F^q(x) = x + q\omega = x + p \equiv x \pmod{1}$. If $\omega \notin \mathbb{Q}$, then every orbit is dense in $[0,1)$: for any $x_0$ and any interval $(a,b) \subseteq [0,1)$, there exists $n \in \mathbb{N}$ such that $F^n(x_0) \in (a,b)$. This is because $\{n\omega \bmod 1 : n \in \mathbb{N}\}$ is equidistributed (Weyl's theorem), and in particular dense.
[/remark]
<!-- illustration-needed: Cobweb diagram for the rotation map on [0,1] with ω irrational, showing how iterates are spread densely around the circle without returning exactly to the start -->
[definition: Bernoulli Shift Map]
The **Bernoulli shift map** (also called the **sawtooth map**) is the map $F \colon [0,1) \to [0,1)$ defined by
\begin{align*}
F(x) = 2x \pmod{1}.
\end{align*}
[/definition]
The Bernoulli shift map has a vivid description in terms of binary representations. Write $x_0 = 0.a_1 a_2 a_3 \cdots$ in binary, where each $a_k \in \{0,1\}$. Multiplying by 2 shifts the binary point one place to the right, and taking the result modulo 1 discards the integer part. Thus
\begin{align*}
x_1 = F(x_0) = 0.a_2 a_3 a_4 \cdots, \quad x_2 = F^2(x_0) = 0.a_3 a_4 a_5 \cdots.
\end{align*}
After $n$ iterations, the digit $a_{n+1}$ has moved to the leading position, so iteration of $F$ reads off the binary digits of $x_0$ one by one.
[remark: Trichotomy of Bernoulli Orbits]
The dynamics of the Bernoulli shift depend on the binary expansion of $x_0$:
- If $x_0 = p \cdot 2^{-n}$ for some integers $p, n$ (i.e. $x_0$ is a dyadic rational), then the binary expansion of $x_0$ is eventually zero, so the orbit eventually reaches $0$ and is absorbed there.
- If $x_0 = p/q \in \mathbb{Q}$ with $q$ not a power of 2, then the binary expansion of $x_0$ is eventually periodic, so the orbit eventually reaches a periodic cycle.
- If $x_0 \notin \mathbb{Q}$, then the binary expansion of $x_0$ is aperiodic, so the orbit is aperiodic. Moreover, orbits of irrational $x_0$ are dense in $[0,1)$.
[/remark]
<!-- illustration-needed: Graph of the Bernoulli shift map F(x) = 2x mod 1 on [0,1], showing the two linear branches and the jump at x = 1/2 -->
The Bernoulli shift is the simplest model of a chaotic map: it has sensitive dependence on initial conditions (two nearby points $x_0$ and $x_0 + \varepsilon$ will differ by $2^n \varepsilon \pmod 1$ after $n$ steps, so they diverge exponentially), yet the map is completely explicit. It is the prototype for symbolic dynamics, where one encodes an orbit by the sequence of symbols recording which half of the interval the orbit visits at each step.
[example: Sensitive Dependence in the Bernoulli Shift]
Take two initial conditions $x_0 = 0.100$ and $x_0' = 0.101$, both in $[0,1)$. In decimal, their initial separation is $|x_0' - x_0| = 0.001$. Since $F(x) = 2x \bmod 1$, successive iterates of $x_0$ are:
\begin{align*}
x_0 &= 0.100, \\
x_1 &= 0.200, \\
x_2 &= 0.400, \\
x_3 &= 0.800, \\
x_4 &= 0.600, \\
x_5 &= 0.200.
\end{align*}
Successive iterates of $x_0' = 0.101$ are:
\begin{align*}
x_0' &= 0.101, \\
x_1' &= 0.202, \\
x_2' &= 0.404, \\
x_3' &= 0.808, \\
x_4' &= 0.616, \\
x_5' &= 0.232.
\end{align*}
The separations at each step are $|x_n' - x_n| = 0.001 \cdot 2^n$ (as long as no modular reduction has separated them into different branches). Computing: at $n=0$, separation $= 0.001$; at $n=1$, $0.002$; at $n=2$, $0.004$; at $n=3$, $0.008$; at $n=4$, $0.016$; at $n=5$, the separation is $|0.232 - 0.200| = 0.032 = 0.001 \cdot 2^5$. Thus the separation grows by a factor of 2 at every step, confirming exponential divergence. After $n = 10$ steps the separation has grown to $0.001 \cdot 2^{10} = 1.024$, which exceeds the size of the entire interval — the two orbits are completely unrelated. Two initial conditions that agree to three decimal places are indistinguishable after 10 iterations.
[/example]
[definition: Baker Map]
The **baker map** is the map $F \colon [0,1) \times [0,1) \to [0,1) \times [0,1)$ defined by
\begin{align*}
F(x, y) = \begin{cases} \left(2x,\; \dfrac{y}{2}\right) & 0 \leq x < \tfrac{1}{2}, \\ \left(2x - 1,\; \dfrac{y+1}{2}\right) & \tfrac{1}{2} \leq x < 1. \end{cases}
\end{align*}
Equivalently, $x_{n+1} = 2x_n \bmod 1$ and $y_{n+1} = \frac{y_n + \lfloor 2x_n \rfloor}{2}$.
[/definition]
The baker map is the two-dimensional extension of the Bernoulli shift: the $x$-coordinate evolves exactly as under the sawtooth map, while the $y$-coordinate is compressed. In the $x < 1/2$ case, the unit square is stretched horizontally by a factor of 2 and compressed vertically by a factor of 2, so that area is preserved. In the $x \geq 1/2$ case, the same operation is performed on the right half of the square and the result is translated upward. The name comes from the analogy with kneading dough: the square is stretched, cut, and stacked. The baker map preserves area (the Jacobian determinant is 1 everywhere), making it a simple model of area-preserving chaotic dynamics.
<!-- illustration-needed: The baker map acting on the unit square: left half stretched to a wide thin rectangle on the bottom, right half stretched to a wide thin rectangle placed on top, illustrating the "stretch and stack" action -->
## Fixed Points, Cycles, and Invariant Sets
**Given a map $F \colon E \to E$, which points return to themselves under iteration, and what sets are mapped to themselves?** The qualitative long-time behaviour of an orbit is organised around the invariant structures of the map, the most fundamental of which are fixed points and periodic orbits.
Consider a general map $F \colon E \to E$ where $E \subseteq \mathbb{R}^n$.
[definition: Fixed Point of a Map]
A point $x^* \in E$ is a **fixed point** of $F$ if
\begin{align*}
F(x^*) = x^*.
\end{align*}
[/definition]
Fixed points of maps are the analogue of equilibria for flows. A periodic orbit with period greater than one has no analogue in the continuous-time setting at the level of a single point, but the concept of a period-$n$ point captures a point that returns to itself after $n$ steps.
[definition: Periodic Point of Period $n$]
A point $x^* \in E$ is a **periodic point of period $n$** if
\begin{align*}
F^n(x^*) = x^*
\end{align*}
and $F^k(x^*) \neq x^*$ for all $k \in \{1, 2, \ldots, n-1\}$. The smallest such $n$ is called the **minimal period** of $x^*$.
[/definition]
Every fixed point is a periodic point of period 1. A periodic point of minimal period $n$ visits exactly $n$ distinct points in $E$ before returning to itself.
[definition: $N$-Cycle]
If $x_0 \in E$ is a periodic point of minimal period $N$, then the set
\begin{align*}
\bigl\{x_0,\, F(x_0),\, F^2(x_0),\, \ldots,\, F^{N-1}(x_0)\bigr\}
\end{align*}
is called an **$N$-cycle** (or a **periodic orbit of period $N$**).
[/definition]
Note that every point of an $N$-cycle is itself a periodic point of minimal period $N$. The cycle as a set is a finite invariant set for $F$.
[definition: Invariant Set for a Map]
A set $\Lambda \subseteq E$ is **invariant** under $F$ if
\begin{align*}
x \in \Lambda \implies F(x) \in \Lambda.
\end{align*}
[/definition]
Invariant sets are the skeleton around which the dynamics organise themselves. Every fixed point is an invariant set (a singleton), every $N$-cycle is a finite invariant set, and the entire state space $E$ is trivially invariant. More interesting examples arise as invariant manifolds, attractors, and repellers that partition the phase space according to the long-time behaviour of orbits. The natural object to associate to a given initial condition is its orbit — the set of all points visited under iteration.
[definition: Forward Orbit]
The **forward orbit** of $x_0 \in E$ under $F$ is the set
\begin{align*}
\mathcal{O}^+(x_0) = \bigl\{x_0,\, F(x_0),\, F^2(x_0),\, \ldots\bigr\}.
\end{align*}
[/definition]
The forward orbit of any point is a forward-invariant set; it is fully invariant (i.e. $F^{-1}(\Lambda) = \Lambda$ as well) only when $F$ is invertible and we include the backward orbit.
The stability of invariant sets for maps parallels the definitions already established for flows. An invariant set $\Lambda$ is **Lyapunov stable** if trajectories starting close to $\Lambda$ remain close; it is **quasi-asymptotically stable** if trajectories starting close to $\Lambda$ converge to $\Lambda$ in the limit $n \to \infty$; and it is **asymptotically stable** if it is both Lyapunov stable and quasi-asymptotically stable. These definitions carry over verbatim from the continuous-time case by replacing the flow $\varphi$ with the iterate $F^n$ and the limit $t \to \infty$ with $n \to \infty$.
[definition: Attractor]
An **attractor** is an invariant set $\Lambda$ that is asymptotically stable and that contains no proper non-empty subset which is itself both invariant and asymptotically stable.
[/definition]
The minimality condition in the definition of an attractor rules out redundancy: an attractor is the smallest invariant set that attracts nearby orbits. Fixed points and stable $N$-cycles are the simplest examples; strange attractors with fractal geometry arise in chaotic systems.
## Stability of Fixed Points
**Given a fixed point $x^*$ of $F$, how does one determine whether nearby orbits converge to or diverge from $x^*$?** As in the continuous-time case, the answer at the linear level is given by the eigenvalues of the derivative of $F$ at the fixed point.
Let $x^*$ be a fixed point of $F \colon E \to E$ where $E \subseteq \mathbb{R}^n$ is open and $F$ is continuously differentiable. Define the displacement $y_n := x_n - x^*$. Since $x^*$ is a fixed point, expanding $F$ around $x^*$ gives
\begin{align*}
x_{n+1} = F(x^* + y_n) = F(x^*) + Ay_n + O(|y_n|^2) = x^* + Ay_n + O(|y_n|^2),
\end{align*}
where $A = JF_{x^*} \in \mathbb{R}^{n \times n}$ is the Jacobian matrix of $F$ evaluated at $x^*$. Subtracting $x^*$ from both sides:
\begin{align*}
y_{n+1} = Ay_n + O(|y_n|^2).
\end{align*}
For small $|y_n|$, the dominant behaviour is determined by the linear map $y \mapsto Ay$. Under repeated application of the linear map, $|y_n|$ grows like $|\lambda|^n$ where $\lambda$ is the eigenvalue of $A$ with largest modulus. This motivates the following stability criterion.
[quotetheorem:2798]
[citeproof:2798]
The condition $|\lambda| < 1$ replaces the condition $\operatorname{Re}(\lambda) < 0$ from the continuous-time theory. This is because the discrete analogue of the exponential $e^{\lambda t}$ is $\lambda^n$, and $|\lambda^n| \to 0$ as $n \to \infty$ if and only if $|\lambda| < 1$. The unit circle $|\lambda| = 1$ in the complex plane plays the role of the imaginary axis.
[remark: Stability of Cycles]
The stability of an $N$-cycle $\{x_0, F(x_0), \ldots, F^{N-1}(x_0)\}$ is determined by treating each point of the cycle as a fixed point of the iterated map $G := F^N$. The Jacobian of $G$ at $x_0$ is, by the chain rule,
\begin{align*}
JG_{x_0} = JF_{F^{N-1}(x_0)} \cdots JF_{F(x_0)} \cdot JF_{x_0}.
\end{align*}
The cycle is asymptotically stable if all eigenvalues of this product matrix have modulus less than 1, and unstable if any eigenvalue has modulus greater than 1. Crucially, the same product matrix (up to cyclic permutation) is obtained at every point of the cycle, so the stability type is the same for all points of the cycle.
[/remark]
[remark: Indeterminacy at the Unit Circle]
The non-hyperbolic case $|\lambda_i| = 1$ for some $i$ is genuinely indeterminate — the linearisation alone gives no information and the nonlinear terms decide stability. A concrete illustration: take $F(x) = x + x^3$ at the fixed point $x^* = 0$. The multiplier is $F'(0) = 1$, which lies on the unit circle, so the theorem is silent. However, for any $x_0 > 0$ the iterates satisfy $x_{n+1} = x_n + x_n^3 > x_n$, so the sequence is strictly increasing and cannot converge to 0 — the origin is unstable, despite having multiplier exactly 1. This example shows that one must not conclude stability from $|\lambda| = 1$; a separate centre manifold or Lyapunov function analysis is required.
[/remark]
The logistic map provides a worked example of all three possibilities — unstable fixed point at $x^* = 0$ for $\mu > 1$, a hyperbolic stable fixed point for $\mu \in (1,3)$, and a non-hyperbolic point at $\mu = 3$ where the multiplier exits the unit circle and a period-doubling bifurcation is born.
[example: Fixed Points of the Logistic Map]
For the logistic map $F(x) = \mu x(1-x)$ on $[0,1]$, we find the fixed points by solving $F(x^*) = x^*$, i.e. $\mu x^*(1-x^*) = x^*$. This gives $x^*((\mu-1) - \mu x^*) = 0$, so
\begin{align*}
x^* = 0 \quad \text{or} \quad x^* = \frac{\mu - 1}{\mu}.
\end{align*}
The second fixed point lies in $(0,1)$ if and only if $\mu > 1$. To assess stability, compute $F'(x) = \mu(1-2x)$. At $x^* = 0$: $F'(0) = \mu$. For $\mu > 1$, we have $|F'(0)| = \mu > 1$, so the origin is unstable. At $x^* = (\mu-1)/\mu$:
\begin{align*}
F'\!\left(\frac{\mu-1}{\mu}\right) = \mu\left(1 - \frac{2(\mu-1)}{\mu}\right) = \mu \cdot \frac{2-\mu}{\mu} = 2 - \mu.
\end{align*}
The condition $|2-\mu| < 1$ is equivalent to $1 < \mu < 3$. For $\mu \in (1,3)$, the non-trivial fixed point is asymptotically stable. At $\mu = 3$, the eigenvalue equals $-1$ (the multiplier passes through the unit circle), which signals a period-doubling bifurcation — the fixed point loses stability and a stable 2-cycle is born.
[/example]
[example: Computing a 2-Cycle of the Logistic Map]
At $\mu = 3.2$, the non-trivial fixed point $x^* = (\mu-1)/\mu = 2.2/3.2 = 0.6875$ has multiplier $|2 - \mu| = |2 - 3.2| = 1.2 > 1$, so it is unstable. We expect a stable 2-cycle. A 2-cycle consists of two distinct points $p, q \in [0,1]$ with $F(p) = q$ and $F(q) = p$, i.e. both $p$ and $q$ are fixed points of $F^2$. Every fixed point of $F$ is also a fixed point of $F^2$, so the fixed points of $F^2$ beyond the two fixed points of $F$ are the genuine period-2 points.
The equation $F^2(x) = x$ is a degree-4 polynomial; dividing out the fixed points of $F$ leaves a quadratic. Setting $F(x) = \mu x(1-x)$, the fixed points of $F$ satisfy $\mu x^2 - (\mu-1)x = 0$, i.e. $x = 0$ or $x = (\mu-1)/\mu$. Factoring these out of $F^2(x) - x$ by polynomial long division yields the remaining quadratic:
\begin{align*}
\mu^2 x^2 - \mu(\mu+1)x + (\mu+1) = 0.
\end{align*}
Using the quadratic formula with $\mu = 3.2$:
\begin{align*}
x &= \frac{\mu(\mu+1) \pm \sqrt{\mu^2(\mu+1)^2 - 4\mu^2(\mu+1)}}{2\mu^2} = \frac{(\mu+1) \pm \sqrt{(\mu+1)(\mu-3)}}{2\mu}.
\end{align*}
With $\mu = 3.2$: $\mu + 1 = 4.2$, $\mu - 3 = 0.2$, $(\mu+1)(\mu-3) = 0.84$, $\sqrt{0.84} \approx 0.9165$. Therefore:
\begin{align*}
p &= \frac{4.2 - 0.9165}{6.4} \approx \frac{3.2835}{6.4} \approx 0.5131, \\
q &= \frac{4.2 + 0.9165}{6.4} \approx \frac{5.1165}{6.4} \approx 0.7994.
\end{align*}
We verify: $F(0.5131) = 3.2 \cdot 0.5131 \cdot (1 - 0.5131) = 3.2 \cdot 0.5131 \cdot 0.4869 \approx 3.2 \cdot 0.2499 \approx 0.7997 \approx q$. And $F(0.7994) = 3.2 \cdot 0.7994 \cdot 0.2006 \approx 3.2 \cdot 0.1604 \approx 0.5133 \approx p$. The 2-cycle is confirmed.
To assess stability, the multiplier of $F^2$ at either 2-cycle point equals $F'(p) \cdot F'(q)$ by the chain rule. Using $F'(x) = \mu(1-2x) = 3.2(1-2x)$:
\begin{align*}
F'(p) &= 3.2(1 - 2 \cdot 0.5131) = 3.2 \cdot (-0.0262) \approx -0.0838, \\
F'(q) &= 3.2(1 - 2 \cdot 0.7994) = 3.2 \cdot (-0.5988) \approx -1.9162.
\end{align*}
Thus the multiplier of $F^2$ at the 2-cycle is $(-0.0838) \cdot (-1.9162) \approx 0.1606$. Since $|0.1606| < 1$, the 2-cycle is asymptotically stable, confirming that orbits from almost any initial condition near $x_0 = 0.5$ converge to this 2-cycle for $\mu = 3.2$.
[/example]
## Bifurcations of One-Dimensional Maps
**How does the qualitative behaviour of a one-dimensional map $F(\cdot\,; \mu)$ change as the parameter $\mu$ varies?** In the continuous-time setting, bifurcations of equilibria occurred when eigenvalues of the linearisation crossed the imaginary axis. For one-dimensional maps, the analogous condition is that the eigenvalue (multiplier) $\lambda = \partial F / \partial x$ at a fixed point crosses the unit circle. Since we are in one dimension, this can only happen at $\lambda = +1$ or $\lambda = -1$.
Consider the parameterised one-dimensional map $x_{n+1} = F(x_n; \mu)$. Without loss of generality, assume there is a fixed point at $x = 0$ when $\mu = 0$, so $F(0;0) = 0$, and suppose a bifurcation occurs at $\mu = 0$, meaning $\frac{\partial F}{\partial x}(0;0) = \pm 1$.
The Taylor expansion of $F$ around $(x,\mu) = (0,0)$ takes the form
\begin{align*}
F(x;\mu) = x + \lambda x + F_\mu \mu + \tfrac{1}{2} F_{xx} x^2 + F_{x\mu} x\mu + \tfrac{1}{2} F_{\mu\mu}\mu^2 + \tfrac{1}{6}F_{xxx}x^3 + \cdots
\end{align*}
where subscripts denote partial derivatives evaluated at the origin. For a fixed point we require $F(x^*;\mu) = x^*$, so the bifurcating fixed points satisfy $F(x^*;\mu) - x^* = 0$. The type of bifurcation is determined by which derivatives vanish.
### Saddle-Node Bifurcation
[definition: Saddle-Node Bifurcation (Maps)]
A **saddle-node bifurcation** occurs when $\lambda = \partial F/\partial x\big|_{(0,0)} = 1$, $F_\mu \neq 0$, and $F_{xx} \neq 0$. The map near the bifurcation takes the form
\begin{align*}
x_{n+1} = x_n + \mu F_\mu + \tfrac{1}{2} x_n^2 F_{xx} + O(3),
\end{align*}
where $O(3)$ denotes terms of order $|\mu|^{3/2}$ or $|x|^3$ or products of lower-order terms.
[/definition]
The balance of terms in the fixed-point equation $F(x^*;\mu) = x^*$ gives $\mu F_\mu + \tfrac{1}{2}(x^*)^2 F_{xx} \approx 0$, so
\begin{align*}
x^* \approx \pm\sqrt{\frac{-2\mu F_\mu}{F_{xx}}}.
\end{align*}
This reveals that two fixed points are created (or destroyed) as $\mu$ passes through 0, with $x^* = O(\sqrt{\mu})$. The two fixed points collide and annihilate at $\mu = 0$.
[definition: Normal Form — Saddle-Node]
The **normal form of the saddle-node bifurcation** for maps is
\begin{align*}
x_{n+1} = x_n + \mu - x_n^2.
\end{align*}
[/definition]
For $\mu > 0$ this map has two fixed points $x^* = \pm\sqrt{\mu}$. The multiplier at each fixed point is $F_x = 1 - 2x^*$. At $x^* = +\sqrt{\mu}$: multiplier $= 1 - 2\sqrt{\mu} < 1$ (stable for small $\mu > 0$). At $x^* = -\sqrt{\mu}$: multiplier $= 1 + 2\sqrt{\mu} > 1$ (unstable). For $\mu < 0$ there are no fixed points. At $\mu = 0$ the two fixed points merge at the origin, where the multiplier is exactly 1. This is entirely analogous to the saddle-node bifurcation for flows.
<!-- illustration-needed: Bifurcation diagram for the saddle-node normal form x_{n+1} = x_n + μ - x_n^2, showing the parabola of fixed points in the (μ, x*) plane with the stable branch drawn solid and unstable branch dashed -->
[example: Saddle-Node Bifurcation for a Concrete Map]
Consider the map $F(x;\mu) = x + \mu - x^2$ on $\mathbb{R}$. This is precisely the saddle-node normal form with $F_\mu = 1$ and $F_{xx} = -2$. We verify the conditions: $F(0;0) = 0$ (fixed point at origin), $F_x(0;0) = 1$ (multiplier equals 1), $F_\mu(0;0) = 1 \neq 0$, and $F_{xx}(0;0) = -2 \neq 0$. All saddle-node conditions are satisfied.
The fixed points satisfy $F(x^*;\mu) = x^*$, i.e. $\mu - (x^*)^2 = 0$, giving $x^* = \pm\sqrt{\mu}$. These exist for $\mu > 0$ and collide at the origin when $\mu = 0$. For $\mu < 0$ there are no fixed points.
At $\mu = 0.04$: $x^* = \pm 0.2$. The multiplier at $x^* = +0.2$ is $F_x(0.2; 0.04) = 1 - 2 \cdot 0.2 = 0.6$, so $|0.6| < 1$ and this branch is stable. The multiplier at $x^* = -0.2$ is $1 - 2 \cdot (-0.2) = 1.4$, so $|1.4| > 1$ and this branch is unstable. We verify convergence: starting from $x_0 = 0.1$ and iterating $x_{n+1} = x_n + 0.04 - x_n^2$:
\begin{align*}
x_0 &= 0.1, & x_1 &= 0.1 + 0.04 - 0.01 = 0.13, \\
x_2 &= 0.13 + 0.04 - 0.0169 = 0.1531, & x_3 &\approx 0.1531 + 0.04 - 0.02344 \approx 0.1697.
\end{align*}
The iterates are increasing toward $x^* = 0.2$, consistent with asymptotic stability. The saddle-node at $\mu = 0$ is confirmed: as $\mu$ decreases through 0, the two fixed points coalesce and disappear, and orbits no longer have a nearby attractor.
[/example]
### Transcritical Bifurcation
[definition: Transcritical Bifurcation (Maps)]
A **transcritical bifurcation** occurs when $\lambda = 1$, $F_\mu = 0$, and $F_{xx} \neq 0$. The fixed-point equation near the bifurcation reduces to
\begin{align*}
x_{n+1} = x_n + \tfrac{1}{2}x_n^2 F_{xx} + x_n\mu F_{x\mu} + \tfrac{1}{2}\mu^2 F_{\mu\mu} + O(3).
\end{align*}
[/definition]
The condition $F_\mu = 0$ means the fixed point $x = 0$ persists for all values of $\mu$ near 0 (the parameter does not move the fixed point). What changes is the stability: a second branch of fixed points crosses through $x = 0$ at $\mu = 0$, and the two branches exchange stability at the crossing.
[definition: Normal Form — Transcritical]
The **normal form of the transcritical bifurcation** for maps is
\begin{align*}
x_{n+1} = x_n + x_n(x_n - \mu b), \quad b > 0.
\end{align*}
[/definition]
The fixed points are $x^* = 0$ and $x^* = \mu b$. Expanding the normal form as $F(x;\mu) = x + x^2 - \mu b x$, the derivative is $F_x(x;\mu) = 1 + 2x - \mu b$. At $x^* = 0$: $F_x(0;\mu) = 1 - \mu b$. At $x^* = \mu b$: $F_x(\mu b;\mu) = 1 + 2\mu b - \mu b = 1 + \mu b$. For $\mu > 0$ with $0 < \mu b < 2$: $|1 - \mu b| < 1$ so $x^* = 0$ is stable; the non-trivial branch $x^* = \mu b$ has multiplier $1 + \mu b > 1$ and is unstable. For $\mu < 0$: the roles swap. The two branches exchange stability as they cross at $\mu = 0$.
### Pitchfork Bifurcation
[definition: Pitchfork Bifurcation (Maps)]
A **pitchfork bifurcation** occurs when $\lambda = 1$, $F_\mu = 0$, $F_{xx} = 0$, and $F_{xxx} \neq 0$. The map near the bifurcation takes the form
\begin{align*}
x_{n+1} = x_n + x_n\mu F_{\mu x} + \tfrac{1}{6}x_n^3 F_{xxx} + O(4) + \text{"$O(3)$" terms involving $\mu$}.
\end{align*}
[/definition]
The vanishing of $F_{xx}$ imposes an odd symmetry ($x \mapsto -x$) on the leading nonlinear term, so the map is approximately odd in $x$. The fixed-point equation gives $x^*(x_n\mu F_{\mu x} + \frac{1}{6}(x^*)^2 F_{xxx}) = 0$, yielding
\begin{align*}
x^* = 0 \quad \text{or} \quad (x^*)^2 = \frac{-6\mu F_{\mu x}}{F_{xxx}}.
\end{align*}
So a pair of non-trivial fixed points $x^* = O(\sqrt{\mu})$ bifurcates from the origin. The fixed points balance with $x^* = O(\mu^{1/2})$.
[definition: Normal Form — Pitchfork]
The **normal form of the pitchfork bifurcation** for maps is
\begin{align*}
x_{n+1} = x_n + x_n(\mu - ax_n^2), \quad a = \pm 1.
\end{align*}
[/definition]
For $a = +1$ (supercritical pitchfork): the origin is stable for $\mu < 0$ and unstable for $\mu > 0$. For $\mu > 0$, two stable fixed points $x^* = \pm\sqrt{\mu}$ are born. For $a = -1$ (subcritical pitchfork): the origin is unstable for $\mu > 0$ and stable for $\mu < 0$. For $\mu < 0$, two unstable fixed points $x^* = \pm\sqrt{-\mu/|a|}$ coexist with the stable origin before being annihilated at $\mu = 0$.
### Period-Doubling Bifurcation
The period-doubling bifurcation has no analogue in the continuous-time bifurcation theory of equilibria: it arises specifically because the multiplier can pass through $-1$ as well as $+1$.
[definition: Period-Doubling Bifurcation]
A **period-doubling bifurcation** occurs when the multiplier $\lambda = \partial F/\partial x$ at a fixed point passes through $-1$.
[/definition]
To understand why this is called a period-doubling bifurcation, suppose $F(0;\mu) = 0$ for all $\mu$ (so the origin is always a fixed point) and $F_x(0;0) = -1$. Expand:
\begin{align*}
x_{n+1} = -x_n + A\mu x_n + Bx_n^2 + C x_n^3 + D\mu^2 + O(4) + \text{"$O(3)$" terms involving $\mu$}.
\end{align*}
For a fixed point of $F$ we need $x_{n+1} = x_n$, which gives $-2x_n + \text{higher order} = 0$, so the only fixed point near the origin is $x_n = 0$.
Now consider $F^2$, the second iterate. Applying $F$ twice and collecting terms:
\begin{align*}
x_{n+2} = x_n - 2A\mu x_n - 2(C + B^2)x_n^3 + O(\mu^2) + O(4) + \text{"$O(3)$" terms involving $\mu$}.
\end{align*}
A fixed point of $F^2$ satisfies $x_{n+2} = x_n$, i.e.
\begin{align*}
-2A\mu x_n - 2(C + B^2)x_n^3 = 0,
\end{align*}
giving
\begin{align*}
x_n = 0 \quad \text{or} \quad x_n = \pm\sqrt{\frac{-A\mu}{C + B^2}}.
\end{align*}
The solution $x_n = 0$ is the existing fixed point of $F$ (which has period 1). The solutions $x_n = \pm\sqrt{-A\mu/(C+B^2)}$ are genuine period-2 points, provided the expression under the square root is positive. These points form a 2-cycle that bifurcates from the origin.
To check the stability of this 2-cycle, compute the multiplier of $F^2$ at the 2-cycle points:
\begin{align*}
\frac{d}{dx} F^2(x) \Big|_{x^*} = 1 - 2A\mu - 6(C+B^2)(x^*)^2 + O(\mu^{3/2}).
\end{align*}
Substituting $(x^*)^2 = -A\mu/(C+B^2)$:
\begin{align*}
\frac{d}{dx} F^2(x^*) = 1 - 2A\mu + 6A\mu + O(\mu^{3/2}) = 1 + 4A\mu + O(\mu^{3/2}).
\end{align*}
Meanwhile, at the original fixed point $x^* = 0$:
\begin{align*}
\frac{d}{dx} F^2(0) = (-1+A\mu)^2 + \cdots = 1 - 2A\mu + O(\mu^2).
\end{align*}
For small $\mu > 0$ with $A < 0$ (so that the 2-cycle exists): the multiplier of $F^2$ at the 2-cycle satisfies $1 + 4A\mu < 1$ (stable 2-cycle), while at $x = 0$ the multiplier satisfies $1 - 2A\mu > 1$ (unstable fixed point). **The fixed point and the 2-cycle have opposite stability**, which is the hallmark of the period-doubling bifurcation.
<!-- illustration-needed: Bifurcation diagram for a period-doubling bifurcation, showing the single branch of fixed points becoming unstable as μ increases through 0, with a stable 2-cycle branching off symmetrically above and below -->
### Classification Summary
The classification of bifurcations at a fixed point of a one-dimensional map is determined by a simple algorithm based on the multiplier $\lambda = F_x(0;0)$ and the vanishing of partial derivatives.
If $\lambda = 1$:
- $F_\mu \neq 0$ and $F_{xx} \neq 0$: **saddle-node bifurcation** — two fixed points collide and annihilate.
- $F_\mu = 0$ and $F_{xx} \neq 0$: **transcritical bifurcation** — two branches of fixed points cross and exchange stability.
- $F_\mu = 0$, $F_{xx} = 0$, and $F_{xxx} \neq 0$: **pitchfork bifurcation** — a pair of fixed points bifurcates from the origin.
If $\lambda = -1$: **period-doubling bifurcation** — the fixed point loses stability as a 2-cycle is born.
This classification mirrors the one for continuous-time systems in Chapter 5, with the roles of eigenvalues on the imaginary axis replaced by multipliers on the unit circle. The saddle-node, transcritical, and pitchfork bifurcations all arise when the multiplier is $+1$, while the distinctly discrete-time phenomenon of period-doubling arises when the multiplier is $-1$. Repeated period-doubling as a parameter varies produces a **period-doubling cascade**, and the accumulation point of such a cascade is one route to chaos — a phenomenon studied in detail when examining the logistic map at large $\mu$.
Within the ordered regime of maps — fixed points and periodic orbits — bifurcations proceed predictably through period-doubling and other routes. Beyond these boundaries lies chaotic dynamics, where sensitive dependence on initial conditions and dense orbits create fundamentally unpredictable long-term behavior.
# 7. Chaos
The preceding chapters developed tools for understanding ordered dynamics: fixed points, periodic orbits, limit cycles, and bifurcations. These structures are predictable in the sense that nearby initial conditions lead to qualitatively similar long-term behaviour. This chapter takes a fundamentally different direction. We ask: can a deterministic system produce behaviour so sensitive to initial conditions that long-term prediction becomes impossible? The answer is yes, and the mathematical framework capturing this phenomenon is called **chaos**. The central objects are two competing definitions of chaos — one due to Devaney, which is intrinsic and dynamical, and one due to Glendinning, which is combinatorial — together with the remarkable result that the presence of a period-three orbit implies chaotic behaviour of every period.
## Two Definitions of Chaos
**Problem.** Given a map $F \colon \Lambda \to \Lambda$ on an invariant set $\Lambda \subset \mathbb{R}$, what properties should we require in order to call $F$ chaotic? Two independent properties capture the two aspects of chaotic dynamics: extreme sensitivity to initial conditions, and a mixing property that prevents the system from decomposing into independent subsystems.
[definition: Sensitive Dependence on Initial Conditions]
Let $\Lambda \subset \mathbb{R}$ be an invariant set under $F \colon \Lambda \to \Lambda$, where $F$ is continuous. We say $F$ has **sensitive dependence on initial conditions** on $\Lambda$ if there exists $\delta > 0$ such that for every $x \in \Lambda$ and every $\varepsilon > 0$, there exist $y \in \Lambda$ with $|y - x| < \varepsilon$ and $n \in \mathbb{N}$ with
\begin{align*}
|F^n(y) - F^n(x)| > \delta.
\end{align*}
The constant $\delta$ is called a **sensitivity constant** for $F$ on $\Lambda$.
[/definition]
The definition captures the idea that no matter how precisely we know the initial condition $x$, there are arbitrarily nearby initial conditions $y$ whose orbits eventually diverge from that of $x$ by at least $\delta$. Note carefully that the definition does not require the orbits to diverge immediately, nor that they diverge for all time — only that at some future time $n$ the separation exceeds $\delta$. This makes it a weaker condition than one might initially expect.
[definition: Topological Transitivity]
Let $\Lambda \subset \mathbb{R}$ be an invariant set under $F \colon \Lambda \to \Lambda$, where $F$ is continuous. We say $F$ is **topologically transitive** on $\Lambda$ if for any two non-empty open sets $U, V \subset \Lambda$, there exists $n > 0$ such that
\begin{align*}
F^n(U) \cap V \neq \varnothing.
\end{align*}
[/definition]
Topological transitivity means that no non-trivial open subset of $\Lambda$ is forward-invariant: iterating any open set will eventually reach any other open set. This rules out the possibility that $\Lambda$ decomposes into two disjoint invariant open sets, which would be a sign of non-chaotic, decomposable dynamics.
It is essential to understand that these two properties are logically independent. The following examples show both combinations are possible.
[example: Transitive But Not Sensitive]
Consider the irrational rotation $F \colon [0,1) \to [0,1)$ defined by
\begin{align*}
F(x) = x + \omega \pmod{1},
\end{align*}
where $\omega \notin \mathbb{Q}$, for instance $\omega = \sqrt{2}$. By the equidistribution theorem for irrational rotations, the forward orbit $\{F^n(x) : n \geq 0\}$ is dense in $[0,1)$ for every $x \in [0,1)$. Density of orbits implies topological transitivity: given non-empty open $U, V \subset [0,1)$, pick any $x \in U$; since the orbit of $x$ is dense, some iterate $F^n(x)$ falls in $V$, hence $F^n(U) \cap V \neq \varnothing$.
However, $F$ does not have sensitive dependence on initial conditions. Since $F$ is an isometry — $|F(x) - F(y)| = |x - y|$ for all $x, y$ (distances are preserved modulo 1 for small separations) — we have $|F^n(x) - F^n(y)| = |x - y|$ for all $n$. Given any proposed sensitivity constant $\delta > 0$, choosing $\varepsilon < \delta$ ensures that for any $y$ with $|y - x| < \varepsilon$, we have $|F^n(y) - F^n(x)| = |y - x| < \varepsilon < \delta$ for all $n \geq 0$. Thus $F$ fails to be SDIC.
[/example]
[example: Sensitive But Not Transitive]
Consider the modified sawtooth map $F \colon \mathbb{R} \to \mathbb{R}$ defined by
\begin{align*}
F(x) = \begin{cases}
2(-1 - x) & x < -\tfrac{1}{2}, \\
2x & -\tfrac{1}{2} \leq x \leq \tfrac{1}{2}, \\
2(2 - x) & x > \tfrac{1}{2}.
\end{cases}
\end{align*}
The derivative satisfies $|F'(x)| = 2$ wherever it exists. For any $x$ and $y$ with $|y - x| < \varepsilon$, after $n$ iterates the separation satisfies $|F^n(y) - F^n(x)| \approx 2^n |y - x|$, so by choosing $n$ large enough we obtain $|F^n(y) - F^n(x)| > \delta = \frac{1}{4}$ for any $\varepsilon > 0$. Hence $F$ has sensitive dependence on initial conditions with $\delta = \frac{1}{4}$.
On the other hand, $F$ maps the half-line $\{x > 0\}$ to $\{x > 0\}$ and the half-line $\{x < 0\}$ to $\{x < 0\}$ (one checks from the formula that $F(x) > 0$ when $x > 0$ and $F(x) < 0$ when $x < 0$). Therefore, taking $U \subset \{x > 0\}$ and $V \subset \{x < 0\}$, no iterate $F^n(U)$ can intersect $V$. Hence $F$ is not topologically transitive on $\mathbb{R}$.
[/example]
With these two independent conditions in hand, Devaney's definition of chaos requires both properties together with a third: the periodic points should be dense.
[definition: Chaos in the Sense of Devaney]
Let $\Lambda \subset \mathbb{R}$ be a closed invariant set under a continuous map $F \colon \Lambda \to \Lambda$. We say $F$ is **chaotic on $\Lambda$ in the sense of Devaney** if all three of the following hold:
1. $F$ has sensitive dependence on initial conditions on $\Lambda$;
2. $F$ is topologically transitive on $\Lambda$;
3. The set of periodic points of $F$ is dense in $\Lambda$.
[/definition]
The third condition — density of periodic points — ensures that the chaotic behaviour is not merely an asymptotic phenomenon but is embedded densely throughout the invariant set. A system satisfying all three conditions is genuinely complex at every scale.
An alternative approach to defining chaos is combinatorial rather than topological. It is based on the concept of a horseshoe, which captures the key geometric mechanism by which orbits diverge.
[definition: Horseshoe]
Let $F \colon I \to I$ be a continuous map on a closed interval $I \subset \mathbb{R}$. We say $F$ has a **horseshoe** if there exist an open interval $J \subset I$ and disjoint open subintervals $K_0, K_1 \subset J$ such that
\begin{align*}
F(K_0) = J \quad \text{and} \quad F(K_1) = J.
\end{align*}
[/definition]
<!-- illustration-needed: horseshoe map — show an interval J with two disjoint subintervals K_0 and K_1, both of which stretch and map onto the full interval J under F, creating the characteristic "horseshoe" shape -->
The horseshoe condition says that $F$ stretches two disjoint pieces of $J$ to cover all of $J$. This is a geometric version of the doubling property of the binary shift and is the key mechanism behind exponential divergence of orbits.
[definition: Chaos in the Sense of Glendinning]
Let $F \colon I \to I$ be a continuous map on a closed interval $I \subset \mathbb{R}$. We say $F$ is **chaotic in the sense of Glendinning** if $F^n$ has a horseshoe for some $n \geq 1$.
[/definition]
[remark: Relationship Between the Two Definitions]
The Glendinning definition is strictly stronger than the Devaney definition in the following sense: if $F$ is chaotic in the sense of Glendinning, then $F$ is chaotic in the sense of Devaney. The converse does not hold in general. We will establish the implication Glendinning $\Rightarrow$ Devaney in the next section via symbolic dynamics.
[/remark]
## The Sawtooth Map as a Model of Chaos
**Question.** What does a concrete chaotic map look like, and how do we verify the three Devaney conditions directly from the map's definition rather than invoking abstract theory? The sawtooth map, also called the **Bernoulli shift**, answers both questions simultaneously: it is simple enough that every computation is explicit, yet it exhibits the full complexity demanded by Devaney's definition. It will serve as the primary example from which all subsequent arguments about chaos on intervals are modelled.
The sawtooth map $F \colon [0,1) \to [0,1)$ is defined by
\begin{align*}
F(x) = 2x \pmod{1}.
\end{align*}
In terms of binary expansions, if $x = 0.a_1 a_2 a_3 \ldots$ in base 2, then $F(x) = 0.a_2 a_3 a_4 \ldots$ — the map simply discards the leading binary digit. This identification is the key to all three proofs below.
[quotetheorem:2799]
[citeproof:2799]
The key tool here is the binary expansion: every point $x \in [0,1)$ can be written as $x = 0.a_1 a_2 a_3 \ldots$ in base 2, and the sawtooth map acts by discarding the leading digit. This makes the horseshoe completely transparent — the two halves $K_0$ and $K_1$ correspond to the cases $a_1 = 0$ and $a_1 = 1$, and stripping the leading digit maps both halves onto the entire unit interval. The same binary-expansion perspective will drive all three verifications of the Devaney conditions.
[quotetheorem:2800]
[citeproof:2800]
The sawtooth map thus serves as the canonical model. Every subsequent argument about chaos on intervals reduces, via symbolic dynamics, to properties of this map.
Two features of this verification are worth noting. First, the construction uses binary expansions in an entirely explicit way: there is no appeal to compactness or measure theory, only digit manipulation. Second, the three conditions are verified by completely different arguments — sensitive dependence via digit-flipping, topological transitivity via concatenation, and density of periodic points via repeating blocks — yet all three rest on the same foundational representation. This is what makes the sawtooth map the correct canonical example: its structure lays bare all the mechanisms of chaos simultaneously.
## Horseshoes and Symbolic Dynamics
**Question.** If $F$ has a horseshoe, can we recover all three Devaney conditions in a unified way, rather than verifying them separately?
The answer is yes, through **symbolic dynamics**: we encode orbits as binary sequences and show that the dynamics on the invariant set $\Lambda$ is topologically conjugate to the binary shift on $\{0,1\}^\mathbb{N}$. This gives Devaney chaos essentially for free.
Suppose $f \colon \mathbb{R} \to \mathbb{R}$ is continuous and has a horseshoe on $J \subset \mathbb{R}$ with accompanying open sets $K_0, K_1 \subset J$. Assume additionally:
1. $\overline{K_0} \cap \overline{K_1} = \varnothing$;
2. $f$ is monotone on each of $\overline{K_0}$ and $\overline{K_1}$.
These conditions guarantee that $f$ restricted to each of $\overline{K_0}$ and $\overline{K_1}$ has an inverse, and that the two images of the horseshoe do not overlap. A consequence is that
\begin{align*}
f(x) \in \overline{J} \implies x \in \overline{K_0} \cup \overline{K_1}.
\end{align*}
This implication is what makes the symbolic coding well-defined: every point in $\overline{J}$ that remains in $\overline{J}$ after one iterate must have come from one of the two distinguished subintervals, so the label $0$ or $1$ can be assigned unambiguously. Without monotonicity, a single preimage in $\overline{J}$ could lie outside $\overline{K_0} \cup \overline{K_1}$, breaking the one-to-one correspondence between points and binary sequences.
<!-- illustration-needed: horseshoe to shift — show the interval J with K_0 and K_1 mapped onto J, with the symbolic coding a_n = 0 or 1 depending on which subinterval contains f^{n-1}(x) -->
Define the **maximal invariant set** of the horseshoe:
\begin{align*}
\Lambda = \bigl\{ x \in \overline{J} : f^n(x) \in \overline{J} \text{ for all } n \geq 0 \bigr\}.
\end{align*}
The set $\Lambda$ consists precisely of those points whose entire forward orbit never escapes $\overline{J}$. Since most points eventually leave $\overline{J}$ and are thereafter irrelevant to the horseshoe dynamics, $\Lambda$ is the correct object of study. The first step is to confirm that $\Lambda$ is genuinely invariant — that is, $f$ maps $\Lambda$ to itself and not to a strict subset.
[quotetheorem:2801]
[citeproof:2801]
For any $x \in \Lambda$, since $f^{n-1}(x) \in \overline{J}$ and the consequence of monotonicity forces $f^{n-1}(x) \in \overline{K_0} \cup \overline{K_1}$ (as the union is disjoint), define the **itinerary** of $x$ by
\begin{align*}
a_n = \begin{cases} 0 & f^{n-1}(x) \in \overline{K_0}, \\ 1 & f^{n-1}(x) \in \overline{K_1}. \end{cases}
\end{align*}
This assigns to every $x \in \Lambda$ a binary sequence $a(x) = a_1 a_2 a_3 \ldots \in \{0,1\}^\mathbb{N}$. The itinerary of $f(x)$ is $a_2 a_3 a_4 \ldots$ — the left-shift of $a(x)$. Thus the dynamics of $f$ on $\Lambda$ corresponds to the left-shift on $\{0,1\}^\mathbb{N}$, which is precisely the binary expansion model of the sawtooth map. Applying the arguments from the previous section almost verbatim to this symbolic representation yields all three Devaney conditions on $\Lambda$. This establishes the implication Glendinning $\Rightarrow$ Devaney, as claimed in the remark above.
## Period Three Implies Chaos
**Question.** Can a single low-period orbit force global complexity throughout the entire interval? The existence of a fixed point is compatible with perfectly ordered dynamics — it says nothing about sensitive dependence. A 2-cycle likewise imposes no chaos. But a 3-cycle is qualitatively different: it forces the map to stretch and fold the interval in a way that produces horseshoes in $f^2$, and therefore Glendinning chaos. This is the content of the next theorem.
[quotetheorem:2802]
[citeproof:2802]
<!-- illustration-needed: period-3-implies-chaos — phase portrait showing the 3-cycle x_1 < x_2 < x_3, the fixed point z in (x_2,x_3), the point y in (x_1,x_2), and the intervals K_1 = (y,r) and K_2 = (s,z) that form the horseshoe for f^2 -->
The result is remarkable because period 3 is the minimum period from which chaos can be forced: a 2-cycle does not imply chaos, since a map with only 2-cycles can fail the density-of-periodic-points condition and can even be an isometry in suitable coordinates. The argument above depends critically on continuity of $f$ — specifically, on the intermediate value theorem applied three times. For discontinuous maps (such as the tent map with jumps), a 3-cycle need not produce a horseshoe. The transition from period 2 to period 3 is therefore a genuine threshold, not merely an artifact of the proof strategy.
## Existence of $N$-Cycles from a 3-Cycle
The implication of period three does not stop at chaos: it implies the existence of periodic orbits of every period. This requires a lemma about interval maps.
[quotetheorem:2803]
The proof uses the intermediate value theorem and a standard compactness argument to extract the smallest sub-interval that maps onto $B$. This lemma is the main technical tool in the following result.
[quotetheorem:2804]
[citeproof:2804]
The asymmetry implicit in the Sharkovsky ordering is already visible here: period 3 forces period 5, but the converse fails. A map with a 5-cycle need not have a 3-cycle, because the directed graph encoding of a 5-cycle does not contain a closed path of length 3 in general. The reason is structural: a 3-cycle creates the containment $F(I_R) \supset I_L \cup I_R$ with a self-loop on $I_R$, which generates paths of every length; a 5-cycle does not automatically produce this self-loop. Powers of 2 are at the other extreme: period-$2^k$ maps can arise from repeated period-doubling bifurcations without ever producing an orbit of odd period, which is why such maps avoid Glendinning chaos.
### Directed Graph Encoding
The containment relations $F(I_L) \supset I_R$ and $F(I_R) \supset I_L \cup I_R$ can be encoded in a directed graph: draw an arrow $I_A \to I_B$ whenever $F(I_A) \supset I_B$. For the 3-cycle with $x_1 < x_2 < x_3$, the graph has arrows $I_L \to I_R$, $I_R \to I_L$, and $I_R \to I_R$. The existence of an $N$-cycle corresponds to the existence of a closed directed path of length $N$ in this graph. This graphical encoding is particularly useful for reading off which periods are forced by a given cycle structure.
[example: 4-Cycle and Its Forced Periods]
Suppose $F \colon I \to I$ is continuous and has a 4-cycle with $F(x_1) = x_2$, $F(x_2) = x_3$, $F(x_3) = x_4$, $F(x_4) = x_1$ and $x_1 < x_2 < x_3 < x_4$. Define the subintervals $I_A = [x_1,x_2]$, $I_B = [x_2,x_3]$, $I_C = [x_3,x_4]$. By continuity of $F$: $F(I_A) \supset I_B$, $F(I_B) \supset I_C$, $F(I_C) \supset I_A \cup I_B \cup I_C$. The directed graph has arrows $I_A \to I_B$, $I_B \to I_C$, $I_C \to I_A$, $I_C \to I_B$, $I_C \to I_C$. A closed path $I_C \to I_C$ gives a fixed point in $I_C$. A closed path $I_A \to I_B \to I_C \to I_A$ of length 3 gives a 3-cycle (with points in $I_A$, $I_B$, $I_C$ respectively). By the period-3-implies-all-periods theorem, this 4-cycle therefore forces cycles of all periods.
[/example]
The directed graph method is more than a bookkeeping device. Every closed path of length $N$ in the graph corresponds to a genuine $N$-cycle, by the same intermediate value argument used in the period-three case. The graph for the 3-cycle has three nodes and three arrows, giving a Sharkovsky-complete family of periods; the graph for a 4-cycle or 5-cycle has more nodes but may lack certain closed paths, which is precisely why those periods do not force all others.
## Sharkovsky's Theorem
**Question.** Period 3 forces every period. But what about a 5-cycle or a 7-cycle — which periods do they force? And if a map has only powers-of-2 periods, can we characterise exactly what that means for its dynamics? Sharkovsky's theorem answers all of these questions at once by providing a complete linear ordering on $\mathbb{N}$ such that a $k$-cycle forces $l$-cycles for every $l$ appearing later in the ordering.
The period-three result is a special case of a much more general theorem that completely characterises which periods imply which others for continuous interval maps. Define the **Sharkovsky ordering** $\triangleleft$ on $\mathbb{N}$ by reading the following list from right to left:
\begin{align*}
3 \triangleright 5 \triangleright 7 \triangleright \cdots \triangleright 2 \cdot 3 \triangleright 2 \cdot 5 \triangleright 2 \cdot 7 \triangleright \cdots \triangleright 2^2 \cdot 3 \triangleright 2^2 \cdot 5 \triangleright \cdots \triangleright 2^3 \triangleright 2^2 \triangleright 2 \triangleright 1.
\end{align*}
In other words, the ordering places $3$ first (strongest), then all odd numbers $\geq 3$ in increasing order, then $2$ times each odd number $\geq 3$, then $4$ times each odd number $\geq 3$, and so on, with all powers of $2$ at the end in decreasing order, and $1$ last.
[quotetheorem:2805]
The proof extends the directed graph method from the period-three case to arbitrary cycle structures. Given a $k$-cycle with points $x_1 < x_2 < \cdots < x_k$, one partitions $[x_1, x_k]$ into subintervals $I_1, \ldots, I_{k-1}$ and draws an arrow $I_i \to I_j$ whenever $F(I_i) \supset I_j$. The combinatorial analysis of which closed paths exist in this graph — depending on the permutation type of the $k$-cycle — yields exactly the Sharkovsky ordering. The argument is elementary but requires a careful case analysis over all possible orderings of the cycle points.
[remark: Consequences of Sharkovsky's Theorem]
Several important special cases deserve mention.
- **Period 3 is strongest**: since $3$ is first in the Sharkovsky ordering, the existence of a 3-cycle implies cycles of all other periods. This recovers the previous theorem.
- **Fixed points are weakest**: since $1$ is last in the ordering, a fixed point does not force any other period. However, every cycle implies a fixed point, since every $k$ satisfies $1 \triangleleft k$.
- **Powers of 2**: maps with only power-of-2 periods are precisely those that avoid chaos in a certain sense; they correspond to the period-doubling cascade seen in the logistic map.
[/remark]
The special role of powers of 2 is explained by the period-doubling mechanism: a stable fixed point can period-double to a stable 2-cycle, which can period-double to a stable 4-cycle, and so on, without ever producing an orbit of odd period greater than 1. At each stage, the map is far from chaotic — the dynamics concentrates near the attracting cycle. The Sharkovsky ordering captures this: powers of 2 appear at the very end of the ordering, so they force as few periods as possible. An odd period, by contrast, sits near the beginning of the ordering and forces almost everything.
[quotetheorem:2806]
[citeproof:2806]
This result clarifies why powers of 2 are structurally different: a map can undergo period-doubling bifurcations indefinitely — passing through periods $1, 2, 4, 8, \ldots$ — without ever developing the odd-factor orbits that Sharkovsky's theorem would then force to proliferate. The period-doubling cascade therefore represents a controlled pathway toward complexity that stays just outside the Glendinning-chaotic regime until the accumulated doublings produce an odd-factor period.
## The Tent Map and Its Chaotic Behaviour
**Question.** We need a parametric example that exhibits the transition from order to chaos as a parameter is varied, with explicit horseshoe constructions available at each stage. The tent map provides exactly this: a one-parameter family of piecewise-linear maps whose dynamics can be tracked precisely.
[definition: Tent Map]
For $\mu > 0$, the **tent map** $F_\mu \colon [0,1] \to \mathbb{R}$ is defined by
\begin{align*}
F_\mu(x) = \begin{cases} \mu x & 0 \leq x \leq \tfrac{1}{2}, \\ \mu(1-x) & \tfrac{1}{2} \leq x \leq 1. \end{cases}
\end{align*}
[/definition]
<!-- illustration-needed: tent map — graph of F_mu on [0,1] for mu = 1.5 and mu = 2, showing the tent shape peaking at x = 1/2 with height mu/2 -->
The global behaviour of the tent map depends critically on $\mu$. For $\mu < 1$, the maximum value of $F_\mu$ is $\mu/2 < 1/2$, so $F_\mu(x) < x$ for all $x > 0$ and all orbits converge to $0$. For $\mu = 1$, every point in $[0,\tfrac{1}{2}]$ is a fixed point and every point in $[\tfrac{1}{2},1]$ maps to its reflection $1-x \in [0,\tfrac{1}{2}]$, giving a fixed point after one step. These cases are non-chaotic.
[quotetheorem:2807]
[citeproof:2807]
At the boundary $\mu = 2$, the tent map maps $[0,1]$ onto $[0,1]$ exactly — the peak reaches height $F_2(\tfrac{1}{2}) = 1$. This is the critical case: the horseshoe argument for $\mu > 2$ degenerates because $K_1$ and $K_2$ are no longer disjoint (they share the boundary point $\tfrac{1}{2}$), yet the dynamics at $\mu = 2$ is still highly chaotic. One can verify directly that $F_2$ is topologically conjugate to the sawtooth map via the conjugacy $h(x) = \sin^2(\pi x / 2)$, which intertwines the two maps. So the transition is not from chaos to order at $\mu = 2$ but rather a change in the geometry of the horseshoe.
[quotetheorem:2808]
[citeproof:2808]
The argument breaks at $\mu = 1$ because the geometric series $2^{1/2^n}$ approaches $1$ from above: for $\mu = 1$ there is no $n$ with $1 \geq 2^{1/2^n}$, since $2^{1/2^n} > 1$ for all $n$. This is consistent with the behaviour at $\mu = 1$: every point in $[0,\tfrac{1}{2}]$ is a fixed point, so the map is maximally non-chaotic, and the self-similar renormalisation argument collapses because the tent map with parameter $\mu^{2^n} = 1$ has no expanding structure.
[quotetheorem:2809]
This clean trichotomy makes the tent map one of the most useful examples in the theory: by varying a single parameter, one transitions sharply from ordered dynamics to full chaos.
## Unimodal Maps and Feigenbaum's Constant
**Question.** The period-doubling cascade we observed in the tent map might seem like a special feature of the piecewise-linear structure. Is this behaviour specific to the tent map, or does the same quantitative pattern — the same ratio of successive bifurcation intervals — appear for any map with a single peak? Feigenbaum's discovery answered this definitively: the cascade is universal, and the limiting ratio is a constant that depends on nothing except the existence of a smooth unimodal maximum.
The tent map is a special case of the broader class of unimodal maps. The universal route to chaos observed in this class — a cascade of period-doubling bifurcations — is one of the deepest quantitative discoveries in dynamical systems.
[definition: Unimodal Map]
A **unimodal map** on $[a,b]$ is a continuous map $F \colon [a,b] \to [a,b]$ satisfying:
1. $F(a) = F(b) = a$;
2. There exists $c \in (a,b)$ such that $F$ is strictly increasing on $[a,c)$ and strictly decreasing on $(c,b]$.
[/definition]
The **logistic map** $F_r \colon [0,1] \to [0,1]$ defined by
\begin{align*}
F_r(x) = r x(1-x)
\end{align*}
is the canonical example of a unimodal map for $r \in (0,4]$, with peak at $c = \tfrac{1}{2}$ and $F_r(0) = F_r(1) = 0$.
As the parameter $r$ increases from $1$ to $4$, the logistic map undergoes a **period-doubling cascade**: the stable fixed point at $x^* = 1 - 1/r$ loses stability at $r = 3$, giving rise to a stable 2-cycle; that 2-cycle loses stability at $r \approx 3.449$, giving rise to a stable 4-cycle; and so on. The bifurcation values form an accumulating sequence:
\begin{align*}
r_0 &= 1 \quad (\text{stable fixed point at origin}), \\
r_1 &= 3 \quad (2\text{-cycle appears}), \\
r_2 &\approx 3.449 \quad (4\text{-cycle appears}), \\
r_3 &\approx 3.544 \quad (8\text{-cycle appears}), \\
&\vdots
\end{align*}
Feigenbaum discovered that this cascade has a universal quantitative structure: the ratio of successive bifurcation intervals converges to a constant independent of the particular unimodal map.
[definition: Feigenbaum's Constant]
For the period-doubling cascade in a unimodal map, the **Feigenbaum constant** is
\begin{align*}
\delta = \lim_{n \to \infty} \frac{r_n - r_{n-1}}{r_{n+1} - r_n} \approx 4.6692\ldots
\end{align*}
[/definition]
[remark: Universality of the Feigenbaum Constant]
The remarkable fact is that this limiting ratio $\delta \approx 4.6692$ is the same for every unimodal map satisfying mild smoothness conditions, regardless of the specific form of the map. This universality means that the period-doubling route to chaos is not an artifact of the logistic map but a generic phenomenon in one-dimensional dynamics. The proof of universality uses renormalisation group methods from functional analysis: one defines a renormalisation operator $\mathcal{R}$ acting on a space of unimodal maps, shows that $\mathcal{R}$ has a hyperbolic fixed point $f^*$, and identifies $\delta$ as the unstable eigenvalue of $D\mathcal{R}_{f^*}$. The constant is thus a universal property of the renormalisation operator, not of any particular map.
[/remark]
<!-- illustration-needed: logistic bifurcation diagram — plot of the attractor of the logistic map as a function of r, showing the period-doubling cascade from r=1 to r=4, with the accumulation point of bifurcations visible -->
The period-doubling cascade accumulates at a value $r_\infty$ beyond which chaotic behaviour appears. For the logistic map $r_\infty \approx 3.570$. For $r > r_\infty$, the logistic map exhibits sensitive dependence on initial conditions and the orbit structure becomes extremely complex. The Sharkovsky theorem and the horseshoe arguments developed in this chapter provide the rigorous framework for understanding why: once the period-doubling cascade passes through $r$-values yielding odd-period orbits, Glendinning chaos follows.
Contents
- Introduction
- The State Space and the Vector Field
- Initial Value Problems and the Flow
- Existence and Uniqueness
- Orbits and Trajectories
- Invariant Sets and Special Orbits
- Connecting Orbits
- Limit Sets
- Topological Equivalence and Structural Stability
- 2. Flows in the Plane
- Linearisation at a Fixed Point
- Classification of Fixed Points in $\mathbb{R}^2$
- Hyperbolic Fixed Points
- Non-Hyperbolic Fixed Points: The Centre
- Fixed Points of Hamiltonian Systems
- The Effect of Nonlinear Terms
- Non-Hyperbolic Case: The Centre–Focus Problem
- Non-Hyperbolic Case: One Zero Eigenvalue
- Stable and Unstable Manifolds
- Computing Stable and Unstable Manifolds by Power Series
- Separatrices
- Sketching Phase Portraits
- 3. Stability
- Definitions of Stability
- Lyapunov Functions
- La Salle's Invariance Principle
- Worked Examples Using La Salle
- A Key Examination Example
- Bounding Functions
- 4. Existence and Stability of Periodic Orbits
- The Poincaré Index
- The Poincaré–Bendixson Theorem
- Dulac's Criterion and the Divergence Test
- Near-Hamiltonian Flows and the Energy Balance Method
- Hamiltonian Systems and Their Properties
- Near-Hamiltonian Perturbations
- The Energy Balance Method
- Stability of Periodic Orbits
- Floquet Theory and the Fundamental Matrix
- The Floquet Multiplier Formula in $\mathbb{R}^2$
- The Van der Pol Oscillator
- Nullclines and the Liénard Plane
- Existence and Uniqueness of the Limit Cycle
- Limiting Behaviour: Small and Large $\mu$
- 5. Bifurcations
- What Is a Bifurcation?
- The Centre Manifold and Extended Centre Manifold
- Reduction via the Centre Manifold Theorem
- The Extended Centre Manifold
- The General Method
- Stationary Bifurcations: Zero Eigenvalue
- Saddle-Node Bifurcation
- Transcritical Bifurcation
- Pitchfork Bifurcation
- Structural Stability of Stationary Bifurcations
- Oscillatory Bifurcations: The Hopf Bifurcation
- Supercritical Hopf Bifurcation ($a > 0$)
- Subcritical Hopf Bifurcation ($a < 0$)
- 6. Maps
- The Five Canonical Maps
- Fixed Points, Cycles, and Invariant Sets
- Stability of Fixed Points
- Bifurcations of One-Dimensional Maps
- Saddle-Node Bifurcation
- Transcritical Bifurcation
- Pitchfork Bifurcation
- Period-Doubling Bifurcation
- Classification Summary
- 7. Chaos
- Two Definitions of Chaos
- The Sawtooth Map as a Model of Chaos
- Horseshoes and Symbolic Dynamics
- Period Three Implies Chaos
- Existence of $N$-Cycles from a 3-Cycle
- Directed Graph Encoding
- Sharkovsky's Theorem
- The Tent Map and Its Chaotic Behaviour
- Unimodal Maps and Feigenbaum's Constant
Cambridge II Dynamical Systems
Content
Problems
History
Created by admin on 4/29/2026 | Last updated on 4/29/2026
Prerequisites
No prerequisites required for this page.
Rate this page
★
★
★
★
★
Poor
Excellent