Test new #24
Loading comments...
Sign in to comment on this pull request.
Changes to Content
Original Content
The derivative is the central concept of differential calculus — the precise formulation of the idea that a function has an instantaneous rate of change. Where a [limit](/page/Limit) captures the behaviour of a quantity as a parameter tends to some value, the derivative applies this mechanism to the difference quotient of a function, extracting the slope of the tangent line from the slopes of secant lines. Every major application of calculus — optimisation, differential equations, Taylor approximation, the [Fundamental Theorem of Calculus](/theorems/632) linking differentiation to [integration](/page/Integral) — depends on the derivative. This page develops the derivative of a real-valued function of one real variable: its definition, its algebraic properties, the mean value theorems that give it global reach, and the higher-order theory culminating in Taylor's theorem.
[motivation]
### Rates of Change and the Tangent Problem
Many quantities in mathematics and the sciences vary with respect to one another: position changes with time, pressure changes with volume, area changes with the length of a side. The fundamental question of differential calculus is: *at what rate does one quantity change with respect to another, at a specific instant?*
The difficulty is that "instantaneous rate of change" is not a directly observable quantity. Over a time interval $[t_0, t_0 + h]$, one can measure the average velocity of a particle as the ratio $\Delta x / \Delta t = (x(t_0 + h) - x(t_0))/h$. But this is a rate over an interval, not at a point. Taking $h$ smaller gives a better approximation, but setting $h = 0$ produces $0/0$ — the ratio is undefined. The derivative resolves this by defining the instantaneous rate as the [limit](/page/Limit) of the average rate as $h \to 0$, provided the limit exists.
### What Fails Without the Derivative
Without the derivative, one cannot formulate the condition for a function to have a local extremum, determine when two functions grow at the same rate, approximate a function by polynomials, or relate the rate of change of a quantity to its accumulation. The mean value theorem — the bridge between local information (the derivative at each point) and global information (the change over an interval) — requires differentiability as a hypothesis. Taylor's theorem, which approximates smooth [functions](/page/Function) by polynomials with explicit error bounds, is built entirely on iterated differentiation. And the [Fundamental Theorem of Calculus](/theorems/632), which connects the [integral](/page/Integral) to antidifferentiation, makes differentiation indispensable for computing integrals.
### The Geometric Picture
Geometrically, the derivative of $f$ at $a$ is the slope of the unique line that best approximates the graph of $f$ near $(a, f(a))$. The secant line through $(a, f(a))$ and $(a+h, f(a+h))$ has slope $(f(a+h) - f(a))/h$; as $h \to 0$, these secant lines rotate toward a limiting position — the tangent line — and the derivative is the slope of this tangent. The existence of the derivative is thus equivalent to the existence of a well-defined tangent direction, which fails at corners, cusps, and points of vertical tangency.
[/motivation]
## Definition
The derivative is defined as a limit of difference quotients. The domain of the function must be such that the limit can be formed — the point must be a limit point of the domain from both sides (for the full derivative) or from one side (for one-sided derivatives).
[definition: Derivative]
Let $E \subseteq \mathbb{R}$, let $a \in E$ be a limit point of $E$, and let $f : E \to \mathbb{R}$. The **derivative of $f$ at $a$** is
\begin{align*}
f'(a) &:= \lim_{h \to 0} \frac{f(a+h) - f(a)}{h},
\end{align*}
provided this [limit](/page/Limit) exists and is finite. Equivalently,
\begin{align*}
f'(a) &= \lim_{x \to a} \frac{f(x) - f(a)}{x - a}.
\end{align*}
If $f'(a)$ exists, $f$ is said to be **differentiable at $a$**. If $f$ is differentiable at every point of an [open set](/page/Open%20Set) $U \subseteq E$, $f$ is **differentiable on $U$**, and the function
\begin{align*}
f' : U &\to \mathbb{R} \\
a &\mapsto f'(a)
\end{align*}
is called the **derivative of $f$**.
[/definition]
The two formulations are related by the substitution $x = a + h$. The limit $h \to 0$ is taken over all $h \neq 0$ such that $a + h \in E$; the limit $x \to a$ is taken over all $x \in E \setminus \{a\}$. In either case, the definition asks for a limit of the slopes of secant lines through the fixed point $(a, f(a))$.
An equivalent characterisation — often more useful for proofs — reformulates differentiability as a linear approximation property. The function $f$ is differentiable at $a$ with derivative $f'(a) = L$ if and only if there exists a function $\varepsilon : E \to \mathbb{R}$ with $\varepsilon(x) \to 0$ as $x \to a$ such that
\begin{align*}
f(x) &= f(a) + L(x - a) + \varepsilon(x)(x - a) \quad \text{for all } x \in E.
\end{align*}
This says that $f$ is approximated near $a$ by the affine function $x \mapsto f(a) + L(x-a)$ — the tangent line — with an error that is $o(|x-a|)$ as $x \to a$. The derivative $L = f'(a)$ is the unique real number for which such an approximation exists. This linear-approximation viewpoint is the starting point for the generalisation to several variables, where the derivative becomes a [linear map](/page/Linear%20Map) rather than a number.
[example: Derivative Of A Power Function]
Let $f : \mathbb{R} \to \mathbb{R}$ be defined by $f(x) = x^n$ for a fixed $n \in \mathbb{N}$. For any $a \in \mathbb{R}$, the difference quotient is
\begin{align*}
\frac{f(a+h) - f(a)}{h} &= \frac{(a+h)^n - a^n}{h}.
\end{align*}
Using the factorisation $u^n - v^n = (u - v)(u^{n-1} + u^{n-2}v + \cdots + v^{n-1})$ with $u = a+h$ and $v = a$:
\begin{align*}
\frac{(a+h)^n - a^n}{h} &= (a+h)^{n-1} + (a+h)^{n-2}a + \cdots + a^{n-1}.
\end{align*}
This is a sum of $n$ terms. As $h \to 0$, each $(a+h)^{n-1-k} a^k \to a^{n-1}$, so
\begin{align*}
f'(a) &= \lim_{h \to 0} \sum_{k=0}^{n-1} (a+h)^{n-1-k} a^k = n a^{n-1}.
\end{align*}
The power rule $f'(a) = na^{n-1}$ extends to all real exponents $n \in \mathbb{R}$ (for $a > 0$) via the exponential function: if $f(x) = x^n = e^{n \ln x}$, the chain rule gives $f'(x) = n x^{n-1}$.
[/example]
## Differentiability and [Continuity](/page/Continuity)
Differentiability is a stronger condition than continuity: every differentiable function is continuous, but the converse fails dramatically. Understanding this gap is essential for knowing when the tools of differential calculus apply.
The relationship between the two concepts reflects the geometric distinction between functions whose graphs have well-defined tangent lines and functions whose graphs are too rough or irregular for any linear approximation to work.
[theorem: Differentiability Implies Continuity]
Let $E \subseteq \mathbb{R}$, $a \in E$ a limit point, and $f : E \to \mathbb{R}$ differentiable at $a$. Then $f$ is continuous at $a$.
[/theorem]
The proof uses the decomposition $f(x) - f(a) = \frac{f(x) - f(a)}{x - a} \cdot (x - a)$. As $x \to a$, the first factor tends to $f'(a)$ (a finite number) and the second factor tends to $0$, so their product tends to $0$, giving $f(x) \to f(a)$.
The converse fails: continuity does not imply differentiability. The simplest example is the absolute value function, which is continuous everywhere but not differentiable at the origin.
[example: Non-Differentiability Of The Absolute Value]
The function $f : \mathbb{R} \to \mathbb{R}$ defined by $f(x) = |x|$ is continuous at $0$ (since $|f(x) - f(0)| = |x| \to 0$) but not differentiable at $0$. The left and right difference quotients give different limits:
\begin{align*}
\lim_{h \to 0^+} \frac{|h| - 0}{h} &= \lim_{h \to 0^+} \frac{h}{h} = 1, \\
\lim_{h \to 0^-} \frac{|h| - 0}{h} &= \lim_{h \to 0^-} \frac{-h}{h} = -1.
\end{align*}
Since the one-sided limits disagree, the two-sided limit does not exist, and $f'(0)$ is undefined. Geometrically, the graph of $|x|$ has a corner at the origin: the left half has slope $-1$ and the right half has slope $+1$, so there is no single tangent line.
[/example]
[example: A Continuous Nowhere-Differentiable Function]
The failure of the converse is far worse than isolated corners. Weierstrass (1872) exhibited a continuous function $f : \mathbb{R} \to \mathbb{R}$ that is differentiable at *no* point. One such construction is
\begin{align*}
f(x) &= \sum_{n=0}^\infty a^n \cos(b^n \pi x),
\end{align*}
where $0 < a < 1$, $b$ is a positive odd integer, and $ab > 1 + \frac{3\pi}{2}$. The [series](/page/Series) [converges uniformly](/page/Uniform%20Convergence) (by the Weierstrass $M$-test, since $|a^n \cos(b^n \pi x)| \leq a^n$ and $\sum a^n < \infty$), so $f$ is continuous. But the condition $ab > 1$ ensures that the oscillations at scale $b^{-n}$ grow faster than the damping factor $a^n$ contracts them, preventing the difference quotient from converging at any point. The existence of such functions demonstrates that continuity and differentiability are genuinely different properties — one cannot infer differentiability from continuity alone, even at a single point.
[/example]
## Algebraic Rules
Computing derivatives from the definition for each new function would be impractical. The algebraic rules — the sum rule, product rule, quotient rule, and chain rule — reduce the differentiation of complicated expressions to the differentiation of their building blocks. These rules, together with the derivatives of the elementary functions, suffice to differentiate any expression built from polynomials, rational functions, exponentials, logarithms, and trigonometric functions.
[quotetheorem:198]
The sum rule and product rule are proved by adding and subtracting auxiliary terms in the difference quotient. The chain rule is subtler: the naive argument "cancel $\Delta g$ in $(\Delta f / \Delta g) \cdot (\Delta g / \Delta x)$" fails when $\Delta g = 0$, and a correct proof requires the linear-approximation characterisation of differentiability.
For the **product rule**, the key manipulation is:
\begin{align*}
\frac{f(x)g(x) - f(a)g(a)}{x - a} &= \frac{f(x) - f(a)}{x-a} \cdot g(x) + f(a) \cdot \frac{g(x) - g(a)}{x-a}.
\end{align*}
As $x \to a$, the first term tends to $f'(a) g(a)$ (using continuity of $g$ at $a$, which follows from differentiability) and the second tends to $f(a) g'(a)$.
For the **chain rule**, write $g(x) = g(a) + g'(a)(x-a) + \varepsilon_g(x)(x-a)$ and $f(y) = f(g(a)) + f'(g(a))(y - g(a)) + \varepsilon_f(y)(y - g(a))$. Substituting $y = g(x)$ and dividing by $x - a$ gives $(f \circ g)'(a) = f'(g(a)) \cdot g'(a)$; the error terms vanish in the limit because $\varepsilon_g(x) \to 0$ and $\varepsilon_f(g(x)) \to 0$ as $x \to a$.
[example: Derivative Of The Exponential Function]
The exponential function $\exp : \mathbb{R} \to \mathbb{R}$ is the unique function satisfying $\exp'(x) = \exp(x)$ for all $x$ and $\exp(0) = 1$. To verify this from the series definition $\exp(x) = \sum_{n=0}^\infty x^n / n!$, one computes
\begin{align*}
\frac{\exp(a+h) - \exp(a)}{h} &= \exp(a) \cdot \frac{\exp(h) - 1}{h}.
\end{align*}
The factor $(\exp(h) - 1)/h = \sum_{n=1}^\infty h^{n-1}/n!$ converges to $1$ as $h \to 0$ (the sum starts at $1$ and the remaining terms are $O(h)$). Hence $\exp'(a) = \exp(a)$.
Combined with the chain rule, this gives the derivative of $a^x = \exp(x \ln a)$ as $a^x \ln a$, and the derivative of $\ln x$ (the inverse of $\exp$) as $1/x$ via inverse function differentiation.
[/example]
## The Mean Value Theorems
The definition of the derivative provides only *local* information — the behaviour of $f$ in an infinitesimal neighbourhood of a single point. The mean value theorems convert this local information into *global* conclusions about the function's behaviour over an entire interval. They are the workhorses of analysis: every proof that uses "the derivative is nonnegative, therefore the function is nondecreasing" relies on the mean value theorem.
### Rolle's Theorem
The simplest mean value theorem is Rolle's theorem, which asserts that a differentiable function that starts and ends at the same value must have a horizontal tangent somewhere in between. It is the stepping stone to the full mean value theorem.
[theorem: Rolle's Theorem]
Let $f : [a, b] \to \mathbb{R}$ be continuous on $[a, b]$ and differentiable on $(a, b)$, with $f(a) = f(b)$. Then there exists $c \in (a, b)$ with $f'(c) = 0$.
[/theorem]
If $f$ is constant on $[a, b]$, then $f'(c) = 0$ for every $c \in (a, b)$. If $f$ is not constant, the extreme value theorem (applied to the continuous function $f$ on the compact [set](/page/Set) $[a, b]$) guarantees that $f$ attains a maximum or minimum at some interior point $c \in (a, b)$, and at such a point the derivative must vanish — the one-sided difference quotients have opposite signs, so their common limit can only be zero.
### The Mean Value Theorem
Rolle's theorem is the special case $f(a) = f(b)$; the general case follows by subtracting the linear function connecting $(a, f(a))$ to $(b, f(b))$.
[quotetheorem:186]
The proof applies Rolle's theorem to the auxiliary function $g(x) = f(x) - f(a) - \frac{f(b) - f(a)}{b - a}(x - a)$, which satisfies $g(a) = g(b) = 0$ and inherits continuity on $[a, b]$ and differentiability on $(a, b)$ from $f$.
The mean value theorem has numerous immediate consequences. A function with $f'(x) = 0$ for all $x$ in an interval is constant on that interval (apply the MVT to any two points). A function with $f'(x) > 0$ on an interval is strictly increasing. A function with a bounded derivative $|f'(x)| \leq M$ is Lipschitz continuous with constant $M$. These are qualitative conclusions about a function's global behaviour, derived entirely from pointwise information about its derivative.
### Why the Mean Value Theorem Fails for Vector-Valued Functions
The MVT is a fundamentally real result. For functions $f : [a, b] \to \mathbb{R}^n$ with $n \geq 2$, the conclusion $f'(c) = (f(b) - f(a))/(b-a)$ can fail: take $f(t) = (\cos t, \sin t)$ on $[0, 2\pi]$, which satisfies $f(0) = f(2\pi)$ but $\|f'(t)\| = 1$ for all $t$, so $f'(c) \neq 0$ for any $c$. The correct generalisation is the [Mean Value Inequality](/theorems/328) $\|f(b) - f(a)\| \leq M\|b - a\|$, which replaces equality with an upper bound.
### Cauchy's Mean Value Theorem
Cauchy's generalisation of the MVT replaces the linear function connecting two points with a parametric curve, and is the key ingredient in the proof of L'Hôpital's rule.
[quotetheorem:187]
The proof applies Rolle's theorem to $h(x) = f(x)(g(b) - g(a)) - g(x)(f(b) - f(a))$, which satisfies $h(a) = h(b)$.
## L'Hôpital's Rule
A recurring problem in analysis is the evaluation of limits of the form $\lim_{x \to a} f(x)/g(x)$ when both $f(x) \to 0$ and $g(x) \to 0$ (or both tend to $\pm \infty$). L'Hôpital's rule asserts that, under appropriate hypotheses, such a limit equals the limit of $f'(x)/g'(x)$, reducing the problem to a (hopefully simpler) computation with the derivatives.
[theorem: L'Hôpital's Rule]
Let $f, g : (a, b) \to \mathbb{R}$ be differentiable with $g'(x) \neq 0$ for all $x \in (a, b)$. Suppose $\lim_{x \to a^+} f(x) = \lim_{x \to a^+} g(x) = 0$. If the limit $\lim_{x \to a^+} f'(x)/g'(x) = L$ exists (finite or $\pm \infty$), then
\begin{align*}
\lim_{x \to a^+} \frac{f(x)}{g(x)} &= L.
\end{align*}
[/theorem]
The proof uses Cauchy's Mean Value Theorem. For $x \in (a, b)$, define $f(a) = g(a) = 0$ (extending by continuity). Then Cauchy's MVT applied to $[a, x]$ gives a point $c_x \in (a, x)$ with $f(x)/g(x) = f'(c_x)/g'(c_x)$. As $x \to a^+$, the point $c_x \to a^+$ as well (since $a < c_x < x$), so $f(x)/g(x) = f'(c_x)/g'(c_x) \to L$.
[example: A Standard L'Hôpital Computation]
The limit $\lim_{x \to 0} (\sin x) / x$ is the canonical $0/0$ indeterminate form. With $f(x) = \sin x$ and $g(x) = x$, L'Hôpital's rule gives
\begin{align*}
\lim_{x \to 0} \frac{\sin x}{x} &= \lim_{x \to 0} \frac{\cos x}{1} = 1,
\end{align*}
provided the derivative limit exists (which it does, since $\cos x \to 1$). However, this application is circular if one defines $(\sin x)' = \cos x$ using the limit $\lim_{x \to 0} (\sin x)/x = 1$ in the first place. An independent proof of the limit (e.g., by geometric area comparison) is needed before L'Hôpital can be legitimately applied.
[/example]
[remark: When L'Hôpital Fails]
The converse of L'Hôpital's rule is false: the limit $\lim f(x)/g(x)$ may exist even when $\lim f'(x)/g'(x)$ does not. For example, $f(x) = x + \sin x$ and $g(x) = x$ give $\lim_{x \to \infty} f(x)/g(x) = 1$, but $f'(x)/g'(x) = 1 + \cos x$ oscillates and has no limit. The rule also does not apply when the hypotheses fail — if $g'(x) = 0$ at points accumulating at $a$, the conclusion can break down.
[/remark]
## Higher Derivatives and Taylor's Theorem
If the derivative $f'$ is itself differentiable, one obtains the **second derivative** $f'' = (f')'$, and by induction the $n$-th derivative $f^{(n)}$. A function possessing continuous derivatives up to order $k$ is said to be of class $C^k$; a function of class $C^k$ for every $k$ is **smooth** ($C^\infty$). The higher derivatives encode the curvature, inflection, and higher-order bending of the graph, and they are the ingredients of polynomial approximation via Taylor's theorem.
[definition: Higher Derivative]
Let $U \subseteq \mathbb{R}$ be open and $f : U \to \mathbb{R}$. For $n \in \mathbb{N}$, the **$n$-th derivative** $f^{(n)} : U \to \mathbb{R}$ is defined inductively by $f^{(0)} = f$ and $f^{(n)} = (f^{(n-1)})'$, provided each intermediate derivative exists and is differentiable. The function $f$ is of class $C^n(U)$ if $f^{(n)}$ exists and is continuous on $U$, and of class $C^\infty(U)$ if $f \in C^n(U)$ for every $n \in \mathbb{N}$.
[/definition]
### Taylor Polynomials
The derivative $f'(a)$ gives the best *linear* approximation to $f$ near $a$. The higher derivatives provide progressively better *polynomial* approximations. The $n$-th Taylor polynomial captures the behaviour of $f$ near $a$ up to order $n$.
[definition: Taylor Polynomial]
Let $U \subseteq \mathbb{R}$ be open, $a \in U$, and $f \in C^n(U)$. The **$n$-th Taylor polynomial of $f$ centred at $a$** is
\begin{align*}
T_n(x) &= \sum_{k=0}^n \frac{f^{(k)}(a)}{k!}(x - a)^k.
\end{align*}
The polynomial $T_n$ is the unique polynomial of degree $\leq n$ satisfying $T_n^{(k)}(a) = f^{(k)}(a)$ for $k = 0, 1, \ldots, n$: it matches $f$ and all its derivatives up to order $n$ at the point $a$.
[/definition]
The key question is: how well does $T_n$ approximate $f$? Taylor's theorem provides the answer by giving an explicit formula for the remainder $R_n(x) = f(x) - T_n(x)$.
[theorem: Taylor's Theorem With Lagrange Remainder]
Let $U \subseteq \mathbb{R}$ be open, $a \in U$, and $f \in C^n(U)$ with $f^{(n+1)}$ existing on $U$. Then for every $x \in U$ with the closed interval between $a$ and $x$ contained in $U$, there exists $c$ strictly between $a$ and $x$ such that
\begin{align*}
f(x) &= \sum_{k=0}^n \frac{f^{(k)}(a)}{k!}(x - a)^k + \frac{f^{(n+1)}(c)}{(n+1)!}(x-a)^{n+1}.
\end{align*}
[/theorem]
The Lagrange remainder $R_n(x) = f^{(n+1)}(c)(x-a)^{n+1}/(n+1)!$ shows that the error in the $n$-th Taylor approximation is controlled by the $(n+1)$-th derivative. When $n = 0$, this reduces to the mean value theorem: $f(x) = f(a) + f'(c)(x-a)$. The proof proceeds by applying the generalised mean value theorem (Cauchy's MVT) repeatedly, peeling off one derivative at each step.
[example: Taylor Expansion Of The Exponential]
Since $\exp^{(k)}(x) = \exp(x)$ for all $k$, and $\exp(0) = 1$, the Taylor polynomial centred at $a = 0$ is
\begin{align*}
T_n(x) &= \sum_{k=0}^n \frac{x^k}{k!}.
\end{align*}
The Lagrange remainder satisfies $|R_n(x)| = |\exp(c)| \cdot |x|^{n+1}/(n+1)!$ for some $c$ between $0$ and $x$. For any fixed $x$, the factor $|x|^{n+1}/(n+1)! \to 0$ as $n \to \infty$ (since $n!$ grows faster than any exponential), and $|\exp(c)| \leq \exp(|x|)$. Therefore $R_n(x) \to 0$, and the Taylor series converges to $\exp(x)$:
\begin{align*}
\exp(x) &= \sum_{k=0}^\infty \frac{x^k}{k!} \quad \text{for all } x \in \mathbb{R}.
\end{align*}
This is one of the rare cases where the Taylor series converges to the function on all of $\mathbb{R}$.
[/example]
[example: A Smooth Function Not Equal To Its Taylor Series]
The Taylor series of a $C^\infty$ function need not converge to the function. Define
\begin{align*}
f : \mathbb{R} &\to \mathbb{R} \\
x &\mapsto \begin{cases} e^{-1/x^2} & \text{if } x \neq 0, \\ 0 & \text{if } x = 0. \end{cases}
\end{align*}
Then $f \in C^\infty(\mathbb{R})$ and $f^{(n)}(0) = 0$ for every $n \in \mathbb{N}_0$ (proved by induction, using L'Hôpital's rule to show that $\lim_{x \to 0} x^{-k} e^{-1/x^2} = 0$ for every $k$). The Taylor series of $f$ at $0$ is therefore identically zero — $T_n(x) = 0$ for every $n$ — yet $f(x) > 0$ for all $x \neq 0$. The Taylor series converges, but to the wrong function. A function whose Taylor series at every point converges to the function in a neighbourhood is called **real-analytic**; this example shows that $C^\infty$ does not imply real-analyticity.
[/example]
## The Fundamental Theorem of Calculus
The deepest single result connecting differentiation to [integration](/page/Integral) is the [Fundamental Theorem of Calculus](/theorems/632), which asserts that the two operations are inverses of each other (under appropriate regularity hypotheses). It has two parts: Part I says that differentiation undoes integration, and Part II says that integration undoes differentiation.
[quotetheorem:632]
Part I is proved by estimating the difference quotient of $F$. For small $h > 0$:
\begin{align*}
\frac{F(x+h) - F(x)}{h} &= \frac{1}{h} \int_x^{x+h} f(t) \, d\mathcal{L}^1(t).
\end{align*}
By continuity of $f$, for any $\varepsilon > 0$ and $h$ sufficiently small, $|f(t) - f(x)| < \varepsilon$ for all $t \in [x, x+h]$, so the integral is within $\varepsilon h$ of $f(x) \cdot h$, giving $|F'(x) - f(x)| < \varepsilon$.
Part II is proved using Part I and the mean value theorem. If $G' = f$ and $F(x) = \int_a^x f$, then $(G - F)' = f - f = 0$ on $(a, b)$, so $G - F$ is constant by the MVT corollary. Evaluating at $x = a$ gives $G(a) - F(a) = G(a)$, hence $G(x) = F(x) + G(a)$, so $\int_a^b f = F(b) = G(b) - G(a)$.
The FTC is the reason that [integration](/page/Integral) in practice reduces to finding antiderivatives: to compute $\int_a^b f(t) \, dt$, one finds a function $G$ with $G' = f$ and evaluates $G(b) - G(a)$. Every integration technique — substitution, [integration by parts](/theorems/210), partial fractions — is a method for finding antiderivatives, and the FTC justifies why this works.
### The Relationship To Weak Derivatives
The classical derivative requires the limit of the difference quotient to exist pointwise. In PDE theory and the modern theory of [distributions](/page/Distribution), this is too restrictive: many natural "solutions" (e.g., the Heaviside step function as an "antiderivative" of the Dirac delta) are not differentiable in the classical sense. The **[weak derivative](/page/Weak%20Derivative)** generalises differentiation by replacing the pointwise limit with an integration-by-parts identity: a locally integrable function $g$ is the weak derivative of $f$ if
\begin{align*}
\int_\Omega f \varphi' \, d\mathcal{L}^1 &= -\int_\Omega g \varphi \, d\mathcal{L}^1 \quad \text{for all } \varphi \in C_c^\infty(\Omega).
\end{align*}
When $f$ is classically differentiable with continuous derivative, the weak derivative agrees with $f'$ by integration by parts. The [Sobolev spaces](/page/Sobolev%20Space) $W^{k,p}(\Omega)$ are built on this notion, and the [distribution](/page/Distribution) theory extends it further to objects that are not functions at all.
## Beyond One Variable
For functions $f : U \subseteq \mathbb{R}^m \to \mathbb{R}^n$ between Euclidean spaces, the derivative at a point $a$ is no longer a number but a **linear map** $f'(a) : \mathbb{R}^m \to \mathbb{R}^n$ — the best linear approximation to $f$ near $a$. The condition $f(x) = f(a) + f'(a)(x-a) + o(|x-a|)$ generalises directly, with the product $f'(a)(x-a)$ replaced by the action of a linear map on a vector. The matrix representing $f'(a)$ with respect to the standard bases is the **Jacobian matrix** $(D_j f_i(a))$.
The algebraic rules generalise: the [Chain Rule for Maps Between Euclidean Spaces](/theorems/323) asserts that $(g \circ f)'(a) = g'(f(a)) \circ f'(a)$ — the derivative of a composition is the composition of the derivatives, with matrix multiplication replacing scalar multiplication. The [Inverse Function Theorem](/theorems/51) asserts that if the derivative $f'(a)$ is an invertible linear map, then $f$ is locally invertible with a differentiable inverse.
## References
- Rudin, *Principles of Mathematical Analysis* (1976).
- Spivak, *Calculus* (2008).
- Abbott, *Understanding Analysis* (2015).
- Tao, *Analysis I* (2016).
Proposed Changes
Add math here
\begin{align*}
\sin (x)
\end{align*}
[example]
# test
[definition:test]
test
[/definition]
In equation \ref{eq:sample}, we find the value of an
interesting integral:
\theorem{1}
[thm:1]
{{theorem}}
[Pythagorean Theorem](/theorems/1)
[/example]
\begin{equation}
\int_0^\infty \frac{x^3}{e^x-1}\,dx = \frac{\pi^4}{15}
\end{equation}
$d$
\begin{equation}
\int_0^\infty \frac{x^3}{e^x-1}\,dx = \frac{\pi^4}{15}
\end{equation}
\begin{equation}
\int_0^\infty \frac{x^3}{e^x-1}\,dx = \frac{\pi^4}{15}
\end{equation}
\begin{equation}
\int_0^\infty \frac{x^3}{e^x-1}\,dx = \frac{\pi^4}{15}
\end{equation}
\begin{equation}
\int_0^\infty \frac{x^3}{e^x-1}\,dx = \frac{\pi^4}{15}
\end{equation}
[quotetheorem:23]
Computing diff...
4 modified
6 added
74 removed
0 unchanged
Modified
text
Original
The derivative is the central concept of differential calculus — the precise formulation of the idea that a function has an instantaneous rate of change. Where a [limit](/page/Limit) captures the behaviour of a quantity as a parameter tends to some value, the derivative applies this mechanism to the difference quotient of a function, extracting the slope of the tangent line from the slopes of secant lines. Every major application of calculus — optimisation, differential equations, Taylor approximation, the [Fundamental Theorem of Calculus](/theorems/632) linking differentiation to [integration](/page/Integral) — depends on the derivative. This page develops the derivative of a real-valued function of one real variable: its definition, its algebraic properties, the mean value theorems that give it global reach, and the higher-order theory culminating in Taylor's theorem.
Proposed
Add math here
Modified
align*
Original
\begin{align*}
f(x) &= f(a) + L(x - a) + \varepsilon(x)(x - a) \quad \text{for all } x \in E.
\end{align*}
Proposed
\begin{align*}
\sin (x)
\end{align*}
Removed
motivation
[motivation]
### Rates of Change and the Tangent Problem
Many quantities in mathematics and the sciences vary with respect to one another: position changes with time, pressure changes with volume, area changes with the length of a side. The fundamental question of differential calculus is: *at what rate does one quantity change with respect to another, at a specific instant?*
The difficulty is that "instantaneous rate of change" is not a directly observable quantity. Over a time interval $[t_0, t_0 + h]$, one can measure the average velocity of a particle as the ratio $\Delta x / \Delta t = (x(t_0 + h) - x(t_0))/h$. But this is a rate over an interval, not at a point. Taking $h$ smaller gives a better approximation, but setting $h = 0$ produces $0/0$ — the ratio is undefined. The derivative resolves this by defining the instantaneous rate as the [limit](/page/Limit) of the average rate as $h \to 0$, provided the limit exists.
### What Fails Without the Derivative
Without the derivative, one cannot formulate the condition for a function to have a local extremum, determine when two functions grow at the same rate, approximate a function by polynomials, or relate the rate of change of a quantity to its accumulation. The mean value theorem — the bridge between local information (the derivative at each point) and global information (the change over an interval) — requires differentiability as a hypothesis. Taylor's theorem, which approximates smooth [functions](/page/Function) by polynomials with explicit error bounds, is built entirely on iterated differentiation. And the [Fundamental Theorem of Calculus](/theorems/632), which connects the [integral](/page/Integral) to antidifferentiation, makes differentiation indispensable for computing integrals.
### The Geometric Picture
Geometrically, the derivative of $f$ at $a$ is the slope of the unique line that best approximates the graph of $f$ near $(a, f(a))$. The secant line through $(a, f(a))$ and $(a+h, f(a+h))$ has slope $(f(a+h) - f(a))/h$; as $h \to 0$, these secant lines rotate toward a limiting position — the tangent line — and the derivative is the slope of this tangent. The existence of the derivative is thus equivalent to the existence of a well-defined tangent direction, which fails at corners, cusps, and points of vertical tangency.
[/motivation]
Added
example
[example]
# test
[definition:test]
test
[/definition]
In equation \ref{eq:sample}, we find the value of an
interesting integral:
\theorem{1}
[thm:1]
{{theorem}}
[Pythagorean Theorem](/theorems/1)
[/example]
Added
equation
\begin{equation}
\int_0^\infty \frac{x^3}{e^x-1}\,dx = \frac{\pi^4}{15}
\end{equation}
Modified
text
Original
## Definition
Proposed
$d$
Removed
definition
Derivative
[definition: Derivative]
Let $E \subseteq \mathbb{R}$, let $a \in E$ be a limit point of $E$, and let $f : E \to \mathbb{R}$. The **derivative of $f$ at $a$** is
\begin{align*}
f'(a) &:= \lim_{h \to 0} \frac{f(a+h) - f(a)}{h},
\end{align*}
provided this [limit](/page/Limit) exists and is finite. Equivalently,
\begin{align*}
f'(a) &= \lim_{x \to a} \frac{f(x) - f(a)}{x - a}.
\end{align*}
If $f'(a)$ exists, $f$ is said to be **differentiable at $a$**. If $f$ is differentiable at every point of an [open set](/page/Open%20Set) $U \subseteq E$, $f$ is **differentiable on $U$**, and the function
\begin{align*}
f' : U &\to \mathbb{R} \\
a &\mapsto f'(a)
\end{align*}
is called the **derivative of $f$**.
[/definition]
Added
equation
\begin{equation}
\int_0^\infty \frac{x^3}{e^x-1}\,dx = \frac{\pi^4}{15}
\end{equation}
Removed
text
The two formulations are related by the substitution $x = a + h$. The limit $h \to 0$ is taken over all $h \neq 0$ such that $a + h \in E$; the limit $x \to a$ is taken over all $x \in E \setminus \{a\}$. In either case, the definition asks for a limit of the slopes of secant lines through the fixed point $(a, f(a))$.
Added
equation
\begin{equation}
\int_0^\infty \frac{x^3}{e^x-1}\,dx = \frac{\pi^4}{15}
\end{equation}
Removed
text
An equivalent characterisation — often more useful for proofs — reformulates differentiability as a linear approximation property. The function $f$ is differentiable at $a$ with derivative $f'(a) = L$ if and only if there exists a function $\varepsilon : E \to \mathbb{R}$ with $\varepsilon(x) \to 0$ as $x \to a$ such that
Added
equation
\begin{equation}
\int_0^\infty \frac{x^3}{e^x-1}\,dx = \frac{\pi^4}{15}
\end{equation}
Added
equation
\begin{equation}
\int_0^\infty \frac{x^3}{e^x-1}\,dx = \frac{\pi^4}{15}
\end{equation}
Removed
text
This says that $f$ is approximated near $a$ by the affine function $x \mapsto f(a) + L(x-a)$ — the tangent line — with an error that is $o(|x-a|)$ as $x \to a$. The derivative $L = f'(a)$ is the unique real number for which such an approximation exists. This linear-approximation viewpoint is the starting point for the generalisation to several variables, where the derivative becomes a [linear map](/page/Linear%20Map) rather than a number.
Modified
text
Original
The derivative is defined as a limit of difference quotients. The domain of the function must be such that the limit can be formed — the point must be a limit point of the domain from both sides (for the full derivative) or from one side (for one-sided derivatives).
Proposed
[quotetheorem:23]
Removed
example
Derivative Of A Power Function
[example: Derivative Of A Power Function]
Let $f : \mathbb{R} \to \mathbb{R}$ be defined by $f(x) = x^n$ for a fixed $n \in \mathbb{N}$. For any $a \in \mathbb{R}$, the difference quotient is
\begin{align*}
\frac{f(a+h) - f(a)}{h} &= \frac{(a+h)^n - a^n}{h}.
\end{align*}
Using the factorisation $u^n - v^n = (u - v)(u^{n-1} + u^{n-2}v + \cdots + v^{n-1})$ with $u = a+h$ and $v = a$:
\begin{align*}
\frac{(a+h)^n - a^n}{h} &= (a+h)^{n-1} + (a+h)^{n-2}a + \cdots + a^{n-1}.
\end{align*}
This is a sum of $n$ terms. As $h \to 0$, each $(a+h)^{n-1-k} a^k \to a^{n-1}$, so
\begin{align*}
f'(a) &= \lim_{h \to 0} \sum_{k=0}^{n-1} (a+h)^{n-1-k} a^k = n a^{n-1}.
\end{align*}
The power rule $f'(a) = na^{n-1}$ extends to all real exponents $n \in \mathbb{R}$ (for $a > 0$) via the exponential function: if $f(x) = x^n = e^{n \ln x}$, the chain rule gives $f'(x) = n x^{n-1}$.
[/example]
Removed
text
## Differentiability and [Continuity](/page/Continuity)
Removed
text
Differentiability is a stronger condition than continuity: every differentiable function is continuous, but the converse fails dramatically. Understanding this gap is essential for knowing when the tools of differential calculus apply.
Removed
text
The relationship between the two concepts reflects the geometric distinction between functions whose graphs have well-defined tangent lines and functions whose graphs are too rough or irregular for any linear approximation to work.
Removed
theorem
Differentiability Implies Continuity
[theorem: Differentiability Implies Continuity]
Let $E \subseteq \mathbb{R}$, $a \in E$ a limit point, and $f : E \to \mathbb{R}$ differentiable at $a$. Then $f$ is continuous at $a$.
[/theorem]
Removed
text
The proof uses the decomposition $f(x) - f(a) = \frac{f(x) - f(a)}{x - a} \cdot (x - a)$. As $x \to a$, the first factor tends to $f'(a)$ (a finite number) and the second factor tends to $0$, so their product tends to $0$, giving $f(x) \to f(a)$.
Removed
text
The converse fails: continuity does not imply differentiability. The simplest example is the absolute value function, which is continuous everywhere but not differentiable at the origin.
Removed
example
Non-Differentiability Of The Absolute Value
[example: Non-Differentiability Of The Absolute Value]
The function $f : \mathbb{R} \to \mathbb{R}$ defined by $f(x) = |x|$ is continuous at $0$ (since $|f(x) - f(0)| = |x| \to 0$) but not differentiable at $0$. The left and right difference quotients give different limits:
\begin{align*}
\lim_{h \to 0^+} \frac{|h| - 0}{h} &= \lim_{h \to 0^+} \frac{h}{h} = 1, \\
\lim_{h \to 0^-} \frac{|h| - 0}{h} &= \lim_{h \to 0^-} \frac{-h}{h} = -1.
\end{align*}
Since the one-sided limits disagree, the two-sided limit does not exist, and $f'(0)$ is undefined. Geometrically, the graph of $|x|$ has a corner at the origin: the left half has slope $-1$ and the right half has slope $+1$, so there is no single tangent line.
[/example]
Removed
example
A Continuous Nowhere-Differentiable Function
[example: A Continuous Nowhere-Differentiable Function]
The failure of the converse is far worse than isolated corners. Weierstrass (1872) exhibited a continuous function $f : \mathbb{R} \to \mathbb{R}$ that is differentiable at *no* point. One such construction is
\begin{align*}
f(x) &= \sum_{n=0}^\infty a^n \cos(b^n \pi x),
\end{align*}
where $0 < a < 1$, $b$ is a positive odd integer, and $ab > 1 + \frac{3\pi}{2}$. The [series](/page/Series) [converges uniformly](/page/Uniform%20Convergence) (by the Weierstrass $M$-test, since $|a^n \cos(b^n \pi x)| \leq a^n$ and $\sum a^n < \infty$), so $f$ is continuous. But the condition $ab > 1$ ensures that the oscillations at scale $b^{-n}$ grow faster than the damping factor $a^n$ contracts them, preventing the difference quotient from converging at any point. The existence of such functions demonstrates that continuity and differentiability are genuinely different properties — one cannot infer differentiability from continuity alone, even at a single point.
[/example]
Removed
text
## Algebraic Rules
Removed
text
Computing derivatives from the definition for each new function would be impractical. The algebraic rules — the sum rule, product rule, quotient rule, and chain rule — reduce the differentiation of complicated expressions to the differentiation of their building blocks. These rules, together with the derivatives of the elementary functions, suffice to differentiate any expression built from polynomials, rational functions, exponentials, logarithms, and trigonometric functions.
Removed
text
[quotetheorem:198]
Removed
text
The sum rule and product rule are proved by adding and subtracting auxiliary terms in the difference quotient. The chain rule is subtler: the naive argument "cancel $\Delta g$ in $(\Delta f / \Delta g) \cdot (\Delta g / \Delta x)$" fails when $\Delta g = 0$, and a correct proof requires the linear-approximation characterisation of differentiability.
Removed
text
For the **product rule**, the key manipulation is:
Removed
align*
\begin{align*}
\frac{f(x)g(x) - f(a)g(a)}{x - a} &= \frac{f(x) - f(a)}{x-a} \cdot g(x) + f(a) \cdot \frac{g(x) - g(a)}{x-a}.
\end{align*}
Removed
text
As $x \to a$, the first term tends to $f'(a) g(a)$ (using continuity of $g$ at $a$, which follows from differentiability) and the second tends to $f(a) g'(a)$.
Removed
text
For the **chain rule**, write $g(x) = g(a) + g'(a)(x-a) + \varepsilon_g(x)(x-a)$ and $f(y) = f(g(a)) + f'(g(a))(y - g(a)) + \varepsilon_f(y)(y - g(a))$. Substituting $y = g(x)$ and dividing by $x - a$ gives $(f \circ g)'(a) = f'(g(a)) \cdot g'(a)$; the error terms vanish in the limit because $\varepsilon_g(x) \to 0$ and $\varepsilon_f(g(x)) \to 0$ as $x \to a$.
Removed
example
Derivative Of The Exponential Function
[example: Derivative Of The Exponential Function]
The exponential function $\exp : \mathbb{R} \to \mathbb{R}$ is the unique function satisfying $\exp'(x) = \exp(x)$ for all $x$ and $\exp(0) = 1$. To verify this from the series definition $\exp(x) = \sum_{n=0}^\infty x^n / n!$, one computes
\begin{align*}
\frac{\exp(a+h) - \exp(a)}{h} &= \exp(a) \cdot \frac{\exp(h) - 1}{h}.
\end{align*}
The factor $(\exp(h) - 1)/h = \sum_{n=1}^\infty h^{n-1}/n!$ converges to $1$ as $h \to 0$ (the sum starts at $1$ and the remaining terms are $O(h)$). Hence $\exp'(a) = \exp(a)$.
Combined with the chain rule, this gives the derivative of $a^x = \exp(x \ln a)$ as $a^x \ln a$, and the derivative of $\ln x$ (the inverse of $\exp$) as $1/x$ via inverse function differentiation.
[/example]
Removed
text
## The Mean Value Theorems
Removed
text
The definition of the derivative provides only *local* information — the behaviour of $f$ in an infinitesimal neighbourhood of a single point. The mean value theorems convert this local information into *global* conclusions about the function's behaviour over an entire interval. They are the workhorses of analysis: every proof that uses "the derivative is nonnegative, therefore the function is nondecreasing" relies on the mean value theorem.
Removed
text
### Rolle's Theorem
Removed
text
The simplest mean value theorem is Rolle's theorem, which asserts that a differentiable function that starts and ends at the same value must have a horizontal tangent somewhere in between. It is the stepping stone to the full mean value theorem.
Removed
theorem
Rolle's Theorem
[theorem: Rolle's Theorem]
Let $f : [a, b] \to \mathbb{R}$ be continuous on $[a, b]$ and differentiable on $(a, b)$, with $f(a) = f(b)$. Then there exists $c \in (a, b)$ with $f'(c) = 0$.
[/theorem]
Removed
text
If $f$ is constant on $[a, b]$, then $f'(c) = 0$ for every $c \in (a, b)$. If $f$ is not constant, the extreme value theorem (applied to the continuous function $f$ on the compact [set](/page/Set) $[a, b]$) guarantees that $f$ attains a maximum or minimum at some interior point $c \in (a, b)$, and at such a point the derivative must vanish — the one-sided difference quotients have opposite signs, so their common limit can only be zero.
Removed
text
### The Mean Value Theorem
Removed
text
Rolle's theorem is the special case $f(a) = f(b)$; the general case follows by subtracting the linear function connecting $(a, f(a))$ to $(b, f(b))$.
Removed
text
[quotetheorem:186]
Removed
text
The proof applies Rolle's theorem to the auxiliary function $g(x) = f(x) - f(a) - \frac{f(b) - f(a)}{b - a}(x - a)$, which satisfies $g(a) = g(b) = 0$ and inherits continuity on $[a, b]$ and differentiability on $(a, b)$ from $f$.
Removed
text
The mean value theorem has numerous immediate consequences. A function with $f'(x) = 0$ for all $x$ in an interval is constant on that interval (apply the MVT to any two points). A function with $f'(x) > 0$ on an interval is strictly increasing. A function with a bounded derivative $|f'(x)| \leq M$ is Lipschitz continuous with constant $M$. These are qualitative conclusions about a function's global behaviour, derived entirely from pointwise information about its derivative.
Removed
text
### Why the Mean Value Theorem Fails for Vector-Valued Functions
Removed
text
The MVT is a fundamentally real result. For functions $f : [a, b] \to \mathbb{R}^n$ with $n \geq 2$, the conclusion $f'(c) = (f(b) - f(a))/(b-a)$ can fail: take $f(t) = (\cos t, \sin t)$ on $[0, 2\pi]$, which satisfies $f(0) = f(2\pi)$ but $\|f'(t)\| = 1$ for all $t$, so $f'(c) \neq 0$ for any $c$. The correct generalisation is the [Mean Value Inequality](/theorems/328) $\|f(b) - f(a)\| \leq M\|b - a\|$, which replaces equality with an upper bound.
Removed
text
### Cauchy's Mean Value Theorem
Removed
text
Cauchy's generalisation of the MVT replaces the linear function connecting two points with a parametric curve, and is the key ingredient in the proof of L'Hôpital's rule.
Removed
text
[quotetheorem:187]
Removed
text
The proof applies Rolle's theorem to $h(x) = f(x)(g(b) - g(a)) - g(x)(f(b) - f(a))$, which satisfies $h(a) = h(b)$.
Removed
text
## L'Hôpital's Rule
Removed
text
A recurring problem in analysis is the evaluation of limits of the form $\lim_{x \to a} f(x)/g(x)$ when both $f(x) \to 0$ and $g(x) \to 0$ (or both tend to $\pm \infty$). L'Hôpital's rule asserts that, under appropriate hypotheses, such a limit equals the limit of $f'(x)/g'(x)$, reducing the problem to a (hopefully simpler) computation with the derivatives.
Removed
theorem
L'Hôpital's Rule
[theorem: L'Hôpital's Rule]
Let $f, g : (a, b) \to \mathbb{R}$ be differentiable with $g'(x) \neq 0$ for all $x \in (a, b)$. Suppose $\lim_{x \to a^+} f(x) = \lim_{x \to a^+} g(x) = 0$. If the limit $\lim_{x \to a^+} f'(x)/g'(x) = L$ exists (finite or $\pm \infty$), then
\begin{align*}
\lim_{x \to a^+} \frac{f(x)}{g(x)} &= L.
\end{align*}
[/theorem]
Removed
text
The proof uses Cauchy's Mean Value Theorem. For $x \in (a, b)$, define $f(a) = g(a) = 0$ (extending by continuity). Then Cauchy's MVT applied to $[a, x]$ gives a point $c_x \in (a, x)$ with $f(x)/g(x) = f'(c_x)/g'(c_x)$. As $x \to a^+$, the point $c_x \to a^+$ as well (since $a < c_x < x$), so $f(x)/g(x) = f'(c_x)/g'(c_x) \to L$.
Removed
example
A Standard L'Hôpital Computation
[example: A Standard L'Hôpital Computation]
The limit $\lim_{x \to 0} (\sin x) / x$ is the canonical $0/0$ indeterminate form. With $f(x) = \sin x$ and $g(x) = x$, L'Hôpital's rule gives
\begin{align*}
\lim_{x \to 0} \frac{\sin x}{x} &= \lim_{x \to 0} \frac{\cos x}{1} = 1,
\end{align*}
provided the derivative limit exists (which it does, since $\cos x \to 1$). However, this application is circular if one defines $(\sin x)' = \cos x$ using the limit $\lim_{x \to 0} (\sin x)/x = 1$ in the first place. An independent proof of the limit (e.g., by geometric area comparison) is needed before L'Hôpital can be legitimately applied.
[/example]
Removed
remark
When L'Hôpital Fails
[remark: When L'Hôpital Fails]
The converse of L'Hôpital's rule is false: the limit $\lim f(x)/g(x)$ may exist even when $\lim f'(x)/g'(x)$ does not. For example, $f(x) = x + \sin x$ and $g(x) = x$ give $\lim_{x \to \infty} f(x)/g(x) = 1$, but $f'(x)/g'(x) = 1 + \cos x$ oscillates and has no limit. The rule also does not apply when the hypotheses fail — if $g'(x) = 0$ at points accumulating at $a$, the conclusion can break down.
[/remark]
Removed
text
## Higher Derivatives and Taylor's Theorem
Removed
text
If the derivative $f'$ is itself differentiable, one obtains the **second derivative** $f'' = (f')'$, and by induction the $n$-th derivative $f^{(n)}$. A function possessing continuous derivatives up to order $k$ is said to be of class $C^k$; a function of class $C^k$ for every $k$ is **smooth** ($C^\infty$). The higher derivatives encode the curvature, inflection, and higher-order bending of the graph, and they are the ingredients of polynomial approximation via Taylor's theorem.
Removed
definition
Higher Derivative
[definition: Higher Derivative]
Let $U \subseteq \mathbb{R}$ be open and $f : U \to \mathbb{R}$. For $n \in \mathbb{N}$, the **$n$-th derivative** $f^{(n)} : U \to \mathbb{R}$ is defined inductively by $f^{(0)} = f$ and $f^{(n)} = (f^{(n-1)})'$, provided each intermediate derivative exists and is differentiable. The function $f$ is of class $C^n(U)$ if $f^{(n)}$ exists and is continuous on $U$, and of class $C^\infty(U)$ if $f \in C^n(U)$ for every $n \in \mathbb{N}$.
[/definition]
Removed
text
### Taylor Polynomials
Removed
text
The derivative $f'(a)$ gives the best *linear* approximation to $f$ near $a$. The higher derivatives provide progressively better *polynomial* approximations. The $n$-th Taylor polynomial captures the behaviour of $f$ near $a$ up to order $n$.
Removed
definition
Taylor Polynomial
[definition: Taylor Polynomial]
Let $U \subseteq \mathbb{R}$ be open, $a \in U$, and $f \in C^n(U)$. The **$n$-th Taylor polynomial of $f$ centred at $a$** is
\begin{align*}
T_n(x) &= \sum_{k=0}^n \frac{f^{(k)}(a)}{k!}(x - a)^k.
\end{align*}
The polynomial $T_n$ is the unique polynomial of degree $\leq n$ satisfying $T_n^{(k)}(a) = f^{(k)}(a)$ for $k = 0, 1, \ldots, n$: it matches $f$ and all its derivatives up to order $n$ at the point $a$.
[/definition]
Removed
text
The key question is: how well does $T_n$ approximate $f$? Taylor's theorem provides the answer by giving an explicit formula for the remainder $R_n(x) = f(x) - T_n(x)$.
Removed
theorem
Taylor's Theorem With Lagrange Remainder
[theorem: Taylor's Theorem With Lagrange Remainder]
Let $U \subseteq \mathbb{R}$ be open, $a \in U$, and $f \in C^n(U)$ with $f^{(n+1)}$ existing on $U$. Then for every $x \in U$ with the closed interval between $a$ and $x$ contained in $U$, there exists $c$ strictly between $a$ and $x$ such that
\begin{align*}
f(x) &= \sum_{k=0}^n \frac{f^{(k)}(a)}{k!}(x - a)^k + \frac{f^{(n+1)}(c)}{(n+1)!}(x-a)^{n+1}.
\end{align*}
[/theorem]
Removed
text
The Lagrange remainder $R_n(x) = f^{(n+1)}(c)(x-a)^{n+1}/(n+1)!$ shows that the error in the $n$-th Taylor approximation is controlled by the $(n+1)$-th derivative. When $n = 0$, this reduces to the mean value theorem: $f(x) = f(a) + f'(c)(x-a)$. The proof proceeds by applying the generalised mean value theorem (Cauchy's MVT) repeatedly, peeling off one derivative at each step.
Removed
example
Taylor Expansion Of The Exponential
[example: Taylor Expansion Of The Exponential]
Since $\exp^{(k)}(x) = \exp(x)$ for all $k$, and $\exp(0) = 1$, the Taylor polynomial centred at $a = 0$ is
\begin{align*}
T_n(x) &= \sum_{k=0}^n \frac{x^k}{k!}.
\end{align*}
The Lagrange remainder satisfies $|R_n(x)| = |\exp(c)| \cdot |x|^{n+1}/(n+1)!$ for some $c$ between $0$ and $x$. For any fixed $x$, the factor $|x|^{n+1}/(n+1)! \to 0$ as $n \to \infty$ (since $n!$ grows faster than any exponential), and $|\exp(c)| \leq \exp(|x|)$. Therefore $R_n(x) \to 0$, and the Taylor series converges to $\exp(x)$:
\begin{align*}
\exp(x) &= \sum_{k=0}^\infty \frac{x^k}{k!} \quad \text{for all } x \in \mathbb{R}.
\end{align*}
This is one of the rare cases where the Taylor series converges to the function on all of $\mathbb{R}$.
[/example]
Removed
example
A Smooth Function Not Equal To Its Taylor Series
[example: A Smooth Function Not Equal To Its Taylor Series]
The Taylor series of a $C^\infty$ function need not converge to the function. Define
\begin{align*}
f : \mathbb{R} &\to \mathbb{R} \\
x &\mapsto \begin{cases} e^{-1/x^2} & \text{if } x \neq 0, \\ 0 & \text{if } x = 0. \end{cases}
\end{align*}
Then $f \in C^\infty(\mathbb{R})$ and $f^{(n)}(0) = 0$ for every $n \in \mathbb{N}_0$ (proved by induction, using L'Hôpital's rule to show that $\lim_{x \to 0} x^{-k} e^{-1/x^2} = 0$ for every $k$). The Taylor series of $f$ at $0$ is therefore identically zero — $T_n(x) = 0$ for every $n$ — yet $f(x) > 0$ for all $x \neq 0$. The Taylor series converges, but to the wrong function. A function whose Taylor series at every point converges to the function in a neighbourhood is called **real-analytic**; this example shows that $C^\infty$ does not imply real-analyticity.
[/example]
Removed
text
## The Fundamental Theorem of Calculus
Removed
text
The deepest single result connecting differentiation to [integration](/page/Integral) is the [Fundamental Theorem of Calculus](/theorems/632), which asserts that the two operations are inverses of each other (under appropriate regularity hypotheses). It has two parts: Part I says that differentiation undoes integration, and Part II says that integration undoes differentiation.
Removed
text
[quotetheorem:632]
Removed
text
Part I is proved by estimating the difference quotient of $F$. For small $h > 0$:
Removed
align*
\begin{align*}
\frac{F(x+h) - F(x)}{h} &= \frac{1}{h} \int_x^{x+h} f(t) \, d\mathcal{L}^1(t).
\end{align*}
Removed
text
By continuity of $f$, for any $\varepsilon > 0$ and $h$ sufficiently small, $|f(t) - f(x)| < \varepsilon$ for all $t \in [x, x+h]$, so the integral is within $\varepsilon h$ of $f(x) \cdot h$, giving $|F'(x) - f(x)| < \varepsilon$.
Removed
text
Part II is proved using Part I and the mean value theorem. If $G' = f$ and $F(x) = \int_a^x f$, then $(G - F)' = f - f = 0$ on $(a, b)$, so $G - F$ is constant by the MVT corollary. Evaluating at $x = a$ gives $G(a) - F(a) = G(a)$, hence $G(x) = F(x) + G(a)$, so $\int_a^b f = F(b) = G(b) - G(a)$.
Removed
text
The FTC is the reason that [integration](/page/Integral) in practice reduces to finding antiderivatives: to compute $\int_a^b f(t) \, dt$, one finds a function $G$ with $G' = f$ and evaluates $G(b) - G(a)$. Every integration technique — substitution, [integration by parts](/theorems/210), partial fractions — is a method for finding antiderivatives, and the FTC justifies why this works.
Removed
text
### The Relationship To Weak Derivatives
Removed
text
The classical derivative requires the limit of the difference quotient to exist pointwise. In PDE theory and the modern theory of [distributions](/page/Distribution), this is too restrictive: many natural "solutions" (e.g., the Heaviside step function as an "antiderivative" of the Dirac delta) are not differentiable in the classical sense. The **[weak derivative](/page/Weak%20Derivative)** generalises differentiation by replacing the pointwise limit with an integration-by-parts identity: a locally integrable function $g$ is the weak derivative of $f$ if
Removed
align*
\begin{align*}
\int_\Omega f \varphi' \, d\mathcal{L}^1 &= -\int_\Omega g \varphi \, d\mathcal{L}^1 \quad \text{for all } \varphi \in C_c^\infty(\Omega).
\end{align*}
Removed
text
When $f$ is classically differentiable with continuous derivative, the weak derivative agrees with $f'$ by integration by parts. The [Sobolev spaces](/page/Sobolev%20Space) $W^{k,p}(\Omega)$ are built on this notion, and the [distribution](/page/Distribution) theory extends it further to objects that are not functions at all.
Removed
text
## Beyond One Variable
Removed
text
For functions $f : U \subseteq \mathbb{R}^m \to \mathbb{R}^n$ between Euclidean spaces, the derivative at a point $a$ is no longer a number but a **linear map** $f'(a) : \mathbb{R}^m \to \mathbb{R}^n$ — the best linear approximation to $f$ near $a$. The condition $f(x) = f(a) + f'(a)(x-a) + o(|x-a|)$ generalises directly, with the product $f'(a)(x-a)$ replaced by the action of a linear map on a vector. The matrix representing $f'(a)$ with respect to the standard bases is the **Jacobian matrix** $(D_j f_i(a))$.
Removed
text
The algebraic rules generalise: the [Chain Rule for Maps Between Euclidean Spaces](/theorems/323) asserts that $(g \circ f)'(a) = g'(f(a)) \circ f'(a)$ — the derivative of a composition is the composition of the derivatives, with matrix multiplication replacing scalar multiplication. The [Inverse Function Theorem](/theorems/51) asserts that if the derivative $f'(a)$ is an invertible linear map, then $f$ is locally invertible with a differentiable inverse.
Removed
text
## References
Removed
bullet
- Rudin, *Principles of Mathematical Analysis* (1976).
- Spivak, *Calculus* (2008).
- Abbott, *Understanding Analysis* (2015).
- Tao, *Analysis I* (2016).
Thread
0 replies
Delete comment
Are you sure you want to delete this comment? This cannot be undone.
Merge pull request
Are you sure you want to merge this pull request? The proposed changes will be applied to the page.