Derivative - Content Verification
Raw Database Data
| ID | Page ID | Section | Type | Contributor ID | Partition Content | Partition Order | Created At |
|---|---|---|---|---|---|---|---|
| 342 | 1 | content | create | 1 | Add math here ok [Pythagorean Theorem](/page |
0 | Sun Feb 22 2026 23:47:15 GMT+0000 (Coordinated Universal Time) |
| 341 | 1 | content | create | 1 | /derivative)
\begin{align*}
\sin (x)
\end{align*}
[example]
# test
[definition:test]
test
[/definition]
In equation \ref{eq:sample}, we find the value of an
interesting integral:
\theorem{1}
[thm:1]
{{theorem}}
[Pythagorean Theorem](/theorems/1)
[/example]
\begin{equation}
\int_0^\infty \frac{x^3}{e^x-1}\,dx = \frac{\pi^4}{15}
\end{equation}
$d$
\begin{equation}
\int_0^\infty \frac{x^3}{e^x-1}\,dx = \frac{\pi^4}{15}
\end{equation}
\begin{equation}
\int_0^\infty \frac{x^3}{e^x-1}\,dx = \frac{\pi^4}{15}
\end{equation}
\begin{equation}
\int_0^\infty \frac{x^3}{e^x-1}\,dx = \frac{\pi^4}{15}
\end{equation}
\begin{equation}
\int_0^\infty \frac{x^3}{e^x-1}\,dx = \frac{\pi^4}{15}
\end{equation}
[quotetheorem:23] |
1 | Sun Feb 22 2026 23:47:15 GMT+0000 (Coordinated Universal Time) |
Current Content
Debug: Found 2 attribution entries
First Attribution: Source: create, Text length: 45, Start: N/A, End: N/A
Page content length: 25487
The derivative is the central concept of differential calculus — the precise formulation of the idea that a function has an instantaneous rate of change. Where a limit captures the behaviour of a quantity as a parameter tends to some value, the derivative applies this mechanism to the difference quotient of a function, extracting the slope of the tangent line from the slopes of secant lines. Every major application of calculus — optimisation, differential equations, Taylor approximation, the Fundamental Theorem of Calculus linking differentiation to integration — depends on the derivative. This page develops the derivative of a real-valued function of one real variable: its definition, its algebraic properties, the mean value theorems that give it global reach, and the higher-order theory culminating in Taylor's theorem.
[motivation]
Rates of Change and the Tangent Problem
Many quantities in mathematics and the sciences vary with respect to one another: position changes with time, pressure changes with volume, area changes with the length of a side. The fundamental question of differential calculus is: at what rate does one quantity change with respect to another, at a specific instant?
The difficulty is that "instantaneous rate of change" is not a directly observable quantity. Over a time interval $[t_0, t_0 + h]$, one can measure the average velocity of a particle as the ratio $\Delta x / \Delta t = (x(t_0 + h) - x(t_0))/h$. But this is a rate over an interval, not at a point. Taking $h$ smaller gives a better approximation, but setting $h = 0$ produces $0/0$ — the ratio is undefined. The derivative resolves this by defining the instantaneous rate as the limit of the average rate as $h \to 0$, provided the limit exists.
What Fails Without the Derivative
Without the derivative, one cannot formulate the condition for a function to have a local extremum, determine when two functions grow at the same rate, approximate a function by polynomials, or relate the rate of change of a quantity to its accumulation. The mean value theorem — the bridge between local information (the derivative at each point) and global information (the change over an interval) — requires differentiability as a hypothesis. Taylor's theorem, which approximates smooth functions by polynomials with explicit error bounds, is built entirely on iterated differentiation. And the Fundamental Theorem of Calculus, which connects the integral to antidifferentiation, makes differentiation indispensable for computing integrals.
The Geometric Picture
Geometrically, the derivative of $f$ at $a$ is the slope of the unique line that best approximates the graph of $f$ near $(a, f(a))$. The secant line through $(a, f(a))$ and $(a+h, f(a+h))$ has slope $(f(a+h) - f(a))/h$; as $h \to 0$, these secant lines rotate toward a limiting position — the tangent line — and the derivative is the slope of this tangent. The existence of the derivative is thus equivalent to the existence of a well-defined tangent direction, which fails at corners, cusps, and points of vertical tangency.
[/motivation]
Definition
The derivative is defined as a limit of difference quotients. The domain of the function must be such that the limit can be formed — the point must be a limit point of the domain from both sides (for the full derivative) or from one side (for one-sided derivatives).
[definition: Derivative]
Let $E \subseteq \mathbb{R}$, let $a \in E$ be a limit point of $E$, and let $f : E \to \mathbb{R}$. The derivative of $f$ at $a$ is
\begin{align*} f'(a) &:= \lim_{h \to 0} \frac{f(a+h) - f(a)}{h}, \end{align*}
provided this limit exists and is finite. Equivalently,
\begin{align*} f'(a) &= \lim_{x \to a} \frac{f(x) - f(a)}{x - a}. \end{align*}
If $f'(a)$ exists, $f$ is said to be differentiable at $a$. If $f$ is differentiable at every point of an open set $U \subseteq E$, $f$ is differentiable on $U$, and the function
\begin{align*} f' : U &\to \mathbb{R} \\ a &\mapsto f'(a) \end{align*}
is called the derivative of $f$.
[/definition]
The two formulations are related by the substitution $x = a + h$. The limit $h \to 0$ is taken over all $h \neq 0$ such that $a + h \in E$; the limit $x \to a$ is taken over all $x \in E \setminus \{a\}$. In either case, the definition asks for a limit of the slopes of secant lines through the fixed point $(a, f(a))$.
An equivalent characterisation — often more useful for proofs — reformulates differentiability as a linear approximation property. The function $f$ is differentiable at $a$ with derivative $f'(a) = L$ if and only if there exists a function $\varepsilon : E \to \mathbb{R}$ with $\varepsilon(x) \to 0$ as $x \to a$ such that
\begin{align*} f(x) &= f(a) + L(x - a) + \varepsilon(x)(x - a) \quad \text{for all } x \in E. \end{align*}
This says that $f$ is approximated near $a$ by the affine function $x \mapsto f(a) + L(x-a)$ — the tangent line — with an error that is $o(|x-a|)$ as $x \to a$. The derivative $L = f'(a)$ is the unique real number for which such an approximation exists. This linear-approximation viewpoint is the starting point for the generalisation to several variables, where the derivative becomes a linear map rather than a number.
[example: Derivative Of A Power Function]
Let $f : \mathbb{R} \to \mathbb{R}$ be defined by $f(x) = x^n$ for a fixed $n \in \mathbb{N}$. For any $a \in \mathbb{R}$, the difference quotient is
\begin{align*} \frac{f(a+h) - f(a)}{h} &= \frac{(a+h)^n - a^n}{h}. \end{align*}
Using the factorisation $u^n - v^n = (u - v)(u^{n-1} + u^{n-2}v + \cdots + v^{n-1})$ with $u = a+h$ and $v = a$:
\begin{align*} \frac{(a+h)^n - a^n}{h} &= (a+h)^{n-1} + (a+h)^{n-2}a + \cdots + a^{n-1}. \end{align*}
This is a sum of $n$ terms. As $h \to 0$, each $(a+h)^{n-1-k} a^k \to a^{n-1}$, so
\begin{align*} f'(a) &= \lim_{h \to 0} \sum_{k=0}^{n-1} (a+h)^{n-1-k} a^k = n a^{n-1}. \end{align*}
The power rule $f'(a) = na^{n-1}$ extends to all real exponents $n \in \mathbb{R}$ (for $a > 0$) via the exponential function: if $f(x) = x^n = e^{n \ln x}$, the chain rule gives $f'(x) = n x^{n-1}$.
[/example]
Differentiability and Continuity
Differentiability is a stronger condition than continuity: every differentiable function is continuous, but the converse fails dramatically. Understanding this gap is essential for knowing when the tools of differential calculus apply.
The relationship between the two concepts reflects the geometric distinction between functions whose graphs have well-defined tangent lines and functions whose graphs are too rough or irregular for any linear approximation to work.
[theorem: Differentiability Implies Continuity]
Let $E \subseteq \mathbb{R}$, $a \in E$ a limit point, and $f : E \to \mathbb{R}$ differentiable at $a$. Then $f$ is continuous at $a$.
[/theorem]
The proof uses the decomposition $f(x) - f(a) = \frac{f(x) - f(a)}{x - a} \cdot (x - a)$. As $x \to a$, the first factor tends to $f'(a)$ (a finite number) and the second factor tends to $0$, so their product tends to $0$, giving $f(x) \to f(a)$.
The converse fails: continuity does not imply differentiability. The simplest example is the absolute value function, which is continuous everywhere but not differentiable at the origin.
[example: Non-Differentiability Of The Absolute Value]
The function $f : \mathbb{R} \to \mathbb{R}$ defined by $f(x) = |x|$ is continuous at $0$ (since $|f(x) - f(0)| = |x| \to 0$) but not differentiable at $0$. The left and right difference quotients give different limits:
\begin{align*} \lim_{h \to 0^+} \frac{|h| - 0}{h} &= \lim_{h \to 0^+} \frac{h}{h} = 1, \\ \lim_{h \to 0^-} \frac{|h| - 0}{h} &= \lim_{h \to 0^-} \frac{-h}{h} = -1. \end{align*}
Since the one-sided limits disagree, the two-sided limit does not exist, and $f'(0)$ is undefined. Geometrically, the graph of $|x|$ has a corner at the origin: the left half has slope $-1$ and the right half has slope $+1$, so there is no single tangent line.
[/example]
[example: A Continuous Nowhere-Differentiable Function]
The failure of the converse is far worse than isolated corners. Weierstrass (1872) exhibited a continuous function $f : \mathbb{R} \to \mathbb{R}$ that is differentiable at no point. One such construction is
\begin{align*} f(x) &= \sum_{n=0}^\infty a^n \cos(b^n \pi x), \end{align*}
where $0 < a < 1$, $b$ is a positive odd integer, and $ab > 1 + \frac{3\pi}{2}$. The series converges uniformly (by the Weierstrass $M$-test, since $|a^n \cos(b^n \pi x)| \leq a^n$ and $\sum a^n < \infty$), so $f$ is continuous. But the condition $ab > 1$ ensures that the oscillations at scale $b^{-n}$ grow faster than the damping factor $a^n$ contracts them, preventing the difference quotient from converging at any point. The existence of such functions demonstrates that continuity and differentiability are genuinely different properties — one cannot infer differentiability from continuity alone, even at a single point.
[/example]
Algebraic Rules
Computing derivatives from the definition for each new function would be impractical. The algebraic rules — the sum rule, product rule, quotient rule, and chain rule — reduce the differentiation of complicated expressions to the differentiation of their building blocks. These rules, together with the derivatives of the elementary functions, suffice to differentiate any expression built from polynomials, rational functions, exponentials, logarithms, and trigonometric functions.
[quotetheorem:198]
The sum rule and product rule are proved by adding and subtracting auxiliary terms in the difference quotient. The chain rule is subtler: the naive argument "cancel $\Delta g$ in $(\Delta f / \Delta g) \cdot (\Delta g / \Delta x)$" fails when $\Delta g = 0$, and a correct proof requires the linear-approximation characterisation of differentiability.
For the product rule, the key manipulation is:
\begin{align*} \frac{f(x)g(x) - f(a)g(a)}{x - a} &= \frac{f(x) - f(a)}{x-a} \cdot g(x) + f(a) \cdot \frac{g(x) - g(a)}{x-a}. \end{align*}
As $x \to a$, the first term tends to $f'(a) g(a)$ (using continuity of $g$ at $a$, which follows from differentiability) and the second tends to $f(a) g'(a)$.
For the chain rule, write $g(x) = g(a) + g'(a)(x-a) + \varepsilon_g(x)(x-a)$ and $f(y) = f(g(a)) + f'(g(a))(y - g(a)) + \varepsilon_f(y)(y - g(a))$. Substituting $y = g(x)$ and dividing by $x - a$ gives $(f \circ g)'(a) = f'(g(a)) \cdot g'(a)$; the error terms vanish in the limit because $\varepsilon_g(x) \to 0$ and $\varepsilon_f(g(x)) \to 0$ as $x \to a$.
[example: Derivative Of The Exponential Function]
The exponential function $\exp : \mathbb{R} \to \mathbb{R}$ is the unique function satisfying $\exp'(x) = \exp(x)$ for all $x$ and $\exp(0) = 1$. To verify this from the series definition $\exp(x) = \sum_{n=0}^\infty x^n / n!$, one computes
\begin{align*} \frac{\exp(a+h) - \exp(a)}{h} &= \exp(a) \cdot \frac{\exp(h) - 1}{h}. \end{align*}
The factor $(\exp(h) - 1)/h = \sum_{n=1}^\infty h^{n-1}/n!$ converges to $1$ as $h \to 0$ (the sum starts at $1$ and the remaining terms are $O(h)$). Hence $\exp'(a) = \exp(a)$.
Combined with the chain rule, this gives the derivative of $a^x = \exp(x \ln a)$ as $a^x \ln a$, and the derivative of $\ln x$ (the inverse of $\exp$) as $1/x$ via inverse function differentiation.
[/example]
The Mean Value Theorems
The definition of the derivative provides only local information — the behaviour of $f$ in an infinitesimal neighbourhood of a single point. The mean value theorems convert this local information into global conclusions about the function's behaviour over an entire interval. They are the workhorses of analysis: every proof that uses "the derivative is nonnegative, therefore the function is nondecreasing" relies on the mean value theorem.
Rolle's Theorem
The simplest mean value theorem is Rolle's theorem, which asserts that a differentiable function that starts and ends at the same value must have a horizontal tangent somewhere in between. It is the stepping stone to the full mean value theorem.
[theorem: Rolle's Theorem]
Let $f : [a, b] \to \mathbb{R}$ be continuous on $[a, b]$ and differentiable on $(a, b)$, with $f(a) = f(b)$. Then there exists $c \in (a, b)$ with $f'(c) = 0$.
[/theorem]
If $f$ is constant on $[a, b]$, then $f'(c) = 0$ for every $c \in (a, b)$. If $f$ is not constant, the extreme value theorem (applied to the continuous function $f$ on the compact set $[a, b]$) guarantees that $f$ attains a maximum or minimum at some interior point $c \in (a, b)$, and at such a point the derivative must vanish — the one-sided difference quotients have opposite signs, so their common limit can only be zero.
The Mean Value Theorem
Rolle's theorem is the special case $f(a) = f(b)$; the general case follows by subtracting the linear function connecting $(a, f(a))$ to $(b, f(b))$.
[quotetheorem:186]
The proof applies Rolle's theorem to the auxiliary function $g(x) = f(x) - f(a) - \frac{f(b) - f(a)}{b - a}(x - a)$, which satisfies $g(a) = g(b) = 0$ and inherits continuity on $[a, b]$ and differentiability on $(a, b)$ from $f$.
The mean value theorem has numerous immediate consequences. A function with $f'(x) = 0$ for all $x$ in an interval is constant on that interval (apply the MVT to any two points). A function with $f'(x) > 0$ on an interval is strictly increasing. A function with a bounded derivative $|f'(x)| \leq M$ is Lipschitz continuous with constant $M$. These are qualitative conclusions about a function's global behaviour, derived entirely from pointwise information about its derivative.
Why the Mean Value Theorem Fails for Vector-Valued Functions
The MVT is a fundamentally real result. For functions $f : [a, b] \to \mathbb{R}^n$ with $n \geq 2$, the conclusion $f'(c) = (f(b) - f(a))/(b-a)$ can fail: take $f(t) = (\cos t, \sin t)$ on $[0, 2\pi]$, which satisfies $f(0) = f(2\pi)$ but $\|f'(t)\| = 1$ for all $t$, so $f'(c) \neq 0$ for any $c$. The correct generalisation is the Mean Value Inequality $\|f(b) - f(a)\| \leq M\|b - a\|$, which replaces equality with an upper bound.
Cauchy's Mean Value Theorem
Cauchy's generalisation of the MVT replaces the linear function connecting two points with a parametric curve, and is the key ingredient in the proof of L'Hôpital's rule.
[quotetheorem:187]
The proof applies Rolle's theorem to $h(x) = f(x)(g(b) - g(a)) - g(x)(f(b) - f(a))$, which satisfies $h(a) = h(b)$.
L'Hôpital's Rule
A recurring problem in analysis is the evaluation of limits of the form $\lim_{x \to a} f(x)/g(x)$ when both $f(x) \to 0$ and $g(x) \to 0$ (or both tend to $\pm \infty$). L'Hôpital's rule asserts that, under appropriate hypotheses, such a limit equals the limit of $f'(x)/g'(x)$, reducing the problem to a (hopefully simpler) computation with the derivatives.
[theorem: L'Hôpital's Rule]
Let $f, g : (a, b) \to \mathbb{R}$ be differentiable with $g'(x) \neq 0$ for all $x \in (a, b)$. Suppose $\lim_{x \to a^+} f(x) = \lim_{x \to a^+} g(x) = 0$. If the limit $\lim_{x \to a^+} f'(x)/g'(x) = L$ exists (finite or $\pm \infty$), then
\begin{align*}
\lim_{x \to a^+} \frac{f(x)}{g(x)} &= L.
\end{align*}
[/theorem]
The proof uses Cauchy's Mean Value Theorem. For $x \in (a, b)$, define $f(a) = g(a) = 0$ (extending by continuity). Then Cauchy's MVT applied to $[a, x]$ gives a point $c_x \in (a, x)$ with $f(x)/g(x) = f'(c_x)/g'(c_x)$. As $x \to a^+$, the point $c_x \to a^+$ as well (since $a < c_x < x$), so $f(x)/g(x) = f'(c_x)/g'(c_x) \to L$.
[example: A Standard L'Hôpital Computation]
The limit $\lim_{x \to 0} (\sin x) / x$ is the canonical $0/0$ indeterminate form. With $f(x) = \sin x$ and $g(x) = x$, L'Hôpital's rule gives
\begin{align*} \lim_{x \to 0} \frac{\sin x}{x} &= \lim_{x \to 0} \frac{\cos x}{1} = 1, \end{align*}
provided the derivative limit exists (which it does, since $\cos x \to 1$). However, this application is circular if one defines $(\sin x)' = \cos x$ using the limit $\lim_{x \to 0} (\sin x)/x = 1$ in the first place. An independent proof of the limit (e.g., by geometric area comparison) is needed before L'Hôpital can be legitimately applied.
[/example]
[remark: When L'Hôpital Fails]
The converse of L'Hôpital's rule is false: the limit $\lim f(x)/g(x)$ may exist even when $\lim f'(x)/g'(x)$ does not. For example, $f(x) = x + \sin x$ and $g(x) = x$ give $\lim_{x \to \infty} f(x)/g(x) = 1$, but $f'(x)/g'(x) = 1 + \cos x$ oscillates and has no limit. The rule also does not apply when the hypotheses fail — if $g'(x) = 0$ at points accumulating at $a$, the conclusion can break down.
[/remark]
Higher Derivatives and Taylor's Theorem
If the derivative $f'$ is itself differentiable, one obtains the second derivative $f'' = (f')'$, and by induction the $n$-th derivative $f^{(n)}$. A function possessing continuous derivatives up to order $k$ is said to be of class $C^k$; a function of class $C^k$ for every $k$ is smooth ($C^\infty$). The higher derivatives encode the curvature, inflection, and higher-order bending of the graph, and they are the ingredients of polynomial approximation via Taylor's theorem.
[definition: Higher Derivative]
Let $U \subseteq \mathbb{R}$ be open and $f : U \to \mathbb{R}$. For $n \in \mathbb{N}$, the $n$-th derivative $f^{(n)} : U \to \mathbb{R}$ is defined inductively by $f^{(0)} = f$ and $f^{(n)} = (f^{(n-1)})'$, provided each intermediate derivative exists and is differentiable. The function $f$ is of class $C^n(U)$ if $f^{(n)}$ exists and is continuous on $U$, and of class $C^\infty(U)$ if $f \in C^n(U)$ for every $n \in \mathbb{N}$.
[/definition]
Taylor Polynomials
The derivative $f'(a)$ gives the best linear approximation to $f$ near $a$. The higher derivatives provide progressively better polynomial approximations. The $n$-th Taylor polynomial captures the behaviour of $f$ near $a$ up to order $n$.
[definition: Taylor Polynomial]
Let $U \subseteq \mathbb{R}$ be open, $a \in U$, and $f \in C^n(U)$. The $n$-th Taylor polynomial of $f$ centred at $a$ is
\begin{align*} T_n(x) &= \sum_{k=0}^n \frac{f^{(k)}(a)}{k!}(x - a)^k. \end{align*}
The polynomial $T_n$ is the unique polynomial of degree $\leq n$ satisfying $T_n^{(k)}(a) = f^{(k)}(a)$ for $k = 0, 1, \ldots, n$: it matches $f$ and all its derivatives up to order $n$ at the point $a$.
[/definition]
The key question is: how well does $T_n$ approximate $f$? Taylor's theorem provides the answer by giving an explicit formula for the remainder $R_n(x) = f(x) - T_n(x)$.
[theorem: Taylor's Theorem With Lagrange Remainder]
Let $U \subseteq \mathbb{R}$ be open, $a \in U$, and $f \in C^n(U)$ with $f^{(n+1)}$ existing on $U$. Then for every $x \in U$ with the closed interval between $a$ and $x$ contained in $U$, there exists $c$ strictly between $a$ and $x$ such that
\begin{align*}
f(x) &= \sum_{k=0}^n \frac{f^{(k)}(a)}{k!}(x - a)^k + \frac{f^{(n+1)}(c)}{(n+1)!}(x-a)^{n+1}.
\end{align*}
[/theorem]
The Lagrange remainder $R_n(x) = f^{(n+1)}(c)(x-a)^{n+1}/(n+1)!$ shows that the error in the $n$-th Taylor approximation is controlled by the $(n+1)$-th derivative. When $n = 0$, this reduces to the mean value theorem: $f(x) = f(a) + f'(c)(x-a)$. The proof proceeds by applying the generalised mean value theorem (Cauchy's MVT) repeatedly, peeling off one derivative at each step.
[example: Taylor Expansion Of The Exponential]
Since $\exp^{(k)}(x) = \exp(x)$ for all $k$, and $\exp(0) = 1$, the Taylor polynomial centred at $a = 0$ is
\begin{align*} T_n(x) &= \sum_{k=0}^n \frac{x^k}{k!}. \end{align*}
The Lagrange remainder satisfies $|R_n(x)| = |\exp(c)| \cdot |x|^{n+1}/(n+1)!$ for some $c$ between $0$ and $x$. For any fixed $x$, the factor $|x|^{n+1}/(n+1)! \to 0$ as $n \to \infty$ (since $n!$ grows faster than any exponential), and $|\exp(c)| \leq \exp(|x|)$. Therefore $R_n(x) \to 0$, and the Taylor series converges to $\exp(x)$:
\begin{align*} \exp(x) &= \sum_{k=0}^\infty \frac{x^k}{k!} \quad \text{for all } x \in \mathbb{R}. \end{align*}
This is one of the rare cases where the Taylor series converges to the function on all of $\mathbb{R}$.
[/example]
[example: A Smooth Function Not Equal To Its Taylor Series]
The Taylor series of a $C^\infty$ function need not converge to the function. Define
\begin{align*} f : \mathbb{R} &\to \mathbb{R} \\ x &\mapsto \begin{cases} e^{-1/x^2} & \text{if } x \neq 0, \\ 0 & \text{if } x = 0. \end{cases} \end{align*}
Then $f \in C^\infty(\mathbb{R})$ and $f^{(n)}(0) = 0$ for every $n \in \mathbb{N}_0$ (proved by induction, using L'Hôpital's rule to show that $\lim_{x \to 0} x^{-k} e^{-1/x^2} = 0$ for every $k$). The Taylor series of $f$ at $0$ is therefore identically zero — $T_n(x) = 0$ for every $n$ — yet $f(x) > 0$ for all $x \neq 0$. The Taylor series converges, but to the wrong function. A function whose Taylor series at every point converges to the function in a neighbourhood is called real-analytic; this example shows that $C^\infty$ does not imply real-analyticity.
[/example]
The Fundamental Theorem of Calculus
The deepest single result connecting differentiation to integration is the Fundamental Theorem of Calculus, which asserts that the two operations are inverses of each other (under appropriate regularity hypotheses). It has two parts: Part I says that differentiation undoes integration, and Part II says that integration undoes differentiation.
[quotetheorem:632]
Part I is proved by estimating the difference quotient of $F$. For small $h > 0$:
\begin{align*} \frac{F(x+h) - F(x)}{h} &= \frac{1}{h} \int_x^{x+h} f(t) \, d\mathcal{L}^1(t). \end{align*}
By continuity of $f$, for any $\varepsilon > 0$ and $h$ sufficiently small, $|f(t) - f(x)| < \varepsilon$ for all $t \in [x, x+h]$, so the integral is within $\varepsilon h$ of $f(x) \cdot h$, giving $|F'(x) - f(x)| < \varepsilon$.
Part II is proved using Part I and the mean value theorem. If $G' = f$ and $F(x) = \int_a^x f$, then $(G - F)' = f - f = 0$ on $(a, b)$, so $G - F$ is constant by the MVT corollary. Evaluating at $x = a$ gives $G(a) - F(a) = G(a)$, hence $G(x) = F(x) + G(a)$, so $\int_a^b f = F(b) = G(b) - G(a)$.
The FTC is the reason that integration in practice reduces to finding antiderivatives: to compute $\int_a^b f(t) \, dt$, one finds a function $G$ with $G' = f$ and evaluates $G(b) - G(a)$. Every integration technique — substitution, integration by parts, partial fractions — is a method for finding antiderivatives, and the FTC justifies why this works.
The Relationship To Weak Derivatives
The classical derivative requires the limit of the difference quotient to exist pointwise. In PDE theory and the modern theory of distributions, this is too restrictive: many natural "solutions" (e.g., the Heaviside step function as an "antiderivative" of the Dirac delta) are not differentiable in the classical sense. The weak derivative generalises differentiation by replacing the pointwise limit with an integration-by-parts identity: a locally integrable function $g$ is the weak derivative of $f$ if
\begin{align*} \int_\Omega f \varphi' \, d\mathcal{L}^1 &= -\int_\Omega g \varphi \, d\mathcal{L}^1 \quad \text{for all } \varphi \in C_c^\infty(\Omega). \end{align*}
When $f$ is classically differentiable with continuous derivative, the weak derivative agrees with $f'$ by integration by parts. The Sobolev spaces $W^{k,p}(\Omega)$ are built on this notion, and the distribution theory extends it further to objects that are not functions at all.
Beyond One Variable
For functions $f : U \subseteq \mathbb{R}^m \to \mathbb{R}^n$ between Euclidean spaces, the derivative at a point $a$ is no longer a number but a linear map $f'(a) : \mathbb{R}^m \to \mathbb{R}^n$ — the best linear approximation to $f$ near $a$. The condition $f(x) = f(a) + f'(a)(x-a) + o(|x-a|)$ generalises directly, with the product $f'(a)(x-a)$ replaced by the action of a linear map on a vector. The matrix representing $f'(a)$ with respect to the standard bases is the Jacobian matrix $(D_j f_i(a))$.
The algebraic rules generalise: the Chain Rule for Maps Between Euclidean Spaces asserts that $(g \circ f)'(a) = g'(f(a)) \circ f'(a)$ — the derivative of a composition is the composition of the derivatives, with matrix multiplication replacing scalar multiplication. The Inverse Function Theorem asserts that if the derivative $f'(a)$ is an invertible linear map, then $f$ is locally invertible with a differentiable inverse.
References
- Rudin, Principles of Mathematical Analysis (1976).
- Spivak, Calculus (2008).
- Abbott, Understanding Analysis (2015).
- Tao, Analysis I (2016).
Attribution Debug Info:
Total segments: 1
Attributed segments: 0
Non-attributed segments: 1