Divided differences are the coefficients that make polynomial interpolation usable without solving a new linear system each time a data point is added. Given values of a function at finitely many distinct nodes, they measure the coefficient of the highest-degree term in the interpolating polynomial over those nodes. This makes them central to [Polynomial Interpolation](/page/Polynomial%20Interpolation), Newton Polynomial, Finite Difference, and the numerical study of [Derivative](/page/Derivative) approximation.
The main point is that divided differences package local information in a symmetric algebraic form. For nearby nodes they resemble derivatives; for separated nodes they describe global interpolation data. They also explain why Newton's interpolation formula updates one node at a time, while the Lagrange formula treats all nodes at once.
## Definition
When a function is known only at finitely many nodes, the central interpolation question is which polynomial those data determine. The coefficient of the highest possible power is the part that changes when the interpolation order increases, so it is the coefficient worth naming first.
[definition: Divided Difference]
Let $n$ be a nonnegative integer, let $(x_0, \ldots, x_n)$ be a tuple of pairwise distinct [real numbers](/page/Real%20Numbers), and let $X_n=\{x_0,\ldots,x_n\}$. The order-$n$ divided difference at the ordered nodes $(x_0,\ldots,x_n)$ is the functional
\begin{align*}
D_{x_0,\ldots,x_n}: \mathbb{R}^{X_n} \to \mathbb{R}
\end{align*}
defined as follows. Let $\mathcal{P}_n$ denote the [vector space](/page/Vector%20Space) of real polynomials of degree at most $n$. For a polynomial $q(t)$, write $[t^n]q(t)$ for the coefficient of $t^n$ in $q(t)$. For $f:X_n\to\mathbb{R}$, if $p\in \mathcal{P}_n$ is the unique polynomial satisfying $p(x_i)=f(x_i)$ for $0\le i\le n$, then
\begin{align*}
D_{x_0,\ldots,x_n}(f) = [t^n]p(t).
\end{align*}
The value $D_{x_0,\ldots,x_n}(f)$ is denoted by
\begin{align*}
f[x_0,\ldots,x_n].
\end{align*}
[/definition]
The restriction to distinct nodes prevents denominators such as $x_k-x_j$ from vanishing once the recursive formula is introduced. Repeated nodes lead to Hermite divided differences, where derivative data replaces the missing separation between nodes.
The definition above uses both an ordered tuple and an underlying finite set, and those two pieces of data play different roles. The tuple controls the entries in a divided-difference table; the set is the domain on which the sampled function is known. Naming this node data prevents later formulas from blurring an algorithmic ordering with the interpolation problem itself.
[definition: Interpolation Nodes]
Let $n$ be a nonnegative integer. An ordered list of interpolation nodes is a tuple $(x_0, \ldots, x_n)$ of pairwise distinct real numbers. Its underlying node set is
\begin{align*}
X_n = \{x_0, \ldots, x_n\} \subset \mathbb{R}.
\end{align*}
[/definition]
The notation $f[x_0,\ldots,x_n]$ is not evaluation of $f$ at a tuple. It records the nodes used to form the interpolation functional. For order zero, the interpolating polynomial is constant, so the divided difference recovers the given function value.
For computation, the leading-coefficient definition is too indirect: constructing the whole interpolating polynomial just to read one coefficient wastes the triangular structure of interpolation data. The recursive definition gives the Newton-table algorithm used in numerical interpolation.
[definition: Recursive Divided Difference]
Let $(x_j,\ldots,x_k)$ be pairwise distinct real numbers with $0\le j\le k$, and let $X_{j,k}=\{x_j,\ldots,x_k\}$. The recursive divided difference is the functional
\begin{align*}
R_{x_j,\ldots,x_k}: \mathbb{R}^{X_{j,k}} \to \mathbb{R}
\end{align*}
defined as follows: for $f:X_{j,k}\to\mathbb{R}$, set $f[x_j] = f(x_j)$ and, for $j<k$, set
\begin{align*}
f[x_j,\ldots,x_k] = \frac{f[x_{j+1},\ldots,x_k]-f[x_j,\ldots,x_{k-1}]}{x_k-x_j}.
\end{align*}
[/definition]
This recursion divides a change in an order-$(k-j-1)$ divided difference by the span of the outer nodes. It is the analogue of a difference quotient after the original function values have been replaced by lower-order interpolation coefficients.
## Equivalent Characterisations
The recursive rule is convenient for computation, but it is not yet clear that it is measuring the same quantity as the interpolation coefficient that motivated divided differences. Without that agreement, the triangular table would produce numbers whose connection to the interpolating polynomial is only asserted, not justified. The key issue is whether each recursive entry is exactly the leading coefficient of the polynomial interpolating the corresponding nodes.
[quotetheorem:907]
This theorem justifies the triangular divided-difference table. Each entry depends on the two entries immediately below it from the previous order, so adding a new node appends one diagonal rather than recomputing a full interpolation problem.
The recursive definition also has to agree with the other standard ways of constructing the interpolating polynomial. In Lagrange form, the polynomial is written directly from basis functions attached to the nodes. In Newton form, the same polynomial is built one node at a time, with divided differences as the coefficients in the triangular table. In Vandermonde form, one instead writes the unknown polynomial as $a_0+a_1t+\cdots+a_nt^n$ and solves the interpolation equations for the coefficient list $(a_0,\ldots,a_n)$.
To compare these constructions without changing the underlying interpolation problem, we need to translate their notations into one common polynomial language. The comparison theorem uses slightly more formal notation for this same picture. Here $\mathbb{R}[t]$ means the set of polynomials in the variable $t$ with real coefficients. The notation $\mathcal{P}_n(\mathbb{R})$ is the same polynomial space earlier denoted by $\mathcal{P}_n$: real polynomials of degree at most $n$. A vector in $\mathbb{R}^{n+1}$ is just an ordered list of $n+1$ real numbers, so in this context it records the coefficients of $1,t,\ldots,t^n$. With that vocabulary fixed, the next theorem says that the Lagrange, Vandermonde, and Newton descriptions are three coordinate systems for one interpolating polynomial.
[quotetheorem:8329]
This equivalence explains why divided differences can be used interchangeably with Lagrange weights or Vandermonde coefficient vectors once the polynomial space and coefficient conventions have been fixed. It also shows that the Newton table is not a separate construction, but a computationally efficient coordinate system for the unique interpolating polynomial.
The notation $f[x_0,\ldots,x_n]$ records the nodes in an order, and the recursion itself also chooses an order when it peels off the left and right endpoints. This creates a possible ambiguity: if the same finite set of nodes is listed differently, the recursive computation might appear to give a different coefficient. To use divided differences in estimates or comparisons, one needs to know that the value depends only on the set of distinct nodes, not on the chosen listing.
[quotetheorem:9548]
Symmetry is useful when estimating divided differences, since the nodes may be reordered to make a computation convenient. It does not remove the need for an ordered list in algorithms.
## Standard Examples
The first nonzero orders show how divided differences generalise slopes. Order one is exactly the secant slope; higher order detects higher-degree polynomial behaviour.
[example: First and Second Order Divided Differences]
Let $f:\{1,2,4\}\to\mathbb{R}$ be given by $f(t)=t^2$ on these nodes, so $f(1)=1$, $f(2)=4$, and $f(4)=16$. For the first pair of nodes, the recursive divided-difference formula gives
\begin{align*}
f[1,2]=\frac{f[2]-f[1]}{2-1}.
\end{align*}
Since $f[2]=f(2)=4$ and $f[1]=f(1)=1$, this becomes
\begin{align*}
f[1,2]=\frac{4-1}{2-1}.
\end{align*}
The numerator is $4-1=3$ and the denominator is $2-1=1$, hence
\begin{align*}
f[1,2]=3.
\end{align*}
For the second pair,
\begin{align*}
f[2,4]=\frac{f[4]-f[2]}{4-2}.
\end{align*}
Using $f[4]=f(4)=16$ and $f[2]=f(2)=4$, we get
\begin{align*}
f[2,4]=\frac{16-4}{4-2}.
\end{align*}
The numerator is $12$ and the denominator is $2$, so
\begin{align*}
f[2,4]=6.
\end{align*}
Now the second-order divided difference is
\begin{align*}
f[1,2,4]=\frac{f[2,4]-f[1,2]}{4-1}.
\end{align*}
Substituting the two first-order values gives
\begin{align*}
f[1,2,4]=\frac{6-3}{4-1}.
\end{align*}
Both numerator and denominator are $3$, so
\begin{align*}
f[1,2,4]=1.
\end{align*}
Thus the order-two divided difference recovers the coefficient of $t^2$ in the interpolating polynomial $p(t)=t^2$, namely $1$.
[/example]
This example also warns against confusing second divided differences with second derivatives. For $f(t)=t^2$, the [second derivative](/page/Second%20Derivative) is $2$, while the order-two divided difference is $1$, the leading coefficient.
A divided difference of order $n$ kills polynomial data of degree below $n$. This is why it functions as a discrete analogue of the highest derivative term.
[example: Lower Degree Polynomial Gives Zero]
Let $(x_0,x_1,x_2,x_3)$ be any four distinct real numbers, let $X_3=\{x_0,x_1,x_2,x_3\}$, and let $f:X_3\to\mathbb{R}$ be the restriction of
\begin{align*}
q(t)=5t^2-3t+7.
\end{align*}
For each $i\in\{0,1,2,3\}$, restriction means
\begin{align*}
f(x_i)=q(x_i)=5x_i^2-3x_i+7.
\end{align*}
Also $q\in\mathcal{P}_3$, since it can be written as
\begin{align*}
q(t)=0\cdot t^3+5t^2-3t+7.
\end{align*}
Thus the degree-at-most-$3$ polynomial interpolating the data is $q$ itself, and its coefficient of $t^3$ is
\begin{align*}
[t^3]q(t)=0.
\end{align*}
By the leading-coefficient definition of the order-$3$ divided difference,
\begin{align*}
f[x_0,x_1,x_2,x_3]=[t^3]q(t)=0.
\end{align*}
This shows that an order-$3$ divided difference measures the cubic part of the interpolating data, not the size of the function values.
[/example]
The distinct-node hypothesis is not a technical ornament. If two nodes collide, the ordinary recursive quotient asks us to divide by zero.
[example: Repeated Nodes Break the Ordinary Definition]
Suppose one tries to form an ordinary divided-difference table using the node list $(1,1,2)$. Already at order one, the recursive definition would require the entry formed from the first two nodes:
\begin{align*}
f[1,1]=\frac{f[1]-f[1]}{1-1}.
\end{align*}
For an order-zero divided difference, $f[1]=f(1)$, so the numerator is
\begin{align*}
f[1]-f[1]=f(1)-f(1)=0.
\end{align*}
The denominator is
\begin{align*}
1-1=0.
\end{align*}
Thus the required quotient is $0/0$, which is not a real number. The ordinary recursive definition therefore does not define $f[1,1]$, and without that entry it cannot define the higher-order entry $f[1,1,2]$. Hermite interpolation replaces the missing separation between repeated nodes with derivative data, so it is a different construction rather than an ordinary divided difference with coincident nodes.
[/example]
Equally spaced nodes connect divided differences with finite differences. This is where interpolation theory meets numerical differentiation and finite-difference schemes.
[example: Equally Spaced Nodes and Finite Differences]
Let $h>0$, let the nodes be $x_i=x_0+ih$ for $0\le i\le n$, and define forward differences by $\Delta^0 f(x_i)=f(x_i)$ and
\begin{align*}
\Delta^{r+1}f(x_i)=\Delta^r f(x_{i+1})-\Delta^r f(x_i).
\end{align*}
We show that for equally spaced nodes,
\begin{align*}
f[x_0,x_1,\ldots,x_n]=\frac{\Delta^n f(x_0)}{n!h^n}.
\end{align*}
More generally, for $0\le j\le j+r\le n$, we prove
\begin{align*}
f[x_j,x_{j+1},\ldots,x_{j+r}]=\frac{\Delta^r f(x_j)}{r!h^r}.
\end{align*}
For $r=0$, this says
\begin{align*}
f[x_j]=f(x_j)=\Delta^0 f(x_j),
\end{align*}
so the formula holds because $0!h^0=1$. Suppose the formula holds for order $r-1$, where $r\ge 1$. Since
\begin{align*}
x_{j+r}-x_j=(x_0+(j+r)h)-(x_0+jh)=rh,
\end{align*}
the recursive divided-difference formula gives
\begin{align*}
f[x_j,\ldots,x_{j+r}]=\frac{f[x_{j+1},\ldots,x_{j+r}]-f[x_j,\ldots,x_{j+r-1}]}{rh}.
\end{align*}
Using the induction hypothesis on the two order-$(r-1)$ divided differences,
\begin{align*}
f[x_j,\ldots,x_{j+r}]=\frac{\frac{\Delta^{r-1}f(x_{j+1})}{(r-1)!h^{r-1}}-\frac{\Delta^{r-1}f(x_j)}{(r-1)!h^{r-1}}}{rh}.
\end{align*}
Putting the two fractions over the common denominator $(r-1)!h^{r-1}$ gives
\begin{align*}
f[x_j,\ldots,x_{j+r}]=\frac{\Delta^{r-1}f(x_{j+1})-\Delta^{r-1}f(x_j)}{r!h^r}.
\end{align*}
By the definition of the forward difference, the numerator is $\Delta^r f(x_j)$, so
\begin{align*}
f[x_j,\ldots,x_{j+r}]=\frac{\Delta^r f(x_j)}{r!h^r}.
\end{align*}
Taking $j=0$ and $r=n$ gives the stated formula.
For $n=2$, the second forward difference is
\begin{align*}
\Delta^2 f(x_0)=\Delta f(x_1)-\Delta f(x_0).
\end{align*}
Since $x_1=x_0+h$ and $x_2=x_0+2h$,
\begin{align*}
\Delta f(x_1)=f(x_2)-f(x_1)=f(x_0+2h)-f(x_0+h).
\end{align*}
Also,
\begin{align*}
\Delta f(x_0)=f(x_1)-f(x_0)=f(x_0+h)-f(x_0).
\end{align*}
Therefore
\begin{align*}
\Delta^2 f(x_0)=f(x_0+2h)-f(x_0+h)-f(x_0+h)+f(x_0).
\end{align*}
Combining the two middle terms,
\begin{align*}
\Delta^2 f(x_0)=f(x_0+2h)-2f(x_0+h)+f(x_0).
\end{align*}
Since $2!h^2=2h^2$, the general formula becomes
\begin{align*}
f[x_0,x_0+h,x_0+2h]=\frac{f(x_0+2h)-2f(x_0+h)+f(x_0)}{2h^2}.
\end{align*}
Thus finite differences and divided differences contain the same information on an equally spaced grid, with the divided difference normalised by the geometric factor $n!h^n$.
[/example]
The factor $n!$ is the same factor relating the leading coefficient of a Taylor polynomial to the $n$th derivative. This makes the connection with smooth calculus precise.
## Properties
The usual monomial basis makes interpolation coefficients change globally when a new node is added, which is why solving a Vandermonde system is expensive and opaque. Divided differences suggest a better basis: at stage $k$, add a factor that vanishes at all earlier nodes, so the new coefficient can correct the value at $x_k$ without disturbing interpolation already achieved at $x_0,\ldots,x_{k-1}$. The following definition packages this update-friendly form into a single polynomial.
[definition: Newton Interpolation Polynomial]
Let $(x_0,\ldots,x_n)$ be pairwise distinct real numbers, let $X_n=\{x_0,\ldots,x_n\}$, and let $f:X_n\to\mathbb{R}$. The Newton interpolation polynomial associated to these data is the polynomial $N_n\in\mathcal{P}_n$ defined by
\begin{align*}
N_n(t)=f[x_0]+\sum_{k=1}^{n} f[x_0,\ldots,x_k]\prod_{m=0}^{k-1}(t-x_m)
\end{align*}
for every $t\in\mathbb{R}$.
[/definition]
The coefficient multiplying each new basis factor is exactly the divided difference over the nodes seen so far. To use this expression as interpolation rather than just notation, we need to know that it really matches the prescribed data at every node. In the quoted statement, the notation $P_n[x]$ denotes the same polynomial space as $\mathcal{P}_n$ above: polynomials in the variable $x$ of degree at most $n$.
[quotetheorem:474]
This theorem is the practical heart of the subject. Once the divided-difference table is built, the interpolating polynomial is read off without solving a Vandermonde system.
The equally spaced formula shows that divided differences resemble normalised finite differences, but that observation alone is still discrete. For estimates, one needs a bridge from these algebraic coefficients to the size of an actual derivative of a smooth function. The obstruction is that the divided difference is computed from finitely many values, while the derivative is local; the mean-value theorem for divided differences identifies a point where the two viewpoints agree after the factorial normalisation. Here $C^n[a,b]$ denotes the class of functions on the interval $[a,b]$ whose derivatives up to order $n$ exist and are continuous, and $f^{(n)}$ denotes the $n$th derivative.
[quotetheorem:908]
This result explains the normalisation in the equally spaced formula. It also shows that a uniform bound on $f^{(n)}$ controls all order-$n$ divided differences formed from distinct nodes in the interval.
Error analysis asks how much is lost by replacing a function with its interpolating polynomial. The answer is another divided difference, now formed by adding the evaluation point as a new node.
[quotetheorem:9549]
When combined with the [mean value theorem](/theorems/186) for divided differences, this gives the familiar derivative-based interpolation error estimate for functions in $C^{n+1}$.
To analyse sums of data, perturbations of data, or decompositions into a polynomial part and a remainder, divided differences must respect addition and scalar multiplication. The next property records that algebraic compatibility.
[quotetheorem:9550]
This property lets one analyse monomials, basis polynomials, or error terms separately and then recombine the results.
## Relationship to Other Concepts
Divided differences are most closely tied to [Polynomial Interpolation](/page/Polynomial%20Interpolation). The Lagrange form expresses the interpolating polynomial as a sum of cardinal basis polynomials; the Newton form expresses it in a nested basis whose coefficients are divided differences.
They are also related to Finite Difference. On equally spaced grids, finite differences are scaled divided differences. On nonuniform grids, divided differences are the more natural object because they retain the geometry of the nodes.
In [Numerical Analysis](/page/Numerical%20Analysis), divided differences support adaptive interpolation, error estimation, and construction of interpolation tables. Since a new node only adds one new diagonal to the table, the method is efficient when samples arrive sequentially.
In classical Real Analysis, divided differences form a bridge between polynomial approximation and derivatives. The generalised mean value theorem says that the order-$n$ divided difference samples the $n$th derivative after the correct factorial normalisation.
Divided differences also appear in [approximation theory](/page/Approximation%20Theory), spline theory, and the theory of numerical quadrature. In these areas they measure smoothness of data without assuming that the data initially comes from a differentiable function.
## References
Isaac Newton, *Methodus Differentialis* (1711).
Philip J. Davis, *Interpolation and Approximation* (1963).
Carl de Boor, *A Practical Guide to Splines* (1978).
Endre Süli and David Mayers, *An Introduction to Numerical Analysis* (2003).
[Polynomial Interpolation](/page/Polynomial%20Interpolation).
[Leading Coefficient of an Iterated Finite Difference](/theorems/9053).
Divided Difference
Also known as: Divided differences, Newton divided differences, divided difference table, finite divided differences, interpolation coefficients