[step:Show that the Newton divided-difference form interpolates the same data]The divided differences $f[x_i,\dots,x_j]$ and the Newton polynomial $N_f$ are those defined in the theorem statement. The denominators in the recursive definition are nonzero because the nodes are pairwise distinct. Each summand in
\begin{align*}
N_f(t):=f[x_0]+\sum_{m=1}^{n} f[x_0,\dots,x_m]\prod_{j=0}^{m-1}(t-x_j)
\end{align*}
has degree at most $m$, so $N_f \in \mathcal{P}_n(\mathbb{R})$.
[claim:The divided difference is the leading coefficient of the interpolating polynomial on its nodes]
Let $m \in \{0,1,\dots,n\}$ and let $y_0,\dots,y_m$ be pairwise distinct elements of $\mathbb{R}$ chosen from the ordered list $x_0,\dots,x_n$. Let $P_m \in \mathcal{P}_m(\mathbb{R})$ be the unique polynomial satisfying $P_m(y_i)=f(y_i)$ for every $i \in \{0,1,\dots,m\}$. Then the coefficient of $t^m$ in $P_m$ is the recursively defined divided difference $f[y_0,\dots,y_m]$.
[/claim]
[proof]
We prove the claim by induction on $m$. For $m=0$, the unique interpolating polynomial is the constant polynomial $P_0(t)=f(y_0)$, and its coefficient of $t^0$ is $f[y_0]$.
Assume $m \ge 1$ and that the claim is known for $m-1$ nodes. Let $P_{0,m-1} \in \mathcal{P}_{m-1}(\mathbb{R})$ be the unique interpolating polynomial for $y_0,\dots,y_{m-1}$, and let $P_{1,m} \in \mathcal{P}_{m-1}(\mathbb{R})$ be the unique interpolating polynomial for $y_1,\dots,y_m$. Define $R_m \in \mathbb{R}[t]$ by
\begin{align*}
R_m(t):=\frac{(t-y_0)P_{1,m}(t)-(t-y_m)P_{0,m-1}(t)}{y_m-y_0}.
\end{align*}
The denominator is nonzero because $y_0 \ne y_m$, and $R_m \in \mathcal{P}_m(\mathbb{R})$. At $t=y_0$, the first term vanishes and $R_m(y_0)=P_{0,m-1}(y_0)=f(y_0)$. At $t=y_m$, the second term vanishes and $R_m(y_m)=P_{1,m}(y_m)=f(y_m)$. If $1 \le i \le m-1$, then $P_{0,m-1}(y_i)=P_{1,m}(y_i)=f(y_i)$, so
\begin{align*}
R_m(y_i)=\frac{(y_i-y_0)f(y_i)-(y_i-y_m)f(y_i)}{y_m-y_0}=f(y_i).
\end{align*}
Thus $R_m$ interpolates $f$ at $y_0,\dots,y_m$, and uniqueness of interpolation at distinct nodes gives $R_m=P_m$.
By the induction hypothesis, the coefficient of $t^{m-1}$ in $P_{1,m}$ is $f[y_1,\dots,y_m]$, and the coefficient of $t^{m-1}$ in $P_{0,m-1}$ is $f[y_0,\dots,y_{m-1}]$. Therefore the coefficient of $t^m$ in $R_m=P_m$ is
\begin{align*}
\frac{f[y_1,\dots,y_m]-f[y_0,\dots,y_{m-1}]}{y_m-y_0}=f[y_0,\dots,y_m],
\end{align*}
which is exactly the recursive definition of the divided difference. This proves the claim.
[/proof]
We now prove by induction on $m \in \{0,1,\dots,n\}$ that the truncated Newton polynomial
\begin{align*}
N_m(t):=f[x_0]+\sum_{r=1}^{m} f[x_0,\dots,x_r]\prod_{j=0}^{r-1}(t-x_j)
\end{align*}
satisfies $N_m(x_i)=f(x_i)$ for every $i \in \{0,1,\dots,m\}$. For $m=0$, this is $N_0(x_0)=f[x_0]=f(x_0)$.
Assume the assertion holds for $m-1$, where $1 \le m \le n$. For $i<m$, the final product in $N_m-N_{m-1}$ contains the factor $x_i-x_i$, so $N_m(x_i)=N_{m-1}(x_i)=f(x_i)$. Let $P_m \in \mathcal{P}_m(\mathbb{R})$ be the unique polynomial interpolating $f$ at $x_0,\dots,x_m$. By the claim, the coefficient of $t^m$ in $P_m$ is $f[x_0,\dots,x_m]$. The coefficient of $t^m$ in $N_m$ is also $f[x_0,\dots,x_m]$, because the last Newton basis product has leading coefficient $1$ and every earlier summand has degree at most $m-1$. Hence $N_m-P_m \in \mathcal{P}_{m-1}(\mathbb{R})$. Since $N_m(x_i)=P_m(x_i)=f(x_i)$ for $i \in \{0,1,\dots,m-1\}$, the polynomial $N_m-P_m$ has the $m$ distinct roots $x_0,\dots,x_{m-1}$. The root-counting fact used above forces $N_m-P_m=0$, so $N_m=P_m$. In particular, $N_m(x_m)=P_m(x_m)=f(x_m)$.
Thus $N_n=N_f$ interpolates $f$ at all nodes $x_0,\dots,x_n$.[/step]