Whitney Inequality for Local Polynomial Approximation

Theorem

Edit Issues Pull Requests Attributions Admin

Discussion

Proof

[proofplan] We first reduce the problem to the reference interval $[0,1]$ by an affine change of variables and check that both the best approximation error and the $r$-th modulus of smoothness are invariant under this normalization. Let $\mathcal{P}_{r-1}$ denote the [vector space](/page/Vector%20Space) of real polynomials of degree at most $r-1$, restricted to the interval under discussion. On $[0,1]$, we prove the needed estimate by contradiction: the interpolation projector normalizes away the polynomial kernel, while compactness for anchored functions with vanishing $r$-th differences forces a limiting polynomial with too many zeros. Scaling then returns the estimate on the original interval. [/proofplan] [step:Reduce the estimate to the reference interval $[0,1]$] If $I=\{a\}$ is a singleton, then the constant polynomial $p(x)=f(a)$ satisfies $f-p=0$ on $I$, so $E_{r-1}(f;I)=0$ and the desired estimate follows. We may therefore assume that $I = [a,b] \subset \mathbb{R}$ with $a < b$, and define the affine map \begin{align*} T: [0,1] &\to I \end{align*} by $T(y) = a + (b-a)y$. Define \begin{align*} g: [0,1] &\to \mathbb{R} \end{align*} by $g(y) = f(T(y))$. For a compact interval $J \subset \mathbb{R}$ and a [continuous function](/page/Continuous%20Function) $v \in C(J)$, define the best uniform approximation error by \begin{align*} E_{r-1}(v;J) := \inf_{p \in \mathcal{P}_{r-1}} \|v-p\|_{C(J)}. \end{align*} For any function $v$ on an interval, any $h \geq 0$, and any point $x$ for which $x,x+h,\dots,x+rh$ lie in the interval, define the $r$-th forward finite difference by \begin{align*} \Delta_h^r v(x) := \sum_{k=0}^r (-1)^{r-k}\binom{r}{k}v(x+kh). \end{align*} For $t \geq 0$, define the $r$-th modulus of smoothness on $J$ by \begin{align*} \omega_r(v,t;J) := \sup\{|\Delta_h^r v(x)| : 0 \leq h \leq t,\ x,x+h,\dots,x+rh \in J\}. \end{align*} For every polynomial $p \in \mathcal{P}_{r-1}$ on $I$, the function $q := p \circ T$ belongs to $\mathcal{P}_{r-1}$ on $[0,1]$. Conversely, every $q \in \mathcal{P}_{r-1}$ on $[0,1]$ has the form $q = p \circ T$ for a unique $p \in \mathcal{P}_{r-1}$ on $I$, namely the polynomial map $p:I\to\mathbb{R}$ defined by \begin{align*} p(x) := q\left(\frac{x-a}{b-a}\right). \end{align*} Therefore \begin{align*} E_{r-1}(f;I) = E_{r-1}(g;[0,1]). \end{align*} Next let $0 \leq \eta \leq 1$ and let $y,y+\eta,\dots,y+r\eta \in [0,1]$. Set $x := T(y)$ and $h := (b-a)\eta$. Then $x,x+h,\dots,x+rh \in I$, and \begin{align*} \Delta_\eta^r g(y) = \Delta_h^r f(x). \end{align*} Conversely every admissible $r$-th finite difference of $f$ on $I$ arises in this way. Hence \begin{align*} \omega_r(g,1;[0,1]) = \omega_r(f,|I|;I). \end{align*} It is therefore enough to prove that there is a constant $A_r > 0$ such that \begin{align*} E_{r-1}(g;[0,1]) \leq A_r \omega_r(g,1;[0,1]) \end{align*} for every $g \in C([0,1])$. [guided] The affine normalization removes the interval length from the problem. We define the map \begin{align*} T: [0,1] &\to I \end{align*} by $T(y) = a+(b-a)y$, and we pull $f$ back to \begin{align*} g: [0,1] &\to \mathbb{R} \end{align*} by $g(y)=f(T(y))$. We also fix the approximation and finite-difference notation. For a compact interval $J \subset \mathbb{R}$ and $v \in C(J)$, the best approximation error is \begin{align*} E_{r-1}(v;J) := \inf_{p \in \mathcal{P}_{r-1}} \|v-p\|_{C(J)}. \end{align*} For any function $v$ on an interval, any $h \geq 0$, and any admissible point $x$, set \begin{align*} \Delta_h^r v(x) := \sum_{k=0}^r (-1)^{r-k}\binom{r}{k}v(x+kh). \end{align*} For $t \geq 0$, the modulus of smoothness is \begin{align*} \omega_r(v,t;J) := \sup\{|\Delta_h^r v(x)| : 0 \leq h \leq t,\ x,x+h,\dots,x+rh \in J\}. \end{align*} This is the supremum of the absolute values of all admissible $r$-th differences with step at most $t$. The first quantity to check is the best polynomial approximation error. Composition with $T$ sends every polynomial of degree at most $r-1$ on $I$ to a polynomial of degree at most $r-1$ on $[0,1]$. Composition with $T^{-1}$ gives the inverse correspondence. Since $T$ is a bijection from $[0,1]$ onto $I$, the uniform errors agree: \begin{align*} \sup_{x \in I}|f(x)-p(x)| = \sup_{y \in [0,1]} |g(y)-p(T(y))|. \end{align*} Taking the infimum over all $p \in \mathcal{P}_{r-1}$ gives \begin{align*} E_{r-1}(f;I)=E_{r-1}(g;[0,1]). \end{align*} Now we check the finite differences. If $y,y+\eta,\dots,y+r\eta \in [0,1]$ and $h=(b-a)\eta$, then \begin{align*} T(y+k\eta)=T(y)+kh \end{align*} for each $k \in \{0,\dots,r\}$. Therefore \begin{align*} \Delta_\eta^r g(y) = \sum_{k=0}^r (-1)^{r-k}\binom{r}{k}g(y+k\eta) = \sum_{k=0}^r (-1)^{r-k}\binom{r}{k}f(T(y)+kh) = \Delta_h^r f(T(y)). \end{align*} The admissibility condition is also preserved: $y,y+\eta,\dots,y+r\eta \in [0,1]$ exactly corresponds to $T(y),T(y)+h,\dots,T(y)+rh \in I$. Hence \begin{align*} \omega_r(g,1;[0,1])=\omega_r(f,|I|;I). \end{align*} Thus the whole theorem reduces to proving the reference interval estimate on $[0,1]$. [/guided] [/step] [step:Identify the kernel of the finite-difference seminorm using Fréchet's equation] Let $u \in C([0,1])$ satisfy \begin{align*} \omega_r(u,1;[0,1]) = 0. \end{align*} Then $\Delta_h^r u(x)=0$ for every $h \geq 0$ and every $x \in [0,1]$ such that $x,x+h,\dots,x+rh \in [0,1]$. We use the following precise local form of Fréchet's finite-difference theorem. If $J \subset \mathbb{R}$ is an interval and $w:J \to \mathbb{R}$ is continuous with $\Delta_h^r w(x)=0$ whenever $x,x+h,\dots,x+rh \in J$, then $w$ agrees on $J$ with a polynomial of degree at most $r-1$. Its hypotheses are exactly satisfied with $J=[0,1]$ and $w=u$, because $u$ is continuous and the displayed vanishing holds for every admissible equal-step $r$-th difference. Hence $u \in \mathcal{P}_{r-1}$. Conversely, if $u \in \mathcal{P}_{r-1}$, then each application of the forward difference operator lowers polynomial degree by one. Therefore $\Delta_h^r u(x)=0$ for every admissible pair $(x,h)$, and so \begin{align*} \omega_r(u,1;[0,1])=0. \end{align*} Thus \begin{align*} \omega_r(u,1;[0,1])=0 \end{align*} if and only if $u \in \mathcal{P}_{r-1}$. [guided] The point of this step is to identify exactly which functions are invisible to the $r$-th finite-difference seminorm. Suppose first that $u \in C([0,1])$ and \begin{align*} \omega_r(u,1;[0,1]) = 0. \end{align*} By the definition of the modulus, this means that every admissible equal-step finite difference vanishes: \begin{align*} \Delta_h^r u(x)=0 \end{align*} whenever $h \geq 0$ and $x,x+h,\dots,x+rh \in [0,1]$. We now apply the local form of Fréchet's finite-difference theorem stated in the exact proof. The result requires an interval $J \subset \mathbb{R}$, a continuous function $w:J \to \mathbb{R}$, and the identity $\Delta_h^r w(x)=0$ for every admissible equal-step $r$-th difference in $J$. These hypotheses hold with $J=[0,1]$ and $w=u$: continuity is assumed, and the vanishing of all admissible differences is exactly the conclusion of $\omega_r(u,1;[0,1])=0$. Therefore $u$ agrees on $[0,1]$ with a polynomial of degree at most $r-1$, so $u \in \mathcal{P}_{r-1}$. The converse is algebraic. If $u \in \mathcal{P}_{r-1}$, then applying one forward difference lowers the degree by one, and after $r$ applications the result is the zero polynomial. Hence every admissible $r$-th difference of $u$ is zero, so \begin{align*} \omega_r(u,1;[0,1])=0. \end{align*} This proves that the kernel of the seminorm $u \mapsto \omega_r(u,1;[0,1])$ is precisely $\mathcal{P}_{r-1}$. [/guided] [/step] [step:Construct a bounded projector from sampled values] Choose distinct points $t_0,\dots,t_{r-1} \in [0,1]$; for instance, when $r\geq 2$ take \begin{align*} t_j:=\frac{j}{r-1} \end{align*} for $j\in\{0,\dots,r-1\}$, and when $r=1$ take $t_0=0$. Let $L_j \in \mathcal{P}_{r-1}$ denote the Lagrange polynomial with $L_j(t_i)=\delta_{ij}$. Define the interpolation operator \begin{align*} \Pi_r:C([0,1]) \to \mathcal{P}_{r-1}, \qquad \Pi_r u := \sum_{j=0}^{r-1} u(t_j)L_j. \end{align*} Since each $L_j$ is continuous on $[0,1]$, the constant \begin{align*} B_r := 1+\sum_{j=0}^{r-1}\|L_j\|_{C([0,1])} \end{align*} is finite and depends only on $r$. For every $u\in C([0,1])$, \begin{align*} \|u-\Pi_r u\|_{C([0,1])}\leq B_r\|u\|_{C([0,1])}. \end{align*} Moreover $\Pi_r p=p$ for every $p\in\mathcal{P}_{r-1}$. [/step] [step:Prove the reference interval estimate by compactness and anchoring] We prove that there is a constant $A_r>0$ such that every $u\in C([0,1])$ satisfies \begin{align*} E_{r-1}(u;[0,1])\leq A_r\omega_r(u,1;[0,1]). \end{align*} Assume the contrary. Then for each $n\in\mathbb{N}$ there is $u_n\in C([0,1])$ such that \begin{align*} E_{r-1}(u_n;[0,1])>n\omega_r(u_n,1;[0,1]). \end{align*} After replacing $u_n$ by $u_n/E_{r-1}(u_n;[0,1])$, we may assume \begin{align*} E_{r-1}(u_n;[0,1])=1 \end{align*} and \begin{align*} \omega_r(u_n,1;[0,1])<\frac{1}{n}. \end{align*} Define $v_n:=u_n-\Pi_r u_n$. Since $\Pi_r p=p$ for every $p\in\mathcal{P}_{r-1}$, subtracting the polynomial $\Pi_r u_n$ does not change the best approximation error or any $r$-th finite difference. Hence \begin{align*} E_{r-1}(v_n;[0,1])=1 \end{align*} and \begin{align*} \omega_r(v_n,1;[0,1])<\frac{1}{n}. \end{align*} Also $v_n(t_j)=0$ for each $j\in\{0,\dots,r-1\}$. Choosing $p_n\in\mathcal{P}_{r-1}$ with $\|u_n-p_n\|_{C([0,1])}\leq 2$, and using $\Pi_r p_n=p_n$, gives \begin{align*} \|v_n\|_{C([0,1])}=\|(I-\Pi_r)(u_n-p_n)\|_{C([0,1])}\leq B_r\|u_n-p_n\|_{C([0,1])}\leq 2B_r. \end{align*} [claim:Anchored functions with vanishing $r$-th modulus converge uniformly to zero] Let $(w_n)_{n=1}^{\infty}\subset C([0,1])$ be uniformly bounded, suppose $w_n(t_j)=0$ for each $j\in\{0,\dots,r-1\}$, and suppose \begin{align*} \omega_r(w_n,1;[0,1])\to 0. \end{align*} Then $w_n\to 0$ uniformly on $[0,1]$. [/claim] [proof] Let $M>0$ be such that $\|w_n\|_{C([0,1])}\leq M$ for every $n\in\mathbb{N}$. We use the following explicit Marchaud finite-difference estimate on $[0,1]$: for each $r\in\mathbb{N}$ there are constants $K_r,L_r>0$, depending only on $r$, such that every $w\in C([0,1])$ and every $0<\delta\leq 1$ satisfy \begin{align*} \omega_1(w,\delta;[0,1])\leq K_r\delta\|w\|_{C([0,1])}+L_r\omega_r(w,1;[0,1]). \end{align*} Here $\omega_1$ is the first modulus of smoothness defined by the same formula with $r=1$. Applying this estimate to $w=w_n$ gives, for every $0<\delta\leq1$, \begin{align*} \omega_1(w_n,\delta;[0,1])\leq K_rM\delta+L_r\omega_r(w_n,1;[0,1]). \end{align*} Since $\omega_r(w_n,1;[0,1])\to0$, the family $(w_n)$ is equicontinuous: given $\varepsilon>0$, choose $\delta>0$ with $K_rM\delta<\varepsilon/2$, and then choose $N\in\mathbb{N}$ such that $L_r\omega_r(w_n,1;[0,1])<\varepsilon/2$ for all $n\geq N$; the finitely many functions $w_1,\dots,w_{N-1}$ are uniformly continuous on $[0,1]$, so decreasing $\delta$ if necessary gives the same bound for them. The sequence $(w_n)$ is uniformly bounded by $M$ and equicontinuous on the compact [metric space](/page/Metric%20Space) $[0,1]$. The sequential [Arzelà-Ascoli theorem](/theorems/885) therefore gives that every subsequence of $(w_n)$ contains a uniformly convergent further subsequence. Take an arbitrary subsequence $(w_{n_k})_{k=1}^{\infty}$ and choose a further subsequence $(w_{n_{k_\ell}})_{\ell=1}^{\infty}$ converging uniformly to some $w\in C([0,1])$. For every admissible pair $(x,h)$, the finite-difference functional $z\mapsto \Delta_h^r z(x)$ is continuous with respect to the uniform norm, since it is a finite linear combination of point evaluations. Hence \begin{align*} \Delta_h^r w(x)=\lim_{\ell\to\infty}\Delta_h^r w_{n_{k_\ell}}(x)=0. \end{align*} By the kernel identification from the previous step, $w\in\mathcal{P}_{r-1}$. [Uniform convergence](/page/Uniform%20Convergence) also preserves the anchored values, hence $w(t_j)=0$ for each $j\in\{0,\dots,r-1\}$. Since the nodes $t_0,\dots,t_{r-1}$ are distinct, a polynomial of degree at most $r-1$ with these $r$ zeros is the zero polynomial. Therefore $w=0$ on $[0,1]$. Every subsequence has a further subsequence converging uniformly to $0$. If the full sequence did not converge uniformly to $0$, there would be an $\varepsilon>0$ and a subsequence with $\|w_{n_k}\|_{C([0,1])}\geq\varepsilon$ for all $k$, contradicting the further subsequence convergence to $0$. Thus $w_n\to0$ uniformly. [/proof] Applying the claim to $(v_n)$ gives $\|v_n\|_{C([0,1])}\to 0$, and therefore \begin{align*} 1=E_{r-1}(v_n;[0,1])\leq \|v_n\|_{C([0,1])}\to 0, \end{align*} a contradiction. Thus the reference interval estimate holds. The constant $A_r$ is supplied by this contradiction argument; its dependence is only on the order $r$, since the compactness step uses only the Marchaud constants $K_r,L_r$, the fixed interpolation nodes chosen above, and the uniform bound obtained from $B_r$. [guided] We now prove the fixed-interval estimate rather than citing it. Suppose no constant $A_r$ exists. Then for each $n\in\mathbb{N}$ we can choose $u_n\in C([0,1])$ such that \begin{align*} E_{r-1}(u_n;[0,1])>n\omega_r(u_n,1;[0,1]). \end{align*} The error $E_{r-1}(u_n;[0,1])$ is nonzero, because otherwise the displayed strict inequality would force $0>n\omega_r(u_n,1;[0,1])$, impossible. Dividing by this positive number, we may assume \begin{align*} E_{r-1}(u_n;[0,1])=1 \end{align*} and \begin{align*} \omega_r(u_n,1;[0,1])<\frac{1}{n}. \end{align*} The polynomial part is invisible to both sides of the inequality, so we remove it in a controlled way. Define \begin{align*} v_n:=u_n-\Pi_r u_n. \end{align*} Since $\Pi_r u_n\in\mathcal{P}_{r-1}$ and adding or subtracting a polynomial of degree at most $r-1$ merely translates the approximating polynomial, the best approximation error is unchanged: \begin{align*} E_{r-1}(v_n;[0,1])=E_{r-1}(u_n;[0,1])=1. \end{align*} The same subtraction does not change the $r$-th finite differences, because every $r$-th finite difference of a polynomial in $\mathcal{P}_{r-1}$ is zero. Hence \begin{align*} \omega_r(v_n,1;[0,1])=\omega_r(u_n,1;[0,1])<\frac{1}{n}. \end{align*} The interpolation property also gives anchoring: \begin{align*} v_n(t_j)=0 \end{align*} for each $j\in\{0,\dots,r-1\}$. We still need a uniform bound. Since $E_{r-1}(u_n;[0,1])=1$, the definition of the infimum gives a polynomial $p_n\in\mathcal{P}_{r-1}$ such that \begin{align*} \|u_n-p_n\|_{C([0,1])}\leq 2. \end{align*} Because $\Pi_r p_n=p_n$, we have \begin{align*} v_n=(I-\Pi_r)(u_n-p_n). \end{align*} The projector estimate from the previous step therefore gives \begin{align*} \|v_n\|_{C([0,1])}\leq B_r\|u_n-p_n\|_{C([0,1])}\leq 2B_r. \end{align*} We now prove the compactness conclusion needed here. The sequence $(v_n)$ is uniformly bounded by $2B_r$, is anchored by $v_n(t_j)=0$ for each $j\in\{0,\dots,r-1\}$, and satisfies $\omega_r(v_n,1;[0,1])\to0$. The explicit Marchaud estimate used in the claim says that there are constants $K_r,L_r>0$, depending only on $r$, such that every $w\in C([0,1])$ and every $0<\delta\leq1$ satisfy \begin{align*} \omega_1(w,\delta;[0,1])\leq K_r\delta\|w\|_{C([0,1])}+L_r\omega_r(w,1;[0,1]). \end{align*} Applying this to $w=v_n$ gives \begin{align*} \omega_1(v_n,\delta;[0,1])\leq 2K_rB_r\delta+L_r\omega_r(v_n,1;[0,1]). \end{align*} Given $\varepsilon>0$, choose $\delta>0$ with $2K_rB_r\delta<\varepsilon/2$ and then choose $N\in\mathbb{N}$ such that $L_r\omega_r(v_n,1;[0,1])<\varepsilon/2$ for all $n\geq N$. The finitely many functions $v_1,\dots,v_{N-1}$ are uniformly continuous on $[0,1]$, so decreasing $\delta$ if necessary gives equicontinuity for the entire sequence. Since the sequence is also uniformly bounded, the sequential Arzelà-Ascoli theorem implies that every subsequence of $(v_n)$ contains a uniformly convergent further subsequence. Let $(v_{n_{k_\ell}})$ be such a uniformly convergent further subsequence, and let $v\in C([0,1])$ be its uniform limit. For every admissible pair $(x,h)$, the map $z\mapsto\Delta_h^r z(x)$ is continuous in the uniform norm, because it is a finite linear combination of point evaluations. Hence \begin{align*} \Delta_h^r v(x)=\lim_{\ell\to\infty}\Delta_h^r v_{n_{k_\ell}}(x)=0. \end{align*} The kernel identification from the previous step then gives $v\in\mathcal{P}_{r-1}$. Also, uniform convergence preserves point values, so \begin{align*} v(t_j)=\lim_{\ell\to\infty}v_{n_{k_\ell}}(t_j)=0. \end{align*} This holds for every $j\in\{0,\dots,r-1\}$. A polynomial of degree at most $r-1$ with the $r$ distinct zeros $t_0,\dots,t_{r-1}$ is identically zero. Thus every subsequence of $(v_n)$ has a further subsequence converging uniformly to $0$. This subsequential statement forces the whole sequence to converge uniformly to $0$: otherwise some $\varepsilon>0$ and some subsequence would satisfy $\|v_{n_k}\|_{C([0,1])}\geq\varepsilon$, contradicting the existence of a further subsequence converging uniformly to $0$. Hence \begin{align*} \|v_n\|_{C([0,1])}\to 0. \end{align*} But the zero polynomial is an admissible approximant, so \begin{align*} 1=E_{r-1}(v_n;[0,1])\leq \|v_n\|_{C([0,1])}\to 0, \end{align*} which is impossible. The contradiction proves the reference interval estimate. [/guided] [/step] [step:Scale the reference interval estimate back to $I$] Applying the reference estimate to $g=f\circ T$ gives \begin{align*} E_{r-1}(g;[0,1]) \leq A_r \omega_r(g,1;[0,1]). \end{align*} Using the equalities from the affine reduction step, \begin{align*} E_{r-1}(f;I) \leq A_r \omega_r(f,|I|;I). \end{align*} Thus the theorem holds with \begin{align*} C_r := A_r. \end{align*} The constant depends only on $r$, because the reference interval argument depends only on the order of the finite difference and not on $I$ or $f$. [/step]

Prerequisites (0/5 completed)

Prerequisites Graph

Interactive dependency map showing how this theorem builds on foundational concepts

Loading dependency graph...

Theorems

Arzelà-Ascoli Theorem

Definitions & Concepts

What brings you to Androma?

Start with a route through the knowledge graph.