Chebyshev Alternation Theorem for Haar Spaces — Statement & Proof

Theorem

Edit Issues Pull Requests Attributions Admin

Discussion

Proof

[proofplan] Write $e:=f-p$. The sufficiency direction is the direct zero-counting argument: any strictly better approximant forces $q-p$ to alternate sign across the $n+1$ extremal points, hence to have at least $n$ distinct zeros, contradicting the Haar property unless $q=p$. For necessity, we prove the standard Haar improvement alternative: either the active error set contains an alternating chain of length $n+1$, or there is a direction $h\in X$ whose sign agrees with $e$ on the whole active set. A small perturbation $p+\varepsilon h$ then lowers the maximum error on the active set while compactness and continuity keep the inactive set below the old maximum, contradicting best approximation. [/proofplan] [step:Show that an alternating extremal chain forces best approximation] Assume that there exist points $a \leq x_0 < \cdots < x_n \leq b$ and a sign $\sigma \in \{-1,1\}$ such that \begin{align*} e(x_i)=\sigma(-1)^iE \end{align*} for every $i\in\{0,\dots,n\}$. Suppose, toward a contradiction, that there exists $q\in X$ such that \begin{align*} \|f-q\|_{C([a,b])}<E. \end{align*} Define $r\in X$ by $r:=q-p$. At each $x_i$, \begin{align*} |e(x_i)-r(x_i)|=|f(x_i)-q(x_i)|<E. \end{align*} If $e(x_i)=E$, then this inequality implies $r(x_i)>0$. If $e(x_i)=-E$, then it implies $r(x_i)<0$. Hence \begin{align*} \sigma(-1)^i r(x_i)>0 \end{align*} for every $i\in\{0,\dots,n\}$. Therefore $r(x_i)$ and $r(x_{i+1})$ have opposite signs for each $i\in\{0,\dots,n-1\}$. Since $r$ is continuous on $[a,b]$, for each such $i$ there exists a point $z_i\in(x_i,x_{i+1})$ such that $r(z_i)=0$. The intervals $(x_i,x_{i+1})$ are pairwise disjoint, so the zeros $z_0,\dots,z_{n-1}$ are distinct. Thus $r$ has at least $n$ distinct zeros in $[a,b]$. Because $X$ is a Haar space of dimension $n$, every nonzero element of $X$ has at most $n-1$ distinct zeros. Hence $r=0$, so $q=p$. But then \begin{align*} \|f-q\|_{C([a,b])}=\|f-p\|_{C([a,b])}=E, \end{align*} contradicting $\|f-q\|_{C([a,b])}<E$. Therefore no strictly better $q$ exists, and $p$ is a best approximation. [/step] [step:Prove the Haar improvement alternative on the active error set] Define the active set $A\subset [a,b]$ by \begin{align*} A:=\{x\in[a,b]: |e(x)|=E\}. \end{align*} Since $e$ is continuous and $[a,b]$ is compact, $A$ is a nonempty compact subset of $[a,b]$. Define the sign function $s:A\to\{-1,1\}$ by \begin{align*} s(x):=\operatorname{sgn}(e(x)). \end{align*} This is well-defined on $A$ because $|e(x)|=E$. If $E=0$, then $f=p\in X$, and the theorem is immediate: any $n+1$ distinct points in $[a,b]$ give $e(x_i)=0=\sigma(-1)^iE$. Thus assume $E>0$. We prove the following alternative. [claim:Haar improvement alternative] Either $A$ contains points $x_0<\cdots<x_n$ such that $s(x_i)=\sigma(-1)^i$ for some $\sigma\in\{-1,1\}$, or there exists $h\in X$ such that \begin{align*} s(x)h(x)>0 \end{align*} for every $x\in A$. [/claim] [proof] Choose a basis $u_1,\dots,u_n$ of $X$. Define the evaluation map \begin{align*} U:[a,b]\to\mathbb{R}^n \end{align*} by \begin{align*} U(x):=(u_1(x),\dots,u_n(x)). \end{align*} The Haar property implies that for every choice of distinct points $t_1<\cdots<t_n$ in $[a,b]$, the matrix $(u_j(t_i))_{i,j=1}^n$ is invertible. Indeed, if it were singular, there would be coefficients $c_1,\dots,c_n$, not all zero, such that $h:=\sum_{j=1}^n c_j u_j$ vanished at all $t_1,\dots,t_n$, contradicting the Haar property. Define the compact set $K\subset\mathbb{R}^n$ by \begin{align*} K:=\{s(x)U(x):x\in A\}. \end{align*} A function $h=\sum_{j=1}^n c_j u_j\in X$ satisfies $s(x)h(x)>0$ on $A$ exactly when the vector $c:=(c_1,\dots,c_n)\in\mathbb{R}^n$ satisfies \begin{align*} c\cdot v>0 \end{align*} for every $v\in K$. If $0\notin \operatorname{conv}(K)$, the compact convex set $\operatorname{conv}(K)$ and the singleton $\{0\}$ are disjoint. Since $\operatorname{conv}(K)$ is compact, choose $w\in\operatorname{conv}(K)$ with minimal Euclidean norm. Then $w\neq0$. For every $v\in\operatorname{conv}(K)$ and every $t\in[0,1]$, minimality of $w$ gives \begin{align*} |w|^2\leq |(1-t)w+tv|^2. \end{align*} Expanding the right-hand side and dividing by $t>0$ gives \begin{align*} 0\leq 2w\cdot(v-w)+t|v-w|^2. \end{align*} Letting $t\downarrow0$ yields $w\cdot v\geq |w|^2>0$. In particular $w\cdot v>0$ for every $v\in K$. Therefore $h:=\sum_{j=1}^n w_j u_j$ satisfies $s(x)h(x)>0$ for every $x\in A$. It remains to show that if $0\in\operatorname{conv}(K)$, then $A$ contains an alternating chain of length $n+1$. By Carathéodory's finite-dimensional convexity theorem, there exist points $y_0<\cdots<y_m$ in $A$ and coefficients $\lambda_0,\dots,\lambda_m>0$, with $m\leq n$ and $\sum_{r=0}^m\lambda_r=1$, such that \begin{align*} \sum_{r=0}^m \lambda_r s(y_r)U(y_r)=0. \end{align*} Choose such a representation with the smallest possible number of support points. The vectors $U(y_0),\dots,U(y_m)$ are linearly independent whenever $m\leq n-1$. To prove this, extend the distinct points $y_0,\dots,y_m$ to $n$ distinct points $t_1<\cdots<t_n$ in $[a,b]$. The Haar determinant invertibility proved above says that the $n\times n$ matrix $(u_j(t_i))_{i,j=1}^n$ is invertible, so its rows are linearly independent. The rows corresponding to $y_0,\dots,y_m$ are a subset of these rows, hence $U(y_0),\dots,U(y_m)$ are linearly independent. Therefore the displayed positive dependence cannot have $m\leq n-1$. Hence $m=n$. We now prove that the signs $s(y_0),\dots,s(y_n)$ alternate. Suppose instead that $s(y_j)=s(y_{j+1})$ for some $j\in\{0,\dots,n-1\}$. Let $c_r:=\lambda_r s(y_r)$, so \begin{align*} \sum_{r=0}^n c_rU(y_r)=0. \end{align*} Because every $n$ of the vectors $U(y_0),\dots,U(y_n)$ are linearly independent, this dependence is unique up to multiplication by a nonzero scalar. For a Chebyshev system, the coefficients of the unique dependence among the $n+1$ ordered evaluation vectors have alternating signs. To verify this directly, define the ordered Haar determinant map on the ordered configuration set by \begin{align*} D(t_1,\dots,t_n):=\det(u_j(t_i))_{i,j=1}^n. \end{align*} The set $\{(t_1,\dots,t_n)\in[a,b]^n:t_1<\cdots<t_n\}$ is connected, and $D$ is continuous and never zero by the Haar determinant invertibility proved above. Hence $D$ has one fixed sign on this ordered configuration set, after possibly replacing $u_1$ by $-u_1$. Delete the $r$-th vector and expand the determinant of the resulting $n\times n$ matrix. Cramer's cofactor formula gives \begin{align*} \operatorname{sgn}(c_r)=\tau(-1)^r \end{align*} for one fixed sign $\tau\in\{-1,1\}$ and every $r\in\{0,\dots,n\}$. Since $\lambda_r>0$, this means \begin{align*} s(y_r)=\operatorname{sgn}(c_r)=\tau(-1)^r. \end{align*} Thus the signs alternate, contradicting $s(y_j)=s(y_{j+1})$. Hence $y_0<\cdots<y_n$ is the required alternating chain. [/proof] [guided] The point of this alternative is to turn the geometric failure of alternation into an analytic direction that improves the approximation. Choose a basis $u_1,\dots,u_n$ of $X$. Define the evaluation map \begin{align*} U:[a,b]\to\mathbb{R}^n \end{align*} by \begin{align*} U(x):=(u_1(x),\dots,u_n(x)). \end{align*} The first structural fact we need is that evaluations at $n$ distinct points are independent. If $t_1<\cdots<t_n$ are points in $[a,b]$ and the matrix $(u_j(t_i))_{i,j=1}^n$ were singular, then there would be coefficients $c_1,\dots,c_n$, not all zero, such that the function $h:=\sum_{j=1}^n c_j u_j\in X$ vanished at every point $t_1,\dots,t_n$. This would contradict the Haar property, because a nonzero element of an $n$-dimensional Haar space has at most $n-1$ distinct zeros. Hence every such Haar evaluation matrix is invertible. Now encode each active point $x\in A$ by the signed evaluation vector $s(x)U(x)\in\mathbb{R}^n$, and define \begin{align*} K:=\{s(x)U(x):x\in A\}. \end{align*} The set $K$ is compact because $A$ is compact and $x\mapsto s(x)U(x)$ is continuous on $A$: here $s(x)=e(x)/E$ on $A$ since $E>0$. A direction $h=\sum_{j=1}^n c_j u_j\in X$ satisfies \begin{align*} s(x)h(x)>0 \end{align*} for every $x\in A$ exactly when the coefficient vector $c:=(c_1,\dots,c_n)\in\mathbb{R}^n$ satisfies \begin{align*} c\cdot v>0 \end{align*} for every $v\in K$. If $0\notin\operatorname{conv}(K)$, then the compact convex set $\operatorname{conv}(K)$ is disjoint from the singleton $\{0\}$. Choose $w\in\operatorname{conv}(K)$ with minimal Euclidean norm. Since $0\notin\operatorname{conv}(K)$, we have $w\neq0$. For every $v\in\operatorname{conv}(K)$ and every $t\in[0,1]$, the point $(1-t)w+tv$ belongs to $\operatorname{conv}(K)$, so minimality of $w$ gives \begin{align*} |w|^2\leq |(1-t)w+tv|^2. \end{align*} Expanding the square and subtracting $|w|^2$ gives \begin{align*} 0\leq 2t w\cdot(v-w)+t^2|v-w|^2. \end{align*} For $t>0$, division by $t$ gives \begin{align*} 0\leq 2w\cdot(v-w)+t|v-w|^2. \end{align*} Letting $t\downarrow0$ yields \begin{align*} w\cdot v\geq |w|^2>0. \end{align*} Taking $v=s(x)U(x)$, define $h\in X$ by \begin{align*} h:=\sum_{j=1}^n w_j u_j. \end{align*} Then $s(x)h(x)=w\cdot s(x)U(x)>0$ for every $x\in A$, which is the desired improvement direction. It remains to treat the case $0\in\operatorname{conv}(K)$. By Carathéodory's finite-dimensional convexity theorem, there exist points $y_0<\cdots<y_m$ in $A$ and coefficients $\lambda_0,\dots,\lambda_m>0$, with $m\leq n$ and $\sum_{r=0}^m\lambda_r=1$, such that \begin{align*} \sum_{r=0}^m \lambda_r s(y_r)U(y_r)=0. \end{align*} Choose such a representation with the smallest possible number of support points. We claim that $m=n$. If $m\leq n-1$, extend the distinct points $y_0,\dots,y_m$ to $n$ distinct points $t_1<\cdots<t_n$ in $[a,b]$. The matrix $(u_j(t_i))_{i,j=1}^n$ is invertible by the Haar determinant argument above, so its rows are linearly independent. The rows $U(y_0),\dots,U(y_m)$ are a subset of those rows, hence they are linearly independent. This contradicts the displayed positive dependence. Therefore $m=n$. We now prove that the signs $s(y_0),\dots,s(y_n)$ alternate. Let $c_r:=\lambda_r s(y_r)$, so \begin{align*} \sum_{r=0}^n c_rU(y_r)=0. \end{align*} Every $n$ of the vectors $U(y_0),\dots,U(y_n)$ are linearly independent by the same extension-and-invertibility argument, so this dependence is unique up to multiplication by a nonzero scalar. To compute its signs, define the ordered Haar determinant map by \begin{align*} D(t_1,\dots,t_n):=\det(u_j(t_i))_{i,j=1}^n. \end{align*} The ordered configuration set $\{(t_1,\dots,t_n)\in[a,b]^n:t_1<\cdots<t_n\}$ is connected, and $D$ is continuous and never zero on it. Hence $D$ has one fixed sign there, after possibly replacing $u_1$ by $-u_1$. Cramer's cofactor formula for the unique dependence among the $n+1$ ordered vectors gives \begin{align*} \operatorname{sgn}(c_r)=\tau(-1)^r \end{align*} for one fixed sign $\tau\in\{-1,1\}$ and every $r\in\{0,\dots,n\}$. Since each $\lambda_r>0$, the sign of $c_r=\lambda_r s(y_r)$ is exactly $s(y_r)$. Therefore \begin{align*} s(y_r)=\tau(-1)^r \end{align*} for every $r\in\{0,\dots,n\}$. Thus $y_0<\cdots<y_n$ is an alternating chain in $A$. [/guided] [/step] [step:Use the improvement alternative to prove necessity] Assume that $p$ is a best approximation to $f$ from $X$. If $E=0$, then $f=p$, and any strictly increasing choice of $n+1$ points in $[a,b]$ satisfies \begin{align*} f(x_i)-p(x_i)=0=\sigma(-1)^iE. \end{align*} Thus assume $E>0$. Apply the Haar improvement alternative to the active set $A=\{x\in[a,b]:|e(x)|=E\}$. If $A$ contains an alternating chain of length $n+1$, then the desired conclusion follows. Suppose instead that there exists $h\in X$ such that \begin{align*} s(x)h(x)>0 \end{align*} for every $x\in A$. Since $E>0$ and $|e(x)|=E$ on $A$, we have $s(x)=e(x)/E$ for every $x\in A$. Hence the function $x\mapsto s(x)h(x)$ is continuous on $A$. Since $A$ is compact, define \begin{align*} \alpha:=\min_{x\in A}s(x)h(x)>0. \end{align*} Let \begin{align*} M:=\|h\|_{C([a,b])}. \end{align*} If $M=0$, then $h=0$, contradicting $\alpha>0$, so $M>0$. Choose $\varepsilon>0$ such that \begin{align*} 0<\varepsilon<\frac{2E\alpha}{M^2}. \end{align*} For $x\in A$, we have $e(x)=s(x)E$, and therefore \begin{align*} |e(x)-\varepsilon h(x)|^2=E^2-2\varepsilon E s(x)h(x)+\varepsilon^2 h(x)^2. \end{align*} Using $s(x)h(x)\geq\alpha$ and $|h(x)|\leq M$, we get \begin{align*} |e(x)-\varepsilon h(x)|^2\leq E^2-2\varepsilon E\alpha+\varepsilon^2M^2<E^2. \end{align*} Thus \begin{align*} |e(x)-\varepsilon h(x)|<E \end{align*} for every $x\in A$. It remains to control points away from $A$. Because $A$ is compact and the function $x\mapsto |e(x)|$ is continuous, for every open neighbourhood $V$ of $A$ in $[a,b]$ with compact complement $[a,b]\setminus V$, the maximum of $|e|$ on $[a,b]\setminus V$ is strictly smaller than $E$. Choose such a neighbourhood $V$ small enough that the strict inequality on $A$ above persists on $V$ by continuity of $e-\varepsilon h$. Then there exists $\eta>0$ such that \begin{align*} |e(x)-\varepsilon h(x)|\leq E-\eta \end{align*} for every $x\in V$. On the compact set $[a,b]\setminus V$, define \begin{align*} \beta:=E-\max_{x\in [a,b]\setminus V}|e(x)|>0. \end{align*} By decreasing $\varepsilon$ if necessary while keeping $\varepsilon>0$, we may also require \begin{align*} \varepsilon M<\beta. \end{align*} Then for $x\in [a,b]\setminus V$, \begin{align*} |e(x)-\varepsilon h(x)|\leq |e(x)|+\varepsilon |h(x)|\leq E-\beta+\varepsilon M<E. \end{align*} Combining the estimates on $V$ and on $[a,b]\setminus V$ gives \begin{align*} \|f-(p+\varepsilon h)\|_{C([a,b])}=\|e-\varepsilon h\|_{C([a,b])}<E. \end{align*} Since $p+\varepsilon h\in X$, this contradicts the assumption that $p$ is a best approximation. Therefore the improvement direction cannot exist, and the Haar improvement alternative forces an alternating chain $x_0<\cdots<x_n$ in $A$. For those points, \begin{align*} e(x_i)=s(x_i)E=\sigma(-1)^iE, \end{align*} which is the desired conclusion. [/step] [step:Conclude the equivalence] The first step proves that the existence of an alternating extremal chain implies that $p$ is a best approximation from $X$. The preceding step proves that every best approximation must possess such a chain. Hence $p$ is a best uniform approximation to $f$ from the Haar space $X$ if and only if there exist $a\leq x_0<\cdots<x_n\leq b$ and $\sigma\in\{-1,1\}$ such that \begin{align*} f(x_i)-p(x_i)=\sigma(-1)^i\|f-p\|_{C([a,b])} \end{align*} for every $i\in\{0,\dots,n\}$. [/step]

Prerequisites (0/2 completed)

Prerequisites Graph

Interactive dependency map showing how this theorem builds on foundational concepts

Loading dependency graph...

Definitions & Concepts

What brings you to Androma?

Start with a route through the knowledge graph.

Chebyshev Alternation Theorem for Haar Spaces (Theorem # 6882)

Discussion

Proof

Prerequisites (0/2 completed)

Prerequisites Graph

Explore Further

Sign in to Androma

Check your inbox

One last step

Chebyshev Alternation Theorem for Haar Spaces (Theorem # 6882)

Discussion

Proof

Prerequisites (0/2 completed)

Prerequisites Graph

Explore Further