[proofplan]
Choose the $n$ basis vectors corresponding to the $n$ largest coefficient magnitudes and approximate $f$ by the partial sum over those indices. Orthogonality converts the squared approximation error into the sum of the squared omitted coefficients. The assumed algebraic decay bounds this tail by a convergent power-series tail, and the integral comparison gives the rate $n^{1-2s}$ for the squared error. Taking square roots yields the stated algebraic best $n$-term rate.
[/proofplan]
custom_env
admin
[step:Approximate by the $n$ largest coefficients]Fix $n \in \mathbb{N}$. Let $\Lambda_n \subset I$ be a set of indices with $|\Lambda_n| \leq n$ such that the multiset $\{|a_k| : k \in \Lambda_n\}$ consists of the first $n$ terms of the nonincreasing rearrangement $(a_m^*)_{m \geq 1}$, omitting repeated zero terms if fewer than $n$ nonzero coefficients are present. Define the $n$-term approximant
\begin{align*}
g_n := \sum_{k \in \Lambda_n} a_k e_k \in H.
\end{align*}
This is an admissible competitor in the definition of $\sigma_n(f)_{H,\mathcal{D}}$, since it uses at most $n$ elements of $\mathcal{D}$. Therefore
\begin{align*}
\sigma_n(f)_{H,\mathcal{D}} \leq \|f-g_n\|_H.
\end{align*}[/step]
custom_env
admin
[guided]Fix $n \in \mathbb{N}$. The best $n$-term error is an infimum over all linear combinations of at most $n$ dictionary elements, so to prove an upper bound it is enough to exhibit one particular admissible $n$-term approximation.
We choose the most efficient candidate suggested by orthogonality: keep the $n$ largest coefficients. Let $\Lambda_n \subset I$ be a set of indices with $|\Lambda_n| \leq n$ such that the magnitudes $\{|a_k| : k \in \Lambda_n\}$ are exactly the first $n$ entries of the nonincreasing rearrangement $(a_m^*)_{m \geq 1}$, ignoring padded zero entries if the coefficient support has fewer than $n$ nonzero terms. Define
\begin{align*}
g_n := \sum_{k \in \Lambda_n} a_k e_k \in H.
\end{align*}
The vector $g_n$ belongs to the span of at most $n$ elements of $\mathcal{D} = \{e_k : k \in I\}$. Hence $g_n$ is one of the competitors allowed in the definition
\begin{align*}
\sigma_n(f)_{H,\mathcal{D}} := \inf\left\{\left\|f - \sum_{k \in \Lambda} c_k e_k\right\|_H : \Lambda \subset I,\ |\Lambda| \leq n,\ c_k \in \mathbb{F}\right\}.
\end{align*}
Since an infimum is bounded above by the value of any admissible competitor, we obtain
\begin{align*}
\sigma_n(f)_{H,\mathcal{D}} \leq \|f-g_n\|_H.
\end{align*}[/guided]
custom_env
admin
[step:Convert the approximation error into a coefficient tail]
Because $f = \sum_{k \in I} a_k e_k$ in $H$ and $(e_k)_{k \in I}$ is orthonormal, the difference $f-g_n$ has coefficients $a_k$ on $I \setminus \Lambda_n$ and coefficient $0$ on $\Lambda_n$. For finite partial sums over $I \setminus \Lambda_n$, orthonormality gives the Pythagorean identity
\begin{align*}
\left\|\sum_{k \in F} a_k e_k\right\|_H^2 = \sum_{k \in F} |a_k|^2
\end{align*}
for every finite set $F \subset I \setminus \Lambda_n$. Since $f-g_n$ is the norm limit in $H$ of these finite partial sums as $F$ increases through the finite subsets of $I \setminus \Lambda_n$, norm continuity gives convergence of the left-hand side. The right-hand side is an increasing net of nonnegative finite sums, so its limit is the unordered series over $I \setminus \Lambda_n$. Hence
\begin{align*}
\|f-g_n\|_H^2 = \sum_{k \in I \setminus \Lambda_n} |a_k|^2.
\end{align*}
By the choice of $\Lambda_n$, the omitted magnitudes are precisely the rearranged tail $(a_m^*)_{m>n}$, counted with multiplicity. Hence
\begin{align*}
\|f-g_n\|_H^2 = \sum_{m=n+1}^{\infty} (a_m^*)^2.
\end{align*}
Combining this identity with the previous step yields
\begin{align*}
\sigma_n(f)_{H,\mathcal{D}}^2 \leq \sum_{m=n+1}^{\infty} (a_m^*)^2.
\end{align*}
[/step]
custom_env
admin
[step:Bound the rearranged tail by an integral comparison]The coefficient decay hypothesis gives $(a_m^*)^2 \leq C^2 m^{-2s}$ for every $m \in \mathbb{N}$. Since $s > \frac{1}{2}$, we have $2s > 1$. Therefore
\begin{align*}
\sum_{m=n+1}^{\infty} (a_m^*)^2 \leq C^2 \sum_{m=n+1}^{\infty} m^{-2s}.
\end{align*}
Let $\mathcal{L}^1$ denote one-dimensional [Lebesgue measure](/page/Lebesgue%20Measure) on $\mathbb{R}$. The function $\varphi: [n,\infty) \to (0,\infty)$ defined by $\varphi(x) := x^{-2s}$ is positive and decreasing, so the integral comparison estimate gives
\begin{align*}
\sum_{m=n+1}^{\infty} m^{-2s} \leq \int_n^\infty x^{-2s}\,d\mathcal{L}^1(x).
\end{align*}
Evaluating the improper integral,
\begin{align*}
\int_n^\infty x^{-2s}\,d\mathcal{L}^1(x) = \frac{n^{1-2s}}{2s-1}.
\end{align*}
Thus
\begin{align*}
\sum_{m=n+1}^{\infty} (a_m^*)^2 \leq \frac{C^2}{2s-1} n^{1-2s}.
\end{align*}[/step]
custom_env
admin
[guided]We now estimate the tail left after keeping the $n$ largest coefficients. The decay assumption says that every rearranged coefficient satisfies
\begin{align*}
a_m^* \leq C m^{-s}.
\end{align*}
Squaring both sides is legitimate because both sides are nonnegative, and gives
\begin{align*}
(a_m^*)^2 \leq C^2 m^{-2s}.
\end{align*}
Therefore the omitted squared coefficients satisfy
\begin{align*}
\sum_{m=n+1}^{\infty} (a_m^*)^2 \leq C^2 \sum_{m=n+1}^{\infty} m^{-2s}.
\end{align*}
The condition $s > \frac{1}{2}$ is used exactly here: it implies $2s > 1$, so the power-series tail with exponent $2s$ is summable. Let $\mathcal{L}^1$ denote one-dimensional Lebesgue measure on $\mathbb{R}$. To obtain the dependence on $n$, define the comparison function $\varphi: [n,\infty) \to (0,\infty)$ by
\begin{align*}
\varphi(x) := x^{-2s}.
\end{align*}
This function is decreasing on $[n,\infty)$ because $2s>0$. Hence, for each integer $m \geq n+1$, the monotonicity of $\varphi$ gives
\begin{align*}
m^{-2s} = \varphi(m) \leq \int_{m-1}^{m} \varphi(x)\,d\mathcal{L}^1(x).
\end{align*}
Summing this estimate over all $m \geq n+1$ gives the integral comparison
\begin{align*}
\sum_{m=n+1}^{\infty} m^{-2s} \leq \int_n^\infty x^{-2s}\,d\mathcal{L}^1(x).
\end{align*}
Since $2s>1$, the improper integral converges and evaluates to
\begin{align*}
\int_n^\infty x^{-2s}\,d\mathcal{L}^1(x) = \frac{n^{1-2s}}{2s-1}.
\end{align*}
Combining the decay estimate with this integral comparison yields
\begin{align*}
\sum_{m=n+1}^{\infty} (a_m^*)^2 \leq \frac{C^2}{2s-1} n^{1-2s}.
\end{align*}[/guided]
custom_env
admin
[step:Take square roots to obtain the algebraic rate]
From the preceding two steps,
\begin{align*}
\sigma_n(f)_{H,\mathcal{D}}^2 \leq \frac{C^2}{2s-1} n^{1-2s}.
\end{align*}
Both sides are nonnegative, so taking square roots gives
\begin{align*}
\sigma_n(f)_{H,\mathcal{D}} \leq \frac{C}{\sqrt{2s-1}} n^{\frac{1}{2}-s}.
\end{align*}
Since $\frac{1}{2}-s = -(s-\frac{1}{2})$, this is
\begin{align*}
\sigma_n(f)_{H,\mathcal{D}} \leq \frac{C}{\sqrt{2s-1}} n^{-(s-\frac{1}{2})}.
\end{align*}
Thus the theorem holds with $C' = C(2s-1)^{-1/2}$, which depends only on $C$ and $s$.
[/step]