[proofplan]
The proof has four stages. First, recall the Whitney $C^1$ condition and import (as standard external lemmas) the Whitney decomposition of $\mathbb{R}^n \setminus K$ and a smooth partition of unity subordinate to it. Second, build local first-order polynomial approximants $P_j$ on each Whitney cube using the data $(f, d)$ at carefully chosen reference points $x_j \in K$. Third, define $\tilde f$ piecewise on $K$ and on $\mathbb{R}^n \setminus K$, and prove that $\tilde f \in C^\infty(\mathbb{R}^n \setminus K)$ via a per-point reference index $i = i(x)$ whose choice does not affect the formula's value because $\sum_j \nabla \psi_j = 0$. Fourth, prove that the partial derivatives extend continuously to $K$ with boundary value $d$, using the key estimate $|P_i(x) - P_j(x)| = o(\ell(Q_i))$ between adjacent cubes, derived from the Whitney remainder.
[/proofplan]
[step:State the Whitney $C^1$ condition and import the Whitney decomposition]
Let $K \subseteq \mathbb{R}^n$ be closed and nonempty. The functions $f: K \to \mathbb{R}$ and $d = (d_1, \dots, d_n): K \to \mathbb{R}^n$ (we write $d(a)$ for the candidate gradient at $a \in K$) satisfy the **Whitney $C^1$ condition** if the remainder
\begin{align*}
R: (K \times K) \setminus \{(a, a) : a \in K\} &\to \mathbb{R}, \\
(a, b) &\mapsto \frac{f(b) - f(a) - d(a) \cdot (b - a)}{|b - a|}
\end{align*}
satisfies: for every compact $H \subseteq K$ and every $\varepsilon > 0$, there exists $\delta = \delta(H, \varepsilon) > 0$ such that
\begin{align*}
|R(a, b)| < \varepsilon \qquad \text{for all } a, b \in H \text{ with } 0 < |a - b| < \delta. \tag{W}
\end{align*}
We call (W) the **Whitney remainder estimate**.
The construction below relies on two well-known auxiliary results from harmonic analysis, used here as black boxes.
We use the **Whitney Decomposition**: for any closed proper subset $K \subsetneq \mathbb{R}^n$, there exists a countable collection $\mathcal{Q} = \{Q_j\}_{j \in J}$ of closed dyadic cubes in $\mathbb{R}^n \setminus K$ with the following properties (see Stein, *Singular Integrals and Differentiability Properties of Functions*, Princeton University Press, 1970, Chapter VI, §1):
- (D1) $\bigcup_{j \in J} Q_j = \mathbb{R}^n \setminus K$ and $\operatorname{int}(Q_i) \cap \operatorname{int}(Q_j) = \varnothing$ for $i \ne j$.
- (D2) Comparability: for each $j \in J$, $\sqrt{n} \, \ell(Q_j) \le \operatorname{dist}(Q_j, K) \le 4\sqrt{n} \, \ell(Q_j)$, where $\ell(Q_j)$ denotes the side length of $Q_j$.
- (D3) Bounded overlap of enlargements: for the $\tfrac{9}{8}$-enlargement $Q_j^* := \tfrac{9}{8} Q_j$ (the cube concentric with $Q_j$ scaled by $9/8$), each point of $\mathbb{R}^n \setminus K$ lies in at most $N_n$ such enlargements, with $N_n$ a dimensional constant.
- (D4) Adjacency comparability: if $Q_i^* \cap Q_j^* \ne \varnothing$, then $\tfrac{1}{4}\ell(Q_i) \le \ell(Q_j) \le 4 \ell(Q_i)$.
We also use the **Whitney Partition of Unity**: with $\mathcal Q$ as above, there exist smooth functions $\{\psi_j\}_{j \in J}$ with (Stein, op. cit., Chapter VI, §1.3):
- (P1) $\operatorname{supp} \psi_j \subseteq Q_j^*$.
- (P2) $0 \le \psi_j \le 1$ and $\sum_{j \in J} \psi_j(x) = 1$ for every $x \in \mathbb{R}^n \setminus K$ (the sum is locally finite by (D3)).
- (P3) $|\nabla \psi_j(x)| \le C_n / \ell(Q_j)$ for every $x \in \mathbb{R}^n$, with $C_n$ a dimensional constant.
If $K = \mathbb{R}^n$, the conclusion follows by setting $\tilde f := f$, since then $\nabla \tilde f|_K = d$ holds by hypothesis and the complement is empty. Assume henceforth $K \subsetneq \mathbb{R}^n$ and apply the two cited results.
[/step]
[step:Construct the reference points and the local affine approximants]
For each Whitney cube $Q_j$, pick a point $x_j \in K$ with
\begin{align*}
|x_j - c_j| \le 2 \operatorname{dist}(Q_j, K),
\end{align*}
where $c_j$ is the centre of $Q_j$. Such a choice exists because $\operatorname{dist}(c_j, K) \le \operatorname{dist}(Q_j, K) + \operatorname{diam}(Q_j) \le \operatorname{dist}(Q_j, K) + \sqrt{n} \, \ell(Q_j) \le 2 \operatorname{dist}(Q_j, K)$ (using (D2)), and $K$ is closed so the infimum is attained.
Define the **local first-order polynomial approximant**
\begin{align*}
P_j: \mathbb{R}^n &\to \mathbb{R}, \\
x &\mapsto f(x_j) + d(x_j) \cdot (x - x_j).
\end{align*}
This is the candidate Taylor polynomial of $\tilde f$ at $x_j$ (viewing $d(x_j)$ as the would-be gradient of $\tilde f$ at $x_j$).
Define the extension
\begin{align*}
\tilde f: \mathbb{R}^n &\to \mathbb{R}, \\
\tilde f(x) &:= \begin{cases}
f(x), & x \in K, \\
\sum_{j \in J} \psi_j(x) P_j(x), & x \in \mathbb{R}^n \setminus K.
\end{cases}
\end{align*}
The sum on $\mathbb{R}^n \setminus K$ is locally finite by (D3), hence $\tilde f$ is well-defined on $\mathbb{R}^n \setminus K$.
[/step]
[step:Compute $\nabla \tilde f$ on $\mathbb{R}^n \setminus K$ and prove smoothness there]
We work locally near a fixed point $x_0 \in \mathbb{R}^n \setminus K$. By (D3) and (P1), there is an open neighbourhood $U \ni x_0$ with $U \subset \mathbb{R}^n \setminus K$ such that the index set
\begin{align*}
J(U) := \{j \in J : Q_j^* \cap U \ne \varnothing\}
\end{align*}
is finite (concretely: $|J(U)| \le N_n$ for $U$ small enough that all relevant enlargements covering $U$ are accounted for; together with (D4), only cubes whose side lengths are comparable to the nearest cube to $x_0$ contribute, and these have bounded multiplicity). On $U$, $\psi_j \equiv 0$ for $j \notin J(U)$, so
\begin{align*}
\tilde f(x) = \sum_{j \in J(U)} \psi_j(x) P_j(x), \qquad x \in U,
\end{align*}
is a finite sum of $C^\infty$ functions. Term-by-term differentiation gives
\begin{align*}
\nabla \tilde f(x) = \sum_{j \in J(U)} (\nabla \psi_j(x)) P_j(x) + \sum_{j \in J(U)} \psi_j(x) \nabla P_j(x) = \sum_{j \in J(U)} (\nabla \psi_j(x)) P_j(x) + \sum_{j \in J(U)} \psi_j(x) d(x_j),
\end{align*}
using $\nabla P_j(x) = d(x_j)$.
**Per-point reference index.** For each $x \in \mathbb{R}^n \setminus K$, fix any index $i = i(x) \in J$ such that $\psi_i(x) > 0$. Such an $i$ exists because $\sum_{j \in J} \psi_j(x) = 1$ by (P2), so at least one term is positive; (P1) further forces $x \in Q_i^*$. Differentiating the identity $\sum_{j \in J} \psi_j \equiv 1$ on $\mathbb{R}^n \setminus K$ gives
\begin{align*}
\sum_{j \in J} \nabla \psi_j(x) = 0 \qquad \text{for all } x \in \mathbb{R}^n \setminus K, \tag{$\dagger$}
\end{align*}
where the sum is locally finite (only $j \in J(U)$ contribute near $x$). Hence subtracting $P_i(x)$ from each $P_j(x)$ in the first sum changes nothing:
\begin{align*}
\sum_{j \in J(U)} (\nabla \psi_j(x))(P_j(x) - P_i(x)) = \sum_{j \in J(U)} (\nabla \psi_j(x)) P_j(x) - P_i(x) \sum_{j \in J(U)} \nabla \psi_j(x) = \sum_{j \in J(U)} (\nabla \psi_j(x)) P_j(x),
\end{align*}
since the second term vanishes by ($\dagger$). Thus
\begin{align*}
\nabla \tilde f(x) = \sum_{j \in J(U)} (\nabla \psi_j(x))(P_j(x) - P_i(x)) + \sum_{j \in J(U)} \psi_j(x) d(x_j). \tag{$\spadesuit$}
\end{align*}
**Independence of the choice $i = i(x)$.** If $i' \in J$ is another admissible index (i.e.\ $\psi_{i'}(x) > 0$), the difference between the two right-hand sides of ($\spadesuit$) using $i$ versus $i'$ is
\begin{align*}
\sum_{j \in J(U)} (\nabla \psi_j(x))(P_{i'}(x) - P_i(x)) = (P_{i'}(x) - P_i(x)) \sum_{j \in J(U)} \nabla \psi_j(x) = 0,
\end{align*}
again by ($\dagger$). Hence ($\spadesuit$) defines $\nabla \tilde f(x)$ unambiguously.
**Smoothness on the complement.** Fix $x_0 \in \mathbb{R}^n \setminus K$ and the neighbourhood $U$ above. On $U$, $\tilde f$ is the finite sum $\sum_{j \in J(U)} \psi_j P_j$ of $C^\infty$ functions, hence $\tilde f \in C^\infty(U)$. Since $x_0$ was arbitrary, $\tilde f \in C^\infty(\mathbb{R}^n \setminus K)$.
[/step]
[step:Estimate $|P_i(x) - P_j(x)|$ between adjacent cubes]
Fix $x \in \mathbb{R}^n \setminus K$ and the reference index $i = i(x)$ from the previous step (so $x \in Q_i^*$). For each $j \in J$ with $\psi_j(x) > 0$ (equivalently $x \in Q_j^*$), by (D4) we have $\tfrac{1}{4}\ell(Q_i) \le \ell(Q_j) \le 4 \ell(Q_i)$, and hence by (D2)
\begin{align*}
\operatorname{dist}(Q_j, K) \asymp \operatorname{dist}(Q_i, K) \asymp \ell(Q_i),
\end{align*}
where $\asymp$ means equality up to dimensional constants.
We claim $|x - x_j| \le C \ell(Q_i)$ for a dimensional constant $C$. Indeed, $|x - c_j| \le \operatorname{diam}(Q_j^*) = (9/8) \sqrt{n} \, \ell(Q_j) \le (9/8) \sqrt{n} \cdot 4 \ell(Q_i) = (9/2) \sqrt{n} \, \ell(Q_i)$, and $|c_j - x_j| \le 2 \operatorname{dist}(Q_j, K) \le 8 \sqrt{n} \, \ell(Q_j) \le 32 \sqrt{n} \, \ell(Q_i)$. Combining via the triangle inequality, $|x - x_j| \le C \ell(Q_i)$ with $C := (9/2 + 32)\sqrt{n}$. By the same argument applied to $i$ (with $\ell(Q_i)$ in place of $\ell(Q_j)$), $|x - x_i| \le C \ell(Q_i)$. Thus
\begin{align*}
|x_i - x_j| \le |x_i - x| + |x - x_j| \le 2C \, \ell(Q_i) =: C' \ell(Q_i).
\end{align*}
Compute
\begin{align*}
P_i(x) - P_j(x) = f(x_i) - f(x_j) + d(x_i) \cdot (x - x_i) - d(x_j) \cdot (x - x_j).
\end{align*}
By definition of $R$ with $a = x_i$, $b = x_j$,
\begin{align*}
f(x_j) - f(x_i) - d(x_i) \cdot (x_j - x_i) = |x_i - x_j| R(x_i, x_j),
\end{align*}
so
\begin{align*}
f(x_i) - f(x_j) = -|x_i - x_j| R(x_i, x_j) - d(x_i)\cdot(x_j - x_i).
\end{align*}
Substituting,
\begin{align*}
P_i(x) - P_j(x) = -|x_i - x_j| R(x_i, x_j) - d(x_i) \cdot (x_j - x_i) + d(x_i) \cdot (x - x_i) - d(x_j) \cdot (x - x_j).
\end{align*}
The middle two terms simplify: $-d(x_i)\cdot(x_j - x_i) + d(x_i)\cdot(x - x_i) = d(x_i)\cdot(x - x_j)$, so
\begin{align*}
P_i(x) - P_j(x) = -|x_i - x_j| R(x_i, x_j) + (d(x_i) - d(x_j))\cdot(x - x_j). \tag{$\star$}
\end{align*}
By ($\star$) and the bounds $|x_i - x_j| \le C' \ell(Q_i)$, $|x - x_j| \le C \ell(Q_i)$:
\begin{align*}
|P_i(x) - P_j(x)| \le C' \ell(Q_i) \cdot |R(x_i, x_j)| + |d(x_i) - d(x_j)| \cdot C \ell(Q_i).
\end{align*}
Both factors are $o(1)$ as $\ell(Q_i) \to 0$ along sequences for which $x_i, x_j$ remain in a compact set $H \subseteq K$: the first by (W), and the second by uniform continuity of $d$ on $H$. Hence
\begin{align*}
|P_i(x) - P_j(x)| = o(\ell(Q_i)) \quad \text{as } \ell(Q_i) \to 0,
\end{align*}
uniformly over pairs $(i, j)$ with $Q_i^* \cap Q_j^* \ne \varnothing$ and $x_i, x_j \in H$.
[/step]
[step:Differentiability of $\tilde f$ at points of $K$ with $\nabla \tilde f|_K = d$]
Fix $x_* \in K$ and a compact neighbourhood $H \subseteq K$ of $x_*$ (e.g.\ $H := K \cap \overline{B(x_*, 1)}$).
*Case A: $x \in K$.* By definition $\tilde f(x) = f(x)$ and $\tilde f(x_*) = f(x_*)$, so
\begin{align*}
\tilde f(x) - \tilde f(x_*) - d(x_*) \cdot (x - x_*) = f(x) - f(x_*) - d(x_*) \cdot (x - x_*) = |x - x_*| R(x_*, x).
\end{align*}
By (W) applied with $a = x_*$, $b = x$, this is $o(|x - x_*|)$ as $x \to x_*$ in $K$.
*Case B: $x \in \mathbb{R}^n \setminus K$.* Choose a Whitney cube $Q_i$ with $x \in Q_i^*$ (so $i = i(x)$ is admissible as the reference index). By the bound $|x - x_i| \le C \ell(Q_i)$ from the previous step, and by (D2), $|x - x_i| \le C \ell(Q_i)$.
We compute
\begin{align*}
\tilde f(x) - P_i(x) = \sum_{j \in J(x)} \psi_j(x)(P_j(x) - P_i(x)),
\end{align*}
where $J(x) := \{j \in J : \psi_j(x) > 0\}$ has cardinality at most $N_n$ by (D3). By ($\star$) applied to each pair $(i, j)$, $|P_j(x) - P_i(x)| = o(\ell(Q_i))$, and since $0 \le \psi_j(x) \le 1$,
\begin{align*}
|\tilde f(x) - P_i(x)| \le \sum_{j \in J(x)} \psi_j(x)\,|P_j(x) - P_i(x)| \le N_n \cdot o(\ell(Q_i)) = o(\ell(Q_i)).
\end{align*}
We bound $\ell(Q_i)$ by $|x - x_*|$. Since $x \in Q_i^*$, the distance from $x$ to $K$ satisfies
\begin{align*}
\operatorname{dist}(x, K) \ge \operatorname{dist}(Q_i, K) - \operatorname{diam}(Q_i^*) \ge \sqrt{n} \, \ell(Q_i) - (9/8)\sqrt{n}\, \ell(Q_i) \cdot \tfrac{1}{2}
\end{align*}
— more carefully, since $Q_i^*$ is a $9/8$-enlargement of $Q_i$, any point of $Q_i^*$ is within $(1/16)\sqrt{n}\,\ell(Q_i)$ of $Q_i$, hence $\operatorname{dist}(x, K) \ge \operatorname{dist}(Q_i, K) - (1/16)\sqrt{n}\,\ell(Q_i) \ge \sqrt{n}\,\ell(Q_i) - (1/16)\sqrt{n}\,\ell(Q_i) = (15/16)\sqrt{n}\,\ell(Q_i)$. Thus $\operatorname{dist}(x, K) \ge c \, \ell(Q_i)$ with $c := (15/16)\sqrt{n}$, and since $x_* \in K$, $|x - x_*| \ge \operatorname{dist}(x, K) \ge c \, \ell(Q_i)$. Therefore $\ell(Q_i) \le |x - x_*|/c$, and
\begin{align*}
|\tilde f(x) - P_i(x)| = o(\ell(Q_i)) = o(|x - x_*|).
\end{align*}
For $x \to x_*$ with $x \notin K$, we also have $x_i \to x_*$, since $|x_i - x_*| \le |x_i - x| + |x - x_*| \le C \ell(Q_i) + |x - x_*| \le (C/c + 1) |x - x_*|$. In particular, $x_i \in H$ for $x$ sufficiently close to $x_*$.
Compute
\begin{align*}
P_i(x) - f(x_*) - d(x_*) \cdot (x - x_*) &= f(x_i) + d(x_i) \cdot (x - x_i) - f(x_*) - d(x_*) \cdot (x - x_*) \\
&= [f(x_i) - f(x_*) - d(x_*)\cdot(x_i - x_*)] + (d(x_i) - d(x_*))\cdot(x - x_i) \\
&= |x_i - x_*| R(x_*, x_i) + (d(x_i) - d(x_*))\cdot(x - x_i).
\end{align*}
By (W) with $a = x_*$, $b = x_i$, the first term is $|x_i - x_*| \cdot o(1)$, and $|x_i - x_*| \le (C/c + 1)|x - x_*|$, so this is $o(|x - x_*|)$. The second term is bounded by $|d(x_i) - d(x_*)| \cdot |x - x_i| \le |d(x_i) - d(x_*)| \cdot C \ell(Q_i) \le |d(x_i) - d(x_*)| \cdot (C/c) \cdot |x - x_*|$, which is $o(|x - x_*|)$ by continuity of $d$ at $x_*$ (hypothesis).
Combining Case B,
\begin{align*}
\tilde f(x) - \tilde f(x_*) - d(x_*) \cdot (x - x_*) = (\tilde f(x) - P_i(x)) + (P_i(x) - f(x_*) - d(x_*)\cdot(x - x_*)) = o(|x - x_*|).
\end{align*}
In both Cases A and B, $\tilde f(x) - \tilde f(x_*) - d(x_*) \cdot (x - x_*) = o(|x - x_*|)$ as $x \to x_*$. Hence $\tilde f$ is differentiable at $x_*$ with $\nabla \tilde f(x_*) = d(x_*)$.
[/step]
[step:Continuity of $\nabla \tilde f$ at points of $K$]
Fix $x_* \in K$ and a compact neighbourhood $H \subseteq K$ of $x_*$. We show $\nabla \tilde f(x) \to d(x_*)$ as $x \to x_*$.
For $x \in K$, $\nabla \tilde f(x) = d(x)$ by the previous step, which converges to $d(x_*)$ by continuity of $d$.
For $x \in \mathbb{R}^n \setminus K$ near $x_*$, formula ($\spadesuit$) gives
\begin{align*}
\nabla \tilde f(x) = \sum_{j \in J(x)}(\nabla \psi_j(x))(P_j(x) - P_i(x)) + \sum_{j \in J(x)} \psi_j(x) d(x_j),
\end{align*}
where $i = i(x) \in J(x)$ is the per-point reference index from Step 3. By (P3), $|\nabla \psi_j(x)| \le C_n / \ell(Q_j) \le 4 C_n / \ell(Q_i)$ via (D4). By Step 4, $|P_j(x) - P_i(x)| = o(\ell(Q_i))$ uniformly for $x_i, x_j \in H$. The cardinality $|J(x)| \le N_n$. Hence the first sum is bounded by
\begin{align*}
N_n \cdot \frac{4 C_n}{\ell(Q_i)} \cdot o(\ell(Q_i)) = o(1) \quad \text{as } x \to x_*,
\end{align*}
using that $\ell(Q_i) \to 0$ since $\ell(Q_i) \le |x - x_*|/c \to 0$.
The second sum is a convex combination of values $\{d(x_j) : j \in J(x)\}$ (the coefficients $\psi_j(x)$ sum to $1$ by (P2) and $\psi_j(x) > 0$ only for $j \in J(x)$). Each $x_j$ with $j \in J(x)$ satisfies $|x_j - x_*| \le |x_j - x| + |x - x_*| \le (C/c + 1)|x - x_*| \to 0$, so $x_j \to x_*$ uniformly over $j \in J(x)$. By continuity of $d$ at $x_*$, $d(x_j) \to d(x_*)$ uniformly, hence the convex combination converges to $d(x_*)$.
Combining,
\begin{align*}
\nabla \tilde f(x) \to d(x_*) = \nabla \tilde f(x_*) \quad \text{as } x \to x_*, \, x \notin K.
\end{align*}
Thus $\nabla \tilde f$ is continuous at every $x_* \in K$. Combined with Step 3 ($\tilde f \in C^\infty(\mathbb{R}^n \setminus K)$, so $\nabla \tilde f$ is continuous on the open complement) and the differentiability established in Step 5, we conclude $\tilde f \in C^1(\mathbb{R}^n)$, with $\tilde f|_K = f$ and $\nabla \tilde f|_K = d$.
This completes the proof.
[/step]