Properties of the Subdifferential — Statement & Proof

Theorem

Edit Issues Pull Requests Attributions Admin

Discussion

No discussion available for this theorem.

Proof

[proofplan] The four properties are proved separately but use a common toolkit: the supporting hyperplane theorem at the boundary of the epigraph (for non-emptiness), the local Lipschitz bound from [Convex Functions Are Locally Lipschitz](/theorems/3086) (for compactness and closedness), the subgradient inequality applied symmetrically at two points (for monotonicity), and the first-order Taylor expansion (for the differentiable case). The convexity of $\partial f(x)$ is direct from the linearity of the defining inequality. [/proofplan] [step:Set up the subgradient inequality and the local Lipschitz constant] Recall that for a convex function $f: U \to \mathbb{R}$ on the open convex set $U \subseteq \mathbb{R}^n$, the subdifferential at $x \in U$ is the set \begin{align*} \partial f(x) := \{p \in \mathbb{R}^n : f(y) \ge f(x) + p \cdot (y - x) \text{ for every } y \in U\}. \end{align*} This is the **subgradient inequality**. By [Convex Functions Are Locally Lipschitz](/theorems/3086), for every $x \in U$ there exists $r_0 > 0$ with $\overline{B}(x, r_0) \subseteq U$ and a constant $L = L(x) > 0$ such that \begin{align*} |f(y) - f(z)| \le L |y - z| \qquad \text{for all } y, z \in \overline{B}(x, r_0). \end{align*} We refer to $L$ as the **local Lipschitz constant of $f$ at $x$** and to $r_0$ as the **local Lipschitz radius**. These quantities depend only on the chosen compact neighbourhood of $x$ and on $f$. [/step] [step:Prove $\partial f(x)$ is nonempty by separating the epigraph] Define the epigraph of $f$ over $U$ by \begin{align*} \operatorname{epi}(f) := \{(y, t) \in U \times \mathbb{R} : t \ge f(y)\} \subseteq \mathbb{R}^{n+1}. \end{align*} We treat $\operatorname{epi}(f)$ as a subset of $\mathbb{R}^{n+1}$ equipped with its standard inner product. [claim:$\operatorname{epi}(f)$ is convex.] [/claim] *Proof of claim.* Let $(y_1, t_1), (y_2, t_2) \in \operatorname{epi}(f)$ and $\lambda \in [0, 1]$. Then $y_1, y_2 \in U$ and $t_i \ge f(y_i)$ for $i = 1, 2$. By convexity of $U$, $\lambda y_1 + (1 - \lambda) y_2 \in U$. By convexity of $f$, \begin{align*} f(\lambda y_1 + (1 - \lambda) y_2) &\le \lambda f(y_1) + (1 - \lambda) f(y_2) \\ &\le \lambda t_1 + (1 - \lambda) t_2. \end{align*} Hence $(\lambda y_1 + (1 - \lambda) y_2, \lambda t_1 + (1 - \lambda) t_2) \in \operatorname{epi}(f)$. $\square$ Fix $x \in U$. The point $(x, f(x))$ belongs to $\operatorname{epi}(f)$ but not to its relative interior with respect to the vertical direction: any point $(x, f(x) - \varepsilon)$ with $\varepsilon > 0$ fails $t \ge f(x)$. Therefore $(x, f(x))$ is a boundary point of $\operatorname{epi}(f)$ in $\mathbb{R}^{n+1}$. By the supporting hyperplane theorem applied to the convex set $\operatorname{epi}(f) \subseteq \mathbb{R}^{n+1}$ at the boundary point $(x, f(x))$, there exists a nonzero vector $(a, b) \in \mathbb{R}^n \times \mathbb{R}$ such that \begin{align*} a \cdot (y - x) + b (t - f(x)) \le 0 \qquad \text{for every } (y, t) \in \operatorname{epi}(f). \end{align*} Taking $y = x$ and $t = f(x) + s$ with $s > 0$ gives $b s \le 0$ for every $s > 0$, so $b \le 0$. We claim $b < 0$. Suppose for contradiction $b = 0$. Then $a \cdot (y - x) \le 0$ for every $y \in U$. Since $x \in U$ and $U$ is open, $x + \delta a \in U$ for some $\delta > 0$, giving $\delta |a|^2 \le 0$, hence $a = 0$. This contradicts $(a, b) \ne 0$. Hence $b < 0$. Set $p := -a / b \in \mathbb{R}^n$. For $y \in U$, taking $t = f(y) \ge f(y)$ in the inequality: \begin{align*} a \cdot (y - x) + b (f(y) - f(x)) \le 0. \end{align*} Dividing by $-b > 0$ and rearranging, \begin{align*} f(y) - f(x) \ge -\frac{a}{b} \cdot (y - x) = p \cdot (y - x). \end{align*} Hence $p \in \partial f(x)$, so $\partial f(x) \ne \varnothing$. [/step] [step:Prove $\partial f(x)$ is convex from the linearity of the defining inequality] Let $p_1, p_2 \in \partial f(x)$ and $\lambda \in [0, 1]$. For every $y \in U$, the subgradient inequalities give \begin{align*} f(y) &\ge f(x) + p_1 \cdot (y - x), \\ f(y) &\ge f(x) + p_2 \cdot (y - x). \end{align*} Multiplying the first by $\lambda$ and the second by $1 - \lambda$ (both nonnegative), and adding: \begin{align*} \lambda f(y) + (1 - \lambda) f(y) \ge \lambda f(x) + (1 - \lambda) f(x) + (\lambda p_1 + (1 - \lambda) p_2) \cdot (y - x), \end{align*} i.e., \begin{align*} f(y) \ge f(x) + (\lambda p_1 + (1 - \lambda) p_2) \cdot (y - x). \end{align*} Hence $\lambda p_1 + (1 - \lambda) p_2 \in \partial f(x)$, so $\partial f(x)$ is convex. [/step] [step:Prove $\partial f(x)$ is bounded by the local Lipschitz constant] Let $L = L(x)$ and $r_0 = r_0(x)$ be as in Step 1. Fix $p \in \partial f(x)$. We show $|p| \le L$. If $p = 0$ the bound is automatic. Otherwise, set $y := x + (r_0/2) \, p / |p|$. Then $|y - x| = r_0/2 \le r_0$, so $y \in \overline{B}(x, r_0) \subseteq U$. The subgradient inequality at $y$ reads \begin{align*} f(y) - f(x) \ge p \cdot (y - x) = p \cdot \frac{r_0}{2} \cdot \frac{p}{|p|} = \frac{r_0 |p|}{2}. \end{align*} On the other hand, the local Lipschitz bound (with $z = x$) gives \begin{align*} f(y) - f(x) \le |f(y) - f(x)| \le L |y - x| = \frac{L r_0}{2}. \end{align*} Combining the two inequalities, \begin{align*} \frac{r_0 |p|}{2} \le \frac{L r_0}{2}, \end{align*} hence $|p| \le L$. Therefore $\partial f(x) \subseteq \overline{B}(0, L)$, so $\partial f(x)$ is bounded. [/step] [step:Prove $\partial f(x)$ is closed, hence compact] Let $(p_k)_{k \in \mathbb{N}} \subseteq \partial f(x)$ with $p_k \to p$ in $\mathbb{R}^n$. For each $y \in U$ and each $k$, \begin{align*} f(y) \ge f(x) + p_k \cdot (y - x). \end{align*} The right-hand side is a linear (hence continuous) function of $p_k$. Letting $k \to \infty$: \begin{align*} f(y) \ge f(x) + p \cdot (y - x). \end{align*} This holds for every $y \in U$, so $p \in \partial f(x)$. Hence $\partial f(x)$ is closed. Since $\partial f(x)$ is closed and bounded (Step 4) in $\mathbb{R}^n$, it is compact by the Heine–Borel theorem. Together with Steps 2 and 3, this proves part (i): $\partial f(x)$ is nonempty, compact, and convex. [/step] [step:Prove monotonicity by adding the two subgradient inequalities] Fix $x, y \in U$, $p \in \partial f(x)$, and $q \in \partial f(y)$. The subgradient inequality for $p$ at $x$, evaluated at the test point $y$: \begin{align*} f(y) \ge f(x) + p \cdot (y - x). \tag{$\ast$} \end{align*} The subgradient inequality for $q$ at $y$, evaluated at the test point $x$: \begin{align*} f(x) \ge f(y) + q \cdot (x - y). \tag{$\ast\ast$} \end{align*} Adding ($\ast$) and ($\ast\ast$): \begin{align*} f(x) + f(y) \ge f(x) + f(y) + p \cdot (y - x) + q \cdot (x - y). \end{align*} Cancelling $f(x) + f(y)$ from both sides: \begin{align*} 0 \ge p \cdot (y - x) + q \cdot (x - y) = -(p - q) \cdot (x - y). \end{align*} Hence $(p - q) \cdot (x - y) \ge 0$, proving (ii). [/step] [step:Identify $\partial f(x) = \{\nabla f(x)\}$ at points of differentiability] Suppose $f$ is (classically) differentiable at $x \in U$, with derivative $\nabla f(x) \in \mathbb{R}^n$. We first show $\nabla f(x) \in \partial f(x)$. Let $y \in U$ and consider the function \begin{align*} \varphi: [0, 1] &\to \mathbb{R}, \\ t &\mapsto f(x + t(y - x)). \end{align*} This is well-defined because $U$ is convex, so $x + t(y - x) \in U$ for $t \in [0, 1]$. The function $\varphi$ is convex on $[0, 1]$: for $s, t \in [0, 1]$ and $\lambda \in [0, 1]$, \begin{align*} \varphi(\lambda s + (1 - \lambda) t) &= f(x + (\lambda s + (1 - \lambda) t)(y - x)) \\ &= f(\lambda (x + s(y - x)) + (1 - \lambda)(x + t(y - x))) \\ &\le \lambda f(x + s(y - x)) + (1 - \lambda) f(x + t(y - x)) \\ &= \lambda \varphi(s) + (1 - \lambda) \varphi(t). \end{align*} A real-valued convex function on $[0, 1]$ has the property that its difference quotient $(\varphi(t) - \varphi(0))/t$ is nondecreasing in $t \in (0, 1]$ (a standard one-dimensional fact: convexity is equivalent to the slope condition). Hence \begin{align*} \varphi(1) - \varphi(0) \ge \lim_{t \to 0^+} \frac{\varphi(t) - \varphi(0)}{t} = \varphi'(0^+), \end{align*} provided the right-derivative at $0$ exists. Since $f$ is differentiable at $x$, the chain-rule limit \begin{align*} \varphi'(0^+) = \lim_{t \to 0^+} \frac{f(x + t(y - x)) - f(x)}{t} = \nabla f(x) \cdot (y - x) \end{align*} exists. Substituting, \begin{align*} f(y) - f(x) = \varphi(1) - \varphi(0) \ge \nabla f(x) \cdot (y - x), \end{align*} which is the subgradient inequality. Hence $\nabla f(x) \in \partial f(x)$. Conversely, suppose $p \in \partial f(x)$. We show $p = \nabla f(x)$. Fix any $v \in \mathbb{R}^n$ with $|v| = 1$. Since $U$ is open, there exists $\delta > 0$ with $x + tv \in U$ for $|t| < \delta$. The subgradient inequality at $y = x + tv$: \begin{align*} f(x + tv) - f(x) \ge p \cdot (tv) = t (p \cdot v). \end{align*} By differentiability of $f$ at $x$, \begin{align*} f(x + tv) - f(x) = t \nabla f(x) \cdot v + o(t) \qquad \text{as } t \to 0. \end{align*} Substituting: \begin{align*} t \nabla f(x) \cdot v + o(t) \ge t (p \cdot v). \end{align*} Dividing by $t > 0$ and letting $t \to 0^+$: \begin{align*} \nabla f(x) \cdot v \ge p \cdot v. \end{align*} Replacing $v$ by $-v$ yields the reverse inequality $\nabla f(x) \cdot (-v) \ge p \cdot (-v)$, equivalently $\nabla f(x) \cdot v \le p \cdot v$. Combining, \begin{align*} \nabla f(x) \cdot v = p \cdot v \qquad \text{for every unit vector } v \in \mathbb{R}^n. \end{align*} Choosing $v = e_i$ (the standard basis vectors) for $i = 1, \dots, n$ gives that the components of $\nabla f(x)$ and $p$ agree, hence $p = \nabla f(x)$. This proves $\partial f(x) = \{\nabla f(x)\}$, establishing (iii). [/step] [step:Prove the graph of $\partial f$ is closed in $U \times \mathbb{R}^n$] Let $G := \{(x, p) : x \in U, p \in \partial f(x)\}$ denote the graph of the subdifferential. Suppose $(x_k, p_k) \in G$ with $(x_k, p_k) \to (x, p)$ in $U \times \mathbb{R}^n$ (where $U$ has its subspace topology). We show $(x, p) \in G$. By [Convex Functions Are Locally Lipschitz](/theorems/3086), $f$ is continuous on $U$. The convergence $x_k \to x$ in $U$ therefore gives $f(x_k) \to f(x)$. Fix $y \in U$. For each $k$, since $p_k \in \partial f(x_k)$: \begin{align*} f(y) \ge f(x_k) + p_k \cdot (y - x_k). \end{align*} Each term on the right is continuous in $(x_k, p_k)$: $f(x_k) \to f(x)$ by continuity of $f$, and $p_k \cdot (y - x_k) \to p \cdot (y - x)$ by continuity of the inner product and the convergence $(x_k, p_k) \to (x, p)$. Letting $k \to \infty$: \begin{align*} f(y) \ge f(x) + p \cdot (y - x). \end{align*} Since $y \in U$ was arbitrary, $p \in \partial f(x)$. Hence $(x, p) \in G$, so $G$ is closed in $U \times \mathbb{R}^n$, proving (iv). This completes the proofs of all four parts. [/step]

What brings you to Androma?

Start with a route through the knowledge graph.

Properties of the Subdifferential (Theorem # 3087)

Discussion

Proof

Explore Further

Sign in to Androma

Check your inbox

One last step

Properties of the Subdifferential (Theorem # 3087)

Discussion

Proof

Explore Further