Subdifferential Formula for the Support Function of a Compact Convex Set

Subdifferential Formula for the Support Function of a Compact Convex Set (Theorem # 6678)

Theorem

Edit Issues Pull Requests Attributions Admin

Discussion

Proof

[proofplan] We prove both inclusions directly from the definition of the subdifferential. A maximizer $x \in C$ in direction $y$ gives a global affine minorant of the support function, so it is a subgradient at $y$. Conversely, if $x$ is a subgradient, the subgradient inequality first forces $y \cdot x = \sigma_C(y)$, and then a compact convex projection argument forces $x \in C$. These two facts identify $x$ as a maximizer of $c \mapsto y \cdot c$ over $C$. [/proofplan] [step:Verify that the support function and the exposed face are well-defined] For each $z \in \mathbb{R}^n$, define the linear functional $\ell_z: C \to \mathbb{R}$ by \begin{align*} \ell_z(c) = z \cdot c. \end{align*} Since $\ell_z$ is continuous and $C$ is compact and nonempty, $\ell_z$ attains a finite maximum on $C$. Hence $\sigma_C(z)$ is finite for every $z \in \mathbb{R}^n$, and the set \begin{align*} A_y := \operatorname{argmax}_{c \in C} y \cdot c = \{c \in C : y \cdot c = \sigma_C(y)\} \end{align*} is nonempty. [/step] [step:Show that every maximizer gives a subgradient] Let $x \in A_y$. Then $x \in C$ and $y \cdot x = \sigma_C(y)$. For an arbitrary $z \in \mathbb{R}^n$, the definition of $\sigma_C(z)$ as a maximum over $C$ gives \begin{align*} \sigma_C(z) \ge z \cdot x. \end{align*} Using $y \cdot x = \sigma_C(y)$, we rewrite the right-hand side as \begin{align*} z \cdot x = \sigma_C(y) + x \cdot (z - y). \end{align*} Therefore, for every $z \in \mathbb{R}^n$, \begin{align*} \sigma_C(z) \ge \sigma_C(y) + x \cdot (z - y). \end{align*} This is exactly the definition of $x \in \partial \sigma_C(y)$. Hence \begin{align*} A_y \subset \partial \sigma_C(y). \end{align*} [guided] Let $x \in A_y$. The meaning of $x \in A_y$ is that $x$ is a point of $C$ where the linear functional $c \mapsto y \cdot c$ reaches its maximum. Thus \begin{align*} x \in C \end{align*} and \begin{align*} y \cdot x = \sigma_C(y). \end{align*} To prove that $x$ is a subgradient of $\sigma_C$ at $y$, we must prove the global affine lower bound required by the definition of the subdifferential. That means we must show that for every $z \in \mathbb{R}^n$, \begin{align*} \sigma_C(z) \ge \sigma_C(y) + x \cdot (z - y). \end{align*} Fix $z \in \mathbb{R}^n$. Since $x \in C$ and $\sigma_C(z)$ is the maximum of $c \mapsto z \cdot c$ over all $c \in C$, the value at $x$ is bounded above by that maximum: \begin{align*} z \cdot x \le \sigma_C(z). \end{align*} Because $x$ maximizes in direction $y$, we also have $y \cdot x = \sigma_C(y)$. Therefore \begin{align*} z \cdot x = y \cdot x + x \cdot (z - y) = \sigma_C(y) + x \cdot (z - y). \end{align*} Combining the two displayed relations gives \begin{align*} \sigma_C(z) \ge \sigma_C(y) + x \cdot (z - y). \end{align*} Since $z \in \mathbb{R}^n$ was arbitrary, this proves $x \in \partial \sigma_C(y)$. [/guided] [/step] [step:Use the subgradient inequality to force equality in the support direction] Let $x \in \partial \sigma_C(y)$. By definition, for every $z \in \mathbb{R}^n$, \begin{align*} \sigma_C(z) \ge \sigma_C(y) + x \cdot (z - y). \end{align*} Taking $z = 0$ and using $\sigma_C(0) = 0$ gives \begin{align*} 0 \ge \sigma_C(y) - x \cdot y. \end{align*} Thus \begin{align*} x \cdot y \ge \sigma_C(y). \end{align*} Taking $z = 2y$ gives \begin{align*} \sigma_C(2y) \ge \sigma_C(y) + x \cdot y. \end{align*} Since $\sigma_C(2y) = 2\sigma_C(y)$, this gives \begin{align*} 2\sigma_C(y) \ge \sigma_C(y) + x \cdot y. \end{align*} Hence \begin{align*} \sigma_C(y) \ge x \cdot y. \end{align*} Therefore \begin{align*} x \cdot y = \sigma_C(y). \end{align*} [/step] [step:Prove that a subgradient must lie in the compact convex set] We prove $x \in C$. Suppose, for contradiction, that $x \notin C$. Define $d: C \to \mathbb{R}$ by \begin{align*} d(c) = |x - c|^2. \end{align*} The map $d$ is continuous, and $C$ is compact and nonempty, so there exists $c_0 \in C$ such that \begin{align*} |x - c_0|^2 = \min_{c \in C} |x - c|^2. \end{align*} Since $x \notin C$, we have $v := x - c_0 \ne 0$. For any $c \in C$, define $\gamma_c: [0,1] \to C$ by \begin{align*} \gamma_c(t) = (1-t)c_0 + tc. \end{align*} Convexity of $C$ gives $\gamma_c(t) \in C$ for every $t \in [0,1]$. The real function $\phi_c: [0,1] \to \mathbb{R}$ defined by \begin{align*} \phi_c(t) = |x - \gamma_c(t)|^2 \end{align*} has a minimum at $t = 0$. Indeed, $\gamma_c(t) \in C$ for every $t \in [0,1]$, and $c_0$ was chosen to minimize the distance from $x$ to $C$. Hence for every $h \in (0,1]$, \begin{align*} \frac{\phi_c(h)-\phi_c(0)}{h} \ge 0. \end{align*} Passing to the limit $h \downarrow 0$ gives that the right derivative at $0$ is nonnegative: \begin{align*} 0 \le \phi_c'(0) = -2(x - c_0) \cdot (c - c_0). \end{align*} Thus \begin{align*} v \cdot c \le v \cdot c_0 \end{align*} for every $c \in C$. Hence \begin{align*} \sigma_C(v) = \max_{c \in C} v \cdot c \le v \cdot c_0. \end{align*} But \begin{align*} v \cdot x = v \cdot c_0 + |v|^2 > v \cdot c_0. \end{align*} Therefore \begin{align*} v \cdot x - \sigma_C(v) > 0. \end{align*} Now rewrite the subgradient inequality as \begin{align*} x \cdot z - \sigma_C(z) \le x \cdot y - \sigma_C(y) \end{align*} for every $z \in \mathbb{R}^n$. For each $t > 0$, take $z = tv$. Since $t > 0$, \begin{align*} \sigma_C(tv) = t\sigma_C(v). \end{align*} Thus \begin{align*} t(x \cdot v - \sigma_C(v)) \le x \cdot y - \sigma_C(y) \end{align*} for every $t > 0$. The coefficient $x \cdot v - \sigma_C(v)$ is positive, so the left-hand side tends to $+\infty$ as $t \to \infty$, while the right-hand side is fixed. This contradiction proves $x \in C$. [/step] [step:Conclude that every subgradient is an exposed maximizer] Let $x \in \partial \sigma_C(y)$. The previous two steps show that $x \in C$ and \begin{align*} y \cdot x = \sigma_C(y). \end{align*} Therefore $x \in A_y$. Hence \begin{align*} \partial \sigma_C(y) \subset A_y. \end{align*} Together with the first inclusion, \begin{align*} \partial \sigma_C(y) = A_y = \operatorname{argmax}_{c \in C} y \cdot c. \end{align*} This proves the formula for every $y \in \mathbb{R}^n$. [/step]

Explore Further

Momentum Map Conservation Theorem applied Similarity Theorem for Minimal Realizations applied Single-Tape Simulation Theorem for Multi-Tape Turing Machines applied Equivalence of Nondeterministic Polynomial Time and Polynomial-Time Verification applied Bode Sensitivity Integral applied Birkhoff Normal Form Theorem applied Completeness of Bounded Quantified Boolean Satisfiability applied Moreau-Rockafellar Subdifferential Sum Rule applied

What brings you to Androma?

Start with a route through the knowledge graph.