Existence of a Supporting Hyperplane for Convex Functions

Theorem

Edit Issues Pull Requests Attributions Admin

Discussion

Proof

[proofplan] We construct the subgradient $r$ in three stages. First, we show that every convex function $f: \mathbb{R}^n \to \mathbb{R}$ possesses one-sided directional derivatives in every direction and that the resulting map $d \mapsto g(d)$ is sublinear (positively homogeneous and subadditive). Second, we apply the [Hahn-Banach Theorem](/theorems/879) to the sublinear functional $g$ to obtain a linear functional $L: \mathbb{R}^n \to \mathbb{R}$ satisfying $L(d) \leq g(d)$ for all $d \in \mathbb{R}^n$. Third, we identify $L$ with the inner product against a vector $r \in \mathbb{R}^n$ and verify that $r$ satisfies the supporting hyperplane inequality. [/proofplan] [step:Establish existence and monotonicity of the difference quotient for convex functions] Fix $x \in \mathbb{R}^n$ and a direction $d \in \mathbb{R}^n$. For $t > 0$, define the difference quotient \begin{align*} \varphi: (0, \infty) &\to \mathbb{R} \\ t &\mapsto \frac{f(x + td) - f(x)}{t}. \end{align*} We claim that $\varphi$ is nondecreasing. Let $0 < s < t$. Write $x + sd$ as a convex combination of $x$ and $x + td$: \begin{align*} x + sd = \frac{t - s}{t} \cdot x + \frac{s}{t} \cdot (x + td). \end{align*} Since $0 < s < t$, the coefficient $\lambda := s/t$ satisfies $0 < \lambda < 1$. By convexity of $f$, \begin{align*} f(x + sd) \leq (1 - \lambda) f(x) + \lambda f(x + td) = f(x) + \frac{s}{t}\bigl(f(x + td) - f(x)\bigr). \end{align*} Subtracting $f(x)$ and dividing by $s > 0$: \begin{align*} \frac{f(x + sd) - f(x)}{s} \leq \frac{f(x + td) - f(x)}{t}, \end{align*} which gives $\varphi(s) \leq \varphi(t)$. Hence $\varphi$ is nondecreasing on $(0, \infty)$. [guided] The goal of this step is to understand the behaviour of the one-sided difference quotient $t \mapsto \frac{f(x+td) - f(x)}{t}$ as $t \searrow 0$. For the limit to exist, we need some monotonicity or boundedness property. Fix $x \in \mathbb{R}^n$ and a direction $d \in \mathbb{R}^n$. Define the difference quotient \begin{align*} \varphi: (0, \infty) &\to \mathbb{R} \\ t &\mapsto \frac{f(x + td) - f(x)}{t}. \end{align*} Why should $\varphi$ be monotone? The key observation is that the point $x + sd$ (for $0 < s < t$) lies on the line segment joining $x$ and $x + td$, so convexity provides a bound on $f(x + sd)$. Explicitly, we write \begin{align*} x + sd = \frac{t - s}{t} \cdot x + \frac{s}{t} \cdot (x + td), \end{align*} which is a convex combination because the coefficient $\lambda := s/t$ satisfies $0 < \lambda < 1$. By convexity of $f$, \begin{align*} f(x + sd) &\leq (1 - \lambda) f(x) + \lambda f(x + td) \\ &= f(x) + \frac{s}{t}\bigl(f(x + td) - f(x)\bigr). \end{align*} Subtracting $f(x)$ from both sides and dividing by $s > 0$ (which preserves the inequality direction since $s > 0$): \begin{align*} \frac{f(x + sd) - f(x)}{s} \leq \frac{f(x + td) - f(x)}{t}. \end{align*} This is precisely $\varphi(s) \leq \varphi(t)$, so $\varphi$ is nondecreasing on $(0, \infty)$. This monotonicity is the engine of the entire proof: it simultaneously guarantees that the limit as $t \searrow 0$ exists and provides the subadditivity of the limiting object. [/guided] [/step] [step:Prove that the one-sided directional derivative exists and is finite] Since $\varphi$ is nondecreasing, the limit \begin{align*} g(d) := \lim_{t \to 0^+} \frac{f(x + td) - f(x)}{t} = \inf_{t > 0} \frac{f(x + td) - f(x)}{t} \end{align*} exists in $\mathbb{R} \cup \{-\infty\}$. We verify that $g(d) > -\infty$. Write $x$ as the midpoint convex combination \begin{align*} x = \frac{1}{2}(x + td) + \frac{1}{2}(x - td) \end{align*} for any $t > 0$. By convexity of $f$, \begin{align*} f(x) \leq \frac{1}{2} f(x + td) + \frac{1}{2} f(x - td). \end{align*} Rearranging: \begin{align*} f(x + td) - f(x) \geq f(x) - f(x - td) = -(f(x - td) - f(x)). \end{align*} Dividing by $t > 0$: \begin{align*} \frac{f(x + td) - f(x)}{t} \geq -\frac{f(x - td) - f(x)}{t} = -\varphi_{-d}(t), \end{align*} where $\varphi_{-d}(t) := \frac{f(x - td) - f(x)}{t}$ is the difference quotient in the direction $-d$. Since $\varphi_{-d}$ is also nondecreasing (by the result of the previous step applied to the direction $-d$), for any fixed $t_0 > 0$ and all $0 < t \leq t_0$ we have $\varphi_{-d}(t) \leq \varphi_{-d}(t_0)$. Therefore \begin{align*} \varphi(t) = \frac{f(x + td) - f(x)}{t} \geq -\varphi_{-d}(t_0) = -\frac{f(x - t_0 d) - f(x)}{t_0} > -\infty. \end{align*} Since $\varphi$ is bounded below by a finite constant and nondecreasing, $g(d) = \inf_{t > 0} \varphi(t)$ is a finite real number. [guided] We established in the previous step that $\varphi(t) = \frac{f(x+td) - f(x)}{t}$ is nondecreasing. A nondecreasing function that is bounded below has a finite infimum, so the limit $g(d) = \lim_{t \to 0^+} \varphi(t) = \inf_{t > 0} \varphi(t)$ exists — provided we can rule out $g(d) = -\infty$. Why might $g(d)$ be $-\infty$? If $\varphi(t) \to -\infty$ as $t \to 0^+$, the infimum would be $-\infty$. But $\varphi$ is nondecreasing, so this would require $\varphi(t) = -\infty$ for all $t > 0$, contradicting the fact that $f$ is real-valued. We give a cleaner quantitative bound using the reverse direction. Write $x$ as the midpoint \begin{align*} x = \frac{1}{2}(x + td) + \frac{1}{2}(x - td). \end{align*} Convexity gives $f(x) \leq \frac{1}{2} f(x + td) + \frac{1}{2} f(x - td)$, which rearranges to \begin{align*} f(x + td) - f(x) \geq -(f(x - td) - f(x)). \end{align*} Dividing by $t > 0$ and using the notation $\varphi_{-d}(t) = \frac{f(x - td) - f(x)}{t}$ for the difference quotient in the direction $-d$: \begin{align*} \varphi(t) \geq -\varphi_{-d}(t). \end{align*} The function $\varphi_{-d}$ is also nondecreasing (the monotonicity argument applies to every direction, including $-d$). Fixing any $t_0 > 0$, for all $0 < t \leq t_0$ we have $\varphi_{-d}(t) \leq \varphi_{-d}(t_0)$, hence \begin{align*} \varphi(t) \geq -\varphi_{-d}(t_0) = -\frac{f(x - t_0 d) - f(x)}{t_0}. \end{align*} The right-hand side is a fixed real number (since $f$ maps into $\mathbb{R}$), so $g(d) = \inf_{t > 0} \varphi(t) \geq -\varphi_{-d}(t_0) > -\infty$. The conclusion: for every direction $d \in \mathbb{R}^n$, the one-sided directional derivative \begin{align*} g(d) = \lim_{t \to 0^+} \frac{f(x + td) - f(x)}{t} \end{align*} exists as a finite real number. [/guided] [/step] [step:Show that the directional derivative $g$ is sublinear: positively homogeneous and subadditive] We verify two properties of the map $g: \mathbb{R}^n \to \mathbb{R}$. **Positive homogeneity.** For $\lambda > 0$ and $d \in \mathbb{R}^n$, \begin{align*} g(\lambda d) = \lim_{t \to 0^+} \frac{f(x + t\lambda d) - f(x)}{t} = \lambda \lim_{s \to 0^+} \frac{f(x + sd) - f(x)}{s} = \lambda \, g(d), \end{align*} where we substituted $s = t\lambda$ (so $s \to 0^+$ as $t \to 0^+$ since $\lambda > 0$) and used $t = s/\lambda$. **Subadditivity.** Let $d_1, d_2 \in \mathbb{R}^n$. For $t > 0$, write \begin{align*} x + t(d_1 + d_2) = \frac{1}{2}(x + 2td_1) + \frac{1}{2}(x + 2td_2). \end{align*} By convexity of $f$, \begin{align*} f(x + t(d_1 + d_2)) \leq \frac{1}{2} f(x + 2td_1) + \frac{1}{2} f(x + 2td_2). \end{align*} Subtracting $f(x) = \frac{1}{2} f(x) + \frac{1}{2} f(x)$ and dividing by $t > 0$: \begin{align*} \frac{f(x + t(d_1 + d_2)) - f(x)}{t} &\leq \frac{1}{2} \cdot \frac{f(x + 2td_1) - f(x)}{t} + \frac{1}{2} \cdot \frac{f(x + 2td_2) - f(x)}{t} \\ &= \frac{f(x + 2td_1) - f(x)}{2t} + \frac{f(x + 2td_2) - f(x)}{2t}. \end{align*} Taking $t \to 0^+$ (with $s := 2t \to 0^+$): \begin{align*} g(d_1 + d_2) \leq g(d_1) + g(d_2). \end{align*} Since $g$ is positively homogeneous and subadditive, $g$ is a **sublinear functional** on $\mathbb{R}^n$. [guided] We need to verify that the directional derivative map $g: \mathbb{R}^n \to \mathbb{R}$ is sublinear, meaning it is both positively homogeneous and subadditive. This is the key structural property that allows us to apply the Hahn-Banach Theorem in the next step. **Positive homogeneity: $g(\lambda d) = \lambda \, g(d)$ for $\lambda > 0$.** This is a direct consequence of reparametrising the limit. For $\lambda > 0$, \begin{align*} g(\lambda d) = \lim_{t \to 0^+} \frac{f(x + t\lambda d) - f(x)}{t}. \end{align*} Substituting $s = t\lambda$ (so $t = s/\lambda$ and $s \to 0^+$ as $t \to 0^+$): \begin{align*} g(\lambda d) = \lim_{s \to 0^+} \frac{f(x + sd) - f(x)}{s/\lambda} = \lambda \lim_{s \to 0^+} \frac{f(x + sd) - f(x)}{s} = \lambda \, g(d). \end{align*} **Subadditivity: $g(d_1 + d_2) \leq g(d_1) + g(d_2)$.** The idea is to express $x + t(d_1 + d_2)$ as a convex combination that splits the two directions. Observe \begin{align*} x + t(d_1 + d_2) = \frac{1}{2}(x + 2td_1) + \frac{1}{2}(x + 2td_2). \end{align*} Why this particular combination? Because it is a midpoint, so the convexity inequality has equal weights $1/2$, and the factor $2t$ in each term is chosen so that after dividing by $t$ we recover the difference quotients $\frac{f(x + 2td_i) - f(x)}{2t}$, each of which has the form $\varphi_{d_i}(2t)$ and converges to $g(d_i)$. By convexity of $f$, \begin{align*} f(x + t(d_1 + d_2)) \leq \frac{1}{2} f(x + 2td_1) + \frac{1}{2} f(x + 2td_2). \end{align*} Subtracting $f(x)$ from both sides and dividing by $t > 0$: \begin{align*} \frac{f(x + t(d_1 + d_2)) - f(x)}{t} &\leq \frac{f(x + 2td_1) - f(x)}{2t} + \frac{f(x + 2td_2) - f(x)}{2t}. \end{align*} Taking $t \to 0^+$ on both sides (setting $s = 2t \to 0^+$), the left side converges to $g(d_1 + d_2)$ and each term on the right converges to $g(d_i)$: \begin{align*} g(d_1 + d_2) \leq g(d_1) + g(d_2). \end{align*} Since $g$ satisfies both positive homogeneity and subadditivity, $g$ is a sublinear functional on the vector space $\mathbb{R}^n$. This is precisely the kind of functional that dominates linear functionals in the [Hahn-Banach Theorem](/theorems/879). [/guided] [/step] [step:Apply the Hahn-Banach Theorem to obtain a linear functional dominated by $g$] Consider the zero subspace $W = \{0\} \subset \mathbb{R}^n$ and define the linear functional \begin{align*} \ell: W &\to \mathbb{R} \\ 0 &\mapsto 0. \end{align*} Since $g(0) = \lim_{t \to 0^+} \frac{f(x) - f(x)}{t} = 0$, we have $\ell(0) = 0 = g(0)$, so $\ell(w) \leq g(w)$ for all $w \in W$. The functional $g: \mathbb{R}^n \to \mathbb{R}$ is positively homogeneous and subadditive (established in the previous step). We verify the hypotheses of the [Hahn-Banach Theorem](/theorems/879): $\mathbb{R}^n$ is a real vector space, $W = \{0\}$ is a subspace, $g$ is positively homogeneous and subadditive, $\ell$ is linear on $W$, and $\ell(w) \leq g(w)$ for all $w \in W$ (the single condition $0 \leq 0$ holds). All hypotheses are satisfied. By the [Hahn-Banach Theorem](/theorems/879) applied with $V = \mathbb{R}^n$, $W = \{0\}$, and $p = g$, there exists a linear extension \begin{align*} L: \mathbb{R}^n \to \mathbb{R} \end{align*} such that \begin{align*} L(d) \leq g(d) \quad \text{for all } d \in \mathbb{R}^n. \end{align*} [guided] We now apply the [Hahn-Banach Theorem](/theorems/879) to promote the sublinear functional $g$ into a linear functional. The [Hahn-Banach Theorem](/theorems/879) requires four inputs: 1. A real vector space $V$. 2. A subspace $W \subset V$. 3. A sublinear functional $p: V \to \mathbb{R}$ (positively homogeneous and subadditive). 4. A linear functional $\ell: W \to \mathbb{R}$ with $\ell(w) \leq p(w)$ for all $w \in W$. We choose: $V = \mathbb{R}^n$, $W = \{0\}$, $p = g$, and $\ell: \{0\} \to \mathbb{R}$ the zero map. We verify all hypotheses: - $\mathbb{R}^n$ is a real vector space. - $\{0\}$ is a subspace of $\mathbb{R}^n$. - $g$ is positively homogeneous: $g(\lambda d) = \lambda g(d)$ for all $\lambda > 0$, $d \in \mathbb{R}^n$ (shown in the previous step). - $g$ is subadditive: $g(d_1 + d_2) \leq g(d_1) + g(d_2)$ for all $d_1, d_2 \in \mathbb{R}^n$ (shown in the previous step). - $\ell(0) = 0 \leq g(0) = 0$, so the domination condition holds on $W$. The [Hahn-Banach Theorem](/theorems/879) produces a linear functional $L: \mathbb{R}^n \to \mathbb{R}$ extending $\ell$ (which is vacuous since $W = \{0\}$) and satisfying \begin{align*} L(d) \leq g(d) \quad \text{for all } d \in \mathbb{R}^n. \end{align*} One might ask: why start from the trivial subspace $\{0\}$ rather than from a more informative subspace? In $\mathbb{R}^n$, the Hahn-Banach extension from $\{0\}$ is sufficient because any linear functional on a finite-dimensional space is automatically continuous, and the extension is guaranteed to exist. Starting from $\{0\}$ keeps the argument clean and avoids the need to construct $\ell$ on a nontrivial subspace. [/guided] [/step] [step:Represent the linear functional $L$ as the inner product with a vector $r \in \mathbb{R}^n$] Since $L: \mathbb{R}^n \to \mathbb{R}$ is a linear functional on a finite-dimensional space, it is determined by its action on the standard basis. Define $r \in \mathbb{R}^n$ by \begin{align*} r_i := L(e_i) \quad \text{for } i = 1, \ldots, n, \end{align*} where $e_1, \ldots, e_n$ denotes the standard basis of $\mathbb{R}^n$. By linearity of $L$, for any $d = \sum_{i=1}^n d_i \, e_i \in \mathbb{R}^n$, \begin{align*} L(d) = \sum_{i=1}^n d_i \, L(e_i) = \sum_{i=1}^n r_i \, d_i = r \cdot d. \end{align*} The domination condition $L(d) \leq g(d)$ therefore reads \begin{align*} r \cdot d \leq g(d) = \lim_{t \to 0^+} \frac{f(x + td) - f(x)}{t} \quad \text{for all } d \in \mathbb{R}^n. \end{align*} [guided] Every linear functional on $\mathbb{R}^n$ can be represented as an inner product — this is a consequence of the finite-dimensional Riesz representation (or, more elementarily, of the fact that linear maps from $\mathbb{R}^n$ to $\mathbb{R}$ are determined by their action on a basis). Define the vector $r = (r_1, \ldots, r_n) \in \mathbb{R}^n$ by $r_i := L(e_i)$, where $e_1, \ldots, e_n$ is the standard basis of $\mathbb{R}^n$. For any $d = \sum_{i=1}^n d_i \, e_i$, linearity of $L$ gives \begin{align*} L(d) = L\Bigl(\sum_{i=1}^n d_i \, e_i\Bigr) = \sum_{i=1}^n d_i \, L(e_i) = \sum_{i=1}^n r_i \, d_i = r \cdot d. \end{align*} Substituting into the Hahn-Banach conclusion $L(d) \leq g(d)$: \begin{align*} r \cdot d \leq g(d) \quad \text{for all } d \in \mathbb{R}^n. \end{align*} This is the vector $r$ that will serve as the subgradient. The remaining task is to promote the infinitesimal inequality $r \cdot d \leq g(d)$ into the global inequality $f(y) \geq f(x) + r \cdot (y - x)$. [/guided] [/step] [step:Verify the supporting hyperplane inequality for all $y \in \mathbb{R}^n$] Fix an arbitrary $y \in \mathbb{R}^n$ and set $d := y - x$. By the monotonicity established in the first step, $\varphi(t) = \frac{f(x + td) - f(x)}{t}$ is nondecreasing on $(0, \infty)$. Evaluating at $t = 1$: \begin{align*} g(d) = \inf_{t > 0} \frac{f(x + td) - f(x)}{t} \leq \frac{f(x + d) - f(x)}{1} = f(y) - f(x). \end{align*} Combining with the domination inequality $r \cdot d \leq g(d)$: \begin{align*} r \cdot (y - x) = r \cdot d \leq g(d) \leq f(y) - f(x). \end{align*} Rearranging: \begin{align*} f(y) \geq f(x) + r \cdot (y - x). \end{align*} Since $y \in \mathbb{R}^n$ was arbitrary, this completes the proof. $\blacksquare$ [guided] We now close the argument by converting the infinitesimal bound $r \cdot d \leq g(d)$ into the global supporting hyperplane inequality. Fix any $y \in \mathbb{R}^n$ and set $d := y - x$. From the first step, the difference quotient $\varphi(t) = \frac{f(x + td) - f(x)}{t}$ is nondecreasing. The directional derivative $g(d)$ is defined as the infimum of $\varphi$ over $t > 0$. Evaluating $\varphi$ at the specific value $t = 1$: \begin{align*} g(d) = \inf_{t > 0} \varphi(t) \leq \varphi(1) = f(x + d) - f(x) = f(y) - f(x). \end{align*} This inequality is the crucial passage from infinitesimal (directional derivative) to global (function values): it uses the fact that the infimum of a nondecreasing function is at most any particular value, and $t = 1$ gives precisely $f(y) - f(x)$. Chaining the two inequalities: \begin{align*} r \cdot (y - x) \leq g(y - x) \leq f(y) - f(x). \end{align*} The first inequality is the Hahn-Banach domination, and the second is the monotonicity estimate. Rearranging yields \begin{align*} f(y) \geq f(x) + r \cdot (y - x), \end{align*} which is the desired supporting hyperplane inequality. Since $y \in \mathbb{R}^n$ was arbitrary and $x \in \mathbb{R}^n$ was fixed at the outset, the proof is complete. $\blacksquare$ [/guided] [/step]

Prerequisites (0/2 completed)

Prerequisites Graph

Interactive dependency map showing how this theorem builds on foundational concepts

Loading dependency graph...

Theorems

Hahn-Banach Theorem

Definitions & Concepts

1214

What brings you to Androma?

Start with a route through the knowledge graph.