First-Order Characterization of Convexity — Statement & Proof

First-Order Characterization of Convexity (Theorem # 6666)

Theorem

Edit Issues Pull Requests Attributions Admin

Discussion

Proof

[proofplan] The forward direction restricts the convex function $f$ to the line segment from $x$ to $y$ and uses the difference quotients at the initial point of the segment. Convexity gives an upper bound on these quotients in terms of $f(y)-f(x)$, and differentiability identifies their limit with $\nabla f(x)\cdot(y-x)$. For the reverse direction, we apply the assumed supporting-hyperplane inequality at the midpoint-like point $z=(1-\lambda)x+\lambda y$, once with target $x$ and once with target $y$. The two gradient terms cancel after multiplying by the convex weights, leaving exactly the convexity inequality. [/proofplan] [step:Derive the supporting inequality from convexity along a line segment] Assume first that $f$ is convex on $U$. Fix $x,y \in U$. Since $U$ is convex, the map \begin{align*} \gamma_{x,y}: [0,1] \to U, \qquad t \mapsto x+t(y-x) \end{align*} is well-defined. Define the real-valued function \begin{align*} \varphi_{x,y}: [0,1] \to \mathbb{R}, \qquad t \mapsto f(\gamma_{x,y}(t)). \end{align*} For every $t \in (0,1]$, convexity of $f$ applied to $\gamma_{x,y}(t)=(1-t)x+ty$ gives \begin{align*} \varphi_{x,y}(t) \leq (1-t)\varphi_{x,y}(0)+t\varphi_{x,y}(1). \end{align*} Rearranging this inequality and dividing by $t>0$ gives \begin{align*} \frac{\varphi_{x,y}(t)-\varphi_{x,y}(0)}{t} \leq \varphi_{x,y}(1)-\varphi_{x,y}(0). \end{align*} Because $f$ is differentiable at $x$, the directional derivative of $f$ at $x$ in the direction $y-x$ exists and equals $\nabla f(x)\cdot(y-x)$. Equivalently, \begin{align*} \lim_{t\downarrow 0}\frac{f(x+t(y-x))-f(x)}{t}=\nabla f(x)\cdot(y-x). \end{align*} Taking the limit as $t\downarrow 0$ in the previous inequality therefore yields \begin{align*} \nabla f(x)\cdot(y-x) \leq f(y)-f(x). \end{align*} This is precisely \begin{align*} f(y) \geq f(x)+\nabla f(x)\cdot(y-x). \end{align*} [guided] Assume that $f$ is convex on $U$, and fix two points $x,y \in U$. The desired inequality compares $f(y)$ with the affine approximation to $f$ at $x$, so we examine $f$ only along the line segment beginning at $x$ and ending at $y$. Since $U$ is convex, every point of this segment lies in $U$, so the map \begin{align*} \gamma_{x,y}: [0,1] \to U, \qquad t \mapsto x+t(y-x) \end{align*} is well-defined. Define \begin{align*} \varphi_{x,y}: [0,1] \to \mathbb{R}, \qquad t \mapsto f(\gamma_{x,y}(t)). \end{align*} The point $\gamma_{x,y}(t)$ is the convex combination $(1-t)x+ty$. Therefore convexity of $f$ gives, for each $t \in (0,1]$, \begin{align*} f(x+t(y-x)) \leq (1-t)f(x)+tf(y). \end{align*} In terms of $\varphi_{x,y}$, this is \begin{align*} \varphi_{x,y}(t) \leq (1-t)\varphi_{x,y}(0)+t\varphi_{x,y}(1). \end{align*} We now isolate a difference quotient at the point $t=0$. Subtracting $\varphi_{x,y}(0)$ from both sides gives \begin{align*} \varphi_{x,y}(t)-\varphi_{x,y}(0) \leq t(\varphi_{x,y}(1)-\varphi_{x,y}(0)). \end{align*} Since $t>0$, division by $t$ preserves the inequality: \begin{align*} \frac{\varphi_{x,y}(t)-\varphi_{x,y}(0)}{t} \leq \varphi_{x,y}(1)-\varphi_{x,y}(0). \end{align*} The left-hand side is the one-sided difference quotient of $f$ at $x$ in the direction $y-x$. Because $f$ is differentiable at $x$, its first-order expansion at $x$ implies \begin{align*} \lim_{t\downarrow 0}\frac{f(x+t(y-x))-f(x)}{t}=\nabla f(x)\cdot(y-x). \end{align*} The right-hand side is independent of $t$, so taking $t\downarrow 0$ gives \begin{align*} \nabla f(x)\cdot(y-x) \leq f(y)-f(x). \end{align*} Rearranging gives the supporting inequality \begin{align*} f(y) \geq f(x)+\nabla f(x)\cdot(y-x). \end{align*} [/guided] [/step] [step:Use the supporting inequality at an intermediate point] Conversely, assume that for every $a,b \in U$, \begin{align*} f(b) \geq f(a)+\nabla f(a)\cdot(b-a). \end{align*} Fix $x,y \in U$ and $\lambda \in [0,1]$. If $\lambda=0$ or $\lambda=1$, the convexity inequality is an equality. Hence assume $\lambda \in (0,1)$. Define the intermediate point \begin{align*} z := (1-\lambda)x+\lambda y. \end{align*} Since $U$ is convex, $z \in U$. Applying the assumed inequality with $a=z$ and $b=x$ gives \begin{align*} f(x) \geq f(z)+\nabla f(z)\cdot(x-z). \end{align*} Applying it with $a=z$ and $b=y$ gives \begin{align*} f(y) \geq f(z)+\nabla f(z)\cdot(y-z). \end{align*} Multiply the [first inequality](/theorems/2897) by $1-\lambda$ and the second by $\lambda$, then add: \begin{align*} (1-\lambda)f(x)+\lambda f(y) \geq f(z)+\nabla f(z)\cdot((1-\lambda)(x-z)+\lambda(y-z)). \end{align*} By the definition of $z$, \begin{align*} (1-\lambda)(x-z)+\lambda(y-z)=(1-\lambda)x+\lambda y-z=0. \end{align*} Therefore \begin{align*} (1-\lambda)f(x)+\lambda f(y) \geq f(z). \end{align*} Substituting $z=(1-\lambda)x+\lambda y$ gives \begin{align*} f((1-\lambda)x+\lambda y) \leq (1-\lambda)f(x)+\lambda f(y). \end{align*} Since $x,y \in U$ and $\lambda \in [0,1]$ were arbitrary, $f$ is convex on $U$. [/step]

Explore Further

Legendre Transform Equivalence of Euler--Lagrange and Hamilton Equations applied Kalman-Bucy Filter Theorem applied Orthogonality Principle for Least-Squares Estimation applied Kalman Controllability Rank Theorem applied Boolean Satisfiability Belongs to NP applied Gramian Observability Criterion applied Savitch's Theorem Corollary: PSPACE Equals NPSPACE applied Semidefinite Programming Complementary Slackness Theorem applied

What brings you to Androma?

Start with a route through the knowledge graph.

First-Order Characterization of Convexity (Theorem # 6666)

Discussion

Proof

Explore Further

Sign in to Androma

Check your inbox

One last step

First-Order Characterization of Convexity (Theorem # 6666)

Discussion

Proof

Explore Further