Convexity begins with a promise about interpolation: if two states are allowed, then every mixture between them is allowed, and the value of a function at a mixture should not exceed the same mixture of the endpoint values. This promise is powerful because it turns a global problem into a problem controlled by line segments. In optimization, it is the difference between a landscape where a local minimum may be a trap and a landscape where every local minimum is already global.
text
admin
The first failure to keep in mind is the function $f: \mathbb{R} \to \mathbb{R}$ given by $f(x) = x^4 - x^2$. Its graph has two valleys and a hill between them. A descent method started near one valley has no reason to find the other, and a tangent line near the hill gives misleading information about the global shape. Convex functions are designed to exclude exactly this kind of hidden non-global behaviour.
text
admin
[example: A Non-Convex Double Well]
Let $f:\mathbb{R}\to\mathbb{R}$ be defined by $f(x)=x^4-x^2$. First note that the symmetric pair $-1,1$ does not reveal non-convexity, because
\begin{align*}
f(-1)=(-1)^4-(-1)^2=1-1=0
\end{align*}
and
\begin{align*}
f(1)=1^4-1^2=1-1=0.
\end{align*}
Their midpoint is $0$, and
\begin{align*}
f(0)=0^4-0^2=0.
\end{align*}
Thus
\begin{align*}
f(0)=0=\frac{0+0}{2}=\frac{f(-1)+f(1)}{2},
\end{align*}
so this pair satisfies the midpoint equality.
Instead take $x=-1/\sqrt{2}$ and $y=1/\sqrt{2}$. Then
\begin{align*}
x^2=\left(-\frac{1}{\sqrt{2}}\right)^2=\frac{1}{2}
\end{align*}
and
\begin{align*}
x^4=(x^2)^2=\left(\frac{1}{2}\right)^2=\frac{1}{4},
\end{align*}
so
\begin{align*}
f(x)=x^4-x^2=\frac{1}{4}-\frac{1}{2}=-\frac{1}{4}.
\end{align*}
The same calculation gives
\begin{align*}
y^2=\left(\frac{1}{\sqrt{2}}\right)^2=\frac{1}{2}
\end{align*}
and
\begin{align*}
y^4=\frac{1}{4},
\end{align*}
hence
\begin{align*}
f(y)=\frac{1}{4}-\frac{1}{2}=-\frac{1}{4}.
\end{align*}
Their midpoint is
\begin{align*}
\frac{x+y}{2}=\frac{-1/\sqrt{2}+1/\sqrt{2}}{2}=0,
\end{align*}
so
\begin{align*}
f\left(\frac{x+y}{2}\right)=f(0)=0.
\end{align*}
The average endpoint value is
\begin{align*}
\frac{f(x)+f(y)}{2}=\frac{-1/4-1/4}{2}=\frac{-1/2}{2}=-\frac{1}{4}.
\end{align*}
Therefore
\begin{align*}
f\left(\frac{x+y}{2}\right)=0>-\frac{1}{4}=\frac{f(x)+f(y)}{2}.
\end{align*}
At $t=1/2$, the convexity inequality would require the midpoint value to be at most the average endpoint value, so this pair proves that $f(x)=x^4-x^2$ is not convex on $\mathbb{R}$ even though it is a polynomial.
[/example]
example
admin
The point of convexity is not smoothness. It is a compatibility condition between the geometry of the domain and the arithmetic of function values. A convex function may have corners, may fail to be differentiable, and may take the value $+\infty$ in the extended-valued formalism. What it gives in return is a rigid global geometry: epigraphs become convex sets, subgradients replace missing derivatives, and minimization becomes far more stable.
text
admin
## Definition
h2
admin
The function definition is the centre of the subject. It asks for one inequality to hold along every line segment in the domain: the value at an interpolated point must sit below the interpolated endpoint values.
text
admin
[definition: Convex Function]
Let $V$ be a real [vector space](/page/Vector%20Space), let $C \subset V$ be a set such that $(1-t)x+ty\in C$ for every $x,y\in C$ and every $t\in[0,1]$, and let $f: C \to \mathbb{R}$ be a function. The function $f$ is convex on $C$ if for every $x,y \in C$ and every $t \in [0,1]$,
\begin{align*}
f((1-t)x+ty) \le (1-t)f(x)+t f(y).
\end{align*}
[/definition]
definition
admin
This definition is deliberately global: it quantifies over every segment in the domain. It also makes no reference to derivatives, which is why it applies equally well to smooth functions, piecewise-linear functions, norms, and many functions with corners. The definition above has already built the segment condition into its hypotheses; the next section separates that domain condition from the function inequality and then sharpens the function side in two standard directions.
text
admin
## Domains, Strictness, and Infinite Values
h2
admin
### Convex Domains
h3
admin
Before the inequality for function values can even be tested, the interpolated point must remain admissible. If a feasible region has a gap, the segment between two feasible points may leave the region, and the chord comparison has no value there. Convex sets isolate the domains where line-segment arguments can run without leaving the problem.
text
admin
[definition: Convex Set]
Let $V$ be a real vector space. A set $C \subset V$ is convex if for every $x,y \in C$ and every $t \in [0,1]$, the point $(1-t)x + ty$ belongs to $C$.
[/definition]
definition
admin
Once the domain has this segment geometry, the function inequality can be applied repeatedly without leaving the problem. Some optimization problems require more than the absence of hidden bumps: they need to rule out flat valleys where different points have the same best value.
text
admin
### Strict Convexity
h3
admin
If an optimization problem has a whole flat valley of best points, convexity still gives global control but it does not select a unique answer. Strict convexity is the condition that every genuine mixture improves on the straight-line interpolation of endpoint values. It is the right strengthening when the geometry should rule out ties between distinct feasible points.
text
admin
[definition: Strictly Convex Function]
Let $V$ be a real vector space, let $C \subset V$ be convex, and let $f: C \to \mathbb{R}$ be a function. The function $f$ is strictly convex on $C$ if for every distinct $x,y \in C$ and every $t \in (0,1)$,
\begin{align*}
f((1-t)x+ty) < (1-t)f(x)+t f(y).
\end{align*}
[/definition]
definition
admin
Strict convexity is stronger than convexity, but it is not the same as having positive curvature everywhere. The function $x \mapsto x^4$ is strictly convex on $\mathbb{R}$ even though its second derivative vanishes at $0$. Another limitation of the finite-valued definition is that it keeps the feasible set outside the function, while constrained optimization often works better when the objective and the constraint are part of one object.
text
admin
### Extended-Valued Functions
h3
admin
Many convex problems are constrained: some points are feasible and others should never be chosen. Keeping the feasible set separate from the objective works, but it hides the fact that constraints and costs often obey the same convex calculus. Extended-valued functions encode forbidden points by assigning them the value $+\infty$, turning feasibility into an infinite penalty.
text
admin
[definition: Extended-Valued Convex Function]
Let $V$ be a real vector space. A function $f: V \to \mathbb{R} \cup \{+\infty\}$ is extended-valued convex if for every $x,y \in V$ and every $t \in [0,1]$,
\begin{align*}
f((1-t)x+ty) \le (1-t)f(x)+t f(y),
\end{align*}
where arithmetic in $\mathbb{R}\cup\{+\infty\}$ is interpreted by $a+(+\infty)=+\infty$ for $a\in\mathbb{R}\cup\{+\infty\}$, $s(+\infty)=+\infty$ for $s>0$, and $0\cdot(+\infty)=0$.
[/definition]