Minkowski's Theorem on Extreme Points — Statement & Proof

Theorem

Edit Issues Pull Requests Attributions Admin

Discussion

Proof

[proofplan] We prove the result by induction on the affine dimension of $C$. The main geometric input is that every relative boundary point of a compact convex set lies in a proper exposed face, and such a face has strictly smaller affine dimension. Relative interior points are then reduced to boundary points by intersecting $C$ with a line through the point and writing the point as a convex combination of the two endpoints of the resulting compact interval. [/proofplan] [step:Record the elementary properties of exposed faces and extreme points] Let $A := \operatorname{aff} C$ be the [affine hull](/page/Affine%20Hull) of $C$, and let $d := \dim A$. Let $\tau_A$ denote the relative topology on the affine space $A$. Define $\operatorname{relint}_A C$ to be the interior of $C$ with respect to $\tau_A$, and define $\partial_A C := \overline{C}^{\tau_A} \setminus \operatorname{relint}_A C$ to be the boundary of $C$ relative to $A$. Since $C$ is compact in $\mathbb{R}^n$, it is closed in $\mathbb{R}^n$, so $\overline{C}^{\tau_A}=C$. For a non-zero linear functional $\ell: \mathbb{R}^n \to \mathbb{R}$, define the exposed face of $C$ associated to $\ell$ by \begin{align*} F_\ell := \{x \in C : \ell(x)=\max_{y \in C}\ell(y)\}. \end{align*} Since $C$ is compact and $\ell$ is continuous, the maximum exists, so $F_\ell$ is non-empty. Also $F_\ell$ is compact because it is closed in $C$, and it is convex because if $x_1,x_2 \in F_\ell$ and $t \in [0,1]$, then \begin{align*} \ell((1-t)x_1+t x_2)=(1-t)\ell(x_1)+t\ell(x_2)=\max_{y \in C}\ell(y). \end{align*} We next note that extreme points of $F_\ell$ are extreme points of $C$. Let $p \in \operatorname{ext} F_\ell$. Suppose $u,v \in C$ and $t \in (0,1)$ satisfy \begin{align*} p=(1-t)u+tv. \end{align*} Applying $\ell$ gives \begin{align*} \max_{y \in C}\ell(y) = \ell(p) = (1-t)\ell(u)+t\ell(v) \leq (1-t)\max_{y \in C}\ell(y)+t\max_{y \in C}\ell(y) = \max_{y \in C}\ell(y). \end{align*} The inequality is therefore an equality. Since $t \in (0,1)$ and both $\ell(u),\ell(v)$ are at most the maximum, equality forces \begin{align*} \ell(u)=\ell(v)=\max_{y \in C}\ell(y). \end{align*} Thus $u,v \in F_\ell$. Because $p$ is extreme in $F_\ell$, we get $u=v=p$. Hence $p \in \operatorname{ext} C$. [guided] The purpose of this step is to make induction compatible with faces. A face is useful only if its extreme points remain extreme points of the original set. Let $A := \operatorname{aff} C$ be the affine hull of $C$, and let $d := \dim A$. For a non-zero linear functional $\ell: \mathbb{R}^n \to \mathbb{R}$, define \begin{align*} F_\ell := \{x \in C : \ell(x)=\max_{y \in C}\ell(y)\}. \end{align*} The maximum exists because $C$ is compact and $\ell$ is continuous. The set $F_\ell$ is non-empty by existence of the maximum, compact because it is a closed subset of the compact set $C$, and convex because if $x_1,x_2 \in F_\ell$ and $t \in [0,1]$, then \begin{align*} \ell((1-t)x_1+t x_2) = (1-t)\ell(x_1)+t\ell(x_2) = \max_{y \in C}\ell(y). \end{align*} Since $C$ is convex, $(1-t)x_1+t x_2 \in C$, so this point lies in $F_\ell$. Now let $p \in \operatorname{ext} F_\ell$. We prove that $p$ is also extreme in $C$. Suppose $u,v \in C$ and $t \in (0,1)$ satisfy \begin{align*} p=(1-t)u+tv. \end{align*} Because $p \in F_\ell$, we have $\ell(p)=\max_{y \in C}\ell(y)$. Applying $\ell$ to the convex combination gives \begin{align*} \max_{y \in C}\ell(y) = \ell(p) = (1-t)\ell(u)+t\ell(v) \leq (1-t)\max_{y \in C}\ell(y)+t\max_{y \in C}\ell(y) = \max_{y \in C}\ell(y). \end{align*} The middle inequality can be an equality only if both $u$ and $v$ also attain the same maximum, because $t$ and $1-t$ are positive. Therefore \begin{align*} \ell(u)=\ell(v)=\max_{y \in C}\ell(y), \end{align*} so $u,v \in F_\ell$. Since $p$ is extreme in $F_\ell$, the equality $p=(1-t)u+tv$ implies $u=v=p$. This is exactly the definition of $p \in \operatorname{ext} C$. [/guided] [/step] [step:Prove the theorem by induction on affine dimension] We prove the assertion for every non-empty compact convex set $C \subset \mathbb{R}^n$ by induction on $d=\dim(\operatorname{aff} C)$. If $d=0$, then $C=\{p\}$ for a point $p \in \mathbb{R}^n$. The point $p$ is extreme in $C$, and hence \begin{align*} C=\{p\}=\operatorname{conv}(\operatorname{ext} C). \end{align*} Assume the theorem has been proved for all non-empty compact convex subsets of affine dimension strictly less than $d$, and let $C$ be non-empty, compact, convex, and satisfy $\dim(\operatorname{aff} C)=d \geq 1$. [/step] [step:Place every relative boundary point in a lower-dimensional exposed face] Let $x \in \partial_A C$, where $\partial_A C$ denotes the boundary of $C$ relative to the affine space $A=\operatorname{aff} C$. We claim that there exists a non-zero linear functional $\ell: \mathbb{R}^n \to \mathbb{R}$ such that \begin{align*} x \in F_\ell \quad\text{and}\quad F_\ell \neq C. \end{align*} The set $C$ is closed and convex in the finite-dimensional affine space $A$, and $x \in \partial_A C$. By the [Supporting Hyperplane Theorem](/theorems/???), applied in the affine space $A$, there exists an affine hyperplane $H \subset A$ through $x$ such that $C$ is contained in one of the two closed half-spaces of $A$ determined by $H$. Thus there are a non-zero linear functional $L: A_0 \to \mathbb{R}$ on the direction space $A_0 := A-A$, a real number $b \in \mathbb{R}$, and a real number $a \in \mathbb{R}$ such that the affine functional $h: A \to \mathbb{R}$ is given by $h(y)=L(y-y_0)+b$ for a fixed point $y_0 \in A$, and \begin{align*} H=\{y \in A : h(y)=a\}, \qquad h(x)=a, \qquad h(y)\leq a \text{ for all } y \in C. \end{align*} Extend $L$ from $A_0$ to a linear functional $\ell: \mathbb{R}^n \to \mathbb{R}$. For every $y \in C$, subtracting the common affine constant gives \begin{align*} \ell(y)-\ell(x) = L(y-x) = h(y)-h(x) \leq 0. \end{align*} Hence $\ell(y)\leq \ell(x)$ for every $y \in C$, so $\ell$ attains its maximum over $C$ at $x$, and therefore \begin{align*} x \in F_\ell. \end{align*} Since $A=\operatorname{aff} C$, the set $C$ is not contained in the hyperplane $H$ unless $H=A$, and $H$ is a proper affine hyperplane of $A$. Hence $F_\ell \subset C \cap H$ and $F_\ell \neq C$. Therefore \begin{align*} \dim(\operatorname{aff} F_\ell)<d. \end{align*} By the induction hypothesis applied to the non-empty compact convex set $F_\ell$, \begin{align*} F_\ell=\operatorname{conv}(\operatorname{ext} F_\ell). \end{align*} The previous step gives $\operatorname{ext} F_\ell \subset \operatorname{ext} C$, so \begin{align*} x \in F_\ell = \operatorname{conv}(\operatorname{ext} F_\ell) \subset \operatorname{conv}(\operatorname{ext} C). \end{align*} [guided] We now handle points on the relative boundary. The reason boundary points are easier is that a supporting hyperplane cuts out a smaller convex set that still contains the point. Let $x \in \partial_A C$, where $\partial_A C$ denotes the boundary taken inside the affine space $A=\operatorname{aff} C$. The set $C$ is closed and convex in the finite-dimensional affine space $A$, and $x \in \partial_A C$. We apply the [Supporting Hyperplane Theorem](/theorems/???) in $A$. Its hypotheses are satisfied: closedness follows from compactness of $C$ in $\mathbb{R}^n$, convexity is part of the theorem statement, and $x$ is a relative boundary point by assumption. Therefore there is an affine hyperplane $H \subset A$ through $x$ supporting $C$. Concretely, there are a non-zero linear functional $L: A_0 \to \mathbb{R}$ on the direction space $A_0 := A-A$, a fixed point $y_0 \in A$, a real number $b \in \mathbb{R}$, and a real number $a \in \mathbb{R}$ such that the affine functional $h: A \to \mathbb{R}$ satisfies $h(y)=L(y-y_0)+b$ and \begin{align*} H=\{y \in A : h(y)=a\}, \qquad h(x)=a, \qquad h(y)\leq a \text{ for all } y \in C. \end{align*} The geometric meaning is that $C$ lies entirely on one side of $H$, while $x$ lies on the hyperplane itself. Let $\ell: \mathbb{R}^n \to \mathbb{R}$ be a linear extension of $L$ from $A_0$ to $\mathbb{R}^n$. For every $y \in C$, the affine constants in $h(y)$ and $h(x)$ cancel, so \begin{align*} \ell(y)-\ell(x) = L(y-x) = h(y)-h(x) \leq 0. \end{align*} Thus $\ell(y)\leq \ell(x)$ for all $y \in C$. Hence the exposed face \begin{align*} F_\ell := \{y \in C : \ell(y)=\max_{z \in C}\ell(z)\} \end{align*} contains $x$, because $x$ is one of the points where $\ell$ attains its maximum on $C$. This face is proper. Indeed, $F_\ell \subset C \cap H$. If $F_\ell=C$, then $C \subset H$, which would imply $\operatorname{aff} C \subset H$. But $\operatorname{aff} C=A$, while $H$ is a proper affine hyperplane of $A$, a contradiction. Thus $F_\ell \neq C$, and consequently \begin{align*} \dim(\operatorname{aff} F_\ell)<d. \end{align*} Now the induction hypothesis applies to $F_\ell$: it is non-empty, compact, convex, and has smaller affine dimension. Hence \begin{align*} F_\ell=\operatorname{conv}(\operatorname{ext} F_\ell). \end{align*} From the face argument proved earlier, every extreme point of $F_\ell$ is an extreme point of $C$. Therefore \begin{align*} x \in F_\ell = \operatorname{conv}(\operatorname{ext} F_\ell) \subset \operatorname{conv}(\operatorname{ext} C). \end{align*} So every relative boundary point of $C$ lies in the convex hull of the extreme points of $C$. [/guided] [/step] [step:Express every relative interior point as a convex combination of two relative boundary points] Let $x \in \operatorname{relint}_A C$. Choose a non-zero vector $v$ in the direction space of $A$. Define \begin{align*} I := \{s \in \mathbb{R} : x+s v \in C\}. \end{align*} Let $\gamma: \mathbb{R} \to A$ be the affine map $\gamma(s)=x+sv$. The set $I$ is non-empty because $0 \in I$. It is closed because $I=\gamma^{-1}(C)$, the map $\gamma$ is continuous, and $C$ is closed. It is bounded because $C$ is compact, hence bounded: choosing $R>0$ such that $C \subset B(0,R)$, every $s \in I$ satisfies $|x+sv|\leq R$, and therefore \begin{align*} |s|\,|v| \leq |x+sv|+|x| \leq R+|x|. \end{align*} Since $v\neq 0$, this gives $|s|\leq (R+|x|)/|v|$. Thus $I$ is compact in $\mathbb{R}$. The set $I$ is convex because $C$ is convex and $\gamma$ is affine. Since $x \in \operatorname{relint}_A C$, there exists $\varepsilon>0$ such that \begin{align*} x+s v \in C \quad \text{for all } s \in (-\varepsilon,\varepsilon), \end{align*} so $I$ contains an interval around $0$. Thus $I=[\alpha,\beta]$ for [real numbers](/page/Real%20Numbers) $\alpha<0<\beta$. Define \begin{align*} y := x+\alpha v, \qquad z := x+\beta v. \end{align*} The endpoint property of $\alpha$ and $\beta$ implies $y,z \in \partial_A C$. Also \begin{align*} x = \frac{\beta}{\beta-\alpha}y + \frac{-\alpha}{\beta-\alpha}z, \end{align*} and the coefficients are non-negative and sum to $1$. By the previous step, \begin{align*} y,z \in \operatorname{conv}(\operatorname{ext} C). \end{align*} Since $\operatorname{conv}(\operatorname{ext} C)$ is convex, it follows that \begin{align*} x \in \operatorname{conv}(\operatorname{ext} C). \end{align*} [guided] Now suppose $x$ is not on the relative boundary, so $x \in \operatorname{relint}_A C$. We reduce this case to the boundary case by drawing a line through $x$. Choose a non-zero vector $v$ in the direction space of $A$. Define \begin{align*} I := \{s \in \mathbb{R} : x+s v \in C\}. \end{align*} This is the slice of $C$ along the line $x+\mathbb{R}v$. Let $\gamma: \mathbb{R} \to A$ be the affine map $\gamma(s)=x+sv$. The set $I$ is non-empty because $0 \in I$. It is closed because $I=\gamma^{-1}(C)$, the map $\gamma$ is continuous, and $C$ is closed. To see boundedness, use compactness of $C$: there exists $R>0$ such that $C \subset B(0,R)$. If $s \in I$, then $x+sv \in C$, so $|x+sv|\leq R$. The triangle inequality gives \begin{align*} |s|\,|v| = |sv| \leq |x+sv|+|x| \leq R+|x|. \end{align*} Because $v\neq 0$, we obtain $|s|\leq (R+|x|)/|v|$. Thus $I$ is closed and bounded in $\mathbb{R}$, hence compact. It is convex because if $s_1,s_2 \in I$ and $t \in [0,1]$, then \begin{align*} x+\bigl((1-t)s_1+t s_2\bigr)v = (1-t)(x+s_1v)+t(x+s_2v) \in C. \end{align*} Hence $I$ is a compact interval in $\mathbb{R}$. Because $x$ is in the relative interior of $C$, there exists $\varepsilon>0$ such that \begin{align*} x+s v \in C \quad \text{for all } s \in (-\varepsilon,\varepsilon). \end{align*} Therefore $I$ contains an open interval around $0$. Since $I$ is compact and convex in $\mathbb{R}$, there are real numbers $\alpha<0<\beta$ such that \begin{align*} I=[\alpha,\beta]. \end{align*} Define the two endpoint points of the slice by \begin{align*} y := x+\alpha v, \qquad z := x+\beta v. \end{align*} The endpoint property implies $y,z \in C$. Moreover, $y$ and $z$ lie on the relative boundary of $C$: if, for instance, $y$ were in the relative interior, then a small relative neighbourhood of $y$ in $A$ would be contained in $C$, and in particular $x+(\alpha-\delta)v \in C$ for some $\delta>0$, contradicting the minimality of $\alpha$. The argument for $z$ is identical, using $\beta+\delta$. Solving for $x$ in terms of $y$ and $z$ gives \begin{align*} x = \frac{\beta}{\beta-\alpha}y + \frac{-\alpha}{\beta-\alpha}z. \end{align*} Since $\alpha<0<\beta$, both coefficients are non-negative and \begin{align*} \frac{\beta}{\beta-\alpha}+\frac{-\alpha}{\beta-\alpha}=1. \end{align*} Thus $x$ is a convex combination of the two boundary points $y$ and $z$. The previous step proved that both boundary points belong to $\operatorname{conv}(\operatorname{ext} C)$, and a convex hull is convex. Therefore \begin{align*} x \in \operatorname{conv}(\operatorname{ext} C). \end{align*} [/guided] [/step] [step:Combine the boundary and interior cases] Every point $x \in C$ lies either in $\partial_A C$ or in $\operatorname{relint}_A C$. The previous two steps show that in both cases \begin{align*} x \in \operatorname{conv}(\operatorname{ext} C). \end{align*} Hence \begin{align*} C \subset \operatorname{conv}(\operatorname{ext} C). \end{align*} Conversely, $\operatorname{ext} C \subset C$ by definition, and since $C$ is convex, every convex combination of points of $\operatorname{ext} C$ lies in $C$. Therefore \begin{align*} \operatorname{conv}(\operatorname{ext} C)\subset C. \end{align*} Combining the two inclusions gives \begin{align*} C=\operatorname{conv}(\operatorname{ext} C). \end{align*} This completes the induction and the proof. [/step]

Prerequisites (0/1 completed)

Prerequisites Graph

Interactive dependency map showing how this theorem builds on foundational concepts

Loading dependency graph...

Definitions & Concepts

Real Numbers

What brings you to Androma?

Start with a route through the knowledge graph.

Minkowski's Theorem on Extreme Points (Theorem # 4093)

Discussion

Proof

Prerequisites (0/1 completed)

Prerequisites Graph

Explore Further

Sign in to Androma

Check your inbox

One last step

Minkowski's Theorem on Extreme Points (Theorem # 4093)

Discussion

Proof

Prerequisites (0/1 completed)

Prerequisites Graph

Explore Further