[proofplan]
The inclusion $K \subset (K^\circ)^\circ$ follows immediately from the definition of the polar set. For the reverse inclusion, we prove the contrapositive: if $x_0 \notin K$, then there is a vector $y \in K^\circ$ with $x_0 \cdot y > 1$. The separating vector is constructed directly from the closest point of $K$ to $x_0$, whose existence follows from closedness in finite dimension; convexity then gives the supporting inequality.
[/proofplan]
[step:Show every point of $K$ lies in the bipolar]
Let $x_0 \in K$. To prove $x_0 \in (K^\circ)^\circ$, let $y \in K^\circ$. By the definition of the [polar set](/page/Polar%20Set), every point $x \in K$ satisfies $x \cdot y \leq 1$. Applying this with $x = x_0$ gives $x_0 \cdot y \leq 1$. Since this holds for every $y \in K^\circ$, the definition of the polar of $K^\circ$ gives $x_0 \in (K^\circ)^\circ$. Hence
\begin{align*}
K \subset (K^\circ)^\circ.
\end{align*}
[/step]
[step:Choose a nearest point of $K$ to a point outside $K$]
Let $x_0 \in \mathbb{R}^n \setminus K$. Define the distance from $x_0$ to $K$ by
\begin{align*}
d := \inf\{|x_0 - x| : x \in K\}.
\end{align*}
Since $K$ is nonempty, choose a sequence $(x_m)$ in $K$ such that $|x_0 - x_m| \to d$. For all sufficiently large $m$, we have $|x_0 - x_m| \leq d + 1$, and hence
\begin{align*}
|x_m| \leq |x_0| + |x_0 - x_m| \leq |x_0| + d + 1.
\end{align*}
Thus a tail of $(x_m)$ is bounded. By the [Bolzano-Weierstrass theorem](/page/Bolzano-Weierstrass%20Theorem), there is a subsequence $(x_{m_j})$ converging to some point $p \in \mathbb{R}^n$. Since $K$ is closed and every $x_{m_j}$ lies in $K$, we have $p \in K$. Continuity of the Euclidean norm gives
\begin{align*}
|x_0 - p| = \lim_{j \to \infty} |x_0 - x_{m_j}| = d.
\end{align*}
Because $x_0 \notin K$ and $p \in K$, we have $d = |x_0 - p| > 0$.
[guided]
We need an actual supporting vector, not only an infimum. The natural vector will point from the nearest point of $K$ to $x_0$, so we first prove that such a nearest point exists.
Define
\begin{align*}
d := \inf\{|x_0 - x| : x \in K\}.
\end{align*}
Because $K$ is nonempty, this infimum is taken over a nonempty subset of $[0,\infty)$. Choose a sequence $(x_m)$ in $K$ such that $|x_0 - x_m| \to d$. This sequence is eventually bounded: for all sufficiently large $m$,
\begin{align*}
|x_0 - x_m| \leq d + 1,
\end{align*}
and therefore the triangle inequality gives
\begin{align*}
|x_m| \leq |x_0| + |x_0 - x_m| \leq |x_0| + d + 1.
\end{align*}
The [Bolzano-Weierstrass theorem](/page/Bolzano-Weierstrass%20Theorem) says that every bounded sequence in $\mathbb{R}^n$ has a convergent subsequence, so there is a subsequence $(x_{m_j})$ and a point $p \in \mathbb{R}^n$ such that $x_{m_j} \to p$. Closedness of $K$ is used here: since every $x_{m_j}$ belongs to $K$ and $K$ is closed, the limit $p$ also belongs to $K$. Finally, continuity of the Euclidean norm gives
\begin{align*}
|x_0 - p| = \lim_{j \to \infty} |x_0 - x_{m_j}| = d.
\end{align*}
Thus $p$ is a nearest point of $K$ to $x_0$. Since $x_0 \notin K$ but $p \in K$, the distance is positive:
\begin{align*}
d = |x_0 - p| > 0.
\end{align*}
[/guided]
[/step]
[step:Use convexity to obtain a supporting inequality]
Define the vector $v \in \mathbb{R}^n$ by
\begin{align*}
v := x_0 - p.
\end{align*}
We claim that
\begin{align*}
x \cdot v \leq p \cdot v
\end{align*}
for every $x \in K$.
Fix $x \in K$. Since $K$ is convex and $p,x \in K$, the point
\begin{align*}
p_t := (1-t)p + tx
\end{align*}
belongs to $K$ for every $t \in [0,1]$. Since $p$ minimizes the distance to $x_0$ over $K$, we have
\begin{align*}
|x_0 - p|^2 \leq |x_0 - p_t|^2
\end{align*}
for every $t \in [0,1]$. Using $v = x_0 - p$ and $p_t = p + t(x-p)$, this becomes
\begin{align*}
|v|^2 \leq |v - t(x-p)|^2
= |v|^2 - 2t\,v \cdot (x-p) + t^2 |x-p|^2.
\end{align*}
Subtracting $|v|^2$ and dividing by $t > 0$ gives
\begin{align*}
0 \leq -2\,v \cdot (x-p) + t |x-p|^2.
\end{align*}
Letting $t \downarrow 0$ yields $v \cdot (x-p) \leq 0$, equivalently
\begin{align*}
x \cdot v \leq p \cdot v.
\end{align*}
[guided]
The nearest point $p$ gives a geometric normal vector. Define
\begin{align*}
v := x_0 - p.
\end{align*}
We will prove that the hyperplane through $p$ orthogonal to $v$ supports $K$.
Fix $x \in K$. Convexity is used by moving from $p$ a small distance toward $x$. For each $t \in [0,1]$, define
\begin{align*}
p_t := (1-t)p + tx.
\end{align*}
Since $K$ is convex and both endpoints $p$ and $x$ lie in $K$, we have $p_t \in K$. Because $p$ is a nearest point of $K$ to $x_0$, the distance from $x_0$ to $p_t$ cannot be smaller than the distance from $x_0$ to $p$:
\begin{align*}
|x_0 - p|^2 \leq |x_0 - p_t|^2.
\end{align*}
Now rewrite this inequality using $v = x_0 - p$ and $p_t = p + t(x-p)$. Then
\begin{align*}
x_0 - p_t = x_0 - p - t(x-p) = v - t(x-p),
\end{align*}
so
\begin{align*}
|v|^2 \leq |v - t(x-p)|^2
= |v|^2 - 2t\,v \cdot (x-p) + t^2 |x-p|^2.
\end{align*}
Subtracting $|v|^2$ from both sides gives
\begin{align*}
0 \leq -2t\,v \cdot (x-p) + t^2 |x-p|^2.
\end{align*}
For $t > 0$, divide by $t$:
\begin{align*}
0 \leq -2\,v \cdot (x-p) + t |x-p|^2.
\end{align*}
Letting $t \downarrow 0$ removes the quadratic error term and gives
\begin{align*}
0 \leq -2\,v \cdot (x-p),
\end{align*}
hence $v \cdot (x-p) \leq 0$. This is exactly
\begin{align*}
x \cdot v \leq p \cdot v.
\end{align*}
[/guided]
[/step]
[step:Normalize the supporting vector to separate $x_0$ from the bipolar]
Since $0 \in K$, the supporting inequality with $x = 0$ gives
\begin{align*}
0 \leq p \cdot v.
\end{align*}
Also,
\begin{align*}
x_0 \cdot v = (p+v)\cdot v = p \cdot v + |v|^2 > p \cdot v,
\end{align*}
because $|v| = d > 0$.
Define $\lambda \in (0,\infty)$ by
\begin{align*}
\lambda :=
\begin{cases}
\frac{1}{p \cdot v}, & \text{if } p \cdot v > 0,\\
\frac{2}{x_0 \cdot v}, & \text{if } p \cdot v = 0.
\end{cases}
\end{align*}
Define $y \in \mathbb{R}^n$ by
\begin{align*}
y := \lambda v.
\end{align*}
If $p \cdot v > 0$, then for every $x \in K$,
\begin{align*}
x \cdot y = \lambda x \cdot v \leq \lambda p \cdot v = 1,
\end{align*}
while
\begin{align*}
x_0 \cdot y = \lambda x_0 \cdot v > \lambda p \cdot v = 1.
\end{align*}
If $p \cdot v = 0$, then for every $x \in K$,
\begin{align*}
x \cdot y = \lambda x \cdot v \leq 0 \leq 1,
\end{align*}
while
\begin{align*}
x_0 \cdot y = \frac{2}{x_0 \cdot v} x_0 \cdot v = 2 > 1.
\end{align*}
In both cases, $y \in K^\circ$ and $x_0 \cdot y > 1$. Hence $x_0 \notin (K^\circ)^\circ$.
[guided]
The supporting inequality gives $x \cdot v \leq p \cdot v$ for every $x \in K$. To turn this into membership in the polar set, we need the right-hand side to be at most $1$ after scaling. The assumption $0 \in K$ ensures that the supporting level is nonnegative: substituting $x = 0$ gives
\begin{align*}
0 \cdot v \leq p \cdot v,
\end{align*}
so
\begin{align*}
0 \leq p \cdot v.
\end{align*}
The outside point lies strictly beyond the supporting hyperplane because
\begin{align*}
x_0 \cdot v = (p+v)\cdot v = p \cdot v + |v|^2.
\end{align*}
Since $|v| = |x_0-p| > 0$, this gives
\begin{align*}
x_0 \cdot v > p \cdot v.
\end{align*}
There are two normalization cases. If $p \cdot v > 0$, define
\begin{align*}
\lambda := \frac{1}{p \cdot v}.
\end{align*}
If $p \cdot v = 0$, then $x_0 \cdot v = |v|^2 > 0$, and we define
\begin{align*}
\lambda := \frac{2}{x_0 \cdot v}.
\end{align*}
In either case $\lambda > 0$, so define
\begin{align*}
y := \lambda v.
\end{align*}
When $p \cdot v > 0$, the supporting inequality implies that for every $x \in K$,
\begin{align*}
x \cdot y = \lambda x \cdot v \leq \lambda p \cdot v = 1.
\end{align*}
Thus $y \in K^\circ$. At the same time,
\begin{align*}
x_0 \cdot y = \lambda x_0 \cdot v > \lambda p \cdot v = 1.
\end{align*}
So $y$ is a polar vector that excludes $x_0$ from the bipolar.
When $p \cdot v = 0$, the supporting inequality gives $x \cdot v \leq 0$ for every $x \in K$. Therefore, for every $x \in K$,
\begin{align*}
x \cdot y = \lambda x \cdot v \leq 0 \leq 1,
\end{align*}
so again $y \in K^\circ$. The chosen scaling gives
\begin{align*}
x_0 \cdot y = \frac{2}{x_0 \cdot v} x_0 \cdot v = 2 > 1.
\end{align*}
Thus in both cases we have constructed $y \in K^\circ$ with $x_0 \cdot y > 1$, which means $x_0 \notin (K^\circ)^\circ$.
[/guided]
[/step]
[step:Conclude equality of $K$ and its bipolar]
We have shown that every $x_0 \in K$ belongs to $(K^\circ)^\circ$. We have also shown that every $x_0 \in \mathbb{R}^n \setminus K$ fails to belong to $(K^\circ)^\circ$. Therefore
\begin{align*}
(K^\circ)^\circ = K.
\end{align*}
This proves the theorem.
[/step]