[proofplan]
We parameterize all convex combinations of at most $n+1$ points of $A$ by the product of the standard simplex with $A^{n+1}$. The parameter space is compact, and the convex-combination map from this space into $\mathbb{R}^n$ is continuous. [Carathéodory's theorem](/theorems/4079) identifies the image of this map with the whole convex hull, so compactness follows from the open-cover definition of compactness applied to a continuous image.
[/proofplan]
[step:Define the simplex parameter space for convex combinations]
Define the standard $n$-simplex
\begin{align*}
\Delta_n := \left\{(\lambda_1,\dots,\lambda_{n+1}) \in \mathbb{R}^{n+1} : \lambda_i \geq 0 \text{ for every } i \in \{1,\dots,n+1\}, \ \sum_{i=1}^{n+1} \lambda_i = 1\right\}.
\end{align*}
The set $\Delta_n$ is closed in $\mathbb{R}^{n+1}$ because it is the intersection of the closed half-spaces $\{\lambda_i \geq 0\}$ and the closed affine hyperplane $\{\sum_{i=1}^{n+1}\lambda_i = 1\}$. It is bounded because each coordinate satisfies $0 \leq \lambda_i \leq 1$. Hence $\Delta_n$ is compact by the [Heine-Borel theorem](/theorems/315).
Since $A$ is compact and finite products of compact spaces are compact, the product space
\begin{align*}
K := \Delta_n \times A^{n+1}
\end{align*}
is compact, where $A^{n+1}$ denotes the Cartesian product of $n+1$ copies of $A$.
[guided]
The goal is to encode a convex combination by two pieces of data: the coefficients and the points being combined. The coefficients live in the simplex
\begin{align*}
\Delta_n := \left\{(\lambda_1,\dots,\lambda_{n+1}) \in \mathbb{R}^{n+1} : \lambda_i \geq 0 \text{ for every } i \in \{1,\dots,n+1\}, \ \sum_{i=1}^{n+1} \lambda_i = 1\right\}.
\end{align*}
This definition exactly records the two requirements for convex coefficients: non-negativity and total mass $1$.
We verify compactness of $\Delta_n$. For each $i \in \{1,\dots,n+1\}$, the condition $\lambda_i \geq 0$ defines a closed half-space in $\mathbb{R}^{n+1}$. The condition $\sum_{i=1}^{n+1}\lambda_i = 1$ defines a closed affine hyperplane because the map $(\lambda_1,\dots,\lambda_{n+1}) \mapsto \sum_{i=1}^{n+1}\lambda_i$ is continuous. Therefore $\Delta_n$ is closed as a finite intersection of closed sets. Also, if $\lambda \in \Delta_n$, then every coordinate satisfies $0 \leq \lambda_i \leq 1$, so $\Delta_n$ is bounded. By the [Heine-Borel theorem](/theorems/309), a closed and bounded subset of Euclidean space is compact, hence $\Delta_n$ is compact.
Now define
\begin{align*}
K := \Delta_n \times A^{n+1}.
\end{align*}
Here $A^{n+1}$ means the Cartesian product of $n+1$ copies of $A$. Since $A$ is compact by hypothesis, $A^{n+1}$ is compact by finite-product compactness. Since $\Delta_n$ is compact by the closed-and-bounded argument in Euclidean space just proved in this guided step, another application of finite-product compactness gives that $K$ is compact.
[/guided]
[/step]
[step:Build the continuous convex-combination map]
Define
\begin{align*}
F: K &\to \mathbb{R}^n \\
\bigl((\lambda_1,\dots,\lambda_{n+1}), a_1,\dots,a_{n+1}\bigr)
&\mapsto \sum_{i=1}^{n+1} \lambda_i a_i .
\end{align*}
The map $F$ is continuous because each coordinate function is a finite sum of products of coordinate projections from $K$ to $\mathbb{R}$.
[guided]
We now turn the parameter data into an actual point of $\mathbb{R}^n$. Define
\begin{align*}
F: K &\to \mathbb{R}^n \\
\bigl((\lambda_1,\dots,\lambda_{n+1}), a_1,\dots,a_{n+1}\bigr)
&\mapsto \sum_{i=1}^{n+1} \lambda_i a_i .
\end{align*}
This is the natural convex-combination map: the vector $a_i \in A \subset \mathbb{R}^n$ is weighted by the coefficient $\lambda_i$, and the weights sum to $1$.
We verify continuity. Write each point $a_i \in \mathbb{R}^n$ in coordinates as
\begin{align*}
a_i = \bigl((a_i)_1,\dots,(a_i)_n\bigr).
\end{align*}
For each coordinate $j \in \{1,\dots,n\}$, the $j$-th coordinate of $F$ is
\begin{align*}
F_j\bigl((\lambda_1,\dots,\lambda_{n+1}), a_1,\dots,a_{n+1}\bigr)
= \sum_{i=1}^{n+1} \lambda_i (a_i)_j.
\end{align*}
Each projection map onto $\lambda_i$ and onto $(a_i)_j$ is continuous, products of real-valued continuous functions are continuous, and finite sums of continuous functions are continuous. Hence each coordinate function $F_j$ is continuous, and therefore $F$ is continuous as a map into $\mathbb{R}^n$.
[/guided]
[/step]
[step:Identify the image of the map with the convex hull]
We claim that
\begin{align*}
F(K) = \operatorname{conv}(A).
\end{align*}
First, if $y \in F(K)$, then there exist $(\lambda_1,\dots,\lambda_{n+1}) \in \Delta_n$ and $a_1,\dots,a_{n+1} \in A$ such that
\begin{align*}
y = \sum_{i=1}^{n+1} \lambda_i a_i.
\end{align*}
This is a convex combination of points of $A$, so $y \in \operatorname{conv}(A)$.
Conversely, if $y \in \operatorname{conv}(A)$, then by [Carathéodory's theorem](/theorems/2954) (citing a result not yet in the wiki: [Carathéodory's theorem](/theorems/4083) for convex hulls in $\mathbb{R}^n$), there are an integer $m \in \{1,\dots,n+1\}$, points $b_1,\dots,b_m \in A$, and coefficients $\mu_1,\dots,\mu_m \in [0,1]$ with $\sum_{i=1}^{m}\mu_i = 1$ such that
\begin{align*}
y = \sum_{i=1}^{m} \mu_i b_i.
\end{align*}
Choose a point $a_* \in A$ if padding is needed; if $m=n+1$, no padding is needed. Define $a_i := b_i$ and $\lambda_i := \mu_i$ for $1 \leq i \leq m$, and for $m < i \leq n+1$ define $a_i := a_*$ and $\lambda_i := 0$. Then $(\lambda_1,\dots,\lambda_{n+1}) \in \Delta_n$, $a_1,\dots,a_{n+1} \in A$, and
\begin{align*}
y = \sum_{i=1}^{n+1} \lambda_i a_i.
\end{align*}
Thus $y = F((\lambda_1,\dots,\lambda_{n+1}),a_1,\dots,a_{n+1})$, so $y \in F(K)$.
[guided]
We prove both inclusions.
First suppose $y \in F(K)$. By definition of image, there is some element
\begin{align*}
\bigl((\lambda_1,\dots,\lambda_{n+1}),a_1,\dots,a_{n+1}\bigr) \in K
\end{align*}
such that
\begin{align*}
y = F\bigl((\lambda_1,\dots,\lambda_{n+1}),a_1,\dots,a_{n+1}\bigr)
= \sum_{i=1}^{n+1} \lambda_i a_i.
\end{align*}
Because the parameter lies in $K = \Delta_n \times A^{n+1}$, we have $a_i \in A$ for every $i$, $\lambda_i \geq 0$ for every $i$, and $\sum_{i=1}^{n+1}\lambda_i = 1$. Therefore this expression is a convex combination of points of $A$, and hence $y \in \operatorname{conv}(A)$. This proves $F(K) \subset \operatorname{conv}(A)$.
For the reverse inclusion, take $y \in \operatorname{conv}(A)$. The definition of convex hull allows $y$ to be written as a finite convex combination of points of $A$, but the number of points in that combination may initially depend on $y$. We use Carathéodory's theorem (citing a result not yet in the wiki: Carathéodory's theorem for convex hulls in $\mathbb{R}^n$), which states that in $\mathbb{R}^n$ every point in the convex hull of a set is a convex combination of at most $n+1$ points of that set. Hence there exist $a_1,\dots,a_{n+1} \in A$ and coefficients $(\lambda_1,\dots,\lambda_{n+1}) \in \Delta_n$ such that
\begin{align*}
y = \sum_{i=1}^{n+1} \lambda_i a_i.
\end{align*}
If Carathéodory's theorem initially gives fewer than $n+1$ points, we repeat any point of $A$ or append coefficients equal to $0$ to obtain exactly $n+1$ slots. The displayed data then define an element of $K$, and by the definition of $F$ we get
\begin{align*}
y = F\bigl((\lambda_1,\dots,\lambda_{n+1}),a_1,\dots,a_{n+1}\bigr).
\end{align*}
Thus $y \in F(K)$, proving $\operatorname{conv}(A) \subset F(K)$.
[/guided]
[/step]
[step:Conclude compactness from continuity and compactness of the parameter space]
Since $K$ is compact and $F:K \to \mathbb{R}^n$ is continuous, $F(K)$ is compact. By the equality $F(K)=\operatorname{conv}(A)$ proved above, $\operatorname{conv}(A)$ is compact.
[guided]
We have arranged the proof so that the desired set is the continuous image of a compact set. Since $K$ is compact and $F:K \to \mathbb{R}^n$ is continuous, the continuous image $F(K)$ is compact: explicitly, every open cover of $F(K)$ pulls back under $F$ to an open cover of $K$, which has a finite subcover; pushing that finite subcover forward gives a finite subcover of $F(K)$.
The previous step proved
\begin{align*}
F(K)=\operatorname{conv}(A).
\end{align*}
Therefore $\operatorname{conv}(A)$ is compact. This is exactly the theorem statement.
[/guided]
[/step]