[proofplan]
We prove each part by direct application of the definition of convexity and standard calculus. Parts (i)--(iv) verify the convexity inequality for the newly constructed function or set. Part (v) derives the first-order condition by rearranging the convexity inequality along a line segment and taking the limit as $t \to 0^+$. Part (vi) is a short contradiction argument. Part (vii) connects convexity to the Hessian via Taylor's theorem applied along line segments.
[/proofplan]
[step:Prove part (i): non-negative linear combinations preserve (strict) convexity]
Let $x, y \in C$ and $t \in (0,1)$. Since $f$ and $g$ are both convex on $C$:
\begin{align*}
(af + bg)\bigl((1-t)x + ty\bigr) &= af\bigl((1-t)x + ty\bigr) + bg\bigl((1-t)x + ty\bigr) \\
&\leq a\bigl[(1-t)f(x) + tf(y)\bigr] + b\bigl[(1-t)g(x) + tg(y)\bigr] \\
&= (1-t)\bigl[af(x) + bg(x)\bigr] + t\bigl[af(y) + bg(y)\bigr] \\
&= (1-t)(af + bg)(x) + t(af + bg)(y).
\end{align*}
The second line uses $a > 0$ and the convexity of $f$ (so $a$ preserves the inequality direction), and similarly $b > 0$ with the convexity of $g$. This establishes convexity of $af + bg$.
For strict convexity: if $g$ is strictly convex and $x \neq y$, then $g\bigl((1-t)x + ty\bigr) < (1-t)g(x) + tg(y)$ for $t \in (0,1)$. Since $b > 0$, the inequality in the second line becomes strict, so $(af + bg)\bigl((1-t)x + ty\bigr) < (1-t)(af + bg)(x) + t(af + bg)(y)$.
[/step]
[step:Prove part (ii): composition with affine maps preserves convexity]
Define $g : \mathbb{R}^m \to \mathbb{R}$ by $g(x) = f(Ax - b)$. Let $x, y \in \mathbb{R}^m$ and $t \in (0,1)$. Since the affine map $x \mapsto Ax - b$ is linear in $x$:
\begin{align*}
A\bigl((1-t)x + ty\bigr) - b = (1-t)(Ax - b) + t(Ay - b).
\end{align*}
The points $Ax - b$ and $Ay - b$ lie in $\mathbb{R}^d = C$, so by the convexity of $f$:
\begin{align*}
g\bigl((1-t)x + ty\bigr) &= f\bigl(A\bigl((1-t)x + ty\bigr) - b\bigr) \\
&= f\bigl((1-t)(Ax-b) + t(Ay-b)\bigr) \\
&\leq (1-t)f(Ax-b) + tf(Ay-b) \\
&= (1-t)g(x) + tg(y).
\end{align*}
[/step]
[step:Prove part (iii): pointwise suprema of convex functions are convex on their effective domain]
First, we show $D = \{x \in C : g(x) < \infty\}$ is convex. Take $x, y \in D$ (so $g(x) < \infty$ and $g(y) < \infty$) and $t \in (0,1)$. Since $C$ is convex, $(1-t)x + ty \in C$. For each $\alpha \in I$, the convexity of $f_\alpha$ gives
\begin{align*}
f_\alpha\bigl((1-t)x + ty\bigr) \leq (1-t)f_\alpha(x) + tf_\alpha(y) \leq (1-t)g(x) + tg(y),
\end{align*}
where the second inequality uses $f_\alpha(x) \leq \sup_{\alpha' \in I} f_{\alpha'}(x) = g(x)$ and similarly for $y$. Taking the supremum over $\alpha \in I$:
\begin{align*}
g\bigl((1-t)x + ty\bigr) = \sup_{\alpha \in I} f_\alpha\bigl((1-t)x + ty\bigr) \leq (1-t)g(x) + tg(y) < \infty.
\end{align*}
This shows $(1-t)x + ty \in D$ (the supremum is finite), establishing that $D$ is convex. The displayed inequality is also the convexity condition for $g$ on $D$.
[guided]
The argument has two components. First, each convex function $f_\alpha$ satisfies the convexity inequality, giving an upper bound on $f_\alpha$ at the convex combination. Second, we use $f_\alpha \leq g$ to replace $f_\alpha(x)$ and $f_\alpha(y)$ by $g(x)$ and $g(y)$, yielding an upper bound that is uniform in $\alpha$. Taking the supremum on the left-hand side then gives the convexity of $g$ itself.
An important subtlety: we must verify that $(1-t)x + ty \in D$, not just that $g$ satisfies the convexity inequality. The displayed computation confirms both simultaneously -- the right-hand side is finite (since $g(x), g(y) < \infty$), so $g((1-t)x + ty) < \infty$ and the point lies in $D$.
[/guided]
[/step]
[step:Prove part (iv): sublevel sets of convex functions are convex]
Let $S_M := \{x \in C : f(x) \leq M\}$ and take $x, y \in S_M$, so $f(x) \leq M$ and $f(y) \leq M$. For $t \in (0,1)$, the point $(1-t)x + ty \in C$ by convexity of $C$, and
\begin{align*}
f\bigl((1-t)x + ty\bigr) \leq (1-t)f(x) + tf(y) \leq (1-t)M + tM = M.
\end{align*}
Therefore $(1-t)x + ty \in S_M$, and $S_M$ is convex.
[/step]
[step:Prove part (v): the first-order condition from convexity and differentiability]
Let $x \in \operatorname{int}(C)$ and $y \in C$. For $t \in (0,1)$, the point $x + t(y - x) = (1-t)x + ty \in C$ by convexity of $C$. The convexity of $f$ gives
\begin{align*}
f\bigl(x + t(y-x)\bigr) \leq (1-t)f(x) + tf(y) = f(x) + t\bigl(f(y) - f(x)\bigr).
\end{align*}
Rearranging:
\begin{align*}
\frac{f\bigl(x + t(y-x)\bigr) - f(x)}{t} \leq f(y) - f(x).
\end{align*}
Since $f$ is differentiable at $x$, the left-hand side has a limit as $t \to 0^+$. By the definition of the directional derivative and the chain rule applied to the map $t \mapsto f(x + t(y-x))$:
\begin{align*}
\lim_{t \to 0^+} \frac{f\bigl(x + t(y-x)\bigr) - f(x)}{t} = \nabla f(x)^\top (y - x).
\end{align*}
Since the inequality holds for all $t \in (0,1)$, it persists in the limit:
\begin{align*}
\nabla f(x)^\top (y - x) \leq f(y) - f(x).
\end{align*}
Rearranging gives $f(y) \geq f(x) + \nabla f(x)^\top (y - x)$.
For the "in particular" statement: if $\nabla f(x) = 0$, the inequality becomes $f(y) \geq f(x)$ for all $y \in C$, so $x$ is a global minimiser.
[guided]
The geometric interpretation is fundamental. The first-order condition says that the tangent hyperplane $y \mapsto f(x) + \nabla f(x)^\top(y - x)$ at any point $x$ lies below the graph of $f$. This is equivalent to saying that the graph of $f$ lies above every tangent hyperplane -- the defining property of a convex function from the perspective of supporting hyperplanes.
The derivation proceeds by dividing the convexity inequality by $t$:
\begin{align*}
\frac{f(x + t(y-x)) - f(x)}{t} \leq f(y) - f(x).
\end{align*}
The left-hand side is a difference quotient along the direction $y - x$. As $t \to 0^+$, this converges to the directional derivative $\nabla f(x)^\top(y-x)$ by differentiability. The right-hand side is constant in $t$, so the limit inequality follows.
Why do we need $x \in \operatorname{int}(C)$? Differentiability at $x$ requires that $x$ be an interior point of the domain, so that the difference quotient is well-defined for small perturbations in all directions.
[/guided]
[/step]
[step:Prove part (vi): strict convexity implies uniqueness of minimisers]
Suppose for contradiction that $x, y \in C$ are both minimisers of $f$ with $x \neq y$, so $f(x) = f(y) = \min_{z \in C} f(z)$. By strict convexity with $t = 1/2$:
\begin{align*}
f\!\left(\frac{x + y}{2}\right) < \frac{1}{2}f(x) + \frac{1}{2}f(y) = f(x).
\end{align*}
Since $(x + y)/2 \in C$ by convexity of $C$, this contradicts the assumption that $f(x)$ is the minimum of $f$ over $C$.
[/step]
[step:Prove part (vii)(a): convexity is equivalent to the Hessian being positive semi-definite everywhere]
Let $f : \mathbb{R}^d \to \mathbb{R}$ be twice continuously differentiable, with Hessian $H(x) = \bigl(\partial_{x_i}\partial_{x_j} f(x)\bigr)_{i,j=1}^d$.
**($\Rightarrow$)** Assume $f$ is convex. Fix $x \in \mathbb{R}^d$ and $v \in \mathbb{R}^d$. Define the one-variable function $\varphi : \mathbb{R} \to \mathbb{R}$ by $\varphi(t) = f(x + tv)$. By the first-order condition (part (v)) applied with $y = x + tv$ and then $y = x$ (taking $x + sv$ in place of $x$):
\begin{align*}
\varphi(t) = f(x + tv) \geq f(x + sv) + \nabla f(x + sv)^\top(t - s)v = \varphi(s) + \varphi'(s)(t - s)
\end{align*}
for all $s, t \in \mathbb{R}$. This shows $\varphi$ is convex on $\mathbb{R}$. In particular, $\varphi''(0) \geq 0$. Computing $\varphi''(0)$ by the chain rule:
\begin{align*}
\varphi''(0) = v^\top H(x) v.
\end{align*}
Since $v \in \mathbb{R}^d$ was arbitrary, $H(x)$ is positive semi-definite. Since $x$ was arbitrary, this holds for all $x \in \mathbb{R}^d$.
**($\Leftarrow$)** Assume $H(x)$ is positive semi-definite for all $x$. Fix $x, y \in \mathbb{R}^d$ and $t \in (0,1)$. By Taylor's theorem with integral remainder applied to $f$ along the line segment from $x$ to $y$:
\begin{align*}
f(y) = f(x) + \nabla f(x)^\top(y - x) + \int_0^1 (1-s)(y-x)^\top H\bigl(x + s(y-x)\bigr)(y-x) \, d\mathcal{L}^1(s).
\end{align*}
Since $(y-x)^\top H(\xi)(y-x) \geq 0$ for every $\xi$ (by the positive semi-definiteness hypothesis), the integral is non-negative:
\begin{align*}
f(y) \geq f(x) + \nabla f(x)^\top(y - x).
\end{align*}
This is the first-order condition, which implies convexity: for $t \in (0,1)$, applying the inequality once with the pair $(x, (1-t)x + ty)$ and once with $(y, (1-t)x + ty)$, and taking the convex combination $(1-t) \cdot [\text{first}] + t \cdot [\text{second}]$, yields $f((1-t)x + ty) \leq (1-t)f(x) + tf(y)$.
[guided]
The forward direction ($\Rightarrow$) reduces convexity in $\mathbb{R}^d$ to convexity along every line. If $f$ is convex, then for any $x, v \in \mathbb{R}^d$, the restriction $\varphi(t) = f(x + tv)$ is a convex function of one variable. A twice-differentiable convex function of one variable satisfies $\varphi''(t) \geq 0$ for all $t$. In particular $\varphi''(0) = v^\top H(x) v \geq 0$, giving positive semi-definiteness.
The reverse direction ($\Leftarrow$) uses Taylor's theorem to establish the first-order condition $f(y) \geq f(x) + \nabla f(x)^\top(y-x)$, and then deduces convexity from it. To see how the first-order condition implies convexity: let $z = (1-t)x + ty$ for $t \in (0,1)$. Apply the first-order condition at $z$ with $y$ replaced by $x$: $f(x) \geq f(z) + \nabla f(z)^\top(x - z)$. Apply it at $z$ with $y$ replaced by $y$: $f(y) \geq f(z) + \nabla f(z)^\top(y - z)$. Multiply the first by $(1-t)$ and the second by $t$ and add:
\begin{align*}
(1-t)f(x) + tf(y) &\geq f(z) + \nabla f(z)^\top\bigl[(1-t)(x-z) + t(y-z)\bigr] \\
&= f(z) + \nabla f(z)^\top\bigl[(1-t)x + ty - z\bigr] \\
&= f(z),
\end{align*}
since $(1-t)x + ty = z$ by definition. This is the convexity inequality.
[/guided]
[/step]
[step:Prove part (vii)(b): positive definite Hessian implies strict convexity]
Assume $H(x)$ is positive definite for all $x \in \mathbb{R}^d$. Take $x, y \in \mathbb{R}^d$ with $x \neq y$ and $t \in (0,1)$. By the Taylor expansion with integral remainder used in the previous step:
\begin{align*}
f(y) = f(x) + \nabla f(x)^\top(y - x) + \int_0^1 (1-s)(y-x)^\top H\bigl(x + s(y-x)\bigr)(y-x) \, d\mathcal{L}^1(s).
\end{align*}
Since $y \neq x$, the direction $v := y - x \neq 0$, and $v^\top H(\xi) v > 0$ for every $\xi$ by positive definiteness. The integrand $(1-s)v^\top H(x + sv)v > 0$ for $s \in [0,1)$ (and is zero only at $s = 1$), so the integral is strictly positive:
\begin{align*}
f(y) > f(x) + \nabla f(x)^\top(y - x).
\end{align*}
This is a strict first-order condition. The same argument as in part (vii)(a) (taking convex combinations of the strict first-order conditions at $x$ and $y$) then yields the strict convexity inequality $f((1-t)x + ty) < (1-t)f(x) + tf(y)$ for $t \in (0,1)$ and $x \neq y$.
[/step]