[proofplan]
The proof pivots on the supporting hyperplane theorem for convex functions: at the point $\mu := \mathbb{E}[Z] \in \mathbb{R}^d$, convexity guarantees the existence of a subgradient $c \in \mathbb{R}^d$ such that the affine function $x \mapsto f(\mu) + c \cdot (x - \mu)$ lies everywhere below $f$. Substituting the random variable $Z$ into this global inequality yields a pointwise assertion on $\Omega$. Taking expectations of both sides, the linear term $c \cdot (\mathbb{E}[Z] - \mu)$ vanishes by definition of $\mu$, and what remains is exactly $\mathbb{E}[f(Z)] \geq f(\mathbb{E}[Z])$.
[/proofplan]
[step:Apply the supporting hyperplane theorem at $\mu := \mathbb{E}[Z]$ to produce a global affine lower bound for $f$]
Since $f: \mathbb{R}^d \to \mathbb{R}$ is convex, by [Convex Functions Are Locally Lipschitz](/theorems/3086) it is continuous at every point of $\mathbb{R}^d$, hence Borel measurable. Since $Z$ is $(\mathcal{F}, \mathcal{B}(\mathbb{R}^d))$-measurable, the composition $f \circ Z: \Omega \to \mathbb{R}$ is $(\mathcal{F}, \mathcal{B}(\mathbb{R}))$-measurable, confirming that $f(Z)$ is a well-defined real-valued random variable.
Set $\mu := \mathbb{E}[Z] \in \mathbb{R}^d$, which is finite since $\mathbb{E}|Z| < \infty$ by hypothesis. Applying the [Existence of a Supporting Hyperplane for Convex Functions](/theorems/7) to $f$ at the point $\mu$, there exists a vector $c \in \mathbb{R}^d$ such that
\begin{align*}
f(x) \geq f(\mu) + c \cdot (x - \mu) \quad \text{for all } x \in \mathbb{R}^d,
\end{align*}
where $c \cdot (x - \mu) = \sum_{i=1}^d c_i(x_i - \mu_i)$ is the standard Euclidean inner product.
[guided]
We want a global lower bound for $f$ that can be integrated against $\mathbb{P}$. The central observation is that convexity is precisely the condition that guarantees such a bound exists: any convex function $f: \mathbb{R}^d \to \mathbb{R}$ admits a supporting affine minorant at every point. This is the content of the [Existence of a Supporting Hyperplane for Convex Functions](/theorems/7).
Before applying the theorem, we verify that $f(Z)$ is a well-defined random variable so that expectations of it are meaningful. By [Convex Functions Are Locally Lipschitz](/theorems/3086), $f$ is locally Lipschitz on $\mathbb{R}^d$, hence continuous everywhere. A continuous map between metric spaces sends open sets to Borel sets (and preimages of open sets are open), so $f$ is $(\mathcal{B}(\mathbb{R}^d), \mathcal{B}(\mathbb{R}))$-measurable. Since $Z: \Omega \to \mathbb{R}^d$ is $(\mathcal{F}, \mathcal{B}(\mathbb{R}^d))$-measurable, the composition $f \circ Z: \Omega \to \mathbb{R}$ is $(\mathcal{F}, \mathcal{B}(\mathbb{R}))$-measurable. Hence $f(Z)$ is a well-defined real-valued random variable on $(\Omega, \mathcal{F}, \mathbb{P})$, and $\mathbb{E}[f(Z)]$ is meaningful.
We set $\mu := \mathbb{E}[Z]$. Since $\mathbb{E}|Z| < \infty$ by hypothesis, each component satisfies $\mathbb{E}|Z_i| \leq \mathbb{E}|Z| < \infty$, so $\mu_i = \mathbb{E}[Z_i] \in \mathbb{R}$ for each $i = 1, \dots, d$, giving $\mu \in \mathbb{R}^d$.
We now apply the [Existence of a Supporting Hyperplane for Convex Functions](/theorems/7) at $\mu$. That theorem requires: (i) $f$ is convex — given; (ii) $\mu \in \mathbb{R}^d$ — verified above. The conclusion supplies a subgradient $c \in \mathbb{R}^d$ such that
\begin{align*}
f(x) \geq f(\mu) + c \cdot (x - \mu) \quad \text{for all } x \in \mathbb{R}^d.
\end{align*}
The affine function $x \mapsto f(\mu) + c \cdot (x - \mu)$ is a global lower bound for $f$ that achieves equality at $x = \mu$. It is this global bound that carries the geometric content of convexity into an analytic inequality between expectations.
[/guided]
[/step]
[step:Substitute $Z$ into the affine lower bound, integrate both sides, and identify $c \cdot (\mathbb{E}[Z] - \mu) = 0$ to conclude]
Substituting $x = Z(\omega)$ into the supporting hyperplane inequality — which holds for every $x \in \mathbb{R}^d$ — gives the pointwise inequality
\begin{align*}
f(Z(\omega)) \geq f(\mu) + c \cdot (Z(\omega) - \mu) \quad \text{for every } \omega \in \Omega.
\end{align*}
The right-hand side belongs to $L^1(\Omega, \mathcal{F}, \mathbb{P})$: since $|c \cdot Z| \leq |c||Z|$ by the Cauchy-Schwarz inequality and $|c| \in [0, \infty)$ is a finite constant,
\begin{align*}
\mathbb{E}|c \cdot Z| \leq |c|\,\mathbb{E}|Z| < \infty,
\end{align*}
where the last inequality uses $\mathbb{E}|Z| < \infty$ by hypothesis; the constants $f(\mu)$ and $c \cdot \mu$ are finite. The left-hand side satisfies $\mathbb{E}|f(Z)| < \infty$ by hypothesis. By monotonicity and linearity of expectation ([Properties of Expectation](/theorems/1117)):
\begin{align*}
\mathbb{E}[f(Z)] \geq \mathbb{E}\bigl[f(\mu) + c \cdot (Z - \mu)\bigr] = f(\mu) + c \cdot \bigl(\mathbb{E}[Z] - \mu\bigr) = f(\mu) + c \cdot (\mu - \mu) = f(\mu) = f(\mathbb{E}[Z]).
\end{align*}
This completes the proof.
[guided]
The supporting hyperplane inequality $f(x) \geq f(\mu) + c \cdot (x - \mu)$ holds for every $x \in \mathbb{R}^d$. Since $Z(\omega) \in \mathbb{R}^d$ for every $\omega \in \Omega$, substituting $x = Z(\omega)$ is valid at each $\omega$:
\begin{align*}
f(Z(\omega)) \geq f(\mu) + c \cdot (Z(\omega) - \mu) \quad \text{for every } \omega \in \Omega.
\end{align*}
This is a deterministic inequality holding at every point of $\Omega$, not merely $\mathbb{P}$-almost surely.
We wish to integrate both sides over $(\Omega, \mathcal{F}, \mathbb{P})$. To apply monotonicity of expectation from [Properties of Expectation](/theorems/1117), both sides must be in $L^1(\Omega, \mathcal{F}, \mathbb{P})$. We verify this for each side in turn.
**Left side.** By hypothesis, $\mathbb{E}|f(Z)| < \infty$, so $f(Z) \in L^1(\Omega, \mathcal{F}, \mathbb{P})$.
**Right side.** We write $f(\mu) + c \cdot (Z - \mu) = f(\mu) + c \cdot Z - c \cdot \mu$. Here $f(\mu) \in \mathbb{R}$ (finite, since $\mu \in \mathbb{R}^d$) and $c \cdot \mu = \sum_{i=1}^d c_i \mu_i \in \mathbb{R}$ (finite, since $c, \mu \in \mathbb{R}^d$) are constants. For the term $c \cdot Z = \sum_{i=1}^d c_i Z_i$, the Cauchy-Schwarz inequality for the Euclidean inner product gives $|c \cdot Z| \leq |c||Z|$. Since $|c| \in [0, \infty)$ is a finite constant,
\begin{align*}
\mathbb{E}|c \cdot Z| \leq |c|\,\mathbb{E}|Z| < \infty,
\end{align*}
where the final inequality uses $\mathbb{E}|Z| < \infty$ by hypothesis. Hence $c \cdot Z \in L^1(\Omega, \mathcal{F}, \mathbb{P})$, and the entire right-hand side is integrable.
Both sides now confirmed to be in $L^1(\Omega, \mathcal{F}, \mathbb{P})$. Since $f(Z(\omega)) \geq f(\mu) + c \cdot (Z(\omega) - \mu)$ holds for every $\omega \in \Omega$, monotonicity of expectation from [Properties of Expectation](/theorems/1117) gives
\begin{align*}
\mathbb{E}[f(Z)] \geq \mathbb{E}\bigl[f(\mu) + c \cdot (Z - \mu)\bigr].
\end{align*}
By linearity of expectation ([Properties of Expectation](/theorems/1117)), distributing over the integrable summands:
\begin{align*}
\mathbb{E}\bigl[f(\mu) + c \cdot (Z - \mu)\bigr] = f(\mu) + c \cdot \bigl(\mathbb{E}[Z] - \mu\bigr).
\end{align*}
Linearity applies to $c \cdot Z = \sum_{i=1}^d c_i Z_i$ because each $Z_i \in L^1(\Omega, \mathcal{F}, \mathbb{P})$ (following from $\mathbb{E}|Z_i| \leq \mathbb{E}|Z| < \infty$), and $\mathbb{E}[c_i Z_i] = c_i \mathbb{E}[Z_i]$ for each $i$.
By definition of $\mu$, we have $\mathbb{E}[Z] = \mu$, so $\mathbb{E}[Z] - \mu = 0 \in \mathbb{R}^d$ and therefore $c \cdot (\mathbb{E}[Z] - \mu) = 0$. The chain of inequalities gives
\begin{align*}
\mathbb{E}[f(Z)] \geq f(\mu) + 0 = f(\mu) = f(\mathbb{E}[Z]),
\end{align*}
which is Jensen's inequality.
[/guided]
[/step]