Jensen's Inequality for finite measure spaces — Statement & Proof

Jensen's Inequality for finite measure spaces (Theorem # 8)

Theorem

Edit Issues Pull Requests Attributions Admin

Discussion

Proof

[proofplan] We prove Jensen's inequality in three steps. First, we show that the mean $m := \frac{1}{\mu(E)}\int_{E} g \, d\mu(\omega)$ lies in the interval $I$, so that $f(m)$ is well-defined. Second, we invoke the [Existence of Supporting Hyperplane](/theorems/7) at $m$ to produce an affine minorant $t \mapsto f(m) + r(t - m)$ lying below $f$ on all of $I$. Third, we substitute $t = g(\omega)$, integrate the resulting pointwise inequality against the normalized measure $\mu / \mu(E)$, and observe that the affine correction term vanishes by the definition of $m$, yielding the desired inequality. [/proofplan] [step:Show that the mean value $m$ lies in $I$ so that $f(m)$ is well-defined] Since $0 < \mu(E) < \infty$, define the mean \begin{align*} m := \frac{1}{\mu(E)} \int_{E} g \, d\mu(\omega). \end{align*} Since $g: E \to I$ and $I$ is an interval, we have $g(\omega) \in I$ for every $\omega \in E$. Let $a \leq b$ (possibly infinite) be the endpoints of $I$. Then $a \leq g(\omega) \leq b$ for all $\omega \in E$. Integrating this inequality against the probability measure $\tilde{\mu} := \mu / \mu(E)$ on the [measure space](/pages/1251) $(E, \mathcal{E})$ and applying monotonicity of the integral gives \begin{align*} a = \int_{E} a \, d\tilde{\mu}(\omega) \leq \int_{E} g \, d\tilde{\mu}(\omega) = m \leq \int_{E} b \, d\tilde{\mu}(\omega) = b. \end{align*} Hence $m \in I$, and $f(m)$ is well-defined. [guided] The entire proof rests on evaluating $f$ at the point $m$, so we must first verify that $m \in I$. If $m$ fell outside $I$, the expression $f(m)$ would be meaningless and the inequality would be vacuous. Define the normalized measure \begin{align*} \tilde{\mu} := \frac{\mu}{\mu(E)}. \end{align*} Since $\mu(E) \in (0, \infty)$ by hypothesis, $\tilde{\mu}$ is a well-defined probability measure on $(E, \mathcal{E})$, and \begin{align*} m = \int_{E} g \, d\tilde{\mu}(\omega). \end{align*} Why must $m$ lie in $I$? The key fact is that integration against a probability measure cannot move the result outside the convex hull of the values being integrated. Since $g(\omega) \in I$ for every $\omega \in E$ and $I$ is an interval (hence convex), the convex hull of $g(E)$ is contained in $I$. Concretely, let $a \leq b$ be the endpoints of $I$. Then $a \leq g(\omega) \leq b$ for all $\omega \in E$. Integrating this pair of inequalities against $\tilde{\mu}$ and using monotonicity of the integral gives \begin{align*} a = \int_{E} a \, d\tilde{\mu}(\omega) \leq \int_{E} g \, d\tilde{\mu}(\omega) = m \leq \int_{E} b \, d\tilde{\mu}(\omega) = b. \end{align*} Hence $m \in [a, b] \subset I$, and $f(m)$ is well-defined. This is the measure-theoretic version of the fact that a convex combination of points in a convex set remains in that set. [/guided] [/step] [step:Obtain an affine minorant of $f$ at $m$ via the supporting hyperplane] Since $f: I \to \mathbb{R}$ is convex and $m \in I$, the Existence of Supporting Hyperplane (theorem 7) guarantees the existence of a constant $r \in \mathbb{R}$ (a subgradient of $f$ at $m$) such that \begin{align*} f(t) \geq f(m) + r(t - m) \quad \text{for all } t \in I. \end{align*} This is the one-dimensional supporting hyperplane inequality: the graph of $f$ lies on or above the affine function $t \mapsto f(m) + r(t - m)$. [guided] The Existence of Supporting Hyperplane (theorem 7) is the central ingredient of the proof. What does it say? For a convex function $f: I \to \mathbb{R}$ and any point $m \in I$, there exists a real number $r$ such that the affine function $\ell(t) := f(m) + r(t - m)$ satisfies $f(t) \geq \ell(t)$ for all $t \in I$. Geometrically, the graph of $f$ lies on or above the line through $(m, f(m))$ with slope $r$. Where does $r$ come from? At any interior point $m$ of $I$, the left and right derivatives $f'_{-}(m)$ and $f'_{+}(m)$ both exist (this is a standard property of convex functions on intervals) and satisfy $f'_{-}(m) \leq f'_{+}(m)$. Any value $r \in [f'_{-}(m), f'_{+}(m)]$ serves as a subgradient. If $m$ is a left endpoint of $I$ (say $m = a$ where $I = [a, b]$), the right derivative $f'_{+}(a)$ still exists, and the supporting inequality holds with $r = f'_{+}(a)$. The hypothesis that $f$ is convex is consumed here: without convexity, a supporting affine minorant need not exist. The plan for the next step is to evaluate this pointwise inequality at $t = g(\omega)$ and integrate. \begin{align*} f(t) \geq f(m) + r(t - m) \quad \text{for all } t \in I. \end{align*} [/guided] [/step] [step:Substitute $t = g(\omega)$, integrate, and conclude by the definition of $m$] For every $\omega \in E$, we have $g(\omega) \in I$. Substituting $t = g(\omega)$ into the supporting hyperplane inequality from the previous step gives \begin{align*} f(g(\omega)) \geq f(m) + r\bigl(g(\omega) - m\bigr) \quad \text{for all } \omega \in E. \end{align*} Both sides are $\mu$-[integrable](/pages/1152): the left-hand side $f \circ g$ is integrable by hypothesis, and the right-hand side $\omega \mapsto f(m) + r(g(\omega) - m)$ is an affine function of the integrable function $g$ with finite constants $f(m), r, m$, hence integrable. Integrating both sides against the normalized measure $\tilde{\mu} = \mu / \mu(E)$ and applying monotonicity of the integral (which preserves the direction of a pointwise inequality between integrable functions) gives \begin{align*} \frac{1}{\mu(E)} \int_{E} f(g(\omega)) \, d\mu(\omega) &\geq \frac{1}{\mu(E)} \int_{E} \bigl[f(m) + r(g(\omega) - m)\bigr] \, d\mu(\omega). \end{align*} By linearity of the integral, the right-hand side expands as \begin{align*} \frac{1}{\mu(E)} \int_{E} \bigl[f(m) + r(g(\omega) - m)\bigr] \, d\mu(\omega) &= f(m) \cdot \frac{\mu(E)}{\mu(E)} + r\!\left(\frac{1}{\mu(E)} \int_{E} g \, d\mu(\omega) - m\right) \\ &= f(m) + r(m - m) \\ &= f(m). \end{align*} In the second equality, we used the definition $m = \frac{1}{\mu(E)}\int_{E} g \, d\mu(\omega)$, which causes the correction term $r(m - m) = 0$ to vanish. Combining gives \begin{align*} f\!\left(\frac{1}{\mu(E)} \int_{E} g \, d\mu(\omega)\right) = f(m) \leq \frac{1}{\mu(E)} \int_{E} f \circ g \, d\mu(\omega), \end{align*} which is the desired Jensen's inequality. [guided] This is the step where the definition of $m$ pays off. We substitute $t = g(\omega)$ into the affine minorant inequality to obtain a pointwise bound valid for every $\omega \in E$: \begin{align*} f(g(\omega)) \geq f(m) + r\bigl(g(\omega) - m\bigr) \quad \text{for all } \omega \in E. \end{align*} Before integrating, we must verify that both sides are $\mu$-integrable so that monotonicity of the integral applies. The left-hand side $f \circ g$ is integrable by hypothesis. For the right-hand side, the function $\omega \mapsto f(m) + r(g(\omega) - m)$ is an affine function of $g(\omega)$. Since $g$ is $\mu$-integrable by hypothesis and $f(m)$, $r$, and $m$ are finite real constants, this affine combination is $\mu$-integrable. With both sides integrable, monotonicity of the integral preserves the pointwise inequality upon integration against $\tilde{\mu} = \mu / \mu(E)$: \begin{align*} \frac{1}{\mu(E)} \int_{E} f(g(\omega)) \, d\mu(\omega) \geq \frac{1}{\mu(E)} \int_{E} \bigl[f(m) + r(g(\omega) - m)\bigr] \, d\mu(\omega). \end{align*} Now we expand the right-hand side using linearity of the integral. The constant term $f(m)$ integrates to $f(m) \cdot \tilde{\mu}(E) = f(m) \cdot 1 = f(m)$. The linear term becomes $r \cdot \left(\frac{1}{\mu(E)} \int_{E} g \, d\mu(\omega) - m\right)$. Here is the critical cancellation: by the very definition of $m$ as the mean of $g$, we have $\frac{1}{\mu(E)} \int_{E} g \, d\mu(\omega) = m$, so the linear term equals $r \cdot (m - m) = 0$. This is not a coincidence — the mean $m$ is precisely the point at which the linear correction integrates to zero against the normalized measure. Combining: \begin{align*} \frac{1}{\mu(E)} \int_{E} f(g(\omega)) \, d\mu(\omega) &\geq f(m) + r(m - m) = f(m) \\ &= f\!\left(\frac{1}{\mu(E)} \int_{E} g \, d\mu(\omega)\right). \end{align*} This completes the proof of Jensen's inequality for finite measure spaces. [/guided] [/step]

Prerequisites (0/4 completed)

Prerequisites Graph

Interactive dependency map showing how this theorem builds on foundational concepts

Loading dependency graph...

Theorems

Existence of a Supporting Hyperplane for Convex Functions

Definitions & Concepts

Explore Further

measure space Definition convex function Definition Lebesgue integral Definition Existence of a Supporting Hyperplane for Convex Functions Theorem #7 Kakeya Maximal Conjecture Implies Kakeya Conjecture Analysis Marcinkiewicz Multiplier Theorem, Dimension One Analysis Openness of $A_p$ Analysis Fredholm Alternative Functional Analysis Local Well-Posedness Of The Euler Equations In Sobolev Spaces Fluid Dynamics Hahn-Banach Separation Theorem Analysis Existence For Nonlinear Variational Inequalities Functional Analysis Continuous Functions Integrable Integration Analysis Area Measure Theory Subarea

What brings you to Androma?

Start with a route through the knowledge graph.

Jensen's Inequality for finite measure spaces (Theorem # 8)

Discussion

Proof

Prerequisites (0/4 completed)

Prerequisites Graph

Explore Further

Sign in to Androma

Check your inbox

One last step

Jensen's Inequality for finite measure spaces (Theorem # 8)

Discussion

Proof

Prerequisites (0/4 completed)

Prerequisites Graph

Explore Further