Basic Shape of a Probability Generating Function

Basic Shape of a Probability Generating Function (Theorem # 9501)

Theorem

Edit Issues Pull Requests Attributions Admin

Discussion

Proof

[proofplan] Write the [probability generating function](/page/Probability%20Generating%20Function) as the [power series](/page/Power%20Series) $G_X(s)=\sum_{k=0}^{\infty}p_k s^k$, where $p_k=\mathbb P(X=k)$ is nonnegative and the coefficients sum to $1$. Monotonicity and strict monotonicity follow by comparing $b^k$ with $a^k$ term by term for $0\le a<b\le 1$. Convexity follows because each monomial $s\mapsto s^k$ is convex and pointwise limits of convex finite partial sums remain convex. Strict convexity follows by isolating one positive coefficient $p_m$ with $m\ge 2$, since $s\mapsto s^m$ is strictly convex on $(0,1)$ and all remaining terms preserve the weak convex inequality. [/proofplan] [step:Expand the probability generating function into nonnegative coefficient weights] For each $k\in\{0,1,2,\dots\}$, define \begin{align*} p_k:=\mathbb P(X=k). \end{align*} Then $p_k\ge 0$ for every $k$, and since $X$ takes values in $\{0,1,2,\dots\}$, the events $\{X=k\}$ are pairwise disjoint and have union $\Omega$. Countable additivity gives \begin{align*} \sum_{k=0}^{\infty}p_k=1. \end{align*} By the definition of the probability [generating function](/page/Generating%20Function), for every $s\in[0,1]$, \begin{align*} G_X(s)=\sum_{k=0}^{\infty}p_k s^k. \end{align*} This series is well-defined on $[0,1]$ because $0\le p_k s^k\le p_k$ for every $k$ and $\sum_{k=0}^{\infty}p_k=1$. [guided] We first turn the statement into a statement about a power series with nonnegative coefficients. For each integer $k\ge 0$, define the coefficient \begin{align*} p_k:=\mathbb P(X=k). \end{align*} These coefficients are nonnegative because probabilities are nonnegative. Since $X$ takes only values in $\{0,1,2,\dots\}$, exactly one of the events $\{X=k\}$ occurs for each $\omega\in\Omega$. Therefore these events are pairwise disjoint and their union is $\Omega$. Countable additivity of $\mathbb P$ gives \begin{align*} \sum_{k=0}^{\infty}p_k=\mathbb P(\Omega)=1. \end{align*} The probability generating function is therefore, for $s\in[0,1]$, \begin{align*} G_X(s)=\mathbb E[s^X]=\sum_{k=0}^{\infty}p_k s^k. \end{align*} This formula is legitimate on $[0,1]$: since $0\le s\le 1$, we have $0\le s^k\le 1$, so \begin{align*} 0\le p_k s^k\le p_k. \end{align*} Because the majorant series $\sum_{k=0}^{\infty}p_k$ has sum $1$, the defining series for $G_X(s)$ converges absolutely and defines a value in $[0,1]$. [/guided] [/step] [step:Compare powers term by term to prove monotonicity and strict increase] Let $a,b\in[0,1]$ satisfy $a<b$. For every $k\in\{0,1,2,\dots\}$, \begin{align*} a^k\le b^k, \end{align*} with the convention $a^0=b^0=1$. Multiplying by $p_k\ge 0$ and summing over $k$ gives \begin{align*} G_X(a)=\sum_{k=0}^{\infty}p_k a^k\le \sum_{k=0}^{\infty}p_k b^k=G_X(b). \end{align*} Thus $G_X$ is nondecreasing on $[0,1]$. Assume now that $\mathbb P(X\ge 1)>0$. Since \begin{align*} \mathbb P(X\ge 1)=\sum_{k=1}^{\infty}p_k, \end{align*} there exists $m\in\{1,2,\dots\}$ such that $p_m>0$. For this $m$, the inequality $a^m<b^m$ holds whenever $0\le a<b\le 1$. Hence \begin{align*} p_m b^m-p_m a^m>0. \end{align*} All other summands satisfy $p_k b^k-p_k a^k\ge 0$, so \begin{align*} G_X(b)-G_X(a)=\sum_{k=0}^{\infty}p_k(b^k-a^k)>0. \end{align*} Therefore $G_X$ is strictly increasing on $[0,1]$. [guided] We prove monotonicity by comparing the power series coefficient by coefficient. Let $a,b\in[0,1]$ satisfy $a<b$. For each $k\in\{0,1,2,\dots\}$, the map $s\mapsto s^k$ is nondecreasing on $[0,1]$, with the convention $a^0=b^0=1$. Hence \begin{align*} a^k\le b^k. \end{align*} Since each coefficient $p_k$ is nonnegative, multiplying preserves the inequality: \begin{align*} p_k a^k\le p_k b^k. \end{align*} The series on both sides converge because $0\le p_k s^k\le p_k$ for $s\in[0,1]$ and $\sum_{k=0}^{\infty}p_k=1$. Therefore summing the termwise inequalities gives \begin{align*} G_X(a)=\sum_{k=0}^{\infty}p_k a^k\le \sum_{k=0}^{\infty}p_k b^k=G_X(b). \end{align*} This proves that $G_X$ is nondecreasing on $[0,1]$. Now assume $\mathbb P(X\ge 1)>0$. Since $X$ takes values in $\{0,1,2,\dots\}$, the event $\{X\ge 1\}$ is the disjoint union of the events $\{X=k\}$ for $k\ge 1$. Thus countable additivity gives \begin{align*} \mathbb P(X\ge 1)=\sum_{k=1}^{\infty}p_k. \end{align*} Because this sum is positive and every summand is nonnegative, there is some $m\in\{1,2,\dots\}$ such that $p_m>0$. For this index $m$, the inequality is strict: \begin{align*} a^m<b^m, \end{align*} so \begin{align*} p_m(b^m-a^m)>0. \end{align*} For every other index $k$, we still have $p_k(b^k-a^k)\ge 0$. Hence the nonnegative series of gaps has at least one strictly positive term, and therefore \begin{align*} G_X(b)-G_X(a)=\sum_{k=0}^{\infty}p_k(b^k-a^k)>0. \end{align*} Thus $G_X(a)<G_X(b)$ whenever $0\le a<b\le 1$, which is exactly strict increase on $[0,1]$. [/guided] [/step] [step:Pass convexity from monomials to the infinite probability generating series] For each $k\in\{0,1,2,\dots\}$, define the function $\varphi_k:[0,1]\to\mathbb R$ by $\varphi_k(s)=s^k$. The functions $\varphi_0$ and $\varphi_1$ are affine. For $k\ge 2$, $\varphi_k$ is twice differentiable on $(0,1)$ and satisfies \begin{align*} \varphi_k''(s)=k(k-1)s^{k-2}\ge 0 \end{align*} for every $s\in(0,1)$, so $\varphi_k$ is convex on $[0,1]$. For each $N\in\{0,1,2,\dots\}$, define the partial-sum function $S_N:[0,1]\to\mathbb R$ by \begin{align*} S_N(s)=\sum_{k=0}^{N}p_k s^k. \end{align*} Each $S_N$ is convex on $[0,1]$, because it is a finite nonnegative linear combination of convex functions. Let $a,b\in[0,1]$ and let $\lambda\in[0,1]$. Convexity of $S_N$ gives \begin{align*} S_N(\lambda a+(1-\lambda)b)\le \lambda S_N(a)+(1-\lambda)S_N(b). \end{align*} For the three points $\lambda a+(1-\lambda)b$, $a$, and $b$ in $[0,1]$, pointwise convergence gives \begin{align*} \lim_{N\to\infty}S_N(\lambda a+(1-\lambda)b)=G_X(\lambda a+(1-\lambda)b), \end{align*} \begin{align*} \lim_{N\to\infty}S_N(a)=G_X(a), \end{align*} and \begin{align*} \lim_{N\to\infty}S_N(b)=G_X(b). \end{align*} Taking limits in the finite convexity inequality preserves the weak inequality, so \begin{align*} G_X(\lambda a+(1-\lambda)b)\le \lambda G_X(a)+(1-\lambda)G_X(b). \end{align*} Thus $G_X$ is convex on $[0,1]$. [guided] The goal is to prove convexity without differentiating $G_X$, because differentiating the infinite series would require moment assumptions that are not present in the theorem. Instead, we approximate $G_X$ by finite convex sums and then pass to the pointwise limit. For each $k\in\{0,1,2,\dots\}$, define the monomial function $\varphi_k:[0,1]\to\mathbb R$ by \begin{align*} \varphi_k(s)=s^k. \end{align*} The functions $\varphi_0$ and $\varphi_1$ are affine, hence convex. If $k\ge 2$, then $\varphi_k$ is twice differentiable on $(0,1)$ and \begin{align*} \varphi_k''(s)=k(k-1)s^{k-2}\ge 0 \end{align*} for every $s\in(0,1)$. Therefore $\varphi_k$ is convex on $[0,1]$. Now define, for each $N\in\{0,1,2,\dots\}$, the finite partial sum $S_N:[0,1]\to\mathbb R$ by \begin{align*} S_N(s)=\sum_{k=0}^{N}p_k s^k. \end{align*} Each coefficient $p_k$ is nonnegative, and nonnegative finite linear combinations preserve convexity. Hence every $S_N$ is convex on $[0,1]$. Thus, for any $a,b\in[0,1]$ and $\lambda\in[0,1]$, \begin{align*} S_N(\lambda a+(1-\lambda)b)\le \lambda S_N(a)+(1-\lambda)S_N(b). \end{align*} For each fixed $s\in[0,1]$, the partial sums $S_N(s)$ converge to $G_X(s)$ by the power-series representation of the probability generating function: \begin{align*} \lim_{N\to\infty}S_N(s)=G_X(s). \end{align*} The three arguments $\lambda a+(1-\lambda)b$, $a$, and $b$ all lie in $[0,1]$, so the pointwise convergence applies to each of them: \begin{align*} \lim_{N\to\infty}S_N(\lambda a+(1-\lambda)b)=G_X(\lambda a+(1-\lambda)b), \end{align*} \begin{align*} \lim_{N\to\infty}S_N(a)=G_X(a), \end{align*} and \begin{align*} \lim_{N\to\infty}S_N(b)=G_X(b). \end{align*} Since the inequality holds for every $N$ and all three limits exist as [real numbers](/page/Real%20Numbers), passing to the limit preserves the weak inequality. Therefore \begin{align*} G_X(\lambda a+(1-\lambda)b)\le \lambda G_X(a)+(1-\lambda)G_X(b). \end{align*} This is exactly the definition of convexity on $[0,1]$. [/guided] [/step] [step:Isolate one positive higher-order coefficient to prove strict convexity] Assume that $\mathbb P(X\ge 2)>0$. Since \begin{align*} \mathbb P(X\ge 2)=\sum_{k=2}^{\infty}p_k, \end{align*} there exists $m\in\{2,3,\dots\}$ such that $p_m>0$. Let $a,b\in(0,1)$ satisfy $a<b$, and let $\lambda\in(0,1)$. For every $k\in\{0,1,2,\dots\}$, convexity of $s\mapsto s^k$ gives \begin{align*} (\lambda a+(1-\lambda)b)^k\le \lambda a^k+(1-\lambda)b^k. \end{align*} For $k=m$, the function $s\mapsto s^m$ is strictly convex on $(0,1)$ because \begin{align*} m(m-1)s^{m-2}>0 \end{align*} for every $s\in(0,1)$. Therefore \begin{align*} (\lambda a+(1-\lambda)b)^m<\lambda a^m+(1-\lambda)b^m. \end{align*} Multiplying the weak inequalities by $p_k\ge 0$, multiplying the strict inequality by $p_m>0$, and summing the resulting nonnegative termwise gaps gives \begin{align*} G_X(\lambda a+(1-\lambda)b)<\lambda G_X(a)+(1-\lambda)G_X(b). \end{align*} Thus $G_X$ is strictly convex on $(0,1)$ whenever $\mathbb P(X\ge 2)>0$. [guided] We prove strict convexity by locating one genuinely curved monomial with positive coefficient. Assume $\mathbb P(X\ge 2)>0$. Since $X$ takes values in $\{0,1,2,\dots\}$, the event $\{X\ge 2\}$ is the disjoint union of the events $\{X=k\}$ for $k\ge 2$. Therefore \begin{align*} \mathbb P(X\ge 2)=\sum_{k=2}^{\infty}p_k. \end{align*} All summands are nonnegative, and the sum is positive, so there exists $m\in\{2,3,\dots\}$ with $p_m>0$. Let $a,b\in(0,1)$ satisfy $a<b$, and let $\lambda\in(0,1)$. For every $k\in\{0,1,2,\dots\}$, convexity of the monomial $s\mapsto s^k$ on $[0,1]$ gives \begin{align*} (\lambda a+(1-\lambda)b)^k\le \lambda a^k+(1-\lambda)b^k. \end{align*} For the selected index $m\ge 2$, the monomial $s\mapsto s^m$ is strictly convex on $(0,1)$ because its [second derivative](/page/Second%20Derivative) is \begin{align*} m(m-1)s^{m-2}>0 \end{align*} for every $s\in(0,1)$. Since $a\ne b$ and $\lambda\in(0,1)$, strict convexity gives \begin{align*} (\lambda a+(1-\lambda)b)^m<\lambda a^m+(1-\lambda)b^m. \end{align*} Multiplying the weak inequalities by $p_k\ge 0$ preserves them, and multiplying the strict inequality for $k=m$ by $p_m>0$ preserves strictness. Thus the termwise convexity gaps are all nonnegative and the $m$th gap is strictly positive. Summing the gaps in the convergent probability generating series gives \begin{align*} G_X(\lambda a+(1-\lambda)b)<\lambda G_X(a)+(1-\lambda)G_X(b). \end{align*} This holds for every $a,b\in(0,1)$ with $a<b$ and every $\lambda\in(0,1)$, so $G_X$ is strictly convex on $(0,1)$. [/guided] [/step]

Prerequisites (0/3 completed)

Prerequisites Graph

Interactive dependency map showing how this theorem builds on foundational concepts

Loading dependency graph...

Definitions & Concepts

Explore Further

What brings you to Androma?

Start with a route through the knowledge graph.