[proofplan]
Write the [probability generating function](/page/Probability%20Generating%20Function) as the [power series](/page/Power%20Series) $G_X(s)=\sum_{k=0}^{\infty}p_k s^k$, where $p_k=\mathbb P(X=k)$ is nonnegative and the coefficients sum to $1$. Monotonicity and strict monotonicity follow by comparing $b^k$ with $a^k$ term by term for $0\le a<b\le 1$. Convexity follows because each monomial $s\mapsto s^k$ is convex and pointwise limits of convex finite partial sums remain convex. Strict convexity follows by isolating one positive coefficient $p_m$ with $m\ge 2$, since $s\mapsto s^m$ is strictly convex on $(0,1)$ and all remaining terms preserve the weak convex inequality.
[/proofplan]
[step:Expand the probability generating function into nonnegative coefficient weights]
For each $k\in\{0,1,2,\dots\}$, define
\begin{align*}
p_k:=\mathbb P(X=k).
\end{align*}
Then $p_k\ge 0$ for every $k$, and since $X$ takes values in $\{0,1,2,\dots\}$, the events $\{X=k\}$ are pairwise disjoint and have union $\Omega$. Countable additivity gives
\begin{align*}
\sum_{k=0}^{\infty}p_k=1.
\end{align*}
By the definition of the probability [generating function](/page/Generating%20Function), for every $s\in[0,1]$,
\begin{align*}
G_X(s)=\sum_{k=0}^{\infty}p_k s^k.
\end{align*}
This series is well-defined on $[0,1]$ because $0\le p_k s^k\le p_k$ for every $k$ and $\sum_{k=0}^{\infty}p_k=1$.
[guided]
We first turn the statement into a statement about a power series with nonnegative coefficients. For each integer $k\ge 0$, define the coefficient
\begin{align*}
p_k:=\mathbb P(X=k).
\end{align*}
These coefficients are nonnegative because probabilities are nonnegative. Since $X$ takes only values in $\{0,1,2,\dots\}$, exactly one of the events $\{X=k\}$ occurs for each $\omega\in\Omega$. Therefore these events are pairwise disjoint and their union is $\Omega$. Countable additivity of $\mathbb P$ gives
\begin{align*}
\sum_{k=0}^{\infty}p_k=\mathbb P(\Omega)=1.
\end{align*}
The probability generating function is therefore, for $s\in[0,1]$,
\begin{align*}
G_X(s)=\mathbb E[s^X]=\sum_{k=0}^{\infty}p_k s^k.
\end{align*}
This formula is legitimate on $[0,1]$: since $0\le s\le 1$, we have $0\le s^k\le 1$, so
\begin{align*}
0\le p_k s^k\le p_k.
\end{align*}
Because the majorant series $\sum_{k=0}^{\infty}p_k$ has sum $1$, the defining series for $G_X(s)$ converges absolutely and defines a value in $[0,1]$.
[/guided]
[/step]
[step:Compare powers term by term to prove monotonicity and strict increase]
Let $a,b\in[0,1]$ satisfy $a<b$. For every $k\in\{0,1,2,\dots\}$,
\begin{align*}
a^k\le b^k,
\end{align*}
with the convention $a^0=b^0=1$. Multiplying by $p_k\ge 0$ and summing over $k$ gives
\begin{align*}
G_X(a)=\sum_{k=0}^{\infty}p_k a^k\le \sum_{k=0}^{\infty}p_k b^k=G_X(b).
\end{align*}
Thus $G_X$ is nondecreasing on $[0,1]$.
Assume now that $\mathbb P(X\ge 1)>0$. Since
\begin{align*}
\mathbb P(X\ge 1)=\sum_{k=1}^{\infty}p_k,
\end{align*}
there exists $m\in\{1,2,\dots\}$ such that $p_m>0$. For this $m$, the inequality $a^m<b^m$ holds whenever $0\le a<b\le 1$. Hence
\begin{align*}
p_m b^m-p_m a^m>0.
\end{align*}
All other summands satisfy $p_k b^k-p_k a^k\ge 0$, so
\begin{align*}
G_X(b)-G_X(a)=\sum_{k=0}^{\infty}p_k(b^k-a^k)>0.
\end{align*}
Therefore $G_X$ is strictly increasing on $[0,1]$.
[guided]
We prove monotonicity by comparing the power series coefficient by coefficient. Let $a,b\in[0,1]$ satisfy $a<b$. For each $k\in\{0,1,2,\dots\}$, the map $s\mapsto s^k$ is nondecreasing on $[0,1]$, with the convention $a^0=b^0=1$. Hence
\begin{align*}
a^k\le b^k.
\end{align*}
Since each coefficient $p_k$ is nonnegative, multiplying preserves the inequality:
\begin{align*}
p_k a^k\le p_k b^k.
\end{align*}
The series on both sides converge because $0\le p_k s^k\le p_k$ for $s\in[0,1]$ and $\sum_{k=0}^{\infty}p_k=1$. Therefore summing the termwise inequalities gives
\begin{align*}
G_X(a)=\sum_{k=0}^{\infty}p_k a^k\le \sum_{k=0}^{\infty}p_k b^k=G_X(b).
\end{align*}
This proves that $G_X$ is nondecreasing on $[0,1]$.
Now assume $\mathbb P(X\ge 1)>0$. Since $X$ takes values in $\{0,1,2,\dots\}$, the event $\{X\ge 1\}$ is the disjoint union of the events $\{X=k\}$ for $k\ge 1$. Thus countable additivity gives
\begin{align*}
\mathbb P(X\ge 1)=\sum_{k=1}^{\infty}p_k.
\end{align*}
Because this sum is positive and every summand is nonnegative, there is some $m\in\{1,2,\dots\}$ such that $p_m>0$. For this index $m$, the inequality is strict:
\begin{align*}
a^m<b^m,
\end{align*}
so
\begin{align*}
p_m(b^m-a^m)>0.
\end{align*}
For every other index $k$, we still have $p_k(b^k-a^k)\ge 0$. Hence the nonnegative series of gaps has at least one strictly positive term, and therefore
\begin{align*}
G_X(b)-G_X(a)=\sum_{k=0}^{\infty}p_k(b^k-a^k)>0.
\end{align*}
Thus $G_X(a)<G_X(b)$ whenever $0\le a<b\le 1$, which is exactly strict increase on $[0,1]$.
[/guided]
[/step]
[step:Pass convexity from monomials to the infinite probability generating series]
For each $k\in\{0,1,2,\dots\}$, define the function $\varphi_k:[0,1]\to\mathbb R$ by $\varphi_k(s)=s^k$. The functions $\varphi_0$ and $\varphi_1$ are affine. For $k\ge 2$, $\varphi_k$ is twice differentiable on $(0,1)$ and satisfies
\begin{align*}
\varphi_k''(s)=k(k-1)s^{k-2}\ge 0
\end{align*}
for every $s\in(0,1)$, so $\varphi_k$ is convex on $[0,1]$.
For each $N\in\{0,1,2,\dots\}$, define the partial-sum function $S_N:[0,1]\to\mathbb R$ by
\begin{align*}
S_N(s)=\sum_{k=0}^{N}p_k s^k.
\end{align*}
Each $S_N$ is convex on $[0,1]$, because it is a finite nonnegative linear combination of convex functions. Let $a,b\in[0,1]$ and let $\lambda\in[0,1]$. Convexity of $S_N$ gives
\begin{align*}
S_N(\lambda a+(1-\lambda)b)\le \lambda S_N(a)+(1-\lambda)S_N(b).
\end{align*}
For the three points $\lambda a+(1-\lambda)b$, $a$, and $b$ in $[0,1]$, pointwise convergence gives
\begin{align*}
\lim_{N\to\infty}S_N(\lambda a+(1-\lambda)b)=G_X(\lambda a+(1-\lambda)b),
\end{align*}
\begin{align*}
\lim_{N\to\infty}S_N(a)=G_X(a),
\end{align*}
and
\begin{align*}
\lim_{N\to\infty}S_N(b)=G_X(b).
\end{align*}
Taking limits in the finite convexity inequality preserves the weak inequality, so
\begin{align*}
G_X(\lambda a+(1-\lambda)b)\le \lambda G_X(a)+(1-\lambda)G_X(b).
\end{align*}
Thus $G_X$ is convex on $[0,1]$.
[guided]
The goal is to prove convexity without differentiating $G_X$, because differentiating the infinite series would require moment assumptions that are not present in the theorem. Instead, we approximate $G_X$ by finite convex sums and then pass to the pointwise limit.
For each $k\in\{0,1,2,\dots\}$, define the monomial function $\varphi_k:[0,1]\to\mathbb R$ by
\begin{align*}
\varphi_k(s)=s^k.
\end{align*}
The functions $\varphi_0$ and $\varphi_1$ are affine, hence convex. If $k\ge 2$, then $\varphi_k$ is twice differentiable on $(0,1)$ and
\begin{align*}
\varphi_k''(s)=k(k-1)s^{k-2}\ge 0
\end{align*}
for every $s\in(0,1)$. Therefore $\varphi_k$ is convex on $[0,1]$.
Now define, for each $N\in\{0,1,2,\dots\}$, the finite partial sum $S_N:[0,1]\to\mathbb R$ by
\begin{align*}
S_N(s)=\sum_{k=0}^{N}p_k s^k.
\end{align*}
Each coefficient $p_k$ is nonnegative, and nonnegative finite linear combinations preserve convexity. Hence every $S_N$ is convex on $[0,1]$. Thus, for any $a,b\in[0,1]$ and $\lambda\in[0,1]$,
\begin{align*}
S_N(\lambda a+(1-\lambda)b)\le \lambda S_N(a)+(1-\lambda)S_N(b).
\end{align*}
For each fixed $s\in[0,1]$, the partial sums $S_N(s)$ converge to $G_X(s)$ by the power-series representation of the probability generating function:
\begin{align*}
\lim_{N\to\infty}S_N(s)=G_X(s).
\end{align*}
The three arguments $\lambda a+(1-\lambda)b$, $a$, and $b$ all lie in $[0,1]$, so the pointwise convergence applies to each of them:
\begin{align*}
\lim_{N\to\infty}S_N(\lambda a+(1-\lambda)b)=G_X(\lambda a+(1-\lambda)b),
\end{align*}
\begin{align*}
\lim_{N\to\infty}S_N(a)=G_X(a),
\end{align*}
and
\begin{align*}
\lim_{N\to\infty}S_N(b)=G_X(b).
\end{align*}
Since the inequality holds for every $N$ and all three limits exist as [real numbers](/page/Real%20Numbers), passing to the limit preserves the weak inequality. Therefore
\begin{align*}
G_X(\lambda a+(1-\lambda)b)\le \lambda G_X(a)+(1-\lambda)G_X(b).
\end{align*}
This is exactly the definition of convexity on $[0,1]$.
[/guided]
[/step]
[step:Isolate one positive higher-order coefficient to prove strict convexity]
Assume that $\mathbb P(X\ge 2)>0$. Since
\begin{align*}
\mathbb P(X\ge 2)=\sum_{k=2}^{\infty}p_k,
\end{align*}
there exists $m\in\{2,3,\dots\}$ such that $p_m>0$.
Let $a,b\in(0,1)$ satisfy $a<b$, and let $\lambda\in(0,1)$. For every $k\in\{0,1,2,\dots\}$, convexity of $s\mapsto s^k$ gives
\begin{align*}
(\lambda a+(1-\lambda)b)^k\le \lambda a^k+(1-\lambda)b^k.
\end{align*}
For $k=m$, the function $s\mapsto s^m$ is strictly convex on $(0,1)$ because
\begin{align*}
m(m-1)s^{m-2}>0
\end{align*}
for every $s\in(0,1)$. Therefore
\begin{align*}
(\lambda a+(1-\lambda)b)^m<\lambda a^m+(1-\lambda)b^m.
\end{align*}
Multiplying the weak inequalities by $p_k\ge 0$, multiplying the strict inequality by $p_m>0$, and summing the resulting nonnegative termwise gaps gives
\begin{align*}
G_X(\lambda a+(1-\lambda)b)<\lambda G_X(a)+(1-\lambda)G_X(b).
\end{align*}
Thus $G_X$ is strictly convex on $(0,1)$ whenever $\mathbb P(X\ge 2)>0$.
[guided]
We prove strict convexity by locating one genuinely curved monomial with positive coefficient. Assume $\mathbb P(X\ge 2)>0$. Since $X$ takes values in $\{0,1,2,\dots\}$, the event $\{X\ge 2\}$ is the disjoint union of the events $\{X=k\}$ for $k\ge 2$. Therefore
\begin{align*}
\mathbb P(X\ge 2)=\sum_{k=2}^{\infty}p_k.
\end{align*}
All summands are nonnegative, and the sum is positive, so there exists $m\in\{2,3,\dots\}$ with $p_m>0$.
Let $a,b\in(0,1)$ satisfy $a<b$, and let $\lambda\in(0,1)$. For every $k\in\{0,1,2,\dots\}$, convexity of the monomial $s\mapsto s^k$ on $[0,1]$ gives
\begin{align*}
(\lambda a+(1-\lambda)b)^k\le \lambda a^k+(1-\lambda)b^k.
\end{align*}
For the selected index $m\ge 2$, the monomial $s\mapsto s^m$ is strictly convex on $(0,1)$ because its [second derivative](/page/Second%20Derivative) is
\begin{align*}
m(m-1)s^{m-2}>0
\end{align*}
for every $s\in(0,1)$. Since $a\ne b$ and $\lambda\in(0,1)$, strict convexity gives
\begin{align*}
(\lambda a+(1-\lambda)b)^m<\lambda a^m+(1-\lambda)b^m.
\end{align*}
Multiplying the weak inequalities by $p_k\ge 0$ preserves them, and multiplying the strict inequality for $k=m$ by $p_m>0$ preserves strictness. Thus the termwise convexity gaps are all nonnegative and the $m$th gap is strictly positive. Summing the gaps in the convergent probability generating series gives
\begin{align*}
G_X(\lambda a+(1-\lambda)b)<\lambda G_X(a)+(1-\lambda)G_X(b).
\end{align*}
This holds for every $a,b\in(0,1)$ with $a<b$ and every $\lambda\in(0,1)$, so $G_X$ is strictly convex on $(0,1)$.
[/guided]
[/step]