[proofplan]
The argument rests on the concave estimate $\ln x \le x - 1$, valid for all $x > 0$, with equality iff $x = 1$. Applied to $x = q_i / p_i$ on the support of $p$, this turns a weighted sum of log-ratios into a telescoping expression that reduces to at most $\sum q_i - \sum p_i \le 0$. Rearranging gives Gibbs' inequality. Equality forces both $q_i/p_i = 1$ on the support of $p$ and $\sum_{i : p_i \ne 0} q_i = 1$, which together imply $q = p$ on all indices. The conversion from $\ln$ to $\log_2$ is a positive-scalar multiplication that preserves the inequality.
[/proofplan]
[step:Reduce from $\log_2$ to $\ln$]
For $x > 0$, $\log_2 x = \ln x / \ln 2$. Since $\ln 2 > 0$, multiplying both sides of the desired inequality by $\ln 2 > 0$ shows it is equivalent to
\begin{align*}
-\sum_{i=1}^n p_i \ln p_i \le -\sum_{i=1}^n p_i \ln q_i,
\end{align*}
with the same equality condition. We prove this reformulated inequality.
We use the convention $0 \cdot \ln 0 := 0$ (justified by $\lim_{x \to 0^+} x \ln x = 0$) and $0 \cdot \ln q_i := 0$ even when $q_i = 0$. To avoid pathologies when $q_i = 0$, set
\begin{align*}
I := \{i \in \{1, \ldots, n\} : p_i \ne 0\},
\end{align*}
and observe that both sums are supported on $I$:
\begin{align*}
-\sum_{i=1}^n p_i \ln p_i = -\sum_{i \in I} p_i \ln p_i, \qquad -\sum_{i=1}^n p_i \ln q_i = -\sum_{i \in I} p_i \ln q_i.
\end{align*}
If there is any $i \in I$ with $q_i = 0$, then $p_i > 0$ and $p_i \ln q_i = p_i \cdot (-\infty) = -\infty$, so the right-hand side is $+\infty$ and the inequality holds vacuously; equality is then impossible since $q_i = 0 \ne p_i$. Assume therefore that $q_i > 0$ for every $i \in I$.
[/step]
[step:Record the elementary estimate $\ln x \le x - 1$]
[claim:For all $x > 0$, $\ln x \le x - 1$, with equality iff $x = 1$]
Define
\begin{align*}
\varphi : (0, \infty) &\to \mathbb{R}, \\
x &\mapsto (x - 1) - \ln x.
\end{align*}
Then $\varphi$ is differentiable on $(0, \infty)$ with $\varphi'(x) = 1 - 1/x$. Thus $\varphi'(x) < 0$ on $(0, 1)$, $\varphi'(1) = 0$, and $\varphi'(x) > 0$ on $(1, \infty)$. By the first-derivative test, $\varphi$ attains its global minimum on $(0, \infty)$ at $x = 1$, where $\varphi(1) = 0$. Hence $\varphi(x) \ge 0$ for all $x > 0$, with equality iff $x = 1$. Rearranging gives $\ln x \le x - 1$ with equality iff $x = 1$.
[/claim]
[proof][/proof]
[/step]
[step:Apply the estimate to $x = q_i / p_i$ and sum]
For each $i \in I$, set $x_i := q_i / p_i > 0$. The claim of Step 2 gives
\begin{align*}
\ln \frac{q_i}{p_i} \le \frac{q_i}{p_i} - 1, \qquad i \in I,
\end{align*}
with equality iff $q_i = p_i$. Multiplying by $p_i > 0$ (which preserves the inequality and the equality condition) and summing over $i \in I$,
\begin{align*}
\sum_{i \in I} p_i \ln \frac{q_i}{p_i} \le \sum_{i \in I} p_i \left(\frac{q_i}{p_i} - 1\right) = \sum_{i \in I} q_i - \sum_{i \in I} p_i.
\end{align*}
Now $\sum_{i \in I} p_i = \sum_{i=1}^n p_i = 1$ (since $p_i = 0$ outside $I$), and $\sum_{i \in I} q_i \le \sum_{i=1}^n q_i = 1$ (since the summands $q_i \ge 0$ are non-negative). Substituting,
\begin{align*}
\sum_{i \in I} p_i \ln \frac{q_i}{p_i} \le \sum_{i \in I} q_i - 1 \le 1 - 1 = 0.
\end{align*}
[guided]
We chose $x = q_i/p_i$ specifically to introduce both distributions into the inequality. The weight $p_i$ attached to the $i$-th term plays two roles: it guarantees the sum is supported on $I$ (keeping all ratios well-defined), and it cancels the denominator in $p_i(q_i/p_i - 1) = q_i - p_i$, turning the bound into a telescoping difference of two probability sums. The final $\sum_{i \in I} q_i \le 1$ uses only non-negativity of $q$ and the restriction of the summation to $I$; it becomes an equality exactly when $q$ is also supported in $I$.
[/guided]
[/step]
[step:Rearrange the log-ratio sum into Gibbs' inequality]
Split the logarithm: for $i \in I$, $\ln(q_i/p_i) = \ln q_i - \ln p_i$. Therefore
\begin{align*}
\sum_{i \in I} p_i \ln \frac{q_i}{p_i} = \sum_{i \in I} p_i \ln q_i - \sum_{i \in I} p_i \ln p_i.
\end{align*}
Using the support identities from Step 1,
\begin{align*}
\sum_{i \in I} p_i \ln q_i = \sum_{i=1}^n p_i \ln q_i, \qquad \sum_{i \in I} p_i \ln p_i = \sum_{i=1}^n p_i \ln p_i.
\end{align*}
Combining with the bound from Step 3,
\begin{align*}
\sum_{i=1}^n p_i \ln q_i - \sum_{i=1}^n p_i \ln p_i \le 0,
\end{align*}
which rearranges to
\begin{align*}
-\sum_{i=1}^n p_i \ln p_i \le -\sum_{i=1}^n p_i \ln q_i.
\end{align*}
By Step 1, this is the form of Gibbs' inequality with $\log_2$ replaced by $\ln$, hence establishes Gibbs' inequality.
[/step]
[step:Characterise the case of equality]
Tracing the equality conditions through Steps 3 and 1:
**Equality in Step 3 requires both:**
(a) $q_i / p_i = 1$ for every $i \in I$, i.e., $q_i = p_i$ on the support of $p$; and
(b) $\sum_{i \in I} q_i = \sum_{i=1}^n q_i$, i.e., $q_i = 0$ for every $i \notin I$.
**Conclusion.** Conditions (a) and (b) together say: $q_i = p_i$ for every $i \in I$, and $q_i = 0 = p_i$ for every $i \notin I$ (using the definition of $I$). Hence $q_i = p_i$ for all $i \in \{1, \ldots, n\}$.
Conversely, if $q_i = p_i$ for all $i$, then both sides of Gibbs' inequality are the (entropy) value $-\sum p_i \log p_i$, giving equality.
Equality therefore holds if and only if $p_i = q_i$ for all $i \in \{1, \ldots, n\}$, completing the proof.
[guided]
Equality in the Gibbs inequality comes from two sources, both of which must saturate. First, the pointwise estimate $\ln x \le x - 1$ is an equality only at $x = 1$, forcing $q_i = p_i$ on the support $I$ of $p$. Second, the final estimate $\sum_{i \in I} q_i \le \sum_{i=1}^n q_i = 1$ is an equality only when $q_i = 0$ outside $I$. Both must hold simultaneously, and together they pin down $q$ completely: $q = p$ on $I$ and $q = 0 = p$ off $I$. Each condition alone is not enough — one could imagine $q = p$ on the support of $p$ but $q$ placing mass outside that support, which would violate normalisation of $q$. This is why the theorem states the equality condition as the single sharp statement $p = q$.
[/guided]
[/step]