[proofplan]
Part (i) is a direct binomial expansion of $x^p = (1 + p^k y)^p$: the constant and linear terms give $1 + p^{k+1}y$, and we show every other term has $p$-adic valuation at least $k + 2$. The intermediate binomial coefficients $\binom{p}{j}$ for $1 \leq j \leq p - 1$ are divisible by $p$, and the factor $p^{kj}$ supplies extra valuation, yielding $p^{2k+1} \mid \binom{p}{j}(p^k y)^j$; since $k \geq 1$, $2k + 1 \geq k + 2$. The final term $(p^k y)^p$ has valuation $pk$, and the odd-prime hypothesis $p \geq 3$ gives $pk \geq 3k \geq k + 2$. Part (ii) follows by iterating (i) exactly $k$ times, each iteration raising the exponent of $p$ in the base by one.
[/proofplan]
[step:Normalise the hypothesis to an equality and binomial-expand $x^p$]
Fix $p$ an odd prime, $y \in \mathbb{Z}$, and $k \geq 1$. Let $x \in \mathbb{Z}$ satisfy $x \equiv 1 + p^k y \pmod{p^{k+1}}$, i.e. $x = 1 + p^k y + p^{k+1} z$ for some $z \in \mathbb{Z}$. Replacing $y$ by $y' := y + pz$ yields $x = 1 + p^k y'$; since $y' \equiv y \pmod p$, the target congruence
\begin{align*}
x^p \equiv 1 + p^{k+1} y \pmod{p^{k+2}}
\end{align*}
is equivalent to
\begin{align*}
x^p \equiv 1 + p^{k+1} y' \pmod{p^{k+2}}.
\end{align*}
It therefore suffices to prove the statement under the stronger assumption $x = 1 + p^k y$, which we adopt henceforth.
Expanding by the binomial theorem in $\mathbb{Z}$:
\begin{align*}
x^p = (1 + p^k y)^p = \sum_{j=0}^{p} \binom{p}{j} (p^k y)^j = 1 + p \cdot p^k y + \sum_{j=2}^{p-1} \binom{p}{j} p^{kj} y^j + p^{pk} y^p.
\end{align*}
Collecting the first two terms gives $1 + p^{k+1}y$, so
\begin{align*}
x^p - (1 + p^{k+1} y) &= \underbrace{\sum_{j=2}^{p-1} \binom{p}{j} p^{kj} y^j}_{=: A} + \underbrace{p^{pk} y^p}_{=: B}.
\end{align*}
It remains to show $p^{k+2} \mid A$ and $p^{k+2} \mid B$.
[guided]
The hypothesis gives us $x \equiv 1 + p^k y \pmod{p^{k+1}}$, but working modulo $p^{k+2}$ requires a little more information about $x$. The trick is to upgrade to an exact equality by absorbing the ambiguity into $y$: write $x = 1 + p^k y + p^{k+1} z$, then replace $y$ by $y' := y + pz$ to get $x = 1 + p^k y'$. Since $y' \equiv y \pmod p$, and we are going to read off the conclusion modulo $p^{k+2}$ where it depends on $y$ only through $p^{k+1} y \pmod{p^{k+2}}$ — a quantity determined by $y \bmod p$ — this replacement does not change what we need to prove. We rename $y'$ back to $y$ and assume $x = 1 + p^k y$ exactly.
Now we expand $(1 + p^k y)^p$ with the binomial theorem, valid in any commutative ring (here $\mathbb{Z}$). The $j$-th term is $\binom{p}{j} (p^k y)^j$, so
\begin{align*}
x^p = 1 + \binom{p}{1} p^k y + \sum_{j=2}^{p-1} \binom{p}{j} p^{kj} y^j + \binom{p}{p} p^{pk} y^p.
\end{align*}
The outer two terms at $j = 0, 1, p$ are easy: $j = 0$ gives $1$; $j = 1$ gives $p \cdot p^k y = p^{k+1} y$ (this is the "main term" we were aiming for); $j = p$ gives $p^{pk} y^p$ (since $\binom{p}{p} = 1$). The middle terms for $2 \leq j \leq p - 1$ are where we need to check $p$-divisibility, because each of them contributes to the error $x^p - (1 + p^{k+1}y)$ that we want to show is divisible by $p^{k+2}$.
[/guided]
[/step]
[step:Bound the $p$-adic valuation of the middle binomial terms]
For $1 \leq j \leq p - 1$, the prime $p$ divides $\binom{p}{j}$. To see this, recall $\binom{p}{j} = \frac{p!}{j!\,(p-j)!}$. The numerator $p!$ contains the factor $p$, while the denominator $j!\,(p-j)!$ is a product of integers strictly less than $p$ and therefore coprime to $p$. Consequently $p \mid \binom{p}{j}$.
Hence for each $j$ with $2 \leq j \leq p - 1$:
\begin{align*}
\binom{p}{j} p^{kj} y^j \equiv 0 \pmod{p^{kj + 1}}.
\end{align*}
We show $kj + 1 \geq k + 2$ for $j \geq 2$ and $k \geq 1$:
\begin{align*}
kj + 1 - (k + 2) = k(j - 1) - 1 \geq k \cdot 1 - 1 = k - 1 \geq 0.
\end{align*}
Therefore each such term is divisible by $p^{k+2}$, and summing,
\begin{align*}
A = \sum_{j=2}^{p-1} \binom{p}{j} p^{kj} y^j \equiv 0 \pmod{p^{k+2}}.
\end{align*}
[guided]
We need to show that for $2 \leq j \leq p - 1$, the term $\binom{p}{j} p^{kj} y^j$ is divisible by $p^{k+2}$. There are two sources of $p$-divisibility in this term:
1. **From the binomial coefficient.** Writing $\binom{p}{j} = \frac{p!}{j!(p-j)!}$, the factor $p$ in the numerator is not cancelled by anything in the denominator when $1 \leq j \leq p - 1$, because both $j!$ and $(p-j)!$ are products of integers from $1$ to at most $p - 1$, all coprime to $p$. So $p \mid \binom{p}{j}$, contributing one factor of $p$. (It turns out $v_p(\binom{p}{j}) = 1$ exactly, but we only need $\geq 1$.)
2. **From the power $p^{kj}$.** This contributes $kj$ factors of $p$.
Combined, $v_p\bigl(\binom{p}{j} p^{kj} y^j\bigr) \geq kj + 1$. We need this to be at least $k + 2$, i.e. $kj + 1 \geq k + 2$, i.e. $k(j - 1) \geq 1$. For $j \geq 2$ and $k \geq 1$ this is $k(j-1) \geq 1 \cdot 1 = 1$, so the inequality holds.
Notice that the inequality $kj + 1 \geq k + 2$ can **fail** if $j = 1$: then $kj + 1 = k + 1 < k + 2$. That is exactly why we separated the $j = 1$ term from the sum at the outset — it is the leading contribution, equal to $p^{k+1} y$, and must not be discarded.
[/guided]
[/step]
[step:Bound the $p$-adic valuation of the final term using $p \geq 3$]
For the term $B = p^{pk} y^p$, the exponent of $p$ is $pk$. We need $pk \geq k + 2$, i.e. $(p - 1)k \geq 2$. Since $p \geq 3$ (the hypothesis that $p$ is **odd**), we have $p - 1 \geq 2$, and $k \geq 1$, so
\begin{align*}
(p - 1)k \geq 2 \cdot 1 = 2.
\end{align*}
Therefore $pk \geq k + 2$, and
\begin{align*}
B = p^{pk} y^p \equiv 0 \pmod{p^{k+2}}.
\end{align*}
Combining with the previous step:
\begin{align*}
x^p - (1 + p^{k+1} y) = A + B \equiv 0 \pmod{p^{k+2}},
\end{align*}
which is the conclusion of part (i).
[guided]
The last term $B = p^{pk} y^p$ needs $p^{pk}$ to be divisible by $p^{k+2}$, i.e. $pk \geq k + 2$, which rearranges to $(p-1)k \geq 2$. We have $k \geq 1$, so we need $p - 1 \geq 2$, i.e. $p \geq 3$.
**This is the only place where the hypothesis "$p$ is odd" is used.** For $p = 2$ and $k = 1$ the inequality $(p-1)k = 1 \cdot 1 = 1$ fails, and indeed the lemma is false in that case: $x = 1 + 2y$ gives $x^2 = 1 + 4y + 4y^2 = 1 + 4y(1 + y)$, and the analogue "$x^2 \equiv 1 + 4y \pmod 8$" would require $4y(1+y) \equiv 4y \pmod 8$, i.e. $4y^2 \equiv 0 \pmod 8$, which fails for odd $y$. The parity constraint $p \geq 3$ is essential.
For $p \geq 3$, the valuation estimates combine to give
\begin{align*}
x^p - (1 + p^{k+1} y) = \underbrace{A}_{v_p \geq k+2} + \underbrace{B}_{v_p \geq k+2},
\end{align*}
so the difference has $p$-adic valuation $\geq k + 2$, which is the statement $x^p \equiv 1 + p^{k+1} y \pmod{p^{k+2}}$ of part (i).
[/guided]
[/step]
[step:Derive part (ii) by iterating part (i) from exponent $1$ up to exponent $k$]
We prove part (ii), namely $(1 + py)^{p^k} \equiv 1 + p^{k+1} y \pmod{p^{k+2}}$, by induction on $k \geq 1$ using part (i) as the inductive step.
Base case $k = 1$. Part (i) with $k = 1$ applied to $x = 1 + p \cdot y$ (so that $x \equiv 1 + p y \pmod{p^2}$ automatically — the hypothesis is satisfied with equality) gives
\begin{align*}
(1 + py)^p = x^p \equiv 1 + p^{2} y \pmod{p^{3}}.
\end{align*}
This is part (ii) at $k = 1$.
Inductive step. Suppose the statement holds at level $k$:
\begin{align*}
(1 + py)^{p^k} \equiv 1 + p^{k+1} y \pmod{p^{k+2}}. \tag{IH}
\end{align*}
Set $x := (1 + py)^{p^k}$. The inductive hypothesis (IH) reads $x \equiv 1 + p^{k+1} y \pmod{p^{k+2}}$, which is exactly the hypothesis of part (i) with $k$ replaced by $k + 1$ and the same $y$. Part (i) with this substitution yields
\begin{align*}
x^p \equiv 1 + p^{(k+1)+1} y \pmod{p^{(k+1)+2}},
\end{align*}
i.e.
\begin{align*}
\bigl((1 + py)^{p^k}\bigr)^p = (1 + py)^{p^{k+1}} \equiv 1 + p^{k+2} y \pmod{p^{k+3}}.
\end{align*}
This is the statement at level $k + 1$, completing the induction.
By induction on $k$, part (ii) holds for all $k \geq 1$, finishing the proof of the Power Lifting Lemma.
[guided]
Part (ii) asks us to compute $(1 + py)^{p^k} \bmod p^{k+2}$, and the only tool available is part (i), which raises to the $p$-th power and upgrades the exponent of $p$ in the "perturbation" by one. So we iterate part (i) $k$ times, each time advancing from level $\ell$ to level $\ell + 1$.
Formally this is an induction on $k$. At $k = 1$ the claim is $(1 + py)^p \equiv 1 + p^2 y \pmod{p^3}$, which is exactly part (i) with the starting value $x = 1 + py$ (so the hypothesis $x \equiv 1 + p^1 y \pmod{p^2}$ holds automatically with equality).
For the inductive step, suppose $(1 + py)^{p^k} \equiv 1 + p^{k+1} y \pmod{p^{k+2}}$ — this is the conclusion at level $k$, and it matches the hypothesis of part (i) at level $k + 1$ for the element $x := (1 + py)^{p^k}$. Part (i) at level $k + 1$ gives
\begin{align*}
x^p \equiv 1 + p^{k+2} y \pmod{p^{k+3}},
\end{align*}
and $x^p = \bigl((1 + py)^{p^k}\bigr)^p = (1 + py)^{p^{k+1}}$. This is the claim at level $k + 1$.
The structure of the argument is therefore: part (i) is the "one-step lift" from exponent $p^{k+1}$ (mod $p^{k+2}$) to exponent $p^{k+2}$ (mod $p^{k+3}$), and part (ii) glues $k$ such lifts together starting from the initial perturbation $p \cdot y$.
[/guided]
[/step]