[proofplan]
We prove by induction on $k \in \mathbb{N}$ that $(A_G^k)_{xy}$ counts walks of length $k$ from $x$ to $y$. The base case $k = 1$ is immediate from the definition of the adjacency matrix. For the inductive step, we use the matrix product formula $(A_G^k)_{xy} = \sum_z (A_G^{k-1})_{xz} (A_G)_{zy}$ together with the walk-splitting bijection: every walk of length $k$ from $x$ to $y$ decomposes uniquely as a walk of length $k-1$ from $x$ to some vertex $z$, followed by an edge $zy$.
[/proofplan]
[step:Fix notation and set up induction on walk length]
Let $G$ be a graph with vertex set $V(G)$ of cardinality $n$, identified with $\{1, \ldots, n\}$ by some labelling, and adjacency matrix $A_G \in \{0,1\}^{n \times n}$ defined by
\begin{align*}
(A_G)_{ij} = \mathbb{1}_{\{i \sim j\}}.
\end{align*}
A *walk of length $k$ from $x$ to $y$* is a sequence of vertices $(v_0, v_1, \ldots, v_k)$ with $v_0 = x$, $v_k = y$, and $v_{i-1} \sim v_i$ (i.e., $v_{i-1} v_i \in E(G)$) for $i = 1, \ldots, k$. Let $W_k(x, y)$ denote the set of such walks. We prove by induction on $k \geq 1$ that
\begin{align*}
(A_G^k)_{xy} = |W_k(x, y)|.
\end{align*}
[/step]
[step:Establish the base case $k = 1$ from the definition of the adjacency matrix]
For $k = 1$, a walk of length $1$ from $x$ to $y$ is a pair $(x, y)$ with $x \sim y$; there is exactly one such walk if $x \sim y$ and none otherwise. Hence
\begin{align*}
|W_1(x, y)| = \mathbb{1}_{\{x \sim y\}} = (A_G)_{xy} = (A_G^1)_{xy}.
\end{align*}
[/step]
[step:Express $(A_G^k)_{xy}$ via matrix multiplication in terms of $(A_G^{k-1})_{xz}$]
Assume $k \geq 2$ and the inductive hypothesis that $(A_G^{k-1})_{xz} = |W_{k-1}(x, z)|$ for every pair of vertices $x, z \in V(G)$.
By associativity of matrix multiplication, $A_G^k = A_G^{k-1} \cdot A_G$, so by the definition of the matrix product,
\begin{align*}
(A_G^k)_{xy} = \sum_{z \in V(G)} (A_G^{k-1})_{xz} \cdot (A_G)_{zy} = \sum_{z \in V(G)} (A_G^{k-1})_{xz} \cdot \mathbb{1}_{\{z \sim y\}}.
\end{align*}
Applying the inductive hypothesis to replace $(A_G^{k-1})_{xz}$:
\begin{align*}
(A_G^k)_{xy} = \sum_{z \in V(G)} |W_{k-1}(x, z)| \cdot \mathbb{1}_{\{z \sim y\}} = \sum_{\substack{z \in V(G) \\ z \sim y}} |W_{k-1}(x, z)|.
\end{align*}
[/step]
[step:Construct a bijection between $W_k(x, y)$ and walks of length $k-1$ ending at a neighbour of $y$]
Consider the map
\begin{align*}
\Phi: W_k(x, y) &\to \bigsqcup_{\substack{z \in V(G) \\ z \sim y}} W_{k-1}(x, z) \\
(v_0, v_1, \ldots, v_{k-1}, v_k) &\mapsto (v_0, v_1, \ldots, v_{k-1}),
\end{align*}
which removes the final vertex. We verify $\Phi$ is well-defined and bijective.
*Well-defined.* Given a walk $w = (v_0, \ldots, v_k) \in W_k(x, y)$, its truncation $\Phi(w) = (v_0, \ldots, v_{k-1})$ is a sequence with $v_0 = x$ and consecutive vertices adjacent (inherited from $w$). Setting $z := v_{k-1}$, the walk condition at the final step gives $v_{k-1} \sim v_k$, i.e., $z \sim y$. Hence $\Phi(w) \in W_{k-1}(x, z)$ with $z \sim y$, so $\Phi(w)$ lies in the target disjoint union.
*Injective.* If $\Phi(w) = \Phi(w')$ for $w = (v_0, \ldots, v_k)$ and $w' = (v'_0, \ldots, v'_k)$, then $v_i = v'_i$ for $i = 0, \ldots, k-1$; and $v_k = y = v'_k$ by the definition of $W_k(x, y)$. So $w = w'$.
*Surjective.* Given $z \sim y$ and a walk $(v_0, \ldots, v_{k-1}) \in W_{k-1}(x, z)$, the extended sequence $(v_0, \ldots, v_{k-1}, y)$ satisfies $v_0 = x$, $v_k := y = y$, and the adjacency conditions $v_{i-1} \sim v_i$ for $i = 1, \ldots, k - 1$ (inherited) and $v_{k-1} = z \sim y$ (by hypothesis). Hence it lies in $W_k(x, y)$ and maps to the given walk under $\Phi$.
Therefore
\begin{align*}
|W_k(x, y)| = \sum_{\substack{z \in V(G) \\ z \sim y}} |W_{k-1}(x, z)|.
\end{align*}
[guided]
A walk of length $k$ has $k + 1$ vertices $(v_0, v_1, \ldots, v_k)$. To split such a walk into a shorter piece, the natural cut is at the second-to-last vertex: the walk is uniquely determined by its initial segment $(v_0, \ldots, v_{k-1})$ (a walk of length $k-1$) together with the final edge $v_{k-1} v_k$. Since $v_k = y$ is fixed, the only freedom in the final edge is the choice of $v_{k-1}$, which must be a neighbour of $y$.
This gives a correspondence between walks in $W_k(x, y)$ and pairs (walk of length $k-1$ from $x$ to some $z \in N(y)$, final vertex $y$). Since the final vertex $y$ is determined, the pair is really just the walk of length $k-1$. The map $\Phi$ formalises this correspondence by dropping the final vertex.
Why is the map injective? Because we know the target vertex $y$ is fixed in $W_k(x, y)$: removing $y$ and remembering "this is a walk of length $k-1$ ending at some $z \sim y$" loses no information.
Why surjective? Given any walk of length $k - 1$ ending at a neighbour $z$ of $y$, we can extend by the edge $zy$ to produce a walk of length $k$ ending at $y$.
Counting both sides:
\begin{align*}
|W_k(x, y)| = \sum_{z \sim y} |W_{k-1}(x, z)|.
\end{align*}
[/guided]
[/step]
[step:Complete the induction]
Combining the matrix identity and the bijective count,
\begin{align*}
(A_G^k)_{xy} = \sum_{z \sim y} |W_{k-1}(x, z)| = |W_k(x, y)|.
\end{align*}
This closes the induction and proves the claim for all $k \in \mathbb{N}$.
[/step]