[proofplan]
We prove the identity by comparing the tableau generating functions on both sides. The product $h_r s_\mu$ enumerates pairs consisting of a semistandard tableau of shape $\mu$ and a weakly increasing row word of length $r$. Applying Schensted row insertion to that word gives a weight-preserving bijection from such pairs to semistandard tableaux whose shape is obtained from $\mu$ by adding a horizontal strip of size $r$. The inverse is reverse row insertion from the added boxes, taken from right to left.
[/proofplan]
[step:Rewrite the product as a generating function over tableaux and row words]
Let $\operatorname{SSYT}(\nu)$ denote the set of semistandard Young tableaux of shape $\nu$ with entries in $\mathbb{N}=\{1,2,\dots\}$. For $T \in \operatorname{SSYT}(\nu)$, define its weight sequence $\operatorname{wt}(T)=(m_1,m_2,\dots)$ by declaring $m_i$ to be the number of entries of $T$ equal to $i$, and write
\begin{align*}
x^{\operatorname{wt}(T)}:=\prod_{i\geq 1} x_i^{m_i}.
\end{align*}
Only finitely many factors differ from $1$, since $T$ has finitely many boxes.
By the tableau definition of Schur functions,
\begin{align*}
s_\mu=\sum_{T\in \operatorname{SSYT}(\mu)} x^{\operatorname{wt}(T)}.
\end{align*}
Also,
\begin{align*}
h_r=\sum_{1\leq a_1\leq \cdots \leq a_r} x_{a_1}\cdots x_{a_r},
\end{align*}
with the convention that for $r=0$ the only word is the empty word and its monomial is $1$. Therefore
\begin{align*}
h_r s_\mu
=
\sum_{\substack{T\in \operatorname{SSYT}(\mu)\\ 1\leq a_1\leq\cdots\leq a_r}}
x^{\operatorname{wt}(T)}x_{a_1}\cdots x_{a_r}.
\end{align*}
Thus it suffices to construct a weight-preserving bijection from pairs $(T,a)$, where $T\in \operatorname{SSYT}(\mu)$ and $a=(a_1,\dots,a_r)$ is weakly increasing, onto the disjoint union of $\operatorname{SSYT}(\lambda)$ over all partitions $\lambda$ such that $\lambda/\mu$ is a horizontal strip of size $r$.
[/step]
[step:Insert the weakly increasing row word and track the new shape]
For a semistandard tableau $T$ and an integer $b\in\mathbb{N}$, let $T\leftarrow b$ denote ordinary row insertion of $b$ into $T$: in the first row, replace the leftmost entry strictly larger than $b$ by $b$ and bump the replaced entry to the next row; if no entry is strictly larger than $b$, append $b$ at the end of the row and stop; continue row by row with the bumped entry.
Define
\begin{align*}
\Phi:\{(T,a):T\in\operatorname{SSYT}(\mu),\ 1\leq a_1\leq\cdots\leq a_r\}
&\to \bigcup_{\lambda}\operatorname{SSYT}(\lambda)\\
(T,(a_1,\dots,a_r))&\mapsto (((T\leftarrow a_1)\leftarrow a_2)\cdots\leftarrow a_r),
\end{align*}
where the target union is initially taken over all partitions $\lambda$.
Ordinary row insertion sends semistandard tableaux to semistandard tableaux and preserves the multiset of entries after adjoining the inserted letter. Hence, if $U=\Phi(T,a)$, then
\begin{align*}
x^{\operatorname{wt}(U)}
=
x^{\operatorname{wt}(T)}x_{a_1}\cdots x_{a_r}.
\end{align*}
Moreover, the row-bumping lemma for weakly increasing insertion words says that if $a_1\leq\cdots\leq a_r$, then the columns of the newly created boxes move strictly to the right as the insertions proceed. Therefore the $r$ boxes added to $\mu$ lie in distinct columns. Thus the shape $\lambda=\operatorname{shape}(U)$ satisfies that $\lambda/\mu$ is a horizontal strip of size $r$. Here we are using the standard row-bumping lemma for semistandard row insertion (citing a result not yet in the wiki: Row-Bumping Lemma for Semistandard Row Insertion).
[guided]
We now build the map that should explain the whole identity. A term of $h_r s_\mu$ consists of a tableau $T\in\operatorname{SSYT}(\mu)$ and a weakly increasing word $a=(a_1,\dots,a_r)$ with $1\leq a_1\leq\cdots\leq a_r$. The natural operation is to insert these letters into $T$ one at a time, from left to right.
For a semistandard tableau $T$ and a letter $b\in\mathbb{N}$, ordinary row insertion $T\leftarrow b$ is defined as follows. In the first row, locate the leftmost entry strictly larger than $b$. If such an entry exists, replace it by $b$ and bump the replaced entry into the second row. If no such entry exists, append $b$ to the end of the first row and stop. The same rule is then applied in the second row to the bumped entry, and the process continues until some entry is appended at the end of a row. This creates exactly one new box.
Define
\begin{align*}
\Phi:\{(T,a):T\in\operatorname{SSYT}(\mu),\ 1\leq a_1\leq\cdots\leq a_r\}
&\to \bigcup_{\lambda}\operatorname{SSYT}(\lambda)\\
(T,(a_1,\dots,a_r))&\mapsto (((T\leftarrow a_1)\leftarrow a_2)\cdots\leftarrow a_r).
\end{align*}
The insertion procedure only moves existing entries and adds the inserted letter. Therefore, after inserting all letters $a_1,\dots,a_r$, the weight has changed exactly by adding one copy of each $a_i$:
\begin{align*}
x^{\operatorname{wt}(\Phi(T,a))}
=
x^{\operatorname{wt}(T)}x_{a_1}\cdots x_{a_r}.
\end{align*}
The remaining point is geometric: where are the new boxes? Since the inserted word is weakly increasing, the row-bumping lemma for semistandard row insertion applies. Its hypotheses are precisely that we use ordinary row insertion and that the inserted letters satisfy $a_1\leq\cdots\leq a_r$. The conclusion is that the column indices of the newly created boxes are strictly increasing as the insertions proceed. Hence no two newly created boxes lie in the same column. By definition, this means the skew diagram between the original shape $\mu$ and the final shape $\lambda=\operatorname{shape}(\Phi(T,a))$ is a horizontal strip of size $r$. This invocation uses a standard result not yet represented as a separate wiki theorem: Row-Bumping Lemma for Semistandard Row Insertion.
[/guided]
[/step]
[step:Reverse the insertion from a horizontal strip]
Fix a partition $\lambda$ such that $\lambda/\mu$ is a horizontal strip of size $r$, and let $U\in\operatorname{SSYT}(\lambda)$. List the boxes of $\lambda/\mu$ in strictly decreasing order of column:
\begin{align*}
c_r>c_{r-1}>\cdots>c_1,
\end{align*}
and let $B_i$ denote the unique box of $\lambda/\mu$ in column $c_i$. Since $\lambda/\mu$ has no two boxes in the same column, this notation is well-defined.
Starting with $U_r:=U$, perform reverse row insertion by removing $B_r$, then $B_{r-1}$, and so on down to $B_1$. At the removal of $B_i$, reverse row insertion deletes the entry in that outside corner and slides entries upward by reversing the bumping path; let $a_i\in\mathbb{N}$ denote the final letter ejected from the first row, and let $U_{i-1}$ denote the resulting tableau. The reverse row-insertion algorithm is inverse to ordinary row insertion, so every $U_i$ is semistandard and
\begin{align*}
U_i=U_{i-1}\leftarrow a_i.
\end{align*}
The inverse form of the same row-bumping lemma says that removing newly added boxes from right to left produces ejected letters satisfying
\begin{align*}
a_1\leq a_2\leq\cdots\leq a_r.
\end{align*}
After all removals, the remaining shape is $\mu$, so $U_0\in\operatorname{SSYT}(\mu)$. Thus every tableau $U$ of shape $\lambda$ with $\lambda/\mu$ a horizontal strip determines a pair $(U_0,(a_1,\dots,a_r))$ in the domain of $\Phi$.
[/step]
[step:Show the two constructions are inverse and sum over the allowed shapes]
Ordinary row insertion and reverse row insertion are inverse operations at each inserted box. Therefore, starting from a pair $(T,(a_1,\dots,a_r))$, applying $\Phi$ and then reversing the added boxes from right to left recovers the same tableau $T$ and the same word $(a_1,\dots,a_r)$. Conversely, starting from $U\in\operatorname{SSYT}(\lambda)$ with $\lambda/\mu$ a horizontal strip, reverse insertion produces a pair whose forward insertion recovers $U$.
Hence $\Phi$ is a weight-preserving bijection
\begin{align*}
\{(T,a):T\in\operatorname{SSYT}(\mu),\ 1\leq a_1\leq\cdots\leq a_r\}
\longleftrightarrow
\bigsqcup_{\substack{\lambda\text{ a partition}\\ \mu\subset\lambda\\ |\lambda|-|\mu|=r\\ \lambda/\mu\text{ horizontal}}}
\operatorname{SSYT}(\lambda).
\end{align*}
Summing the common monomial weight over both sides gives
\begin{align*}
h_r s_\mu
&=
\sum_{\substack{T\in\operatorname{SSYT}(\mu)\\1\leq a_1\leq\cdots\leq a_r}}
x^{\operatorname{wt}(T)}x_{a_1}\cdots x_{a_r}\\
&=
\sum_{\substack{\lambda\text{ a partition}\\ \mu\subset\lambda\\ |\lambda|-|\mu|=r\\ \lambda/\mu\text{ horizontal}}}
\sum_{U\in\operatorname{SSYT}(\lambda)}x^{\operatorname{wt}(U)}\\
&=
\sum_{\substack{\lambda\text{ a partition}\\ \mu\subset\lambda\\ |\lambda|-|\mu|=r\\ \lambda/\mu\text{ horizontal}}}
s_\lambda.
\end{align*}
This is the desired Horizontal Pieri Rule.
[/step]