[proofplan]
The rank-zero and rank-one intervals are handled directly from the defining recursion of the Möbius function and the stated convention for falling chains. For rank at least two, the EL-labeling gives a shelling of the order complex of the open interval $(x,y)$, with facets indexed by maximal chains in $[x,y]$. The EL-shelling restriction-face criterion says that precisely the falling maximal chains are the homology facets, so the order complex has reduced Euler characteristic equal to $(-1)^{r-2}f(x,y)$. Finally, Philip Hall's theorem identifies this reduced Euler characteristic with $\mu_P(x,y)$, and $(-1)^{r-2}=(-1)^r$.
[/proofplan]
[step:Check the rank-zero and rank-one intervals directly]
If $r=0$, then $x=y$. By the defining normalization of the Möbius function,
\begin{align*}
\mu_P(x,x)=1.
\end{align*}
By convention, $f(x,x)=1$, so
\begin{align*}
\mu_P(x,x)=1=(-1)^0 f(x,x).
\end{align*}
Now suppose $r=1$. Then $x\lessdot y$, and the interval $[x,y]$ has exactly one maximal chain,
\begin{align*}
x\lessdot y.
\end{align*}
Its label word has length one, hence is strictly decreasing in the vacuous adjacent-pair sense, so $f(x,y)=1$. The Möbius recursion gives
\begin{align*}
0=\sum_{x\le z\le y}\mu_P(x,z)=\mu_P(x,x)+\mu_P(x,y)=1+\mu_P(x,y).
\end{align*}
Therefore
\begin{align*}
\mu_P(x,y)=-1=(-1)^1 f(x,y).
\end{align*}
[/step]
[step:Pass from maximal chains to facets of the open interval order complex]
Assume from now on that $r\ge 2$. Let
\begin{align*}
\Delta=\triangle((x,y))
\end{align*}
denote the order complex of the open interval
\begin{align*}
(x,y)=\{z\in P:x<z<y\}.
\end{align*}
A facet of $\Delta$ is exactly a maximal chain in $(x,y)$, hence is equivalently a maximal chain
\begin{align*}
C: x=x_0\lessdot x_1\lessdot \cdots \lessdot x_r=y
\end{align*}
in $[x,y]$, with the endpoints $x$ and $y$ removed. Since $P$ is graded and the rank difference from $x$ to $y$ is $r$, such a facet contains the $r-1$ vertices
\begin{align*}
x_1,x_2,\dots,x_{r-1}.
\end{align*}
Thus every facet of $\Delta$ has dimension $r-2$.
[guided]
We now translate the combinatorics of chains into the topology of a simplicial complex. Define
\begin{align*}
\Delta=\triangle((x,y)).
\end{align*}
By definition, the vertices of $\Delta$ are the elements $z\in P$ satisfying $x<z<y$, and a face of $\Delta$ is a finite chain in the open interval $(x,y)$.
A facet of $\Delta$ is a maximal face, so it is a maximal chain in the open interval. Because $[x,y]$ is graded of rank difference $r$, every maximal chain from $x$ to $y$ has the form
\begin{align*}
C: x=x_0\lessdot x_1\lessdot \cdots \lessdot x_r=y.
\end{align*}
Removing the two endpoints gives the chain
\begin{align*}
x_1<x_2<\cdots <x_{r-1}
\end{align*}
inside $(x,y)$, and this chain is a facet of $\Delta$. Conversely, every facet of $\Delta$ is obtained in this way by adjoining $x$ at the bottom and $y$ at the top. Therefore facets of $\Delta$ are indexed by maximal chains in $[x,y]$.
The number of vertices in the facet corresponding to $C$ is $r-1$. A simplex with $r-1$ vertices has dimension $r-2$, so $\Delta$ is pure of dimension $r-2$. This dimension is the source of the sign in the final Euler characteristic calculation.
[/guided]
[/step]
[step:Use the EL-shelling criterion to count the homology facets]
Since $P$ is finite and graded, the interval $[x,y]$ is finite, bounded by $x$ and $y$, and graded with rank difference $r$. By the theorem statement, the restriction of $\lambda$ to cover relations in $[x,y]$ is the inherited EL-labeling. Therefore the standard EL-shelling theorem applies to this bounded graded interval and induces a shelling of $\Delta$ whose facets are ordered lexicographically by their label words.
For a facet $F$ in this shelling, let $R(F)$ denote its shelling restriction face, that is, the unique minimal face of $F$ that is not contained in the union of the facets preceding $F$ in the shelling order. The EL-shelling restriction-face criterion says that, for the facet corresponding to
\begin{align*}
C: x=x_0\lessdot x_1\lessdot \cdots \lessdot x_r=y
\end{align*}
the equality $R(F)=F$ holds exactly when every adjacent pair in the label word is a strict descent:
\begin{align*}
\lambda(x_{i-1},x_i)>\lambda(x_i,x_{i+1}) \text{ for every } i\in\{1,\dots,r-1\}.
\end{align*}
Equivalently, the full label word is strictly decreasing. By the definition of $f(x,y)$, the number of such facets is $f(x,y)$.
Let $d=r-2$. Write the shelling facets as $F_1,\dots,F_t$ in lexicographic shelling order, and write $R(F_k)$ for the shelling restriction face of $F_k$. Let $\widetilde{\chi}(\Delta)$ denote the reduced Euler characteristic of the finite simplicial complex $\Delta$. Since $\Delta$ is pure of dimension $d$, the [Partition of Faces by Restriction Faces]([citetheorem:8128]) applied to this shelling partitions all faces of $\Delta$, including the empty face, into intervals $\{G:R(F_k)\subseteq G\subseteq F_k\}$. If $R(F_k)\subsetneq F_k$, then the interval is a Boolean interval with at least one free vertex, so its alternating contribution to $\widetilde{\chi}(\Delta)$ is zero by cancellation of the Boolean binomial sum. If $R(F_k)=F_k$, then its only face is the facet $F_k$, and its contribution is $(-1)^d$. Therefore
\begin{align*}
\widetilde{\chi}(\Delta)=(-1)^{r-2}f(x,y).
\end{align*}
[guided]
The shelling step is where the word "falling" is converted into topology. We have already verified that $[x,y]$ is finite, bounded, and graded, and the theorem statement gives that the restricted labeling is an EL-labeling of this interval. Hence the standard EL-shelling theorem applies: ordering the facets of the order complex $\Delta=\triangle((x,y))$ lexicographically by their full label words gives a shelling.
For a facet $F$ in this shelling, define $R(F)$ to be its shelling restriction face, the unique minimal face of $F$ not contained in the union of the earlier facets. The EL-shelling restriction-face criterion identifies $R(F)$ in terms of descents of the label word. For the maximal chain
\begin{align*}
C: x=x_0\lessdot x_1\lessdot \cdots \lessdot x_r=y,
\end{align*}
the internal vertex $x_i$ belongs to $R(F)$ exactly when the adjacent labels form a strict descent:
\begin{align*}
\lambda(x_{i-1},x_i)>\lambda(x_i,x_{i+1}).
\end{align*}
Thus $R(F)=F$ exactly when every internal vertex $x_i$ belongs to $R(F)$, which is exactly the condition
\begin{align*}
\lambda(x_{i-1},x_i)>\lambda(x_i,x_{i+1}) \text{ for every } i\in\{1,\dots,r-1\}.
\end{align*}
This is the strictly decreasing, or falling, condition in the repaired theorem statement. Therefore the number of facets with $R(F)=F$ is precisely $f(x,y)$.
Now let $d=r-2$, so every facet of $\Delta$ has dimension $d$. Write the shelling facets as $F_1,\dots,F_t$, and let $R(F_k)$ be the restriction face of $F_k$. Let $\widetilde{\chi}(\Delta)$ denote the reduced Euler characteristic of $\Delta$. By the [Partition of Faces by Restriction Faces]([citetheorem:8128]), the faces of $\Delta$, including the empty face, are partitioned into the intervals
\begin{align*}
\{G:R(F_k)\subseteq G\subseteq F_k\}.
\end{align*}
If $R(F_k)\subsetneq F_k$, then this interval is a Boolean interval with at least one optional vertex. The alternating sum over that Boolean interval cancels to zero. If $R(F_k)=F_k$, then the interval consists only of the facet $F_k$, and since $\dim F_k=d$, its contribution to the reduced Euler characteristic is $(-1)^d$. Summing over all shelling intervals gives
\begin{align*}
\widetilde{\chi}(\Delta)=(-1)^d f(x,y)=(-1)^{r-2}f(x,y).
\end{align*}
[/guided]
[/step]
[step:Convert the reduced Euler characteristic into the Möbius value]
Since $P$ is finite, the interval $[x,y]$ is finite, so [Philip Hall Theorem]([citetheorem:8124]) applies to the finite poset interval. Because $x<y$, that theorem gives
\begin{align*}
\mu_P(x,y)=\widetilde{\chi}(\triangle((x,y))).
\end{align*}
Using the notation $\Delta=\triangle((x,y))$ from the previous steps and substituting the shelling computation, we obtain
\begin{align*}
\mu_P(x,y)=\widetilde{\chi}(\Delta)=(-1)^{r-2}f(x,y).
\end{align*}
Since $(-1)^{r-2}=(-1)^r$, this becomes
\begin{align*}
\mu_P(x,y)=(-1)^r f(x,y).
\end{align*}
Together with the rank-zero and rank-one cases, this proves the formula for every interval $[x,y]$.
[/step]