[proofplan]
We split the $(m+n)$-fold iterated join into the first $m$ observations and the next $n$ observations pulled back by $T^m$. The entropy of a join of two finite partitions is at most the sum of their entropies, and this finite inequality is proved directly from the elementary logarithmic inequality $\log x\leq x-1$. Finally, measure preservation implies that pulling a finite partition back by $T^m$ does not change its entropy, so the two entropy terms become $a_m$ and $a_n$.
[/proofplan]
[step:Split the iterated join after the first $m$ observations]
Fix $m,n\in\mathbb{N}$. Define the finite measurable partitions
\begin{align*}
\beta := \bigvee_{r=0}^{m-1}T^{-r}\alpha
\end{align*}
and
\begin{align*}
\gamma := \bigvee_{s=0}^{n-1}T^{-s}\alpha.
\end{align*}
For a finite measurable partition $\pi$ of $X$, write $T^{-m}\pi$ for the finite measurable partition whose elements are the sets $T^{-m}P$ with $P\in\pi$, ignoring repeated empty sets if necessary. Since preimages commute with finite intersections, we have
\begin{align*}
T^{-m}\gamma = \bigvee_{s=0}^{n-1}T^{-(m+s)}\alpha.
\end{align*}
Therefore
\begin{align*}
\alpha_{m+n} = \bigvee_{r=0}^{m+n-1}T^{-r}\alpha = \beta \vee T^{-m}\gamma.
\end{align*}
[/step]
[step:Establish the entropy estimates for joins and measure-preserving pullbacks]
[claim:Entropy of a join is subadditive for finite measurable partitions]
If $\rho$ and $\sigma$ are finite measurable partitions of $X$, then
\begin{align*}
H_\mu(\rho\vee\sigma)\leq H_\mu(\rho)+H_\mu(\sigma).
\end{align*}
[/claim]
[proof]
Write $\rho=\{A_1,\dots,A_N\}$ and $\sigma=\{B_1,\dots,B_M\}$. Define
\begin{align*}
p_i := \mu(A_i), \qquad q_j := \mu(B_j), \qquad c_{ij}:=\mu(A_i\cap B_j).
\end{align*}
The atoms of $\rho\vee\sigma$ are the nonempty sets among the intersections $A_i\cap B_j$, and the zero-measure intersections contribute $0$ to entropy by the convention $0\log 0=0$. Let
\begin{align*}
S := \{(i,j)\in\{1,\dots,N\}\times\{1,\dots,M\}: c_{ij}>0\}.
\end{align*}
For $(i,j)\in S$, we have $p_i>0$ and $q_j>0$. Applying $\log x\leq x-1$ to
\begin{align*}
x := \frac{p_iq_j}{c_{ij}}
\end{align*}
gives
\begin{align*}
c_{ij}\log\frac{p_iq_j}{c_{ij}}\leq p_iq_j-c_{ij}.
\end{align*}
Summing over $S$ gives
\begin{align*}
\sum_{(i,j)\in S}c_{ij}\log\frac{p_iq_j}{c_{ij}}\leq \sum_{(i,j)\in S}p_iq_j-\sum_{(i,j)\in S}c_{ij}\leq 1-1=0.
\end{align*}
Hence
\begin{align*}
\sum_{(i,j)\in S}c_{ij}\log\frac{c_{ij}}{p_iq_j}\geq 0.
\end{align*}
Using this inequality and the marginal identities $\sum_j c_{ij}=p_i$ and $\sum_i c_{ij}=q_j$, we obtain
\begin{align*}
H_\mu(\rho\vee\sigma) = -\sum_{(i,j)\in S}c_{ij}\log c_{ij}\leq -\sum_{(i,j)\in S}c_{ij}\log(p_iq_j)=H_\mu(\rho)+H_\mu(\sigma).
\end{align*}
This proves the claim.
[/proof]
Now let $\pi=\{P_1,\dots,P_K\}$ be any finite measurable partition of $X$. Since $T$ is measure-preserving, induction on $m$ gives
\begin{align*}
\mu(T^{-m}E)=\mu(E)
\end{align*}
for every $E\in\mathcal{B}$. Therefore
\begin{align*}
H_\mu(T^{-m}\pi)= -\sum_{k=1}^{K}\mu(T^{-m}P_k)\log\mu(T^{-m}P_k)= -\sum_{k=1}^{K}\mu(P_k)\log\mu(P_k)=H_\mu(\pi).
\end{align*}
[guided]
We need two entropy facts, and both are finite calculations.
First, take two finite measurable partitions $\rho=\{A_1,\dots,A_N\}$ and $\sigma=\{B_1,\dots,B_M\}$. Define
\begin{align*}
p_i := \mu(A_i), \qquad q_j := \mu(B_j), \qquad c_{ij}:=\mu(A_i\cap B_j).
\end{align*}
The partition $\rho\vee\sigma$ is made from the intersections $A_i\cap B_j$. Thus its entropy is the entropy of the finite joint probability table $(c_{ij})$. The desired inequality says that the entropy of this joint table is at most the sum of the entropies of its two marginal tables $(p_i)$ and $(q_j)$.
Let
\begin{align*}
S := \{(i,j)\in\{1,\dots,N\}\times\{1,\dots,M\}: c_{ij}>0\}.
\end{align*}
For $(i,j)\in S$, both $p_i$ and $q_j$ are positive, because $A_i\cap B_j\subseteq A_i$ and $A_i\cap B_j\subseteq B_j$. We may therefore apply the elementary inequality $\log x\leq x-1$ to the positive number
\begin{align*}
x := \frac{p_iq_j}{c_{ij}}.
\end{align*}
This gives
\begin{align*}
c_{ij}\log\frac{p_iq_j}{c_{ij}}\leq p_iq_j-c_{ij}.
\end{align*}
Summing over all positive cells of the joint table gives
\begin{align*}
\sum_{(i,j)\in S}c_{ij}\log\frac{p_iq_j}{c_{ij}}\leq \sum_{(i,j)\in S}p_iq_j-\sum_{(i,j)\in S}c_{ij}\leq 1-1=0.
\end{align*}
Equivalently,
\begin{align*}
\sum_{(i,j)\in S}c_{ij}\log\frac{c_{ij}}{p_iq_j}\geq 0.
\end{align*}
Now expand the entropy of the join and use this non-negativity:
\begin{align*}
H_\mu(\rho\vee\sigma) = -\sum_{(i,j)\in S}c_{ij}\log c_{ij}\leq -\sum_{(i,j)\in S}c_{ij}\log(p_iq_j).
\end{align*}
Finally, separate the logarithm and use the marginal identities $\sum_j c_{ij}=p_i$ and $\sum_i c_{ij}=q_j$:
\begin{align*}
-\sum_{(i,j)\in S}c_{ij}\log(p_iq_j)= -\sum_{i=1}^{N}p_i\log p_i-\sum_{j=1}^{M}q_j\log q_j=H_\mu(\rho)+H_\mu(\sigma).
\end{align*}
Thus
\begin{align*}
H_\mu(\rho\vee\sigma)\leq H_\mu(\rho)+H_\mu(\sigma).
\end{align*}
Second, let $\pi=\{P_1,\dots,P_K\}$ be a finite measurable partition of $X$. Since $T$ is measure-preserving, applying the identity $\mu(T^{-1}E)=\mu(E)$ repeatedly gives
\begin{align*}
\mu(T^{-m}E)=\mu(E)
\end{align*}
for every $E\in\mathcal{B}$. Therefore every atom of $T^{-m}\pi$ has the same measure as the corresponding atom of $\pi$, and so
\begin{align*}
H_\mu(T^{-m}\pi)= -\sum_{k=1}^{K}\mu(T^{-m}P_k)\log\mu(T^{-m}P_k)= -\sum_{k=1}^{K}\mu(P_k)\log\mu(P_k)=H_\mu(\pi).
\end{align*}
[/guided]
[/step]
[step:Apply the estimates to the two pieces of the split join]
From the splitting identity,
\begin{align*}
a_{m+n}=H_\mu(\alpha_{m+n})=H_\mu(\beta\vee T^{-m}\gamma).
\end{align*}
By [subadditivity of entropy](/theorems/1634) for finite joins applied to the finite measurable partitions $\beta$ and $T^{-m}\gamma$,
\begin{align*}
H_\mu(\beta\vee T^{-m}\gamma)\leq H_\mu(\beta)+H_\mu(T^{-m}\gamma).
\end{align*}
The first term is
\begin{align*}
H_\mu(\beta)=H_\mu\left(\bigvee_{r=0}^{m-1}T^{-r}\alpha\right)=a_m.
\end{align*}
The pullback invariance of entropy under the measure-preserving map $T^m$ gives
\begin{align*}
H_\mu(T^{-m}\gamma)=H_\mu(\gamma)=H_\mu\left(\bigvee_{s=0}^{n-1}T^{-s}\alpha\right)=a_n.
\end{align*}
Combining these identities yields
\begin{align*}
a_{m+n}\leq a_m+a_n.
\end{align*}
Since $m,n\in\mathbb{N}$ were arbitrary, the sequence $(a_n)_{n\in\mathbb{N}}$ is subadditive.
[/step]