[guided]Assume $T_{\#}\mu=\nu$. The definition of the pushforward measure says that $T_{\#}\mu$ is the probability measure on $(Y,\mathcal{B})$ defined by
\begin{align*}
(T_{\#}\mu)(B)=\mu(T^{-1}(B))
\end{align*}
for every $B \in \mathcal{B}$. Therefore the hypothesis $T_{\#}\mu=\nu$ gives, for every $B \in \mathcal{B}$,
\begin{align*}
\nu(B)=\mu(T^{-1}(B)).
\end{align*}
The integral identity is first checked on simple functions, because a simple function is a finite linear combination of indicator functions. Let $s: Y \to \mathbb{R}$ be a bounded $\mathcal{B}$-measurable simple function, and write
\begin{align*}
s=\sum_{k=1}^{m} a_k\mathbb{1}_{B_k},
\end{align*}
where $m \in \mathbb{N}$, $a_k \in \mathbb{R}$, and $B_k \in \mathcal{B}$. Since $T$ is measurable, each preimage $T^{-1}(B_k)$ belongs to $\mathcal{A}$, so the composition $s\circ T: X \to \mathbb{R}$ is the simple function
\begin{align*}
s\circ T=\sum_{k=1}^{m} a_k\mathbb{1}_{T^{-1}(B_k)}.
\end{align*}
Using the definition of integration for simple functions,
\begin{align*}
\int_Y s(y)\,d\nu(y)=\sum_{k=1}^{m}a_k\nu(B_k).
\end{align*}
Since $\nu(B_k)=\mu(T^{-1}(B_k))$ for each $k$,
\begin{align*}
\sum_{k=1}^{m}a_k\nu(B_k)=\sum_{k=1}^{m}a_k\mu(T^{-1}(B_k)).
\end{align*}
The right-hand side is exactly the simple-function integral of $s\circ T$ over $X$:
\begin{align*}
\sum_{k=1}^{m}a_k\mu(T^{-1}(B_k))=\int_X s(T(x))\,d\mu(x).
\end{align*}
Now let $f: Y \to \mathbb{R}$ be bounded and $\mathcal{B}$-measurable. The composition $f\circ T$ is $\mathcal{A}$-measurable because $T$ is $\mathcal{A}$-$\mathcal{B}$ measurable and $f$ is $\mathcal{B}$-measurable. Since $f$ is bounded and measurable, it can be uniformly approximated by bounded measurable simple functions: for each $n \in \mathbb{N}$, choose $s_n: Y \to \mathbb{R}$ simple and $\mathcal{B}$-measurable such that
\begin{align*}
\sup_{y \in Y}|s_n(y)-f(y)|\leq \frac{1}{n}.
\end{align*}
Applying the simple-function identity to $s_n$ gives
\begin{align*}
\int_Y s_n(y)\,d\nu(y)=\int_X s_n(T(x))\,d\mu(x).
\end{align*}
Because both $\mu$ and $\nu$ are probability measures, the uniform error controls the integral error:
\begin{align*}
\left|\int_Y f(y)\,d\nu(y)-\int_Y s_n(y)\,d\nu(y)\right|\leq \frac{1}{n}.
\end{align*}
Likewise, since $|f(T(x))-s_n(T(x))|\leq 1/n$ for every $x \in X$,
\begin{align*}
\left|\int_X f(T(x))\,d\mu(x)-\int_X s_n(T(x))\,d\mu(x)\right|\leq \frac{1}{n}.
\end{align*}
Thus both sides of the simple-function identity converge to the corresponding integrals for $f$. Passing to the limit as $n\to\infty$ gives
\begin{align*}
\int_Y f(y)\,d\nu(y)=\int_X f(T(x))\,d\mu(x).
\end{align*}[/guided]