[proofplan]
We expand $\operatorname{Var}(\hat{f}_{\mathrm{rf}}(x))$ using the bilinearity of covariance, separate the diagonal (variance) and off-diagonal (covariance) terms, express the covariances in terms of $\rho$ and $\sigma_T^2$, and rearrange the resulting expression into the stated form.
[/proofplan]
[step:Expand the variance of the average using bilinearity of covariance]
Fix $x \in \mathbb{R}^p$. Write $T_b := \hat{T}^{(b)}(x)$ for brevity. By the bilinearity of covariance applied to the linear combination $\hat{f}_{\mathrm{rf}}(x) = \frac{1}{B}\sum_{b=1}^B T_b$:
\begin{align*}
\operatorname{Var}\!\left(\frac{1}{B}\sum_{b=1}^B T_b\right) = \frac{1}{B^2} \sum_{b_1=1}^B \sum_{b_2=1}^B \operatorname{Cov}(T_{b_1}, T_{b_2}).
\end{align*}
This uses $\operatorname{Var}\!\left(\sum_b \alpha_b T_b\right) = \sum_{b_1, b_2} \alpha_{b_1} \alpha_{b_2} \operatorname{Cov}(T_{b_1}, T_{b_2})$ with $\alpha_b = 1/B$ for each $b$.
[/step]
[step:Separate diagonal and off-diagonal contributions]
Split the double sum into the $B$ diagonal terms ($b_1 = b_2$) and the $B(B-1)$ off-diagonal terms ($b_1 \neq b_2$):
\begin{align*}
\sum_{b_1=1}^B \sum_{b_2=1}^B \operatorname{Cov}(T_{b_1}, T_{b_2}) = \sum_{b=1}^B \operatorname{Var}(T_b) + \sum_{\substack{b_1, b_2 = 1 \\ b_1 \neq b_2}}^B \operatorname{Cov}(T_{b_1}, T_{b_2}).
\end{align*}
For the diagonal terms: since the $T_b$ are identically distributed with $\operatorname{Var}(T_b) = \sigma_T^2$, the sum of variances is $B \sigma_T^2$.
For the off-diagonal terms: by definition, $\operatorname{Corr}(T_{b_1}, T_{b_2}) = \operatorname{Cov}(T_{b_1}, T_{b_2}) / (\sigma_T \cdot \sigma_T)$ for $b_1 \neq b_2$, since all marginal standard deviations equal $\sigma_T$. The hypothesis $\operatorname{Corr}(T_{b_1}, T_{b_2}) = \rho$ for all $b_1 \neq b_2$ gives $\operatorname{Cov}(T_{b_1}, T_{b_2}) = \rho \sigma_T^2$. There are $B(B-1)$ ordered pairs with $b_1 \neq b_2$, so the off-diagonal sum equals $B(B-1) \rho \sigma_T^2$.
Substituting:
\begin{align*}
\operatorname{Var}(\hat{f}_{\mathrm{rf}}(x)) = \frac{1}{B^2}\bigl[B\sigma_T^2 + B(B-1)\rho\sigma_T^2\bigr] = \frac{\sigma_T^2}{B} + \frac{(B-1)\rho\sigma_T^2}{B}.
\end{align*}
[/step]
[step:Rearrange into the stated form]
Factor out $\sigma_T^2 / B$:
\begin{align*}
\frac{\sigma_T^2}{B} + \frac{(B-1)\rho\sigma_T^2}{B} = \frac{\sigma_T^2}{B}\bigl[1 + (B-1)\rho\bigr] = \frac{\sigma_T^2}{B}\bigl[1 - \rho + B\rho\bigr] = \frac{(1-\rho)\sigma_T^2}{B} + \rho\,\sigma_T^2.
\end{align*}
The second equality uses $1 + (B-1)\rho = 1 + B\rho - \rho = (1 - \rho) + B\rho$. Therefore
\begin{align*}
\operatorname{Var}(\hat{f}_{\mathrm{rf}}(x)) = \frac{1 - \rho}{B}\,\sigma_T^2 + \rho\,\sigma_T^2,
\end{align*}
which is the stated decomposition. The first term $\frac{1-\rho}{B}\sigma_T^2$ vanishes as $B \to \infty$ (the benefit of averaging), while the second term $\rho\,\sigma_T^2$ persists regardless of the ensemble size (the cost of correlation between the base learners).
[/step]