[guided]The purpose of the upper Brun weights is to replace the exact condition $\gcd(a,P_z)=1$ by a divisor sum that can be evaluated through the sieve axioms. The exact condition is already encoded by the map $I_z: \mathcal A\to\{0,1\}$, where $I_z(a)=1$ precisely when $a$ survives the sieving by primes in $\mathcal P$ below $z$.
Since $s=\frac{\log D}{\log z}$ is assumed to lie in a compact subinterval of $(1,\infty)$, the upper part of the weight-level dimension-one Brun fundamental lemma applies. Its hypotheses are exactly the data in the theorem statement: the divisors are squarefree and supported on $\mathcal P$, the density map $g: \mathcal D_\infty\to\mathbb R$ is multiplicative, and the level is $D$. The conclusion gives weights $\lambda_d^+\in\mathbb R$, supported on divisors $d\mid P_z$ with $d\leq D$, with $|\lambda_d^+|\leq1$, and with the pointwise majorization
\begin{align*}
I_z(a)\leq \sum_{d\mid \gcd(a,P_z)}\lambda_d^+
\end{align*}
for every $a\in\mathcal A$.
We now sum over $a\in\mathcal A$. This causes no convergence issue because $\mathcal A$ is finite and $P_z$ has only finitely many divisors. The divisor $d$ contributes to the inner sum exactly for those $a\in\mathcal A$ such that $d\mid a$, and those elements form $\mathcal A_d$. Hence
\begin{align*}
S_{\mathcal A,\mathcal P}(z)=\sum_{a\in\mathcal A}I_z(a)\leq \sum_{a\in\mathcal A}\sum_{d\mid\gcd(a,P_z)}\lambda_d^+=\sum_{d\mid P_z}\lambda_d^+|\mathcal A_d|.
\end{align*}
For each divisor $d\mid P_z$, the divisor is squarefree and supported on $\mathcal P$, so the sieve axiom applies to the same $d$ and gives $|\mathcal A_d|=Xg(d)+r_d$. Substitution yields
\begin{align*}
S_{\mathcal A,\mathcal P}(z)\leq X\sum_{d\mid P_z}\lambda_d^+g(d)+\sum_{d\mid P_z}\lambda_d^+r_d.
\end{align*}
The last sum is the only part depending on the remainders. Since the weights vanish for $d>D$, every nonzero term has $d\in\mathcal D(D)$, and since $|\lambda_d^+|\leq1$, the triangle inequality gives
\begin{align*}
\left|\sum_{d\mid P_z}\lambda_d^+r_d\right|\leq \sum_{d\in\mathcal D(D)}|r_d|=R(D).
\end{align*}
Thus the upper majorization has reduced the sifted count to the weighted density sum, with error at most $R(D)$:
\begin{align*}
S_{\mathcal A,\mathcal P}(z)\leq X\sum_{d\mid P_z}\lambda_d^+g(d)+R(D).
\end{align*}[/guided]