[guided]The goal is to turn the approximate map $f:X\to Y$ into an actual ambient metric in which the two spaces are close. The cross-distance from $x\in X$ to $y\in Y$ is defined by allowing travel inside $X$ to some point $a$, paying the bridge cost $\varepsilon$, and then travelling inside $Y$ from $f(a)$ to $y$:
\begin{align*}
\rho(x,y)
:=
\inf_{a\in X}
\left\{
d_X(x,a)+\varepsilon+d_Y(f(a),y)
\right\}.
\end{align*}
The restrictions $\rho|_{X\times X}=d_X$ and $\rho|_{Y\times Y}=d_Y$ are imposed by definition, and symmetry is imposed by setting $\rho(y,x)=\rho(x,y)$.
The only non-formal point is the triangle inequality. Triangles entirely inside $X$ or entirely inside $Y$ are exactly the triangle inequalities for $d_X$ and $d_Y$. For a triangle with two points $x,x'\in X$ and one point $y\in Y$, we must prove
\begin{align*}
d_X(x,x')\leq \rho(x,y)+\rho(y,x').
\end{align*}
Choose arbitrary comparison points $a,b\in X$. Since $f$ has distortion at most $\varepsilon$, we have
\begin{align*}
d_X(a,b)
\leq d_Y(f(a),f(b))+\varepsilon.
\end{align*}
Using the triangle inequality first in $X$, then the distortion bound, and then the triangle inequality in $Y$, we obtain
\begin{align*}
d_X(x,x')
&\leq d_X(x,a)+d_X(a,b)+d_X(b,x')\\
&\leq d_X(x,a)+d_Y(f(a),f(b))+\varepsilon+d_X(b,x')\\
&\leq d_X(x,a)+d_Y(f(a),y)+d_Y(y,f(b))+\varepsilon+d_X(b,x')\\
&\leq
\left[d_X(x,a)+\varepsilon+d_Y(f(a),y)\right]
+
\left[d_X(x',b)+\varepsilon+d_Y(f(b),y)\right].
\end{align*}
Because this holds for every $a,b\in X$, taking the infimum over both variables gives the desired mixed triangle inequality.
The other mixed triangle inequalities are built into the infimum formula. For example, if $x,x'\in X$ and $y\in Y$, then for every $a\in X$,
\begin{align*}
\rho(x,y)
&\leq d_X(x,a)+\varepsilon+d_Y(f(a),y)\\
&\leq d_X(x,x')+d_X(x',a)+\varepsilon+d_Y(f(a),y).
\end{align*}
Taking the infimum over $a\in X$ gives
\begin{align*}
\rho(x,y)\leq d_X(x,x')+\rho(x',y).
\end{align*}
Similarly, if $x\in X$ and $y,y'\in Y$, then for every $a\in X$,
\begin{align*}
\rho(x,y)
&\leq d_X(x,a)+\varepsilon+d_Y(f(a),y)\\
&\leq d_X(x,a)+\varepsilon+d_Y(f(a),y')+d_Y(y',y),
\end{align*}
and taking the infimum over $a\in X$ gives
\begin{align*}
\rho(x,y)\leq \rho(x,y')+d_Y(y',y).
\end{align*}
Thus $\rho$ is a metric on the disjoint union.
Finally, this metric places $X$ close to $Y$. For each $x\in X$,
\begin{align*}
\rho(x,f(x))\leq \varepsilon.
\end{align*}
For each $y\in Y$, the $\varepsilon$-density of $f(X)$ gives some $x\in X$ with $d_Y(f(x),y)<\varepsilon$, and therefore
\begin{align*}
\rho(x,y)\leq \varepsilon+d_Y(f(x),y)<2\varepsilon.
\end{align*}
Hence
\begin{align*}
d_H^\rho(X,Y)\leq 2\varepsilon,
\end{align*}
so
\begin{align*}
d_{GH}(X,Y)\leq 2\varepsilon.
\end{align*}[/guided]