[proofplan]
We verify the metric axioms on $X\times Y$ coordinatewise. Nonnegativity, symmetry, and identity of indiscernibles follow directly from the corresponding axioms for $d_X$ and $d_Y$. The only substantive point is the triangle inequality: for $1\le p<\infty$ we prove the required two-coordinate Minkowski estimate, and for $p=\infty$ we use the elementary inequality $\max\{a+c,b+d\}\le \max\{a,b\}+\max\{c,d\}$ for nonnegative [real numbers](/page/Real%20Numbers).
[/proofplan]
[step:Verify nonnegativity and symmetry from the coordinate metrics]
Let $(x_1,y_1),(x_2,y_2)\in X\times Y$.
First suppose $1\le p<\infty$. Since $d_X(x_1,x_2)\ge 0$ and $d_Y(y_1,y_2)\ge 0$, the quantity
\begin{align*}
d_X(x_1,x_2)^p+d_Y(y_1,y_2)^p
\end{align*}
is nonnegative, so $d_p((x_1,y_1),(x_2,y_2))\ge 0$. Since $d_X$ and $d_Y$ are symmetric,
\begin{align*}
d_X(x_1,x_2)=d_X(x_2,x_1)
\end{align*}
and
\begin{align*}
d_Y(y_1,y_2)=d_Y(y_2,y_1).
\end{align*}
Substituting these equalities into the finite-$p$ formula gives
\begin{align*}
d_p((x_1,y_1),(x_2,y_2))=d_p((x_2,y_2),(x_1,y_1)).
\end{align*}
Now suppose $p=\infty$. The maximum of two nonnegative real numbers is nonnegative, so $d_\infty((x_1,y_1),(x_2,y_2))\ge 0$. The same symmetry equalities for $d_X$ and $d_Y$ give
\begin{align*}
d_\infty((x_1,y_1),(x_2,y_2))=d_\infty((x_2,y_2),(x_1,y_1)).
\end{align*}
[/step]
[step:Check identity of indiscernibles coordinate by coordinate]
Let $(x_1,y_1),(x_2,y_2)\in X\times Y$.
If $(x_1,y_1)=(x_2,y_2)$, then $x_1=x_2$ and $y_1=y_2$. Since $d_X$ and $d_Y$ are metrics,
\begin{align*}
d_X(x_1,x_2)=0
\end{align*}
and
\begin{align*}
d_Y(y_1,y_2)=0.
\end{align*}
Therefore $d_p((x_1,y_1),(x_2,y_2))=0$ for every $1\le p\le\infty$.
Conversely, first suppose $1\le p<\infty$ and
\begin{align*}
d_p((x_1,y_1),(x_2,y_2))=0.
\end{align*}
By the finite-$p$ formula,
\begin{align*}
d_X(x_1,x_2)^p+d_Y(y_1,y_2)^p=0.
\end{align*}
Both summands are nonnegative, so each summand is zero:
\begin{align*}
d_X(x_1,x_2)^p=0
\end{align*}
and
\begin{align*}
d_Y(y_1,y_2)^p=0.
\end{align*}
Since $p\ge 1$, this implies
\begin{align*}
d_X(x_1,x_2)=0
\end{align*}
and
\begin{align*}
d_Y(y_1,y_2)=0.
\end{align*}
The identity of indiscernibles for $d_X$ and $d_Y$ gives $x_1=x_2$ and $y_1=y_2$, hence $(x_1,y_1)=(x_2,y_2)$.
Now suppose $p=\infty$ and
\begin{align*}
d_\infty((x_1,y_1),(x_2,y_2))=0.
\end{align*}
Then
\begin{align*}
\max\{d_X(x_1,x_2),d_Y(y_1,y_2)\}=0.
\end{align*}
Since both entries in the maximum are nonnegative, both are zero. Thus $d_X(x_1,x_2)=0$ and $d_Y(y_1,y_2)=0$, so $x_1=x_2$ and $y_1=y_2$. Hence $(x_1,y_1)=(x_2,y_2)$.
[/step]
[step:Prove the two-coordinate Minkowski estimate for finite $p$]
We prove the following estimate: for every $1\le p<\infty$ and every $a,b,c,d\in[0,\infty)$,
\begin{align*}
\left((a+c)^p+(b+d)^p\right)^{1/p}\le \left(a^p+b^p\right)^{1/p}+\left(c^p+d^p\right)^{1/p}.
\end{align*}
For $p=1$, the estimate is equality:
\begin{align*}
(a+c)+(b+d)=(a+b)+(c+d).
\end{align*}
Assume now that $1<p<\infty$. Define $q\in(1,\infty)$ by
\begin{align*}
q=\frac{p}{p-1}.
\end{align*}
Then
\begin{align*}
\frac{1}{p}+\frac{1}{q}=1.
\end{align*}
We first use the two-term Hölder inequality: for all $r,s,u,v\in[0,\infty)$,
\begin{align*}
ru+sv\le \left(r^p+s^p\right)^{1/p}\left(u^q+v^q\right)^{1/q}.
\end{align*}
For completeness, this follows from [Young's inequality](/theorems/244) $AB\le A^p/p+B^q/q$ for $A,B\ge 0$, which is obtained by minimizing the function $\phi_A:(0,\infty)\to\mathbb R$ given by $\phi_A(B)=A^p/p+B^q/q-AB$ and using $\phi_A'(B)=B^{q-1}-A$. Normalizing by
\begin{align*}
R=\left(r^p+s^p\right)^{1/p}
\end{align*}
and
\begin{align*}
U=\left(u^q+v^q\right)^{1/q}
\end{align*}
when $R,U>0$, and treating the cases $R=0$ or $U=0$ separately, Young's inequality gives the displayed Hölder inequality.
Define
\begin{align*}
S=\left((a+c)^p+(b+d)^p\right)^{1/p}.
\end{align*}
If $S=0$, the desired estimate follows because the right-hand side is nonnegative. Assume $S>0$. Expanding one power of each summand gives
\begin{align*}
S^p=a(a+c)^{p-1}+c(a+c)^{p-1}+b(b+d)^{p-1}+d(b+d)^{p-1}.
\end{align*}
Grouping the terms with coefficients $a,b$ and the terms with coefficients $c,d$, then applying the two-term Hölder inequality twice, gives
\begin{align*}
S^p\le \left(a^p+b^p\right)^{1/p}\left((a+c)^{(p-1)q}+(b+d)^{(p-1)q}\right)^{1/q}+\left(c^p+d^p\right)^{1/p}\left((a+c)^{(p-1)q}+(b+d)^{(p-1)q}\right)^{1/q}.
\end{align*}
Since $(p-1)q=p$, the common second factor is
\begin{align*}
\left((a+c)^p+(b+d)^p\right)^{1/q}=S^{p/q}=S^{p-1}.
\end{align*}
Thus
\begin{align*}
S^p\le \left(\left(a^p+b^p\right)^{1/p}+\left(c^p+d^p\right)^{1/p}\right)S^{p-1}.
\end{align*}
Dividing by $S^{p-1}>0$ proves the two-coordinate Minkowski estimate.
[guided]
We need an inequality that says the $p$-length of a sum of two nonnegative coordinate vectors is at most the sum of their $p$-lengths. Written in coordinates, this is exactly the estimate
\begin{align*}
\left((a+c)^p+(b+d)^p\right)^{1/p}\le \left(a^p+b^p\right)^{1/p}+\left(c^p+d^p\right)^{1/p}.
\end{align*}
The case $p=1$ contains no hidden inequality: both sides reduce to the same sum,
\begin{align*}
(a+c)+(b+d)=(a+b)+(c+d).
\end{align*}
Now assume $1<p<\infty$. Define the conjugate exponent $q\in(1,\infty)$ by
\begin{align*}
q=\frac{p}{p-1}.
\end{align*}
Then
\begin{align*}
\frac{1}{p}+\frac{1}{q}=1.
\end{align*}
The estimate rests on the two-term Hölder inequality. We record it in the exact form needed here: for $r,s,u,v\in[0,\infty)$,
\begin{align*}
ru+sv\le \left(r^p+s^p\right)^{1/p}\left(u^q+v^q\right)^{1/q}.
\end{align*}
Here is the verification. Young's inequality says that for $A,B\ge 0$,
\begin{align*}
AB\le \frac{A^p}{p}+\frac{B^q}{q}.
\end{align*}
To justify it, fix $A\ge 0$ and consider the function $\phi_A:(0,\infty)\to\mathbb R$ defined by
\begin{align*}
\phi_A(B)=\frac{A^p}{p}+\frac{B^q}{q}-AB.
\end{align*}
Its derivative is
\begin{align*}
\phi_A'(B)=B^{q-1}-A.
\end{align*}
The minimum occurs at $B=A^{1/(q-1)}$, equivalently at $B^q=A^p$, and the minimum value is $0$ because $1/p+1/q=1$. The endpoint case $B=0$ is immediate, so Young's inequality holds for all $A,B\ge 0$.
To derive Hölder, define
\begin{align*}
R=\left(r^p+s^p\right)^{1/p}
\end{align*}
and
\begin{align*}
U=\left(u^q+v^q\right)^{1/q}.
\end{align*}
If $R=0$ or $U=0$, then the products $ru$ and $sv$ are both zero, and the Hölder inequality follows. If $R>0$ and $U>0$, apply Young's inequality to $A=r/R$, $B=u/U$ and again to $A=s/R$, $B=v/U$. Adding the two inequalities gives
\begin{align*}
\frac{ru+sv}{RU}\le \frac{r^p+s^p}{pR^p}+\frac{u^q+v^q}{qU^q}.
\end{align*}
By the definitions of $R$ and $U$, the right-hand side is
\begin{align*}
\frac{1}{p}+\frac{1}{q}=1.
\end{align*}
Multiplying by $RU$ gives the two-term Hölder inequality.
Now define
\begin{align*}
S=\left((a+c)^p+(b+d)^p\right)^{1/p}.
\end{align*}
If $S=0$, then the left-hand side of the desired inequality is zero, while the right-hand side is nonnegative. Hence the estimate holds in that case. Assume $S>0$.
The useful move is to write $S^p$ with one copy of each coordinate separated:
\begin{align*}
S^p=(a+c)(a+c)^{p-1}+(b+d)(b+d)^{p-1}.
\end{align*}
Expanding the two sums gives
\begin{align*}
S^p=a(a+c)^{p-1}+c(a+c)^{p-1}+b(b+d)^{p-1}+d(b+d)^{p-1}.
\end{align*}
Group the terms involving $a,b$ together and the terms involving $c,d$ together:
\begin{align*}
S^p=\left(a(a+c)^{p-1}+b(b+d)^{p-1}\right)+\left(c(a+c)^{p-1}+d(b+d)^{p-1}\right).
\end{align*}
Apply the two-term Hölder inequality to the first parenthesis with
\begin{align*}
(r,s,u,v)=(a,b,(a+c)^{p-1},(b+d)^{p-1})
\end{align*}
and to the second parenthesis with
\begin{align*}
(r,s,u,v)=(c,d,(a+c)^{p-1},(b+d)^{p-1}).
\end{align*}
This yields
\begin{align*}
S^p\le \left(a^p+b^p\right)^{1/p}\left((a+c)^{(p-1)q}+(b+d)^{(p-1)q}\right)^{1/q}+\left(c^p+d^p\right)^{1/p}\left((a+c)^{(p-1)q}+(b+d)^{(p-1)q}\right)^{1/q}.
\end{align*}
Because $q=p/(p-1)$, we have $(p-1)q=p$. Therefore the shared second factor is
\begin{align*}
\left((a+c)^p+(b+d)^p\right)^{1/q}=S^{p/q}=S^{p-1}.
\end{align*}
Substituting this into the previous inequality gives
\begin{align*}
S^p\le \left(\left(a^p+b^p\right)^{1/p}+\left(c^p+d^p\right)^{1/p}\right)S^{p-1}.
\end{align*}
Since $S>0$, division by $S^{p-1}$ is valid and gives
\begin{align*}
S\le \left(a^p+b^p\right)^{1/p}+\left(c^p+d^p\right)^{1/p}.
\end{align*}
This is the desired two-coordinate Minkowski estimate.
[/guided]
[/step]
[step:Apply the finite Minkowski estimate to prove the finite $p$ triangle inequality]
Assume $1\le p<\infty$. Let $(x_1,y_1),(x_2,y_2),(x_3,y_3)\in X\times Y$. Define four nonnegative real numbers
\begin{align*}
a=d_X(x_1,x_2)
\end{align*}
\begin{align*}
b=d_Y(y_1,y_2)
\end{align*}
\begin{align*}
c=d_X(x_2,x_3)
\end{align*}
\begin{align*}
d=d_Y(y_2,y_3).
\end{align*}
The triangle inequalities in $(X,d_X)$ and $(Y,d_Y)$ give
\begin{align*}
d_X(x_1,x_3)\le a+c
\end{align*}
and
\begin{align*}
d_Y(y_1,y_3)\le b+d.
\end{align*}
Since the function $t\mapsto t^p$ is increasing on $[0,\infty)$, the finite-$p$ formula gives
\begin{align*}
d_p((x_1,y_1),(x_3,y_3))\le \left((a+c)^p+(b+d)^p\right)^{1/p}.
\end{align*}
By the two-coordinate Minkowski estimate proved above,
\begin{align*}
\left((a+c)^p+(b+d)^p\right)^{1/p}\le \left(a^p+b^p\right)^{1/p}+\left(c^p+d^p\right)^{1/p}.
\end{align*}
Using the definitions of $a,b,c,d$, this becomes
\begin{align*}
d_p((x_1,y_1),(x_3,y_3))\le d_p((x_1,y_1),(x_2,y_2))+d_p((x_2,y_2),(x_3,y_3)).
\end{align*}
Thus $d_p$ satisfies the triangle inequality for every $1\le p<\infty$.
[/step]
[step:Use the maximum inequality to prove the $p=\infty$ triangle inequality]
Let $(x_1,y_1),(x_2,y_2),(x_3,y_3)\in X\times Y$. Define
\begin{align*}
a=d_X(x_1,x_2)
\end{align*}
\begin{align*}
b=d_Y(y_1,y_2)
\end{align*}
\begin{align*}
c=d_X(x_2,x_3)
\end{align*}
\begin{align*}
d=d_Y(y_2,y_3).
\end{align*}
The coordinate triangle inequalities give
\begin{align*}
d_X(x_1,x_3)\le a+c
\end{align*}
and
\begin{align*}
d_Y(y_1,y_3)\le b+d.
\end{align*}
Therefore
\begin{align*}
d_\infty((x_1,y_1),(x_3,y_3))\le \max\{a+c,b+d\}.
\end{align*}
Since $a\le \max\{a,b\}$, $b\le \max\{a,b\}$, $c\le \max\{c,d\}$, and $d\le \max\{c,d\}$, we have
\begin{align*}
a+c\le \max\{a,b\}+\max\{c,d\}
\end{align*}
and
\begin{align*}
b+d\le \max\{a,b\}+\max\{c,d\}.
\end{align*}
Taking the maximum of the two left-hand sides gives
\begin{align*}
\max\{a+c,b+d\}\le \max\{a,b\}+\max\{c,d\}.
\end{align*}
Substituting the definitions of $a,b,c,d$ yields
\begin{align*}
d_\infty((x_1,y_1),(x_3,y_3))\le d_\infty((x_1,y_1),(x_2,y_2))+d_\infty((x_2,y_2),(x_3,y_3)).
\end{align*}
Thus $d_\infty$ satisfies the triangle inequality.
[/step]
[step:Conclude that every product formula satisfies the metric axioms]
For $1\le p<\infty$, the finite-$p$ formula is nonnegative, symmetric, separates points, and satisfies the triangle inequality by the preceding steps. Hence $d_p$ is a metric on $X\times Y$.
For $p=\infty$, the supremum formula is nonnegative, symmetric, separates points, and satisfies the triangle inequality by the preceding steps. Hence $d_\infty$ is a metric on $X\times Y$. This proves the theorem for every $1\le p\le\infty$.
[/step]