[proofplan]
The strategy is to reduce the SVD of the rectangular matrix $A \in \mathbb{R}^{n \times m}$ to the Spectral Theorem for Real Symmetric Matrices applied to a positive semidefinite Gram matrix. Form $G := A^\top A \in \mathbb{R}^{m \times m}$, which is symmetric and positive semidefinite. Diagonalise $G = Q D Q^\top$ via the spectral theorem, with $Q$ orthogonal and $D = \operatorname{diag}(\lambda_1, \dots, \lambda_m)$ with $\lambda_i \ge 0$ ordered decreasingly. The columns of $A Q$ are then orthogonal vectors in $\mathbb{R}^n$ with norms $\sigma_i := \sqrt{\lambda_i}$. Normalising the non-zero columns yields the first $r := \operatorname{rank}(A)$ columns of $P$, and extending to an orthonormal basis of $\mathbb{R}^n$ (using the hypothesis $m \le n$) by Gram–Schmidt orthonormalisation produces an $n \times n$ orthogonal $P$. Reading off the resulting identity gives $A = P \Sigma Q^\top$ with $\Sigma$ "diagonal" in the rectangular sense. Uniqueness of the singular values follows from their identification as the square roots of the eigenvalues of $G$.
[/proofplan]
[step:Set up notation and reduce to the spectral theorem via the Gram matrix]
The theorem concerns a linear map $L: \mathbb{R}^m \to \mathbb{R}^n$ with $m \le n$. Let $A := JL \in \mathbb{R}^{n \times m}$ denote the matrix of $L$ in the standard bases of $\mathbb{R}^m$ and $\mathbb{R}^n$ (the Jacobian matrix of $L$, which for a linear map coincides with its matrix in standard bases and is independent of the base point). Throughout the proof we work with $A$ as an $n \times m$ matrix; the action of $L$ on a vector $x \in \mathbb{R}^m$ is the matrix–vector product $A x \in \mathbb{R}^n$.
Define the Gram matrix
\begin{align*}
G := A^\top A \in \mathbb{R}^{m \times m}.
\end{align*}
We verify two structural properties of $G$.
(i) **$G$ is symmetric**: $G^\top = (A^\top A)^\top = A^\top (A^\top)^\top = A^\top A = G$.
(ii) **$G$ is positive semidefinite**: for every $x \in \mathbb{R}^m$,
\begin{align*}
x^\top G x = x^\top A^\top A x = (A x)^\top (A x) = |A x|^2 \ge 0.
\end{align*}
By the Spectral Theorem for Real Symmetric Matrices (a real symmetric matrix is orthogonally diagonalisable), there exists an orthogonal matrix $Q \in \mathbb{R}^{m \times m}$ (so $Q^\top Q = Q Q^\top = I_m$) and a diagonal matrix $D = \operatorname{diag}(\lambda_1, \dots, \lambda_m)$ with real entries such that
\begin{align*}
G = Q D Q^\top.
\end{align*}
Write $Q = [q_1\ q_2\ \cdots\ q_m]$ with $q_i \in \mathbb{R}^m$ the $i$-th column of $Q$, so that $\{q_1, \dots, q_m\}$ is orthonormal and $G q_i = \lambda_i q_i$ (each $q_i$ is a unit eigenvector of $G$ with eigenvalue $\lambda_i$).
The eigenvalues $\lambda_i$ are non-negative: applying the positive semidefiniteness of $G$ to the unit vector $q_i$,
\begin{align*}
\lambda_i = \lambda_i |q_i|^2 = q_i^\top (\lambda_i q_i) = q_i^\top G q_i = |A q_i|^2 \ge 0.
\end{align*}
Order the eigenvalues so that $\lambda_1 \ge \lambda_2 \ge \cdots \ge \lambda_m \ge 0$ (and reorder the columns of $Q$ accordingly), and set
\begin{align*}
\sigma_i := \sqrt{\lambda_i} \ge 0, \qquad \sigma_1 \ge \sigma_2 \ge \cdots \ge \sigma_m \ge 0.
\end{align*}
[/step]
[step:Construct the orthogonal matrix $P$ from the orthonormal image vectors]
For each $i = 1, \dots, m$, set $u_i := A q_i \in \mathbb{R}^n$. We compute the inner products: for $i, j \in \{1, \dots, m\}$,
\begin{align*}
u_i \cdot u_j = (A q_i)^\top (A q_j) = q_i^\top A^\top A q_j = q_i^\top G q_j = q_i^\top \lambda_j q_j = \lambda_j (q_i \cdot q_j) = \lambda_j \delta_{ij},
\end{align*}
where $\delta_{ij}$ is the Kronecker delta and we used $G q_j = \lambda_j q_j$ together with the orthonormality $q_i \cdot q_j = \delta_{ij}$. In particular,
\begin{align*}
|u_i|^2 = \lambda_i = \sigma_i^2, \qquad u_i \cdot u_j = 0 \text{ for } i \ne j.
\end{align*}
Let $r := \#\{i : \sigma_i > 0\}$, the number of strictly positive singular values. Since the $\sigma_i$ are non-increasing, $\sigma_1, \dots, \sigma_r > 0$ and $\sigma_{r+1} = \cdots = \sigma_m = 0$. For $i = 1, \dots, r$, define
\begin{align*}
p_i := \frac{u_i}{\sigma_i} \in \mathbb{R}^n,
\end{align*}
so $|p_i|^2 = |u_i|^2 / \sigma_i^2 = \sigma_i^2 / \sigma_i^2 = 1$. The orthogonality computation above gives $p_i \cdot p_j = (u_i \cdot u_j)/(\sigma_i \sigma_j) = \delta_{ij}$ for $i, j \in \{1, \dots, r\}$. Hence $\{p_1, \dots, p_r\}$ is an orthonormal set in $\mathbb{R}^n$.
We now invoke the hypothesis of the theorem that $m \le n$. Since $r \le m$ (as $r$ counts a subset of indices $\{1, \dots, m\}$) and $m \le n$ by hypothesis, we have $r \le n$. Therefore the orthonormal set $\{p_1, \dots, p_r\} \subset \mathbb{R}^n$ has at most $n$ elements and can be extended: choose any vectors $w_{r+1}, \dots, w_n \in \mathbb{R}^n$ such that $\{p_1, \dots, p_r, w_{r+1}, \dots, w_n\}$ is a basis of $\mathbb{R}^n$ (possible since $r \le n$), and apply Gram–Schmidt orthonormalisation to this basis, leaving the already-orthonormal vectors $p_1, \dots, p_r$ unchanged and producing orthonormal vectors $p_{r+1}, \dots, p_n \in \mathbb{R}^n$. The result is an orthonormal basis $\{p_1, \dots, p_n\}$ of $\mathbb{R}^n$. Define
\begin{align*}
P := [p_1\ p_2\ \cdots\ p_n] \in \mathbb{R}^{n \times n}.
\end{align*}
Then $P$ is orthogonal: $(P^\top P)_{ij} = p_i \cdot p_j = \delta_{ij}$, so $P^\top P = I_n$.
[/step]
[step:Define the rectangular diagonal matrix $\Sigma$ and verify the SVD identity]
Define
\begin{align*}
\Sigma \in \mathbb{R}^{n \times m}, \qquad \Sigma_{ij} := \begin{cases} \sigma_i & \text{if } i = j \in \{1, \dots, m\}, \\ 0 & \text{otherwise.} \end{cases}
\end{align*}
That is, $\Sigma$ has $\sigma_1, \dots, \sigma_m$ on its main diagonal and zeros elsewhere; the bottom $n - m$ rows are entirely zero (this is well-defined precisely because $m \le n$).
We verify $A = P \Sigma Q^\top$ by checking equality of the action on the orthonormal basis $\{q_1, \dots, q_m\}$ of $\mathbb{R}^m$.
For $i \in \{1, \dots, r\}$:
\begin{align*}
(P \Sigma Q^\top) q_i = P \Sigma (Q^\top q_i) = P \Sigma e_i^{(m)} = P (\sigma_i e_i^{(n)}) = \sigma_i p_i = \sigma_i \cdot \frac{u_i}{\sigma_i} = u_i = A q_i.
\end{align*}
Here $Q^\top q_i = e_i^{(m)}$ (the $i$-th standard basis vector of $\mathbb{R}^m$) because $Q$ is orthogonal and $q_i$ is the $i$-th column of $Q$; $\Sigma e_i^{(m)} = \sigma_i e_i^{(n)}$ since the $i$-th column of $\Sigma$ has $\sigma_i$ in slot $i$ and zeros elsewhere (valid for $i \le m \le n$); and $P e_i^{(n)} = p_i$ since $p_i$ is the $i$-th column of $P$.
For $i \in \{r+1, \dots, m\}$:
\begin{align*}
(P \Sigma Q^\top) q_i = P \Sigma e_i^{(m)} = P (\sigma_i e_i^{(n)}) = P \cdot 0 = 0,
\end{align*}
since $\sigma_i = 0$ for $i > r$. On the other hand,
\begin{align*}
A q_i = u_i, \qquad |u_i|^2 = \sigma_i^2 = 0 \implies u_i = 0,
\end{align*}
so $A q_i = 0$ as well.
Hence $(P \Sigma Q^\top) q_i = A q_i$ for every $i \in \{1, \dots, m\}$. Since $\{q_1, \dots, q_m\}$ is a basis of $\mathbb{R}^m$, the matrices $P \Sigma Q^\top$ and $A$ agree on all of $\mathbb{R}^m$, hence as $n \times m$ matrices,
\begin{align*}
A = P \Sigma Q^\top.
\end{align*}
[/step]
[step:Establish uniqueness of the singular values]
Suppose $A = P_1 \Sigma_1 Q_1^\top = P_2 \Sigma_2 Q_2^\top$ are two SVDs, with $\Sigma_1, \Sigma_2$ both rectangular diagonal in $\mathbb{R}^{n \times m}$ with non-increasing non-negative diagonals $\sigma_1^{(1)} \ge \cdots \ge \sigma_m^{(1)}$ and $\sigma_1^{(2)} \ge \cdots \ge \sigma_m^{(2)}$, and $P_1, P_2 \in \mathbb{R}^{n \times n}$, $Q_1, Q_2 \in \mathbb{R}^{m \times m}$ orthogonal.
Compute the Gram matrix using each decomposition: for $k \in \{1, 2\}$,
\begin{align*}
G = A^\top A = (P_k \Sigma_k Q_k^\top)^\top (P_k \Sigma_k Q_k^\top) = Q_k \Sigma_k^\top P_k^\top P_k \Sigma_k Q_k^\top = Q_k (\Sigma_k^\top \Sigma_k) Q_k^\top,
\end{align*}
using $P_k^\top P_k = I_n$. The matrix $\Sigma_k^\top \Sigma_k \in \mathbb{R}^{m \times m}$ is diagonal with entries $(\sigma_j^{(k)})^2$ in the $j$-th slot, so
\begin{align*}
G = Q_1 \operatorname{diag}\bigl((\sigma_j^{(1)})^2\bigr) Q_1^\top = Q_2 \operatorname{diag}\bigl((\sigma_j^{(2)})^2\bigr) Q_2^\top.
\end{align*}
Both right-hand sides are orthogonal diagonalisations of $G$ with eigenvalues listed in non-increasing order. The multiset of eigenvalues of a matrix is uniquely determined as the multiset of roots (with multiplicity) of the characteristic polynomial $\det(\lambda I - G)$, which depends only on $G$. Hence
\begin{align*}
(\sigma_j^{(1)})^2 = (\sigma_j^{(2)})^2 \quad \text{for } j = 1, \dots, m.
\end{align*}
Since $\sigma_j^{(k)} \ge 0$, taking square roots gives $\sigma_j^{(1)} = \sigma_j^{(2)}$ for all $j$. So the singular values are uniquely determined by $A$, namely $\sigma_j = \sqrt{\lambda_j(G)}$ where $\lambda_j(G)$ is the $j$-th largest eigenvalue (with multiplicity) of $G = A^\top A$.
(The matrices $P$ and $Q$ are *not* unique when singular values repeat: in an eigenspace of $G$ of dimension $> 1$, any orthonormal basis can be chosen as the corresponding columns of $Q$, with the corresponding columns of $P$ then determined, or, in the case of zero singular values, free.)
[/step]