[proofplan]
We diagonalize the positive [compact operator](/page/Compact%20Operator) $|T|=(T^*T)^{1/2}$ and take its positive eigenvalues as the singular values. The corresponding eigenvectors give the right singular vectors $v_j$, and the left singular vectors are defined by normalizing $Tv_j$. The identities for $T$ and $T^*$ then follow from $T^*T=|T|^2$, while the expansion follows by expanding the component of $x$ in $(\ker T)^\perp$ in the orthonormal eigenbasis of $|T|$. Finally, the norm of the truncation error is the largest omitted singular value, which tends to $0$ in the infinite case.
[/proofplan]
[step:Diagonalize the compact positive operator $|T|$]
Let $A:=T^*T\in\mathcal{L}(H)$. Here $\mathcal{L}(H)$ denotes the bounded linear maps $H\to H$, and $\mathcal{L}(K,H)$ denotes the bounded linear maps $K\to H$. Since $T$ is compact and $T^*\in\mathcal{L}(K,H)$ is bounded, $A=T^*T$ is compact. For every $x\in H$, \begin{align*}(Ax,x)_H=(Tx,Tx)_K\ge 0.\end{align*} Thus $A$ is non-negative. Also $A$ is self-adjoint because, by the basic adjoint identity for products,
\begin{align*}
A^*=(T^*T)^*=T^*T=A.
\end{align*}
Let $|T|:=A^{1/2}$ be the unique non-negative square root of $A$; here we use the existence and basic properties of the positive square root for bounded non-negative [self-adjoint operators](/page/Self-Adjoint%20Operators) (citing a result not yet in the wiki: Existence and basic properties of $|T|=(T^*T)^{1/2}$). Since $A$ is compact and non-negative, $|T|$ is compact by the compactness part of the same square-root result.
We now apply the spectral theorem for compact positive self-adjoint operators to $|T|$ (citing a result not yet in the wiki: Spectral theorem for compact positive self-adjoint operators). It gives an index set $J$, where $J=\varnothing$, $J=\{1,\dots,m\}$ for some $m\in\mathbb N$, or $J=\mathbb N$, positive eigenvalues $(s_j)_{j\in J}$ repeated according to multiplicity and ordered non-increasingly, and an orthonormal family $(v_j)_{j\in J}$ in $H$ such that
\begin{align*}
|T|v_j=s_jv_j
\end{align*}
for every $j\in J$, and
\begin{align*}
\overline{\operatorname{span}}\{v_j:j\in J\}=(\ker |T|)^\perp.
\end{align*}
If $J=\mathbb N$, the same spectral theorem gives
\begin{align*}
s_j\to 0.
\end{align*}
It remains to identify $\ker |T|$ with $\ker T$. For $x\in H$, \begin{align*}\||T|x\|_H^2=(|T|^2x,x)_H=(T^*Tx,x)_H=\|Tx\|_K^2.\end{align*} Therefore $|T|x=0$ if and only if $Tx=0$, and hence
\begin{align*}
\ker |T|=\ker T.
\end{align*}
Thus
\begin{align*}
\overline{\operatorname{span}}\{v_j:j\in J\}=(\ker T)^\perp.
\end{align*}
[guided]
The operator to diagonalize is not $T$ itself, since $T$ maps from $H$ to $K$ and need not be normal or even act on one space. Instead we form the positive operator $A:=T^*T\in\mathcal{L}(H)$. Compactness is preserved under composition with bounded operators: $T$ is compact by hypothesis and $T^*$ is bounded, so $T^*T$ is compact. The operator $A$ is non-negative because for every $x\in H$,
\begin{align*}
(Ax,x)_H=(T^*Tx,x)_H=(Tx,Tx)_K\ge 0.
\end{align*}
It is also self-adjoint, since
\begin{align*}
A^*=(T^*T)^*=T^*T=A.
\end{align*}
Now define $|T|:=A^{1/2}$, the unique non-negative square root of $A$. This uses the standard square-root theorem for bounded non-negative self-adjoint operators, together with the fact that when $A$ is compact and non-negative, its square root is compact as well (citing a result not yet in the wiki: Existence and basic properties of $|T|=(T^*T)^{1/2}$). Thus $|T|$ is a compact positive self-adjoint operator on $H$.
We may therefore apply the compact positive spectral theorem to $|T|$ (citing a result not yet in the wiki: Spectral theorem for compact positive self-adjoint operators). The theorem gives all positive eigenvalues of $|T|$, repeated according to the dimensions of their eigenspaces. These positive eigenvalues form either no list, a finite list, or a countably infinite list. We denote them by $(s_j)_{j\in J}$ and choose corresponding orthonormal eigenvectors $(v_j)_{j\in J}$ such that
\begin{align*}
|T|v_j=s_jv_j
\end{align*}
for every $j\in J$. The spectral theorem also gives
\begin{align*}
\overline{\operatorname{span}}\{v_j:j\in J\}=(\ker |T|)^\perp.
\end{align*}
In the infinite case, compactness is exactly what forces the positive eigenvalues to tend to $0$, so
\begin{align*}
s_j\to 0.
\end{align*}
The final point in this step is to replace $\ker |T|$ by $\ker T$. This is not a convention; it follows from a norm identity. Since $|T|^2=T^*T$, for every $x\in H$ we have
\begin{align*}
\||T|x\|_H^2=(|T|x,|T|x)_H=(|T|^2x,x)_H=(T^*Tx,x)_H=(Tx,Tx)_K=\|Tx\|_K^2.
\end{align*}
Therefore $\||T|x\|_H=0$ if and only if $\|Tx\|_K=0$, which is equivalent to $|T|x=0$ if and only if $Tx=0$. Hence
\begin{align*}
\ker |T|=\ker T.
\end{align*}
Substituting this into the spectral-theorem conclusion gives
\begin{align*}
\overline{\operatorname{span}}\{v_j:j\in J\}=(\ker T)^\perp.
\end{align*}
[/guided]
[/step]
[step:Normalize the images of the right singular vectors]
If $J=\varnothing$, then $\ker T=H$ by the preceding step, so $T=0$ and the theorem follows with empty families. Assume from now on that $J\ne\varnothing$.
For each $j\in J$, define
\begin{align*}
u_j:=s_j^{-1}Tv_j\in K.
\end{align*}
Then
\begin{align*}
Tv_j=s_j u_j.
\end{align*}
For $i,j\in J$, using the adjoint identity with inner products linear in the first argument, together with $T^*T=|T|^2$ and $|T|v_i=s_iv_i$, we obtain \begin{align*}(u_i,u_j)_K=s_i^{-1}s_j^{-1}(Tv_i,Tv_j)_K=s_i^{-1}s_j^{-1}(T^*Tv_i,v_j)_H.\end{align*} Since $T^*Tv_i=|T|^2v_i=s_i^2v_i$, this gives
\begin{align*}
(u_i,u_j)_K=s_i s_j^{-1}(v_i,v_j)_H.
\end{align*}
If $i=j$, this equals $1$, and if $i\ne j$, this equals $0$. Thus $(u_j)_{j\in J}$ is an orthonormal family in $K$. Since each $u_j=s_j^{-1}Tv_j$ belongs to $\operatorname{Range}(T)$, the family lies in $\overline{\operatorname{Range}(T)}$.
[/step]
[step:Compute the adjoint action on the left singular vectors]
For every $j\in J$, the definition of $u_j$ gives
\begin{align*}
T^*u_j=s_j^{-1}T^*Tv_j=s_j^{-1}|T|^2v_j.
\end{align*}
Since $|T|v_j=s_jv_j$, we have $|T|^2v_j=s_j^2v_j$. Therefore
\begin{align*}
T^*u_j=s_jv_j.
\end{align*}
Together with $Tv_j=s_j u_j$, this proves the two displayed singular-vector identities.
[/step]
[step:Expand $Tx$ by expanding the component of $x$ in $(\ker T)^\perp$]
Let $M:=\overline{\operatorname{span}}\{v_j:j\in J\}\subset H$. From the first step, $M=(\ker T)^\perp$. For $x\in H$, let $x_0\in\ker T$ and $x_1\in M$ be the unique [orthogonal decomposition](/theorems/436) \begin{align*}x=x_0+x_1.\end{align*} By the Parseval expansion for an [orthonormal basis](/page/Orthonormal%20Basis) of $M$ (citing a result not yet in the wiki: [Parseval identity](/theorems/248) for Hilbert spaces), \begin{align*}x_1=\sum_{j\in J}(x_1,v_j)_H v_j.\end{align*} Since $x_0\perp v_j$ for every $j\in J$, we have \begin{align*}(x_1,v_j)_H=(x,v_j)_H.\end{align*} Thus
\begin{align*}
x_1=\sum_{j\in J}(x,v_j)_H v_j.
\end{align*}
Because $T$ is bounded, applying $T$ to the norm-convergent expansion of $x_1$ gives \begin{align*}Tx=Tx_0+Tx_1=\sum_{j\in J}(x,v_j)_H Tv_j.\end{align*} Since $x_0\in\ker T$ and $Tv_j=s_j u_j$, this becomes \begin{align*}Tx=\sum_{j\in J}s_j(x,v_j)_H u_j.\end{align*} When $J$ is finite this is a finite sum, and when $J=\varnothing$ it is the zero sum.
[/step]
[step:Identify the closed span of the left singular vectors with the closed range of $T$]
Let
\begin{align*}
N:=\overline{\operatorname{span}}\{u_j:j\in J\}\subset K.
\end{align*}
Since each $u_j$ lies in $\operatorname{Range}(T)$, we have
\begin{align*}
N\subset \overline{\operatorname{Range}(T)}.
\end{align*}
Conversely, let $y\in\operatorname{Range}(T)$. Then there is an element $x\in H$ such that $y=Tx$. By the expansion just proved,
\begin{align*}
y=\sum_{j\in J}s_j(x,v_j)_H u_j.
\end{align*}
The right-hand side is a norm limit of finite linear combinations of the vectors $u_j$, so $y\in N$. Hence
\begin{align*}
\operatorname{Range}(T)\subset N.
\end{align*}
Since $N$ is closed by definition, it follows that
\begin{align*}
\overline{\operatorname{Range}(T)}\subset N.
\end{align*}
Combining the two inclusions gives
\begin{align*}
\overline{\operatorname{span}}\{u_j:j\in J\}=\overline{\operatorname{Range}(T)}.
\end{align*}
[guided]
This step proves a point that cannot be skipped: the range of a compact operator need not be closed, so it is not enough to say that the vectors $u_j$ lie in the range. We define the candidate closed subspace
\begin{align*}
N:=\overline{\operatorname{span}}\{u_j:j\in J\}\subset K.
\end{align*}
Because $u_j=s_j^{-1}Tv_j$, every $u_j$ belongs to $\operatorname{Range}(T)$. Therefore every finite linear combination of the $u_j$ belongs to $\operatorname{Range}(T)$, and after taking closures we get
\begin{align*}
N\subset \overline{\operatorname{Range}(T)}.
\end{align*}
For the reverse inclusion, take an arbitrary $y\in\operatorname{Range}(T)$. By definition of the range, there exists $x\in H$ such that $y=Tx$. The singular-vector expansion already proved gives
\begin{align*}
y=Tx=\sum_{j\in J}s_j(x,v_j)_H u_j.
\end{align*}
This series is a norm-convergent limit of finite linear combinations of the $u_j$, so $y\in N$. Since $y\in\operatorname{Range}(T)$ was arbitrary, we have
\begin{align*}
\operatorname{Range}(T)\subset N.
\end{align*}
Now use that $N$ is closed by its definition as a closed linear span. Taking closures in the preceding inclusion gives
\begin{align*}
\overline{\operatorname{Range}(T)}\subset N.
\end{align*}
Together with the first inclusion, this proves
\begin{align*}
\overline{\operatorname{span}}\{u_j:j\in J\}=\overline{\operatorname{Range}(T)}.
\end{align*}
[/guided]
[/step]
[step:Compute the operator norm of the truncation error in the infinite case]
Assume $J=\mathbb N$. For each $n\in\mathbb N$, define the finite-rank operator $T_n:H\to K$ by \begin{align*}T_nx=\sum_{j=1}^n s_j(x,v_j)_H u_j.\end{align*} Its range is contained in the finite-dimensional subspace $\operatorname{span}\{u_1,\dots,u_n\}$, so $T_n$ has finite rank.
Let $x\in H$ with $\|x\|_H\le 1$. Using the expansion of $Tx$ and the orthonormality of $(u_j)_{j\in\mathbb N}$, the Parseval identity gives \begin{align*}\|(T-T_n)x\|_K^2=\sum_{j=n+1}^{\infty}s_j^2 |(x,v_j)_H|^2.\end{align*} Since the sequence $(s_j)$ is non-increasing, \begin{align*}\|(T-T_n)x\|_K^2\le s_{n+1}^2\sum_{j=n+1}^{\infty}|(x,v_j)_H|^2.\end{align*} [Bessel's inequality](/theorems/540) for the orthonormal family $(v_j)_{j\in\mathbb N}$ gives \begin{align*}\sum_{j=n+1}^{\infty}|(x,v_j)_H|^2\le \|x\|_H^2\le 1.\end{align*} Therefore \begin{align*}\|(T-T_n)x\|_K\le s_{n+1}.\end{align*} Taking the supremum over all $x\in H$ with $\|x\|_H\le 1$ yields \begin{align*}\|T-T_n\|_{\mathcal{L}(H,K)}\le s_{n+1}.\end{align*}
The reverse inequality follows by evaluating at $v_{n+1}$, which has norm $1$: \begin{align*}(T-T_n)v_{n+1}=s_{n+1}u_{n+1}.\end{align*} Thus \begin{align*}\|T-T_n\|_{\mathcal{L}(H,K)}\ge \|(T-T_n)v_{n+1}\|_K=s_{n+1}.\end{align*} Hence \begin{align*}\|T-T_n\|_{\mathcal{L}(H,K)}=s_{n+1}.\end{align*} Since $s_j\to 0$, we conclude \begin{align*}\|T-T_n\|_{\mathcal{L}(H,K)}\to 0.\end{align*}
This also confirms that the singular-value series converges in $K$ for every $x\in H$, because it is the norm limit of the finite-rank truncations $T_nx$ to $Tx$. This completes the proof.
[/step]