[proofplan]
We compare both sides by expanding them in the standard coordinates. The Frobenius norm of $T$ is the Euclidean norm of its standard matrix, so its square is the sum of the squares of all entries of $A$. On the other hand, the diagonal entries of $A^\top A$ are precisely the squared Euclidean lengths of the columns of $A$, and summing those diagonal entries gives the same double sum.
[/proofplan]
custom_env
admin
[step:Expand the Frobenius norm through the entries of the standard matrix]Let $e_1,\ldots,e_m$ denote the standard basis of $\mathbb{R}^m$, and let $\varepsilon_1,\ldots,\varepsilon_n$ denote the standard basis of $\mathbb{R}^n$. Since $A=(A_{ij})$ is the standard matrix of $T$, for each $j \in \{1,\ldots,m\}$ the $j$-th column of $A$ is the coordinate vector of $T(e_j)$ in the basis $\varepsilon_1,\ldots,\varepsilon_n$; equivalently,
\begin{align*}
T(e_j)=\sum_{i=1}^n A_{ij}\varepsilon_i.
\end{align*}
By the definition of the Frobenius norm of a [linear map](/page/Linear%20Map) as the Euclidean norm of its standard matrix,
\begin{align*}
\|T\|_F^2=\sum_{j=1}^m\sum_{i=1}^n A_{ij}^2.
\end{align*}[/step]
custom_env
admin
[guided]We first translate the linear map $T$ into coordinates, because the theorem compares a norm of $T$ with an expression built from the matrix $A$. Let $e_1,\ldots,e_m$ be the standard basis of $\mathbb{R}^m$, and let $\varepsilon_1,\ldots,\varepsilon_n$ be the standard basis of $\mathbb{R}^n$. The phrase "$A$ is the standard matrix of $T$" means exactly that the $j$-th column of $A$ records the standard coordinates of $T(e_j)$. Thus, for every $j \in \{1,\ldots,m\}$,
\begin{align*}
T(e_j)=\sum_{i=1}^n A_{ij}\varepsilon_i.
\end{align*}
The Frobenius norm of $T$ is defined by taking the Euclidean norm of this standard matrix. Therefore its square is the sum of the squares of all matrix entries:
\begin{align*}
\|T\|_F^2=\sum_{j=1}^m\sum_{i=1}^n A_{ij}^2.
\end{align*}
This is the quantity we must recover from the trace of $A^\top A$.[/guided]
custom_env
admin
[step:Compute the diagonal entries of $A^\top A$]
For each $j \in \{1,\ldots,m\}$, the definition of matrix multiplication gives
\begin{align*}
(A^\top A)_{jj}=\sum_{i=1}^n (A^\top)_{ji}A_{ij}.
\end{align*}
By the definition of transpose, $(A^\top)_{ji}=A_{ij}$ for every $i \in \{1,\ldots,n\}$. Hence
\begin{align*}
(A^\top A)_{jj}=\sum_{i=1}^n A_{ij}^2.
\end{align*}
[/step]
custom_env
admin
[step:Sum the diagonal entries to identify the trace]
By the definition of trace for an $m \times m$ matrix,
\begin{align*}
\operatorname{tr}(A^\top A)=\sum_{j=1}^m (A^\top A)_{jj}.
\end{align*}
Substituting the diagonal-entry computation from the previous step gives
\begin{align*}
\operatorname{tr}(A^\top A)=\sum_{j=1}^m\sum_{i=1}^n A_{ij}^2.
\end{align*}
The right-hand side is exactly the expansion of $\|T\|_F^2$ obtained above. Therefore
\begin{align*}
\|T\|_F^2=\operatorname{tr}(A^\top A).
\end{align*}
This proves the claimed trace formula.
[/step]