A derivative of a scalar function has a size: in one dimension it is an absolute value, and in several dimensions it is the Euclidean length of the gradient. A derivative of a vector-valued function has more data. At a point $a$, the total derivative $Df_a: \mathbb{R}^m \to \mathbb{R}^n$ is a [linear map](/page/Linear%20Map), and its [Jacobian matrix](/page/Jacobian%20Matrix) $Jf_a$ contains $nm$ first partial derivatives.
text
admin
This creates a practical problem. If the goal is to estimate the derivative, listing all entries is unwieldy, but replacing the whole derivative by its largest stretching factor forgets how much first-order variation is spread across the different coordinate directions and components. The Euclidean norm on linear maps solves the bookkeeping problem by treating the matrix of a linear map as one Euclidean vector.
text
admin
The norm has two faces: it is the square root of the sum of the squares of all matrix entries, and it is also the square root of the sum of the squared lengths of the images of the standard basis vectors. The first face is convenient for computation; the second explains why this is a norm on maps rather than only a norm on arrays.
text
admin
[example: A Jacobian with Several Entries]
Let $T:\mathbb{R}^3\to\mathbb{R}^2$ have standard matrix with first row $(1,-2,0)$ and second row $(3,1,-1)$. Its Frobenius norm is the square root of the sum of the squares of these six entries, so the squared norm is
\begin{align*}
\|T\|_F^2=1^2+(-2)^2+0^2+3^2+1^2+(-1)^2.
\end{align*}
The individual squares are $1^2=1$, $(-2)^2=4$, $0^2=0$, $3^2=9$, $1^2=1$, and $(-1)^2=1$, hence
\begin{align*}
\|T\|_F^2=1+4+0+9+1+1=16.
\end{align*}
Therefore
\begin{align*}
\|T\|_F=\sqrt{16}=4.
\end{align*}
If $e_1,e_2,e_3$ are the standard basis vectors of $\mathbb{R}^3$, then the columns of the standard matrix give
\begin{align*}
T(e_1)=(1,3),\quad T(e_2)=(-2,1),\quad T(e_3)=(0,-1).
\end{align*}
Their squared Euclidean lengths are
\begin{align*}
|T(e_1)|^2=1^2+3^2=10.
\end{align*}
Also,
\begin{align*}
|T(e_2)|^2=(-2)^2+1^2=5.
\end{align*}
And
\begin{align*}
|T(e_3)|^2=0^2+(-1)^2=1.
\end{align*}
Adding the three column contributions gives
\begin{align*}
|T(e_1)|^2+|T(e_2)|^2+|T(e_3)|^2=10+5+1=16.
\end{align*}
Thus the entrywise computation and the column-length computation give the same squared size for the linear map.
[/example]
example
admin
## Definition
h2
admin
The page topic is the norm itself: a scalar measurement assigned to each linear map. Since the measurement is entrywise, the definition also fixes the standard coordinate convention inside the same block. That keeps the first definition focused on the object being studied while still making the formula unambiguous.
text
admin
[definition: Euclidean Norm on Linear Maps]
Let $m,n \in \mathbb{N}$, and write $\mathcal{L}(\mathbb{R}^m,\mathbb{R}^n)$ for the set of all linear maps from $\mathbb{R}^m$ to $\mathbb{R}^n$. The Euclidean norm on linear maps, also called the Frobenius norm, is the function
\begin{align*}
\|\cdot\|_F: \mathcal{L}(\mathbb{R}^m,\mathbb{R}^n) \to [0,\infty).
\end{align*}
It is given by
\begin{align*}
\|T\|_F:=\left(\sum_{j=1}^m \sum_{i=1}^n A_{ij}^2\right)^{1/2},
\end{align*}
where $A \in \mathbb{R}^{n \times m}$ is the matrix whose $j$-th column is $T(e_j)$, and $e_1,\ldots,e_m$ is the standard basis of $\mathbb{R}^m$.
[/definition]
definition
admin
The subscript $F$ distinguishes this norm from the operator norm. Many undergraduate notes write $\|T\|$ when the Euclidean norm on linear maps is the only norm being discussed. Since analysis often compares several norms at once, this chapter keeps the subscript until the context is unambiguous.
text
admin
## Matrix Representation and Ambient Space
h2
admin
A linear map is an abstract function before it is a matrix. The definition above used the standard bases to turn the map into an array, and that convention deserves its own name because it is also the convention used for Jacobian matrices in multivariable calculus.
text
admin
[definition: Standard Matrix of a Linear Map]
Let $m,n \in \mathbb{N}$, and let $T: \mathbb{R}^m \to \mathbb{R}^n$ be a linear map. The standard matrix of $T$ is the matrix $A \in \mathbb{R}^{n \times m}$ whose $j$-th column is $T(e_j)$, where $e_1,\ldots,e_m$ is the standard basis of $\mathbb{R}^m$.
[/definition]
definition
admin
This definition fixes the indexing convention: $A_{ij}$ is the $i$-th component of $T(e_j)$. That convention is important because the derivative of a map $f:U\subset\mathbb{R}^m\to\mathbb{R}^n$ has Jacobian entries $(Jf_a)_{ij}=\partial_{x_j}f_i(a)$. Once we want to add linear maps, scale them, and measure their size by a norm, we also need to name the ambient [vector space](/page/Vector%20Space) where those operations take place. That space keeps the source and target dimensions fixed, so all entries being compared live in the same coordinate array.
text
admin
[definition: Space of Linear Maps Between Euclidean Spaces]
For $m,n \in \mathbb{N}$, the space $\mathcal{L}(\mathbb{R}^m,\mathbb{R}^n)$ is the real vector space of all linear maps $T: \mathbb{R}^m \to \mathbb{R}^n$.
[/definition]
definition
admin
The formula for $\|T\|_F$ is modeled on the Euclidean norm of a vector, so it should satisfy the familiar norm laws. This matters because estimates such as $\|S+T\|_F\le \|S\|_F+\|T\|_F$ are used constantly when linear maps arise as derivatives or approximations. The next result records that the formula genuinely defines a norm on the space just introduced.
text
admin
[quotetheorem:9172]
text
admin
The theorem lets us use the language of normed spaces for finite-dimensional spaces of linear maps. A sequence of matrices converges in Frobenius norm precisely when its entries converge in Euclidean square-sum. This is the first reason the norm is so convenient: it turns a list of scalar estimates into one vector estimate.
text
admin
A linear functional is the smallest non-scalar test case. It has several coefficients but only one output component. This example verifies that the definition reduces to an ordinary Euclidean length in that case.
text
admin
[example: Linear Functionals]
Let $\ell: \mathbb{R}^3 \to \mathbb{R}$ be given by
\begin{align*}
\ell(x_1,x_2,x_3)=2x_1-x_2+2x_3.
\end{align*}
Since $\ell(e_1)=2$, $\ell(e_2)=-1$, and $\ell(e_3)=2$, the standard matrix is the $1\times 3$ row matrix $(2,-1,2)$. By the definition of the Frobenius norm,
\begin{align*}
\|\ell\|_F=\left(2^2+(-1)^2+2^2\right)^{1/2}.
\end{align*}
The three squares are
\begin{align*}
2^2=4,\quad (-1)^2=1,\quad 2^2=4.
\end{align*}
Thus
\begin{align*}
2^2+(-1)^2+2^2=4+1+4=9.
\end{align*}
Therefore
\begin{align*}
\|\ell\|_F=\sqrt{9}=3.
\end{align*}
For linear maps into $\mathbb{R}$, the Frobenius norm is the Euclidean length of the row of coefficients.
[/example]