Given a [linear map](/page/Linear%20Map) $A: \mathbb{R}^n \to \mathbb{R}^m$ represented by a matrix, the transpose $A^\top$ satisfies $(Ax) \cdot y = x \cdot (A^\top y)$ for all $x \in \mathbb{R}^n$ and $y \in \mathbb{R}^m$. This simple identity — the ability to "move" the operator from one side of the inner product to the other — is the foundation of duality in linear algebra: the rank of $A$ equals the rank of $A^\top$, the null space of $A^\top$ is the orthogonal complement of the column space of $A$, and the solvability of $Ax = b$ reduces to a condition on $b$ relative to $\ker(A^\top)$. The adjoint of an operator is the infinite-dimensional generalisation of this idea. It is the central construction linking an operator to its dual, and it provides the language in which spectral theory, the [Fredholm alternative](/page/The%20Fredholm%20Alternative), and the theory of [self-adjoint operators](/page/Self-Adjoint%20Operators) are formulated.
This page develops the adjoint in two settings: first for bounded operators between [Banach spaces](/page/Banach%20Space) (the Banach adjoint, which acts on dual spaces), and then for bounded operators between [Hilbert spaces](/page/Hilbert%20Space) (the Hilbert adjoint, which acts on the spaces themselves). The distinction matters: the Banach adjoint always exists by pure duality, while the Hilbert adjoint relies on the [Riesz Representation Theorem](/theorems/221) to identify a Hilbert space with its dual. The Hilbert adjoint is the one that appears most often in PDE theory and spectral theory, and we develop its properties in detail. We also introduce self-adjoint operators and establish their most basic properties — that their eigenvalues are real and their spectral radius equals their operator norm — as preparation for the [separate page](/page/Self-Adjoint%20Operators) devoted to the deeper theory.
## Motivation
[motivation]
### The Finite-Dimensional Template
Consider a linear map $A: \mathbb{R}^n \to \mathbb{R}^m$ and the question: for which $b \in \mathbb{R}^m$ does $Ax = b$ have a solution? The answer is classical: $b$ must lie in the column space of $A$, which is the orthogonal complement of $\ker(A^\top)$. That is, $Ax = b$ is solvable if and only if $y \cdot b = 0$ for every $y$ with $A^\top y = 0$. This characterisation depends entirely on the relationship between $A$ and its transpose $A^\top$, defined by the identity
\begin{align*}
(Ax) \cdot y = x \cdot (A^\top y) \quad \text{for all } x \in \mathbb{R}^n, \, y \in \mathbb{R}^m.
\end{align*}
The transpose gives us a "dual" operator that moves in the reverse direction, and the interplay between $A$ and $A^\top$ — their kernels, their ranges, the dimensional relationships — is the backbone of finite-dimensional linear algebra.
### The Problem in Infinite Dimensions
When we move to operators $T: X \to Y$ between infinite-dimensional Banach spaces, we lose the matrix representation, and with it the concrete recipe for computing a transpose. More fundamentally, a general Banach space has no inner product, so the identity $(Ax) \cdot y = x \cdot (A^\top y)$ does not even make sense. We need a substitute for the inner product pairing. The natural candidate is the *duality pairing* between a space and its dual: for $f \in X^*$ and $x \in X$, the evaluation $f(x)$ plays the role of the inner product. This leads to the Banach adjoint, which operates on dual spaces.
### The Hilbert Space Advantage
In Hilbert spaces, we have both the duality pairing and the inner product, and the [Riesz Representation Theorem](/theorems/221) provides a canonical isometric isomorphism between a Hilbert space and its dual. This identification allows us to "internalise" the adjoint: instead of an operator $T^*: K^* \to H^*$ between dual spaces, we obtain an operator $T^*: K \to H$ between the original spaces, satisfying exactly the same identity as the finite-dimensional transpose:
\begin{align*}
(Tx, y)_K = (x, T^*y)_H.
\end{align*}
The Hilbert adjoint is far more powerful than its Banach counterpart because it lives in the same spaces as the original operator. Self-adjointness ($T = T^*$) becomes a meaningful condition, eigenvalue theory acquires a geometric flavour through orthogonal decompositions, and the kernel-range duality relations become statements about orthogonal complements rather than annihilators.
[/motivation]
## The Banach Space Adjoint
Before introducing the Hilbert adjoint, we define the more general Banach adjoint. The construction requires no inner product — only the duality pairing between a Banach space and its [continuous](/page/Continuity) dual.
The idea is simple: if $T: X \to Y$ is bounded and linear, and $f \in Y^*$ is a functional on $Y$, then the composition $f \circ T$ is a functional on $X$. This "pullback" operation defines an operator from $Y^*$ to $X^*$.
[definition:Banach Adjoint]
Let $X$ and $Y$ be Banach spaces and let $T \in \mathcal{L}(X, Y)$. The **Banach adjoint** (also called the **dual operator** or **transpose**) of $T$ is the operator
\begin{align*}
T^*: Y^* &\to X^* \\
f &\mapsto f \circ T,
\end{align*}
so that $(T^*f)(x) = f(Tx)$ for all $x \in X$ and $f \in Y^*$.
[/definition]
The Banach adjoint is automatically bounded with $\|T^*\|_{\mathcal{L}(Y^*, X^*)} = \|T\|_{\mathcal{L}(X, Y)}$. The bound $\|T^*\| \le \|T\|$ is immediate: $|(T^*f)(x)| = |f(Tx)| \le \|f\|_{Y^*} \|Tx\|_Y \le \|f\|_{Y^*} \|T\| \|x\|_X$. The reverse inequality $\|T\| \le \|T^*\|$ follows from the Hahn–Banach theorem: for each $x \in X$ with $Tx \neq 0$, there exists $f \in Y^*$ with $\|f\|_{Y^*} = 1$ and $f(Tx) = \|Tx\|_Y$, giving $\|Tx\|_Y = (T^*f)(x) \le \|T^*f\|_{X^*} \|x\|_X \le \|T^*\| \|x\|_X$.
[example:The Matrix Transpose As Banach Adjoint]
Let $X = \mathbb{R}^n$ and $Y = \mathbb{R}^m$ with the Euclidean norms. Every linear map $A: \mathbb{R}^n \to \mathbb{R}^m$ is represented by an $m \times n$ matrix. The dual $(\mathbb{R}^m)^*$ is canonically identified with $\mathbb{R}^m$ via $f \leftrightarrow y$ where $f(z) = y \cdot z$. Under this identification, the Banach adjoint $A^*: (\mathbb{R}^m)^* \to (\mathbb{R}^n)^*$ corresponds to the matrix transpose $A^\top: \mathbb{R}^m \to \mathbb{R}^n$. Indeed, for $y \in \mathbb{R}^m$ and $x \in \mathbb{R}^n$:
\begin{align*}
(A^*f_y)(x) = f_y(Ax) = y \cdot (Ax) = (A^\top y) \cdot x = f_{A^\top y}(x),
\end{align*}
so $A^* f_y = f_{A^\top y}$, which under the identification $f_z \leftrightarrow z$ gives $A^* y = A^\top y$.
[/example]
The Banach adjoint has a significant limitation: it maps between dual spaces, not between the original spaces. If $T: X \to Y$, then $T^*: Y^* \to X^*$. In general, $X^*$ and $X$ are different objects — there is no canonical way to identify them. This asymmetry makes the Banach adjoint less suitable for spectral theory, where we want to compare an operator with its adjoint on the *same* space. In Hilbert spaces, the Riesz Representation Theorem resolves this asymmetry.
## The Hilbert Adjoint
The defining identity of the matrix transpose — $(Ax) \cdot y = x \cdot (A^\top y)$ — makes sense in Hilbert spaces because the inner product provides the pairing. The question is whether, given $T \in \mathcal{L}(H, K)$, there always exists an operator $T^* \in \mathcal{L}(K, H)$ satisfying $(Tx, y)_K = (x, T^*y)_H$. This is not a formal tautology: the existence of $T^*$ requires producing, for each $y \in K$, a specific element of $H$ that represents the functional $x \mapsto (Tx, y)_K$. The [Riesz Representation Theorem](/theorems/221) is precisely the tool that supplies this element. The resulting adjoint has the same operator norm as $T$, which is a stronger statement than mere existence — it says the adjoint operation preserves the metric structure of the operator space.
[quotetheorem:550]
The norm identity $\|T^*\| = \|T\|$ has an important consequence that goes beyond the individual operator. Combined with the fact that $(T^*)^* = T$ (which follows from the symmetry of the defining identity), it implies the $C^*$-identity: $\|T^*T\|_{\mathcal{L}(H)} = \|T\|_{\mathcal{L}(H,K)}^2$. To see this, note that $\|T^*T\| \le \|T^*\| \|T\| = \|T\|^2$ by submultiplicativity. For the reverse, $\|Tx\|_K^2 = (Tx, Tx)_K = (x, T^*Tx)_H \le \|x\|_H \|T^*Tx\|_H \le \|x\|_H \|T^*T\| \|x\|_H$, giving $\|T\|^2 \le \|T^*T\|$. This identity characterises the algebras of bounded operators on Hilbert spaces and is the starting point of $C^*$-algebra theory.
### The Relationship Between the Two Adjoints
If $H$ and $K$ are Hilbert spaces, $T \in \mathcal{L}(H, K)$ has both a Banach adjoint $T^*_B: K^* \to H^*$ and a Hilbert adjoint $T^*_H: K \to H$. These are related by the Riesz isomorphisms $\Phi_H: H^* \to H$ and $\Phi_K: K^* \to K$ (which send a functional to its Riesz representer):
\begin{align*}
T^*_H = \Phi_H \circ T^*_B \circ \Phi_K^{-1}.
\end{align*}
In other words, the Hilbert adjoint is obtained from the Banach adjoint by "translating" from dual spaces back to the original spaces via the Riesz identification. Because this identification is isometric, the norm equality $\|T^*_H\| = \|T\|$ is consistent with the Banach adjoint norm equality $\|T^*_B\| = \|T\|$.
### Algebraic Properties of the Hilbert Adjoint
The adjoint operation on $\mathcal{L}(H, K)$ is well-behaved algebraically. All of the following properties are verified by checking the defining identity $(Tx, y) = (x, T^*y)$ on arbitrary elements.
Let $H$, $K$, and $L$ be Hilbert spaces, $S, T \in \mathcal{L}(H, K)$, $R \in \mathcal{L}(K, L)$, and $\lambda \in \mathbb{R}$. Then:
(a) **Antilinearity:** $(\lambda S + T)^* = \lambda S^* + T^*$. (Over $\mathbb{C}$, this becomes $(\lambda S + T)^* = \bar{\lambda}S^* + T^*$.)
(b) **Involution:** $(T^*)^* = T$.
(c) **Composition:** $(R \circ T)^* = T^* \circ R^*$.
(d) **Identity:** $I^* = I$.
(e) **Inversion:** If $T \in \mathcal{L}(H, K)$ is bijective with bounded inverse, then $(T^{-1})^* = (T^*)^{-1}$.
Property (b) follows from the symmetry of the inner product: $(T^*y, x)_H = \overline{(x, T^*y)_H} = \overline{(Tx, y)_K} = (y, Tx)_K$, so $T^*$ has adjoint $T$. Property (c) is the operator analogue of $(AB)^\top = B^\top A^\top$: for $x \in H$ and $z \in L$, $(RTx, z)_L = (Tx, R^*z)_K = (x, T^*R^*z)_H$.
[example:Adjoint Of A Multiplication Operator]
Let $H = L^2(U, \mathcal{L}^n)$ where $U \subseteq \mathbb{R}^n$ is a measurable [set](/page/Set), and let $m \in L^\infty(U)$ be a real-valued essentially bounded [function](/page/Function). Define the multiplication operator
\begin{align*}
M_m: L^2(U) &\to L^2(U) \\
f &\mapsto mf,
\end{align*}
so that $(M_m f)(x) = m(x) f(x)$ for $\mathcal{L}^n$-a.e. $x \in U$. This is bounded with $\|M_m\|_{\mathcal{L}(L^2)} = \|m\|_{L^\infty}$. The adjoint $M_m^*$ is determined by the identity:
\begin{align*}
(M_m f, g)_{L^2} = \int_U m(x) f(x) g(x) \, d\mathcal{L}^n(x) = \int_U f(x) m(x) g(x) \, d\mathcal{L}^n(x) = (f, M_m g)_{L^2}.
\end{align*}
Therefore $M_m^* = M_m$. That is, multiplication by a real-valued $L^\infty$ function is self-adjoint. Over $\mathbb{C}$, we would instead have $M_m^* = M_{\bar{m}}$: the adjoint of multiplication by $m$ is multiplication by $\bar{m}$.
[/example]
## Kernel-Range Duality
The most important structural consequence of the adjoint is the relationship between the kernels and ranges of $T$ and $T^*$. In finite dimensions, the rank-nullity theorem says $\dim(\ker A) + \dim(\operatorname{Range} A) = n$ for an $m \times n$ matrix $A$, and the "dual" rank-nullity theorem for $A^\top$ gives $\dim(\ker A^\top) + \dim(\operatorname{Range} A^\top) = m$. Since $\operatorname{rank}(A) = \operatorname{rank}(A^\top)$, these combine to give the orthogonal complement relations $\ker(A^\top) = \operatorname{Range}(A)^\perp$ and $\ker(A) = \operatorname{Range}(A^\top)^\perp$.
In infinite dimensions, these identities survive — but with a crucial refinement: the closure of the range replaces the range itself. The range of a bounded operator between infinite-dimensional spaces need not be closed (this failure is one of the central difficulties of operator theory), and the orthogonal complement relations hold for the *closed* range. The following theorem makes this precise.
[quotetheorem:551]
The power of these relations lies in their reformulation of solvability. The equation $Tx = y$ has a solution if and only if $y \in \operatorname{Range}(T)$. By (i), this holds whenever $y \perp \ker(T^*)$, and when $\operatorname{Range}(T)$ is closed, the converse also holds: $Tx = y$ is solvable *if and only if* $(y, z)_K = 0$ for every $z \in \ker(T^*)$. This is the content of the [Fredholm alternative](/page/The%20Fredholm%20Alternative) in its simplest form. In the theory of [elliptic PDEs](/page/Second-Order%20Elliptic%20Equations), this principle determines whether a [boundary](/page/Boundary) value problem $Lu = f$ has a solution: one must check $f$ against the solutions of the adjoint homogeneous problem $L^*v = 0$.
The "in particular" statement deserves emphasis: injectivity and dense range are dual properties. An operator $T$ is injective if and only if $T^*$ has dense range, and $T$ has dense range if and only if $T^*$ is injective. This duality is used constantly in functional analysis — for instance, to show that certain operators are injective by exhibiting their adjoints as having dense range.
[example:The Left And Right Shift Operators]
Let $H = \ell^2(\mathbb{N})$, the space of square-summable real [sequences](/page/Sequence) $x = (x_1, x_2, x_3, \dots)$ with the inner product $(x, y)_{\ell^2} = \sum_{k=1}^\infty x_k y_k$. Define the right shift operator
\begin{align*}
S: \ell^2(\mathbb{N}) &\to \ell^2(\mathbb{N}) \\
(x_1, x_2, x_3, \dots) &\mapsto (0, x_1, x_2, x_3, \dots)
\end{align*}
and the left shift operator
\begin{align*}
L: \ell^2(\mathbb{N}) &\to \ell^2(\mathbb{N}) \\
(x_1, x_2, x_3, \dots) &\mapsto (x_2, x_3, x_4, \dots).
\end{align*}
We verify that $S^* = L$. For any $x, y \in \ell^2(\mathbb{N})$:
\begin{align*}
(Sx, y)_{\ell^2} = \sum_{k=1}^\infty (Sx)_k y_k = \sum_{k=2}^\infty x_{k-1} y_k = \sum_{j=1}^\infty x_j y_{j+1} = \sum_{j=1}^\infty x_j (Ly)_j = (x, Ly)_{\ell^2},
\end{align*}
where we substituted $j = k - 1$. So $S^* = L$.
This example illustrates the kernel-range duality. The right shift $S$ is injective ($\ker(S) = \{0\}$), so by part (iv), $\overline{\operatorname{Range}(S^*)} = \overline{\operatorname{Range}(L)} = \ker(S)^\perp = \{0\}^\perp = \ell^2(\mathbb{N})$. Indeed, $L$ is surjective (for any target $y = (y_1, y_2, \dots) \in \ell^2$, choose $x = (0, y_1, y_2, \dots)$), so its range is all of $\ell^2$, consistent with the prediction.
Conversely, $L$ is not injective: $\ker(L) = \operatorname{span}\{e_1\}$ where $e_1 = (1, 0, 0, \dots)$. By part (i), $\ker(L) = \ker(S^*) = \operatorname{Range}(S)^\perp$. This says that the range of the right shift — the set of sequences with first coordinate zero — has orthogonal complement $\operatorname{span}\{e_1\}$, which is correct. Notice that $\operatorname{Range}(S)$ is closed (it is the hyperplane $\{x \in \ell^2 : x_1 = 0\}$) but is not all of $\ell^2$, so $S$ is not surjective. Correspondingly, $S^* = L$ is not injective.
[/example]
### Why Closedness of the Range Matters
The distinction between parts (i)–(ii) and (iii)–(iv) of the [Kernel-Range Duality](/theorems/551) is not a technicality — it reflects a genuine phenomenon. If $\operatorname{Range}(T)$ is not closed, then $\ker(T^*)^\perp = \overline{\operatorname{Range}(T)} \supsetneq \operatorname{Range}(T)$: there exist vectors $y$ that are orthogonal to every solution of the adjoint homogeneous equation $T^*z = 0$, yet the equation $Tx = y$ still has no solution. In other words, the "necessary condition" $y \perp \ker(T^*)$ becomes necessary but not sufficient. The additional hypothesis that the range is closed — which in the [Fredholm alternative](/page/The%20Fredholm%20Alternative) is guaranteed by the compactness of the relevant operator — is what upgrades the orthogonality condition from necessary to necessary-and-sufficient.
[example:Failure Of Solvability With Non Closed Range]
Let $H = \ell^2(\mathbb{N})$ and define $T \in \mathcal{L}(\ell^2)$ by
\begin{align*}
T: \ell^2(\mathbb{N}) &\to \ell^2(\mathbb{N}) \\
(x_1, x_2, x_3, \dots) &\mapsto \left(x_1, \frac{x_2}{2}, \frac{x_3}{3}, \dots\right).
\end{align*}
Then $T$ is bounded (with $\|T\| = 1$), injective, and self-adjoint ($T^* = T$, since the eigenvalues $1/k$ are real and the standard basis vectors are eigenvectors). Since $T^* = T$ is injective, we have $\ker(T^*) = \{0\}$, so $\ker(T^*)^\perp = \ell^2$ and the duality relation gives $\overline{\operatorname{Range}(T)} = \ell^2$.
However, $\operatorname{Range}(T)$ itself is not closed. Consider the sequence $y = (1, 1/2, 1/3, \dots) \in \ell^2$ (since $\sum 1/k^2 < \infty$). If $Tx = y$, then $x_k/k = 1/k$, giving $x_k = 1$ for all $k$, so $x = (1, 1, 1, \dots) \notin \ell^2$. Therefore $y \notin \operatorname{Range}(T)$ even though $y \perp \ker(T^*) = \{0\}$.
This example shows that the equation $Tx = y$ can fail to be solvable even when the necessary orthogonality condition is satisfied, precisely because the range is not closed.
[/example]
## Self-Adjoint Operators: First Properties
An operator that equals its own adjoint occupies a distinguished position in the theory — analogous to symmetric matrices in finite dimensions. The full theory of self-adjoint operators, including spectral decomposition and the functional calculus, is developed on the [self-adjoint operators page](/page/Self-Adjoint%20Operators). Here we introduce the definition and establish the two most basic structural consequences: eigenvalues are real, and the operator norm is determined by the quadratic form.
The condition $T = T^*$ requires $H = K$ (since $T: H \to K$ and $T^*: K \to H$ must have the same domain and codomain). It says that the quadratic form $x \mapsto (Tx, x)_H$ completely determines the operator, because self-adjointness forces $(Tx, y)_H = (x, Ty)_H$ to be a *symmetric* bilinear form.
[definition:Self-Adjoint Operator]
Let $H$ be a Hilbert space with inner product $(\cdot, \cdot)_H$. A bounded linear operator $T \in \mathcal{L}(H)$ is **self-adjoint** if $T^* = T$, that is,
\begin{align*}
(Tx, y)_H = (x, Ty)_H \quad \text{for all } x, y \in H.
\end{align*}
[/definition]
The multiplication operator $M_m$ from the earlier example is self-adjoint when $m$ is real-valued. In finite dimensions, self-adjoint operators on $\mathbb{R}^n$ (with the standard inner product) are exactly the symmetric matrices.
The first fundamental property is that self-adjointness forces all eigenvalues to be real, even when the underlying field is $\mathbb{C}$. This should be compared with the finite-dimensional case: a real symmetric matrix has only real eigenvalues, and the proof is the same.
[theorem:Eigenvalues Of Self-Adjoint Operators Are Real]
Let $H$ be a Hilbert space over $\mathbb{C}$ and let $T \in \mathcal{L}(H)$ be self-adjoint. If $\lambda \in \mathbb{C}$ is an eigenvalue of $T$ with eigenvector $x \in H \setminus \{0\}$ (so $Tx = \lambda x$), then $\lambda \in \mathbb{R}$.
Moreover, eigenvectors corresponding to distinct eigenvalues are orthogonal: if $Tx = \lambda x$ and $Ty = \mu y$ with $\lambda \neq \mu$ and $x, y \neq 0$, then $(x, y)_H = 0$.
[/theorem]
[proof]
**Step 1: Eigenvalues are real.**
\begin{align*}
\lambda \|x\|_H^2 = \lambda(x, x)_H = (\lambda x, x)_H = (Tx, x)_H = (x, Tx)_H = (x, \lambda x)_H = \bar{\lambda}(x, x)_H = \bar{\lambda}\|x\|_H^2.
\end{align*}
Since $x \neq 0$, we have $\|x\|_H^2 > 0$, so $\lambda = \bar{\lambda}$, which means $\lambda \in \mathbb{R}$.
**Step 2: Orthogonality of eigenvectors for distinct eigenvalues.**
\begin{align*}
\lambda(x, y)_H = (\lambda x, y)_H = (Tx, y)_H = (x, Ty)_H = (x, \mu y)_H = \bar{\mu}(x, y)_H = \mu(x, y)_H,
\end{align*}
where the last equality uses the fact that $\mu \in \mathbb{R}$ (by Step 1). Therefore $(\lambda - \mu)(x, y)_H = 0$. Since $\lambda \neq \mu$, we conclude $(x, y)_H = 0$.
[/proof]
The second fundamental property connects the operator norm to the quadratic form $(Tx, x)_H$. For a general operator, $\|T\| = \sup_{\|x\| = 1} \|Tx\|$, and the inner quantity $|(Tx, x)|$ can be strictly smaller than $\|Tx\| \|x\|$ (the Cauchy–Schwarz inequality is not always sharp). For self-adjoint operators, however, the quadratic form captures the full norm. This is because the symmetric bilinear form $(Tx, y)_H$ is completely determined by its diagonal values $(Tx, x)_H$ via the polarisation identity.
[theorem:Norm Of A Self-Adjoint Operator]
Let $H$ be a Hilbert space and let $T \in \mathcal{L}(H)$ be self-adjoint. Then
\begin{align*}
\|T\|_{\mathcal{L}(H)} = \sup_{\substack{x \in H \\ \|x\|_H = 1}} |(Tx, x)_H|.
\end{align*}
[/theorem]
[proof]
**Step 1: The upper bound.**
By the Cauchy–Schwarz inequality, $|(Tx, x)_H| \le \|Tx\|_H \|x\|_H \le \|T\| \|x\|_H^2$. For $\|x\|_H = 1$, this gives $|(Tx, x)_H| \le \|T\|$. Let $M := \sup_{\|x\| = 1} |(Tx, x)_H|$. Then $M \le \|T\|$.
**Step 2: The reverse inequality via polarisation.**
[claim:Polarisation Bound]
For all $u, v \in H$ with $\|u\|_H = \|v\|_H = 1$:
\begin{align*}
|(Tu, v)_H| \le M.
\end{align*}
[/claim]
[proof]
The self-adjointness of $T$ gives the real polarisation identity: for any $u, v \in H$,
\begin{align*}
4(Tu, v)_H = (T(u + v), u + v)_H - (T(u - v), u - v)_H.
\end{align*}
To verify this, expand both sides using bilinearity and $T^* = T$:
\begin{align*}
(T(u+v), u+v)_H &= (Tu, u)_H + (Tu, v)_H + (Tv, u)_H + (Tv, v)_H, \\
(T(u-v), u-v)_H &= (Tu, u)_H - (Tu, v)_H - (Tv, u)_H + (Tv, v)_H.
\end{align*}
Subtracting: $(T(u+v), u+v)_H - (T(u-v), u-v)_H = 2(Tu, v)_H + 2(Tv, u)_H$. Since $T$ is self-adjoint, $(Tv, u)_H = (v, Tu)_H = \overline{(Tu, v)_H}$, and over $\mathbb{R}$ this equals $(Tu, v)_H$, so the difference equals $4(Tu, v)_H$.
By the definition of $M$ and the parallelogram-type estimate:
\begin{align*}
4|(Tu, v)_H| &\le |(T(u+v), u+v)_H| + |(T(u-v), u-v)_H| \\
&\le M\|u + v\|_H^2 + M\|u - v\|_H^2 \\
&= M \cdot 2(\|u\|_H^2 + \|v\|_H^2) = 4M,
\end{align*}
where the last line uses the parallelogram law. For $\|u\| = \|v\| = 1$, this gives $|(Tu, v)_H| \le M$.
[/proof]
Taking the supremum over $\|v\| = 1$ gives $\|Tu\|_H = \sup_{\|v\| = 1} |(Tu, v)_H| \le M$ for all unit vectors $u$. Therefore $\|T\| \le M$, and combined with Step 1, $\|T\| = M$.
[/proof]
This result has no analogue for general operators: the right shift $S$ on $\ell^2(\mathbb{N})$ satisfies $\|S\| = 1$ but $(Sx, x) = \sum_{k=1}^\infty x_k x_{k+1}$, and the supremum of $|(Sx, x)|$ over unit vectors is strictly less than $1$ (it equals $1$ only in the [limit](/page/Limit), never attained). The self-adjointness hypothesis is essential.
## Problems
[problem]
Let $H = L^2([0,1], \mathcal{L}^1)$ and let $k \in L^2([0,1]^2, \mathcal{L}^2)$ be a kernel function. Define the [integral](/page/Integral) operator
\begin{align*}
T: L^2([0,1]) &\to L^2([0,1]) \\
(Tf)(x) &= \int_0^1 k(x, t) f(t) \, d\mathcal{L}^1(t).
\end{align*}
(a) Show that $T \in \mathcal{L}(L^2([0,1]))$ and that $\|T\|_{\mathcal{L}(L^2)} \le \|k\|_{L^2([0,1]^2)}$.
(b) Determine the adjoint $T^*$ and show that it is the integral operator with kernel $k^*(x, t) = k(t, x)$.
(c) Determine a necessary and sufficient condition on $k$ for $T$ to be self-adjoint.
[/problem]
[solution]
**Step 1: Boundedness.** For $f \in L^2([0,1])$, estimate $\|Tf\|_{L^2}$ using the Cauchy–Schwarz inequality in $t$ and then Fubini's theorem:
\begin{align*}
\|Tf\|_{L^2}^2 &= \int_0^1 \left|\int_0^1 k(x,t) f(t) \, d\mathcal{L}^1(t)\right|^2 d\mathcal{L}^1(x) \\
&\le \int_0^1 \left(\int_0^1 |k(x,t)|^2 \, d\mathcal{L}^1(t)\right) \left(\int_0^1 |f(t)|^2 \, d\mathcal{L}^1(t)\right) d\mathcal{L}^1(x) \\
&= \|f\|_{L^2}^2 \int_0^1 \int_0^1 |k(x,t)|^2 \, d\mathcal{L}^1(t) \, d\mathcal{L}^1(x) \\
&= \|k\|_{L^2([0,1]^2)}^2 \|f\|_{L^2}^2.
\end{align*}
Therefore $T$ is bounded with $\|T\|_{\mathcal{L}(L^2)} \le \|k\|_{L^2([0,1]^2)}$.
**Step 2: Computing the adjoint.** For $f, g \in L^2([0,1])$:
\begin{align*}
(Tf, g)_{L^2} &= \int_0^1 (Tf)(x) g(x) \, d\mathcal{L}^1(x) \\
&= \int_0^1 \int_0^1 k(x, t) f(t) g(x) \, d\mathcal{L}^1(t) \, d\mathcal{L}^1(x) \\
&= \int_0^1 f(t) \int_0^1 k(x, t) g(x) \, d\mathcal{L}^1(x) \, d\mathcal{L}^1(t) \\
&= \int_0^1 f(t) (T^*g)(t) \, d\mathcal{L}^1(t),
\end{align*}
where the exchange of integration order is justified by Fubini's theorem (the integrand $k(x,t) f(t) g(x)$ is in $L^1([0,1]^2)$ since $k \in L^2$, $f \in L^2$, and $g \in L^2$). Reading off the last line:
\begin{align*}
(T^*g)(t) = \int_0^1 k(x, t) g(x) \, d\mathcal{L}^1(x) = \int_0^1 k^*(t, x) g(x) \, d\mathcal{L}^1(x),
\end{align*}
where $k^*(t, x) := k(x, t)$ is the transposed kernel. So $T^*$ is the integral operator with kernel $k^*$.
**Step 3: Self-adjointness.** The operator $T$ is self-adjoint if and only if $T = T^*$, which holds if and only if $k(x, t) = k(t, x)$ for $\mathcal{L}^2$-a.e. $(x, t) \in [0,1]^2$. That is, $T$ is self-adjoint if and only if $k$ is a symmetric kernel.
[/solution]
## References
- Brezis, *Functional Analysis, [Sobolev Spaces](/page/Sobolev%20Space) and Partial Differential Equations* (2011), Chapter 2.
- Conway, *A Course in Functional Analysis* (1990), Chapter II.
- Lax, *Functional Analysis* (2002), Chapters 6 and 31.
- Reed and Simon, *Methods of Modern Mathematical Physics I: Functional Analysis* (1980), Chapter VI.