Symmetric Matrix

Also known as: Symmetric matrices, Real symmetric matrix, Self-adjoint matrix, Symmetric operator matrix, Symmetric square matrix

Edit 0 Issues 0 Pull Requests Roadmap Admin

Content

Problems

History

Issues Verification Attributions

A symmetric matrix is the matrix form of a linear transformation whose action is balanced with respect to the Euclidean [inner product](/page/Inner%20Product). When a matrix $A$ is used in expressions such as $\langle Ax,y\rangle$, $x^\top Ay$, or $x^\top Ax$, symmetry determines whether the two vector inputs can be exchanged without changing the scalar measurement. This is why symmetric matrices appear across linear algebra, quadratic forms, [inner product spaces](/page/Inner%20Product%20Space), multivariable calculus, probability, and optimization. A general real square matrix may rotate, shear, and stretch space in coupled ways. A real symmetric matrix has a much more rigid geometry: it admits perpendicular eigenvector directions and real eigenvalues. That geometric rigidity is the reason symmetric matrices are central to the spectral theorem, positive definiteness, Hessian tests, and covariance matrices. This page works over the [real numbers](/page/Real%20Numbers). Over complex vector spaces, the closely related analogue is Hermitian symmetry, where transpose is replaced by conjugate transpose; confusing the two loses the inner-product interpretation. ## Definition The central condition is that entries mirror across the main diagonal. This is the matrix-level way to say that the interaction from coordinate $i$ to coordinate $j$ is the same as the interaction from coordinate $j$ to coordinate $i$. [definition: Symmetric Matrix] Let $n \in \mathbb{N}$. A matrix $A \in \mathbb{R}^{n \times n}$ with entries $A_{ij}$ is symmetric if \begin{align*} A_{ij}=A_{ji} \end{align*} for all $1 \le i,j \le n$. [/definition] This definition is deliberately entrywise: it can be checked without choosing any extra language beyond the matrix entries themselves. Most of the theory, however, is cleaner once the row-column reflection operation has a name. ## Basic Notation and Related Definitions The compact test for symmetry is $A^\top=A$, so the transpose operation must be fixed before it is used in later characterisations. Entrywise conditions are good for checking a matrix by hand, but they become awkward once matrices are added, multiplied, or used to represent linear maps. Naming the operation gives a reusable language for every comparison between a matrix and its reflected version. [definition: Transpose of a Matrix] Let $m,n \in \mathbb{N}$. The transpose map is the function \begin{align*} T: \mathbb{R}^{m \times n} \to \mathbb{R}^{n \times m}, \qquad A \mapsto A^\top \end{align*} such that, if $A$ has entries $A_{ij}$, then $A^\top$ has entries satisfying \begin{align*} (A^\top)_{ij}=A_{ji}. \end{align*} [/definition] With this notation, a real square matrix is symmetric exactly when $A^\top=A$. Once the equation $A^\top=A$ is fixed, it is useful to ask where all matrices satisfying that equation live. They are not scattered examples: adding two such matrices or scaling one preserves the same transpose equation, so the symmetric matrices of a fixed size form a natural ambient space for later projections, bases, and dimension counts. [definition: Space of Symmetric Matrices] For $n \in \mathbb{N}$, the space of real symmetric $n \times n$ matrices is \begin{align*} \operatorname{Sym}_n(\mathbb{R}) := \{A \in \mathbb{R}^{n \times n} : A^\top=A\}. \end{align*} [/definition] Symmetry is only one way a matrix can interact simply with transposition. To separate the part unchanged by transpose from the part that cancels under transpose, we also need the opposite sign behavior; these matrices will be invisible to quadratic expressions of the form $x^\top Ax$. [definition: Skew-Symmetric Matrix] Let $n \in \mathbb{N}$. A matrix $A \in \mathbb{R}^{n \times n}$ is skew-symmetric if \begin{align*} A^\top=-A. \end{align*} [/definition] In entries, symmetry says $A_{ij}=A_{ji}$ for all $1 \le i,j \le n$. The notation $\operatorname{Sym}_n(\mathbb{R})$ should not be confused with the [symmetric group](/page/Symmetric%20Group) $S_n$: the former is a [vector space](/page/Vector%20Space) of matrices, and the latter is a group of permutations. Symmetric and skew-symmetric matrices form the two natural halves of a real square matrix. The symmetric half controls quadratic forms, while the skew-symmetric half controls the alternating part of the associated bilinear form. ## Equivalent Characterisations The entrywise definition is useful for checking examples, but it does not yet explain the geometry. The geometric meaning is that a symmetric matrix can be moved from one side of the Euclidean inner product to the other. [quotetheorem:9901] This theorem identifies symmetric matrices as the coordinate representatives of self-adjoint linear maps on Euclidean space with its standard inner product and the standard [orthonormal basis](/page/Orthonormal%20Basis). With a different inner product or a non-orthonormal basis, the coordinate matrix of the same self-adjoint map need not satisfy the entrywise condition $A^\top=A$. It connects the page to [linear maps](/page/Linear%20Map) and [self-adjoint operators](/page/Self-Adjoint%20Operators). The same exchange principle can be expressed without mentioning an operator moving across an inner product. A square matrix also defines a two-input scalar function, and symmetry asks whether the two inputs play interchangeable roles. [definition: Bilinear Form Associated to a Matrix] Let $A \in \mathbb{R}^{n \times n}$. The bilinear form associated to $A$ is the function $B_A: \mathbb{R}^n \times \mathbb{R}^n \to \mathbb{R}$ defined by \begin{align*} B_A: \mathbb{R}^n \times \mathbb{R}^n \to \mathbb{R}, \qquad (x,y) \mapsto x^\top Ay. \end{align*} [/definition] This construction turns matrix entries into scalar pairings of vectors. It motivates the next equivalence, which says that symmetric matrices are exactly the matrices whose associated bilinear forms are symmetric. The quoted result is stated in the more general language of a field $\mathbb{F}$. For this page, take $\mathbb{F}=\mathbb{R}$, so $M_n(\mathbb{F})$ means the set of real $n\times n$ matrices and the associated bilinear form is the real-valued pairing $B_A(x,y)=x^\top Ay$. The general statement is included only to record that the same entrywise test is not special to real numbers. [quotetheorem:425] This characterisation explains why symmetric matrices appear whenever a construction depends on an unordered pair of directions. It is the matrix-level version of a [symmetric bilinear form](/page/Bilinear%20Form). Quadratic forms use the same vector on both sides of a square matrix. This creates an information loss: any skew-symmetric contribution cancels when the two inputs are forced to be equal. To study what remains, we isolate the transpose-invariant component of an arbitrary square matrix. [definition: Symmetric Part of a Matrix] Let $n \in \mathbb{N}$. The symmetric-part map is the function \begin{align*} \operatorname{sym}: \mathbb{R}^{n \times n} \to \operatorname{Sym}_n(\mathbb{R}), \qquad A \mapsto \frac{1}{2}(A+A^\top). \end{align*} [/definition] After defining $\operatorname{sym}(A)$, the essential question is whether this extraction changes the scalar quantity that a quadratic form computes. The skew-symmetric remainder should contribute nothing, but that cancellation must hold for every vector, not just in a special example. This gives the precise justification for replacing $A$ by $\operatorname{sym}(A)$ in quadratic-form problems. [quotetheorem:9902] The theorem does not say that $A$ and $\operatorname{sym}(A)$ act as the same linear map. It says that a single-input quadratic measurement loses the skew-symmetric information. ## Standard Examples The smallest non-diagonal examples already show the mirrored-entry pattern. A two by two symmetric matrix has two free diagonal entries and one free off-diagonal entry. [example: A Two by Two Symmetric Matrix] Let $A \in \mathbb{R}^{2 \times 2}$ be given by $A_{11}=2$, $A_{12}=-3$, $A_{21}=-3$, and $A_{22}=5$. The diagonal entries match themselves, and the only off-diagonal condition is \begin{align*} A_{12}=-3=A_{21}. \end{align*} Thus $A_{ij}=A_{ji}$ for all $1 \le i,j \le 2$, so $A$ is symmetric. For $x=(x_1,x_2) \in \mathbb{R}^2$, the product $Ax$ has first coordinate $2x_1-3x_2$ and second coordinate $-3x_1+5x_2$. Hence \begin{align*} x^\top Ax=x_1(2x_1-3x_2)+x_2(-3x_1+5x_2). \end{align*} Expanding the two products gives \begin{align*} x^\top Ax=2x_1^2-3x_1x_2-3x_1x_2+5x_2^2. \end{align*} Combining the two equal mixed terms gives \begin{align*} x^\top Ax=2x_1^2-6x_1x_2+5x_2^2. \end{align*} The coefficient $-6$ appears because the two mirrored entries $A_{12}$ and $A_{21}$ each contribute one copy of $-3x_1x_2$. [/example] This example shows why off-diagonal entries encode interactions between coordinates. Symmetry forces the interaction from coordinate $i$ to coordinate $j$ to match the interaction from coordinate $j$ to coordinate $i$. A non-symmetric matrix may still determine the same quadratic form as a symmetric matrix. This example shows the boundary of what quadratic expressions can detect. [example: A Non-Symmetric Matrix with the Same Quadratic Form as Its Symmetric Part] Let $A \in \mathbb{R}^{2 \times 2}$ be given by $A_{11}=1$, $A_{12}=4$, $A_{21}=0$, and $A_{22}=3$. Since $A_{12}=4$ while $A_{21}=0$, the matrix $A$ is not symmetric. By the definition of the symmetric part, \begin{align*} \operatorname{sym}(A)_{ij}=\frac{1}{2}(A_{ij}+A_{ji}). \end{align*} Thus \begin{align*} \operatorname{sym}(A)_{11}=\frac{1}{2}(1+1)=1. \end{align*} Also, \begin{align*} \operatorname{sym}(A)_{12}=\frac{1}{2}(4+0)=2. \end{align*} Similarly, \begin{align*} \operatorname{sym}(A)_{21}=\frac{1}{2}(0+4)=2. \end{align*} Finally, \begin{align*} \operatorname{sym}(A)_{22}=\frac{1}{2}(3+3)=3. \end{align*} For $x=(x_1,x_2) \in \mathbb{R}^2$, the product $Ax$ has first coordinate $x_1+4x_2$ and second coordinate $3x_2$. Therefore \begin{align*} x^\top Ax=x_1(x_1+4x_2)+x_2(3x_2). \end{align*} Expanding the products gives \begin{align*} x^\top Ax=x_1^2+4x_1x_2+3x_2^2. \end{align*} The product $\operatorname{sym}(A)x$ has first coordinate $x_1+2x_2$ and second coordinate $2x_1+3x_2$. Hence \begin{align*} x^\top \operatorname{sym}(A)x=x_1(x_1+2x_2)+x_2(2x_1+3x_2). \end{align*} Expanding gives \begin{align*} x^\top \operatorname{sym}(A)x=x_1^2+2x_1x_2+2x_1x_2+3x_2^2. \end{align*} Combining the equal mixed terms gives \begin{align*} x^\top \operatorname{sym}(A)x=x_1^2+4x_1x_2+3x_2^2. \end{align*} Thus $x^\top Ax=x^\top \operatorname{sym}(A)x$ for every $x \in \mathbb{R}^2$. The quadratic form cannot distinguish $A$ from $\operatorname{sym}(A)$, although the two matrices are different linear maps because, for example, their actions on $(0,1)$ have different first coordinates: $4$ for $A(0,1)$ and $2$ for $\operatorname{sym}(A)(0,1)$. [/example] In multivariable calculus, second-order change depends on pairs of coordinate directions. A single [second derivative](/page/Second%20Derivative) such as $\partial^2 f/\partial x_i^2$ only measures bending along one coordinate axis, while a mixed derivative records how change in one direction varies as another coordinate changes. To compare all these second-order interactions, including the two possible orders for mixed directions, they must be organized in a form where row and column indices remember both directions. This need introduces a new object rather than another example of symmetrization. For a twice differentiable scalar function, we want one matrix attached to a point that records every second [partial derivative](/page/Partial%20Derivative) before using any equality of mixed partials. That matrix will later supply the quadratic term in Taylor approximation, so its indices must record exactly which two coordinate directions are being compared. The formal definition of the Hessian fixes this indexing convention and makes clear that the matrix depends both on the function and on the point where the second-order behavior is being measured. The point matters because second partial derivatives can vary across the domain, and the order of the indices matters because the matrix entry must record which coordinate direction is differentiated first and which is differentiated second. With this convention in place, later symmetry and Taylor-expansion statements can refer to a single matrix instead of repeatedly listing all second partial derivatives. The next definition therefore packages second-order data in the same way that a gradient packages first-order data: it turns many coordinate derivatives into one object that can be inserted into matrix formulas. The precise entry convention matters, because changing the order of the indices would change how later quadratic expressions and symmetry statements are read. [definition: Hessian Matrix] Let $U \subset \mathbb{R}^n$ be open, and let $f: U \to \mathbb{R}$ satisfy $f \in C^2(U)$. For $a \in U$, the Hessian matrix of $f$ at $a$ is the matrix $Hf_a \in \mathbb{R}^{n \times n}$ with entries \begin{align*} (Hf_a)_{ij}=\frac{\partial^2 f}{\partial x_i\partial x_j}(a). \end{align*} [/definition] The Hessian measures second-order change of a scalar function. Its symmetry is the matrix form of equality of mixed partial derivatives. [example: Hessian Matrix of a Quadratic Polynomial] Let $f: \mathbb{R}^2 \to \mathbb{R}$ be given by \begin{align*} f(x_1,x_2)=3x_1^2+4x_1x_2+x_2^2. \end{align*} We compute the entries of $Hf_x$ from the second partial derivatives. First, \begin{align*} \frac{\partial f}{\partial x_1}(x_1,x_2)=6x_1+4x_2. \end{align*} Differentiating this expression again gives \begin{align*} \frac{\partial^2 f}{\partial x_1^2}(x_1,x_2)=6. \end{align*} Differentiating the same first partial with respect to $x_2$ gives \begin{align*} \frac{\partial^2 f}{\partial x_2\partial x_1}(x_1,x_2)=4. \end{align*} Similarly, \begin{align*} \frac{\partial f}{\partial x_2}(x_1,x_2)=4x_1+2x_2. \end{align*} Therefore \begin{align*} \frac{\partial^2 f}{\partial x_1\partial x_2}(x_1,x_2)=4. \end{align*} Also, \begin{align*} \frac{\partial^2 f}{\partial x_2^2}(x_1,x_2)=2. \end{align*} Thus, for every $x \in \mathbb{R}^2$, the Hessian has entries \begin{align*} (Hf_x)_{11}=6,\quad (Hf_x)_{12}=4,\quad (Hf_x)_{21}=4,\quad (Hf_x)_{22}=2. \end{align*} Since $(Hf_x)_{12}=4=(Hf_x)_{21}$ and the diagonal entries match themselves, $Hf_x$ is symmetric. For an increment $h=(h_1,h_2)$, this Hessian gives the quadratic Taylor term \begin{align*} \frac{1}{2}h^\top Hf_xh=\frac{1}{2}\bigl(h_1(6h_1+4h_2)+h_2(4h_1+2h_2)\bigr). \end{align*} Expanding the products gives \begin{align*} \frac{1}{2}h^\top Hf_xh=\frac{1}{2}(6h_1^2+4h_1h_2+4h_1h_2+2h_2^2). \end{align*} Combining the mixed terms gives \begin{align*} \frac{1}{2}h^\top Hf_xh=3h_1^2+4h_1h_2+h_2^2. \end{align*} So the second-order part of this polynomial is encoded by the symmetric matrix with entries $6$, $4$, $4$, and $2$. [/example] For a random vector, the variability of one component is not enough to describe its joint behavior. One also needs to measure how each pair of components varies together, and those pairwise measurements naturally form a square array; here $\mathcal B(\mathbb{R}^n)$ denotes the Borel sigma-algebra on $\mathbb{R}^n$. [definition: Covariance Matrix] Let $(\Omega, \mathcal F, \mathbb P)$ be a [probability space](/page/Probability%20Space), and let $X: (\Omega, \mathcal F) \to (\mathbb{R}^n, \mathcal B(\mathbb{R}^n))$ be a random vector with components $X=(X_1,\ldots,X_n)$ satisfying $\mathbb{E}[X_i^2]<\infty$ for each $1 \le i \le n$. The covariance matrix of $X$ is the matrix $\Sigma \in \mathbb{R}^{n \times n}$ with entries \begin{align*} \Sigma_{ij}=\operatorname{Cov}(X_i,X_j). \end{align*} [/definition] Here $\operatorname{Cov}(X_i,X_j)=\mathbb{E}[(X_i-\mathbb{E}[X_i])(X_j-\mathbb{E}[X_j])]$. Since multiplication of real numbers commutes, covariance matrices are symmetric. [example: Covariance Matrix with Correlation] Let $X=(X_1,X_2)$ satisfy $\operatorname{Var}(X_1)=4$, $\operatorname{Var}(X_2)=9$, and $\operatorname{Cov}(X_1,X_2)=6$. Since covariance is symmetric, \begin{align*} \operatorname{Cov}(X_2,X_1)=\operatorname{Cov}(X_1,X_2)=6. \end{align*} Thus the covariance matrix $\Sigma$ has entries $\Sigma_{11}=4$, $\Sigma_{12}=6$, $\Sigma_{21}=6$, and $\Sigma_{22}=9$, so $\Sigma$ is symmetric. For $v=(v_1,v_2)\in\mathbb{R}^2$, the product $\Sigma v$ has first coordinate $4v_1+6v_2$ and second coordinate $6v_1+9v_2$. Therefore \begin{align*} v^\top \Sigma v=v_1(4v_1+6v_2)+v_2(6v_1+9v_2). \end{align*} Expanding the products gives \begin{align*} v^\top \Sigma v=4v_1^2+6v_1v_2+6v_1v_2+9v_2^2. \end{align*} Combining the mixed terms gives \begin{align*} v^\top \Sigma v=4v_1^2+12v_1v_2+9v_2^2. \end{align*} Factoring the square gives \begin{align*} v^\top \Sigma v=(2v_1+3v_2)^2. \end{align*} Hence $v^\top \Sigma v\ge 0$ for every direction $v\in\mathbb{R}^2$. This same expression is the variance of $v_1X_1+v_2X_2$: by bilinearity of covariance, \begin{align*} \operatorname{Var}(v_1X_1+v_2X_2)=v_1^2\operatorname{Var}(X_1)+v_1v_2\operatorname{Cov}(X_1,X_2)+v_2v_1\operatorname{Cov}(X_2,X_1)+v_2^2\operatorname{Var}(X_2). \end{align*} Substituting the given values yields \begin{align*} \operatorname{Var}(v_1X_1+v_2X_2)=4v_1^2+6v_1v_2+6v_2v_1+9v_2^2. \end{align*} Since $v_2v_1=v_1v_2$, this becomes \begin{align*} \operatorname{Var}(v_1X_1+v_2X_2)=(2v_1+3v_2)^2. \end{align*} The covariance matrix is symmetric because the two cross-covariances agree, and its quadratic form records the variance of each linear combination of the two random variables. [/example] These examples show several origins of the same condition $A^\top=A$. It can come from prescribed matrix entries, second derivatives, or second moments. ## Properties The first structural fact is closure under the operations of a vector space. This matters because symmetric matrices are often added, scaled, and decomposed in computations. [quotetheorem:9903] Sums and scalar multiples of symmetric matrices remain symmetric. Products require more care because multiplication may disturb the [symmetry condition](/theorems/1360). [example: Product of Symmetric Matrices Need Not Be Symmetric] Let $A,B \in \mathbb{R}^{2 \times 2}$ be defined by $A_{11}=1$, $A_{12}=0$, $A_{21}=0$, $A_{22}=2$, and $B_{11}=B_{12}=B_{21}=B_{22}=1$. The symmetry conditions for $A$ and $B$ are \begin{align*} A_{12}=0=A_{21} \end{align*} and \begin{align*} B_{12}=1=B_{21}. \end{align*} The diagonal entries match themselves, so both $A$ and $B$ are symmetric. Now compute the entries of $AB$ from the row-column formula $(AB)_{ij}=A_{i1}B_{1j}+A_{i2}B_{2j}$. The $(1,1)$ entry is \begin{align*} (AB)_{11}=A_{11}B_{11}+A_{12}B_{21}=1\cdot 1+0\cdot 1=1. \end{align*} The $(1,2)$ entry is \begin{align*} (AB)_{12}=A_{11}B_{12}+A_{12}B_{22}=1\cdot 1+0\cdot 1=1. \end{align*} The $(2,1)$ entry is \begin{align*} (AB)_{21}=A_{21}B_{11}+A_{22}B_{21}=0\cdot 1+2\cdot 1=2. \end{align*} The $(2,2)$ entry is \begin{align*} (AB)_{22}=A_{21}B_{12}+A_{22}B_{22}=0\cdot 1+2\cdot 1=2. \end{align*} Thus $(AB)_{12}=1$ while $(AB)_{21}=2$, so $(AB)_{12}\ne (AB)_{21}$. Therefore $AB$ is not symmetric, even though both factors are symmetric. [/example] The next basic invariant is dimension. Counting degrees of freedom explains how much smaller $\operatorname{Sym}_n(\mathbb{R})$ is than the full space of square matrices, and it gives a useful coordinate count for symmetric bilinear forms. The theorem below is phrased over an arbitrary field $\mathbb{F}$, with $S_n$ denoting the space of symmetric $n\times n$ matrices. In the present note, the intended specialization is $\mathbb{F}=\mathbb{R}$ and $S_n=\operatorname{Sym}_n(\mathbb{R})$; the field language only says that the same independent-entry count works whenever the entries support the usual matrix addition and scalar multiplication. [quotetheorem:443] The dimension count shows that symmetry is a strong constraint, but it does not explain why symmetric matrices are so useful geometrically. The deeper issue is whether the linear map represented by a symmetric matrix can be understood along perpendicular coordinate directions, so that the matrix acts by independent scalings after an orthogonal [change of basis](/page/Change%20Of%20Basis). This is the structural question behind the spectral theorem: can every real symmetric matrix be reduced, by an orthonormal choice of coordinates, to independent one-dimensional actions? The quoted theorem answers this in its standard complex form. A Hermitian matrix is a complex square matrix satisfying $A=A^\dagger$, where $A^\dagger$ is the conjugate transpose; a unitary matrix $P$ satisfies $P^\dagger P=I$ and represents an orthonormal change of coordinates in $\mathbb{C}^n$. A real symmetric matrix is the real special case: conjugation does nothing, so $A^\dagger=A^\top$, and the unitary diagonalization restricts to the familiar orthogonal diagonalization over $\mathbb{R}$. [quotetheorem:925] For a real symmetric matrix, this means there is an orthonormal basis of eigenvectors. In that basis the matrix is diagonal, so applying the matrix simply multiplies each coordinate direction by its corresponding eigenvalue. The theorem is therefore not just a way to simplify entries; it explains why quadratic forms, ellipsoids, and second-derivative tests can be analyzed one perpendicular axis at a time. Before using a full orthogonal diagonalization, one often needs a more basic guarantee: the scalars that appear as eigenvalues should not leave the real number system. For a general real matrix complex eigenvalues can occur, so symmetry must be doing real work if every spectral scaling factor stays real. [quotetheorem:3279] Real eigenvalues alone do not give an orthogonal coordinate system. The missing geometric question is how eigenspaces for different eigenvalues sit relative to one another; for symmetric matrices, distinct spectral behaviors are forced into perpendicular directions. [quotetheorem:3280] This orthogonality is the geometric mechanism behind diagonalization. If two eigenvalues describe different scaling factors, their eigenspaces do not interfere with each other in the inner product: vectors from the two spaces meet at right angles. After choosing orthonormal bases inside each eigenspace, these perpendicular pieces assemble into the coordinate system used by the spectral theorem. Positive definiteness is the main inequality condition imposed on symmetric matrices. Symmetry guarantees that the quadratic expression $x^\top Ax$ has coherent geometry, but it does not guarantee that the expression behaves like a squared length. Many applications need a quadratic measurement that never vanishes except at the origin: an energy should not assign negative cost to a nonzero displacement, and a second-order approximation should bend upward in every direction at a strict local minimum. Positive definiteness isolates exactly this stronger situation. [definition: Positive Definite Matrix] Let $A \in \operatorname{Sym}_n(\mathbb{R})$. The matrix $A$ is positive definite if \begin{align*} x^\top Ax>0 \end{align*} for every $x \in \mathbb{R}^n \setminus \{0\}$. [/definition] The definition of positive definiteness quantifies over every nonzero vector, which is usually impossible to check directly. Spectral diagonalization removes the obstruction for symmetric matrices: after an orthonormal change of coordinates, the quadratic form separates into independent squared-coordinate terms. In that diagonal form, the only way the expression can stay positive in every nonzero direction is for each diagonal coefficient, equivalently each eigenvalue, to be positive. This creates the main test one wants after defining positive definiteness: replace the infinite family of inequalities $x^\top Ax>0$ with a finite spectral sign condition. The obstruction is that positivity must hold in directions that are not visible from the entries of $A$ alone, while eigenvalues capture exactly the independent coordinate directions after orthonormal diagonalization. The quoted theorem records this criterion precisely, linking the geometry of the quadratic form to the spectral data of the symmetric matrix. It answers the practical question raised by the definition: instead of testing infinitely many directions, one can test the finitely many eigenvalues obtained from the symmetric matrix's orthonormal eigenbasis. This is the bridge from a geometric condition on all vectors to a computable algebraic condition. [quotetheorem:3289] This result is one reason symmetric matrices are central in optimization, stability, and second derivative tests. It also explains why positive definite matrices are usually defined only inside the symmetric or Hermitian setting. ## Relationship to Other Concepts A symmetric matrix is a [matrix](/page/Matrix), but it is best understood as a meeting point between several structures. As a representative of a [linear map](/page/Linear%20Map), it is the finite-dimensional Euclidean form of a self-adjoint operator after an orthonormal basis has been chosen. Over a complex inner product space, the corresponding matrix condition is Hermitian symmetry, $A^*=A$, where $A^*$ is the conjugate transpose. Thus real symmetric matrices are the real special case of the broader self-adjoint picture. As a coefficient array for a [bilinear form](/page/Bilinear%20Form), it represents a symmetric pairing of vectors. As the coefficient matrix of a quadratic form, it records second-order scalar data. The connection with orthogonal diagonalization places symmetric matrices next to the spectral theorem. That theorem says that real symmetry is not merely an entrywise pattern; it is the condition that makes eigenvector geometry orthonormal and complete. In calculus, symmetric matrices appear as Hessian matrices when second partial derivatives are continuous. The [second derivative test](/theorems/8607) classifies critical points by the eigenvalues or definiteness of the Hessian. In probability and statistics, covariance matrices are symmetric because covariance is symmetric in its two arguments. Their positive semidefinite property says that every linear combination of the random variables has nonnegative variance. The final basic relationship is that symmetric matrices sit beside skew-symmetric matrices as complementary parts of every real square matrix. This is needed here because quadratic forms do not see all of a matrix: replacing $A$ by $\frac12(A+A^\top)$ leaves $x^\top Ax$ unchanged, while the skew-symmetric part contributes zero on every diagonal input $x,x$. The decomposition theorem packages that observation into a unique splitting of an arbitrary matrix into the part controlled by this page and the alternating part that must be handled separately. In the quoted statement, $M_n(\mathbb{F})$ denotes all $n\times n$ matrices over a field $\mathbb{F}$, $S_n$ denotes the symmetric ones, and $\mathcal{A}_n$ denotes the skew-symmetric ones. The hypothesis that the characteristic is not $2$ guarantees that division by $2$ is allowed; for the real matrices in this note, that condition is automatic. [quotetheorem:442] This decomposition separates the part visible to quadratic forms from the part invisible to them. The condition on the characteristic is essential: the displayed formulas for the two pieces require dividing by $2$, so the statement is not just a formal rearrangement in every field. Over the real numbers, the theorem explains why many questions about expressions such as $x^\top Ax$ can be reduced to the symmetric part of $A$, while questions involving commutators, rotations, or infinitesimal motion may depend on the skew-symmetric part. Thus the decomposition marks both the power and the limitation of symmetric matrices: they control the self-adjoint and quadratic behavior, but they do not by themselves describe every feature of a general linear transformation. ## Beyond and Connections Symmetric matrices are a meeting point for several later topics. Their quadratic forms lead naturally to positive definiteness, optimization, conic sections, and energy methods, where the sign of $x^\top Ax$ carries geometric or analytic information. Their orthogonal diagonalization connects them to spectral theory, principal axes, and the classification of real inner-product operators. In numerical linear algebra, the symmetry condition is also a structural advantage: eigenvalues are real, eigenspaces behave cleanly, and many algorithms can exploit the absence of complex rotational behavior. The complementary skew-symmetric part points in a different direction. It is the linear-algebraic shadow of infinitesimal rotations, Lie brackets, and alternating bilinear forms. Keeping both pieces in view helps explain why symmetric matrices are central but not universal: they describe the self-adjoint part of a linear map, while general matrix theory also needs orthogonal matrices, change of basis, normal operators, and other structures that measure behavior not visible to quadratic forms alone. ## References [Matrix](/page/Matrix). [Linear Map](/page/Linear%20Map). [Inner Product Space](/page/Inner%20Product%20Space). Quadratic Form. [Orthogonal Matrix](/page/Orthogonal%20Matrix). Sheldon Axler, *Linear Algebra Done Right* (2015). Roger A. Horn and Charles R. Johnson, *Matrix Analysis* (2013). Gilbert Strang, *Linear Algebra and Its Applications* (2006).

Created by admin on 6/22/2026 | Last updated on 6/22/2026

What brings you to Androma?

Start with a route through the knowledge graph.

Symmetric Matrix

Sign in to Androma

Check your inbox

One last step

Symmetric Matrix

Prerequisites (0/7 completed)

Prerequisites Graph

Rate this page