Gram Matrix Characterisation of Sums of Squares — Statement & Proof

Gram Matrix Characterisation of Sums of Squares (Theorem # 6712)

Theorem

Edit Issues Pull Requests Attributions Admin

Discussion

Proof

[proofplan] We prove both directions by translating between sums of polynomial squares and Gram matrices with respect to the monomial basis of $\mathbb{R}[x_1,\dots,x_n]_{\leq d}$. A positive semidefinite matrix is first realised as a Gram matrix of finitely many vectors, which turns the quadratic form $z_d(x)^\top Qz_d(x)$ into a sum of squares of linear combinations of the monomials in $z_d$. Conversely, a sum of squares gives coefficient vectors for the summand polynomials, and summing their rank-one Gram matrices produces the required positive semidefinite matrix. [/proofplan] [step:Fix the coefficient coordinates for polynomials of degree at most $d$] Let \begin{align*} A_d := \{\alpha \in \mathbb{N}_0^n : |\alpha| \leq d\}. \end{align*} Choose the ordering $A_d = \{\alpha_1,\dots,\alpha_m\}$ so that $z_d(x) = (x^{\alpha_1},\dots,x^{\alpha_m})^\top$ for every $x \in \mathbb{R}^n$. For each $a = (a_1,\dots,a_m)^\top \in \mathbb{R}^m$, define the polynomial map $q_a: \mathbb{R}^n \to \mathbb{R}$ by \begin{align*} q_a(x) := a^\top z_d(x) = \sum_{i=1}^m a_i x^{\alpha_i}. \end{align*} This gives the usual coefficient-coordinate identification between $\mathbb{R}^m$ and $\mathbb{R}[x_1,\dots,x_n]_{\leq d}$. [/step] [step:Convert a positive semidefinite Gram matrix into a sum of squares] Assume that $Q \in \mathbb{R}^{m \times m}$ is real symmetric positive semidefinite and that \begin{align*} p(x) = z_d(x)^\top Q z_d(x) \end{align*} for every $x \in \mathbb{R}^n$. We first realise $Q$ as a Gram matrix. Define a symmetric [bilinear form](/page/Bilinear%20Form) $B: \mathbb{R}^m \times \mathbb{R}^m \to \mathbb{R}$ by \begin{align*} B(u,v) := u^\top Qv. \end{align*} Since $Q$ is positive semidefinite, $B(u,u) \geq 0$ for every $u \in \mathbb{R}^m$. Let \begin{align*} N := \{u \in \mathbb{R}^m : B(u,u)=0\}. \end{align*} If $u \in N$ and $v \in \mathbb{R}^m$, then the polynomial $t \mapsto B(v+tu,v+tu)$ is nonnegative for every $t \in \mathbb{R}$. Since $B(u,u)=0$, this polynomial equals $B(v,v)+2tB(u,v)$, so nonnegativity for both positive and negative $t$ forces $B(u,v)=0$. Hence $B$ descends to an [inner product](/page/Inner%20Product) on the quotient [vector space](/page/Vector%20Space) $\mathbb{R}^m/N$. Choose an [orthonormal basis](/page/Orthonormal%20Basis) $e_1,\dots,e_r$ of $\mathbb{R}^m/N$ for this inner product. For $i=1,\dots,m$, let $b_i \in \mathbb{R}^m$ denote the $i$th standard basis vector, let $\pi_i \in \mathbb{R}^m/N$ denote the class of $b_i$, and define vectors $v_i \in \mathbb{R}^r$ by \begin{align*} v_i := ((\pi_i,e_1),\dots,(\pi_i,e_r))^\top. \end{align*} Then \begin{align*} Q_{ij} = B(b_i,b_j) = (\pi_i,\pi_j) = v_i^\top v_j \end{align*} for all $1 \leq i,j \leq m$. For each $\ell \in \{1,\dots,r\}$, define the polynomial map $s_\ell: \mathbb{R}^n \to \mathbb{R}$ by \begin{align*} s_\ell(x) := \sum_{i=1}^m (v_i)_\ell x^{\alpha_i}. \end{align*} Then, for every $x \in \mathbb{R}^n$, expanding the quadratic form gives \begin{align*} p(x) = \sum_{i=1}^m \sum_{j=1}^m Q_{ij}x^{\alpha_i}x^{\alpha_j}. \end{align*} Substituting $Q_{ij}=\sum_{\ell=1}^r (v_i)_\ell (v_j)_\ell$ gives \begin{align*} p(x) = \sum_{i=1}^m \sum_{j=1}^m \sum_{\ell=1}^r (v_i)_\ell (v_j)_\ell x^{\alpha_i}x^{\alpha_j}. \end{align*} Regrouping the finite sums gives \begin{align*} p(x) = \sum_{\ell=1}^r s_\ell(x)^2. \end{align*} Thus $p$ is a sum of squares in $\mathbb{R}[x_1,\dots,x_n]$. [guided] The point of this direction is to turn the positive semidefinite matrix $Q$ into actual square summands. A positive semidefinite matrix behaves like a Gram matrix, and we now construct that Gram representation explicitly. Define the symmetric bilinear map $B: \mathbb{R}^m \times \mathbb{R}^m \to \mathbb{R}$ by \begin{align*} B(u,v) := u^\top Qv. \end{align*} Because $Q$ is symmetric, $B$ is symmetric and bilinear. Because $Q$ is positive semidefinite, $B(u,u) \geq 0$ for every $u \in \mathbb{R}^m$. The only possible obstruction to being an inner product is that some nonzero vectors may have zero length. We remove precisely those vectors by defining \begin{align*} N := \{u \in \mathbb{R}^m : B(u,u)=0\}. \end{align*} We now prove the null-space property needed for the quotient. Fix $u \in N$ and $v \in \mathbb{R}^m$. For every $t \in \mathbb{R}$, positive semidefiniteness gives \begin{align*} 0 \leq B(v+tu,v+tu)=B(v,v)+2tB(u,v)+t^2B(u,u)=B(v,v)+2tB(u,v). \end{align*} If $B(u,v)>0$, then taking $t< -B(v,v)/(2B(u,v))$ contradicts this inequality. If $B(u,v)<0$, then taking $t> -B(v,v)/(2B(u,v))$ contradicts it. Hence $B(u,v)=0$. Therefore $B$ descends to a genuine inner product on the quotient space $\mathbb{R}^m/N$. Choose an orthonormal basis $e_1,\dots,e_r$ of this quotient [inner product space](/page/Inner%20Product%20Space). Let $b_i \in \mathbb{R}^m$ be the $i$th standard basis vector, and let $\pi_i$ be the class of $b_i$ in $\mathbb{R}^m/N$. Define \begin{align*} v_i := ((\pi_i,e_1),\dots,(\pi_i,e_r))^\top \in \mathbb{R}^r. \end{align*} Since $e_1,\dots,e_r$ is orthonormal, the inner product of $\pi_i$ and $\pi_j$ is the Euclidean dot product of their coordinate vectors: \begin{align*} (\pi_i,\pi_j) = v_i^\top v_j. \end{align*} But the quotient inner product was induced by $B$, so \begin{align*} (\pi_i,\pi_j) = B(b_i,b_j) = b_i^\top Qb_j = Q_{ij}. \end{align*} Thus $Q_{ij}=v_i^\top v_j$ for all $i,j$. Now define, for each $\ell \in \{1,\dots,r\}$, the polynomial map $s_\ell: \mathbb{R}^n \to \mathbb{R}$ by \begin{align*} s_\ell(x) := \sum_{i=1}^m (v_i)_\ell x^{\alpha_i}. \end{align*} Each $s_\ell$ has degree at most $d$, because every monomial $x^{\alpha_i}$ appearing in $z_d$ has $|\alpha_i| \leq d$. Expanding the assumed Gram representation gives \begin{align*} p(x) = z_d(x)^\top Qz_d(x). \end{align*} Therefore \begin{align*} p(x) = \sum_{i=1}^m \sum_{j=1}^m Q_{ij}x^{\alpha_i}x^{\alpha_j}. \end{align*} Substituting $Q_{ij}=\sum_{\ell=1}^r (v_i)_\ell(v_j)_\ell$ and regrouping the finite sums yields \begin{align*} p(x) = \sum_{\ell=1}^r \left(\sum_{i=1}^m (v_i)_\ell x^{\alpha_i}\right)^2. \end{align*} By the definition of $s_\ell$, this is \begin{align*} p(x) = \sum_{\ell=1}^r s_\ell(x)^2. \end{align*} This is exactly a finite sum of squares of real polynomials, so $p$ is a sum of squares. [/guided] [/step] [step:Show that square summands have degree at most $d$] Assume conversely that $p$ is a finite sum of squares, so there exist a positive integer $L$ and polynomials $q_1,\dots,q_L \in \mathbb{R}[x_1,\dots,x_n]$ such that \begin{align*} p = \sum_{k=1}^L q_k^2. \end{align*} We claim that each $q_k$ has degree at most $d$. If some $q_k$ had degree $e>d$, let $e$ be the maximum degree among the nonzero polynomials $q_1,\dots,q_L$, and let $h_k$ denote the homogeneous degree-$e$ part of $q_k$, with $h_k=0$ when $\deg q_k<e$. The homogeneous degree-$2e$ part of $\sum_{k=1}^L q_k^2$ is \begin{align*} \sum_{k=1}^L h_k^2. \end{align*} At least one $h_k$ is nonzero. A nonzero real polynomial cannot have a square-sum leading form identically equal to zero, because evaluating at a point where one nonzero $h_k$ is nonzero gives a strictly positive value after choosing a point outside the common zero set of the finitely many nonzero homogeneous polynomials. Hence the degree-$2e$ part of $p$ is nonzero, so $\deg p=2e>2d$, contradicting $p \in \mathbb{R}[x_1,\dots,x_n]_{\leq 2d}$. Therefore $q_k \in \mathbb{R}[x_1,\dots,x_n]_{\leq d}$ for every $k$. [/step] [step:Convert a sum of squares into a positive semidefinite Gram matrix] Since each $q_k$ has degree at most $d$, there is a unique coefficient vector $a_k=(a_{k1},\dots,a_{km})^\top \in \mathbb{R}^m$ such that \begin{align*} q_k(x)=a_k^\top z_d(x) \end{align*} for every $x \in \mathbb{R}^n$. Define \begin{align*} Q := \sum_{k=1}^L a_k a_k^\top \in \mathbb{R}^{m \times m}. \end{align*} The matrix $Q$ is symmetric because each rank-one matrix $a_k a_k^\top$ is symmetric. For every $u \in \mathbb{R}^m$, expanding the finite sum defining $Q$ gives \begin{align*} u^\top Q u = \sum_{k=1}^L u^\top a_k a_k^\top u. \end{align*} Since $u^\top a_k a_k^\top u = (a_k^\top u)^2$ for each $k$, we obtain \begin{align*} u^\top Q u = \sum_{k=1}^L (a_k^\top u)^2 \geq 0. \end{align*} Thus $Q$ is positive semidefinite. Finally, for every $x \in \mathbb{R}^n$, expanding $Q$ gives \begin{align*} z_d(x)^\top Q z_d(x) = \sum_{k=1}^L z_d(x)^\top a_k a_k^\top z_d(x). \end{align*} Since $z_d(x)^\top a_k a_k^\top z_d(x) = (a_k^\top z_d(x))^2$, this becomes \begin{align*} z_d(x)^\top Q z_d(x) = \sum_{k=1}^L (a_k^\top z_d(x))^2. \end{align*} Using $q_k(x)=a_k^\top z_d(x)$ and the assumed sum-of-squares representation of $p$, we get \begin{align*} z_d(x)^\top Q z_d(x) = p(x). \end{align*} Thus a positive semidefinite Gram matrix $Q$ with the required representation exists. [/step] [step:Conclude the equivalence] The second step proves that every positive semidefinite Gram representation \begin{align*} p(x)=z_d(x)^\top Qz_d(x) \end{align*} produces a sum-of-squares representation of $p$. The third and fourth steps prove that every sum-of-squares representation of a polynomial $p \in \mathbb{R}[x_1,\dots,x_n]_{\leq 2d}$ produces a real symmetric positive semidefinite matrix $Q$ satisfying the same identity for all $x \in \mathbb{R}^n$. Therefore $p$ is a sum of squares if and only if such a matrix $Q$ exists. [/step]

Prerequisites (0/2 completed)

Prerequisites Graph

Interactive dependency map showing how this theorem builds on foundational concepts

Loading dependency graph...

Definitions & Concepts

Explore Further

Orthonormal Basis Definition Inner Product Definition Marsden-Weinstein Reduction for an Invariant Hamiltonian applied Polynomial Size Bound for Local Gadget Constructions applied Moreau-Rockafellar Subdifferential Sum Rule applied Momentum Map Conservation Theorem applied Closure of Polynomial-Time Computable Functions Under Composition applied NP-Completeness of PARTITION applied Uniqueness of Minimal Realizations up to Similarity applied Polar Representation of the Minkowski Functional applied

What brings you to Androma?

Start with a route through the knowledge graph.

Gram Matrix Characterisation of Sums of Squares (Theorem # 6712)

Discussion

Proof

Prerequisites (0/2 completed)

Prerequisites Graph

Explore Further

Sign in to Androma

Check your inbox

One last step

Gram Matrix Characterisation of Sums of Squares (Theorem # 6712)

Discussion

Proof

Prerequisites (0/2 completed)

Prerequisites Graph

Explore Further