Exact Equivalence of Classical and Modified Gram-Schmidt

Exact Equivalence of Classical and Modified Gram-Schmidt (Theorem # 7969)

Theorem

Edit Issues Pull Requests Attributions Admin

Discussion

Proof

[proofplan] We prove the equality of the two output lists by induction on the index $k$. The induction hypothesis says not only that the previously produced orthonormal vectors agree, but also that they span the same subspace as the previously processed input vectors. At the $k$th stage, the modified algorithm subtracts the same [orthogonal projection](/theorems/437) components as the classical formula, one component at a time. [Linear independence](/page/Linear%20Independence) guarantees that the common residual is nonzero, so the shared normalization rule gives the same normalized vector in both algorithms. [/proofplan] [step:Initialize the first residual in both algorithms] For $k=1$, both algorithms have no previous orthonormal vectors. Thus the empty sum in the classical formula is zero, and the modified algorithm starts with $w_{1,0}=v_1$. Hence \begin{align*} u_{C,1}=v_1=u_{M,1}. \end{align*} Because $(v_1,\ldots,v_m)$ is linearly independent, $v_1\neq 0$. Therefore $u_{C,1}$ and $u_{M,1}$ are nonzero, and both algorithms normalize by the same positive scalar $|v_1|$. Consequently \begin{align*} e_{C,1}=\frac{v_1}{|v_1|}=e_{M,1}. \end{align*} The list $(e_{C,1})$ is orthonormal, and \begin{align*} \operatorname{span}\{e_{C,1}\}=\operatorname{span}\{v_1\}. \end{align*} [/step] [step:Assume the previous Gram-Schmidt vectors agree and span the previous inputs] Fix an index $k$ with $2\le k\le m$. Assume that, for every $1\le i<k$, the vectors produced by the two algorithms agree: \begin{align*} e_{C,i}=e_{M,i}. \end{align*} Let $e_i$ denote this common vector for $1\le i<k$. Assume also that $(e_1,\ldots,e_{k-1})$ is orthonormal and that \begin{align*} \operatorname{span}\{e_1,\ldots,e_{k-1}\}=\operatorname{span}\{v_1,\ldots,v_{k-1}\}. \end{align*} We prove the same conclusions at index $k$. [/step] [step:Compute the modified working vector after each projection] We claim that, for every integer $j$ with $0\le j\le k-1$, the modified working vector satisfies \begin{align*} w_{k,j}=v_k-\sum_{i=1}^{j}(v_k,e_i)_V e_i. \end{align*} For $j=0$, this is the definition $w_{k,0}=v_k$, with the empty sum equal to zero. Now let $j$ satisfy $1\le j\le k-1$, and assume the formula holds for $j-1$. Since $(e_1,\ldots,e_{k-1})$ is orthonormal, we have $(e_i,e_j)_V=0$ for $i<j$ and $(e_j,e_j)_V=1$. Using linearity of the [inner product](/page/Inner%20Product) in the first argument, the coefficient subtracted by the modified algorithm is \begin{align*} (w_{k,j-1},e_j)_V=(v_k,e_j)_V. \end{align*} Indeed, the terms involving $e_i$ with $i<j$ vanish by orthogonality. Substituting this coefficient into the modified update gives \begin{align*} w_{k,j}=w_{k,j-1}-(v_k,e_j)_V e_j. \end{align*} Using the induction formula for $w_{k,j-1}$ yields \begin{align*} w_{k,j}=v_k-\sum_{i=1}^{j}(v_k,e_i)_V e_i. \end{align*} Thus the formula holds for every $0\le j\le k-1$ by finite induction. [guided] The purpose of this step is to show that the modified algorithm has not changed the mathematical residual; it has only changed the order in which the projection components are removed. We prove the precise formula \begin{align*} w_{k,j}=v_k-\sum_{i=1}^{j}(v_k,e_i)_V e_i \end{align*} for each $0\le j\le k-1$. When $j=0$, the statement says $w_{k,0}=v_k$, which is exactly the definition of the initial modified working vector. Now suppose the formula has been proved for some $j-1$, where $1\le j\le k-1$. The modified algorithm computes the next coefficient from the current working vector: \begin{align*} (w_{k,j-1},e_j)_V. \end{align*} By the induction formula, \begin{align*} w_{k,j-1}=v_k-\sum_{i=1}^{j-1}(v_k,e_i)_V e_i. \end{align*} Taking the inner product with $e_j$ and using linearity in the first argument gives \begin{align*} (w_{k,j-1},e_j)_V=(v_k,e_j)_V-\sum_{i=1}^{j-1}(v_k,e_i)_V(e_i,e_j)_V. \end{align*} Because $(e_1,\ldots,e_{k-1})$ is orthonormal, every factor $(e_i,e_j)_V$ with $i<j$ is zero. Therefore \begin{align*} (w_{k,j-1},e_j)_V=(v_k,e_j)_V. \end{align*} This is the key point: although modified Gram-Schmidt computes the coefficient from the updated vector $w_{k,j-1}$, exact orthogonality of the previously removed directions makes that coefficient equal to the classical coefficient. Substituting this equality into the modified update, \begin{align*} w_{k,j}=w_{k,j-1}-(w_{k,j-1},e_j)_V e_j, \end{align*} we obtain \begin{align*} w_{k,j}=w_{k,j-1}-(v_k,e_j)_V e_j. \end{align*} Finally, replacing $w_{k,j-1}$ by its induction formula gives \begin{align*} w_{k,j}=v_k-\sum_{i=1}^{j-1}(v_k,e_i)_V e_i-(v_k,e_j)_V e_j. \end{align*} Combining the two displayed subtraction terms into a single finite sum gives \begin{align*} w_{k,j}=v_k-\sum_{i=1}^{j}(v_k,e_i)_V e_i. \end{align*} This completes the induction over $j$. [/guided] [/step] [step:Identify the classical and modified residuals] Taking $j=k-1$ in the formula from the previous step gives \begin{align*} u_{M,k}=w_{k,k-1}=v_k-\sum_{i=1}^{k-1}(v_k,e_i)_V e_i. \end{align*} Since $e_i=e_{C,i}$ for every $1\le i<k$, the classical residual is \begin{align*} u_{C,k}=v_k-\sum_{i=1}^{k-1}(v_k,e_i)_V e_i. \end{align*} Therefore \begin{align*} u_{C,k}=u_{M,k}. \end{align*} Let $u_k$ denote this common residual. [/step] [step:Use linear independence to rule out a zero residual] Suppose, for contradiction, that $u_k=0$. From the residual formula, \begin{align*} v_k=\sum_{i=1}^{k-1}(v_k,e_i)_V e_i. \end{align*} Thus $v_k\in \operatorname{span}\{e_1,\ldots,e_{k-1}\}$. By the induction hypothesis on spans, \begin{align*} v_k\in \operatorname{span}\{v_1,\ldots,v_{k-1}\}. \end{align*} This contradicts the linear independence of $(v_1,\ldots,v_m)$. Hence $u_k\neq 0$. [/step] [step:Normalize the common residual and update the induction] Since $u_{C,k}=u_{M,k}=u_k$ and $u_k\neq 0$, both algorithms divide by the same positive scalar $|u_k|$. Hence \begin{align*} e_{C,k}=\frac{u_k}{|u_k|}=e_{M,k}. \end{align*} Moreover, $u_k$ is orthogonal to each $e_j$ with $1\le j<k$, because \begin{align*} (u_k,e_j)_V=(v_k,e_j)_V-\sum_{i=1}^{k-1}(v_k,e_i)_V(e_i,e_j)_V=0. \end{align*} Also $|e_{C,k}|=1$ by construction. Therefore $(e_1,\ldots,e_k)$ is orthonormal. Finally, since \begin{align*} u_k=v_k-\sum_{i=1}^{k-1}(v_k,e_i)_V e_i, \end{align*} we have $u_k\in \operatorname{span}\{v_1,\ldots,v_k\}$ and $v_k\in \operatorname{span}\{e_1,\ldots,e_k\}$. Combining these inclusions with \begin{align*} \operatorname{span}\{e_1,\ldots,e_{k-1}\}=\operatorname{span}\{v_1,\ldots,v_{k-1}\} \end{align*} gives \begin{align*} \operatorname{span}\{e_1,\ldots,e_k\}=\operatorname{span}\{v_1,\ldots,v_k\}. \end{align*} Thus the induction hypotheses are established at index $k$. [/step] [step:Conclude equality of the full orthonormal lists] By induction on $k$, every residual used by either algorithm is nonzero, and for every $1\le k\le m$ the two algorithms produce the same normalized vector: \begin{align*} e_{C,k}=e_{M,k}. \end{align*} Therefore, in exact arithmetic and under the stated common inner product and normalization conventions, classical Gram-Schmidt and modified Gram-Schmidt produce the same orthonormal list. [/step]

Prerequisites (0/3 completed)

Prerequisites Graph

Interactive dependency map showing how this theorem builds on foundational concepts

Loading dependency graph...

Definitions & Concepts

Explore Further

Inner Product Definition Orthogonality Definition Linear Independence Definition Littlewood-Paley Decomposition Identity Analysis Microlocal Partition Lemma Analysis Neck Detection Principle Analysis Uniqueness of Limits Topology Duhamel's Principle For The Wave Equation Partial Differential Equations Preservation of Isometry Groups under Ricci Flow Analysis Linearisation Test for Exponential Stability Analysis Sobolev Embedding Into Continuous Functions Functional Analysis Analysis Area

What brings you to Androma?

Start with a route through the knowledge graph.