Theorems Exact Equivalence of Classical and Modified Gram-Schmidt Attributions

Attributions & Verification

Track contributions and verify content correctness

Proof

[proofplan] We prove the equality of the two output lists by induction on the index $k$. The induction hypothesis says not only that the previously produced orthonormal vectors agree, but also that they span the same subspace as the previously processed input vectors. At the $k$th stage, the modified algorithm subtracts the same [orthogonal projection](/theorems/437) components as the classical formula, one component at a time. [Linear independence](/page/Linear%20Independence) guarantees that the common residual is nonzero, so the shared normalization rule gives the same normalized vector in both algorithms. [/proofplan]

custom_env admin

[step:Initialize the first residual in both algorithms] For $k=1$, both algorithms have no previous orthonormal vectors. Thus the empty sum in the classical formula is zero, and the modified algorithm starts with $w_{1,0}=v_1$. Hence \begin{align*} u_{C,1}=v_1=u_{M,1}. \end{align*} Because $(v_1,\ldots,v_m)$ is linearly independent, $v_1\neq 0$. Therefore $u_{C,1}$ and $u_{M,1}$ are nonzero, and both algorithms normalize by the same positive scalar $|v_1|$. Consequently \begin{align*} e_{C,1}=\frac{v_1}{|v_1|}=e_{M,1}. \end{align*} The list $(e_{C,1})$ is orthonormal, and \begin{align*} \operatorname{span}\{e_{C,1}\}=\operatorname{span}\{v_1\}. \end{align*} [/step]

custom_env admin

[step:Assume the previous Gram-Schmidt vectors agree and span the previous inputs] Fix an index $k$ with $2\le k\le m$. Assume that, for every $1\le i<k$, the vectors produced by the two algorithms agree: \begin{align*} e_{C,i}=e_{M,i}. \end{align*} Let $e_i$ denote this common vector for $1\le i<k$. Assume also that $(e_1,\ldots,e_{k-1})$ is orthonormal and that \begin{align*} \operatorname{span}\{e_1,\ldots,e_{k-1}\}=\operatorname{span}\{v_1,\ldots,v_{k-1}\}. \end{align*} We prove the same conclusions at index $k$. [/step]

custom_env admin

[step:Compute the modified working vector after each projection]We claim that, for every integer $j$ with $0\le j\le k-1$, the modified working vector satisfies \begin{align*} w_{k,j}=v_k-\sum_{i=1}^{j}(v_k,e_i)_V e_i. \end{align*} For $j=0$, this is the definition $w_{k,0}=v_k$, with the empty sum equal to zero. Now let $j$ satisfy $1\le j\le k-1$, and assume the formula holds for $j-1$. Since $(e_1,\ldots,e_{k-1})$ is orthonormal, we have $(e_i,e_j)_V=0$ for $i<j$ and $(e_j,e_j)_V=1$. Using linearity of the [inner product](/page/Inner%20Product) in the first argument, the coefficient subtracted by the modified algorithm is \begin{align*} (w_{k,j-1},e_j)_V=(v_k,e_j)_V. \end{align*} Indeed, the terms involving $e_i$ with $i<j$ vanish by orthogonality. Substituting this coefficient into the modified update gives \begin{align*} w_{k,j}=w_{k,j-1}-(v_k,e_j)_V e_j. \end{align*} Using the induction formula for $w_{k,j-1}$ yields \begin{align*} w_{k,j}=v_k-\sum_{i=1}^{j}(v_k,e_i)_V e_i. \end{align*} Thus the formula holds for every $0\le j\le k-1$ by finite induction.[/step]

custom_env admin

[guided]The purpose of this step is to show that the modified algorithm has not changed the mathematical residual; it has only changed the order in which the projection components are removed. We prove the precise formula \begin{align*} w_{k,j}=v_k-\sum_{i=1}^{j}(v_k,e_i)_V e_i \end{align*} for each $0\le j\le k-1$. When $j=0$, the statement says $w_{k,0}=v_k$, which is exactly the definition of the initial modified working vector. Now suppose the formula has been proved for some $j-1$, where $1\le j\le k-1$. The modified algorithm computes the next coefficient from the current working vector: \begin{align*} (w_{k,j-1},e_j)_V. \end{align*} By the induction formula, \begin{align*} w_{k,j-1}=v_k-\sum_{i=1}^{j-1}(v_k,e_i)_V e_i. \end{align*} Taking the inner product with $e_j$ and using linearity in the first argument gives \begin{align*} (w_{k,j-1},e_j)_V=(v_k,e_j)_V-\sum_{i=1}^{j-1}(v_k,e_i)_V(e_i,e_j)_V. \end{align*} Because $(e_1,\ldots,e_{k-1})$ is orthonormal, every factor $(e_i,e_j)_V$ with $i<j$ is zero. Therefore \begin{align*} (w_{k,j-1},e_j)_V=(v_k,e_j)_V. \end{align*} This is the key point: although modified Gram-Schmidt computes the coefficient from the updated vector $w_{k,j-1}$, exact orthogonality of the previously removed directions makes that coefficient equal to the classical coefficient. Substituting this equality into the modified update, \begin{align*} w_{k,j}=w_{k,j-1}-(w_{k,j-1},e_j)_V e_j, \end{align*} we obtain \begin{align*} w_{k,j}=w_{k,j-1}-(v_k,e_j)_V e_j. \end{align*} Finally, replacing $w_{k,j-1}$ by its induction formula gives \begin{align*} w_{k,j}=v_k-\sum_{i=1}^{j-1}(v_k,e_i)_V e_i-(v_k,e_j)_V e_j. \end{align*} Combining the two displayed subtraction terms into a single finite sum gives \begin{align*} w_{k,j}=v_k-\sum_{i=1}^{j}(v_k,e_i)_V e_i. \end{align*} This completes the induction over $j$.[/guided]

custom_env admin

[step:Identify the classical and modified residuals] Taking $j=k-1$ in the formula from the previous step gives \begin{align*} u_{M,k}=w_{k,k-1}=v_k-\sum_{i=1}^{k-1}(v_k,e_i)_V e_i. \end{align*} Since $e_i=e_{C,i}$ for every $1\le i<k$, the classical residual is \begin{align*} u_{C,k}=v_k-\sum_{i=1}^{k-1}(v_k,e_i)_V e_i. \end{align*} Therefore \begin{align*} u_{C,k}=u_{M,k}. \end{align*} Let $u_k$ denote this common residual. [/step]

custom_env admin

[step:Use linear independence to rule out a zero residual] Suppose, for contradiction, that $u_k=0$. From the residual formula, \begin{align*} v_k=\sum_{i=1}^{k-1}(v_k,e_i)_V e_i. \end{align*} Thus $v_k\in \operatorname{span}\{e_1,\ldots,e_{k-1}\}$. By the induction hypothesis on spans, \begin{align*} v_k\in \operatorname{span}\{v_1,\ldots,v_{k-1}\}. \end{align*} This contradicts the linear independence of $(v_1,\ldots,v_m)$. Hence $u_k\neq 0$. [/step]

custom_env admin

[step:Normalize the common residual and update the induction] Since $u_{C,k}=u_{M,k}=u_k$ and $u_k\neq 0$, both algorithms divide by the same positive scalar $|u_k|$. Hence \begin{align*} e_{C,k}=\frac{u_k}{|u_k|}=e_{M,k}. \end{align*} Moreover, $u_k$ is orthogonal to each $e_j$ with $1\le j<k$, because \begin{align*} (u_k,e_j)_V=(v_k,e_j)_V-\sum_{i=1}^{k-1}(v_k,e_i)_V(e_i,e_j)_V=0. \end{align*} Also $|e_{C,k}|=1$ by construction. Therefore $(e_1,\ldots,e_k)$ is orthonormal. Finally, since \begin{align*} u_k=v_k-\sum_{i=1}^{k-1}(v_k,e_i)_V e_i, \end{align*} we have $u_k\in \operatorname{span}\{v_1,\ldots,v_k\}$ and $v_k\in \operatorname{span}\{e_1,\ldots,e_k\}$. Combining these inclusions with \begin{align*} \operatorname{span}\{e_1,\ldots,e_{k-1}\}=\operatorname{span}\{v_1,\ldots,v_{k-1}\} \end{align*} gives \begin{align*} \operatorname{span}\{e_1,\ldots,e_k\}=\operatorname{span}\{v_1,\ldots,v_k\}. \end{align*} Thus the induction hypotheses are established at index $k$. [/step]

custom_env admin

[step:Conclude equality of the full orthonormal lists] By induction on $k$, every residual used by either algorithm is nonzero, and for every $1\le k\le m$ the two algorithms produce the same normalized vector: \begin{align*} e_{C,k}=e_{M,k}. \end{align*} Therefore, in exact arithmetic and under the stated common inner product and normalization conventions, classical Gram-Schmidt and modified Gram-Schmidt produce the same orthonormal list. [/step]

custom_env admin

Verification Progress

9 Total Blocks

0 Verified

0% verified

Contributors

admin 9 blocks (0 verified)

Who Can Verify

Areas: Analysis

Viktor Miykov Admin

Max Vassiliev Global Reviewer

Horia Neagu Global Reviewer

강현욱 Global Reviewer

Demo Testing Global Reviewer

Archie Pennycook Global Reviewer

Quick Actions

Edit Theorem

Raw Attribution Data

Loading attribution data...

What brings you to Androma?

Start with a route through the knowledge graph.

Attributions & Verification

Proof

Verification Progress

Contributors

Who Can Verify

Quick Actions

Sign in to Androma

Check your inbox

One last step

Attributions & Verification

Proof

Verification Progress

Contributors

Who Can Verify

Quick Actions

Raw Attribution Data