Convergence of Inverse Iteration — Statement & Proof

Convergence of Inverse Iteration (Theorem # 1408)

Theorem

Edit Issues Pull Requests Attributions Admin

Discussion

No discussion available for this theorem.

Proof

[proofplan] Inverse iteration with shift $s$ applies the power method to the matrix $(A - sI)^{-1}$. We verify that this matrix is diagonalisable with the same eigenvectors as $A$ and eigenvalues $(\lambda_i - s)^{-1}$. The dominant eigenvalue of $(A - sI)^{-1}$ corresponds to the eigenvalue of $A$ closest to $s$, and the convergence rate of the power method is determined by the ratio of the second-largest to the largest eigenvalue magnitude of $(A - sI)^{-1}$, which equals $|(\lambda - s)/(\lambda' - s)| = \rho$. [/proofplan] [step:Identify the eigenvalues and eigenvectors of $(A - sI)^{-1}$] Since $A$ is diagonalisable, write $Aw_i = \lambda_iw_i$ for $i = 1, \ldots, n$. Then \begin{align*} (A - sI)w_i = (\lambda_i - s)w_i. \end{align*} The shift $s$ is not an eigenvalue of $A$ (since $\lambda$ is the closest eigenvalue to $s$ and $|\lambda - s| > 0$ by hypothesis), so $\lambda_i - s \neq 0$ for all $i$ and $(A - sI)$ is invertible. Applying $(A - sI)^{-1}$ to both sides: \begin{align*} w_i = (\lambda_i - s)(A - sI)^{-1}w_i, \qquad \text{hence} \qquad (A - sI)^{-1}w_i = \frac{1}{\lambda_i - s}\,w_i. \end{align*} Therefore $(A - sI)^{-1}$ is diagonalisable with the same eigenvectors $w_1, \ldots, w_n$ and eigenvalues $\mu_i := (\lambda_i - s)^{-1}$. [guided] The key algebraic fact is that shifting and inverting preserve the eigenvectors. If $Aw_i = \lambda_iw_i$, then $(A - sI)w_i = (\lambda_i - s)w_i$ — the shift moves each eigenvalue by $-s$ but leaves the eigenvectors unchanged. Inverting then flips each eigenvalue to its reciprocal: $(A - sI)^{-1}w_i = (\lambda_i - s)^{-1}w_i$. For the inverse to exist, we need $\lambda_i - s \neq 0$ for all $i$, i.e., $s$ must not be an eigenvalue of $A$. The hypothesis $|\lambda - s| > 0$ (where $\lambda$ is the *closest* eigenvalue to $s$) guarantees this: if the closest eigenvalue is at nonzero distance from $s$, then all eigenvalues are at nonzero distance from $s$. Since $A$ is diagonalisable with eigenvectors $\{w_1, \ldots, w_n\}$ forming a basis, and $(A - sI)^{-1}$ has the same eigenvectors, the matrix $(A - sI)^{-1}$ is also diagonalisable. [/guided] [/step] [step:Identify the dominant eigenvalue of $(A - sI)^{-1}$] The magnitude of the $i$-th eigenvalue of $(A - sI)^{-1}$ is $|\mu_i| = |\lambda_i - s|^{-1}$. The largest $|\mu_i|$ corresponds to the smallest $|\lambda_i - s|$, which is $|\lambda - s|$ by hypothesis. The second-largest corresponds to $|\lambda' - s|$, the second-smallest distance. Therefore the dominant eigenvalue of $(A - sI)^{-1}$ has modulus \begin{align*} |\mu_{\max}| = \frac{1}{|\lambda - s|}, \end{align*} and the ratio of the second-largest to dominant eigenvalue modulus is \begin{align*} \rho = \frac{|\mu_{\text{second}}|}{|\mu_{\max}|} = \frac{|\lambda - s|}{|\lambda' - s|} < 1. \end{align*} [guided] Inversion reverses the ordering of eigenvalue magnitudes relative to the shift $s$. The eigenvalue $\lambda$ of $A$ that is *closest* to $s$ produces the eigenvalue $(\lambda - s)^{-1}$ of $(A - sI)^{-1}$ that is *largest* in magnitude, because $|\lambda_i - s|$ is smallest for $i$ corresponding to $\lambda$. The ratio $\rho = |\lambda - s| / |\lambda' - s| < 1$ because $|\lambda - s| < |\lambda' - s|$ by definition ($\lambda$ is the closest eigenvalue, $\lambda'$ is the second-closest). This ratio controls the convergence speed: the smaller $\rho$ is, the faster convergence will be. When $s$ is very close to $\lambda$, $|\lambda - s|$ is small while $|\lambda' - s|$ remains bounded away from zero, making $\rho$ close to zero. [/guided] [/step] [step:Apply the power method convergence theorem to $(A - sI)^{-1}$] Inverse iteration computes $x^{(k+1)} \propto (A - sI)^{-1}x^{(k)}$, which is exactly the power method applied to the matrix $(A - sI)^{-1}$. Provided the initial vector has a nonzero component along $w$ (the eigenvector associated with $\lambda$), the [Convergence of the Power Method](/theorems/1405) guarantees convergence to $\operatorname{span}(w)$ at rate $O(\rho^k)$. [guided] The [Convergence of the Power Method](/theorems/1405) states that if $B$ is diagonalisable with a dominant eigenvalue of multiplicity one, and if the initial vector has a nonzero component in the dominant eigenspace, then the power iteration $x^{(k+1)} \propto Bx^{(k)}$ converges to the dominant eigenvector at rate $O(\rho^k)$, where $\rho$ is the ratio of the second-largest to the largest eigenvalue magnitude. We apply this with $B = (A - sI)^{-1}$. We have verified: (i) $B$ is diagonalisable (same eigenvectors as $A$); (ii) the dominant eigenvalue $\mu_{\max} = (\lambda - s)^{-1}$ has the largest modulus; (iii) the ratio of the second-largest to the dominant eigenvalue modulus is $\rho = |\lambda - s|/|\lambda' - s| < 1$. The hypothesis on the initial vector (nonzero component along $w$) is carried over from the theorem statement. The practical advantage of inverse iteration is that the convergence rate $\rho = |\lambda - s|/|\lambda' - s|$ can be made arbitrarily small by choosing $s$ close to $\lambda$. This is in contrast to the standard power method, where the convergence rate $|\lambda_2|/|\lambda_1|$ is fixed by the matrix. In the extreme case $s \to \lambda$, $\rho \to 0$ and convergence occurs in essentially a single step (though the linear system $(A - sI)x^{(k+1)} = x^{(k)}$ becomes increasingly ill-conditioned — fortunately, the errors in the solve are biased in the direction of $w$, which is harmless for eigenvector computation). [/guided] [/step]

Explore Further

Symmetry and Negative Definiteness of $A$ Numerical Analysis Strict Diagonal Dominance Implies Nonsingularity Numerical Analysis QR Factor of Iterated Power Numerical Analysis Spectral Radius Criterion Numerical Analysis Finite Termination of Conjugate Direction Methods Numerical Analysis Gradient Equals Negative Residual Numerical Analysis Convergence of Simultaneous Iteration Numerical Analysis Technical Convergence Lemma for Simultaneous Iteration Numerical Analysis

What brings you to Androma?

Start with a route through the knowledge graph.

Convergence of Inverse Iteration (Theorem # 1408)

Discussion

Proof

Explore Further

Sign in to Androma

Check your inbox

One last step

Convergence of Inverse Iteration (Theorem # 1408)

Discussion

Proof

Explore Further