Gradient Equals Negative Residual — Statement & Proof

Gradient Equals Negative Residual (Theorem # 1395)

Theorem

Edit Issues Pull Requests Attributions Admin

Discussion

No discussion available for this theorem.

Proof

[proofplan] We compute $\nabla F(x)$ for the quadratic form $F(x) = \frac{1}{2}\langle x, Ax \rangle - \langle b, x \rangle$ by differentiating each term with respect to $x$. The symmetry of $A$ gives $\nabla(\frac{1}{2}x^\top Ax) = Ax$ and the linear term contributes $-b$, yielding $\nabla F(x) = Ax - b = -r$. [/proofplan] [step:Differentiate $F(x) = \frac{1}{2}\langle x, Ax \rangle - \langle b, x \rangle$ term by term] Write $F$ in component form. In coordinates $x = (x_1, \ldots, x_n)^\top$: \begin{align*} F(x) = \frac{1}{2}\sum_{i=1}^{n}\sum_{j=1}^{n} a_{ij} x_i x_j - \sum_{i=1}^{n} b_i x_i. \end{align*} Differentiating with respect to $x_\ell$ for $\ell = 1, \ldots, n$: \begin{align*} \frac{\partial F}{\partial x_\ell} = \frac{1}{2}\sum_{j=1}^{n} a_{\ell j} x_j + \frac{1}{2}\sum_{i=1}^{n} a_{i\ell} x_i - b_\ell. \end{align*} Since $A$ is symmetric ($a_{i\ell} = a_{\ell i}$), the two sums are equal: \begin{align*} \frac{\partial F}{\partial x_\ell} = \sum_{j=1}^{n} a_{\ell j} x_j - b_\ell = (Ax)_\ell - b_\ell. \end{align*} Assembling the components into a vector: \begin{align*} \nabla F(x) = Ax - b. \end{align*} Evaluating at $x = x^{(k)}$ and using the definition $r^{(k)} := b - Ax^{(k)}$: \begin{align*} \nabla F(x^{(k)}) = Ax^{(k)} - b = -(b - Ax^{(k)}) = -r^{(k)}. \end{align*} [guided] To differentiate $\frac{1}{2}x^\top Ax$, we use the product rule for bilinear forms. In index notation, $\frac{1}{2}\sum_{i,j} a_{ij} x_i x_j$ has two types of contributions to $\partial/\partial x_\ell$: terms where $i = \ell$ (giving $\frac{1}{2}\sum_j a_{\ell j} x_j$) and terms where $j = \ell$ (giving $\frac{1}{2}\sum_i a_{i\ell} x_i$). Symmetry of $A$ means $a_{i\ell} = a_{\ell i}$, so these two contributions are identical, and their sum is $\sum_j a_{\ell j} x_j = (Ax)_\ell$. In matrix calculus notation, the standard identity for a symmetric matrix $A$ is $\nabla(x^\top Ax) = 2Ax$, so $\nabla(\frac{1}{2}x^\top Ax) = Ax$. The linear term $-b^\top x$ differentiates to $-b$. The identity $\nabla F = -r$ is fundamental to the conjugate gradient method: it means that **the negative gradient of the quadratic objective is exactly the residual**. Minimizing $F$ is therefore equivalent to driving the residual to zero, which is the same as solving $Ax = b$. This is why steepest descent (moving in the direction $-\nabla F = r$) reduces both the residual norm and the objective value simultaneously. [/guided] [/step]

Explore Further

Spectral Radius Criterion Numerical Analysis Gershgorin Circle Theorem Numerical Analysis Pointwise Convergence of Fourier Series Numerical Analysis Convergence of Rayleigh Quotients Numerical Analysis Spectral Norm of Normal Matrices Numerical Analysis Complexity of the FFT Numerical Analysis Block Structure After Householder Deflation Numerical Analysis DFT Splitting Identity Numerical Analysis

What brings you to Androma?

Start with a route through the knowledge graph.

Gradient Equals Negative Residual (Theorem # 1395)

Discussion

Proof

Explore Further

Sign in to Androma

Check your inbox

One last step

Gradient Equals Negative Residual (Theorem # 1395)

Discussion

Proof

Explore Further