Chain Rule for Partial Derivatives — Statement & Proof

Chain Rule for Partial Derivatives (Theorem # 7906)

Theorem

Edit Issues Pull Requests Attributions Admin

Discussion

Proof

[proofplan] We first use the differentiable chain rule for total derivatives to identify the total derivative of $f\circ g$ at $a$ as the composition $Df_{g(a)}\circ Dg_a$. Evaluating this [linear map](/page/Linear%20Map) on the standard basis vector $e_i\in\mathbb{R}^m$ turns the total derivative identity into the desired [partial derivative](/page/Partial%20Derivative) identity. The remaining work is to expand $Dg_a(e_i)$ in the standard basis of $\mathbb{R}^k$ and then use linearity of $Df_{g(a)}$. [/proofplan] [step:Apply the total derivative chain rule to $f\circ g$] Define the composition map \begin{align*} h:U &\to \mathbb{R}^n \end{align*} by $h=f\circ g$. Since $g$ is differentiable at $a$ and $f$ is differentiable at $g(a)$, the differentiable chain rule for Euclidean maps gives that $h$ is differentiable at $a$ and \begin{align*} Dh_a=Df_{g(a)}\circ Dg_a. \end{align*} Equivalently, for every vector $v\in\mathbb{R}^m$, \begin{align*} Dh_a(v)=Df_{g(a)}(Dg_a(v)). \end{align*} (citing a result not yet in the wiki: Chain Rule for Differentiable Maps) [/step] [step:Evaluate the derivative identity on the $i$-th coordinate direction] Fix $i\in\{1,\ldots,m\}$, and let $e_i\in\mathbb{R}^m$ denote the $i$-th standard basis vector. By [citetheorem:7904] applied to the differentiable map $h:U\to\mathbb{R}^n$, \begin{align*} \partial_{x_i}h(a)=Dh_a(e_i). \end{align*} Using the derivative identity from the previous step, we obtain \begin{align*} \partial_{x_i}(f\circ g)(a)=Df_{g(a)}(Dg_a(e_i)). \end{align*} [guided] Fix $i\in\{1,\ldots,m\}$, and let $e_i\in\mathbb{R}^m$ be the vector with $1$ in the $i$-th coordinate and $0$ in all other coordinates. The reason to evaluate on $e_i$ is that a partial derivative is exactly the total derivative tested in a coordinate direction. We apply [citetheorem:7904] to the map $h:U\to\mathbb{R}^n$ defined by $h=f\circ g$. Its hypothesis is satisfied because the previous step showed that $h$ is differentiable at $a$. Therefore the $i$-th partial derivative of $h$ exists and is given by \begin{align*} \partial_{x_i}h(a)=Dh_a(e_i). \end{align*} Since $h=f\circ g$, this is \begin{align*} \partial_{x_i}(f\circ g)(a)=Dh_a(e_i). \end{align*} The total derivative chain rule gives $Dh_a=Df_{g(a)}\circ Dg_a$, so substituting this identity into the preceding formula yields \begin{align*} \partial_{x_i}(f\circ g)(a)=Df_{g(a)}(Dg_a(e_i)). \end{align*} This converts the desired partial derivative formula into a computation in the linear maps $Dg_a:\mathbb{R}^m\to\mathbb{R}^k$ and $Df_{g(a)}:\mathbb{R}^k\to\mathbb{R}^n$. [/guided] [/step] [step:Expand $Dg_a(e_i)$ in the standard basis of $\mathbb{R}^k$] For each $r\in\{1,\ldots,k\}$, let $\varepsilon_r\in\mathbb{R}^k$ denote the $r$-th standard basis vector. Since $g:U\to\mathbb{R}^k$ is differentiable at $a$, [citetheorem:7904] applied to $g$ gives \begin{align*} Dg_a(e_i)=\partial_{x_i}g(a). \end{align*} Because $g=(g_1,\ldots,g_k)$, the vector-valued partial derivative has components \begin{align*} \partial_{x_i}g(a)=\sum_{r=1}^k \partial_{x_i}g_r(a)\,\varepsilon_r. \end{align*} Hence \begin{align*} Dg_a(e_i)=\sum_{r=1}^k \partial_{x_i}g_r(a)\,\varepsilon_r. \end{align*} [/step] [step:Use linearity of $Df_{g(a)}$ to obtain the coordinate formula] Since $Df_{g(a)}:\mathbb{R}^k\to\mathbb{R}^n$ is a linear map, the expansion of $Dg_a(e_i)$ gives \begin{align*} Df_{g(a)}(Dg_a(e_i))=\sum_{r=1}^k \partial_{x_i}g_r(a)\,Df_{g(a)}(\varepsilon_r). \end{align*} Applying [citetheorem:7904] to the differentiable map $f:V\to\mathbb{R}^n$ at the point $g(a)$ gives \begin{align*} Df_{g(a)}(\varepsilon_r)=\partial_{y_r}f(g(a)) \end{align*} for every $r\in\{1,\ldots,k\}$. Therefore \begin{align*} Df_{g(a)}(Dg_a(e_i))=\sum_{r=1}^k \partial_{x_i}g_r(a)\,\partial_{y_r}f(g(a)). \end{align*} Scalar multiplication in $\mathbb{R}^n$ is commutative with respect to real scalars, so this is the same as \begin{align*} Df_{g(a)}(Dg_a(e_i))=\sum_{r=1}^k \partial_{y_r}f(g(a))\,\partial_{x_i}g_r(a). \end{align*} Combining this identity with \begin{align*} \partial_{x_i}(f\circ g)(a)=Df_{g(a)}(Dg_a(e_i)) \end{align*} proves \begin{align*} \partial_{x_i}(f\circ g)(a)=\sum_{r=1}^k \partial_{y_r}f(g(a))\,\partial_{x_i}g_r(a). \end{align*} Since $i\in\{1,\ldots,m\}$ was arbitrary, the formula holds for every coordinate direction. [/step]

Prerequisites (0/3 completed)

Prerequisites Graph

Interactive dependency map showing how this theorem builds on foundational concepts

Loading dependency graph...

Theorems

Partial Derivatives from Differentiability

Definitions & Concepts

Explore Further

Partial Derivative Definition Linear Map Definition Partial Derivatives from Differentiability Theorem #7904 Center Manifold Approximation Dynamical Systems Weak Sequential Compactness Functional Analysis Zero Derivative on a Connected Domain Implies Constancy Multivariable Calculus First Variation Formula for the Dirichlet Energy Analysis Uniqueness of the Complete Ordered Field Analysis Recursive Feasibility and Local Asymptotic Stability of Terminal-Cost MPC Analysis Sobolev Embedding on the Torus Sobolev Spaces Sinai Factor Theorem Analysis Analysis Area

What brings you to Androma?

Start with a route through the knowledge graph.

Chain Rule for Partial Derivatives (Theorem # 7906)

Discussion

Proof

Prerequisites (0/3 completed)

Prerequisites Graph

Explore Further

Sign in to Androma

Check your inbox

One last step

Chain Rule for Partial Derivatives (Theorem # 7906)

Discussion

Proof

Prerequisites (0/3 completed)

Prerequisites Graph

Explore Further