Inverse Function Theorem — Statement & Proof

Theorem

Edit Issues Pull Requests Attributions Admin

Discussion

Proof

[proofplan] We reduce to the case where the total derivative at $a$ is the identity by pre-composing with the inverse of $Df_a$, writing $f(x) = x + \varphi(x)$ with $D_a\varphi = 0$. Continuity of $x \mapsto D_x\varphi$ localises $\varphi$ to a $\tfrac{1}{2}$-Lipschitz perturbation on a closed ball. Surjectivity onto a neighbourhood follows from the [Contraction Mapping Theorem](/theorems/71) applied to the auxiliary map $T_y(x) = y - \varphi(x)$, and injectivity is immediate from the contraction bound. The inverse $g$ is shown to be Lipschitz (hence continuous), and its differentiability with the formula $D_bg = (D_{g(b)}f)^{-1}$ follows from the definition of differentiability and the Lipschitz estimate, with continuity of $b \mapsto D_bg$ ensured by continuity of matrix inversion on $\mathrm{GL}_n(\mathbb{R})$. [/proofplan] [step:Reduce to the case $D_af = \mathrm{Id}$ by pre-composing with $(D_af)^{-1}$] Set $\alpha := D_af: \mathbb{R}^n \to \mathbb{R}^n$, which is an [invertible linear map](/page/Linear%20Map) by hypothesis. Replace $f$ with $\alpha^{-1} \circ f$. Since $D_a(\alpha^{-1} \circ f) = \alpha^{-1} \circ D_af = \mathrm{Id}$, and $\alpha^{-1} \circ f$ is a $C^1$-diffeomorphism if and only if $f$ is (because $\alpha^{-1}$ is a linear isomorphism), it suffices to prove the theorem for the replaced map. Henceforth assume $D_af = \mathrm{Id}$. Define the perturbation \begin{align*} \varphi: U &\to \mathbb{R}^n \\ x &\mapsto f(x) - x, \end{align*} so that $f(x) = x + \varphi(x)$ and $D_a\varphi = D_af - \mathrm{Id} = 0$. [guided] The goal is to show $f$ is locally invertible near $a$ with a $C^1$ inverse. The hypothesis is that $D_af$ is an invertible linear map, but working with a general invertible linear map adds unnecessary complexity to every estimate. The reduction step eliminates this: by replacing $f$ with $\alpha^{-1} \circ f$ where $\alpha = D_af$, we arrange that $D_af = \mathrm{Id}$. This is legitimate because $\alpha^{-1}$ is a linear diffeomorphism of $\mathbb{R}^n$, so $f$ is a local diffeomorphism if and only if $\alpha^{-1} \circ f$ is. With this normalisation, write $f(x) = x + \varphi(x)$ where \begin{align*} \varphi: U &\to \mathbb{R}^n \\ x &\mapsto f(x) - x. \end{align*} Then $D_a\varphi = D_af - \mathrm{Id} = 0$. The map $\varphi$ is the "nonlinear part" of $f$ near $a$, and the condition $D_a\varphi = 0$ means this nonlinear part has vanishing first-order approximation at $a$ -- it is a perturbation that becomes negligible near $a$. The entire proof now reduces to exploiting this smallness. [/guided] [/step] [step:Localise so that $\varphi$ is a $\tfrac{1}{2}$-contraction on $\overline{B}(a,r)$] Since $f$ is $C^1$ and $D_a\varphi = 0$, the map $x \mapsto D_x\varphi$ is continuous with $D_a\varphi = 0$, so there exists $r > 0$ such that $\overline{B}(a,r) \subset U$ and \begin{align*} \|D_x\varphi\| \leq \tfrac{1}{2} \quad \text{for all } x \in \overline{B}(a,r). \end{align*} By the [Mean Value Inequality](/theorems/328), for all $x, x' \in \overline{B}(a,r)$: \begin{align*} \|\varphi(x) - \varphi(x')\| \leq \tfrac{1}{2}\|x - x'\|. \end{align*} The hypotheses of the [Mean Value Inequality](/theorems/328) are satisfied: $\overline{B}(a,r)$ is convex, $\varphi$ is differentiable on $\overline{B}(a,r)$, and $\|D_x\varphi\| \leq \tfrac{1}{2}$ throughout. [guided] Why do we need $\varphi$ to be a contraction? The plan is to construct the inverse of $f$ via the [Contraction Mapping Theorem](/theorems/71), which requires a map with [Lipschitz](/page/Continuity%20(Metric%20Spaces)) constant strictly less than $1$. Since $D_a\varphi = 0$ and $x \mapsto D_x\varphi$ is continuous (because $f$ is $C^1$), we can make $\|D_x\varphi\|$ as small as we like by restricting to a sufficiently small ball around $a$. Choosing $r > 0$ so that $\|D_x\varphi\| \leq \tfrac{1}{2}$ on $\overline{B}(a,r)$, the [Mean Value Inequality](/theorems/328) converts this pointwise derivative bound into the Lipschitz estimate \begin{align*} \|\varphi(x) - \varphi(x')\| \leq \sup_{z \in \overline{B}(a,r)} \|D_z\varphi\| \cdot \|x - x'\| \leq \tfrac{1}{2}\|x - x'\| \end{align*} for all $x, x' \in \overline{B}(a,r)$. The convexity of $\overline{B}(a,r)$ is needed for the Mean Value Inequality (the line segment from $x$ to $x'$ must remain in the domain). The specific constant $\tfrac{1}{2}$ is not special -- any $\lambda \in (0,1)$ would work -- but $\tfrac{1}{2}$ makes the subsequent estimates clean. [/guided] [/step] [step:Prove surjectivity onto a ball around $f(a)$ via the Contraction Mapping Theorem] Set $V := B(a,r)$. Fix $y \in \mathbb{R}^n$ with $\|y - f(a)\| < r/2$. Define the auxiliary map \begin{align*} T_y: \overline{B}(a,r) &\to \mathbb{R}^n \\ x &\mapsto y - \varphi(x). \end{align*} We verify that $T_y$ maps $\overline{B}(a,r)$ into itself. For $x \in \overline{B}(a,r)$: \begin{align*} \|T_y(x) - a\| &= \|y - \varphi(x) - a\| \\ &\leq \|y - f(a)\| + \|\varphi(a) - \varphi(x)\| \\ &< \tfrac{r}{2} + \tfrac{1}{2}\|x - a\| \\ &\leq \tfrac{r}{2} + \tfrac{r}{2} = r, \end{align*} where the second line uses $f(a) = a + \varphi(a)$ and the triangle inequality, and the third uses the hypothesis $\|y - f(a)\| < r/2$ and the contraction estimate from the previous step. For the contraction property, for $x, x' \in \overline{B}(a,r)$: \begin{align*} \|T_y(x) - T_y(x')\| = \|\varphi(x') - \varphi(x)\| \leq \tfrac{1}{2}\|x - x'\|. \end{align*} The space $\overline{B}(a,r)$ is a closed subset of the complete metric space $\mathbb{R}^n$, hence complete. By the [Contraction Mapping Theorem](/theorems/71), $T_y$ has a unique fixed point $x_0 \in \overline{B}(a,r)$: \begin{align*} x_0 = T_y(x_0) = y - \varphi(x_0), \end{align*} which rearranges to $f(x_0) = x_0 + \varphi(x_0) = y$. Since $\|T_y(x_0) - a\| < r$ (strict inequality from the estimate above), the fixed point satisfies $x_0 \in B(a,r) = V$. Therefore $y \in f(V)$, and $W := f(V)$ contains the ball $B(f(a), r/2)$. [guided] The idea is to reformulate the equation $f(x) = y$ as a fixed-point problem. Since $f(x) = x + \varphi(x)$, the equation $x + \varphi(x) = y$ rearranges to $x = y - \varphi(x)$, so a solution is precisely a fixed point of the auxiliary map \begin{align*} T_y: \overline{B}(a,r) &\to \mathbb{R}^n, \qquad T_y(x) := y - \varphi(x). \end{align*} To apply the [Contraction Mapping Theorem](/theorems/71), three hypotheses must be verified: (i) the ambient space $\overline{B}(a,r)$ is a complete metric space, (ii) $T_y$ maps $\overline{B}(a,r)$ into itself, and (iii) $T_y$ is a strict contraction. Completeness holds because $\overline{B}(a,r)$ is a closed subset of the complete space $\mathbb{R}^n$. For the self-mapping property, note that $f(a) = a + \varphi(a)$, so $a = f(a) - \varphi(a)$. For any $x \in \overline{B}(a,r)$, the triangle inequality gives \begin{align*} \|T_y(x) - a\| &= \|y - \varphi(x) - a\| = \|(y - f(a)) + (\varphi(a) - \varphi(x))\| \\ &\leq \|y - f(a)\| + \|\varphi(a) - \varphi(x)\| < \tfrac{r}{2} + \tfrac{1}{2}\|x - a\| \leq \tfrac{r}{2} + \tfrac{r}{2} = r. \end{align*} The first term uses the hypothesis $\|y - f(a)\| < r/2$, and the second uses the $\tfrac{1}{2}$-Lipschitz bound $\|\varphi(a) - \varphi(x)\| \leq \tfrac{1}{2}\|x - a\|$ from the previous step. The contraction property follows directly from the same Lipschitz estimate: \begin{align*} \|T_y(x) - T_y(x')\| = \|(y - \varphi(x)) - (y - \varphi(x'))\| = \|\varphi(x') - \varphi(x)\| \leq \tfrac{1}{2}\|x - x'\|. \end{align*} The [Contraction Mapping Theorem](/theorems/71) now yields a unique fixed point $x_0 \in \overline{B}(a,r)$ satisfying $x_0 = T_y(x_0) = y - \varphi(x_0)$. Rearranging: $f(x_0) = x_0 + \varphi(x_0) = y$. The strict inequality $\|T_y(x_0) - a\| < r$ (not merely $\leq r$) from the self-mapping estimate ensures that $x_0 \in B(a,r) = V$, so the fixed point lies in the open ball, not merely in $\overline{B}(a,r)$. The constraint $\|y - f(a)\| < r/2$ determines the size of the image neighbourhood $W \supseteq B(f(a), r/2)$. The factor of $2$ arises from the contraction constant $\lambda = \tfrac{1}{2}$: in the self-mapping estimate, the budget $r$ is split between the displacement $\|y - f(a)\| < r/2$ and the perturbation $\|\varphi(a) - \varphi(x)\| \leq \lambda r = r/2$. With a general contraction constant $\lambda \in (0,1)$, the image ball would have radius $r(1 - \lambda)$. [/guided] [/step] [step:Prove injectivity of $f$ on $V$] Suppose $f(x) = f(x')$ for $x, x' \in V$. Then $x + \varphi(x) = x' + \varphi(x')$, so \begin{align*} \|x - x'\| = \|\varphi(x') - \varphi(x)\| \leq \tfrac{1}{2}\|x - x'\|. \end{align*} This forces $\|x - x'\| = 0$, hence $x = x'$. Therefore $f|_V: V \to W$ is a bijection. [/step] [step:Show that $W = f(V)$ is open] Let $y_0 = f(x_0) \in W$ with $x_0 \in V$. Since $V$ is open, there exists $\delta > 0$ with $B(x_0, \delta) \subset V$. Apply the contraction argument from the surjectivity step with $a$ replaced by $x_0$ and $r$ replaced by $\delta$: the derivative bound $\|D_x\varphi\| \leq \tfrac{1}{2}$ still holds on $\overline{B}(x_0, \delta) \subset \overline{B}(a,r)$, so the same argument shows $B(y_0, \delta/2) \subset f(B(x_0, \delta)) \subset W$. Since every point of $W$ has a neighbourhood contained in $W$, the set $W$ is open. [/step] [step:Prove the inverse $g = (f|_V)^{-1}$ is Lipschitz] For $y = f(x)$ and $y' = f(x')$ in $W$, write $y - y' = (x - x') + (\varphi(x) - \varphi(x'))$ and rearrange using the triangle inequality: \begin{align*} \|x - x'\| &\leq \|y - y'\| + \|\varphi(x) - \varphi(x')\| \leq \|y - y'\| + \tfrac{1}{2}\|x - x'\|. \end{align*} Subtracting $\tfrac{1}{2}\|x - x'\|$ from both sides: \begin{align*} \|g(y) - g(y')\| = \|x - x'\| \leq 2\|y - y'\|. \end{align*} Hence $g$ is Lipschitz with constant $2$, and in particular continuous. [/step] [step:Show $g$ is $C^1$ with $D_bg = (D_{g(b)}f)^{-1}$] Fix $b = f(a_0) \in W$ where $a_0 = g(b) \in V$. Set $\beta := (D_{a_0}f)^{-1} \in \mathcal{L}(\mathbb{R}^n)$, which exists since $D_{a_0}f$ is invertible (the derivative bound $\|D_x\varphi\| \leq \tfrac{1}{2}$ implies $\|D_xf - \mathrm{Id}\| \leq \tfrac{1}{2} < 1$, so $D_xf$ is invertible for all $x \in V$ by the Neumann series). For $k := y - b$ with $y \in W$, set $h := g(y) - g(b) = g(y) - a_0$. Then $f(a_0 + h) = b + k$, and differentiability of $f$ at $a_0$ gives \begin{align*} k = D_{a_0}f(h) + \|h\|\,\varepsilon(h), \end{align*} where $\varepsilon: \mathbb{R}^n \to \mathbb{R}^n$ satisfies $\varepsilon(h) \to 0$ as $h \to 0$. Applying $\beta$: \begin{align*} \beta(k) = h + \beta(\|h\|\,\varepsilon(h)), \end{align*} so the differentiability remainder for $g$ is \begin{align*} g(y) - g(b) - \beta(k) = -\beta(\|h\|\,\varepsilon(h)). \end{align*} Using the Lipschitz bound $\|h\| = \|g(y) - g(b)\| \leq 2\|k\|$ from the previous step: \begin{align*} \frac{\|g(y) - g(b) - \beta(k)\|}{\|k\|} &\leq \frac{\|\beta\| \cdot \|h\| \cdot \|\varepsilon(h)\|}{\|k\|} \leq 2\|\beta\| \cdot \|\varepsilon(h)\| \to 0 \end{align*} as $k \to 0$ (since $h \to 0$ by continuity of $g$, and $\varepsilon(h) \to 0$ by differentiability of $f$). Therefore $g$ is differentiable at $b$ with $D_bg = \beta = (D_{a_0}f)^{-1} = (D_{g(b)}f)^{-1}$. For continuity of $b \mapsto D_bg$: the map $b \mapsto D_{g(b)}f$ is continuous (composition of the continuous maps $g: W \to V$ and $x \mapsto D_xf: V \to \mathcal{L}(\mathbb{R}^n)$, the latter being continuous because $f$ is $C^1$). Matrix inversion $A \mapsto A^{-1}$ is continuous on $\mathrm{GL}_n(\mathbb{R})$ (the set of invertible $n \times n$ matrices). Therefore $b \mapsto D_bg = (D_{g(b)}f)^{-1}$ is continuous, and $g$ is $C^1$. [guided] This is the most delicate step. We must show $g$ is differentiable and compute its derivative. The strategy is to use the definition of differentiability directly: show that $g(y) - g(b) - \beta(y - b)$ is $o(\|y - b\|)$ as $y \to b$, where $\beta = (D_{g(b)}f)^{-1}$ is the candidate derivative. The key technical input is the Lipschitz estimate $\|h\| \leq 2\|k\|$ from the previous step. Without this, we could not control the ratio $\|h\|/\|k\|$, and the differentiability argument would fail. Here is the computation in detail. Starting from the differentiability of $f$ at $a_0 = g(b)$: for $h = g(y) - g(b)$ and $k = y - b = f(a_0 + h) - f(a_0)$, \begin{align*} k = D_{a_0}f(h) + \|h\|\,\varepsilon(h), \end{align*} where $\varepsilon(h) \to 0$ as $h \to 0$. Applying $\beta = (D_{a_0}f)^{-1}$ to both sides: \begin{align*} \beta(k) = h + \beta(\|h\|\,\varepsilon(h)). \end{align*} Rearranging: $g(y) - g(b) - \beta(k) = h - \beta(k) = -\beta(\|h\|\,\varepsilon(h))$. Taking norms and dividing by $\|k\|$: \begin{align*} \frac{\|g(y) - g(b) - \beta(k)\|}{\|k\|} = \frac{\|\beta(\|h\|\,\varepsilon(h))\|}{\|k\|} \leq \|\beta\| \cdot \frac{\|h\|}{\|k\|} \cdot \|\varepsilon(h)\|. \end{align*} The Lipschitz bound gives $\|h\|/\|k\| \leq 2$. As $k \to 0$, continuity of $g$ gives $h \to 0$, so $\|\varepsilon(h)\| \to 0$. Therefore the entire expression tends to $0$, confirming $D_bg = \beta = (D_{g(b)}f)^{-1}$. Why is $D_bg$ continuous in $b$? We need the composition $b \mapsto g(b) \mapsto D_{g(b)}f \mapsto (D_{g(b)}f)^{-1}$ to be continuous. The first arrow is continuous (just proved). The second is continuous because $f$ is $C^1$. The third -- matrix inversion -- is continuous on $\mathrm{GL}_n(\mathbb{R})$, which is an open subset of $\mathbb{R}^{n \times n}$, and $(D_{g(b)}f)$ remains in $\mathrm{GL}_n(\mathbb{R})$ for all $b \in W$ because $\|D_x\varphi\| \leq \tfrac{1}{2} < 1$ implies $D_xf = \mathrm{Id} + D_x\varphi$ is invertible (the Neumann series $\sum_{k=0}^\infty (-D_x\varphi)^k$ converges). Therefore $b \mapsto D_bg$ is continuous and $g$ is $C^1$. [/guided] [/step]

Prerequisites (0/6 completed)

Prerequisites Graph

Interactive dependency map showing how this theorem builds on foundational concepts

Loading dependency graph...

Theorems

Definitions & Concepts

What brings you to Androma?

Start with a route through the knowledge graph.

Inverse Function Theorem (Theorem # 51)

Discussion

Proof

Prerequisites (0/6 completed)

Prerequisites Graph

Explore Further

Sign in to Androma

Check your inbox

One last step

Inverse Function Theorem (Theorem # 51)

Discussion

Proof

Prerequisites (0/6 completed)

Prerequisites Graph

Explore Further