Implicit Function Theorem (Theorem # 52)
Theorem
Let $U \subseteq \mathbb{R}^{n+m}$ be an [open set](/page/Open%20Set) and let $F: U \to \mathbb{R}^m$ be a [continuously](/page/Continuity) [differentiable](/page/Derivative) map ($C^1$). We identify $\mathbb{R}^{n+m}$ with the product space $\mathbb{R}^n \times \mathbb{R}^m$. For any point $z \in U$, we write $z = (x, y)$ where $x \in \mathbb{R}^n$ corresponds to the first $n$ coordinates and $y \in \mathbb{R}^m$ corresponds to the last $m$ coordinates. Let $(x_0, y_0) \in U$ be a point satisfying:
1. Zero Condition: $F(x_0, y_0) = 0$.
2. Invertibility Condition: Let $L = D F_{(x_0, y_0)}: \mathbb{R}^{n+m} \to \mathbb{R}^m$ be the total derivative. We define the partial [linear map](/page/Linear%20Map) $D_y F_{(x_0, y_0)}: \mathbb{R}^m \to \mathbb{R}^m$ by restricting $L$ to the subspace $\{0\} \times \mathbb{R}^m$. We require this restriction to be an isomorphism (i.e., invertible).
Then, there exist open neighborhoods $V \subseteq \mathbb{R}^n$ of $x_0$ and $W \subseteq \mathbb{R}^m$ of $y_0$ (with $V \times W \subseteq U$) such that:
- For every $x \in V$, there exists a unique $y \in W$ such that $F(x, y) = 0$.
- Define $g: V \to W$ by $g(x) = y$. Then $g$ is a $C^1$ map, $g(x_0) = y_0$, and $F(x, g(x)) = 0$ for all $x \in V$.
Calculus
Multivariable Calculus
Discussion
The Implicit Function Theorem guarantees that a system of $m$ equations $F(x, y) = 0$ in $n + m$ unknowns can be locally solved for $m$ variables as a $C^1$ function of the remaining $n$ variables, provided the partial derivative $D_y F$ is invertible at the point of interest. It is a cornerstone of nonlinear analysis, underpinning the local structure of level sets, constraint optimization via Lagrange multipliers, and the construction of coordinate charts on smooth manifolds defined by equations. Its proof reduces to the [Inverse Function Theorem](/pages/1026) applied to an auxiliary map that embeds the constraint into an invertible setting.
Proof
[proofplan]
We reduce the Implicit Function Theorem to the [Inverse Function Theorem](/theorems/51). Define an auxiliary map $\Psi: U \to \mathbb{R}^{n+m}$ by $\Psi(x,y) = (x, F(x,y))$, whose Jacobian at $(x_0, y_0)$ is block-upper-triangular with diagonal blocks $I_n$ and $D_y F_{(x_0, y_0)}$, hence invertible. The Inverse Function Theorem provides a local $C^1$ inverse $\Psi^{-1}$; since $\Psi$ fixes the $x$-coordinates, the second component of $\Psi^{-1}(x, 0)$ yields the desired implicit function $g$.
[/proofplan]
[step:Define the auxiliary map $\Psi: U \to \mathbb{R}^{n+m}$ that embeds $F$ into an invertible setting]
Define the map
\begin{align*}
\Psi: U \subseteq \mathbb{R}^{n+m} &\to \mathbb{R}^{n+m} \\
(x, y) &\mapsto (x, F(x, y)).
\end{align*}
Since $F: U \to \mathbb{R}^m$ is $C^1$ by hypothesis and the projection $(x,y) \mapsto x$ is smooth, the map $\Psi$ is $C^1$.
[guided]
The idea is to reformulate the equation $F(x, y) = 0$ as a fixed-point problem for an invertible map. The equation $F(x, y) = 0$ constrains $m$ of the $n + m$ variables, so we expect the solution set to be locally an $n$-dimensional graph. To make this precise, we need a map between $(n+m)$-dimensional spaces that we can invert. The natural choice is to keep the "free" variables $x$ and replace $y$ with the constraint $F(x,y)$: define
\begin{align*}
\Psi: U \subseteq \mathbb{R}^{n+m} &\to \mathbb{R}^{n+m} \\
(x, y) &\mapsto (x, F(x, y)).
\end{align*}
Since $F: U \to \mathbb{R}^m$ is $C^1$ by hypothesis and the coordinate projection $(x, y) \mapsto x$ is smooth, the composite map $\Psi$ is $C^1$. Notice that $\Psi(x_0, y_0) = (x_0, F(x_0, y_0)) = (x_0, 0)$ by the hypothesis $F(x_0, y_0) = 0$. The level set $\{F = 0\}$ near $(x_0, y_0)$ corresponds under $\Psi$ to points of the form $(x, 0)$, which is why this auxiliary map converts the implicit function problem into an inversion problem.
[/guided]
[/step]
[step:Compute $D\Psi_{(x_0, y_0)}$ and verify it is invertible via its block-triangular structure]
Write the variables in $\mathbb{R}^{n+m}$ as $(x, y)$ with $x \in \mathbb{R}^n$ and $y \in \mathbb{R}^m$. The [total derivative](/pages/1) of $\Psi$ at $(x_0, y_0)$ is the linear map $D\Psi_{(x_0, y_0)}: \mathbb{R}^{n+m} \to \mathbb{R}^{n+m}$ whose Jacobian matrix has the block form
\begin{align*}
J\Psi_{(x_0, y_0)} = \begin{pmatrix} I_n & 0 \\ D_x F_{(x_0, y_0)} & D_y F_{(x_0, y_0)} \end{pmatrix} \in \mathbb{R}^{(n+m) \times (n+m)},
\end{align*}
where $D_x F_{(x_0, y_0)}: \mathbb{R}^n \to \mathbb{R}^m$ denotes the partial derivative of $F$ with respect to the first $n$ variables and $D_y F_{(x_0, y_0)}: \mathbb{R}^m \to \mathbb{R}^m$ denotes the partial derivative with respect to the last $m$ variables. This matrix is block-lower-triangular. Its determinant factors as
\begin{align*}
\det J\Psi_{(x_0, y_0)} = \det(I_n) \cdot \det(J(D_y F)_{(x_0, y_0)}) = \det(J(D_y F)_{(x_0, y_0)}) \neq 0,
\end{align*}
since $D_y F_{(x_0, y_0)}$ is invertible by hypothesis. Therefore $D\Psi_{(x_0, y_0)}$ is invertible.
[guided]
To apply the [Inverse Function Theorem](/theorems/51), we must verify that $D\Psi_{(x_0, y_0)}$ is invertible. The first component of $\Psi$ is the identity on $x$, and the second component is $F(x, y)$. How does this affect the Jacobian? The partial derivative of the first component $(x, y) \mapsto x$ with respect to $x$ is $I_n$, and with respect to $y$ is $0$. The partial derivative of the second component $(x, y) \mapsto F(x, y)$ with respect to $x$ is $D_x F_{(x_0, y_0)}$ and with respect to $y$ is $D_y F_{(x_0, y_0)}$. Assembling these into the Jacobian matrix of $\Psi$ at $(x_0, y_0)$:
\begin{align*}
J\Psi_{(x_0, y_0)} = \begin{pmatrix} I_n & 0 \\ D_x F_{(x_0, y_0)} & D_y F_{(x_0, y_0)} \end{pmatrix} \in \mathbb{R}^{(n+m) \times (n+m)}.
\end{align*}
This is a block-lower-triangular matrix. The determinant of a block-triangular matrix is the product of the determinants of its diagonal blocks:
\begin{align*}
\det J\Psi_{(x_0, y_0)} = \det(I_n) \cdot \det(J(D_y F)_{(x_0, y_0)}) = 1 \cdot \det(J(D_y F)_{(x_0, y_0)}).
\end{align*}
The hypothesis that $D_y F_{(x_0, y_0)}: \mathbb{R}^m \to \mathbb{R}^m$ is invertible means precisely that the Jacobian matrix $J(D_y F)_{(x_0, y_0)}$ is nonsingular, so $\det(J(D_y F)_{(x_0, y_0)}) \neq 0$. This is the step where the invertibility hypothesis on $D_y F$ is consumed. Without it, $D\Psi_{(x_0, y_0)}$ could be singular and the Inverse Function Theorem would not apply.
[/guided]
[/step]
[step:Apply the Inverse Function Theorem to $\Psi$ to obtain a local $C^1$ inverse]
The map $\Psi: U \to \mathbb{R}^{n+m}$ is $C^1$ and $D\Psi_{(x_0, y_0)}$ is invertible. By the [Inverse Function Theorem](/theorems/51), there exist an open neighborhood $A$ of $(x_0, y_0)$ in $\mathbb{R}^{n+m}$ with $A \subseteq U$ and an [open](/pages/1144) neighborhood $B$ of $\Psi(x_0, y_0) = (x_0, 0)$ in $\mathbb{R}^{n+m}$ such that $\Psi|_A: A \to B$ is a $C^1$ diffeomorphism. Denote the inverse by
\begin{align*}
\Phi := (\Psi|_A)^{-1}: B \to A,
\end{align*}
which is also $C^1$.
[guided]
We now verify the hypotheses of the [Inverse Function Theorem](/theorems/51). The theorem requires:
(i) $\Psi$ is defined on an open set $U \subseteq \mathbb{R}^{n+m}$, which holds by hypothesis;
(ii) $\Psi$ is $C^1$, which we established in the first step;
(iii) $D\Psi_{(x_0, y_0)}$ is invertible, which we verified in the second step.
All three conditions are satisfied. The Inverse Function Theorem therefore provides open sets $A \ni (x_0, y_0)$ and $B \ni \Psi(x_0, y_0) = (x_0, 0)$ with $A \subseteq U$ such that $\Psi|_A: A \to B$ is a bijection, and both $\Psi|_A$ and its inverse
\begin{align*}
\Phi := (\Psi|_A)^{-1}: B \to A
\end{align*}
are $C^1$. The key point: within $A$, the map $\Psi$ is injective, so for each $(x, z) \in B$ there is a unique $(x', y) \in A$ with $\Psi(x', y) = (x, z)$. Since the first component of $\Psi$ is the identity on $x$, this forces $x' = x$, meaning $\Phi$ has the form $\Phi(x, z) = (x, \varphi(x, z))$ for some $C^1$ map $\varphi$.
[/guided]
[/step]
[step:Extract the implicit function $g$ from the second component of $\Phi$ restricted to $z = 0$]
Write $\Phi(x, z) = (\Phi_1(x, z), \Phi_2(x, z))$ with $\Phi_1: B \to \mathbb{R}^n$ and $\Phi_2: B \to \mathbb{R}^m$. Since $\Psi(x, y) = (x, F(x, y))$, the identity $\Psi(\Phi(x, z)) = (x, z)$ gives $\Phi_1(x, z) = x$ for all $(x, z) \in B$. In particular, $\Phi$ preserves the $x$-coordinate.
Since $B$ is open and contains $(x_0, 0)$, there exist open sets $V \subseteq \mathbb{R}^n$ containing $x_0$ and $W_0 \subseteq \mathbb{R}^m$ containing $0$ with $V \times W_0 \subseteq B$. Define
\begin{align*}
g: V &\to \mathbb{R}^m \\
x &\mapsto \Phi_2(x, 0).
\end{align*}
The map $g$ is $C^1$ as the composition of $C^1$ maps (the inclusion $x \mapsto (x, 0)$ and $\Phi_2$). Set $W := g(V') \cap \pi_y(A)$ where $\pi_y: \mathbb{R}^{n+m} \to \mathbb{R}^m$ is the projection onto the last $m$ coordinates and $V'$ is chosen small enough so that $W$ is an open neighborhood of $y_0$. After shrinking $V$ if necessary, we may assume $V \times W \subseteq U$.
[guided]
We now extract the implicit function. Write $\Phi = (\Phi_1, \Phi_2)$ with $\Phi_1: B \to \mathbb{R}^n$ and $\Phi_2: B \to \mathbb{R}^m$. The defining relation $\Psi(\Phi(x, z)) = (x, z)$ means
\begin{align*}
(\Phi_1(x, z),\; F(\Phi_1(x, z), \Phi_2(x, z))) = (x, z).
\end{align*}
Comparing first components gives $\Phi_1(x, z) = x$. Comparing second components gives $F(x, \Phi_2(x, z)) = z$. Setting $z = 0$:
\begin{align*}
F(x, \Phi_2(x, 0)) = 0 \quad \text{for all } x \text{ such that } (x, 0) \in B.
\end{align*}
Since $B$ is open in $\mathbb{R}^{n+m}$ and contains $(x_0, 0)$, we can find open sets $V \subseteq \mathbb{R}^n$ containing $x_0$ and $W_0 \subseteq \mathbb{R}^m$ containing $0$ with $V \times W_0 \subseteq B$. Define $g: V \to \mathbb{R}^m$ by $g(x) = \Phi_2(x, 0)$. This map is $C^1$ because $\Phi_2$ is $C^1$ (as a component of the $C^1$ inverse $\Phi$) and $(x, 0)$ is a $C^1$ function of $x$. After shrinking $V$ if necessary, let $W$ be an open neighborhood of $y_0$ such that $g(V) \subseteq W$ and $V \times W \subseteq U$. Such a $W$ exists because $g(x_0) = \Phi_2(x_0, 0) = y_0$ (since $\Phi(x_0, 0) = \Psi^{-1}(x_0, 0) = (x_0, y_0)$) and $g$ is [continuous](/pages/1147).
[/guided]
[/step]
[step:Verify uniqueness: for each $x \in V$, the point $y = g(x)$ is the only solution to $F(x, y) = 0$ in $W$]
Fix $x \in V$ and suppose $y \in W$ satisfies $F(x, y) = 0$. Then $\Psi(x, y) = (x, 0)$. Since $(x, y) \in V \times W \subseteq A$ and $(x, 0) \in V \times W_0 \subseteq B$, and $\Psi|_A: A \to B$ is injective, we conclude $(x, y) = \Phi(x, 0) = (x, g(x))$, so $y = g(x)$.
[guided]
Fix $x \in V$ and suppose $y \in W$ satisfies $F(x, y) = 0$. We must show $y = g(x)$, i.e., that no other solution exists in $W$. By definition of $\Psi$, we have
\begin{align*}
\Psi(x, y) = (x,\; F(x, y)) = (x,\; 0).
\end{align*}
For the injectivity of $\Psi|_A$ to apply, we need $(x, y) \in A$. This holds because $(x, y) \in V \times W \subseteq A$ by our choice of $V$ and $W$. We also need the image $(x, 0) \in B$: this holds because $(x, 0) \in V \times W_0 \subseteq B$.
Now consider the two points $(x, y)$ and $(x, g(x))$. Both lie in $A$ (since $g(x) \in W$ by construction), and both map under $\Psi$ to the same target:
\begin{align*}
\Psi(x, y) = (x, 0) \quad \text{and} \quad \Psi(x, g(x)) = (x,\; F(x, g(x))) = (x, 0),
\end{align*}
where the second equality uses $F(x, g(x)) = 0$ established in the extraction step. Since $\Psi|_A: A \to B$ is a bijection (provided by the Inverse Function Theorem), the preimage of $(x, 0)$ under $\Psi|_A$ is unique. Therefore $(x, y) = (x, g(x))$, which gives $y = g(x)$. This is the step where the local injectivity from the Inverse Function Theorem converts into uniqueness of the implicit function.
[/guided]
[/step]
[step:Confirm all conclusions: $g(x_0) = y_0$, $F(x, g(x)) = 0$, and $g$ is $C^1$]
We verify the three conclusions of the theorem:
1. **Initial condition.** $g(x_0) = \Phi_2(x_0, 0) = y_0$, since $\Phi(x_0, 0) = \Psi^{-1}(x_0, 0) = (x_0, y_0)$ (using $\Psi(x_0, y_0) = (x_0, 0)$).
2. **Implicit equation.** For every $x \in V$, we have $F(x, g(x)) = 0$ from the identity $F(x, \Phi_2(x, 0)) = 0$ established in the extraction step.
3. **Regularity.** The map $g: V \to W$ is $C^1$ because it is a component of the $C^1$ diffeomorphism $\Phi$ composed with the smooth inclusion $x \mapsto (x, 0)$.
This completes the proof.
[/step]
Prerequisites (0/5 completed)
Prerequisites Graph
Interactive dependency map showing how this theorem builds on foundational concepts
Loading dependency graph...
Theorem
Definition
Current
Requires
Theorems
Definitions & Concepts