Characteristic System for First-Order PDEs (Theorem # 49)
Theorem
Let $F: \mathbb{R}^n \times \mathbb{R} \times \mathbb{R}^n \to \mathbb{R}$ be smooth, where we write $F = F(a, b, c)$ with $a \in \mathbb{R}^n$ the gradient slot, $b \in \mathbb{R}$ the value slot, and $c \in \mathbb{R}^n$ the position slot. Let $\Omega \subseteq \mathbb{R}^n$ be open and suppose $u \in C^2(\Omega)$ satisfies $F(\nabla u(x), u(x), x) = 0$ for every $x \in \Omega$.
For any smooth curve $X: A \to \Omega$ defined on an interval $A \subseteq \mathbb{R}$, define the tracked quantities:
\begin{align*}
U: A &\to \mathbb{R}, \quad U(s) := u(X(s)), \\
G: A &\to \mathbb{R}^n, \quad G(s) := \nabla u(X(s)),
\end{align*}
and set $T(s) := (G(s), U(s), X(s))$. If $X$ is chosen so that
\begin{align*}
\dot{X}_j(s) = \frac{\partial F}{\partial a_j}\bigg|_{T(s)}, \quad j = 1, \dots, n,
\end{align*}
then the triple $T(s) = (G(s), U(s), X(s))$ satisfies the **characteristic ODE system**:
\begin{align*}
\dot{X}_j(s) &= \frac{\partial F}{\partial a_j}\bigg|_{T(s)}, \\[4pt]
\dot{G}_i(s) &= -\frac{\partial F}{\partial c_i}\bigg|_{T(s)} - G_i(s)\frac{\partial F}{\partial b}\bigg|_{T(s)}, \\[4pt]
\dot{U}(s) &= \sum_{j=1}^n G_j(s)\frac{\partial F}{\partial a_j}\bigg|_{T(s)},
\end{align*}
for each $i, j = 1, \dots, n$.
Analysis
Partial Differential Equations
Discussion
Given a nonlinear first-order PDE $F(\nabla u, u, x) = 0$ with $F$ smooth and $u \in C^2(\Omega)$ a classical solution, and a curve $X(s)$ satisfying $\dot{X}_j = \partial F/\partial a_j|_{T(s)}$, the triple $T(s) = (\nabla u(X(s)), u(X(s)), X(s))$ satisfies the closed characteristic ODE system: $\dot{X}_j = \partial_{{a_j}} F|_T$, $\dot{G}_i = -\partial_{c_i} F|_T - G_i \partial_b F|_T$, and $\dot{U} = \sum_j G_j \partial_{a_j} F|_T$. The key step eliminates second derivatives of $u$ by differentiating the PDE identity with respect to $x_i$.
Proof
[proofplan]
We derive the characteristic ODE system by differentiating two identities: the composition $U(s) = u(X(s))$ and the PDE constraint $F(\nabla u(x), u(x), x) = 0$. The $\dot{U}$ equation follows from a direct chain rule application. The $\dot{G}_i$ equation requires differentiating the PDE identity $F(\tau(x)) = 0$ with respect to $x_i$ to express the second-order term $\sum_j (\partial^2 u / \partial x_i \partial x_j) \cdot (\partial F / \partial a_j)$ in terms of first-order data, then recognising this sum as $\dot{G}_i(s)$ along the characteristic curve.
[/proofplan]
[step:Derive the $\dot{U}$ equation via the chain rule]
Since $U(s) = u(X(s))$ and $u \in C^2(\Omega)$, the chain rule gives
\begin{align*}
\dot{U}(s) = \sum_{j=1}^n \frac{\partial u}{\partial x_j}(X(s)) \, \dot{X}_j(s) = \sum_{j=1}^n G_j(s) \, \dot{X}_j(s).
\end{align*}
Substituting the characteristic velocity condition $\dot{X}_j(s) = \frac{\partial F}{\partial a_j}\big|_{T(s)}$:
\begin{align*}
\dot{U}(s) = \sum_{j=1}^n G_j(s) \, \frac{\partial F}{\partial a_j}\bigg|_{T(s)}.
\end{align*}
[/step]
[step:Differentiate the PDE identity $F(\tau(x)) = 0$ with respect to $x_i$]
Define the auxiliary map $\tau: \Omega \to \mathbb{R}^n \times \mathbb{R} \times \mathbb{R}^n$ by $\tau(x) := (\nabla u(x), u(x), x)$, so the [first-order PDE](/page/Imperial%20Theory%20of%20PDEs%3A%20First%20Order%20Equations) reads $F(\tau(x)) = 0$ for all $x \in \Omega$. Since $u \in C^2(\Omega)$ and $F$ is smooth, the composition $F \circ \tau$ is $C^1$. Differentiating the identity $F(\tau(x)) = 0$ with respect to $x_i$ via the chain rule:
\begin{align*}
0 = \frac{\partial}{\partial x_i}\bigl[F(\tau(x))\bigr] = \sum_{j=1}^n \frac{\partial F}{\partial a_j}\bigg|_{\tau(x)} \frac{\partial^2 u}{\partial x_i \partial x_j}(x) + \frac{\partial F}{\partial b}\bigg|_{\tau(x)} \frac{\partial u}{\partial x_i}(x) + \frac{\partial F}{\partial c_i}\bigg|_{\tau(x)}.
\end{align*}
[guided]
The PDE identity $F(\nabla u(x), u(x), x) = 0$ holds for every $x \in \Omega$. Since this is an identity in $x$ (not just at a single point), we may differentiate both sides with respect to $x_i$. The map $\tau(x) = (\nabla u(x), u(x), x)$ has $2n + 1$ component functions of $x$, and $F$ depends on $2n + 1$ variables $(a_1, \ldots, a_n, b, c_1, \ldots, c_n)$.
Applying the multivariable chain rule to the composition $x \mapsto F(\tau(x))$:
\begin{align*}
\frac{\partial}{\partial x_i}\bigl[F(\tau(x))\bigr] &= \sum_{j=1}^n \frac{\partial F}{\partial a_j}\bigg|_{\tau(x)} \cdot \frac{\partial}{\partial x_i}\bigl[\partial_{x_j} u(x)\bigr] + \frac{\partial F}{\partial b}\bigg|_{\tau(x)} \cdot \frac{\partial u}{\partial x_i}(x) + \sum_{k=1}^n \frac{\partial F}{\partial c_k}\bigg|_{\tau(x)} \cdot \frac{\partial x_k}{\partial x_i}.
\end{align*}
The third sum simplifies because $\partial x_k / \partial x_i = \delta_{ki}$, so only the $k = i$ term survives. The first sum uses $\frac{\partial}{\partial x_i}[\partial_{x_j} u] = \frac{\partial^2 u}{\partial x_i \partial x_j}$, which exists and is continuous since $u \in C^2(\Omega)$. Setting the result equal to zero:
\begin{align*}
0 = \sum_{j=1}^n \frac{\partial F}{\partial a_j}\bigg|_{\tau(x)} \frac{\partial^2 u}{\partial x_i \partial x_j}(x) + \frac{\partial F}{\partial b}\bigg|_{\tau(x)} \frac{\partial u}{\partial x_i}(x) + \frac{\partial F}{\partial c_i}\bigg|_{\tau(x)}.
\end{align*}
This identity relates the second derivatives of $u$ to first-order data via $F$, which is the key to eliminating second derivatives from the $\dot{G}_i$ equation.
[/guided]
[/step]
[step:Derive the $\dot{G}_i$ equation by evaluating along $X(s)$]
Since $G_i(s) = \partial_{x_i} u(X(s))$ and $u \in C^2(\Omega)$, the chain rule gives
\begin{align*}
\dot{G}_i(s) = \sum_{j=1}^n \frac{\partial^2 u}{\partial x_i \partial x_j}(X(s)) \, \dot{X}_j(s) = \sum_{j=1}^n \frac{\partial^2 u}{\partial x_i \partial x_j}(X(s)) \, \frac{\partial F}{\partial a_j}\bigg|_{T(s)}.
\end{align*}
Evaluating the differentiated PDE identity from the previous step at $x = X(s)$ and rearranging for the sum involving second derivatives:
\begin{align*}
\sum_{j=1}^n \frac{\partial F}{\partial a_j}\bigg|_{T(s)} \frac{\partial^2 u}{\partial x_i \partial x_j}(X(s)) = -\frac{\partial F}{\partial b}\bigg|_{T(s)} G_i(s) - \frac{\partial F}{\partial c_i}\bigg|_{T(s)}.
\end{align*}
The left-hand side is exactly $\dot{G}_i(s)$, so
\begin{align*}
\dot{G}_i(s) = -\frac{\partial F}{\partial c_i}\bigg|_{T(s)} - G_i(s) \, \frac{\partial F}{\partial b}\bigg|_{T(s)}.
\end{align*}
[guided]
The $\dot{G}_i$ equation is the most substantial part of the derivation, because naively $\dot{G}_i$ involves the second derivatives $\partial^2 u / \partial x_i \partial x_j$ evaluated at $X(s)$. These are not part of the characteristic data $(G, U, X)$. The whole point of the [method of characteristics](/page/Method%20of%20Characteristics) is that these second derivatives can be eliminated using the PDE.
From the chain rule applied to $G_i(s) = \partial_{x_i} u(X(s))$:
\begin{align*}
\dot{G}_i(s) = \sum_{j=1}^n \frac{\partial^2 u}{\partial x_i \partial x_j}(X(s)) \, \dot{X}_j(s).
\end{align*}
Substituting the characteristic velocity $\dot{X}_j(s) = \frac{\partial F}{\partial a_j}\big|_{T(s)}$:
\begin{align*}
\dot{G}_i(s) = \sum_{j=1}^n \frac{\partial^2 u}{\partial x_i \partial x_j}(X(s)) \, \frac{\partial F}{\partial a_j}\bigg|_{T(s)}.
\end{align*}
Now look at the differentiated PDE identity evaluated at $x = X(s)$. Note that $\tau(X(s)) = (\nabla u(X(s)), u(X(s)), X(s)) = (G(s), U(s), X(s)) = T(s)$. The identity reads
\begin{align*}
\sum_{j=1}^n \frac{\partial F}{\partial a_j}\bigg|_{T(s)} \frac{\partial^2 u}{\partial x_i \partial x_j}(X(s)) + \frac{\partial F}{\partial b}\bigg|_{T(s)} G_i(s) + \frac{\partial F}{\partial c_i}\bigg|_{T(s)} = 0,
\end{align*}
where we used $\frac{\partial u}{\partial x_i}(X(s)) = G_i(s)$ by definition. Solving for the sum on the left:
\begin{align*}
\sum_{j=1}^n \frac{\partial F}{\partial a_j}\bigg|_{T(s)} \frac{\partial^2 u}{\partial x_i \partial x_j}(X(s)) = -\frac{\partial F}{\partial b}\bigg|_{T(s)} G_i(s) - \frac{\partial F}{\partial c_i}\bigg|_{T(s)}.
\end{align*}
The left-hand side is precisely $\dot{G}_i(s)$, and the right-hand side involves only the characteristic data $T(s) = (G(s), U(s), X(s))$. Therefore
\begin{align*}
\dot{G}_i(s) = -\frac{\partial F}{\partial c_i}\bigg|_{T(s)} - G_i(s) \, \frac{\partial F}{\partial b}\bigg|_{T(s)}.
\end{align*}
This is the crucial "miracle" of the method of characteristics: the PDE provides exactly the relation needed to close the system -- the second derivatives drop out and the evolution of $(G, U, X)$ depends only on $(G, U, X)$.
[/guided]
[/step]
[step:Assemble the three equations into the characteristic ODE system]
Collecting the results, the triple $T(s) = (G(s), U(s), X(s))$ satisfies the autonomous first-order ODE system:
\begin{align*}
\dot{X}_j(s) &= \frac{\partial F}{\partial a_j}\bigg|_{T(s)}, \\
\dot{G}_i(s) &= -\frac{\partial F}{\partial c_i}\bigg|_{T(s)} - G_i(s) \, \frac{\partial F}{\partial b}\bigg|_{T(s)}, \\
\dot{U}(s) &= \sum_{j=1}^n G_j(s) \, \frac{\partial F}{\partial a_j}\bigg|_{T(s)},
\end{align*}
for $i, j = 1, \ldots, n$. The first equation is the defining hypothesis on $X$, and the second and third are derived consequences. The system is closed: the right-hand sides depend only on $T(s)$, not on second derivatives of $u$.
[/step]
Prerequisites (0/1 completed)
Prerequisites Graph
Interactive dependency map showing how this theorem builds on foundational concepts
Loading dependency graph...
Theorem
Definition
Current
Requires
Definitions & Concepts