[proofplan]
The proof is a coordinate computation. We use the assumed functions $h,L_fh,\dots,L_f^{r-1}h,\psi_1,\dots,\psi_{n-r}$ to build the coordinate map $\Phi$, and the [inverse function theorem](/theorems/51) turns the invertibility of $D\Phi_{x_0}$ into a local chart. In that chart, the first $r-1$ coordinate equations are forced by the relative-degree vanishings, the final external equation has a nonzero input coefficient by the relative-degree nonvanishing condition, and the internal coordinates are input-free because $L_g\psi_j=0$. Substituting the static feedback then replaces the last external equation by $\dot{\xi}_r=v$, which is equivalent to $\xi_1^{(r)}=v$.
[/proofplan]
[step:Use the inverse function theorem to obtain the coordinate chart]
Define the smooth map
\begin{align*}
\Phi: U \to \mathbb{R}^n, \quad x \mapsto \bigl(h(x),L_fh(x),\dots,L_f^{r-1}h(x),\psi_1(x),\dots,\psi_{n-r}(x)\bigr).
\end{align*}
By hypothesis, the derivative $D\Phi_{x_0}:\mathbb{R}^n \to \mathbb{R}^n$ is invertible. By the inverse function theorem, there exist an open neighbourhood $V \subset U$ of $x_0$ and an [open set](/page/Open%20Set) $\Omega \subset \mathbb{R}^n$ containing $\Phi(x_0)$ such that
\begin{align*}
\Phi|_V: V \to \Omega
\end{align*}
is a smooth diffeomorphism. Replace $U$ by $V$ and write
\begin{align*}
\Theta: \Omega \to V
\end{align*}
for the inverse map $\Theta=(\Phi|_V)^{-1}$.
For $z=(\xi,\eta)\in \Omega$, where $\xi=(\xi_1,\dots,\xi_r)\in \mathbb{R}^r$ and $\eta=(\eta_1,\dots,\eta_{n-r})\in \mathbb{R}^{n-r}$, the coordinates are therefore
\begin{align*}
\xi_i=L_f^{i-1}h(x) \quad \text{for } i \in \{1,\dots,r\}, \qquad \eta_j=\psi_j(x) \quad \text{for } j \in \{1,\dots,n-r\}.
\end{align*}
[/step]
[step:Compute the integrator chain for the first external coordinates]
Let $I \subset \mathbb{R}$ be an interval, let $u:I \to \mathbb{R}$ be an admissible input, and let
\begin{align*}
x:I \to V
\end{align*}
be a differentiable trajectory satisfying $\dot{x}(t)=f(x(t))+g(x(t))u(t)$. For each $i \in \{1,\dots,r\}$, define
\begin{align*}
\xi_i:I \to \mathbb{R}, \quad t \mapsto L_f^{i-1}h(x(t)).
\end{align*}
Fix $i \in \{1,\dots,r-1\}$. Applying the chain rule to the smooth function $L_f^{i-1}h:V \to \mathbb{R}$ gives
\begin{align*}
\dot{\xi}_i(t)=D(L_f^{i-1}h)_{x(t)}(\dot{x}(t)).
\end{align*}
Substituting the control-affine equation and using linearity of the differential,
\begin{align*}
\dot{\xi}_i(t)=D(L_f^{i-1}h)_{x(t)}(f(x(t)))+D(L_f^{i-1}h)_{x(t)}(g(x(t)))u(t).
\end{align*}
By the definition of Lie derivative, this is
\begin{align*}
\dot{\xi}_i(t)=L_f^i h(x(t))+L_gL_f^{i-1}h(x(t))u(t).
\end{align*}
Since $i-1 \in \{0,\dots,r-2\}$, the relative-degree hypothesis gives $L_gL_f^{i-1}h=0$ on $V$. Hence
\begin{align*}
\dot{\xi}_i(t)=L_f^i h(x(t))=\xi_{i+1}(t).
\end{align*}
Thus $\dot{\xi}_i=\xi_{i+1}$ for every $i \in \{1,\dots,r-1\}$.
[guided]
The first goal is to see why the coordinates built from $h$ and its drift Lie derivatives automatically form an integrator chain. Let $I \subset \mathbb{R}$ be an interval, let $u:I \to \mathbb{R}$ be an input, and let
\begin{align*}
x:I \to V
\end{align*}
be a differentiable solution of $\dot{x}(t)=f(x(t))+g(x(t))u(t)$. For each $i \in \{1,\dots,r\}$, the $i$th external coordinate along the trajectory is the real-valued function
\begin{align*}
\xi_i:I \to \mathbb{R}, \quad t \mapsto L_f^{i-1}h(x(t)).
\end{align*}
Fix $i \in \{1,\dots,r-1\}$. The function being differentiated is $L_f^{i-1}h:V \to \mathbb{R}$, and the curve is $x:I \to V$. The chain rule gives
\begin{align*}
\dot{\xi}_i(t)=D(L_f^{i-1}h)_{x(t)}(\dot{x}(t)).
\end{align*}
Now substitute the dynamics of $x$:
\begin{align*}
\dot{\xi}_i(t)=D(L_f^{i-1}h)_{x(t)}(f(x(t))+g(x(t))u(t)).
\end{align*}
Because $D(L_f^{i-1}h)_{x(t)}:\mathbb{R}^n \to \mathbb{R}$ is a [linear map](/page/Linear%20Map), the two vector-field contributions separate:
\begin{align*}
\dot{\xi}_i(t)=D(L_f^{i-1}h)_{x(t)}(f(x(t)))+D(L_f^{i-1}h)_{x(t)}(g(x(t)))u(t).
\end{align*}
By the definition of Lie derivative along $f$ and $g$, this becomes
\begin{align*}
\dot{\xi}_i(t)=L_f^i h(x(t))+L_gL_f^{i-1}h(x(t))u(t).
\end{align*}
This is exactly where the relative-degree hypothesis is used. Since $i\le r-1$, we have $i-1\in \{0,\dots,r-2\}$, and relative degree $r$ says that $L_gL_f^{i-1}h=0$ on the neighbourhood under consideration. Therefore the input term disappears:
\begin{align*}
\dot{\xi}_i(t)=L_f^i h(x(t)).
\end{align*}
But the next coordinate is defined by $\xi_{i+1}(t)=L_f^i h(x(t))$. Hence
\begin{align*}
\dot{\xi}_i(t)=\xi_{i+1}(t).
\end{align*}
Since the index $i$ was arbitrary in $\{1,\dots,r-1\}$, the whole chain $\dot{\xi}_1=\xi_2,\dots,\dot{\xi}_{r-1}=\xi_r$ follows.
[/guided]
[/step]
[step:Define the last external equation and keep its input coefficient nonzero]
For the final external coordinate
\begin{align*}
\xi_r:I \to \mathbb{R}, \quad t \mapsto L_f^{r-1}h(x(t)),
\end{align*}
the same chain-rule computation gives
\begin{align*}
\dot{\xi}_r(t)=L_f^r h(x(t))+L_gL_f^{r-1}h(x(t))u(t).
\end{align*}
Define the smooth functions
\begin{align*}
a:\Omega \to \mathbb{R}, \quad z \mapsto L_f^r h(\Theta(z)),
\end{align*}
and
\begin{align*}
b:\Omega \to \mathbb{R}, \quad z \mapsto L_gL_f^{r-1}h(\Theta(z)).
\end{align*}
Writing $z=(\xi,\eta)=\Phi(x)$, the last external equation becomes
\begin{align*}
\dot{\xi}_r=a(\xi,\eta)+b(\xi,\eta)u.
\end{align*}
The relative-degree hypothesis gives $L_gL_f^{r-1}h(x)\neq 0$ for every $x\in V$. Since $\Theta(\Omega)=V$, it follows that
\begin{align*}
b(\xi,\eta)\neq 0 \quad \text{for every } (\xi,\eta)\in \Omega.
\end{align*}
[/step]
[step:Use input-invariant completion coordinates to obtain the internal dynamics]
For each $j\in \{1,\dots,n-r\}$, define
\begin{align*}
\eta_j:I \to \mathbb{R}, \quad t \mapsto \psi_j(x(t)).
\end{align*}
Applying the chain rule to $\psi_j:V\to \mathbb{R}$ gives
\begin{align*}
\dot{\eta}_j(t)=D\psi_j{}_{x(t)}(\dot{x}(t)).
\end{align*}
Substituting $\dot{x}(t)=f(x(t))+g(x(t))u(t)$ and using the definition of Lie derivative,
\begin{align*}
\dot{\eta}_j(t)=L_f\psi_j(x(t))+L_g\psi_j(x(t))u(t).
\end{align*}
By hypothesis, $L_g\psi_j=0$ on $V$, so
\begin{align*}
\dot{\eta}_j(t)=L_f\psi_j(x(t)).
\end{align*}
Define the smooth map
\begin{align*}
q:\Omega \to \mathbb{R}^{n-r}, \quad z \mapsto \bigl(L_f\psi_1(\Theta(z)),\dots,L_f\psi_{n-r}(\Theta(z))\bigr).
\end{align*}
Then, with $z=(\xi,\eta)$, the internal coordinates satisfy
\begin{align*}
\dot{\eta}=q(\xi,\eta).
\end{align*}
[/step]
[step:Recover the output equation and apply the feedback linearizing input]
Since the first coordinate is defined by $\xi_1=h(x)$ and the output is $y=h(x)$, we have
\begin{align*}
y=\xi_1.
\end{align*}
Combining the preceding coordinate equations gives the normal form
\begin{align*}
\dot{\xi}_i=\xi_{i+1} \quad \text{for } i\in\{1,\dots,r-1\},
\end{align*}
\begin{align*}
\dot{\xi}_r=a(\xi,\eta)+b(\xi,\eta)u, \qquad \dot{\eta}=q(\xi,\eta), \qquad y=\xi_1.
\end{align*}
Because $b(\xi,\eta)\neq 0$ on $\Omega$, the feedback law
\begin{align*}
u=\frac{v-a(\xi,\eta)}{b(\xi,\eta)}
\end{align*}
is well-defined for every $(\xi,\eta)\in \Omega$. Substituting it into the last external equation gives
\begin{align*}
\dot{\xi}_r=a(\xi,\eta)+b(\xi,\eta)\frac{v-a(\xi,\eta)}{b(\xi,\eta)}=v.
\end{align*}
Together with $\dot{\xi}_1=\xi_2,\dots,\dot{\xi}_{r-1}=\xi_r$, this implies
\begin{align*}
\xi_1^{(r)}=v.
\end{align*}
The internal equation is unchanged by this substitution except through the dependence of $q$ on $(\xi,\eta)$, so the internal dynamics remain
\begin{align*}
\dot{\eta}=q(\xi,\eta).
\end{align*}
This proves the Byrnes-Isidori normal form and the stated feedback linearization of the external subsystem.
[/step]