[guided]We prove the immersion statement directly from the linear algebra of the phase equations. Fix a point $(x,\theta)\in C_\phi$ and take a tangent vector
\begin{align*}
v\in T_{(x,\theta)}C_\phi
\end{align*}
such that
\begin{align*}
d(j_\phi)_{(x,\theta)}(v)=0.
\end{align*}
The goal is to prove $v=0$.
Work in a coordinate chart $(U,\kappa)$ on $X$ near $x$, with coordinates $x_1,\dots,x_n$, together with the standard coordinates $\theta_1,\dots,\theta_N$ on $\mathbb{R}^N_0$. In these coordinates, write
\begin{align*}
v=(\delta x,\delta\theta)\in\mathbb{R}^n\times\mathbb{R}^N.
\end{align*}
The map $j_\phi$ has two pieces: its base component is $x$, and its fiber component is the covector $d_x\phi$. Therefore the equality $d(j_\phi)_{(x,\theta)}(v)=0$ first forces the base variation to vanish:
\begin{align*}
\delta x=0.
\end{align*}
The same equality also says that the first variation of each fiber coordinate $\partial_{x_i}\phi$ vanishes:
\begin{align*}
d_{x,\theta}(\partial_{x_i}\phi)(x,\theta)(v)=0
\end{align*}
for every $1\leq i\leq n$.
Now we use that $v$ is tangent to the critical set. Since
\begin{align*}
C_\phi=F^{-1}(\{0\}),\quad F=(\partial_{\theta_1}\phi,\dots,\partial_{\theta_N}\phi),
\end{align*}
a tangent vector to $C_\phi$ lies in the kernel of $dF_{(x,\theta)}$. Thus
\begin{align*}
d_{x,\theta}(\partial_{\theta_a}\phi)(x,\theta)(v)=0
\end{align*}
for every $1\leq a\leq N$.
At this point the vanishing of the base variation matters. Since $\delta x=0$, the vector $v$ is purely vertical, so $v=(0,\delta\theta)$ for some $\delta\theta\in\mathbb{R}^N$. The equations coming from $d(j_\phi)(v)=0$ say that $\delta\theta$ kills all mixed second derivatives $\partial_{\theta_b}\partial_{x_i}\phi$, while the tangent equations say that $\delta\theta$ kills all second derivatives $\partial_{\theta_b}\partial_{\theta_a}\phi$. Together, these are exactly the statement
\begin{align*}
(dF_{(x,\theta)})^\top(\delta\theta)=0.
\end{align*}
Here $(dF_{(x,\theta)})^\top:\mathbb{R}^N\to T_{(x,\theta)}^*\Omega$ is the transpose of the surjective [linear map](/page/Linear%20Map) $dF_{(x,\theta)}:T_{(x,\theta)}\Omega\to\mathbb{R}^N$.
The nondegeneracy hypothesis says that the component covectors of $dF_{(x,\theta)}$ are linearly independent. Equivalently, $dF_{(x,\theta)}$ is surjective. A surjective linear map has injective transpose, so the equation
\begin{align*}
(dF_{(x,\theta)})^\top(\delta\theta)=0
\end{align*}
implies $\delta\theta=0$. Since also $\delta x=0$, we get $v=0$. Therefore the kernel of $d(j_\phi)_{(x,\theta)}$ is zero at every point, which proves that $j_\phi$ is an immersion.[/guided]