[proofplan]
The proof is local on $M$. At each point $p \in F^{-1}(\{y\})$, the regular value hypothesis says that $F$ is a submersion at $p$, so the Euclidean constant rank theorem gives coordinates in which $F$ is the coordinate projection onto $n$ coordinates. In those adapted coordinates, the level set $F^{-1}(\{y\})$ is exactly the coordinate slice where those $n$ coordinates vanish, hence is locally modeled on $\mathbb{R}^{m-n}$. These local models are compatible because they are restrictions of smooth coordinate changes on $M$.
[/proofplan]
[step:Discard the empty level set case]
Let
\begin{align*}
S := F^{-1}(\{y\}) \subset M.
\end{align*}
If $S = \varnothing$, then $S$ is an embedded smooth submanifold of $M$ by the standard empty-submanifold convention, and there is nothing further to prove. Hence assume $S \neq \varnothing$.
For every $p \in S$, the regular value hypothesis gives a surjective [linear map](/page/Linear%20Map)
\begin{align*}
dF_p: T_pM \to T_yN.
\end{align*}
Since $\dim T_pM = m$ and $\dim T_yN = n$, surjectivity implies $m \ge n$ by rank-nullity.
[/step]
[step:Choose adapted local coordinates near a point of the level set]
Fix $p \in S$. Choose smooth coordinate charts $(A,\varphi)$ on $M$ with $p \in A$ and $(B,\psi)$ on $N$ with $y \in B$, and shrink $A$ so that $F(A) \subset B$. Define the local representative
\begin{align*}
f: \varphi(A) &\to \psi(B), \\
z &\mapsto \psi\bigl(F(\varphi^{-1}(z))\bigr).
\end{align*}
Since $dF_p$ is surjective, the derivative $Df_{\varphi(p)}: \mathbb{R}^m \to \mathbb{R}^n$ has rank $n$. Hence some $n \times n$ minor of the Jacobian matrix $Jf_{\varphi(p)}$ is nonzero. The corresponding determinant is a smooth real-valued function of $z \in \varphi(A)$, so after shrinking $A$ around $p$ it remains nonzero on $\varphi(A)$. Therefore $Df_z$ has rank at least $n$ for every $z \in \varphi(A)$, and since the codomain is $\mathbb{R}^n$, it has rank exactly $n$ on this neighbourhood.
Now the Euclidean constant rank theorem applies to $f$ on this neighbourhood, because its rank is constantly $n$. It gives open neighbourhoods $U_p \subset M$ of $p$ and $V_y \subset N$ of $y$, with $F(U_p) \subset V_y$, and smooth coordinate charts
\begin{align*}
\theta_p: U_p &\to \theta_p(U_p) \subset \mathbb{R}^m, \\
\rho_y: V_y &\to \rho_y(V_y) \subset \mathbb{R}^n
\end{align*}
such that $\theta_p(p) = 0$, $\rho_y(y) = 0$, and
\begin{align*}
(\rho_y \circ F \circ \theta_p^{-1})(u_1,\dots,u_m)
=
(u_{m-n+1},\dots,u_m)
\end{align*}
for all $(u_1,\dots,u_m) \in \theta_p(U_p)$.
[guided]
Fix $p \in S$, so $F(p) = y$. We want coordinates on $M$ near $p$ and coordinates on $N$ near $y$ in which the equation $F(x) = y$ becomes a coordinate equation. The hypothesis that $y$ is a regular value gives the surjectivity of
\begin{align*}
dF_p: T_pM \to T_yN.
\end{align*}
Thus $F$ has rank $n$ at $p$.
Choose smooth coordinate charts $(A,\varphi)$ on $M$ with $p \in A$ and $(B,\psi)$ on $N$ with $y \in B$. Because $F$ is continuous and $F(p)=y \in B$, shrink $A$ to an open neighbourhood of $p$ such that $F(A) \subset B$. In these charts, define the local representative
\begin{align*}
f: \varphi(A) &\to \psi(B), \\
z &\mapsto \psi\bigl(F(\varphi^{-1}(z))\bigr).
\end{align*}
The derivative $Df_{\varphi(p)}: \mathbb{R}^m \to \mathbb{R}^n$ represents $dF_p: T_pM \to T_yN$ under the coordinate identifications induced by $d\varphi_p$ and $d\psi_y$. Since $dF_p$ is surjective, $Df_{\varphi(p)}$ has rank $n$.
The Euclidean constant rank theorem requires constant rank on a neighbourhood, not just rank $n$ at one point. We verify that hypothesis. Since $Df_{\varphi(p)}$ has rank $n$, some $n \times n$ minor of the Jacobian matrix $Jf_{\varphi(p)}$ is nonzero. Let $\Delta: \varphi(A) \to \mathbb{R}$ denote the determinant of that same minor of $Jf_z$. The entries of $Jf_z$ are smooth functions of $z$, so $\Delta$ is smooth and hence continuous. Since $\Delta(\varphi(p)) \neq 0$, after shrinking $A$ around $p$ we have $\Delta(z) \neq 0$ for every $z \in \varphi(A)$. Therefore $Df_z$ has rank at least $n$ on this smaller neighbourhood. Because the codomain is $\mathbb{R}^n$, no derivative into $\mathbb{R}^n$ can have rank greater than $n$, so $Df_z$ has rank exactly $n$ for every $z \in \varphi(A)$.
Now the Euclidean constant rank theorem applies to the local representative $f$ on this neighbourhood. It provides smaller coordinate neighbourhoods $U_p \subset M$ of $p$ and $V_y \subset N$ of $y$, with $F(U_p) \subset V_y$, and coordinate charts
\begin{align*}
\theta_p: U_p &\to \theta_p(U_p) \subset \mathbb{R}^m, \\
\rho_y: V_y &\to \rho_y(V_y) \subset \mathbb{R}^n
\end{align*}
such that $\theta_p(p) = 0$, $\rho_y(y) = 0$, and the coordinate expression of $F$ is the projection onto the final $n$ coordinates:
\begin{align*}
(\rho_y \circ F \circ \theta_p^{-1})(u_1,\dots,u_m)
=
(u_{m-n+1},\dots,u_m).
\end{align*}
This is the point where regularity is used. Without surjectivity of $dF_p$, we would not obtain a nonzero $n \times n$ minor, so we could not force rank $n$ on a neighbourhood or obtain a codimension-$n$ coordinate model for the level set.
[/guided]
[/step]
[step:Identify the level set as a coordinate slice]
We claim that, after possibly replacing $U_p$ by a smaller neighbourhood of $p$, the chart $\theta_p$ sends $U_p \cap S$ onto the coordinate slice
\begin{align*}
\theta_p(U_p) \cap \bigl(\mathbb{R}^{m-n} \times \{0\}\bigr).
\end{align*}
Indeed, for $q \in U_p$, write
\begin{align*}
\theta_p(q) = (u_1,\dots,u_m).
\end{align*}
Because $\rho_y$ is injective and $\rho_y(y) = 0$, we have
\begin{align*}
q \in S
&\iff F(q) = y \\
&\iff \rho_y(F(q)) = 0 \\
&\iff (\rho_y \circ F \circ \theta_p^{-1})(u_1,\dots,u_m) = 0 \\
&\iff (u_{m-n+1},\dots,u_m) = 0.
\end{align*}
Thus
\begin{align*}
\theta_p(U_p \cap S)
=
\theta_p(U_p) \cap \bigl(\mathbb{R}^{m-n} \times \{0\}\bigr).
\end{align*}
[guided]
Let $q \in U_p$, and write its adapted coordinate representation as
\begin{align*}
\theta_p(q) = (u_1,\dots,u_m).
\end{align*}
We now translate the equation $q \in S$ into these coordinates. Since $S = F^{-1}(\{y\})$, the condition $q \in S$ is exactly $F(q) = y$. Applying the coordinate chart $\rho_y$ on $N$ is legitimate because $F(U_p) \subset V_y$, and $\rho_y$ is injective on $V_y$. Since $\rho_y(y) = 0$, we get the equivalences
\begin{align*}
q \in S
&\iff F(q) = y \\
&\iff \rho_y(F(q)) = \rho_y(y) \\
&\iff \rho_y(F(q)) = 0.
\end{align*}
Using the adapted coordinate formula from the previous step,
\begin{align*}
\rho_y(F(q))
=
(\rho_y \circ F \circ \theta_p^{-1})(u_1,\dots,u_m)
=
(u_{m-n+1},\dots,u_m).
\end{align*}
Therefore
\begin{align*}
q \in S
\iff
(u_{m-n+1},\dots,u_m) = 0.
\end{align*}
So the level set is locally the coordinate slice where the final $n$ coordinates vanish:
\begin{align*}
\theta_p(U_p \cap S)
=
\theta_p(U_p) \cap \bigl(\mathbb{R}^{m-n} \times \{0\}\bigr).
\end{align*}
This is precisely the local normal form required for an embedded submanifold of codimension $n$.
[/guided]
[/step]
[step:Construct smooth charts on the level set]
Define the coordinate projection
\begin{align*}
\pi: \mathbb{R}^{m-n} \times \{0\} &\to \mathbb{R}^{m-n}, \\
(a_1,\dots,a_{m-n},0,\dots,0) &\mapsto (a_1,\dots,a_{m-n}).
\end{align*}
For each $p \in S$, define
\begin{align*}
\alpha_p: U_p \cap S &\to \mathbb{R}^{m-n}, \\
q &\mapsto \pi(\theta_p(q)).
\end{align*}
Since $\theta_p(U_p \cap S)$ is an open subset of the coordinate slice $\mathbb{R}^{m-n} \times \{0\}$ in its [subspace topology](/page/Subspace%20Topology), the image $\alpha_p(U_p \cap S)$ is open in $\mathbb{R}^{m-n}$. Moreover $\alpha_p$ is a bijection from $U_p \cap S$ onto this open subset, with inverse
\begin{align*}
\alpha_p^{-1}: \alpha_p(U_p \cap S) &\to U_p \cap S, \\
a &\mapsto \theta_p^{-1}(a,0).
\end{align*}
Here $(a,0)$ denotes the point of $\mathbb{R}^{m-n} \times \mathbb{R}^n$ whose first $m-n$ coordinates are $a$ and whose final $n$ coordinates are zero.
[/step]
[step:Verify smooth compatibility of the level set charts]
Let $p,r \in S$ with $(U_p \cap U_r \cap S) \neq \varnothing$. The transition map between the level set charts is
\begin{align*}
\alpha_r \circ \alpha_p^{-1}: \alpha_p(U_p \cap U_r \cap S) &\to \alpha_r(U_p \cap U_r \cap S), \\
a &\mapsto \pi\bigl(\theta_r(\theta_p^{-1}(a,0))\bigr).
\end{align*}
The map
\begin{align*}
\theta_r \circ \theta_p^{-1}: \theta_p(U_p \cap U_r) \to \theta_r(U_p \cap U_r)
\end{align*}
is a smooth transition map between smooth coordinate charts on $M$. The inclusion
\begin{align*}
\iota: \mathbb{R}^{m-n} &\to \mathbb{R}^m, \\
a &\mapsto (a,0)
\end{align*}
and the projection $\pi$ are smooth maps between Euclidean spaces. Hence
\begin{align*}
\alpha_r \circ \alpha_p^{-1}
=
\pi \circ \theta_r \circ \theta_p^{-1} \circ \iota
\end{align*}
is smooth as a composition of smooth maps. Reversing the roles of $p$ and $r$ gives smoothness of the inverse transition map. Therefore the charts $(U_p \cap S,\alpha_p)$ form a smooth atlas on $S$.
[/step]
[step:Conclude that the level set is an embedded submanifold of dimension $m-n$]
The sets $U_p \cap S$, with $p \in S$, cover $S$. In each ambient chart $\theta_p$ on $M$, the subset $S$ is represented as
\begin{align*}
\theta_p(U_p \cap S)
=
\theta_p(U_p) \cap \bigl(\mathbb{R}^{m-n} \times \{0\}\bigr).
\end{align*}
Thus $S$ is locally a coordinate slice of dimension $m-n$ inside $M$. The compatibility verification above gives $S$ a smooth atlas, and the coordinate-slice description shows that this smooth structure is embedded in the ambient smooth structure of $M$. Hence $F^{-1}(\{y\}) = S$ is an embedded smooth submanifold of $M$ of dimension $m-n$.
[/step]