[proofplan]
We prove the inclusion by the Fourier-decay characterization of the wave front set. After multiplying $u$ by a compactly supported cutoff in $x$, its [Fourier transform](/page/Fourier%20Transform) is an oscillatory integral with phase $\phi(x,\theta) - x \cdot \xi$. Away from the displayed critical image and away from the conic essential support of the amplitude, the integral is either smoothing or has no stationary points in the full $(x,\theta)$ variables, so repeated [integration by parts](/theorems/210) gives rapid decay in $\xi$. The nondegenerate assertion is the standard geometric theorem that a nondegenerate homogeneous phase parametrizes a conic immersed Lagrangian.
[/proofplan]
[step:Localize the distribution and write its Fourier transform as an oscillatory integral]
Fix a point $(x_0,\xi_0) \in T^*U \setminus 0$ which does not belong to
\begin{align*}
\overline{\{(x,\nabla_x\phi(x,\theta)) : (x,\theta) \in C_\phi \cap \operatorname{esssupp}(a), \ \nabla_x\phi(x,\theta) \ne 0\}}.
\end{align*}
We will prove that $(x_0,\xi_0) \notin \operatorname{WF}(u)$.
Choose $\chi \in C_c^\infty(U)$ with $\chi(x_0) \ne 0$. Shrinking the support of $\chi$ if necessary, let $K := \operatorname{supp}\chi \subset U$ be compact and chosen so that there is an open conic neighbourhood $\Gamma \subset \mathbb{R}^n_0$ of $\xi_0$ satisfying
\begin{align*}
(K \times \Gamma) \cap \overline{\{(x,\nabla_x\phi(x,\theta)) : (x,\theta) \in C_\phi \cap \operatorname{esssupp}(a), \ \nabla_x\phi(x,\theta) \ne 0\}} = \varnothing.
\end{align*}
Define the localized amplitude
\begin{align*}
a_\chi: U \times \mathbb{R}^N &\to \mathbb{C}
\end{align*}
by
\begin{align*}
a_\chi(x,\theta) := \chi(x)a(x,\theta).
\end{align*}
Since $\chi$ has compact support and the $x$-support of $a$ is proper over $U$, the $x$-support of $a_\chi$ is compact in $U$. The Fourier transform of $\chi u$ is therefore, in the oscillatory sense,
\begin{align*}
\widehat{\chi u}(\xi) = (2\pi)^{-N}\int_U \int_{\mathbb{R}^N} e^{i\Phi(x,\theta,\xi)} a_\chi(x,\theta)\, d\mathcal{L}^N(\theta)\, d\mathcal{L}^n(x),
\end{align*}
where the phase
\begin{align*}
\Phi: U \times \mathbb{R}^N_0 \times \mathbb{R}^n_0 &\to \mathbb{R}
\end{align*}
is defined by
\begin{align*}
\Phi(x,\theta,\xi) := \phi(x,\theta) - x \cdot \xi.
\end{align*}
[guided]
The goal is to prove absence from the wave front set at $(x_0,\xi_0)$. By the Fourier transform criterion for wave front sets, it is enough to find a cutoff $\chi \in C_c^\infty(U)$ with $\chi(x_0) \ne 0$ and a conic neighbourhood $\Gamma$ of $\xi_0$ such that $\widehat{\chi u}(\xi)$ decays faster than every power of $|\xi|$ for $\xi \in \Gamma$.
Because $(x_0,\xi_0)$ is outside the closed critical image, we may shrink the support of $\chi$ around $x_0$ and choose an open conic neighbourhood $\Gamma \subset \mathbb{R}^n_0$ of $\xi_0$ so that
\begin{align*}
(K \times \Gamma) \cap \overline{\{(x,\nabla_x\phi(x,\theta)) : (x,\theta) \in C_\phi \cap \operatorname{esssupp}(a), \ \nabla_x\phi(x,\theta) \ne 0\}} = \varnothing,
\end{align*}
where $K := \operatorname{supp}\chi$. This is exactly the separation needed for nonstationary phase: in the cone $\Gamma$, the covector $\xi$ cannot equal $\nabla_x\phi(x,\theta)$ at a critical point of the phase in the $\theta$ variables, at least on the part of the amplitude which is not smoothing.
Now multiply the amplitude by the cutoff. Define
\begin{align*}
a_\chi: U \times \mathbb{R}^N &\to \mathbb{C}
\end{align*}
by
\begin{align*}
a_\chi(x,\theta) := \chi(x)a(x,\theta).
\end{align*}
The compact support of $\chi$ in $x$ makes the localized oscillatory integral a compactly supported distribution in the $x$ variable. Taking its Fourier transform gives the phase
\begin{align*}
\Phi: U \times \mathbb{R}^N_0 \times \mathbb{R}^n_0 &\to \mathbb{R}
\end{align*}
defined by
\begin{align*}
\Phi(x,\theta,\xi) := \phi(x,\theta) - x \cdot \xi.
\end{align*}
Thus
\begin{align*}
\widehat{\chi u}(\xi) = (2\pi)^{-N}\int_U \int_{\mathbb{R}^N} e^{i\Phi(x,\theta,\xi)} a_\chi(x,\theta)\, d\mathcal{L}^N(\theta)\, d\mathcal{L}^n(x).
\end{align*}
The measure in the outer integral is $n$-dimensional [Lebesgue measure](/page/Lebesgue%20Measure) on $U$, and the measure in the inner integral is $N$-dimensional Lebesgue measure in the oscillatory variable $\theta$.
[/guided]
[/step]
[step:Discard the part of the amplitude outside its conic essential support]
By the definition of $\operatorname{esssupp}(a)$, for every conic [open set](/page/Open%20Set) $V \subset U \times \mathbb{R}^N_0$ whose closure is disjoint from $\operatorname{esssupp}(a)$, the symbol $a$ is of order $-\infty$ on $V$ after multiplication by a conic cutoff supported in $V$.
Since
\begin{align*}
(K \times \Gamma) \cap \overline{\{(x,\nabla_x\phi(x,\theta)) : (x,\theta) \in C_\phi \cap \operatorname{esssupp}(a), \ \nabla_x\phi(x,\theta) \ne 0\}} = \varnothing,
\end{align*}
there is a closed conic neighbourhood $E \subset K \times \mathbb{R}^N_0$ of $\operatorname{esssupp}(a) \cap (K \times \mathbb{R}^N_0)$ such that
\begin{align*}
\{(x,\nabla_x\phi(x,\theta)) : (x,\theta) \in C_\phi \cap E, \ \nabla_x\phi(x,\theta) \ne 0\} \cap (K \times \Gamma) = \varnothing.
\end{align*}
Choose a conic cutoff
\begin{align*}
\rho: U \times \mathbb{R}^N_0 &\to [0,1]
\end{align*}
which is smooth, homogeneous of degree $0$ for $|\theta| \ge 1$, equal to $1$ on a conic neighbourhood of $\operatorname{esssupp}(a) \cap (K \times \mathbb{R}^N_0)$ for large $|\theta|$, and whose conic support over $K$ is contained in $E$ for large $|\theta|$. Decompose
\begin{align*}
a_\chi = a_{\chi,1} + a_{\chi,0},
\end{align*}
where
\begin{align*}
a_{\chi,1}(x,\theta) := \rho(x,\theta)a_\chi(x,\theta)
\end{align*}
and
\begin{align*}
a_{\chi,0}(x,\theta) := (1-\rho(x,\theta))a_\chi(x,\theta).
\end{align*}
The symbol $a_{\chi,0}$ is rapidly decreasing in $\theta$ together with all $x$- and $\theta$-derivatives, modulo a compact set in $\theta$. Hence the oscillatory integral defined by $a_{\chi,0}$ is a smooth compactly supported function of $x$, and its Fourier transform is rapidly decreasing in every conic region. Therefore this term does not contribute to $\operatorname{WF}(u)$.
It remains to prove rapid decay for the term with amplitude $a_{\chi,1}$.
[/step]
[step:Obtain a scaled nonstationary estimate on the retained conic support]
For $\xi \in \Gamma$, the gradient of $\Phi$ in the integration variables is
\begin{align*}
\nabla_{x,\theta}\Phi(x,\theta,\xi) = (\nabla_x\phi(x,\theta)-\xi,\nabla_\theta\phi(x,\theta)).
\end{align*}
Set
\begin{align*}
r(\theta,\xi) := |\theta| + |\xi|.
\end{align*}
On the conic support of $a_{\chi,1}$ over $K$, the preceding separation excludes simultaneous vanishing of $\nabla_\theta\phi(x,\theta)$ and $\nabla_x\phi(x,\theta)-\xi$ for $\xi \in \Gamma$. Since $\nabla_x\phi$ is homogeneous of degree $1$ in $\theta$ and $\nabla_\theta\phi$ is homogeneous of degree $0$ in $\theta$, this exclusion gives the scaled estimate: there exist constants $R > 0$ and $c > 0$ such that, for all $x \in K$, all $\xi \in \Gamma$ with $|\xi| \ge R$, and all $\theta$ in the conic support of $a_{\chi,1}$,
\begin{align*}
\frac{|\nabla_x\phi(x,\theta)-\xi|}{r(\theta,\xi)} + |\nabla_\theta\phi(x,\theta)| \ge c.
\end{align*}
Indeed, if this failed, after passing to a sequence with $|\xi_j| \to \infty$ and normalizing by $|\theta_j|+|\xi_j|=1$, compactness of $K$, closedness of the normalized conic support, and closedness of the cone $\overline{\Gamma}$ would give a [limit point](/page/Limit%20Point) with $\nabla_\theta\phi=0$ and $\nabla_x\phi=\xi$ in the excluded critical image. The cases $|\theta_j|/|\xi_j| \to 0$ and $|\theta_j|/|\xi_j| \to \infty$ are also excluded by the displayed normalized inequality: in the first case $|\nabla_x\phi-\xi|/r$ stays bounded away from $0$, while in the second case homogeneity forces the normalized $x$-gradient term to have a nonzero limit unless the limiting critical image condition holds.
Equivalently, the phase $\Phi(\cdot,\cdot,\xi)$ is uniformly nonstationary on the retained conic support in the homogeneous metric in which $x$-derivatives are scaled by $r(\theta,\xi)^{-1}$ and $\theta$-derivatives are unscaled.
[guided]
The point of the estimate is that ordinary nonvanishing is not enough: $\xi$ is the large parameter, while $\theta$ is integrated over a noncompact cone. We therefore measure the $x$-part of the phase gradient at scale $r(\theta,\xi) = |\theta|+|\xi|$ and the $\theta$-part at scale $1$.
For $\xi \in \Gamma$ we have
\begin{align*}
\nabla_{x,\theta}\Phi(x,\theta,\xi) = (\nabla_x\phi(x,\theta)-\xi,\nabla_\theta\phi(x,\theta)).
\end{align*}
The cutoff $\rho$ was chosen so that the conic support of $a_{\chi,1}$ over $K$ lies in a closed conic set whose critical image misses $K \times \Gamma$. Thus there is no point on this support for which both $\nabla_\theta\phi(x,\theta)=0$ and $\nabla_x\phi(x,\theta)=\xi$ with $\xi \in \Gamma$.
Because $\phi$ is homogeneous of degree $1$ in $\theta$, the derivative $\nabla_x\phi$ is homogeneous of degree $1$ in $\theta$, while $\nabla_\theta\phi$ is homogeneous of degree $0$ in $\theta$. The correct compactness argument is therefore performed after normalizing $|\theta|+|\xi|=1$. On that normalized compact set, the [continuous function](/page/Continuous%20Function)
\begin{align*}
(x,\theta,\xi) \mapsto \frac{|\nabla_x\phi(x,\theta)-\xi|}{|\theta|+|\xi|} + |\nabla_\theta\phi(x,\theta)|
\end{align*}
cannot vanish on the retained support, because vanishing would say exactly that $(x,\theta)$ is critical in the $\theta$ variables and that $\xi=\nabla_x\phi(x,\theta)$, placing $(x,\xi)$ in the excluded critical image. Hence this function has a positive lower bound $c>0$ on the normalized closed support. Rescaling back gives, for $|\xi|\ge R$,
\begin{align*}
\frac{|\nabla_x\phi(x,\theta)-\xi|}{|\theta|+|\xi|} + |\nabla_\theta\phi(x,\theta)| \ge c.
\end{align*}
This is the homogeneous nonstationary estimate needed for [integration by parts](/theorems/2098). It distinguishes the regimes $|\theta| \ll |\xi|$, $|\theta| \sim |\xi|$, and $|\theta| \gg |\xi|$ because the first term is normalized by the combined large parameter $|\theta|+|\xi|$.
[/guided]
[/step]
[step:Integrate by parts in the full set of phase variables to get rapid decay]
Set $r(\theta,\xi) := |\theta|+|\xi|$ and define the scaled denominator
\begin{align*}
D(x,\theta,\xi) := r(\theta,\xi)^{-2}|\nabla_x\phi(x,\theta)-\xi|^2 + |\nabla_\theta\phi(x,\theta)|^2.
\end{align*}
By the scaled estimate from the preceding step, $D(x,\theta,\xi) \ge c_1>0$ on the retained conic support for $\xi \in \Gamma$ and $|\xi| \ge R$, where $c_1$ depends only on the separation constant. Define the first-order differential operator
\begin{align*}
L_\xi: C^\infty(K \times \mathbb{R}^N_0) &\to C^\infty(K \times \mathbb{R}^N_0)
\end{align*}
by
\begin{align*}
L_\xi f := \frac{1}{iD(x,\theta,\xi)}\left(r(\theta,\xi)^{-2}(\nabla_x\phi(x,\theta)-\xi) \cdot \nabla_x f + \nabla_\theta\phi(x,\theta) \cdot \nabla_\theta f\right).
\end{align*}
Then
\begin{align*}
L_\xi(e^{i\Phi(x,\theta,
\xi)}) = e^{i\Phi(x,\theta,\xi)}.
\end{align*}
Let $L_\xi^*$ denote the formal adjoint of $L_\xi$ with respect to the product measure $d\mathcal{L}^n(x)d\mathcal{L}^N(\theta)$. Repeated integration by parts gives, for every integer $M \ge 0$,
\begin{align*}
\widehat{\chi u}_1(\xi) = (2\pi)^{-N}\int_U \int_{\mathbb{R}^N} e^{i\Phi(x,\theta,\xi)}(L_\xi^*)^M a_{\chi,1}(x,\theta)\, d\mathcal{L}^N(\theta)\, d\mathcal{L}^n(x),
\end{align*}
where $\widehat{\chi u}_1$ denotes the Fourier transform of the localized oscillatory integral with amplitude $a_{\chi,1}$.
The scaled estimate implies that the coefficients of $L_\xi^*$ and all their $x$- and $\theta$-derivatives are bounded by symbol estimates in the joint parameter $r(\theta,\xi)$; each integration by parts lowers the joint order by one. More precisely, for every multi-index pair $\alpha,\beta$ and every integer $M \ge 0$, there is a constant $C_{M,\alpha,\beta}>0$ such that
\begin{align*}
|\partial_x^\alpha\partial_\theta^\beta (L_\xi^*)^M a_{\chi,1}(x,\theta)| \le C_{M,\alpha,\beta}(1+|\theta|+|\xi|)^{m-M-|\beta|}
\end{align*}
on the retained conic support. Choose $M > m+N+Q$. Then the right-hand side is integrable in $\theta$ and contributes the factor $(1+|\xi|)^{-Q}$ after integration over $\mathbb{R}^N$. Since the $x$-support is contained in the compact set $K$, there is a constant $C_Q>0$, depending on finitely many symbol seminorms of $a_{\chi,1}$, on $K$, on $\Gamma$, and on $Q$, such that
\begin{align*}
|\widehat{\chi u}_1(\xi)| \le C_Q(1+|\xi|)^{-Q}
\end{align*}
for all $\xi \in \Gamma$ with $|\xi| \ge R$.
[guided]
We now convert absence of stationary points into decay. The mechanism is integration by parts using a vector field pointed in the direction of the phase gradient.
Set $r(\theta,\xi) := |\theta|+|\xi|$ and define
\begin{align*}
D(x,\theta,\xi) := r(\theta,\xi)^{-2}|\nabla_x\phi(x,\theta)-\xi|^2 + |\nabla_\theta\phi(x,\theta)|^2.
\end{align*}
The scaled nonstationary estimate gives $D(x,\theta,\xi) \ge c_1>0$ on the retained conic support. Define
\begin{align*}
L_\xi: C^\infty(K \times \mathbb{R}^N_0) &\to C^\infty(K \times \mathbb{R}^N_0)
\end{align*}
by
\begin{align*}
L_\xi f := \frac{1}{iD(x,\theta,\xi)}\left(r(\theta,\xi)^{-2}(\nabla_x\phi(x,\theta)-\xi) \cdot \nabla_x f + \nabla_\theta\phi(x,\theta) \cdot \nabla_\theta f\right).
\end{align*}
This operator is chosen so that it reproduces the exponential in the scaled metric. Differentiating the exponential gives $\nabla_x e^{i\Phi}=i e^{i\Phi}(\nabla_x\phi-\xi)$ and $\nabla_\theta e^{i\Phi}=i e^{i\Phi}\nabla_\theta\phi$. Substituting these identities into the definition of $L_\xi$ yields
\begin{align*}
L_\xi(e^{i\Phi(x,\theta,\xi)}) = e^{i\Phi(x,\theta,\xi)}.
\end{align*}
This identity is the entire reason for introducing $L_\xi$: it lets us move derivatives from the oscillatory factor onto the amplitude. Let $L_\xi^*$ be the formal adjoint of $L_\xi$ with respect to the product measure $d\mathcal{L}^n(x)d\mathcal{L}^N(\theta)$. Since the $x$-support is compact and the $\theta$-integral is interpreted in the standard oscillatory sense with symbol cutoffs, integration by parts gives
\begin{align*}
\widehat{\chi u}_1(\xi) = (2\pi)^{-N}\int_U \int_{\mathbb{R}^N} e^{i\Phi(x,\theta,\xi)}(L_\xi^*)^M a_{\chi,1}(x,\theta)\, d\mathcal{L}^N(\theta)\, d\mathcal{L}^n(x)
\end{align*}
for every integer $M \ge 0$.
Why does this produce rapid decay in $\xi$? The denominator $D$ is bounded below, and the coefficients contain the scaling factor $r(\theta,\xi)^{-2}$ in the $x$-directions. Homogeneity of $\phi$ implies that derivatives of these coefficients satisfy symbol bounds in the joint variable $r(\theta,\xi)=|\theta|+|\xi|$. Combining these coefficient bounds with the symbol estimates for $a_{\chi,1} \in S^m$ gives, for every multi-index pair $\alpha,\beta$ and every integer $M \ge 0$,
\begin{align*}
|\partial_x^\alpha\partial_\theta^\beta (L_\xi^*)^M a_{\chi,1}(x,\theta)| \le C_{M,\alpha,\beta}(1+|\theta|+|\xi|)^{m-M-|\beta|}.
\end{align*}
Thus each integration by parts lowers the joint order by one. If $Q \ge 0$ is fixed and $M > m+N+Q$, then the exponent is low enough that integration over $\theta \in \mathbb{R}^N$ is finite and leaves the factor $(1+|\xi|)^{-Q}$. Since $x$ is restricted to the compact set $K$, there is a constant $C_Q>0$ such that
\begin{align*}
|\widehat{\chi u}_1(\xi)| \le C_Q(1+|\xi|)^{-Q}
\end{align*}
for all $\xi \in \Gamma$ with $|\xi|$ sufficiently large.
[/guided]
[/step]
[step:Apply the Fourier decay criterion for the wave front set]
The contribution from $a_{\chi,0}$ is rapidly decreasing in every cone, and the contribution from $a_{\chi,1}$ is rapidly decreasing in $\Gamma$. Hence for every integer $Q \ge 0$ there is a constant $C_Q > 0$ such that
\begin{align*}
|\widehat{\chi u}(\xi)| \le C_Q(1+|\xi|)^{-Q}
\end{align*}
for all $\xi \in \Gamma$.
By the Fourier transform criterion for the wave front set, this rapid decay implies $(x_0,\xi_0) \notin \operatorname{WF}(u)$. This theorem is not yet linked in the wiki: Fourier transform criterion for wave front set.
Since every point outside the displayed closed conic set is absent from $\operatorname{WF}(u)$, we conclude
\begin{align*}
\operatorname{WF}(u) \subset \overline{\{(x,\nabla_x\phi(x,\theta)) : (x,\theta) \in C_\phi \cap \operatorname{esssupp}(a), \ \nabla_x\phi(x,\theta) \ne 0\}}.
\end{align*}
[/step]
[step:Identify the Lagrangian in the nondegenerate case]
Assume now that $\phi$ is nondegenerate, so that the differentials
\begin{align*}
d(\partial_{\theta_1}\phi), \dots, d(\partial_{\theta_N}\phi)
\end{align*}
have rank $N$ on $C_\phi$. By the regular level set theorem, $C_\phi$ is a smooth conic submanifold of $U \times \mathbb{R}^N_0$ of dimension $n$.
Define
\begin{align*}
\iota_\phi: C_\phi &\to T^*U \setminus 0
\end{align*}
by
\begin{align*}
\iota_\phi(x,\theta) := (x,\nabla_x\phi(x,\theta)).
\end{align*}
The standard theorem on nondegenerate homogeneous phase functions states that $\iota_\phi$ is a conic Lagrangian immersion. This theorem is not yet linked in the wiki: nondegenerate phase functions parametrize conic immersed Lagrangian submanifolds. Its hypotheses are exactly the rank condition above, the homogeneity of degree $1$ in $\theta$, and the exclusion of the zero covector by requiring $\nabla_x\phi(x,\theta) \ne 0$ on the image.
Therefore
\begin{align*}
\Lambda_\phi = \iota_\phi(C_\phi)
\end{align*}
is a conic immersed Lagrangian submanifold of $T^*U \setminus 0$. The wave front set inclusion already proved restricts this Lagrangian parametrization to the portion on which the amplitude is not smoothing, namely $C_\phi \cap \operatorname{esssupp}(a)$. This proves the final assertion and completes the proof.
[/step]