[proofplan]
We first prove the [Fourier transform](/page/Fourier%20Transform) identity for one ordinary partial derivative $\partial_{x_j}$ by integrating by parts on cubes and letting the cube radius tend to infinity. The only boundary terms come from the two faces perpendicular to the $x_j$-axis, and Schwartz decay makes them vanish. Multiplying that identity by $-i$ gives $\widehat{D_jv}(\xi)=\xi_j\hat v(\xi)$ for the course convention $D_j=-i\partial_{x_j}$. Iterating the normalized one-derivative identity gives the formula for every multi-index derivative $D^\alpha u$. Finally, finite linearity of the Fourier transform lets us multiply by the polynomial coefficients and sum, producing the multiplier $P(\xi)$.
[/proofplan]
[step:Compute the Fourier transform of one partial derivative by integration by parts]
Fix $j \in \{1,\dots,n\}$. Let
\begin{align*}
v:\mathbb{R}^n \to \mathbb{C}
\end{align*}
be an arbitrary function in $\mathcal{S}(\mathbb{R}^n)$. Since $\partial_{x_j}v \in \mathcal{S}(\mathbb{R}^n) \subset L^1(\mathbb{R}^n)$, its Fourier transform is defined pointwise by an absolutely convergent [Lebesgue integral](/page/Lebesgue%20Integral).
Fix $\xi \in \mathbb{R}^n$. For $R>0$, define the cube
\begin{align*}
Q_R := (-R,R)^n \subset \mathbb{R}^n.
\end{align*}
Also define the face variable
\begin{align*}
y := (x_1,\dots,x_{j-1},x_{j+1},\dots,x_n) \in \mathbb{R}^{n-1}.
\end{align*}
For $t \in \mathbb{R}$, define the insertion map
\begin{align*}
\iota_t:\mathbb{R}^{n-1}\to\mathbb{R}^n, \qquad \iota_t(y):=(x_1,\dots,x_{j-1},t,x_{j+1},\dots,x_n).
\end{align*}
Because $v$, $\partial_{x_j}v$, and $x\mapsto e^{-i\xi\cdot x}$ are continuous, the products appearing below are integrable on the bounded cube $Q_R$. [Fubini's theorem](/theorems/2961) therefore permits us to regard the $\mathcal{L}^n$ integral over $Q_R$ as an iterated integral in the $x_j$ variable and the remaining $n-1$ variables.
Applying one-dimensional [integration by parts](/theorems/210) in the $x_j$ variable on the cube $Q_R$ gives
\begin{align*}
\int_{Q_R}\partial_{x_j}v(x)e^{-i\xi\cdot x}\,d\mathcal{L}^n(x)=B_R+i\xi_j\int_{Q_R}v(x)e^{-i\xi\cdot x}\,d\mathcal{L}^n(x),
\end{align*}
where the boundary term is
\begin{align*}
B_R:=\int_{(-R,R)^{n-1}}v(\iota_R(y))e^{-i\xi\cdot \iota_R(y)}\,d\mathcal{L}^{n-1}(y)-\int_{(-R,R)^{n-1}}v(\iota_{-R}(y))e^{-i\xi\cdot \iota_{-R}(y)}\,d\mathcal{L}^{n-1}(y).
\end{align*}
The derivative of the exponential was computed using
\begin{align*}
\partial_{x_j}\bigl(e^{-i\xi\cdot x}\bigr)=-i\xi_j e^{-i\xi\cdot x}.
\end{align*}
We claim that $B_R \to 0$ as $R \to \infty$. Choose an integer $N>n$. Since $v \in \mathcal{S}(\mathbb{R}^n)$, there is a finite constant
\begin{align*}
C_N:=\sup_{x\in\mathbb{R}^n}(1+|x|)^N|v(x)|.
\end{align*}
For each $t \in \mathbb{R}$ and $y \in \mathbb{R}^{n-1}$, one has $|\iota_t(y)| \ge (|t|^2+|y|^2)^{1/2}$. Hence
\begin{align*}
|v(\iota_t(y))| \le C_N(1+(|t|^2+|y|^2)^{1/2})^{-N}.
\end{align*}
Therefore
\begin{align*}
\left|\int_{(-R,R)^{n-1}}v(\iota_R(y))e^{-i\xi\cdot \iota_R(y)}\,d\mathcal{L}^{n-1}(y)\right| \le C_N\int_{\mathbb{R}^{n-1}}(1+(R^2+|y|^2)^{1/2})^{-N}\,d\mathcal{L}^{n-1}(y).
\end{align*}
The same bound holds with $R$ replaced by $-R$. The dominating function $(1+|y|)^{-N}$ is integrable on $\mathbb{R}^{n-1}$ because $N>n-1$, and the integrand converges pointwise to $0$ as $R\to\infty$. By dominated convergence, both face integrals tend to $0$, so $B_R\to 0$.
Since $v$ and $\partial_{x_j}v$ are in $L^1(\mathbb{R}^n)$, the integrals over $Q_R$ converge to the corresponding integrals over $\mathbb{R}^n$ as $R\to\infty$. Passing to the limit in the integration-by-parts identity yields
\begin{align*}
\int_{\mathbb{R}^n}\partial_{x_j}v(x)e^{-i\xi\cdot x}\,d\mathcal{L}^n(x)=i\xi_j\int_{\mathbb{R}^n}v(x)e^{-i\xi\cdot x}\,d\mathcal{L}^n(x).
\end{align*}
Multiplying both sides by $(2\pi)^{-n/2}$ gives
\begin{align*}
\widehat{\partial_{x_j}v}(\xi)=i\xi_j\hat{v}(\xi).
\end{align*}
[guided]
We prove the one-derivative formula carefully because this is where both the sign convention and the absence of boundary terms enter. Fix $j \in \{1,\dots,n\}$, fix $\xi \in \mathbb{R}^n$, and let
\begin{align*}
v:\mathbb{R}^n \to \mathbb{C}
\end{align*}
be a Schwartz function. The goal is to prove
\begin{align*}
\widehat{\partial_{x_j}v}(\xi)=i\xi_j\hat{v}(\xi).
\end{align*}
Why does the factor have sign $+i\xi_j$? The Fourier transform uses the exponential $e^{-i\xi\cdot x}$. When [integration by parts](/theorems/2098) moves $\partial_{x_j}$ from $v$ onto this exponential, the derivative produces $-i\xi_j e^{-i\xi\cdot x}$, and the minus sign from integration by parts changes this to $+i\xi_j$.
To justify the integration by parts on the unbounded domain, we first work on a bounded cube. For $R>0$, set
\begin{align*}
Q_R:=(-R,R)^n.
\end{align*}
Let
\begin{align*}
y:=(x_1,\dots,x_{j-1},x_{j+1},\dots,x_n)\in\mathbb{R}^{n-1}
\end{align*}
denote all variables except $x_j$. For $t\in\mathbb{R}$, define
\begin{align*}
\iota_t:\mathbb{R}^{n-1}\to\mathbb{R}^n, \qquad \iota_t(y):=(x_1,\dots,x_{j-1},t,x_{j+1},\dots,x_n).
\end{align*}
This map inserts the value $t$ into the $j$-th coordinate.
Before integrating by parts, we justify treating the other variables as parameters. The functions $v$ and $\partial_{x_j}v$ are continuous, and the exponential factor $x\mapsto e^{-i\xi\cdot x}$ is continuous and bounded. Hence the products integrated below are integrable on the bounded cube $Q_R$. Fubini's theorem allows the $\mathcal{L}^n$ integral over $Q_R$ to be written as an iterated integral over $(-R,R)^{n-1}$ with respect to $\mathcal{L}^{n-1}$ and over $(-R,R)$ with respect to $\mathcal{L}^1$.
On $Q_R$, ordinary one-dimensional integration by parts in the variable $x_j$, with all other variables held fixed, gives
\begin{align*}
\int_{Q_R}\partial_{x_j}v(x)e^{-i\xi\cdot x}\,d\mathcal{L}^n(x)=B_R+i\xi_j\int_{Q_R}v(x)e^{-i\xi\cdot x}\,d\mathcal{L}^n(x),
\end{align*}
where
\begin{align*}
B_R:=\int_{(-R,R)^{n-1}}v(\iota_R(y))e^{-i\xi\cdot \iota_R(y)}\,d\mathcal{L}^{n-1}(y)-\int_{(-R,R)^{n-1}}v(\iota_{-R}(y))e^{-i\xi\cdot \iota_{-R}(y)}\,d\mathcal{L}^{n-1}(y).
\end{align*}
The term $B_R$ is precisely the contribution from the two faces of the cube where $x_j=R$ and $x_j=-R$.
Now we prove that this boundary contribution disappears as $R\to\infty$. Choose an integer $N>n$. Because $v$ is a Schwartz function, the quantity
\begin{align*}
C_N:=\sup_{x\in\mathbb{R}^n}(1+|x|)^N|v(x)|
\end{align*}
is finite. Thus, for every $t\in\mathbb{R}$ and $y\in\mathbb{R}^{n-1}$,
\begin{align*}
|v(\iota_t(y))| \le C_N(1+|\iota_t(y)|)^{-N}.
\end{align*}
Since $|\iota_t(y)|=(|t|^2+|y|^2)^{1/2}$, we obtain
\begin{align*}
|v(\iota_t(y))| \le C_N(1+(|t|^2+|y|^2)^{1/2})^{-N}.
\end{align*}
Therefore the absolute value of the face integral at $t=R$ is bounded by
\begin{align*}
C_N\int_{\mathbb{R}^{n-1}}(1+(R^2+|y|^2)^{1/2})^{-N}\,d\mathcal{L}^{n-1}(y).
\end{align*}
The same bound applies to the face at $t=-R$. For each fixed $y$, the integrand tends to $0$ as $R\to\infty$. It is also bounded by $(1+|y|)^{-N}$, which is integrable over $\mathbb{R}^{n-1}$ because $N>n-1$. Hence dominated convergence applies and both face integrals tend to $0$. Thus $B_R\to 0$.
Finally, since $v$ and $\partial_{x_j}v$ are Schwartz functions, they are integrable on $\mathbb{R}^n$. Hence
\begin{align*}
\int_{Q_R}\partial_{x_j}v(x)e^{-i\xi\cdot x}\,d\mathcal{L}^n(x)\to \int_{\mathbb{R}^n}\partial_{x_j}v(x)e^{-i\xi\cdot x}\,d\mathcal{L}^n(x)
\end{align*}
and
\begin{align*}
\int_{Q_R}v(x)e^{-i\xi\cdot x}\,d\mathcal{L}^n(x)\to \int_{\mathbb{R}^n}v(x)e^{-i\xi\cdot x}\,d\mathcal{L}^n(x).
\end{align*}
Passing to the limit in the cube identity gives
\begin{align*}
\int_{\mathbb{R}^n}\partial_{x_j}v(x)e^{-i\xi\cdot x}\,d\mathcal{L}^n(x)=i\xi_j\int_{\mathbb{R}^n}v(x)e^{-i\xi\cdot x}\,d\mathcal{L}^n(x).
\end{align*}
Multiplication by the normalization factor $(2\pi)^{-n/2}$ gives the desired Fourier identity:
\begin{align*}
\widehat{\partial_{x_j}v}(\xi)=i\xi_j\hat{v}(\xi).
\end{align*}
[/guided]
[/step]
[step:Iterate the normalized derivative identity to handle every multi-index]
Let $\alpha=(\alpha_1,\dots,\alpha_n)\in\mathbb{N}_0^n$. Since $u\in\mathcal{S}(\mathbb{R}^n)$ and every partial derivative of a Schwartz function is again a Schwartz function, the identity from the previous step may be applied repeatedly. If $\alpha=0$, we use the empty-product convention $\xi^0=1$ and $D^0u=u$.
For each $j\in\{1,\dots,n\}$, the convention $D_j=-i\partial_{x_j}$ and the previous step give
\begin{align*}
\widehat{D_j v}(\xi)=-i\,\widehat{\partial_{x_j}v}(\xi)=-i(i\xi_j)\hat v(\xi)=\xi_j\hat v(\xi).
\end{align*}
Applying this normalized one-derivative formula $\alpha_j$ times in the $x_j$ direction contributes the factor $\xi_j^{\alpha_j}$. Since the operators $D_j$ commute on smooth functions, the order of these applications does not affect $D^\alpha u$. Therefore
\begin{align*}
\widehat{D^\alpha u}(\xi)=\prod_{j=1}^n\xi_j^{\alpha_j}\hat{u}(\xi)=\xi^\alpha\hat u(\xi).
\end{align*}
[/step]
[step:Sum the multi-index identities with the polynomial coefficients]
The operator $P(D)u$ is a finite linear combination of Schwartz functions, hence $P(D)u\in\mathcal{S}(\mathbb{R}^n)$ and its Fourier transform is defined pointwise. By linearity of the Lebesgue integral defining the Fourier transform,
\begin{align*}
\widehat{P(D)u}(\xi)=\sum_{\substack{\alpha\in\mathbb{N}_0^n, |\alpha|\le m}}a_\alpha \widehat{D^\alpha u}(\xi).
\end{align*}
Substituting the multi-index derivative formula gives
\begin{align*}
\widehat{P(D)u}(\xi)=\sum_{\substack{\alpha\in\mathbb{N}_0^n, |\alpha|\le m}}a_\alpha \xi^\alpha\hat{u}(\xi).
\end{align*}
Since the sum is finite, $\hat{u}(\xi)$ factors out:
\begin{align*}
\widehat{P(D)u}(\xi)=\left(\sum_{\substack{\alpha\in\mathbb{N}_0^n, |\alpha|\le m}}a_\alpha \xi^\alpha\right)\hat{u}(\xi).
\end{align*}
By the definition of the polynomial $P$, the parenthesized expression is $P(\xi)$. Hence
\begin{align*}
\widehat{P(D)u}(\xi)=P(\xi)\hat{u}(\xi).
\end{align*}
[/step]
[step:Identify the multiplier and conclude the theorem]
Define
\begin{align*}
M:\mathbb{R}^n\to\mathbb{C}, \qquad M(\xi):=P(\xi).
\end{align*}
The identity proved above says that for every $u\in\mathcal{S}(\mathbb{R}^n)$ and every $\xi\in\mathbb{R}^n$,
\begin{align*}
\widehat{P(D)u}(\xi)=M(\xi)\hat{u}(\xi).
\end{align*}
This is exactly the statement that $P(D)$ acts on $\mathcal{S}(\mathbb{R}^n)$ as the Fourier multiplier with multiplier $M=\xi\mapsto P(\xi)$. This completes the proof.
[/step]