[proofplan]
We expand the language by constants for the elements of $M$ and by one additional constant $c$. The finite satisfiability hypothesis lets us apply the [compactness theorem](/theorems/2748) to obtain a model of the full theory $\operatorname{Diag}_{\mathrm{el}}(M) \cup \Sigma(c)$. The constants naming elements of $M$ then define an elementary embedding of $M$ into the reduct of this model, because the elementary diagram records every first-order truth with parameters from $M$. Finally, the interpretation of $c$ realizes $\Sigma$, and we identify $M$ with its elementary image.
[/proofplan]
[step:Use compactness to realize the full expanded theory]
Let $\mathcal{L}^+ := \mathcal{L}(M) \cup \{c\}$, where $c$ is a constant symbol not belonging to $\mathcal{L}(M)$. Define the $\mathcal{L}^+$-theory
\begin{align*}
T := \operatorname{Diag}_{\mathrm{el}}(M) \cup \Sigma(c).
\end{align*}
By hypothesis, every finite subset of $T$ is satisfiable. Hence, by the [compactness theorem for first-order logic](/theorems/4290) (citing a result not yet in the wiki: Compactness Theorem for First-Order Logic), there exists an $\mathcal{L}^+$-structure $N^+$ such that
\begin{align*}
N^+ \models T.
\end{align*}
[guided]
We first put all the relevant information into one first-order theory. The language $\mathcal{L}(M)$ already contains a constant symbol $\bar{m}$ for each element $m \in M$, and we add one more constant symbol $c$ whose eventual interpretation will be the desired realizing element. Thus the working language is
\begin{align*}
\mathcal{L}^+ := \mathcal{L}(M) \cup \{c\}.
\end{align*}
In this language, define
\begin{align*}
T := \operatorname{Diag}_{\mathrm{el}}(M) \cup \Sigma(c).
\end{align*}
The first part of $T$ forces a model to contain an elementary copy of $M$, while the second part forces the interpretation of $c$ to satisfy all formulas in $\Sigma$.
The hypothesis says precisely that every finite subset of this theory $T$ is satisfiable. The compactness theorem for first-order logic states that if every finite subset of a first-order theory is satisfiable, then the entire theory is satisfiable. Applying compactness to the $\mathcal{L}^+$-theory $T$, we obtain an $\mathcal{L}^+$-structure $N^+$ satisfying
\begin{align*}
N^+ \models T.
\end{align*}
This is the only place where compactness is used.
[/guided]
[/step]
[step:Extract an elementary copy of $M$ from the elementary diagram]
Let $N$ be the reduct of $N^+$ to the original language $\mathcal{L}$. Define a map
\begin{align*}
j: M &\to N \\
m &\mapsto \bar{m}^{N^+}.
\end{align*}
The map $j$ is injective: if $m_1,m_2 \in M$ and $m_1 \neq m_2$, then the sentence $\bar{m}_1 \neq \bar{m}_2$ belongs to $\operatorname{Diag}_{\mathrm{el}}(M)$, so $N^+ \models \bar{m}_1 \neq \bar{m}_2$.
We show that $j$ is elementary. Let $\psi(y_1,\dots,y_n)$ be any $\mathcal{L}$-formula, and let $m_1,\dots,m_n \in M$. If
\begin{align*}
M \models \psi(m_1,\dots,m_n),
\end{align*}
then the $\mathcal{L}(M)$-sentence $\psi(\bar{m}_1,\dots,\bar{m}_n)$ belongs to $\operatorname{Diag}_{\mathrm{el}}(M)$, hence
\begin{align*}
N \models \psi(j(m_1),\dots,j(m_n)).
\end{align*}
If instead
\begin{align*}
M \models \neg \psi(m_1,\dots,m_n),
\end{align*}
then $\neg \psi(\bar{m}_1,\dots,\bar{m}_n)$ belongs to $\operatorname{Diag}_{\mathrm{el}}(M)$, hence
\begin{align*}
N \models \neg \psi(j(m_1),\dots,j(m_n)).
\end{align*}
Therefore, for every $\mathcal{L}$-formula $\psi(y_1,\dots,y_n)$ and every tuple $(m_1,\dots,m_n) \in M^n$,
\begin{align*}
M \models \psi(m_1,\dots,m_n)
\iff
N \models \psi(j(m_1),\dots,j(m_n)).
\end{align*}
Thus $j: M \to N$ is an elementary embedding.
[guided]
The constants $\bar{m}$ in the language are meant to name the elements of $M$. In the model $N^+$, each such constant has an interpretation $\bar{m}^{N^+}$. We use these interpretations to define
\begin{align*}
j: M &\to N \\
m &\mapsto \bar{m}^{N^+},
\end{align*}
where $N$ is the $\mathcal{L}$-reduct of $N^+$.
First, $j$ is injective. Suppose $m_1,m_2 \in M$ and $m_1 \neq m_2$. Since the elementary diagram contains every $\mathcal{L}(M)$-sentence true in $M$, the sentence
\begin{align*}
\bar{m}_1 \neq \bar{m}_2
\end{align*}
belongs to $\operatorname{Diag}_{\mathrm{el}}(M)$. Because $N^+ \models \operatorname{Diag}_{\mathrm{el}}(M)$, we get
\begin{align*}
N^+ \models \bar{m}_1 \neq \bar{m}_2.
\end{align*}
Therefore $\bar{m}_1^{N^+} \neq \bar{m}_2^{N^+}$, so $j(m_1) \neq j(m_2)$.
Now we verify elementarity. Let $\psi(y_1,\dots,y_n)$ be an arbitrary $\mathcal{L}$-formula, and let $m_1,\dots,m_n \in M$. The point of using the elementary diagram, rather than only the atomic diagram, is that it records the truth of every formula with parameters from $M$, not only atomic formulas.
If
\begin{align*}
M \models \psi(m_1,\dots,m_n),
\end{align*}
then the sentence
\begin{align*}
\psi(\bar{m}_1,\dots,\bar{m}_n)
\end{align*}
is true in the natural $\mathcal{L}(M)$-expansion of $M$. Hence it belongs to $\operatorname{Diag}_{\mathrm{el}}(M)$. Since $N^+ \models \operatorname{Diag}_{\mathrm{el}}(M)$, its $\mathcal{L}$-reduct $N$ satisfies
\begin{align*}
N \models \psi(j(m_1),\dots,j(m_n)).
\end{align*}
Conversely, if
\begin{align*}
M \models \neg \psi(m_1,\dots,m_n),
\end{align*}
then
\begin{align*}
\neg \psi(\bar{m}_1,\dots,\bar{m}_n)
\end{align*}
belongs to $\operatorname{Diag}_{\mathrm{el}}(M)$, so
\begin{align*}
N \models \neg \psi(j(m_1),\dots,j(m_n)).
\end{align*}
Combining the two cases gives
\begin{align*}
M \models \psi(m_1,\dots,m_n)
\iff
N \models \psi(j(m_1),\dots,j(m_n)).
\end{align*}
Since $\psi$ and the tuple from $M$ were arbitrary, $j$ is an elementary embedding.
[/guided]
[/step]
[step:Use the interpretation of $c$ to realize $\Sigma$]
Define
\begin{align*}
a := c^{N^+} \in N.
\end{align*}
Let $\varphi(x) \in \Sigma(x)$. Then $\varphi(c) \in \Sigma(c) \subseteq T$, and $N^+ \models T$, so
\begin{align*}
N^+ \models \varphi(c).
\end{align*}
Equivalently, in the $\mathcal{L}$-reduct $N$, with each parameter $\bar{m}$ interpreted as $j(m)$,
\begin{align*}
N \models \varphi(a).
\end{align*}
Thus $a$ realizes every formula in $\Sigma(x)$ over the elementary copy $j[M] \subseteq N$.
[/step]
[step:Identify $M$ with its elementary image]
Since $j: M \to N$ is an elementary embedding, its image $j[M]$ is an elementary substructure of $N$. Replacing $M$ by the isomorphic copy $j[M]$, we may regard $M$ as an elementary substructure of $N$; that is,
\begin{align*}
N \succeq M.
\end{align*}
Under this identification, the element $a = c^{N^+}$ satisfies
\begin{align*}
N \models \varphi(a)
\end{align*}
for every $\varphi(x) \in \Sigma(x)$. Therefore $N$ is an elementary extension of $M$ containing an element realizing $\Sigma(x)$, as required.
[/step]