[proofplan]
We enlarge $A$ to a set closed under both the function symbols of $L$ and a chosen family of Skolem witness functions for all existential formulas. The closure is obtained by countably iterating these operations, and the cardinal bound follows because there are at most $|L|+\aleph_0$ formulas in finitely many variables and only finitely many arguments are used at each stage. The resulting set is the universe of an $L$-substructure of $\mathcal{N}$, and the chosen witnesses verify the Tarski-Vaught condition, which is then proved directly to avoid an external citation.
[/proofplan]
[step:Choose Skolem witnesses for existential formulas]
Let
\begin{align*}
\kappa := |A| + |L| + \aleph_0.
\end{align*}
Since $\kappa$ is infinite, adjoining finitely or countably many elements to a set of cardinality at most $\kappa$ preserves the bound $\kappa$.
Choose an element $b_0 \in N$, which exists because $\mathcal{N}$ is infinite. Let $\Phi$ denote the set of all $L$-formulas of the form $\varphi(y,x_1,\dots,x_n)$, where $n \in \mathbb{N} \cup \{0\}$ and the displayed variables include all free variables of $\varphi$. The usual recursive construction of first-order formulas gives
\begin{align*}
|\Phi| \leq |L| + \aleph_0 \leq \kappa.
\end{align*}
For each $\varphi(y,x_1,\dots,x_n) \in \Phi$, choose a function
\begin{align*}
F_\varphi: N^n &\to N
\end{align*}
with the following property: for every tuple $(a_1,\dots,a_n) \in N^n$, if
\begin{align*}
\mathcal{N} \models \exists y\,\varphi(y,a_1,\dots,a_n),
\end{align*}
then
\begin{align*}
\mathcal{N} \models \varphi(F_\varphi(a_1,\dots,a_n),a_1,\dots,a_n).
\end{align*}
If no such $y$ exists, define $F_\varphi(a_1,\dots,a_n) := b_0$. This choice is possible by the [axiom of choice](/page/Axiom%20of%20Choice), applied independently to the nonempty witness sets
\begin{align*}
\{c \in N : \mathcal{N} \models \varphi(c,a_1,\dots,a_n)\}.
\end{align*}
[guided]
The purpose of this step is to build, in advance, a witness-selection mechanism for every possible existential formula. Fix
\begin{align*}
\kappa := |A| + |L| + \aleph_0.
\end{align*}
This is the target size bound. Since $\kappa$ is infinite, finite powers, countable unions, and multiplication by at most $\kappa$ many choices will remain bounded by $\kappa$.
We first choose one element $b_0 \in N$. This is legitimate because the universe $N$ is nonempty; indeed $\mathcal{N}$ is assumed infinite. The element $b_0$ is used only as a default value when an existential formula has no witness.
Let $\Phi$ be the set of all formulas $\varphi(y,x_1,\dots,x_n)$, with $n \in \mathbb{N} \cup \{0\}$, in which the displayed variables contain all free variables. The set of first-order formulas is built from the symbols of $L$, variables, logical connectives, quantifiers, and parentheses by finite strings. Hence the number of such formulas is bounded by
\begin{align*}
|\Phi| \leq |L| + \aleph_0 \leq \kappa.
\end{align*}
For each formula $\varphi(y,x_1,\dots,x_n)$ we now define a witness function
\begin{align*}
F_\varphi: N^n &\to N.
\end{align*}
Given $(a_1,\dots,a_n) \in N^n$, if $\mathcal{N}$ satisfies $\exists y\,\varphi(y,a_1,\dots,a_n)$, then the set
\begin{align*}
\{c \in N : \mathcal{N} \models \varphi(c,a_1,\dots,a_n)\}
\end{align*}
is nonempty, so we choose one element of it and call it $F_\varphi(a_1,\dots,a_n)$. If the set is empty, we set $F_\varphi(a_1,\dots,a_n) := b_0$. Thus $F_\varphi$ is a total function on $N^n$, and whenever a witness exists in $\mathcal{N}$, $F_\varphi$ returns one.
[/guided]
[/step]
[step:Iterate closure under language functions and Skolem functions]
Define subsets $B_i \subset N$ recursively. Set
\begin{align*}
B_0 := A \cup \{b_0\}.
\end{align*}
Given $B_i$, define $B_{i+1}$ to be the union of $B_i$ with all elements of the following two forms:
\begin{align*}
f^{\mathcal{N}}(a_1,\dots,a_m)
\end{align*}
where $f$ is an $m$-ary function symbol of $L$, $m \in \mathbb{N} \cup \{0\}$, and $(a_1,\dots,a_m) \in B_i^m$; and
\begin{align*}
F_\varphi(a_1,\dots,a_n)
\end{align*}
where $\varphi(y,x_1,\dots,x_n) \in \Phi$ and $(a_1,\dots,a_n) \in B_i^n$.
Finally define
\begin{align*}
M := \bigcup_{i=0}^{\infty} B_i.
\end{align*}
Then $A \subset M$ and $b_0 \in M$, so $M$ is nonempty.
We prove by induction that $|B_i| \leq \kappa$ for every $i \in \mathbb{N} \cup \{0\}$. For $i=0$,
\begin{align*}
|B_0| \leq |A| + 1 \leq \kappa.
\end{align*}
Assume $|B_i| \leq \kappa$. For each finite $m$, the set $B_i^m$ has cardinality at most $\kappa$. There are at most $|L| \leq \kappa$ function symbols and at most $|\Phi| \leq \kappa$ Skolem functions. Therefore the set of all new values added at stage $i+1$ has cardinality at most $\kappa$, and hence $|B_{i+1}| \leq \kappa$. Thus
\begin{align*}
|M| = \left|\bigcup_{i=0}^{\infty} B_i\right| \leq \kappa \cdot \aleph_0 = \kappa.
\end{align*}
[guided]
We now close $A$ under all operations that are needed for elementarity. Start with
\begin{align*}
B_0 := A \cup \{b_0\}.
\end{align*}
The element $b_0$ ensures that the eventual universe is nonempty, even when $A=\varnothing$ and the language has no constant symbols.
Suppose $B_i$ has been constructed. We put into $B_{i+1}$ every value obtained from elements of $B_i$ by an actual function symbol of the language:
\begin{align*}
f^{\mathcal{N}}(a_1,\dots,a_m),
\end{align*}
where $f$ is an $m$-ary function symbol of $L$ and $(a_1,\dots,a_m) \in B_i^m$. This guarantees that the final set will be closed under the functions of the language.
We also put into $B_{i+1}$ every value obtained by a Skolem witness function:
\begin{align*}
F_\varphi(a_1,\dots,a_n),
\end{align*}
where $\varphi(y,x_1,\dots,x_n) \in \Phi$ and $(a_1,\dots,a_n) \in B_i^n$. This guarantees that if an existential statement with parameters already in the construction is true in $\mathcal{N}$, then a chosen witness is added at the next stage.
Define the final set by
\begin{align*}
M := \bigcup_{i=0}^{\infty} B_i.
\end{align*}
Then $A \subset M$ because $A \subset B_0$, and $M$ is nonempty because $b_0 \in B_0$.
It remains to check the size bound. We prove $|B_i| \leq \kappa$ by induction. The base case is
\begin{align*}
|B_0| \leq |A| + 1 \leq \kappa.
\end{align*}
For the induction step, assume $|B_i| \leq \kappa$. If $m$ is finite, then
\begin{align*}
|B_i^m| \leq \kappa^m = \kappa,
\end{align*}
because $\kappa$ is infinite. There are at most $\kappa$ function symbols of $L$ and at most $\kappa$ formulas in $\Phi$. Therefore all values added using language functions and all values added using Skolem functions together form a set of cardinality at most $\kappa$. Hence $|B_{i+1}| \leq \kappa$.
Finally, $M$ is a countable union of sets of cardinality at most $\kappa$, so
\begin{align*}
|M| = \left|\bigcup_{i=0}^{\infty} B_i\right| \leq \kappa \cdot \aleph_0 = \kappa.
\end{align*}
[/guided]
[/step]
[step:Make the closed set into an $L$-substructure of $\mathcal{N}$]
Let $\mathcal{M}$ be the $L$-structure with universe $M$ whose interpretations are inherited from $\mathcal{N}$: each relation symbol is restricted to $M$, each constant symbol is interpreted as its value in $\mathcal{N}$, and each function symbol is interpreted by restricting the corresponding function of $\mathcal{N}$ to tuples from $M$.
This is well-defined because $M$ is closed under all function symbols of $L$. Indeed, if $f$ is an $m$-ary function symbol and $(a_1,\dots,a_m) \in M^m$, then each $a_j$ belongs to some $B_{i_j}$. Let
\begin{align*}
r := \max\{i_1,\dots,i_m\}.
\end{align*}
Then $(a_1,\dots,a_m) \in B_r^m$, so by construction
\begin{align*}
f^{\mathcal{N}}(a_1,\dots,a_m) \in B_{r+1} \subset M.
\end{align*}
Thus $\mathcal{M}$ is an $L$-substructure of $\mathcal{N}$.
[/step]
[step:Verify the existential witness condition inside $M$]
Let $\varphi(y,x_1,\dots,x_n) \in \Phi$ and let $(a_1,\dots,a_n) \in M^n$. Suppose
\begin{align*}
\mathcal{N} \models \exists y\,\varphi(y,a_1,\dots,a_n).
\end{align*}
Choose $r \in \mathbb{N} \cup \{0\}$ such that $a_1,\dots,a_n \in B_r$. By the defining property of $F_\varphi$,
\begin{align*}
\mathcal{N} \models \varphi(F_\varphi(a_1,\dots,a_n),a_1,\dots,a_n).
\end{align*}
By construction,
\begin{align*}
F_\varphi(a_1,\dots,a_n) \in B_{r+1} \subset M.
\end{align*}
Therefore every existential formula with parameters in $M$ that is realized in $\mathcal{N}$ has a realization in $M$.
[/step]
[step:Derive elementarity from the existential witness condition]
We prove that $\mathcal{M} \preccurlyeq \mathcal{N}$. It suffices to prove, by induction on formulas, that for every $L$-formula $\theta(x_1,\dots,x_n)$ and every tuple $(a_1,\dots,a_n) \in M^n$,
\begin{align*}
\mathcal{M} \models \theta(a_1,\dots,a_n)
\quad\Longleftrightarrow\quad
\mathcal{N} \models \theta(a_1,\dots,a_n).
\end{align*}
For atomic formulas this follows from the definition of substructure: function terms have the same values in $\mathcal{M}$ and $\mathcal{N}$ on tuples from $M$, and relation symbols are interpreted in $\mathcal{M}$ by restriction from $\mathcal{N}$. The Boolean connectives follow immediately from the induction hypothesis.
It remains to handle existential quantifiers. Let $\theta(x_1,\dots,x_n)$ be the formula
\begin{align*}
\exists y\,\psi(y,x_1,\dots,x_n).
\end{align*}
Assume the induction hypothesis holds for $\psi$. If
\begin{align*}
\mathcal{M} \models \exists y\,\psi(y,a_1,\dots,a_n),
\end{align*}
then some $c \in M$ satisfies $\mathcal{M} \models \psi(c,a_1,\dots,a_n)$, so by the induction hypothesis
\begin{align*}
\mathcal{N} \models \psi(c,a_1,\dots,a_n),
\end{align*}
and hence $\mathcal{N} \models \exists y\,\psi(y,a_1,\dots,a_n)$.
Conversely, suppose
\begin{align*}
\mathcal{N} \models \exists y\,\psi(y,a_1,\dots,a_n).
\end{align*}
By the existential witness condition proved above, there exists $c \in M$ such that
\begin{align*}
\mathcal{N} \models \psi(c,a_1,\dots,a_n).
\end{align*}
By the induction hypothesis applied to $\psi$,
\begin{align*}
\mathcal{M} \models \psi(c,a_1,\dots,a_n).
\end{align*}
Therefore
\begin{align*}
\mathcal{M} \models \exists y\,\psi(y,a_1,\dots,a_n).
\end{align*}
Universal quantifiers are handled by rewriting $\forall y\,\psi$ as $\neg\exists y\,\neg\psi$.
Thus $\mathcal{M} \preccurlyeq \mathcal{N}$. Together with $A \subset M$ and $|M| \leq |A|+|L|+\aleph_0$, this proves the theorem.
[guided]
The final point is to show that the substructure is elementary, meaning that it satisfies exactly the same first-order formulas with parameters from $M$ as $\mathcal{N}$ does. We prove the full statement by induction on formulas: for every formula $\theta(x_1,\dots,x_n)$ and every tuple $(a_1,\dots,a_n) \in M^n$,
\begin{align*}
\mathcal{M} \models \theta(a_1,\dots,a_n)
\quad\Longleftrightarrow\quad
\mathcal{N} \models \theta(a_1,\dots,a_n).
\end{align*}
For atomic formulas, this is exactly what it means for $\mathcal{M}$ to be a substructure of $\mathcal{N}$. Terms are evaluated the same way in both structures because the functions of $\mathcal{M}$ are restrictions of the functions of $\mathcal{N}$. Relation symbols also agree because every relation in $\mathcal{M}$ is the restriction of the corresponding relation in $\mathcal{N}$ to tuples from $M$.
The induction steps for negation, conjunction, disjunction, and implication are direct consequences of the induction hypothesis. The only non-formal case is the existential quantifier. Suppose
\begin{align*}
\theta(x_1,\dots,x_n) \equiv \exists y\,\psi(y,x_1,\dots,x_n).
\end{align*}
If $\mathcal{M}$ satisfies this formula, then there is some $c \in M$ with
\begin{align*}
\mathcal{M} \models \psi(c,a_1,\dots,a_n).
\end{align*}
By the induction hypothesis for $\psi$, the same tuple satisfies $\psi$ in $\mathcal{N}$, so
\begin{align*}
\mathcal{N} \models \exists y\,\psi(y,a_1,\dots,a_n).
\end{align*}
The converse is where the Skolem closure is used. Suppose
\begin{align*}
\mathcal{N} \models \exists y\,\psi(y,a_1,\dots,a_n).
\end{align*}
The existential witness condition gives a witness $c \in M$ such that
\begin{align*}
\mathcal{N} \models \psi(c,a_1,\dots,a_n).
\end{align*}
Now apply the induction hypothesis for $\psi$ again, this time from $\mathcal{N}$ back to $\mathcal{M}$:
\begin{align*}
\mathcal{M} \models \psi(c,a_1,\dots,a_n).
\end{align*}
Hence
\begin{align*}
\mathcal{M} \models \exists y\,\psi(y,a_1,\dots,a_n).
\end{align*}
Universal quantifiers require no separate argument because $\forall y\,\psi$ is equivalent to $\neg\exists y\,\neg\psi$ in first-order logic. Therefore all formulas agree between $\mathcal{M}$ and $\mathcal{N}$ on parameters from $M$, which is precisely $\mathcal{M} \preccurlyeq \mathcal{N}$.
[/guided]
[/step]