[proofplan]
We prove the stronger Łoś equivalence for arbitrary functions $f_1,\dots,f_n:I\to M$: truth in the ultrapower is exactly membership in $\mathcal U$ of the coordinatewise truth set. This is proved by induction on formulas, with the atomic cases handled from the definitions of the ultraproduct interpretation and the quantifier step handled by choosing representatives coordinatewise. Applying this equivalence to constant functions gives that the diagonal map preserves and reflects every formula, because the relevant truth set is either all of $I$ or $\varnothing$.
[/proofplan]
[step:Define the ultrapower notation and the diagonal map]
For functions $f,g:I\to M$, define
\begin{align*}
f \sim_{\mathcal U} g
\quad \Longleftrightarrow \quad
\{i\in I: f(i)=g(i)\}\in \mathcal U.
\end{align*}
The universe of the ultrapower $M^I/\mathcal U$ is the quotient set of $M^I$ by $\sim_{\mathcal U}$, and the equivalence class of $f:I\to M$ is denoted $[f]_{\mathcal U}$.
For each $a\in M$, define the constant map
\begin{align*}
c_a:I&\to M\\
i&\mapsto a.
\end{align*}
The diagonal map is
\begin{align*}
d:M&\to M^I/\mathcal U\\
a&\mapsto [c_a]_{\mathcal U}.
\end{align*}
If $a,b\in M$ and $d(a)=d(b)$, then
\begin{align*}
\{i\in I:c_a(i)=c_b(i)\}
=
\{i\in I:a=b\}.
\end{align*}
If $a\ne b$, this set is $\varnothing$, which is not an element of the proper ultrafilter $\mathcal U$. Hence $a=b$, so $d$ is injective.
[/step]
[step:Prove the Łoś equivalence for arbitrary representatives]
[claim:Łoś equivalence for the ultrapower]
For every $n\in\mathbb N$, every $L$-formula $\varphi(x_1,\dots,x_n)$, and every tuple of maps $f_1,\dots,f_n:I\to M$,
\begin{align*}
M^I/\mathcal U \models \varphi([f_1]_{\mathcal U},\dots,[f_n]_{\mathcal U})
\quad \Longleftrightarrow \quad
\{i\in I: M\models \varphi(f_1(i),\dots,f_n(i))\}\in\mathcal U.
\end{align*}
[/claim]
[proof]
We prove the statement by induction on the formation of the formula $\varphi$.
For an atomic equality formula $t_1=t_2$, where $t_1$ and $t_2$ are $L$-terms in variables $x_1,\dots,x_n$, define maps
\begin{align*}
t_k^M(f_1,\dots,f_n):I&\to M\\
i&\mapsto t_k^M(f_1(i),\dots,f_n(i))
\end{align*}
for $k\in\{1,2\}$. By the interpretation of terms in the reduced product,
\begin{align*}
t_k^{M^I/\mathcal U}([f_1]_{\mathcal U},\dots,[f_n]_{\mathcal U})
=
[t_k^M(f_1,\dots,f_n)]_{\mathcal U}.
\end{align*}
Therefore
\begin{align*}
M^I/\mathcal U \models t_1=t_2
\end{align*}
at the tuple $([f_1]_{\mathcal U},\dots,[f_n]_{\mathcal U})$ if and only if
\begin{align*}
\{i\in I:t_1^M(f_1(i),\dots,f_n(i))=t_2^M(f_1(i),\dots,f_n(i))\}\in\mathcal U,
\end{align*}
which is exactly the required coordinatewise truth set.
For an atomic relation formula $R(t_1,\dots,t_m)$, where $R$ is an $m$-ary relation symbol of $L$, the interpretation of $R$ in the ultraproduct gives
\begin{align*}
M^I/\mathcal U \models R(t_1,\dots,t_m)([f_1]_{\mathcal U},\dots,[f_n]_{\mathcal U})
\end{align*}
if and only if
\begin{align*}
\{i\in I: M\models R(t_1^M(f_1(i),\dots,f_n(i)),\dots,t_m^M(f_1(i),\dots,f_n(i)))\}\in\mathcal U.
\end{align*}
This is again the required coordinatewise truth set.
Assume the equivalence has been proved for formulas $\psi$ and $\theta$. For $\neg\psi$, let
\begin{align*}
A_\psi=\{i\in I:M\models\psi(f_1(i),\dots,f_n(i))\}.
\end{align*}
By the induction hypothesis,
\begin{align*}
M^I/\mathcal U\models \psi([f_1]_{\mathcal U},\dots,[f_n]_{\mathcal U})
\quad \Longleftrightarrow \quad
A_\psi\in\mathcal U.
\end{align*}
Since $\mathcal U$ is an ultrafilter, $I\setminus A_\psi\in\mathcal U$ if and only if $A_\psi\notin\mathcal U$. But
\begin{align*}
I\setminus A_\psi
=
\{i\in I:M\models \neg\psi(f_1(i),\dots,f_n(i))\}.
\end{align*}
Thus the equivalence holds for $\neg\psi$.
For $\psi\wedge\theta$, define
\begin{align*}
A_\psi&=\{i\in I:M\models\psi(f_1(i),\dots,f_n(i))\},\\
A_\theta&=\{i\in I:M\models\theta(f_1(i),\dots,f_n(i))\}.
\end{align*}
The induction hypothesis and the closure of $\mathcal U$ under finite intersections give
\begin{align*}
M^I/\mathcal U\models \psi\wedge\theta
\quad \Longleftrightarrow \quad
A_\psi\cap A_\theta\in\mathcal U.
\end{align*}
Since
\begin{align*}
A_\psi\cap A_\theta
=
\{i\in I:M\models(\psi\wedge\theta)(f_1(i),\dots,f_n(i))\},
\end{align*}
the equivalence holds for conjunction. The remaining Boolean connectives follow from negation and conjunction.
Now let $\varphi$ be $\exists y\,\psi(y,x_1,\dots,x_n)$, and assume the equivalence has been proved for $\psi$. First suppose
\begin{align*}
M^I/\mathcal U\models \exists y\,\psi(y,[f_1]_{\mathcal U},\dots,[f_n]_{\mathcal U}).
\end{align*}
Then there is a map $g:I\to M$ such that
\begin{align*}
M^I/\mathcal U\models \psi([g]_{\mathcal U},[f_1]_{\mathcal U},\dots,[f_n]_{\mathcal U}).
\end{align*}
By the induction hypothesis,
\begin{align*}
A=\{i\in I:M\models\psi(g(i),f_1(i),\dots,f_n(i))\}\in\mathcal U.
\end{align*}
Since $A$ is contained in
\begin{align*}
B=\{i\in I:M\models \exists y\,\psi(y,f_1(i),\dots,f_n(i))\},
\end{align*}
and $\mathcal U$ is upward closed, $B\in\mathcal U$.
Conversely, suppose
\begin{align*}
B=\{i\in I:M\models \exists y\,\psi(y,f_1(i),\dots,f_n(i))\}\in\mathcal U.
\end{align*}
For each $i\in B$, choose an element $g(i)\in M$ such that
\begin{align*}
M\models\psi(g(i),f_1(i),\dots,f_n(i)).
\end{align*}
For each $i\in I\setminus B$, choose any element $g(i)\in M$; this is possible because an $L$-structure has non-empty universe. This defines a map $g:I\to M$. Then
\begin{align*}
B\subseteq \{i\in I:M\models\psi(g(i),f_1(i),\dots,f_n(i))\},
\end{align*}
so the latter set belongs to $\mathcal U$. By the induction hypothesis,
\begin{align*}
M^I/\mathcal U\models \psi([g]_{\mathcal U},[f_1]_{\mathcal U},\dots,[f_n]_{\mathcal U}),
\end{align*}
and hence
\begin{align*}
M^I/\mathcal U\models \exists y\,\psi(y,[f_1]_{\mathcal U},\dots,[f_n]_{\mathcal U}).
\end{align*}
Thus the equivalence holds for existential quantification. Universal quantification follows from negation and existential quantification. The induction is complete.
[/proof]
[guided]
The point of this step is to prove the exact transfer principle needed for the diagonal embedding, rather than merely asserting it. For arbitrary maps $f_1,\dots,f_n:I\to M$, the statement says that a formula is true of the equivalence classes $[f_1]_{\mathcal U},\dots,[f_n]_{\mathcal U}$ in the ultrapower exactly when the set of indices where the formula is true coordinatewise in $M$ belongs to the ultrafilter.
For atomic equality, the definition of the quotient relation is exactly designed to make this work. If $t_1$ and $t_2$ are $L$-terms in variables $x_1,\dots,x_n$, define
\begin{align*}
t_k^M(f_1,\dots,f_n):I&\to M\\
i&\mapsto t_k^M(f_1(i),\dots,f_n(i))
\end{align*}
for $k\in\{1,2\}$. Then equality of the two term-values in the ultrapower means equality of these two coordinate functions modulo $\mathcal U$:
\begin{align*}
[t_1^M(f_1,\dots,f_n)]_{\mathcal U}
=
[t_2^M(f_1,\dots,f_n)]_{\mathcal U}.
\end{align*}
By definition of $\sim_{\mathcal U}$, this is equivalent to
\begin{align*}
\{i\in I:t_1^M(f_1(i),\dots,f_n(i))=t_2^M(f_1(i),\dots,f_n(i))\}\in\mathcal U.
\end{align*}
For atomic relation formulas, the relation symbol is interpreted coordinatewise modulo $\mathcal U$. Thus, if $R$ is an $m$-ary relation symbol, then
\begin{align*}
M^I/\mathcal U \models R(t_1,\dots,t_m)([f_1]_{\mathcal U},\dots,[f_n]_{\mathcal U})
\end{align*}
if and only if
\begin{align*}
\{i\in I: M\models R(t_1^M(f_1(i),\dots,f_n(i)),\dots,t_m^M(f_1(i),\dots,f_n(i)))\}\in\mathcal U.
\end{align*}
The Boolean connectives are where the ultrafilter properties enter. For negation, the coordinate truth set changes to the complement. If
\begin{align*}
A_\psi=\{i\in I:M\models\psi(f_1(i),\dots,f_n(i))\},
\end{align*}
then the truth set for $\neg\psi$ is $I\setminus A_\psi$. Since $\mathcal U$ is an ultrafilter, exactly one of $A_\psi$ and $I\setminus A_\psi$ belongs to $\mathcal U$. This gives the equivalence for negation. For conjunction, the coordinate truth set is the intersection:
\begin{align*}
\{i\in I:M\models(\psi\wedge\theta)(f_1(i),\dots,f_n(i))\}
=
A_\psi\cap A_\theta.
\end{align*}
Closure of $\mathcal U$ under finite intersections gives the equivalence for $\psi\wedge\theta$.
The existential quantifier is the only step requiring a genuine witness argument. Let $\varphi$ be $\exists y\,\psi(y,x_1,\dots,x_n)$. If the ultrapower satisfies
\begin{align*}
M^I/\mathcal U\models \exists y\,\psi(y,[f_1]_{\mathcal U},\dots,[f_n]_{\mathcal U}),
\end{align*}
then some element of the ultrapower, represented by a map $g:I\to M$, satisfies
\begin{align*}
M^I/\mathcal U\models \psi([g]_{\mathcal U},[f_1]_{\mathcal U},\dots,[f_n]_{\mathcal U}).
\end{align*}
By the induction hypothesis, the set
\begin{align*}
A=\{i\in I:M\models\psi(g(i),f_1(i),\dots,f_n(i))\}
\end{align*}
belongs to $\mathcal U$. Every index in $A$ is an index where the existential formula is true, so
\begin{align*}
A\subseteq B
\end{align*}
where
\begin{align*}
B=\{i\in I:M\models\exists y\,\psi(y,f_1(i),\dots,f_n(i))\}.
\end{align*}
Because $\mathcal U$ is upward closed, $B\in\mathcal U$.
Conversely, suppose this existential truth set $B$ belongs to $\mathcal U$. For every $i\in B$, choose a witness $g(i)\in M$ with
\begin{align*}
M\models\psi(g(i),f_1(i),\dots,f_n(i)).
\end{align*}
For indices outside $B$, choose any value of $g(i)$ in $M$; the universe of an $L$-structure is non-empty, so this is possible. This defines a map $g:I\to M$. Then
\begin{align*}
B\subseteq \{i\in I:M\models\psi(g(i),f_1(i),\dots,f_n(i))\},
\end{align*}
so the latter set belongs to $\mathcal U$. Applying the induction hypothesis to $\psi$ gives
\begin{align*}
M^I/\mathcal U\models \psi([g]_{\mathcal U},[f_1]_{\mathcal U},\dots,[f_n]_{\mathcal U}),
\end{align*}
hence the ultrapower satisfies the existential formula. This proves the Łoś equivalence.
[/guided]
[/step]
[step:Apply the Łoś equivalence to constant representatives]
Let $n\in\mathbb N$, let $\varphi(x_1,\dots,x_n)$ be an $L$-formula, and let $a_1,\dots,a_n\in M$. For each $k\in\{1,\dots,n\}$, let
\begin{align*}
c_{a_k}:I&\to M\\
i&\mapsto a_k
\end{align*}
be the constant map.
Define the coordinate truth set
\begin{align*}
A_\varphi
=
\{i\in I:M\models\varphi(c_{a_1}(i),\dots,c_{a_n}(i))\}.
\end{align*}
Since each $c_{a_k}$ is constant,
\begin{align*}
A_\varphi
=
\begin{cases}
I, & \text{if } M\models\varphi(a_1,\dots,a_n),\\
\varnothing, & \text{if } M\not\models\varphi(a_1,\dots,a_n).
\end{cases}
\end{align*}
Because $\mathcal U$ is a proper ultrafilter on $I$, we have $I\in\mathcal U$ and $\varnothing\notin\mathcal U$. Applying the Łoś equivalence to $c_{a_1},\dots,c_{a_n}$ gives
\begin{align*}
M^I/\mathcal U\models\varphi([c_{a_1}]_{\mathcal U},\dots,[c_{a_n}]_{\mathcal U})
\quad \Longleftrightarrow \quad
A_\varphi\in\mathcal U.
\end{align*}
Using $d(a_k)=[c_{a_k}]_{\mathcal U}$, this becomes
\begin{align*}
M^I/\mathcal U\models\varphi(d(a_1),\dots,d(a_n))
\quad \Longleftrightarrow \quad
M\models\varphi(a_1,\dots,a_n).
\end{align*}
Thus $d$ preserves and reflects every first-order $L$-formula with parameters from $M$.
[/step]
[step:Conclude that the diagonal map is elementary]
The first step showed that $d:M\to M^I/\mathcal U$ is injective. The preceding step showed that for every $n\in\mathbb N$, every $L$-formula $\varphi(x_1,\dots,x_n)$, and every $a_1,\dots,a_n\in M$,
\begin{align*}
M\models\varphi(a_1,\dots,a_n)
\quad \Longleftrightarrow \quad
M^I/\mathcal U\models\varphi(d(a_1),\dots,d(a_n)).
\end{align*}
This is precisely the definition of an elementary embedding. Therefore the diagonal map $d$ is an elementary embedding.
[/step]