[guided]The first point is to translate the statistical information into graph-theoretic information. Define
\begin{align*}
\mathcal I(P):=\{(A,B,C): A,B,C\subset V \text{ are pairwise disjoint and } X_A \perp\!\!\!\perp X_B \mid X_C \text{ under } P\}.
\end{align*}
This is exactly the data the theorem allows us to use: the complete list of observational conditional independences.
Now define the corresponding graphical independence model of the DAG $G$:
\begin{align*}
\mathcal I(G):=\{(A,B,C): A,B,C\subset V \text{ are pairwise disjoint and } A \text{ is d-separated from } B \text{ by } C \text{ in } G\}.
\end{align*}
The Markov assumption gives one implication:
\begin{align*}
(A,B,C)\in \mathcal I(G) \implies (A,B,C)\in \mathcal I(P).
\end{align*}
Faithfulness gives the converse:
\begin{align*}
(A,B,C)\in \mathcal I(P) \implies (A,B,C)\in \mathcal I(G).
\end{align*}
Therefore, for every pairwise disjoint triple $A,B,C\subset V$,
\begin{align*}
(A,B,C)\in \mathcal I(P) \iff (A,B,C)\in \mathcal I(G).
\end{align*}
Thus the complete observational conditional independence model is not merely compatible with $G$; it is exactly the d-separation model of $G$:
\begin{align*}
\mathcal I(P)=\mathcal I(G).
\end{align*}
This is where Markovness and faithfulness are used. Without faithfulness, some graphical dependences could disappear accidentally in $P$, and without Markovness, some graphical separations might fail to appear as statistical conditional independences.[/guided]