[guided]The likelihood is a joint density evaluated at the observed data. We denote this joint density by
\begin{align*}
p_{1:n}: (\mathbb{R}^m)^n \to [0,\infty),
\end{align*}
where the reference measure is the $mn$-dimensional Lebesgue measure $\mathcal{L}^{mn}$ on $(\mathbb{R}^m)^n$. For each time $t$, we also introduce the one-step conditional density
\begin{align*}
p_t(\,\cdot \mid y_1,\dots,y_{t-1}): \mathbb{R}^m \to [0,\infty)
\end{align*}
with respect to $\mathcal{L}^m$. When $t=1$, there are no earlier observations, so $p_1(\,\cdot\,)$ is the marginal predictive density determined by the fixed initial quantities $a_1$ and $P_1$.
The conditional density factorization states that a joint density may be built by multiplying successive conditional densities:
\begin{align*}
p_{1:n}(y_1,\dots,y_n)
=
p_1(y_1)
p_2(y_2 \mid y_1)
\cdots
p_n(y_n \mid y_1,\dots,y_{n-1}).
\end{align*}
Equivalently,
\begin{align*}
p_{1:n}(y_1,\dots,y_n)
=
\prod_{t=1}^{n} p_t(y_t \mid y_1,\dots,y_{t-1}).
\end{align*}
Thus the conditional likelihood of the observed data, with $a_1$ and $P_1$ fixed, is exactly this product:
\begin{align*}
L(y_1,\dots,y_n \mid a_1,P_1)
=
\prod_{t=1}^{n} p_t(y_t \mid y_1,\dots,y_{t-1}).
\end{align*}
This is the point at which the likelihood becomes a prediction-error likelihood: each factor depends only on the one-step-ahead prediction error at time $t$.[/guided]