[proofplan]
Choose an ordered $k$-basis of $\mathfrak g$ and apply the Poincare-Birkhoff-Witt theorem to the corresponding PBW monomials in $U(\mathfrak g)$. The degree-one PBW monomials are precisely the elements $i(x_j)$ attached to the chosen basis vectors. Since these degree-one monomials are part of a $k$-basis of $U(\mathfrak g)$, they are linearly independent, and this forces every element in the kernel of $i$ to have all basis coordinates equal to zero.
[/proofplan]
[step:Choose an ordered basis of the Lie algebra]
Since $\mathfrak g$ is a [vector space](/page/Vector%20Space) over the field $k$, choose a basis $(x_j)_{j\in J}$ of $\mathfrak g$. Equip the index set $J$ with a total order. For every element $x\in\mathfrak g$, there are uniquely determined scalars $a_{j_1},\dots,a_{j_N}\in k$ and distinct indices $j_1,\dots,j_N\in J$ such that
\begin{align*}
x=\sum_{r=1}^{N} a_{j_r}x_{j_r}.
\end{align*}
[/step]
[step:Use PBW to make the degree-one canonical images linearly independent]
By the [citetheorem:8827] applied to the ordered basis $(x_j)_{j\in J}$, the ordered monomials
\begin{align*}
i(x_{j_1})i(x_{j_2})\cdots i(x_{j_m}), \qquad j_1\leq j_2\leq \cdots \leq j_m,
\end{align*}
together with the empty product $1$, form a $k$-basis of $U(\mathfrak g)$. Taking $m=1$, each element $i(x_j)$ is one of these basis elements. Therefore the subset $\{i(x_j):j\in J\}$ is $k$-linearly independent in $U(\mathfrak g)$.
[guided]
The point of applying PBW is not to describe every element of $U(\mathfrak g)$, but to isolate a specific part of its basis. The [citetheorem:8827] applies because $(x_j)_{j\in J}$ is an ordered basis of the [Lie algebra](/page/Lie%20Algebra) $\mathfrak g$ over the field $k$. Its conclusion says that all ordered products
\begin{align*}
i(x_{j_1})i(x_{j_2})\cdots i(x_{j_m}), \qquad j_1\leq j_2\leq \cdots \leq j_m,
\end{align*}
including the empty product $1$, form a $k$-basis of $U(\mathfrak g)$.
Now look only at the case $m=1$. Then the ordered monomial has the form $i(x_j)$ for a single index $j\in J$. Hence every $i(x_j)$ occurs as one element of the PBW basis. A subset of a basis is linearly independent, so the family $\{i(x_j):j\in J\}$ is $k$-linearly independent in $U(\mathfrak g)$. This is the exact place where PBW supplies the needed information: it prevents a nonzero linear combination of the degree-one generators from becoming zero in the quotient algebra $U(\mathfrak g)$.
[/guided]
[/step]
[step:Conclude that the kernel of the canonical map is zero]
Let $x\in\ker i$. Write $x$ in the chosen basis as
\begin{align*}
x=\sum_{r=1}^{N} a_{j_r}x_{j_r},
\end{align*}
where $a_{j_r}\in k$ and the indices $j_1,\dots,j_N$ are distinct. Since $i:\mathfrak g\to U(\mathfrak g)$ is $k$-linear,
\begin{align*}
0=i(x)=\sum_{r=1}^{N} a_{j_r}i(x_{j_r}).
\end{align*}
The family $\{i(x_j):j\in J\}$ is $k$-linearly independent, so $a_{j_r}=0$ for every $1\leq r\leq N$. Hence $x=0$. Therefore $\ker i=\{0\}$, and $i$ is injective.
[/step]