[proofplan]
We present a characteristic-free proof by rewriting words in the tensor algebra. The defining relations of $U(\mathfrak g)$ allow every adjacent inverted pair $e_j e_i$ with $j>i$ to be replaced by the ordered pair $e_i e_j$ plus a lower-degree Lie bracket term. Termination follows from a monomial order, and the only possible overlap ambiguities reduce to the Jacobi identity in $\mathfrak g$. Hence the irreducible words are exactly the ordered monomials, and their images form a basis of the quotient algebra $U(\mathfrak g)$.
[/proofplan]
[step:Present the universal enveloping algebra by generators and relations]
Let $T(\mathfrak g)$ denote the tensor algebra of the $k$-[vector space](/page/Vector%20Space) $\mathfrak g$. Thus
\begin{align*}
T(\mathfrak g)=\bigoplus_{m=0}^{\infty}\mathfrak g^{\otimes m},
\end{align*}
with multiplication given by tensor concatenation and with $\mathfrak g^{\otimes 0}=k$.
Let $I\trianglelefteq T(\mathfrak g)$ be the two-sided ideal generated by all elements
\begin{align*}
x\otimes y-y\otimes x-[x,y],
\end{align*}
where $x,y\in\mathfrak g$, and where $[x,y]\in\mathfrak g=\mathfrak g^{\otimes 1}\subset T(\mathfrak g)$. By definition,
\begin{align*}
U(\mathfrak g)=T(\mathfrak g)/I.
\end{align*}
Since $(e_1,\dots,e_n)$ is a basis of $\mathfrak g$, the tensor algebra $T(\mathfrak g)$ is the free associative algebra over $k$ on the ordered alphabet $\{e_1,\dots,e_n\}$. We write a word as
\begin{align*}
e_{i_1}e_{i_2}\cdots e_{i_m}
\end{align*}
for the tensor $e_{i_1}\otimes e_{i_2}\otimes\cdots\otimes e_{i_m}$.
For each pair $j>i$, write the Lie bracket in the chosen basis as
\begin{align*}
[e_j,e_i]=\sum_{r=1}^{n} c_{ji}^{r} e_r,
\end{align*}
where $c_{ji}^{r}\in k$. The defining relation in $U(\mathfrak g)$ gives
\begin{align*}
e_j e_i=e_i e_j+[e_j,e_i]
\end{align*}
for every $j>i$.
[/step]
[step:Define a terminating reduction system whose normal words are ordered monomials]
Let $\mathcal W$ be the set of all words in the alphabet $\{e_1,\dots,e_n\}$, including the empty word $1$. Define a reduction rule for every pair $j>i$ by
\begin{align*}
e_j e_i \longmapsto e_i e_j+\sum_{r=1}^{n}c_{ji}^{r}e_r.
\end{align*}
This rule is extended to arbitrary words by replacing any adjacent subword $e_j e_i$ with $j>i$ while leaving the remaining letters fixed.
Order $\mathcal W$ as follows. First compare words by length, with longer words larger. Among words of the same length, use lexicographic order with
\begin{align*}
e_n>e_{n-1}>\cdots>e_1.
\end{align*}
This is a well-order compatible with inserting fixed words on the left and right. For $j>i$, the word $e_j e_i$ is larger than $e_i e_j$ in same-length lexicographic order, and it is larger than each one-letter word $e_r$ because it has larger length. Therefore every reduction replaces a word by a $k$-linear combination of strictly smaller words. Hence no infinite reduction sequence exists.
A word is irreducible precisely when it contains no adjacent inverted pair $e_j e_i$ with $j>i$. This is equivalent to its indices being weakly increasing:
\begin{align*}
i_1\leq i_2\leq\cdots\leq i_m.
\end{align*}
Thus the irreducible words are exactly the words
\begin{align*}
e_1^{a_1}e_2^{a_2}\cdots e_n^{a_n}
\end{align*}
with $a_1,\dots,a_n\in\mathbb Z_{\geq 0}$.
[guided]
The purpose of the reduction rules is to turn an arbitrary word into ordered form. Whenever two adjacent basis elements appear in the wrong order, say $e_j e_i$ with $j>i$, the enveloping algebra relation says that this product may be rewritten as
\begin{align*}
e_j e_i=e_i e_j+[e_j,e_i].
\end{align*}
After expanding the bracket in the chosen basis,
\begin{align*}
[e_j,e_i]=\sum_{r=1}^{n} c_{ji}^{r}e_r,
\end{align*}
this becomes the explicit rule
\begin{align*}
e_j e_i \longmapsto e_i e_j+\sum_{r=1}^{n}c_{ji}^{r}e_r.
\end{align*}
We must justify that repeated rewriting cannot continue forever. Define an order on words by declaring longer words larger, and among words of the same length using lexicographic order with
\begin{align*}
e_n>e_{n-1}>\cdots>e_1.
\end{align*}
If $j>i$, then $e_j e_i$ is lexicographically larger than $e_i e_j$, because the first letter $e_j$ is larger than $e_i$. Also $e_j e_i$ is larger than every one-letter word $e_r$, because it has length $2$ and $e_r$ has length $1$. Therefore every application of a reduction rule strictly decreases the leading word with respect to this well-order. A strictly decreasing infinite sequence in a well-order cannot exist, so the reduction system terminates.
Now identify the words on which no reduction applies. A reduction applies exactly when some adjacent pair is inverted, meaning that a subword has the form $e_j e_i$ with $j>i$. Thus no reduction applies exactly when the indices in the word are weakly increasing:
\begin{align*}
i_1\leq i_2\leq\cdots\leq i_m.
\end{align*}
Such a word is uniquely expressible as
\begin{align*}
e_1^{a_1}e_2^{a_2}\cdots e_n^{a_n}
\end{align*}
for nonnegative integers $a_1,\dots,a_n$. These are precisely the candidate PBW monomials.
[/guided]
[/step]
[step:Check the only overlap ambiguities using the Jacobi identity]
Because all left sides of the reduction rules have length $2$, there are no inclusion ambiguities. The only possible overlap ambiguity occurs in a word
\begin{align*}
e_k e_j e_i
\end{align*}
with $k>j>i$, where one may first reduce the left pair $e_k e_j$ or the right pair $e_j e_i$.
Let
\begin{align*}
R_{ab}:=e_a e_b-e_b e_a-[e_a,e_b]\in T(\mathfrak g)
\end{align*}
for $a>b$. The ambiguity for $e_k e_j e_i$ is represented by
\begin{align*}
R_{kj}e_i-e_kR_{ji}.
\end{align*}
Expanding gives
\begin{align*}R_{kj}e_i-e_kR_{ji}=-e_j e_k e_i-[e_k,e_j]e_i+e_k e_i e_j+e_k[e_j,e_i].\end{align*}
Reduce the remaining inverted adjacent pairs in the degree-three terms. First, \begin{align*}-e_j e_k e_i\equiv -e_j e_i e_k-e_j[e_k,e_i].\end{align*} modulo the ideal generated by the relations, and then
\begin{align*}-e_j e_i e_k\equiv -e_i e_j e_k-[e_j,e_i]e_k.\end{align*}
Also, \begin{align*}e_k e_i e_j\equiv e_i e_k e_j+[e_k,e_i]e_j.\end{align*} and
\begin{align*}e_i e_k e_j\equiv e_i e_j e_k+e_i[e_k,e_j].\end{align*}
Substituting these reductions, the degree-three ordered terms cancel, and the ambiguity is congruent to
\begin{align*}
-e_j[e_k,e_i]-[e_j,e_i]e_k-[e_k,e_j]e_i+e_i[e_k,e_j]+[e_k,e_i]e_j+e_k[e_j,e_i].
\end{align*}
Using the relation $uv-vu=[u,v]$ for $u,v\in\mathfrak g$, this expression is congruent to
\begin{align*}
-[e_j,[e_k,e_i]]-[[e_j,e_i],e_k]-[[e_k,e_j],e_i].
\end{align*}
By antisymmetry of the Lie bracket, this equals
\begin{align*}
[e_j,[e_i,e_k]]+[e_k,[e_j,e_i]]+[e_i,[e_k,e_j]].
\end{align*}
This is $0$ by the Jacobi identity in $\mathfrak g$. Hence every overlap ambiguity is resolvable.
The standard reduction criterion for a terminating linear rewriting system now applies: if all overlap and inclusion ambiguities are resolvable, then every element of the quotient by the reduction relations has a unique normal form as a $k$-linear combination of irreducible words. In the present system, the quotient by the reduction relations is exactly $U(\mathfrak g)$, because the reductions are precisely the defining relations $e_j e_i-e_i e_j-[e_j,e_i]$ for $j>i$.
[/step]
[step:Conclude that ordered monomials form a basis]
From termination and confluence, every word in $T(\mathfrak g)$ reduces to a unique $k$-linear combination of irreducible words. Passing to the quotient $U(\mathfrak g)$, every element of $U(\mathfrak g)$ is therefore a $k$-linear combination of the images of the irreducible words
\begin{align*}
e_1^{a_1}e_2^{a_2}\cdots e_n^{a_n}.
\end{align*}
Thus these monomials span $U(\mathfrak g)$.
To prove [linear independence](/page/Linear%20Independence), suppose that a finite linear combination of ordered monomials vanishes in $U(\mathfrak g)$:
\begin{align*}
\sum_{a} \lambda_a e_1^{a_1}e_2^{a_2}\cdots e_n^{a_n}=0,
\end{align*}
where the sum is over finitely many multi-indices $a=(a_1,\dots,a_n)\in\mathbb Z_{\geq 0}^{n}$ and $\lambda_a\in k$. This means that the corresponding element of $T(\mathfrak g)$ lies in the ideal generated by the reduction relations. By uniqueness of normal form, its normal form must be $0$. But the displayed sum is already a linear combination of irreducible words, so it is its own normal form. Since distinct words form a $k$-basis of the tensor algebra, all coefficients $\lambda_a$ are $0$.
Therefore the ordered monomials span and are linearly independent over $k$. They form a $k$-basis of $U(\mathfrak g)$.
[/step]