[proofplan]
We construct $F(S)$ as reduced words in the formal alphabet $S\times\{1,-1\}$. The main technical point is to define a canonical reduction map from all finite words to reduced words and prove that deleting an adjacent inverse pair does not change this canonical reduced word. This normal form makes concatenation followed by reduction into an associative group law. Finally, any function $f:S\to G$ is extended by evaluating each reduced word in $G$, and the cancellation invariance proves that this evaluation is a homomorphism and is uniquely determined.
[/proofplan]
[step:Build the alphabet of formal inverses and define reduced words]
Let $S$ be a set. Define the alphabet
\begin{align*}
A := S\times\{1,-1\}.
\end{align*}
For $a=(s,\epsilon)\in A$, where $s\in S$ and $\epsilon\in\{1,-1\}$, define the formal inverse letter $a^{-1}\in A$ by
\begin{align*}
a^{-1}:=(s,-\epsilon).
\end{align*}
A word in $A$ is a finite sequence $w=(a_1,\dots,a_n)$ with $n\in\mathbb{N}\cup\{0\}$ and $a_i\in A$ for each $1\le i\le n$. The unique word of length $0$ is denoted by $e$ and called the empty word.
A word $w=(a_1,\dots,a_n)$ is reduced if either $n=0$, or
\begin{align*}
a_{i+1}\neq a_i^{-1}
\end{align*}
for every $1\le i<n$. Let $F(S)$ denote the set of all reduced words in $A$. Define the inclusion map
\begin{align*}
\iota:S&\to F(S)\\
s&\mapsto ((s,1)).
\end{align*}
[/step]
[step:Define a canonical reduction map for arbitrary words]
Let $W(A)$ be the set of all finite words in $A$. Define a map
\begin{align*}
R:W(A)&\to F(S)
\end{align*}
by the following recursive stack reduction procedure. Set $R(e)=e$. If $w=(a_1,\dots,a_n)$ with $n\ge 1$, process the letters from left to right, maintaining a reduced word $u$. Start with $u=e$. When the next letter is $a_i$, if $u=(b_1,\dots,b_m)$ with $m\ge 1$ and $b_m=a_i^{-1}$, replace $u$ by $(b_1,\dots,b_{m-1})$; otherwise replace $u$ by $(b_1,\dots,b_m,a_i)$. The final reduced word is $R(w)$.
This procedure is well-defined by finite recursion on the length of the word. At each stage the maintained word is reduced: appending $a_i$ is performed only when the previous last letter is not $a_i^{-1}$, and deleting the last letter cannot create a new adjacent inverse pair. Hence $R(w)\in F(S)$ for every $w\in W(A)$.
[claim:Deleting one adjacent inverse pair does not change the canonical reduction]
Let $x,y\in W(A)$ be words and let $a\in A$. If $xay$ denotes the concatenation of $x$, the one-letter word $(a)$, and $y$, then
\begin{align*}
R(xaa^{-1}y)=R(xy).
\end{align*}
[/claim]
[proof]
It is enough to compare the stack reduction after processing the prefix $x$. Let $u:=R(x)$ be the reduced word on the stack after processing $x$.
Now process the two consecutive letters $a$ and $a^{-1}$. If $u=(b_1,\dots,b_m)$ with $m\ge 1$ and $b_m=a^{-1}$, then processing $a$ deletes $b_m$, and processing $a^{-1}$ appends $a^{-1}$. The stack returns to $u$. If either $u=e$ or the last letter of $u$ is not $a^{-1}$, then processing $a$ appends $a$, and processing $a^{-1}$ immediately deletes that appended $a$. Again the stack returns to $u$.
Thus after the prefix $xaa^{-1}$ has been processed, the stack is exactly the same as after the prefix $x$ has been processed. Processing the remaining word $y$ therefore gives the same final stack in both cases. Hence $R(xaa^{-1}y)=R(xy)$.
[/proof]
[/step]
[step:Define multiplication by concatenation followed by reduction]
For reduced words $u,v\in F(S)$, define their product $u\cdot v\in F(S)$ by
\begin{align*}
u\cdot v:=R(uv),
\end{align*}
where $uv$ denotes concatenation of the two finite sequences.
We record the reduction-in-context identity needed for associativity. If $p,q\in W(A)$ and $p'$ is obtained from $p$ by deleting one adjacent inverse pair, then the preceding claim gives
\begin{align*}
R(pq)=R(p'q).
\end{align*}
Iterating this statement over finitely many deletions gives the same conclusion when $p'$ is obtained from $p$ by finitely many adjacent inverse-pair deletions.
We next justify that $R(w)$ is obtained from $w$ by finitely many adjacent inverse-pair deletions. We prove this by induction on the length $n$ of $w=(a_1,\dots,a_n)\in W(A)$. For $n=0$, the assertion is immediate because $R(e)=e$. Suppose it holds for the prefix $w'=(a_1,\dots,a_{n-1})$, so $R(w')$ is obtained from $w'$ by finitely many adjacent inverse-pair deletions. During the last stack step, either $R(w')a_n$ is already reduced, in which case $R(w)=R(w')a_n$, or the last letter of $R(w')$ is $a_n^{-1}$, in which case $R(w)$ is obtained from $R(w')a_n$ by deleting that final adjacent inverse pair. Hence $R(w)$ is obtained from $w=w'a_n$ by finitely many adjacent inverse-pair deletions.
Consequently, for all words $w,z\in W(A)$,
\begin{align*}
R(wz)=R(R(w)z),
\end{align*}
because one may replace $w$ by $R(w)$ inside the larger word $wz$ through finitely many adjacent inverse-pair deletions. More generally, if $z'$ is obtained from $z$ by finitely many adjacent inverse-pair deletions, then
\begin{align*}
R(wz)=R(wz'),
\end{align*}
because the same deletion-in-context claim applies with the fixed prefix $w$ included in the surrounding word.
Now let $u,v,t\in F(S)$. Using the prefix normalization identity for the first product and the suffix deletion-in-context identity for the second product, we compute
\begin{align*}
(u\cdot v)\cdot t
&=R(R(uv)t)\\
&=R((uv)t),\\
u\cdot(v\cdot t)
&=R(uR(vt))\\
&=R(u(vt)).
\end{align*}
Concatenation of finite sequences is associative, so $(uv)t=u(vt)$. Hence
\begin{align*}
(u\cdot v)\cdot t=u\cdot(v\cdot t).
\end{align*}
Thus the operation $\cdot$ is associative.
[/step]
[step:Verify the identity element and inverses]
The empty word $e\in F(S)$ is an identity element. For every reduced word $u\in F(S)$,
\begin{align*}
e\cdot u=R(eu)=R(u)=u,
\qquad
u\cdot e=R(ue)=R(u)=u,
\end{align*}
because $u$ is already reduced.
For a reduced word $u=(a_1,\dots,a_n)\in F(S)$, define
\begin{align*}
u^{-1}:=(a_n^{-1},a_{n-1}^{-1},\dots,a_1^{-1}).
\end{align*}
This word is reduced: if $a_{i+1}\neq a_i^{-1}$ in $u$, then $a_i^{-1}\neq (a_{i+1}^{-1})^{-1}$ in the reversed inverse word. The concatenation $uu^{-1}$ reduces by successive cancellation of
\begin{align*}
a_n a_n^{-1},\ a_{n-1}a_{n-1}^{-1},\ \dots,\ a_1a_1^{-1},
\end{align*}
so $R(uu^{-1})=e$. Similarly, $R(u^{-1}u)=e$. Therefore
\begin{align*}
u\cdot u^{-1}=e,
\qquad
u^{-1}\cdot u=e.
\end{align*}
Thus $F(S)$ is a group.
[/step]
[step:Extend an arbitrary map from $S$ to a group]
Let $G$ be a group with identity element $1_G$, and let $f:S\to G$ be a function. Define a letter-evaluation map
\begin{align*}
\tilde{f}:A&\to G\\
(s,1)&\mapsto f(s)\\
(s,-1)&\mapsto f(s)^{-1}.
\end{align*}
For a reduced word $u=(a_1,\dots,a_n)\in F(S)$, define
\begin{align*}
\bar{f}:F(S)&\to G\\
(a_1,\dots,a_n)&\mapsto \tilde{f}(a_1)\tilde{f}(a_2)\cdots \tilde{f}(a_n),
\end{align*}
with the convention that $\bar{f}(e)=1_G$.
If a word contains an adjacent pair $aa^{-1}$, evaluating that pair in $G$ gives
\begin{align*}
\tilde{f}(a)\tilde{f}(a^{-1})=1_G.
\end{align*}
Therefore deleting adjacent inverse pairs does not change the value of the product in $G$. Hence for every word $w\in W(A)$,
\begin{align*}
\bar{f}(R(w))=\tilde{f}(a_1)\cdots \tilde{f}(a_n),
\end{align*}
where $w=(a_1,\dots,a_n)$.
Now let $u,v\in F(S)$. Since $u\cdot v=R(uv)$ and reduction does not change evaluation in $G$,
\begin{align*}
\bar{f}(u\cdot v)
&=\bar{f}(R(uv))\\
&=\bar{f}(u)\bar{f}(v).
\end{align*}
Thus $\bar{f}:F(S)\to G$ is a group homomorphism. For each $s\in S$,
\begin{align*}
\bar{f}(\iota(s))=\bar{f}(((s,1)))=f(s),
\end{align*}
so $\bar{f}\circ\iota=f$.
[/step]
[step:Prove uniqueness of the extension]
Let $\varphi:F(S)\to G$ be a group homomorphism satisfying $\varphi\circ\iota=f$. We prove that $\varphi=\bar{f}$.
Let $u=(a_1,\dots,a_n)\in F(S)$ be a reduced word. For each $1\le i\le n$, write $a_i=(s_i,\epsilon_i)$ with $s_i\in S$ and $\epsilon_i\in\{1,-1\}$. Since $\iota(s_i)=((s_i,1))$, we have
\begin{align*}
((s_i,-1))=\iota(s_i)^{-1}
\end{align*}
inside the group $F(S)$. Therefore
\begin{align*}
u=\iota(s_1)^{\epsilon_1}\iota(s_2)^{\epsilon_2}\cdots \iota(s_n)^{\epsilon_n}
\end{align*}
in $F(S)$, where exponent $1$ means the element itself and exponent $-1$ means the inverse element.
Using that $\varphi$ is a group homomorphism and that $\varphi(\iota(s_i))=f(s_i)$, we get
\begin{align*}
\varphi(u)
&=\varphi(\iota(s_1))^{\epsilon_1}\varphi(\iota(s_2))^{\epsilon_2}\cdots \varphi(\iota(s_n))^{\epsilon_n}\\
&=f(s_1)^{\epsilon_1}f(s_2)^{\epsilon_2}\cdots f(s_n)^{\epsilon_n}\\
&=\bar{f}(u).
\end{align*}
Since $u\in F(S)$ was arbitrary, $\varphi=\bar{f}$. Hence the homomorphism extending $f$ is unique, and $(F(S),\iota)$ satisfies the [universal property of the free group](/theorems/1900) on $S$.
[/step]