[proofplan]
We realize the extended complex plane as the projective line of one-dimensional subspaces of $\mathbb{C}^2$. Each invertible complex matrix acts on this projective line, and under the standard affine chart this action is exactly the fractional-linear formula
\begin{align*}
z &\mapsto \frac{az+b}{cz+d}
\end{align*}
with the correct behavior at poles and at $\infty$. Matrix multiplication then gives composition, scalar matrices are precisely the matrices acting as the identity, and every [Möbius transformation](/page/M%C3%B6bius%20Transformation) arises from an invertible matrix by definition.
[/proofplan]
[step:Realize Möbius transformations as projective matrix actions]
Define an [equivalence relation](/page/Equivalence%20Relation) on $\mathbb{C}^2\setminus\{0\}$ by declaring $v\sim w$ if there exists $\lambda\in\mathbb{C}^{\times}$ such that $v=\lambda w$. Let $\mathbb{P}^1(\mathbb{C})$ denote the quotient set, and write $[v]$ for the equivalence class of $v\in\mathbb{C}^2\setminus\{0\}$.
Define the bijection
\begin{align*}
\iota:\widehat{\mathbb{C}}&\to \mathbb{P}^1(\mathbb{C})
\end{align*}
by setting $\iota(z)=[(z,1)]$ for $z\in\mathbb{C}$ and $\iota(\infty)=[(1,0)]$.
For each $A\in GL(2,\mathbb{C})$, define
\begin{align*}
L_A:\mathbb{P}^1(\mathbb{C})&\to \mathbb{P}^1(\mathbb{C)}
\end{align*}
by
\begin{align*}
L_A([v])=[Av].
\end{align*}
This is well-defined because if $v=\lambda w$ with $\lambda\in\mathbb{C}^{\times}$, then $Av=\lambda Aw$, and $Aw\ne 0$ since $A$ is invertible.
Now define
\begin{align*}
T_A:\widehat{\mathbb{C}}&\to\widehat{\mathbb{C}}
\end{align*}
by
\begin{align*}
T_A=\iota^{-1}\circ L_A\circ \iota.
\end{align*}
If $A$ has entries $a,b,c,d$, then for $z\in\mathbb{C}$,
\begin{align*}
A(z,1)=(az+b,cz+d).
\end{align*}
Hence, when $cz+d\ne 0$,
\begin{align*}
T_A(z)=\frac{az+b}{cz+d}.
\end{align*}
If $cz+d=0$, then $A(z,1)=(az+b,0)$; the first coordinate is nonzero because $A(z,1)\ne 0$, so $T_A(z)=\infty$. Finally,
\begin{align*}
A(1,0)=(a,c).
\end{align*}
Thus
\begin{align*}
T_A(\infty)=\frac{a}{c}
\end{align*}
when $c\ne 0$, and $T_A(\infty)=\infty$ when $c=0$.
[guided]
The point of introducing $\mathbb{P}^1(\mathbb{C})$ is that the exceptional values of the fractional formula become ordinary linear algebra. Define $v\sim w$ on $\mathbb{C}^2\setminus\{0\}$ when $v=\lambda w$ for some $\lambda\in\mathbb{C}^{\times}$, and let $\mathbb{P}^1(\mathbb{C})$ be the set of equivalence classes. The map
\begin{align*}
\iota:\widehat{\mathbb{C}}&\to \mathbb{P}^1(\mathbb{C})
\end{align*}
is defined by $\iota(z)=[(z,1)]$ for finite $z$ and $\iota(\infty)=[(1,0)]$.
For $A\in GL(2,\mathbb{C})$, define
\begin{align*}
L_A:\mathbb{P}^1(\mathbb{C})&\to \mathbb{P}^1(\mathbb{C)}
\end{align*}
by $L_A([v])=[Av]$. We must check that this definition does not depend on the representative $v$. If $v=\lambda w$ for $\lambda\in\mathbb{C}^{\times}$, then $Av=\lambda Aw$, so $Av$ and $Aw$ determine the same projective point. Also $Aw\ne 0$ because $A$ is invertible and $w\ne 0$. Therefore $L_A$ is a well-defined map.
Define
\begin{align*}
T_A:\widehat{\mathbb{C}}&\to\widehat{\mathbb{C}}
\end{align*}
by $T_A=\iota^{-1}\circ L_A\circ \iota$. If $A$ has entries $a,b,c,d$, then for $z\in\mathbb{C}$ we have
\begin{align*}
A(z,1)=(az+b,cz+d).
\end{align*}
When $cz+d\ne 0$, the projective class of $(az+b,cz+d)$ is the same as the class of $\left((az+b)/(cz+d),1\right)$, so
\begin{align*}
T_A(z)=\frac{az+b}{cz+d}.
\end{align*}
When $cz+d=0$, the vector $A(z,1)$ is nonzero and has second coordinate $0$, so it represents $[(1,0)]$, which corresponds to $\infty$. At the point $\infty$, we compute from $\iota(\infty)=[(1,0)]$ that
\begin{align*}
A(1,0)=(a,c).
\end{align*}
Thus $T_A(\infty)=a/c$ when $c\ne 0$, and $T_A(\infty)=\infty$ when $c=0$. This verifies exactly the usual extended-value convention for a Möbius transformation.
[/guided]
[/step]
[step:Show composition corresponds to matrix multiplication]
Let $A,B\in GL(2,\mathbb{C})$. Since $AB\in GL(2,\mathbb{C})$, the map $T_{AB}$ is defined. For every $p\in\widehat{\mathbb{C}}$,
\begin{align*}
T_A(T_B(p))=(\iota^{-1}\circ L_A\circ \iota\circ \iota^{-1}\circ L_B\circ \iota)(p).
\end{align*}
Canceling $\iota\circ\iota^{-1}$ gives
\begin{align*}
T_A(T_B(p))=(\iota^{-1}\circ L_A\circ L_B\circ \iota)(p).
\end{align*}
For every $[v]\in\mathbb{P}^1(\mathbb{C})$,
\begin{align*}
L_A(L_B([v]))=L_A([Bv])=[ABv]=L_{AB}([v]).
\end{align*}
Hence
\begin{align*}
T_A\circ T_B=T_{AB}.
\end{align*}
Therefore the assignment $A\mapsto T_A$ is a [group homomorphism](/page/Group%20Homomorphism) from $GL(2,\mathbb{C})$ to the group of bijections of $\widehat{\mathbb{C}}$.
[/step]
[step:Identify the identity and inverses]
Let $I$ denote the identity matrix in $GL(2,\mathbb{C})$. Since $L_I$ is the identity map on $\mathbb{P}^1(\mathbb{C})$, $T_I$ is the identity map on $\widehat{\mathbb{C}}$.
For $A\in GL(2,\mathbb{C})$, the inverse matrix $A^{-1}$ belongs to $GL(2,\mathbb{C})$. By the composition formula,
\begin{align*}
T_A\circ T_{A^{-1}}=T_I.
\end{align*}
Also,
\begin{align*}
T_{A^{-1}}\circ T_A=T_I.
\end{align*}
Thus $T_{A^{-1}}=T_A^{-1}$. Consequently every map $T_A$ is bijective, and the set of all such maps is closed under composition and inverses.
[/step]
[step:Compute the kernel of the projective action]
Let $\Phi:GL(2,\mathbb{C})\to \operatorname{Mob}(\widehat{\mathbb{C}})$ denote the homomorphism $\Phi(A)=T_A$. We prove that $\ker\Phi$ is exactly the subgroup of nonzero scalar matrices.
If $A=\lambda I$ with $\lambda\in\mathbb{C}^{\times}$, then for every $v\in\mathbb{C}^2\setminus\{0\}$,
\begin{align*}
Av=\lambda v.
\end{align*}
Hence $[Av]=[v]$, so $T_A$ is the identity map.
Conversely, suppose $T_A$ is the identity map on $\widehat{\mathbb{C}}$. Let $e_1=(1,0)$ and $e_2=(0,1)$. Since $T_A$ is the identity, $L_A$ fixes $[e_1]$, $[e_2]$, and $[e_1+e_2]$. Therefore there exist $\alpha,\delta,\mu\in\mathbb{C}^{\times}$ such that
\begin{align*}
Ae_1=\alpha e_1.
\end{align*}
\begin{align*}
Ae_2=\delta e_2.
\end{align*}
\begin{align*}
A(e_1+e_2)=\mu(e_1+e_2).
\end{align*}
By linearity,
\begin{align*}
A(e_1+e_2)=Ae_1+Ae_2=\alpha e_1+\delta e_2.
\end{align*}
Comparing this with $\mu e_1+\mu e_2$ gives $\alpha=\mu$ and $\delta=\mu$. Hence $\alpha=\delta$, and $A=\alpha I$. Since $A$ is invertible, $\alpha\ne 0$. Thus $\ker\Phi=\mathbb{C}^{\times}I$.
[/step]
[step:Pass to the quotient and conclude the isomorphism]
By definition, every element of $\operatorname{Mob}(\widehat{\mathbb{C}})$ is $T_A$ for some $A\in GL(2,\mathbb{C})$, so $\Phi$ is surjective. The preceding step gives
\begin{align*}
\ker\Phi=\mathbb{C}^{\times}I.
\end{align*}
Therefore $\Phi$ factors through the quotient by nonzero scalar matrices and induces a bijective homomorphism
\begin{align*}
\overline{\Phi}:GL(2,\mathbb{C})/\mathbb{C}^{\times}\to \operatorname{Mob}(\widehat{\mathbb{C}}).
\end{align*}
The group operation on the quotient is induced by matrix multiplication, and the operation on $\operatorname{Mob}(\widehat{\mathbb{C}})$ is composition. Since $\overline{\Phi}$ is a bijective homomorphism, it is a group isomorphism. Hence
\begin{align*}
\operatorname{Mob}(\widehat{\mathbb{C}})\cong PGL(2,\mathbb{C}).
\end{align*}
In particular, $\operatorname{Mob}(\widehat{\mathbb{C}})$ is a group under composition.
[/step]