[proofplan]
We prove termination and correctness by giving the Gaussian elimination algorithm explicitly on the whole augmented matrix, including the augmented column. At each active column, if a nonzero entry occurs at or below the active row, we move it into the pivot position and use row replacement operations to clear all entries below it. The active row then increases and the active column always moves to the right, so the process stops after finitely many columns. Finally, we verify directly that the three elementary row operations preserve the set of solutions of the associated linear system.
[/proofplan]
[step:Translate augmented matrices into systems of equations]
Let $M\in k^{m\times(n+1)}$ be an augmented matrix. For $1\leq i\leq m$ and $1\leq j\leq n+1$, write $M_{ij}\in k$ for the entry in row $i$ and column $j$.
For each row index $i$, define the [linear map](/page/Linear%20Map)
\begin{align*}L_i^M:k^n\to k,\qquad x=(x_1,\dots,x_n)\mapsto \sum_{j=1}^n M_{ij}x_j.\end{align*}
The solution set of $M$ is the subset $\mathcal{S}(M)\subseteq k^n$ defined by
\begin{align*}\mathcal{S}(M)=\{x\in k^n: L_i^M(x)=M_{i,n+1}\text{ for every }1\leq i\leq m\}.\end{align*}
For the original augmented matrix $M_0=[A\mid b]$, this is exactly the solution set of $Ax=b$, because the equation in row $i$ is
\begin{align*}\sum_{j=1}^n A_{ij}x_j=b_i.\end{align*}
[/step]
[step:Verify that each elementary row operation preserves the solution set]
We check the three elementary row operations.
First, interchange two rows with indices $p$ and $q$. The new system contains exactly the same equations as the old system, only in a different order. Therefore a vector $x\in k^n$ satisfies all old equations if and only if it satisfies all new equations.
Second, multiply row $p$ by a scalar $\lambda\in k$ with $\lambda\neq 0$. Since $k$ is a field, the inverse $\lambda^{-1}\in k$ exists. The old row equation is
\begin{align*}L_p^M(x)=M_{p,n+1}.\end{align*}
The new row equation is
\begin{align*}\lambda L_p^M(x)=\lambda M_{p,n+1}.\end{align*}
Multiplying the new equation by $\lambda^{-1}$ recovers the old equation, and multiplying the old equation by $\lambda$ gives the new equation. All other rows are unchanged, so the solution set is unchanged.
Third, replace row $q$ by row $q+\lambda$ row $p$, where $p\neq q$ and $\lambda\in k$. The new row $q$ equation is
\begin{align*}L_q^M(x)+\lambda L_p^M(x)=M_{q,n+1}+\lambda M_{p,n+1}.\end{align*}
If $x$ satisfies the old system, then it satisfies both row $p$ and row $q$, hence it satisfies the displayed new row $q$ equation. Conversely, if $x$ satisfies the new system, then row $p$ is unchanged, so
\begin{align*}L_p^M(x)=M_{p,n+1}.\end{align*}
Subtracting $\lambda$ times this equation from the new row $q$ equation gives
\begin{align*}L_q^M(x)=M_{q,n+1}.\end{align*}
Thus the old row $q$ equation is recovered, and every other old row equation is unchanged. Hence the solution set is unchanged.
[guided]
The reason we check these three cases separately is that an elementary row operation is defined to be one of exactly these moves: swapping two rows, multiplying one row by a nonzero scalar, or adding a scalar multiple of one row to another row.
For a row swap, no algebra changes. The system has the same equations, only listed in a different order. Since membership in $\mathcal{S}(M)$ means satisfying every row equation, the order of those equations has no effect.
For row scaling, suppose row $p$ is multiplied by a scalar $\lambda\in k$ with $\lambda\neq 0$. The old row equation is
\begin{align*}L_p^M(x)=M_{p,n+1}.\end{align*}
The new one is
\begin{align*}\lambda L_p^M(x)=\lambda M_{p,n+1}.\end{align*}
Here the field hypothesis is used: because $\lambda\neq 0$, the element $\lambda^{-1}$ exists in $k$. Therefore the new equation implies the old one after multiplying by $\lambda^{-1}$, and the old equation implies the new one after multiplying by $\lambda$. So scaling by a nonzero scalar is reversible.
For row replacement, suppose row $q$ is replaced by row $q+\lambda$ row $p$, with $p\neq q$. The equation in row $p$ is not changed. The new row $q$ equation is
\begin{align*}L_q^M(x)+\lambda L_p^M(x)=M_{q,n+1}+\lambda M_{p,n+1}.\end{align*}
If $x$ satisfies the old system, then both $L_q^M(x)=M_{q,n+1}$ and $L_p^M(x)=M_{p,n+1}$ hold, so the new row $q$ equation follows by adding $\lambda$ times the row $p$ equation to the row $q$ equation. Conversely, if $x$ satisfies the new system, then it still satisfies the unchanged row $p$ equation. Subtracting $\lambda L_p^M(x)=\lambda M_{p,n+1}$ from the new row $q$ equation gives
\begin{align*}L_q^M(x)=M_{q,n+1}.\end{align*}
Thus the row replacement is reversible and preserves the solution set.
[/guided]
[/step]
[step:Construct pivots column by column]
Start with $M_0=[A\mid b]$ and set the active row index to $r=1$. We process the columns $c=1,\dots,n+1$ of the augmented matrix from left to right.
At column $c$, if $r>m$, there are no remaining rows and the process stops. If $r\leq m$ and $M_{ic}=0$ for every $i$ with $r\leq i\leq m$, then no pivot is available in column $c$, and we move to column $c+1$ without changing the matrix.
If instead there is an index $p$ with $r\leq p\leq m$ and $M_{pc}\neq 0$, interchange rows $p$ and $r$. Then the entry in position $(r,c)$ is nonzero. Since $k$ is a field, the inverse $M_{rc}^{-1}\in k$ exists.
For each row index $i$ with $r<i\leq m$, define
\begin{align*}\lambda_i=-M_{ic}M_{rc}^{-1}\in k.\end{align*}
Replace row $i$ by row $i+\lambda_i$ row $r$. The new entry in position $(i,c)$ is
\begin{align*}M_{ic}+\lambda_iM_{rc}=M_{ic}-M_{ic}M_{rc}^{-1}M_{rc}=0.\end{align*}
Thus every entry below the pivot in column $c$ is zero. Then increase the active row index from $r$ to $r+1$ and continue with the next column.
[/step]
[step:Prove that the algorithm terminates in row echelon form]
The algorithm processes only the finitely many columns $1,\dots,n+1$ of the augmented matrix. At each column, either the active row index $r$ stays fixed and the column index increases, or a pivot is created and then $r$ increases by $1$. Since $1\leq r\leq m+1$ and there are only $n+1$ columns, only finitely many row operations are performed.
Let the pivot rows produced by the algorithm be indexed in the order in which they are created. Their pivot columns strictly increase because the algorithm processes columns from left to right and never returns to an earlier column. In each pivot column, all entries below the pivot are zero by the row-clearing computation in the previous step. If a row receives no pivot after all columns $1,\dots,n+1$ have been processed, then every entry of that row in the augmented matrix is zero; otherwise the first nonzero entry in that row at or to the right of the current column would have been chosen as a pivot when its column was processed. Since the active row index advances only after creating a pivot, every row without a pivot lies below every pivot row. Therefore all zero rows are below the nonzero pivot rows, pivot columns strictly increase as one moves downward, and all entries below each pivot are zero. Hence the final augmented matrix is in row echelon form.
[/step]
[step:Combine construction with solution preservation]
Every operation used in the construction is an elementary row operation: a row interchange, multiplication of a row by a nonzero scalar, or replacement of one row by itself plus a scalar multiple of another row. By the direct verification above, each such operation preserves the solution set of the associated linear system. Since the algorithm performs only finitely many such operations and ends in row echelon form, the original augmented matrix $[A\mid b]$ can be transformed into row echelon form by a finite sequence of elementary row operations, and after every operation in the sequence the solution set remains the solution set of $Ax=b$.
[/step]