[proofplan]
We compare the greedy set with every other basis by looking at all weight prefixes. The key invariant is that, after the first $i$ elements have been processed, the greedy set has maximum possible size inside the prefix $P_i=\{e_1,\dots,e_i\}$. This gives prefix-count domination $|B\cap P_i|\le |G\cap P_i|$ for every basis $B$. Finally, because the weights are ordered non-increasingly, summation by parts converts these prefix-count inequalities into the desired weight inequality.
[/proofplan]
[step:Define the prefix rank function and the greedy prefixes]
Let $m:=|E|$. If $m\ge 1$, let $e_1,\dots,e_m$ denote an ordering of $E$ chosen by the greedy algorithm, arranged in non-increasing weight order: $w(e_1)\ge \cdots \ge w(e_m)$.
Define the [matroid rank](/page/Matroid%20Rank) function $r:\mathcal P(E)\to \{0,1,\dots,m\}$ by declaring, for each subset $A\subset E$,
\begin{align*}
r(A):=\max\{|J|:J\subset A\text{ and }J\in \mathcal I\}.
\end{align*}
For each $0\le i\le m$, define the prefix
\begin{align*}
P_i:=\{e_1,\dots,e_i\},
\end{align*}
with $P_0=\varnothing$. By construction, $G_i\subset P_i$ and $G_i\in\mathcal I$ for every $0\le i\le m$.
[/step]
[step:Prove that the greedy prefix has maximum possible size in each prefix]
We prove by induction on $i$ that
\begin{align*}
|G_i|=r(P_i)
\end{align*}
for every $0\le i\le m$.
For $i=0$, both $G_0$ and $P_0$ are empty, so $|G_0|=0=r(P_0)$.
Assume $1\le i\le m$ and suppose $|G_{i-1}|=r(P_{i-1})$. If $G_{i-1}\cup\{e_i\}\in\mathcal I$, then $G_i=G_{i-1}\cup\{e_i\}$, hence $|G_i|=|G_{i-1}|+1$. Since $P_i=P_{i-1}\cup\{e_i\}$, any independent subset of $P_i$ contains at most one more element than an independent subset of $P_{i-1}$, so
\begin{align*}
r(P_i)\le r(P_{i-1})+1=|G_{i-1}|+1=|G_i|.
\end{align*}
The reverse inequality $|G_i|\le r(P_i)$ holds because $G_i\subset P_i$ and $G_i\in\mathcal I$. Therefore $|G_i|=r(P_i)$.
Now suppose $G_{i-1}\cup\{e_i\}\notin\mathcal I$. Then $G_i=G_{i-1}$. We claim that no independent subset of $P_i$ has cardinality greater than $|G_{i-1}|$. If such a set $J\subset P_i$ with $J\in\mathcal I$ and $|J|>|G_{i-1}|$ existed, then the independent-set exchange axiom in the [definition of a matroid](/page/Matroid) applied to the independent sets $G_{i-1}$ and $J$ would give an element $x\in J\setminus G_{i-1}$ such that $G_{i-1}\cup\{x\}\in\mathcal I$. If $x\in P_{i-1}$, this contradicts the induction hypothesis $|G_{i-1}|=r(P_{i-1})$, since $G_{i-1}$ would not be a largest independent subset of $P_{i-1}$. If $x=e_i$, this contradicts the assumption that $G_{i-1}\cup\{e_i\}\notin\mathcal I$. Hence no such $J$ exists, and therefore
\begin{align*}
r(P_i)=|G_{i-1}|=|G_i|.
\end{align*}
[guided]
We want to show that the greedy algorithm is not merely independent at each prefix, but optimal in cardinality inside that prefix. For a subset $A\subset E$, recall that
\begin{align*}
r(A)=\max\{|J|:J\subset A\text{ and }J\in\mathcal I\}.
\end{align*}
Thus the target statement at prefix $P_i=\{e_1,\dots,e_i\}$ is
\begin{align*}
|G_i|=r(P_i).
\end{align*}
The base case is $i=0$. Here $P_0=\varnothing$ and $G_0=\varnothing$, so the largest independent subset of $P_0$ has size $0$, giving $|G_0|=r(P_0)$.
Now assume $|G_{i-1}|=r(P_{i-1})$. There are two cases, according to the greedy decision at $e_i$.
First suppose $G_{i-1}\cup\{e_i\}\in\mathcal I$. Then the greedy algorithm accepts $e_i$, so
\begin{align*}
G_i=G_{i-1}\cup\{e_i\}
\end{align*}
and therefore
\begin{align*}
|G_i|=|G_{i-1}|+1.
\end{align*}
Since $P_i$ is obtained from $P_{i-1}$ by adding exactly one new element, an independent subset of $P_i$ can have size at most $r(P_{i-1})+1$. Using the induction hypothesis,
\begin{align*}
r(P_i)\le r(P_{i-1})+1=|G_{i-1}|+1=|G_i|.
\end{align*}
On the other hand, $G_i$ itself is an independent subset of $P_i$, so $|G_i|\le r(P_i)$. Combining the two inequalities gives $|G_i|=r(P_i)$.
Second suppose $G_{i-1}\cup\{e_i\}\notin\mathcal I$. Then the greedy algorithm rejects $e_i$, so $G_i=G_{i-1}$. We must prove that the rank of the prefix also does not increase. Assume toward a contradiction that there is an independent set $J\subset P_i$ with
\begin{align*}
|J|>|G_{i-1}|.
\end{align*}
The independent-set exchange axiom in the [definition of a matroid](/page/Matroid) applies to the independent sets $G_{i-1}$ and $J$: because $J$ is larger, there exists an element $x\in J\setminus G_{i-1}$ such that
\begin{align*}
G_{i-1}\cup\{x\}\in\mathcal I.
\end{align*}
Since $J\subset P_i=P_{i-1}\cup\{e_i\}$, the element $x$ lies either in $P_{i-1}$ or equals $e_i$. If $x\in P_{i-1}$, then $G_{i-1}\cup\{x\}$ is an independent subset of $P_{i-1}$ larger than $G_{i-1}$, contradicting $|G_{i-1}|=r(P_{i-1})$. If $x=e_i$, then $G_{i-1}\cup\{e_i\}\in\mathcal I$, contradicting the fact that the greedy algorithm rejected $e_i$. Both possibilities are impossible, so no independent subset of $P_i$ is larger than $G_{i-1}$. Hence
\begin{align*}
r(P_i)=|G_{i-1}|=|G_i|.
\end{align*}
[/guided]
[/step]
[step:Deduce that the greedy output is a basis]
Taking $i=m$ in the prefix rank identity gives
\begin{align*}
|G|=|G_m|=r(P_m)=r(E).
\end{align*}
Since $G\subset E$ and $G\in\mathcal I$, the equality $|G|=r(E)$ means that $G$ is an independent subset of $E$ of maximum possible cardinality. Therefore $G$ is a [basis](/page/Matroid) of $M$.
[/step]
[step:Compare every basis with the greedy set on each prefix]
Let $B$ be any basis of $M$. For each $0\le i\le m$, the set $B\cap P_i$ is independent because it is a subset of the independent set $B$. Since $B\cap P_i\subset P_i$, the definition of rank gives
\begin{align*}
|B\cap P_i|\le r(P_i).
\end{align*}
Using the prefix rank identity and the equality $G\cap P_i=G_i$, we obtain
\begin{align*}
|B\cap P_i|\le r(P_i)=|G_i|=|G\cap P_i|
\end{align*}
for every $0\le i\le m$.
[/step]
[step:Convert prefix-count domination into the weight inequality]
If $m=0$, then $E=\varnothing$. The only basis is $\varnothing$, the greedy output is $G=\varnothing$, and every basis has total weight $0$. Hence $G$ is maximum-weight in this case. For the rest of the step, assume $m\ge 1$.
For a subset $A\subset E$, define its prefix-count function $c_A:\{0,1,\dots,m\}\to \{0,1,\dots,m\}$ by
\begin{align*}
c_A(i):=|A\cap P_i|.
\end{align*}
Then $c_A(0)=0$. For $1\le i\le m$, the difference $c_A(i)-c_A(i-1)$ equals $1$ if $e_i\in A$ and equals $0$ if $e_i\notin A$. Thus
\begin{align*}
\sum_{e\in A}w(e)=\sum_{i=1}^m w(e_i)\bigl(c_A(i)-c_A(i-1)\bigr).
\end{align*}
We now derive the finite summation identity used below. Since $c_A(0)=0$, first expand the finite sum as
\begin{align*}
\sum_{i=1}^m w(e_i)\bigl(c_A(i)-c_A(i-1)\bigr)=\sum_{i=1}^m w(e_i)c_A(i)-\sum_{i=1}^m w(e_i)c_A(i-1).
\end{align*}
Reindexing the second sum and using $c_A(0)=0$ gives
\begin{align*}
\sum_{i=1}^m w(e_i)c_A(i)-\sum_{i=1}^m w(e_i)c_A(i-1)=\sum_{i=1}^m w(e_i)c_A(i)-\sum_{i=1}^{m-1} w(e_{i+1})c_A(i).
\end{align*}
Separating the final term of the first sum and collecting the coefficients of $c_A(i)$ for $1\le i<m$ yields
\begin{align*}
\sum_{i=1}^m w(e_i)c_A(i)-\sum_{i=1}^{m-1} w(e_{i+1})c_A(i)=w(e_m)c_A(m)+\sum_{i=1}^{m-1}\bigl(w(e_i)-w(e_{i+1})\bigr)c_A(i).
\end{align*}
Therefore
\begin{align*}
\sum_{e\in A}w(e)
=
w(e_m)c_A(m)+\sum_{i=1}^{m-1}\bigl(w(e_i)-w(e_{i+1})\bigr)c_A(i).
\end{align*}
Apply this formula first to $A=G$ and then to $A=B$. Since $G$ and $B$ are both bases, $c_G(m)=|G|=|B|=c_B(m)$. From the previous step, $c_B(i)\le c_G(i)$ for every $0\le i\le m$. Since the ordering satisfies $w(e_i)-w(e_{i+1})\ge 0$ for every $1\le i<m$, we obtain
\begin{align*}
\sum_{e\in B}w(e)=w(e_m)c_B(m)+\sum_{i=1}^{m-1}\bigl(w(e_i)-w(e_{i+1})\bigr)c_B(i).
\end{align*}
Using $c_B(m)=c_G(m)$ and $c_B(i)\le c_G(i)$ for $1\le i<m$, with each coefficient $w(e_i)-w(e_{i+1})$ non-negative, this gives
\begin{align*}
\sum_{e\in B}w(e)\le w(e_m)c_G(m)+\sum_{i=1}^{m-1}\bigl(w(e_i)-w(e_{i+1})\bigr)c_G(i).
\end{align*}
Applying the same summation-by-parts formula to $A=G$, the right-hand side is
\begin{align*}
w(e_m)c_G(m)+\sum_{i=1}^{m-1}\bigl(w(e_i)-w(e_{i+1})\bigr)c_G(i)=\sum_{e\in G}w(e).
\end{align*}
Therefore $G$ has weight at least that of every basis $B$ of $M$, so $G$ is a maximum-weight basis.
[/step]