[proofplan]
We prove both directions. If $(E,\mathcal I)$ is a matroid, we compare the greedy output with an optimal independent set by repeatedly exchanging one element at a time while preserving optimality; the nonincreasing scan order ensures that each inserted greedy element has weight at least as large as the element it replaces. Conversely, if the independence system is not a matroid, then some subset $S \subseteq E$ has maximal independent subsets of different cardinalities. Giving weight $1$ to $S$ and weight $0$ outside $S$, and choosing a tie-breaking order that forces greedy into a smaller maximal independent subset of $S$, produces a greedy output of strictly smaller weight than another independent set.
[/proofplan]
[step:Prove the exchange replacement needed to compare bases]
Assume first that $(E,\mathcal I)$ is a matroid. We use the following replacement fact.
[claim:Basis replacement]
Let $B,C \in \mathcal I$ be maximal independent subsets of $E$, and let $b \in B \setminus C$. Then there exists $c \in C \setminus B$ such that $(C \setminus \{c\}) \cup \{b\} \in \mathcal I$.
[/claim]
[proof]
Because $B$ and $C$ are maximal independent subsets of the same matroid, the basis-cardinality property gives $|B| = |C|$. The set $(B \cap C) \cup \{b\}$ is independent, since it is a subset of $B$.
Choose, among the independent subsets of $C \cup \{b\}$ that contain $(B \cap C) \cup \{b\}$, one that is maximal under inclusion; call it $D$. Such a set exists because $C \cup \{b\}$ is finite. The set $D$ is not equal to $C \cup \{b\}$, since $C$ is maximal independent and $b \notin C$, so $C \cup \{b\}$ is dependent.
We claim that $|D| = |C|$. If $|D| < |C|$, then the matroid exchange axiom applied to the independent sets $D$ and $C$ gives an element $x \in C \setminus D$ such that $D \cup \{x\} \in \mathcal I$. This contradicts the maximality of $D$ inside $C \cup \{b\}$. Hence $|D| = |C|$.
Since $D \subseteq C \cup \{b\}$, $b \in D$, and $|D| = |C|$, the set $D$ has the form $(C \setminus \{c\}) \cup \{b\}$ for a unique element $c \in C$. Because $D$ contains $B \cap C$, this element $c$ cannot lie in $B \cap C$. Thus $c \in C \setminus B$, and the definition of $D$ gives $(C \setminus \{c\}) \cup \{b\} \in \mathcal I$.
[/proof]
Because $E$ is finite, every independent set is contained in a maximal independent set. Since $w(e) \geq 0$ for every $e \in E$, extending an independent set to a maximal independent set cannot decrease its weight. Therefore it is enough to compare the greedy output with maximum-weight maximal independent sets.
[/step]
[step:Exchange an optimal basis until it contains each greedy choice]
Let the greedy algorithm accept the elements $g_1,\dots,g_r$ in the order in which they are accepted, and define
\begin{align*}
G_k := \{g_1,\dots,g_k\}
\end{align*}
for $0 \leq k \leq r$, with $G_0 := \varnothing$. The final greedy set $G_r$ is maximal independent, because every element not accepted was rejected exactly when its addition would violate independence, and later accepted elements can only enlarge the current set.
Let $O_0 \in \mathcal I$ be a maximum-weight maximal independent set. We construct maximal independent sets $O_k$ for $0 \leq k \leq r$ such that $G_k \subseteq O_k$ and
\begin{align*}
\sum_{e \in O_k} w(e) \geq \sum_{e \in O_{k-1}} w(e)
\end{align*}
for each $1 \leq k \leq r$.
Assume $O_{k-1}$ has been constructed. If $g_k \in O_{k-1}$, set $O_k := O_{k-1}$. If $g_k \notin O_{k-1}$, apply the basis replacement claim to the maximal independent sets $G_r$ and $O_{k-1}$ with $b := g_k$. Since $G_{k-1} \subseteq O_{k-1}$, the replacement element $c$ may be chosen in $O_{k-1} \setminus G_{k-1}$. Define
\begin{align*}
O_k := (O_{k-1} \setminus \{c\}) \cup \{g_k\}.
\end{align*}
Then $O_k \in \mathcal I$ and $G_k \subseteq O_k$.
It remains to compare weights. Since $c \in O_{k-1}$ and $G_{k-1} \subseteq O_{k-1}$, heredity gives $G_{k-1} \cup \{c\} \in \mathcal I$. Therefore, if $c$ had been scanned before $g_k$, the greedy algorithm would have accepted $c$ at that earlier moment, because the current greedy set was then a subset of $G_{k-1}$ and adding $c$ would preserve independence by heredity. Since $c \notin G_{k-1}$, the element $c$ was not scanned before $g_k$. The scan order is nonincreasing in $w$, so $w(g_k) \geq w(c)$. Hence
\begin{align*}
\sum_{e \in O_k} w(e) = \sum_{e \in O_{k-1}} w(e) - w(c) + w(g_k) \geq \sum_{e \in O_{k-1}} w(e).
\end{align*}
[guided]
We want to prove that each greedy choice can be inserted into an optimal solution without lowering the total weight. Let the accepted greedy elements be $g_1,\dots,g_r$, and write
\begin{align*}
G_k := \{g_1,\dots,g_k\}.
\end{align*}
The goal is to maintain an optimal maximal independent set that contains the current greedy prefix $G_k$.
Start with a maximum-weight maximal independent set $O_0$. Such a set exists because $E$ is finite. Since all weights are nonnegative, restricting attention to maximal independent sets loses nothing: if an independent set is not maximal, extending it to a maximal independent set adds only elements of nonnegative weight.
Suppose we have already constructed a maximal independent set $O_{k-1}$ containing $G_{k-1}$. If $g_k \in O_{k-1}$, there is nothing to change, and we set $O_k := O_{k-1}$. If $g_k \notin O_{k-1}$, we use the matroid exchange structure. The greedy output $G_r$ is maximal independent, and so is $O_{k-1}$. By basis replacement, there exists an element $c \in O_{k-1} \setminus G_r$ such that
\begin{align*}
O_k := (O_{k-1} \setminus \{c\}) \cup \{g_k\}
\end{align*}
is independent. In particular $c \notin G_{k-1}$, so replacing $c$ by $g_k$ does not destroy the already inserted greedy prefix.
Why does this replacement not reduce weight? We must justify that $c$ is not heavier than $g_k$. Since $c \in O_{k-1}$ and $G_{k-1} \subseteq O_{k-1}$, the hereditary property of the independence system gives
\begin{align*}
G_{k-1} \cup \{c\} \in \mathcal I.
\end{align*}
If $c$ had been scanned before $g_k$, then at the moment $c$ was scanned the current greedy set was a subset of $G_{k-1}$. Adding $c$ to that smaller current set would still have been independent, again by heredity. Therefore the greedy algorithm would have accepted $c$ before accepting $g_k$, contradicting $c \notin G_{k-1}$. Hence $c$ was scanned no earlier than $g_k$. Because the scan order is nonincreasing in $w$, we get $w(g_k) \geq w(c)$. Consequently
\begin{align*}
\sum_{e \in O_k} w(e) = \sum_{e \in O_{k-1}} w(e) - w(c) + w(g_k) \geq \sum_{e \in O_{k-1}} w(e).
\end{align*}
This is the central greedy argument: every greedy element can be forced into an optimal solution by exchanging out an element of no larger weight.
[/guided]
[/step]
[step:Conclude greedy optimality for matroids]
After the preceding construction reaches $k=r$, we have $G_r \subseteq O_r$. Both $G_r$ and $O_r$ are maximal independent subsets of the matroid, so they have the same cardinality. Hence $G_r = O_r$. The weight inequalities give
\begin{align*}
\sum_{e \in G_r} w(e) = \sum_{e \in O_r} w(e) \geq \sum_{e \in O_0} w(e).
\end{align*}
Since $O_0$ was chosen to have maximum weight among maximal independent sets, and every independent set extends to a maximal independent set without decreasing weight, $G_r$ has maximum weight among all members of $\mathcal I$. This proves the greedy correctness assertion for matroids.
[/step]
[step:Construct a weight function that exposes failure of the matroid axiom]
We prove the converse by contrapositive. Suppose $(E,\mathcal I)$ is not a matroid. For a finite independence system, failure of the matroid exchange axiom is equivalent to the existence of a subset $S \subseteq E$ whose maximal independent subsets do not all have the same cardinality. Choose a maximal independent subset $A \subseteq S$ and an independent subset $B \subseteq S$ such that
\begin{align*}
|B| > |A|.
\end{align*}
Define the weight function $w: E \to \mathbb R_{\geq 0}$ by setting $w(e) := 1$ for $e \in S$ and $w(e) := 0$ for $e \in E \setminus S$. Choose a tie-breaking order that scans first the elements of $A$, then the elements of $S \setminus A$, and finally the elements of $E \setminus S$.
The greedy algorithm accepts every element of $A$, because $A \in \mathcal I$ and independence is hereditary. Once all elements of $A$ have been accepted, every element of $S \setminus A$ is rejected: if some $s \in S \setminus A$ could be added, then $A \cup \{s\}$ would be an independent subset of $S$, contradicting the maximality of $A$ in $S$. Elements of $E \setminus S$ may be accepted later, but all such elements have weight $0$. Therefore the greedy output $G$ satisfies
\begin{align*}
\sum_{e \in G} w(e) = |A|.
\end{align*}
On the other hand, $B \in \mathcal I$ and every element of $B$ lies in $S$, so
\begin{align*}
\sum_{e \in B} w(e) = |B| > |A|.
\end{align*}
Thus, for this nonnegative weight function and this tie-breaking order, greedy does not return a maximum-weight member of $\mathcal I$.
[/step]
[step:Finish the equivalence]
The previous step shows that if the universal greedy correctness property holds for every nonnegative weight function and every tie-breaking order, then no subset $S \subseteq E$ can have maximal independent subsets of different cardinalities. For finite independence systems, this maximal-cardinality property is equivalent to the matroid exchange axiom. Hence $(E,\mathcal I)$ is a matroid. Combining this converse with the greedy correctness already proved for matroids establishes the equivalence.
[/step]