[proofplan]
We prove the equivalence in two directions. Submodularity directly gives diminishing returns by applying the defining inequality to the two sets $A \cup \{e\}$ and $B$. Conversely, assuming diminishing returns, we compare the gain of adding the elements of $T \setminus S$ to $S \cap T$ with the gain of adding the same elements to $S$; summing the resulting marginal inequalities telescopes into the submodular inequality.
[/proofplan]
[step:Derive diminishing returns from submodularity]
Assume $f$ is submodular. Let $A, B \subset E$ satisfy $A \subset B$, and let $e \in E \setminus B$. Apply submodularity to the sets $A \cup \{e\}$ and $B$. Since $A \subset B$ and $e \notin B$, we have
\begin{align*}
(A \cup \{e\}) \cap B = A.
\end{align*}
Also,
\begin{align*}
(A \cup \{e\}) \cup B = B \cup \{e\}.
\end{align*}
Therefore submodularity gives
\begin{align*}
f(A \cup \{e\}) + f(B) \geq f(B \cup \{e\}) + f(A).
\end{align*}
Rearranging this inequality yields
\begin{align*}
f(A \cup \{e\}) - f(A) \geq f(B \cup \{e\}) - f(B),
\end{align*}
which is exactly
\begin{align*}
\Delta_f(e \mid A) \geq \Delta_f(e \mid B).
\end{align*}
[/step]
[step:Compare the same added elements from two nested starting sets]
Assume that the diminishing returns inequality holds for all $A \subset B \subset E$ and all $e \in E \setminus B$. Let $S, T \subset E$ be arbitrary. Define
\begin{align*}
C := S \cap T.
\end{align*}
Since $E$ is finite, the finite set $T \setminus S$ can be written as
\begin{align*}
T \setminus S = \{e_1, \dots, e_m\}
\end{align*}
for some integer $m \geq 0$, where the elements $e_1, \dots, e_m$ are distinct. For each $i \in \{0, 1, \dots, m\}$, define
\begin{align*}
C_i := C \cup \{e_1, \dots, e_i\}.
\end{align*}
Also define
\begin{align*}
S_i := S \cup \{e_1, \dots, e_i\},
\end{align*}
with the convention that $\{e_1, \dots, e_0\} = \varnothing$.
For each $i \in \{1, \dots, m\}$, we have $C_{i-1} \subset S_{i-1}$ because $C \subset S$, and $e_i \notin S_{i-1}$ because $e_i \in T \setminus S$ and the elements $e_1, \dots, e_m$ are distinct. Hence the diminishing returns hypothesis gives
\begin{align*}
\Delta_f(e_i \mid C_{i-1}) \geq \Delta_f(e_i \mid S_{i-1}).
\end{align*}
Expanding the marginal gains,
\begin{align*}
f(C_i) - f(C_{i-1}) \geq f(S_i) - f(S_{i-1})
\end{align*}
for every $i \in \{1, \dots, m\}$.
[guided]
The goal is to prove submodularity for arbitrary $S, T \subset E$. The inequality we want is
\begin{align*}
f(S) + f(T) \geq f(S \cup T) + f(S \cap T).
\end{align*}
Equivalently, after moving terms, it is enough to prove
\begin{align*}
f(T) - f(S \cap T) \geq f(S \cup T) - f(S).
\end{align*}
This formulation suggests comparing two ways of adding the elements that are in $T$ but not already in $S$.
Define
\begin{align*}
C := S \cap T.
\end{align*}
Since $E$ is finite, the set $T \setminus S$ is finite. Choose an enumeration
\begin{align*}
T \setminus S = \{e_1, \dots, e_m\}
\end{align*}
for some integer $m \geq 0$, with $e_1, \dots, e_m$ distinct. For each $i \in \{0, 1, \dots, m\}$, define the first partial-build set by
\begin{align*}
C_i := C \cup \{e_1, \dots, e_i\}.
\end{align*}
Define the second partial-build set by
\begin{align*}
S_i := S \cup \{e_1, \dots, e_i\},
\end{align*}
where $\{e_1, \dots, e_0\}=\varnothing$. Thus $C_0=C$ and $S_0=S$.
For each $i \in \{1, \dots, m\}$, the set $C_{i-1}$ is contained in $S_{i-1}$ because $C=S \cap T \subset S$. Also, $e_i \notin S_{i-1}$: indeed $e_i \notin S$ because $e_i \in T \setminus S$, and $e_i$ is not one of $e_1, \dots, e_{i-1}$ because the enumeration has distinct elements. Therefore the diminishing returns hypothesis applies to the nested sets $C_{i-1} \subset S_{i-1}$ and the element $e_i \in E \setminus S_{i-1}$, giving
\begin{align*}
\Delta_f(e_i \mid C_{i-1}) \geq \Delta_f(e_i \mid S_{i-1}).
\end{align*}
By the definition of marginal gain, this is exactly
\begin{align*}
f(C_{i-1} \cup \{e_i\}) - f(C_{i-1})
\geq
f(S_{i-1} \cup \{e_i\}) - f(S_{i-1}).
\end{align*}
Because $C_i=C_{i-1}\cup\{e_i\}$ and $S_i=S_{i-1}\cup\{e_i\}$, the inequality becomes
\begin{align*}
f(C_i) - f(C_{i-1}) \geq f(S_i) - f(S_{i-1})
\end{align*}
for every $i \in \{1, \dots, m\}$.
[/guided]
[/step]
[step:Sum the marginal comparisons to recover submodularity]
Summing the inequalities from $i=1$ to $m$ gives
\begin{align*}
\sum_{i=1}^m \bigl(f(C_i)-f(C_{i-1})\bigr)
\geq
\sum_{i=1}^m \bigl(f(S_i)-f(S_{i-1})\bigr).
\end{align*}
Both sums telescope, so
\begin{align*}
f(C_m)-f(C_0) \geq f(S_m)-f(S_0).
\end{align*}
By construction,
\begin{align*}
C_0 &= S \cap T, &
C_m &= T, &
S_0 &= S, &
S_m &= S \cup T.
\end{align*}
Therefore
\begin{align*}
f(T)-f(S \cap T) \geq f(S \cup T)-f(S).
\end{align*}
Rearranging gives
\begin{align*}
f(S)+f(T) \geq f(S \cup T)+f(S \cap T).
\end{align*}
Since $S,T \subset E$ were arbitrary, $f$ is submodular. This proves the converse implication and completes the proof.
[/step]