[proofplan]
We introduce the hitting time $T = \inf\{k \geq 0 : X_k \geq \lambda\}$ and use the submartingale property of the stopped process to relate $\mathbb{E}[X_n]$ to $\mathbb{E}[X_{\min\{T,n\}}]$. Decomposing the latter according to whether $T \leq n$ (the running maximum has exceeded $\lambda$) or $T > n$ (it has not) produces the desired bound.
[/proofplan]
[step:Define the hitting time and apply optional stopping]
Let $T = \inf\{k \geq 0 : X_k \geq \lambda\}$, with $\inf \varnothing = \infty$. Then $T$ is a stopping time (since $\{T \leq k\} = \bigcup_{j=0}^k \{X_j \geq \lambda\} \in \mathcal{F}_k$), and $\{T \leq n\} = \{X_n^* \geq \lambda\}$.
The stopped time $\min\{T, n\}$ is a bounded stopping time (at most $n$). Since $X$ is a non-negative submartingale, the stopped process $X_{\min\{T,n\}}$ satisfies $\mathbb{E}[X_{\min\{T,n\}}] \leq \mathbb{E}[X_n]$ (the submartingale version of the [Optional Stopping Theorem](/theorems/1153): stopping a submartingale at a bounded time can only decrease the expectation, i.e., $\mathbb{E}[X_{\min\{T,n\}}] \leq \mathbb{E}[X_n]$).
[/step]
[step:Decompose the stopped expectation and extract the maximal bound]
Decompose $X_{\min\{T,n\}}$ according to whether the threshold has been reached:
\begin{align*}
\mathbb{E}[X_{\min\{T,n\}}] = \mathbb{E}[X_T \, \mathbb{1}_{\{T \leq n\}}] + \mathbb{E}[X_n \, \mathbb{1}_{\{T > n\}}].
\end{align*}
On $\{T \leq n\}$, we have $X_T \geq \lambda$ (by definition of $T$), so $\mathbb{E}[X_T \, \mathbb{1}_{\{T \leq n\}}] \geq \lambda \, \mathbb{P}(T \leq n) = \lambda \, \mathbb{P}(X_n^* \geq \lambda)$. Since $X_n \geq 0$, the second term is non-negative: $\mathbb{E}[X_n \, \mathbb{1}_{\{T > n\}}] \geq 0$. Therefore:
\begin{align*}
\mathbb{E}[X_n] \geq \mathbb{E}[X_{\min\{T,n\}}] \geq \lambda \, \mathbb{P}(X_n^* \geq \lambda) + \mathbb{E}[X_n \, \mathbb{1}_{\{T > n\}}].
\end{align*}
Rearranging and using $\{T > n\} = \{X_n^* < \lambda\}$:
\begin{align*}
\lambda \, \mathbb{P}(X_n^* \geq \lambda) \leq \mathbb{E}[X_n] - \mathbb{E}[X_n \, \mathbb{1}_{\{X_n^* < \lambda\}}] = \mathbb{E}[X_n \, \mathbb{1}_{\{X_n^* \geq \lambda\}}].
\end{align*}
The final inequality $\mathbb{E}[X_n \, \mathbb{1}_{\{X_n^* \geq \lambda\}}] \leq \mathbb{E}[X_n]$ is immediate since $X_n \geq 0$.
[guided]
The proof has an elegant structure: the stopping time $T$ captures the first time the process exceeds $\lambda$, and the submartingale property prevents the expectation from decreasing when we stop early. The decomposition into $\{T \leq n\}$ and $\{T > n\}$ separates the sample paths that have triggered the maximum from those that have not.
On $\{T \leq n\}$, we know $X_T \geq \lambda$ by construction of $T$, which gives a lower bound of $\lambda$ for the stopped value. On $\{T > n\}$, we only know $X_n \geq 0$, which contributes a non-negative term. The submartingale inequality $\mathbb{E}[X_{\min\{T,n\}}] \leq \mathbb{E}[X_n]$ then forces:
\begin{align*}
\lambda \, \mathbb{P}(X_n^* \geq \lambda) \leq \mathbb{E}[X_T \, \mathbb{1}_{\{T \leq n\}}] \leq \mathbb{E}[X_{\min\{T,n\}}] \leq \mathbb{E}[X_n].
\end{align*}
The sharper bound $\lambda \, \mathbb{P}(X_n^* \geq \lambda) \leq \mathbb{E}[X_n \, \mathbb{1}_{\{X_n^* \geq \lambda\}}]$ is obtained by keeping the term $\mathbb{E}[X_n \, \mathbb{1}_{\{T > n\}}]$ on the right side rather than discarding it.
[/guided]
[/step]