[proofplan]
We compare the mean of $f(X)$ to a median $m_f$ of $f(X)$ by integrating the median tail estimate. This shows that $|\mathbb{E}[f(X)]-m_f|$ is bounded by a universal multiple of the Lipschitz constant $L$. For small deviations from the mean, the desired estimate is made true by enlarging the universal constant; for large deviations, the event above the mean implies an event above the median, where the assumed concentration inequality applies. The concave lower-tail estimate follows by applying the convex upper-tail estimate to $-f$.
[/proofplan]
[step:Bound the distance between the mean and a median by integrating the tail]
Let $L>0$, let $f:\mathbb{R}^n \to \mathbb{R}$ be convex and $L$-Lipschitz, and define the real-valued [random variable](/page/Random%20Variable)
\begin{align*}
Y:\Omega \to \mathbb{R}
\end{align*}
by $Y(\omega)=f(X(\omega))$. Let $m \in \mathbb{R}$ be a median of $Y$. Let $\mathcal{L}^1$ denote one-dimensional [Lebesgue measure](/page/Lebesgue%20Measure) on $\mathbb{R}$, restricted to the Borel subsets of the integration intervals appearing below.
The random variable $|Y-m|:\Omega \to [0,\infty)$ is non-negative and measurable. Therefore the [Layer-Cake Representation](/theorems/2956), applied to $|Y-m|$, gives
\begin{align*}
\mathbb{E}[|Y-m|]=\int_0^\infty \mathbb{P}(|Y-m|>s)\,d\mathcal{L}^1(s).
\end{align*}
Using the assumed median concentration estimate,
\begin{align*}
\mathbb{E}[|Y-m|]\le A\int_0^\infty \exp\left(-\frac{s^2}{B L^2}\right)\,d\mathcal{L}^1(s).
\end{align*}
Use the change of variables
\begin{align*}
u=\frac{s}{\sqrt{B}L}.
\end{align*}
Then $d\mathcal{L}^1(s)=\sqrt{B}L\,d\mathcal{L}^1(u)$, and $s \in (0,\infty)$ corresponds to $u \in (0,\infty)$. Hence
\begin{align*}
\mathbb{E}[|Y-m|]\le A\sqrt{B}L\int_0^\infty e^{-u^2}\,d\mathcal{L}^1(u).
\end{align*}
The Gaussian tail integral is finite, so define the universal constant
\begin{align*}
C_0:=A\sqrt{B}\int_0^\infty e^{-u^2}\,d\mathcal{L}^1(u).
\end{align*}
Then
\begin{align*}
|\mathbb{E}[Y]-m|\le \mathbb{E}[|Y-m|]\le C_0L.
\end{align*}
[guided]
The first task is to move from a median estimate to a mean estimate. Define the real-valued random variable
\begin{align*}
Y:\Omega \to \mathbb{R}
\end{align*}
by $Y(\omega)=f(X(\omega))$, and let $m \in \mathbb{R}$ be a median of $Y$. Let $\mathcal{L}^1$ denote one-dimensional Lebesgue measure on $\mathbb{R}$, restricted to the Borel subsets of the integration intervals appearing below. Since the concentration hypothesis controls deviations of $Y$ from $m$, we estimate the absolute first moment of $Y-m$.
For a non-negative random variable $Z:\Omega \to [0,\infty]$, the [Layer-Cake Representation](/theorems/2956) states that
\begin{align*}
\mathbb{E}[Z]=\int_0^\infty \mathbb{P}(Z>s)\,d\mathcal{L}^1(s).
\end{align*}
Here $Z=|Y-m|$ is non-negative and measurable, because $Y$ is a real-valued random variable and $m$ is a real constant. Applying the theorem to $Z=|Y-m|$ gives
\begin{align*}
\mathbb{E}[|Y-m|]=\int_0^\infty \mathbb{P}(|Y-m|>s)\,d\mathcal{L}^1(s).
\end{align*}
The median concentration hypothesis applies because $f$ is convex and $L$-Lipschitz. Hence, for every $s \ge 0$,
\begin{align*}
\mathbb{P}(|Y-m|>s)\le \mathbb{P}(|Y-m|\ge s)\le A\exp\left(-\frac{s^2}{B L^2}\right).
\end{align*}
Substitution into the tail integral yields
\begin{align*}
\mathbb{E}[|Y-m|]\le A\int_0^\infty \exp\left(-\frac{s^2}{B L^2}\right)\,d\mathcal{L}^1(s).
\end{align*}
Now perform the one-dimensional change of variables
\begin{align*}
u=\frac{s}{\sqrt{B}L}.
\end{align*}
Since $L>0$ and $B>0$, this is an increasing bijection from $(0,\infty)$ to $(0,\infty)$, and the measure transforms as $d\mathcal{L}^1(s)=\sqrt{B}L\,d\mathcal{L}^1(u)$. Therefore
\begin{align*}
\mathbb{E}[|Y-m|]\le A\sqrt{B}L\int_0^\infty e^{-u^2}\,d\mathcal{L}^1(u).
\end{align*}
The integral $\int_0^\infty e^{-u^2}\,d\mathcal{L}^1(u)$ is finite. Define
\begin{align*}
C_0:=A\sqrt{B}\int_0^\infty e^{-u^2}\,d\mathcal{L}^1(u).
\end{align*}
This number is finite and universal because it depends only on the universal constants $A$ and $B$. Finally, the inequality $|\mathbb{E}[Y]-m|\le \mathbb{E}[|Y-m|]$ follows from the triangle inequality for expectation, so
\begin{align*}
|\mathbb{E}[Y]-m|\le C_0L.
\end{align*}
[/guided]
[/step]
[step:Handle deviations no larger than the mean-median error scale]
Choose a universal constant $C\ge 1$ so large that
\begin{align*}
C\ge e^{1/2}
\end{align*}
and
\begin{align*}
C\ge 8C_0^2.
\end{align*}
If $0\le t\le 2C_0L$, then
\begin{align*}
\frac{t^2}{CL^2}\le \frac{4C_0^2}{C}\le \frac{1}{2}.
\end{align*}
Therefore
\begin{align*}
C\exp\left(-\frac{t^2}{CL^2}\right)\ge Ce^{-1/2}\ge 1.
\end{align*}
Since every probability is at most $1$,
\begin{align*}
\mathbb{P}(Y\ge \mathbb{E}[Y]+t)\le C\exp\left(-\frac{t^2}{CL^2}\right).
\end{align*}
[/step]
[step:Convert large deviations from the mean into deviations from the median]
Increase $C$, if necessary, so that also $C\ge A$ and $C\ge 4B$. Suppose $t>2C_0L$. From $|\mathbb{E}[Y]-m|\le C_0L$, we have
\begin{align*}
\mathbb{E}[Y]+t\ge m+t-|\mathbb{E}[Y]-m|>m+\frac{t}{2}.
\end{align*}
Hence
\begin{align*}
\{Y\ge \mathbb{E}[Y]+t\}\subseteq \left\{Y\ge m+\frac{t}{2}\right\}.
\end{align*}
Applying the median concentration estimate with $s=t/2$ gives
\begin{align*}
\mathbb{P}(Y\ge \mathbb{E}[Y]+t)\le A\exp\left(-\frac{t^2}{4BL^2}\right).
\end{align*}
Because $C\ge A$ and $C\ge 4B$,
\begin{align*}
A\exp\left(-\frac{t^2}{4BL^2}\right)\le C\exp\left(-\frac{t^2}{CL^2}\right).
\end{align*}
Thus the desired convex upper-tail estimate holds for all $t\ge 0$.
[/step]
[step:Apply the convex upper-tail estimate to the negative function]
Let $f:\mathbb{R}^n\to\mathbb{R}$ be concave and $L$-Lipschitz. Define
\begin{align*}
g:\mathbb{R}^n \to \mathbb{R}
\end{align*}
by $g(x)=-f(x)$. Then $g$ is convex and $L$-Lipschitz. Applying the convex upper-tail estimate to $g$ gives, for every $t\ge 0$,
\begin{align*}
\mathbb{P}(g(X)\ge \mathbb{E}[g(X)]+t)\le C\exp\left(-\frac{t^2}{CL^2}\right).
\end{align*}
Since $g(X)=-f(X)$ and $\mathbb{E}[g(X)]=-\mathbb{E}[f(X)]$, the event on the left is exactly
\begin{align*}
\{f(X)\le \mathbb{E}[f(X)]-t\}.
\end{align*}
Therefore
\begin{align*}
\mathbb{P}(f(X)\le \mathbb{E}[f(X)]-t)\le C\exp\left(-\frac{t^2}{CL^2}\right).
\end{align*}
This proves the corresponding lower-tail bound for concave $f$ and completes the proof.
[/step]