[proofplan]
We prove the upper-tail estimate for convex $f$ by comparing the upper level set $\{f \geq m+t\}$ with the Talagrand convex distance from the median sublevel set $A=\{f \leq m\}$. Convexity and the Lipschitz bound give, at each upper-tail point $x$, a separating subgradient whose coordinates define Talagrand weights. The diameter assumption converts coordinate displacements into coordinate disagreement indicators, giving $d_T(x,A)\geq t/L$. Talagrand's product-space convex distance inequality then gives the stated probability bound, and the concave lower-tail estimate follows by applying the convex case to $-f$.
[/proofplan]
custom_env
admin
[step:Define the median sublevel set and the Talagrand convex distance]
Define the sublevel set
\begin{align*}
A := \{y \in K : f(y) \leq m\}.
\end{align*}
Since $m$ is a median of $f(X)$, we have
\begin{align*}
\mathbb{P}(X \in A)=\mathbb{P}(f(X)\leq m)\geq \frac{1}{2}.
\end{align*}
For a subset $E \subset K$, define the Talagrand convex distance from $x \in K$ to $E$ by
\begin{align*}
d_T(x,E) := \sup_{\alpha \in \mathbb{R}^n,\ |\alpha|\leq 1}\ \inf_{y \in E}\sum_{i=1}^{n} |\alpha_i|\,\mathbb{1}_{\{x_i\neq y_i\}}.
\end{align*}
Thus $d_T: K \times \mathcal{P}(K) \to [0,\infty]$ measures the largest weighted coordinate-disagreement distance from $x$ to $E$, where the weights have Euclidean norm at most $1$.
[/step]
custom_env
admin
[step:Separate an upper-tail point from the median sublevel set]Fix $t \geq 0$ and fix $x \in K$ satisfying $f(x)\geq m+t$. By the supporting subgradient theorem for Lipschitz convex functions on convex subsets of Euclidean space, applied to the convex $L$-Lipschitz map $f:K\to\mathbb{R}$ at the point $x$, there exists a vector $g=(g_1,\dots,g_n)\in\mathbb{R}^n$ with $|g|\leq L$ such that
\begin{align*}
f(y)\geq f(x)+g\cdot (y-x)
\end{align*}
for every $y\in K$. Here $g\cdot(y-x)=\sum_{i=1}^{n}g_i(y_i-x_i)$ is the Euclidean [inner product](/page/Inner%20Product). Equivalently,
\begin{align*}
g\cdot(x-y)\geq f(x)-f(y)
\end{align*}
for every $y\in K$. In particular, for every $y\in A$, the inequalities $f(x)\geq m+t$ and $f(y)\leq m$ give
\begin{align*}
g\cdot(x-y)\geq t.
\end{align*}
The cited supporting-subgradient fact is a standard consequence of finite-dimensional convex separation; citing a result not yet in the wiki: supporting subgradient theorem for Lipschitz convex functions on convex subsets of Euclidean space.[/step]
custom_env
admin
[guided]Fix $t\geq 0$ and take a point $x\in K$ in the upper tail, so $f(x)\geq m+t$. We want to show that $x$ is far from the median sublevel set $A$ in Talagrand's convex distance. The bridge between the value gap $f(x)-f(y)$ and coordinate disagreements is a separating linear functional.
Because $f:K\to\mathbb{R}$ is convex and $L$-Lipschitz, the supporting subgradient theorem for Lipschitz convex functions on convex subsets of Euclidean space applies at $x$. Its hypotheses are exactly the convexity of the domain $K$, the convexity of $f$, and the finite Lipschitz constant $L$. It gives a vector $g=(g_1,\dots,g_n)\in\mathbb{R}^n$ with $|g|\leq L$ such that
\begin{align*}
f(y)\geq f(x)+g\cdot(y-x)
\end{align*}
for every $y\in K$.
Rearranging this supporting inequality gives
\begin{align*}
g\cdot(x-y)\geq f(x)-f(y)
\end{align*}
for every $y\in K$. Now restrict to $y\in A$. By definition of $A$, such a point satisfies $f(y)\leq m$, while the chosen point $x$ satisfies $f(x)\geq m+t$. Hence
\begin{align*}
f(x)-f(y)\geq t.
\end{align*}
Combining the two inequalities yields
\begin{align*}
g\cdot(x-y)\geq t
\end{align*}
for every $y\in A$.
This is the separation step: one vector $g$ works simultaneously for all points $y$ in the median sublevel set. The bound $|g|\leq L$ is essential because it lets us normalize $g$ into Talagrand weights of Euclidean norm at most $1$. The cited supporting-subgradient fact is a standard consequence of finite-dimensional convex separation; citing a result not yet in the wiki: supporting subgradient theorem for Lipschitz convex functions on convex subsets of Euclidean space.[/guided]
custom_env
admin
[step:Convert the separating vector into Talagrand weights]
If $g=0$, then the inequality $g\cdot(x-y)\geq t$ for $y\in A$ forces $t=0$, and the desired final probability estimate follows from $\mathbb{P}(f(X)\geq m)\leq 1\leq 2$. We therefore assume $g\neq 0$ for the distance estimate.
Define the weight vector $\alpha=(\alpha_1,\dots,\alpha_n)\in\mathbb{R}^n$ by
\begin{align*}
\alpha_i := \frac{|g_i|}{L}
\end{align*}
for each $i\in\{1,\dots,n\}$. Since $|g|\leq L$, we have
\begin{align*}
|\alpha|=\frac{|g|}{L}\leq 1.
\end{align*}
For each $y\in A$ and each coordinate $i$, the diameter bound $\operatorname{diam}(K_i)\leq 1$ gives
\begin{align*}
|x_i-y_i|\leq \mathbb{1}_{\{x_i\neq y_i\}},
\end{align*}
because both $x_i$ and $y_i$ belong to $K_i$. Therefore
\begin{align*}
g\cdot(x-y)\leq \sum_{i=1}^{n}|g_i|\,|x_i-y_i|\leq \sum_{i=1}^{n}|g_i|\,\mathbb{1}_{\{x_i\neq y_i\}}.
\end{align*}
Combining this with $g\cdot(x-y)\geq t$ gives
\begin{align*}
\sum_{i=1}^{n}\alpha_i\,\mathbb{1}_{\{x_i\neq y_i\}}\geq \frac{t}{L}
\end{align*}
for every $y\in A$. Taking the infimum over $y\in A$ and then using this admissible weight vector in the supremum defining $d_T$ yields
\begin{align*}
d_T(x,A)\geq \frac{t}{L}.
\end{align*}
Thus
\begin{align*}
\{x\in K:f(x)\geq m+t\}\subseteq \{x\in K:d_T(x,A)\geq t/L\}.
\end{align*}
[/step]
custom_env
admin
[step:Apply Talagrand's product-space convex distance inequality]
Talagrand's product-space convex distance inequality states that for independent coordinates $X_1,\dots,X_n$ taking values in the product space $K_1\times\cdots\times K_n$ and every measurable set $E\subset K$,
\begin{align*}
\mathbb{P}(X\in E)\,\mathbb{P}(d_T(X,E)\geq r)\leq \exp\left(-\frac{r^2}{4}\right)
\end{align*}
for every $r\geq 0$. The hypotheses are satisfied here because $X_1,\dots,X_n$ are independent and $A\subset K$ is the measurable sublevel set of the real-valued [random variable](/page/Random%20Variable) $f(X)$. This is citing a result not yet in the wiki: Talagrand's product-space convex distance inequality.
Applying the inequality with $E=A$ and $r=t/L$ gives
\begin{align*}
\mathbb{P}(X\in A)\,\mathbb{P}(d_T(X,A)\geq t/L)\leq \exp\left(-\frac{t^2}{4L^2}\right).
\end{align*}
Since $\mathbb{P}(X\in A)\geq 1/2$, division by $\mathbb{P}(X\in A)$ gives
\begin{align*}
\mathbb{P}(d_T(X,A)\geq t/L)\leq 2\exp\left(-\frac{t^2}{4L^2}\right).
\end{align*}
Using the inclusion from the previous step,
\begin{align*}
\mathbb{P}(f(X)\geq m+t)\leq \mathbb{P}(d_T(X,A)\geq t/L)\leq 2\exp\left(-\frac{t^2}{4L^2}\right).
\end{align*}
This proves the convex upper-tail estimate.
[/step]
custom_env
admin
[step:Derive the concave lower-tail estimate by applying the convex result to $-f$]
Assume now that $f:K\to\mathbb{R}$ is concave and $L$-Lipschitz. Define
\begin{align*}
h:K\to\mathbb{R}, \qquad h(x):=-f(x).
\end{align*}
Then $h$ is convex and $L$-Lipschitz. If $m$ is a median of $f(X)$, then $-m$ is a median of $h(X)$, because
\begin{align*}
\mathbb{P}(h(X)\leq -m)=\mathbb{P}(f(X)\geq m)\geq \frac{1}{2}
\end{align*}
and
\begin{align*}
\mathbb{P}(h(X)\geq -m)=\mathbb{P}(f(X)\leq m)\geq \frac{1}{2}.
\end{align*}
Applying the convex upper-tail estimate to $h$ with median $-m$ gives, for every $t\geq 0$,
\begin{align*}
\mathbb{P}(h(X)\geq -m+t)\leq 2\exp\left(-\frac{t^2}{4L^2}\right).
\end{align*}
Since $h(X)\geq -m+t$ is equivalent to $f(X)\leq m-t$, we obtain
\begin{align*}
\mathbb{P}(f(X)\leq m-t)\leq 2\exp\left(-\frac{t^2}{4L^2}\right).
\end{align*}
This completes the proof.
[/step]