[proofplan]
We prove the minimax equality by reformulating both sides as linear programs and applying strong duality. The row player's maximin problem is cast as an LP in which the player maximises a guaranteed payoff level $v$ subject to mixed strategy constraints. We derive the dual of this LP and show it coincides with the column player's minimax problem. Strong duality for linear programs then yields the equality of the two values. Finally, we verify that the optimal strategies form a Nash equilibrium.
[/proofplan]
[step:Formulate the row player's maximin problem as a linear program]
Define the mixed strategy simplices:
\begin{align*}
X &= \bigl\{x \in \mathbb{R}^m : x_i \geq 0 \text{ for all } i, \; \textstyle\sum_{i=1}^{m} x_i = 1\bigr\}, \\
Y &= \bigl\{y \in \mathbb{R}^n : y_j \geq 0 \text{ for all } j, \; \textstyle\sum_{j=1}^{n} y_j = 1\bigr\}.
\end{align*}
For mixed strategies $x \in X$ and $y \in Y$, the expected payoff is $p(x, y) = x^\top P y = \sum_{i,j} x_i P_{ij} y_j$.
The row player's maximin value is
\begin{align*}
\underline{v} := \max_{x \in X} \min_{y \in Y} p(x, y) = \max_{x \in X} \min_{1 \leq j \leq n} \sum_{i=1}^{m} x_i P_{ij},
\end{align*}
where the inner minimisation over $y \in Y$ reduces to a minimisation over the column player's pure strategies $e_j$ (since $p(x, y) = x^\top P y$ is linear in $y$, and a linear function over a simplex attains its minimum at a vertex).
Introduce a scalar variable $v \in \mathbb{R}$ representing the guaranteed payoff level. The maximin problem becomes the LP:
\begin{align*}
(\text{Primal}) \qquad \max_{v, x} \quad & v \\
\text{subject to} \quad & \sum_{i=1}^{m} x_i P_{ij} \geq v \quad \text{for } j = 1, \ldots, n, \\
& \sum_{i=1}^{m} x_i = 1, \\
& x_i \geq 0 \quad \text{for } i = 1, \ldots, m.
\end{align*}
The optimal value of this LP equals $\underline{v}$.
[guided]
Why can we replace $\min_{y \in Y} p(x, y)$ with $\min_j \sum_i x_i P_{ij}$? The function $y \mapsto p(x, y) = x^\top P y$ is linear in $y$. A linear function on a compact convex set (the simplex $Y$) attains its minimum at an extreme point. The extreme points of $Y$ are the standard basis vectors $e_1, \ldots, e_n$, and $p(x, e_j) = \sum_i x_i P_{ij}$ is the expected payoff when the column player uses pure strategy $j$. Therefore
\begin{align*}
\min_{y \in Y} x^\top P y = \min_{1 \leq j \leq n} \sum_{i=1}^{m} x_i P_{ij}.
\end{align*}
We then introduce $v \leq \sum_i x_i P_{ij}$ for all $j$ as a lower bound on the worst-case payoff, and maximise $v$. At the optimum, $v$ equals the minimum over $j$, recovering the maximin value.
[/guided]
[/step]
[step:Derive the dual LP]
We rewrite the primal in standard form. Introduce slack variables $z_j \geq 0$ for $j = 1, \ldots, n$ to convert the inequality constraints:
\begin{align*}
\sum_{i=1}^{m} x_i P_{ij} - v = z_j, \quad z_j \geq 0, \quad j = 1, \ldots, n.
\end{align*}
The primal decision variables are $(v, x_1, \ldots, x_m, z_1, \ldots, z_n)$, the constraints are $n$ slack equalities plus the simplex constraint, and the objective is $\max\; v$.
To form the dual, assign dual variable $y_j \geq 0$ to the $j$-th inequality constraint $\sum_i x_i P_{ij} \geq v$, and an unrestricted dual variable $w$ to the simplex constraint $\sum_i x_i = 1$.
The Lagrangian of the primal (before slacking) is:
\begin{align*}
L(v, x; y, w) = v + \sum_{j=1}^{n} y_j \Bigl(\sum_{i=1}^{m} x_i P_{ij} - v\Bigr) - w \Bigl(\sum_{i=1}^{m} x_i - 1\Bigr).
\end{align*}
Rearranging by collecting terms in $v$ and $x_i$:
\begin{align*}
L = v\Bigl(1 - \sum_{j=1}^{n} y_j\Bigr) + \sum_{i=1}^{m} x_i \Bigl(\sum_{j=1}^{n} P_{ij} y_j - w\Bigr) + w.
\end{align*}
For the dual to be bounded (i.e., for $\inf_{v, x \geq 0} L$ to be finite), we require:
1. The coefficient of $v$ must be zero: $\sum_{j=1}^{n} y_j = 1$.
2. The coefficient of each $x_i$ must be non-positive (since $x_i \geq 0$ and we are taking the infimum): $\sum_{j=1}^{n} P_{ij} y_j \leq w$ for all $i = 1, \ldots, m$.
Under these conditions, $\inf_{v, x \geq 0} L = w$. The dual problem is therefore:
\begin{align*}
(\text{Dual}) \qquad \min_{w, y} \quad & w \\
\text{subject to} \quad & \sum_{j=1}^{n} P_{ij} y_j \leq w \quad \text{for } i = 1, \ldots, m, \\
& \sum_{j=1}^{n} y_j = 1, \\
& y_j \geq 0 \quad \text{for } j = 1, \ldots, n.
\end{align*}
[guided]
The dual derivation follows the standard Lagrangian recipe for LP duality. We assign one dual variable per primal constraint and collect terms.
The term $v(1 - \sum_j y_j)$ is linear in $v$, which is unrestricted in sign (we did not require $v \geq 0$ in the primal). For $\inf_v v(1 - \sum_j y_j)$ to be finite, the coefficient must vanish: $\sum_j y_j = 1$. This is a simplex constraint on $y$, meaning $y$ is a mixed strategy for the column player.
Each $x_i \geq 0$, so for $\inf_{x_i \geq 0} x_i(\sum_j P_{ij} y_j - w)$ to be finite, we need $\sum_j P_{ij} y_j - w \leq 0$, i.e., $\sum_j P_{ij} y_j \leq w$ for each pure strategy $i$ of the row player. The dual variable $w$ is an upper bound on the row player's expected payoff against $y$ for any pure strategy $i$.
So the dual asks: what is the smallest $w$ such that the column player has a mixed strategy $y$ guaranteeing the row player's payoff is at most $w$ regardless of which pure strategy the row player picks?
[/guided]
[/step]
[step:Identify the dual optimal value as the minimax value]
The dual LP minimises $w$ subject to $\sum_j P_{ij} y_j \leq w$ for all $i$, with $y \in Y$. At the dual optimum, $w$ equals the tightest such upper bound:
\begin{align*}
w^* = \min_{y \in Y} \max_{1 \leq i \leq m} \sum_{j=1}^{n} P_{ij} y_j.
\end{align*}
By the same extreme-point argument as before (now applied to the row player), $\max_{x \in X} p(x, y) = \max_i \sum_j P_{ij} y_j$ since $x \mapsto p(x, y) = x^\top P y$ is linear in $x$ and $X$ is a simplex. Therefore
\begin{align*}
w^* = \min_{y \in Y} \max_{x \in X} p(x, y) =: \overline{v}.
\end{align*}
The dual optimal value is the column player's minimax value $\overline{v}$.
[guided]
The dual constraint $\sum_j P_{ij} y_j \leq w$ for all $i$ says that for every pure strategy $i$ of the row player, the expected payoff against the column player's mix $y$ is at most $w$. Since $\max_{x \in X} p(x, y)$ is the maximum over pure strategies (by the linearity and extreme-point argument), minimising $w$ over all feasible $(w, y)$ gives
\begin{align*}
w^* = \min_{y \in Y} \max_{i} \sum_j P_{ij} y_j = \min_{y \in Y} \max_{x \in X} x^\top P y = \overline{v}.
\end{align*}
This confirms that the dual LP captures exactly the column player's minimax problem.
[/guided]
[/step]
[step:Apply strong duality for linear programs to equate the two values]
Both the primal and dual LPs are feasible: for the primal, any $x \in X$ with $v = \min_j \sum_i x_i P_{ij}$ is feasible; for the dual, any $y \in Y$ with $w = \max_i \sum_j P_{ij} y_j$ is feasible. Both are bounded: the primal objective $v$ is bounded above by $\max_{i,j} |P_{ij}|$ (since $x \in X$ and $y \in Y$ are probability vectors), and the dual objective $w$ is bounded below by $-\max_{i,j} |P_{ij}|$.
By [Strong Duality for Linear Programs](/theorems/2553), since both the primal and dual are feasible and bounded, their optimal values coincide:
\begin{align*}
\underline{v} = \overline{v}.
\end{align*}
That is,
\begin{align*}
\max_{x \in X} \min_{y \in Y} p(x, y) = \min_{y \in Y} \max_{x \in X} p(x, y).
\end{align*}
[guided]
Strong duality is the engine of the proof. The inequality $\underline{v} \leq \overline{v}$ (the "max-min $\leq$ min-max" inequality) holds for any function $p$ and any sets $X, Y$ — this is a general fact and does not require linearity or convexity. The content of the minimax theorem is the reverse inequality $\underline{v} \geq \overline{v}$, which we obtain from LP strong duality.
Why are the feasibility and boundedness hypotheses of strong duality satisfied?
**Primal feasibility**: pick any $x \in X$ (for example, the uniform distribution $x_i = 1/m$). Set $v = \min_j \sum_i x_i P_{ij}$. Then all constraints are satisfied.
**Dual feasibility**: pick any $y \in Y$ (for example, $y_j = 1/n$). Set $w = \max_i \sum_j P_{ij} y_j$. Then all constraints are satisfied.
**Boundedness**: since $x \in X$ and $y \in Y$ are probability vectors and $P$ has finite entries, the payoff $p(x, y) = x^\top P y$ lies in the interval $[\min_{i,j} P_{ij}, \max_{i,j} P_{ij}]$. Both the primal value (a max of lower bounds) and the dual value (a min of upper bounds) are bounded.
[/guided]
[/step]
[step:Verify that optimal maximin and minimax strategies form a Nash equilibrium]
Let $x^* \in X$ and $y^* \in Y$ be optimal solutions to the primal and dual LPs, respectively. Let $v^* = \underline{v} = \overline{v}$ be the common value. We verify the Nash equilibrium conditions: neither player can improve their payoff by unilateral deviation.
Since $x^*$ solves the maximin problem:
\begin{align*}
\min_{y \in Y} p(x^*, y) = v^*.
\end{align*}
This means $p(x^*, y) \geq v^*$ for all $y \in Y$. In particular, $p(x^*, y^*) \geq v^*$.
Since $y^*$ solves the minimax problem:
\begin{align*}
\max_{x \in X} p(x, y^*) = v^*.
\end{align*}
This means $p(x, y^*) \leq v^*$ for all $x \in X$. In particular, $p(x^*, y^*) \leq v^*$.
Combining: $p(x^*, y^*) = v^*$. Now for any $x \in X$:
\begin{align*}
p(x, y^*) \leq v^* = p(x^*, y^*),
\end{align*}
so the row player cannot improve by deviating from $x^*$. For any $y \in Y$:
\begin{align*}
p(x^*, y) \geq v^* = p(x^*, y^*),
\end{align*}
so the column player (who minimises the payoff) cannot improve by deviating from $y^*$. Therefore $(x^*, y^*)$ is a Nash equilibrium.
[guided]
A Nash equilibrium requires that neither player can profitably deviate. In a zero-sum game, the row player maximises $p$ and the column player minimises $p$, so the equilibrium conditions are:
\begin{align*}
p(x, y^*) &\leq p(x^*, y^*) \quad \text{for all } x \in X \quad (\text{row player cannot gain}), \\
p(x^*, y) &\geq p(x^*, y^*) \quad \text{for all } y \in Y \quad (\text{column player cannot gain}).
\end{align*}
The first condition says $x^*$ maximises $p(\cdot, y^*)$ over $X$, and the second says $y^*$ minimises $p(x^*, \cdot)$ over $Y$. Both follow directly from the optimality of $x^*$ and $y^*$ for the maximin and minimax problems, together with the minimax equality $v^* = \underline{v} = \overline{v}$.
The fact that $p(x^*, y^*) = v^*$ is a consequence of the "sandwich": $p(x^*, y^*) \geq \min_y p(x^*, y) = v^*$ (row player guarantees at least $v^*$) and $p(x^*, y^*) \leq \max_x p(x, y^*) = v^*$ (column player guarantees at most $v^*$).
[/guided]
[/step]