Fenchel-Rockafellar Duality Theorem — Statement & Proof

Theorem

Edit Issues Pull Requests Attributions Admin

Discussion

Proof

[proofplan] We introduce the perturbation value function $p(u)=\inf_x\{f(x)+g(Ax+u)\}$, so the primal value is $p(0)$. The relative-interior condition implies that $0$ lies in the relative interior of $\operatorname{dom} p$, which is the regularity needed to separate points below the epigraph of $p$ without an abnormal vertical multiplier. The conjugate of $p$ is computed explicitly as $p^*(y)=f^*(-A^\top y)+g^*(y)$, so the separating hyperplanes give $p(0)=\sup_y -p^*(y)$, which is precisely the dual equality. If a primal minimizer exists and the value is finite, a supporting hyperplane to $\operatorname{epi} p$ at $(0,p(0))$ gives a dual optimizer, and equality in the two Fenchel inequalities yields the stated subgradient certificates. [/proofplan] [step:Define the perturbation function and identify its domain] Define the perturbation value function $p: \mathbb{R}^m \to [-\infty,\infty]$ by \begin{align*} p(u)=\inf_{x \in \mathbb{R}^n}\{f(x)+g(Ax+u)\}. \end{align*} Then the primal value is $p(0)$. Since $f$ and $g$ are convex, the function $p$ is convex: if $u_1,u_2 \in \mathbb{R}^m$, $\lambda \in [0,1]$, and $x_1,x_2 \in \mathbb{R}^n$, then by linearity of $A$ and convexity of $f$ and $g$, \begin{align*} f(\lambda x_1+(1-\lambda)x_2)+g(A(\lambda x_1+(1-\lambda)x_2)+\lambda u_1+(1-\lambda)u_2) \leq \lambda[f(x_1)+g(Ax_1+u_1)]+(1-\lambda)[f(x_2)+g(Ax_2+u_2)]. \end{align*} Taking infima over $x_1$ and $x_2$ gives \begin{align*} p(\lambda u_1+(1-\lambda)u_2)\leq \lambda p(u_1)+(1-\lambda)p(u_2). \end{align*} The effective domain of $p$ is \begin{align*} \operatorname{dom} p=\operatorname{dom} g-A(\operatorname{dom} f). \end{align*} Indeed, $p(u)<\infty$ exactly when there exists $x \in \operatorname{dom} f$ such that $Ax+u \in \operatorname{dom} g$, which is equivalent to $u \in \operatorname{dom} g-A(\operatorname{dom} f)$. Because $A(\operatorname{dom} f)$ and $\operatorname{dom} g$ are convex subsets of $\mathbb{R}^m$ and \begin{align*} \operatorname{ri}(A(\operatorname{dom} f)) \cap \operatorname{ri}(\operatorname{dom} g) \ne \varnothing, \end{align*} choose $a\in\operatorname{ri}(A(\operatorname{dom} f)) \cap \operatorname{ri}(\operatorname{dom} g)$. The finite-dimensional relative-interior difference formula for nonempty convex sets gives \begin{align*} \operatorname{ri}(\operatorname{dom} g-A(\operatorname{dom} f))=\operatorname{ri}(\operatorname{dom} g)-\operatorname{ri}(A(\operatorname{dom} f)). \end{align*} Since $a-a=0$ belongs to the right-hand side, applying this formula with $C=A(\operatorname{dom} f)$ and $D=\operatorname{dom} g$ yields \begin{align*} 0 \in \operatorname{ri}(\operatorname{dom} g-A(\operatorname{dom} f))=\operatorname{ri}(\operatorname{dom} p). \end{align*} [/step] [step:Compute the conjugate of the perturbation function] Let $p^*: \mathbb{R}^m \to (-\infty,\infty]$ denote the convex conjugate of $p$, defined by \begin{align*} p^*(y)=\sup_{u \in \mathbb{R}^m}\{\langle y,u\rangle-p(u)\}. \end{align*} For each $y \in \mathbb{R}^m$, using the definition of $p$ and then the change of variable $z=Ax+u$, so that $u=z-Ax$, gives \begin{align*} p^*(y)=\sup_{u \in \mathbb{R}^m}\sup_{x \in \mathbb{R}^n}\{\langle y,u\rangle-f(x)-g(Ax+u)\}. \end{align*} Thus \begin{align*} p^*(y)=\sup_{x \in \mathbb{R}^n}\sup_{z \in \mathbb{R}^m}\{\langle y,z-Ax\rangle-f(x)-g(z)\}. \end{align*} Since $\langle y,Ax\rangle=\langle A^\top y,x\rangle$, this separates into \begin{align*} p^*(y)=\sup_{x \in \mathbb{R}^n}\{\langle -A^\top y,x\rangle-f(x)\}+\sup_{z \in \mathbb{R}^m}\{\langle y,z\rangle-g(z)\}. \end{align*} Therefore \begin{align*} p^*(y)=f^*(-A^\top y)+g^*(y). \end{align*} [guided] The goal of introducing $p$ is that the dual problem should appear as the negative conjugate of $p$ at the origin. We compute this carefully. By definition, \begin{align*} p^*(y)=\sup_{u \in \mathbb{R}^m}\{\langle y,u\rangle-p(u)\}. \end{align*} Since \begin{align*} p(u)=\inf_{x \in \mathbb{R}^n}\{f(x)+g(Ax+u)\}, \end{align*} subtracting $p(u)$ is the same as taking the supremum over all choices of $x$: \begin{align*} p^*(y)=\sup_{u \in \mathbb{R}^m}\sup_{x \in \mathbb{R}^n}\{\langle y,u\rangle-f(x)-g(Ax+u)\}. \end{align*} Now we want the two functions $f$ and $g$ to decouple. The correct substitution is $z=Ax+u$, where $z \in \mathbb{R}^m$ is the argument of $g$. For fixed $x$, as $u$ ranges over all of $\mathbb{R}^m$, the vector $z=Ax+u$ also ranges over all of $\mathbb{R}^m$, and $u=z-Ax$. Hence \begin{align*} p^*(y)=\sup_{x \in \mathbb{R}^n}\sup_{z \in \mathbb{R}^m}\{\langle y,z-Ax\rangle-f(x)-g(z)\}. \end{align*} The transpose map $A^\top$ is defined by the identity $\langle y,Ax\rangle=\langle A^\top y,x\rangle$ for all $x \in \mathbb{R}^n$ and $y \in \mathbb{R}^m$. Therefore \begin{align*} \langle y,z-Ax\rangle=\langle y,z\rangle-\langle A^\top y,x\rangle=\langle y,z\rangle+\langle -A^\top y,x\rangle. \end{align*} The supremum then splits into one supremum over $x$ and one supremum over $z$: \begin{align*} p^*(y)=\sup_{x \in \mathbb{R}^n}\{\langle -A^\top y,x\rangle-f(x)\}+\sup_{z \in \mathbb{R}^m}\{\langle y,z\rangle-g(z)\}. \end{align*} By the definitions of the convex conjugates $f^*$ and $g^*$, this is exactly \begin{align*} p^*(y)=f^*(-A^\top y)+g^*(y). \end{align*} This identity is the algebraic heart of the duality theorem: once $p(0)=\sup_y -p^*(y)$ is established, the displayed formula becomes precisely the stated Fenchel-Rockafellar dual problem. [/guided] [/step] [step:Derive weak duality from Fenchel inequalities] For any $x \in \mathbb{R}^n$ and $y \in \mathbb{R}^m$, the Fenchel inequalities for $f$ and $g$ give \begin{align*} f(x)+f^*(-A^\top y)\geq \langle -A^\top y,x\rangle \end{align*} and \begin{align*} g(Ax)+g^*(y)\geq \langle y,Ax\rangle. \end{align*} Adding the inequalities and using $\langle -A^\top y,x\rangle=-\langle y,Ax\rangle$ gives \begin{align*} f(x)+g(Ax)+f^*(-A^\top y)+g^*(y)\geq 0. \end{align*} Thus \begin{align*} -f^*(-A^\top y)-g^*(y)\leq f(x)+g(Ax). \end{align*} Taking the infimum over $x \in \mathbb{R}^n$ and the supremum over $y \in \mathbb{R}^m$ yields \begin{align*} \sup_{y \in \mathbb{R}^m}\{-f^*(-A^\top y)-g^*(y)\}\leq \inf_{x \in \mathbb{R}^n}\{f(x)+g(Ax)\}. \end{align*} [/step] [step:Use a perturbation subgradient to prove the reverse inequality] If $p(0)=-\infty$, then [weak duality](/theorems/2549) gives \begin{align*} \sup_{y \in \mathbb{R}^m}\{-f^*(-A^\top y)-g^*(y)\}\leq p(0)=-\infty, \end{align*} so equality follows. Hence assume $p(0)>-\infty$. Since $0\in\operatorname{ri}(\operatorname{dom}p)$ and $p(0)$ is finite, convexity forbids $p(u_1)=-\infty$ at any $u_1\in\operatorname{dom}p$: if such a point existed, then the definition of relative interior gives $u_2\in\operatorname{dom}p$ and $\lambda\in(0,1)$ with $0=\lambda u_1+(1-\lambda)u_2$, and convexity gives \begin{align*} p(0)\leq \lambda p(u_1)+(1-\lambda)p(u_2)=-\infty, \end{align*} contradicting $p(0)>-\infty$. Thus $p$ is a proper convex extended-real function that does not take the value $-\infty$. We use the standard finite-dimensional subgradient existence theorem: if $h$ is a proper convex extended-real function on a finite-dimensional Euclidean space and $z\in\operatorname{ri}(\operatorname{dom}h)$, then there exists a vector $s$ such that $h(w)\geq h(z)+\langle s,w-z\rangle$ for every $w$ in the space. This theorem does not require $h$ to be closed. Applying it to $h=p$ and $z=0$, there exists $y\in\mathbb{R}^m$ such that \begin{align*} p(u)\geq p(0)+\langle y,u\rangle \end{align*} for every $u\in\mathbb{R}^m$. Hence \begin{align*} p^*(y)=\sup_{u\in\mathbb{R}^m}\{\langle y,u\rangle-p(u)\}\leq -p(0). \end{align*} Evaluating the supremum at $u=0$ gives the reverse inequality $p^*(y)\geq -p(0)$, so \begin{align*} p^*(y)=-p(0). \end{align*} Using the conjugate computation from the previous step, \begin{align*} p(0)= -f^*(-A^\top y)-g^*(y)\leq \sup_{w\in\mathbb{R}^m}\{-f^*(-A^\top w)-g^*(w)\}. \end{align*} Combining this with weak duality proves \begin{align*} \inf_{x\in\mathbb{R}^n}\{f(x)+g(Ax)\}=\sup_{y\in\mathbb{R}^m}\{-f^*(-A^\top y)-g^*(y)\}. \end{align*} [guided] The purpose of the relative-interior condition is to make the perturbation value function have a genuine supporting affine minorant at $0$. First handle the degenerate case. If $p(0)=-\infty$, weak duality gives \begin{align*} \sup_{y \in \mathbb{R}^m}\{-f^*(-A^\top y)-g^*(y)\}\leq p(0)=-\infty, \end{align*} so equality holds. Thus assume $p(0)>-\infty$. We must verify that $p$ is a proper convex function before using the subgradient theorem. The previous step proved convexity and $0\in\operatorname{ri}(\operatorname{dom}p)$. Since $p(0)$ is finite, the domain is nonempty. If there were $u_1\in\operatorname{dom}p$ with $p(u_1)=-\infty$, then the relative-interior condition at $0$ lets us move through $0$ in the direction from $0$ to $u_1$: there are $u_2\in\operatorname{dom}p$ and $\lambda\in(0,1)$ such that \begin{align*} 0=\lambda u_1+(1-\lambda)u_2. \end{align*} Convexity of $p$ would imply \begin{align*} p(0)\leq \lambda p(u_1)+(1-\lambda)p(u_2)=-\infty, \end{align*} contradicting $p(0)>-\infty$. Therefore $p$ never takes the value $-\infty$, and it is a proper convex extended-real function. Now apply the standard finite-dimensional subgradient existence theorem: if $h$ is a proper convex extended-real function on a finite-dimensional Euclidean space and $z\in\operatorname{ri}(\operatorname{dom}h)$, then there exists a vector $s$ such that $h(w)\geq h(z)+\langle s,w-z\rangle$ for every $w$ in the space. Its hypotheses are satisfied with $h=p$ and $z=0$: $p$ is proper and convex, and $0\in\operatorname{ri}(\operatorname{dom}p)$. No closedness hypothesis on $p$ is needed for this theorem. The theorem gives a vector $y\in\mathbb{R}^m$ such that \begin{align*} p(u)\geq p(0)+\langle y,u\rangle \end{align*} for every $u\in\mathbb{R}^m$. This supporting inequality is exactly the non-abnormal conclusion that the epigraph separation argument was trying to obtain. Insert this inequality into the definition of the conjugate: \begin{align*} p^*(y)=\sup_{u\in\mathbb{R}^m}\{\langle y,u\rangle-p(u)\}\leq -p(0). \end{align*} On the other hand, the term $u=0$ is allowed in the supremum, so \begin{align*} p^*(y)\geq \langle y,0\rangle-p(0)=-p(0). \end{align*} Hence \begin{align*} p^*(y)=-p(0). \end{align*} The previous step computed $p^*(y)=f^*(-A^\top y)+g^*(y)$. Therefore \begin{align*} p(0)= -f^*(-A^\top y)-g^*(y)\leq \sup_{w\in\mathbb{R}^m}\{-f^*(-A^\top w)-g^*(w)\}. \end{align*} This is the reverse inequality. Combining it with weak duality gives the desired equality of primal and dual values. [/guided] [/step] [step:Extract a dual certificate from a supporting hyperplane at the optimum] Assume that the common value is finite and that $\bar{x}\in \mathbb{R}^n$ attains the primal infimum. Set \begin{align*} v=f(\bar{x})+g(A\bar{x})=p(0). \end{align*} By the properness argument in the separation step, the finiteness of $p(0)$ together with $0\in\operatorname{ri}(\operatorname{dom}p)$ implies that $p$ is a proper convex extended-real function and does not take the value $-\infty$. The standard finite-dimensional subgradient existence theorem says that if $h$ is a proper convex extended-real function on a finite-dimensional Euclidean space and $z\in\operatorname{ri}(\operatorname{dom}h)$, then there exists a vector $s$ such that $h(w)\geq h(z)+\langle s,w-z\rangle$ for every $w$ in the space. This theorem does not require $h$ to be closed. Applying it to $h=p$ and $z=0$, there is a vector $\bar{y}\in\mathbb{R}^m$ such that \begin{align*} p(u)\geq p(0)+\langle \bar{y},u\rangle \end{align*} for every $u\in\mathbb{R}^m$. The preceding inequality implies \begin{align*} p^*(\bar{y})=\sup_{u\in \mathbb{R}^m}\{\langle \bar{y},u\rangle-p(u)\}\leq -p(0). \end{align*} Taking $u=0$ gives equality, so \begin{align*} p^*(\bar{y})=-p(0). \end{align*} Using the formula for $p^*$, we get \begin{align*} -f^*(-A^\top \bar{y})-g^*(\bar{y})=p(0)=v. \end{align*} Thus $\bar{y}$ attains the dual supremum. Finally, because $\bar{x}$ is primal optimal and $\bar{y}$ is dual optimal, \begin{align*} f(\bar{x})+g(A\bar{x})+f^*(-A^\top \bar{y})+g^*(\bar{y})=0. \end{align*} The Fenchel inequalities give the two nonnegative quantities \begin{align*} f(\bar{x})+f^*(-A^\top \bar{y})-\langle -A^\top \bar{y},\bar{x}\rangle\geq 0 \end{align*} and \begin{align*} g(A\bar{x})+g^*(\bar{y})-\langle \bar{y},A\bar{x}\rangle\geq 0. \end{align*} Their sum is zero, since $\langle -A^\top\bar{y},\bar{x}\rangle+\langle\bar{y},A\bar{x}\rangle=0$. Hence each term is zero. Equality in the Fenchel inequality is equivalent to membership in the subdifferential, so \begin{align*} -A^\top \bar{y}\in \partial f(\bar{x}) \end{align*} and \begin{align*} \bar{y}\in \partial g(A\bar{x}). \end{align*} These are the desired dual certificate conditions. [guided] Because the primal optimum is finite and attained at $\bar{x}$, the perturbation value at the origin is finite: \begin{align*} p(0)=f(\bar{x})+g(A\bar{x})=v. \end{align*} We verify properness again inside the guided argument. The function $p$ is convex, $0\in\operatorname{ri}(\operatorname{dom}p)$, and $p(0)$ is finite. If there were $u_1\in\operatorname{dom}p$ with $p(u_1)=-\infty$, then the definition of relative interior would give $u_2\in\operatorname{dom}p$ and $\lambda\in(0,1)$ such that \begin{align*} 0=\lambda u_1+(1-\lambda)u_2. \end{align*} Convexity would imply \begin{align*} p(0)\leq \lambda p(u_1)+(1-\lambda)p(u_2)=-\infty, \end{align*} contradicting the finiteness of $p(0)$. Hence $p$ is a proper convex function and never takes the value $-\infty$. Since $0\in\operatorname{ri}(\operatorname{dom}p)$, the finite-dimensional subgradient existence theorem for proper convex functions applies at $0$, and this theorem does not require closedness of $p$. Thus there exists $\bar{y}\in\mathbb{R}^m$ such that \begin{align*} p(u)\geq p(0)+\langle \bar{y},u\rangle \end{align*} for every $u\in\mathbb{R}^m$. This inequality says exactly that $\bar{y}$ supports the perturbation value function at the unperturbed problem. Substituting it into the definition of the conjugate gives \begin{align*} p^*(\bar{y})=\sup_{u\in\mathbb{R}^m}\{\langle \bar{y},u\rangle-p(u)\}\leq -p(0). \end{align*} Taking $u=0$ in the supremum gives the reverse inequality, so \begin{align*} p^*(\bar{y})=-p(0). \end{align*} For completeness, recompute the conjugate identity at $\bar{y}$. By definition of $p$, \begin{align*} p^*(\bar{y})=\sup_{u\in\mathbb{R}^m}\sup_{x\in\mathbb{R}^n}\{\langle\bar{y},u\rangle-f(x)-g(Ax+u)\}. \end{align*} For fixed $x$, set $z=Ax+u$, so $z$ ranges over all of $\mathbb{R}^m$ and $u=z-Ax$. Using $\langle\bar{y},Ax\rangle=\langle A^\top\bar{y},x\rangle$, we obtain \begin{align*} p^*(\bar{y})=\sup_{x\in\mathbb{R}^n}\{\langle-A^\top\bar{y},x\rangle-f(x)\}+\sup_{z\in\mathbb{R}^m}\{\langle\bar{y},z\rangle-g(z)\}. \end{align*} Therefore \begin{align*} p^*(\bar{y})=f^*(-A^\top\bar{y})+g^*(\bar{y}). \end{align*} Combining this with $p^*(\bar{y})=-p(0)$ gives \begin{align*} -f^*(-A^\top\bar{y})-g^*(\bar{y})=p(0)=v. \end{align*} Thus $\bar{y}$ attains the dual supremum. It remains to translate dual attainment into the certificate conditions. Since $\bar{x}$ is primal optimal and $\bar{y}$ is dual optimal, \begin{align*} f(\bar{x})+g(A\bar{x})+f^*(-A^\top\bar{y})+g^*(\bar{y})=0. \end{align*} The Fenchel inequalities give two nonnegative terms: \begin{align*} f(\bar{x})+f^*(-A^\top\bar{y})-\langle -A^\top\bar{y},\bar{x}\rangle\geq0 \end{align*} and \begin{align*} g(A\bar{x})+g^*(\bar{y})-\langle\bar{y},A\bar{x}\rangle\geq0. \end{align*} Their sum is zero because $\langle -A^\top\bar{y},\bar{x}\rangle+\langle\bar{y},A\bar{x}\rangle=0$. A sum of two nonnegative [real numbers](/page/Real%20Numbers) is zero only when both are zero. Equality in Fenchel's inequality is equivalent to subdifferential membership, so \begin{align*} -A^\top\bar{y}\in\partial f(\bar{x}) \end{align*} and \begin{align*} \bar{y}\in\partial g(A\bar{x}). \end{align*} These are the dual certificate conditions. [/guided] [/step]

Prerequisites (0/2 completed)

Prerequisites Graph

Interactive dependency map showing how this theorem builds on foundational concepts

Loading dependency graph...

Theorems

What brings you to Androma?

Start with a route through the knowledge graph.

Fenchel-Rockafellar Duality Theorem (Theorem # 6694)

Discussion

Proof

Prerequisites (0/2 completed)

Prerequisites Graph

Explore Further

Sign in to Androma

Check your inbox

One last step

Fenchel-Rockafellar Duality Theorem (Theorem # 6694)

Discussion

Proof

Prerequisites (0/2 completed)

Prerequisites Graph

Explore Further