Hierarchy of Convexity Conditions for Integral Functionals

Hierarchy of Convexity Conditions for Integral Functionals (Theorem # 8766)

Theorem

Edit Issues Pull Requests Attributions Admin

Discussion

Proof

[proofplan] We prove the two forward implications directly from the definitions. Convexity implies polyconvexity because a [convex function](/page/Convex%20Function) of the matrix entries is already a convex function of the full list of minors after ignoring the higher-order minors. [Polyconvexity implies quasiconvexity](/theorems/8753) by the standard minors argument: convexity in the minors, [Jensen's inequality](/theorems/9), and the null-Lagrangian identity saying that averages of minors of $F+\nabla\varphi$ agree with the minors of $F$ for zero-boundary perturbations. Finally, the reverse implications are ruled out by standard vectorial counterexamples: determinant-type minors give polyconvex nonconvex functions, while known quasiconvex non-polyconvex examples show that quasiconvexity is strictly weaker than polyconvexity in vectorial dimensions. [/proofplan] [step:Represent a convex function as a convex function of all minors] Let $N=N(m,n)\in\mathbb N$ denote the total number of minors of an $m\times n$ matrix, including the first-order minors, and let \begin{align*} M:\mathbb R^{m\times n}\to\mathbb R^N \end{align*} denote the map that sends a matrix $F\in\mathbb R^{m\times n}$ to the ordered vector $M(F)$ of all minors of $F$. We choose the ordering so that the first $mn$ components of $M(F)$ are the entries of $F$. Assume first that $f$ is convex. Define \begin{align*} G:\mathbb R^N\to\mathbb R \end{align*} by $G(y)=f(A(y))$, where $A:\mathbb R^N\to\mathbb R^{m\times n}$ is the linear projection that sends $y\in\mathbb R^N$ to the $m\times n$ matrix whose entries are the first $mn$ components of $y$ in the fixed ordering. Since $A$ is linear and $f$ is convex, the composition $G=f\circ A$ is convex. For every $F\in\mathbb R^{m\times n}$, the definition of $A$ gives $A(M(F))=F$, and hence \begin{align*} G(M(F))=f(F). \end{align*} Thus $f$ is a convex function of the minors of $F$, which is precisely polyconvexity. [/step] [step:Apply the standard minors argument to pass from polyconvexity to quasiconvexity] Assume now that $f$ is polyconvex. By definition, there exists a convex function $G:\mathbb R^N\to\mathbb R$ such that \begin{align*} f(F)=G(M(F)) \end{align*} for every $F\in\mathbb R^{m\times n}$. By [citetheorem:8753], every finite polyconvex function on $\mathbb R^{m\times n}$ is quasiconvex. The hypotheses of that theorem apply here because $f:\mathbb R^{m\times n}\to\mathbb R$ is finite and polyconvex; hence $f$ is quasiconvex. [guided] We now explain the mechanism behind the implication, because it is the structural reason polyconvexity was introduced. Since $f$ is polyconvex, there is a convex function \begin{align*} G:\mathbb R^N\to\mathbb R \end{align*} and a minors map \begin{align*} M:\mathbb R^{m\times n}\to\mathbb R^N \end{align*} such that $f(F)=G(M(F))$ for every matrix $F\in\mathbb R^{m\times n}$. To prove quasiconvexity directly, one fixes a bounded [open set](/page/Open%20Set) $U\subset\mathbb R^n$ with $0<\mathcal L^n(U)<\infty$, where $\mathcal L^n$ denotes $n$-dimensional [Lebesgue measure](/page/Lebesgue%20Measure), a matrix $F\in\mathbb R^{m\times n}$, and a perturbation \begin{align*} \varphi:U\to\mathbb R^m \end{align*} with zero boundary trace. Here $W^{1,\infty}_0(U;\mathbb R^m)$ denotes the closure of $C_c^\infty(U;\mathbb R^m)$ in the $W^{1,\infty}$ norm. One first proves the inequality for $\varphi\in C_c^\infty(U;\mathbb R^m)$. Quasiconvexity asks for the inequality \begin{align*} f(F)\le \frac{1}{\mathcal L^n(U)}\int_U f(F+\nabla\varphi(x))\,d\mathcal L^n(x). \end{align*} Using the representation $f=G\circ M$, the right-hand side becomes \begin{align*} \frac{1}{\mathcal L^n(U)}\int_U G(M(F+\nabla\varphi(x)))\,d\mathcal L^n(x). \end{align*} The useful point is that $G$ is convex. Since $\varphi\in C_c^\infty(U;\mathbb R^m)$, the map $x\mapsto F+\nabla\varphi(x)$ is bounded on $U$. The minors map $M$ is polynomial, so $x\mapsto M(F+\nabla\varphi(x))$ is bounded and measurable as a map from $U$ to $\mathbb R^N$. A finite convex function on $\mathbb R^N$ is continuous, hence $x\mapsto G(M(F+\nabla\varphi(x)))$ is bounded and measurable on the finite-measure set $U$. Therefore [Jensen's inequality](/theorems/1977) applies to the probability measure $\mathcal L^n(U)^{-1}\mathcal L^n\big|_U$ and gives \begin{align*} G\left(\frac{1}{\mathcal L^n(U)}\int_U M(F+\nabla\varphi(x))\,d\mathcal L^n(x)\right) \le \frac{1}{\mathcal L^n(U)}\int_U G(M(F+\nabla\varphi(x)))\,d\mathcal L^n(x). \end{align*} The null-Lagrangian property of minors says that each component of $M$ has unchanged average under addition of a compactly supported gradient. More precisely, for every $\varphi\in C_c^\infty(U;\mathbb R^m)$ and every $F\in\mathbb R^{m\times n}$, \begin{align*} \frac{1}{\mathcal L^n(U)}\int_U M(F+\nabla\varphi(x))\,d\mathcal L^n(x)=M(F). \end{align*} Substituting this identity into Jensen's inequality gives \begin{align*} G(M(F)) \le \frac{1}{\mathcal L^n(U)}\int_U G(M(F+\nabla\varphi(x)))\,d\mathcal L^n(x). \end{align*} Since $G(M(F))=f(F)$ and $G(M(F+\nabla\varphi(x)))=f(F+\nabla\varphi(x))$, this is exactly the quasiconvexity inequality for smooth compactly supported perturbations. For $\varphi\in W^{1,\infty}_0(U;\mathbb R^m)$, choose $\varphi_j\in C_c^\infty(U;\mathbb R^m)$ such that $\varphi_j\to\varphi$ in $W^{1,\infty}(U;\mathbb R^m)$. Then $\nabla\varphi_j\to\nabla\varphi$ in $L^\infty(U;\mathbb R^{m\times n})$, so the polynomial map $M$ gives $M(F+\nabla\varphi_j)\to M(F+\nabla\varphi)$ uniformly on $U$. Since $f$ is continuous and the matrices $F+\nabla\varphi_j(x)$ remain in one bounded subset of $\mathbb R^{m\times n}$, [uniform continuity](/page/Uniform%20Continuity) of $f$ on that [bounded set](/page/Bounded%20Set) gives $f(F+\nabla\varphi_j)\to f(F+\nabla\varphi)$ uniformly on $U$. Because $\mathcal L^n(U)<\infty$, [uniform convergence](/page/Uniform%20Convergence) implies convergence in $L^1(U;\mathcal L^n)$, and hence \begin{align*} \int_U f(F+\nabla\varphi_j(x))\,d\mathcal L^n(x)\to \int_U f(F+\nabla\varphi(x))\,d\mathcal L^n(x). \end{align*} Taking the limit in the smooth quasiconvexity inequality gives the inequality for $\varphi$. This is the approximation passage used in [citetheorem:8753]. [/guided] [/step] [step:Exhibit polyconvex functions that are not convex in vectorial dimensions] Assume $m,n\ge 2$. Define \begin{align*} h:\mathbb R^{m\times n}\to\mathbb R \end{align*} by $h(F)=F_{11}F_{22}-F_{12}F_{21}$. This is the determinant of the upper-left $2\times 2$ submatrix of $F$. Since $h$ is one of the second-order minors of $F$, it is affine as a function of the minors vector $M(F)$. Therefore $h$ is polyconvex. It remains to show that $h$ is not convex. Define matrices $A,B\in\mathbb R^{m\times n}$ by \begin{align*} A_{11}=1,\qquad A_{22}=1, \end{align*} with all other entries of $A$ equal to $0$, and \begin{align*} B_{11}=1,\qquad B_{22}=-1, \end{align*} with all other entries of $B$ equal to $0$. Then \begin{align*} h(A)=1,\qquad h(B)=-1. \end{align*} The midpoint matrix $(A+B)/2$ has upper-left $2\times 2$ block with entries \begin{align*} \left(\frac{A+B}{2}\right)_{11}=1,\qquad \left(\frac{A+B}{2}\right)_{22}=0,\qquad \left(\frac{A+B}{2}\right)_{12}=0,\qquad \left(\frac{A+B}{2}\right)_{21}=0, \end{align*} and hence \begin{align*} h\left(\frac{A+B}{2}\right)=0. \end{align*} Convexity would require \begin{align*} h\left(\frac{A+B}{2}\right)\le \frac{h(A)+h(B)}{2}=0, \end{align*} which gives equality here, so we test the opposite sign. Define $k:\mathbb R^{m\times n}\to\mathbb R$ by $k=-h$. Since $k$ is also affine in the minors vector, $k$ is polyconvex. But \begin{align*} k(A)=-1,\qquad k(B)=1,\qquad k\left(\frac{A+B}{2}\right)=0, \end{align*} again gives equality. To obtain a strict violation, instead take $C,D\in\mathbb R^{m\times n}$ with all entries zero except $C_{11}=1$ and $D_{22}=1$. Then \begin{align*} h(C)=0,\qquad h(D)=0, \end{align*} while \begin{align*} h\left(\frac{C+D}{2}\right)=\frac{1}{4}. \end{align*} Convexity would imply \begin{align*} \frac{1}{4}=h\left(\frac{C+D}{2}\right)\le \frac{h(C)+h(D)}{2}=0, \end{align*} a contradiction. Thus $h$ is polyconvex but not convex. [/step] [step:Conclude the hierarchy] The first step proves \begin{align*} \text{convexity}\implies\text{polyconvexity}, \end{align*} and the second step proves \begin{align*} \text{polyconvexity}\implies\text{quasiconvexity}. \end{align*} The determinant-minor example shows that polyconvexity does not imply convexity in vectorial dimensions. Therefore \begin{align*} \text{convexity}\implies\text{polyconvexity}\implies\text{quasiconvexity}, \end{align*} and the first implication is strict in general for $m,n\ge 2$. [/step]

Prerequisites (0/1 completed)

Prerequisites Graph

Interactive dependency map showing how this theorem builds on foundational concepts

Loading dependency graph...

Definitions & Concepts

Determinant

Explore Further

Determinant Definition Sequential Characterisation of Closure Topology Robust CLF Practical Stabilization Criterion Analysis Transcritical Bifurcation Dynamical Systems Pointwise Decay for Compactly Supported Three-Dimensional Free Waves Partial Differential Equations Compactness Equals Completeness Plus Total Boundedness Analysis Equilibria Are Exactly Constant Solutions Analysis Euclidean Ball Rescaling Analysis Coordinate Projections from a Product Metric Space are $1$-Lipschitz Analysis Analysis Area

What brings you to Androma?

Start with a route through the knowledge graph.