Partial Differential Equations II: Elliptic Theory and Variational Methods

Edit 0 Issues 0 Pull Requests Roadmap Admin

Content

Problems

History

Issues Verification Attributions

This course develops the theory of elliptic partial differential equations from the classical viewpoint to the modern weak and variational framework. It begins with model elliptic equations and harmonic functions, then studies boundary value problems, maximum principles, Green functions, and Poisson kernels as the foundational tools for understanding existence, uniqueness, and representation of solutions. From there, the course moves into Sobolev spaces and weak formulations, showing how elliptic problems can be treated systematically even when classical derivatives are unavailable. The main themes are coercivity, variational structure, and regularity. The middle chapters build the functional-analytic machinery needed for the [Lax-Milgram theorem](/theorems/91), energy minimization, and the derivation of a priori estimates. Once weak solutions are in place, the course proves interior and [boundary regularity](/theorems/99), then turns to spectral theory through eigenvalue problems and compact resolvents. Later chapters extend the theory to Fredholm alternatives, noncoercive problems, variational inequalities, and nonlinear elliptic equations, showing how the same core ideas adapt to constrained and nonlinear settings. The chapters are arranged so that each layer of theory supports the next: classical theory motivates weak formulations, weak formulations lead to existence and uniqueness via variational methods, and regularity theory explains when weak solutions become smooth enough to recover classical intuition. By the end, the course provides a unified view of elliptic equations as analytic, geometric, and variational objects, with tools that apply to both linear and nonlinear problems. # Introduction This opening chapter fixes the viewpoint for the course. Elliptic equations are the stationary equations of analysis: they describe equilibria, minimisers of energies, and fields determined by boundary data rather than by time evolution. The course begins from familiar Laplace-type equations and then moves toward weak formulations, Hilbert-space existence theorems, regularity, and variational methods. The central theme is that a second-order elliptic equation can be studied from several compatible perspectives. Classical solutions satisfy pointwise derivative identities, weak solutions satisfy integral identities against test functions, and variational solutions minimise or critically point an energy functional. The course repeatedly translates among these languages. We use $\mathcal L^n$ for [Lebesgue measure](/page/Lebesgue%20Measure) on $\mathbb R^n$, so $d\mathcal L^n$ is the measure in the integrals below. The space $C_c^\infty(U)$ consists of smooth functions with compact support in $U$, and $H^{-1}(U)$ denotes the [dual space](/page/Dual%20Space) $(H^1_0(U))^*$. Sobolev spaces, traces, and dual spaces are developed in Chapter 4; here they are fixed so that the model weak formulations have precise notation from the start. ## The Main Questions of Elliptic Theory The first problem is to understand what should count as a solution. For smooth data on smooth domains, it is natural to ask for a twice continuously differentiable function satisfying the equation at every point. In applications and in limiting arguments, the data, domain, or coefficients may not be smooth enough for such a solution to exist, so the equation must be interpreted through [integration by parts](/theorems/210). [definition: Classical Solution] Let $U \subset \mathbb R^n$ be open, let $f: U \to \mathbb R$, and let $L:C^2(U)\to C(U)$ be a second-order differential operator on $U$. A function $u \in C^2(U)$ is a classical solution of $Lu=f$ in $U$ if \begin{align*} Lu(x)=f(x) \quad \text{for every } x \in U. \end{align*} [/definition] Classical solutions preserve the pointwise meaning of the equation, but they are not stable enough for the methods used later in the course. This motivates the weak formulation, where the equation is tested against compactly supported smooth functions so that only first derivatives of the unknown are required. [definition: Weak Solution of the Poisson Equation] Let $U \subset \mathbb R^n$ be open and let $f \in L^2(U)$. A function $u \in H^1(U)$ is a weak solution of $-\Delta u=f$ in $U$ if \begin{align*} \int_U \nabla u \cdot \nabla \phi\,d\mathcal L^n = \int_U f\phi\,d\mathcal L^n \end{align*} for every $\phi \in C_c^\infty(U)$. [/definition] This definition is obtained by multiplying the equation by a [test function](/page/Test%20Function) and integrating by parts. The boundary term vanishes because the test function has compact support in $U$, so the formula only encodes the equation in the interior. [example: Smooth Poisson Solutions Are Weak Solutions] Let $U \subset \mathbb R^n$ be open, let $f \in C(U)$, and suppose $u \in C^2(U)$ satisfies $-\Delta u=f$ pointwise in $U$. Fix $\phi \in C_c^\infty(U)$. Since $\operatorname{supp}\phi$ is compact in $U$, choose an open box $Q$ such that $\operatorname{supp}\phi\subset Q$ and $\overline Q\subset U$. For each $i=1,\dots,n$, the functions $\phi$ and $\partial_{x_i}\phi$ vanish outside $\operatorname{supp}\phi$, so \begin{align*} \int_U \partial_{x_i}u\,\partial_{x_i}\phi\,d\mathcal L^n=\int_Q \partial_{x_i}u\,\partial_{x_i}\phi\,d\mathcal L^n. \end{align*} Because $\phi=0$ near $\partial Q$, the one-variable [integration by parts](/theorems/2098) formula in the $x_i$ direction has no boundary term, and therefore \begin{align*} \int_Q \partial_{x_i}u\,\partial_{x_i}\phi\,d\mathcal L^n=-\int_Q \partial_{x_i x_i}u\,\phi\,d\mathcal L^n. \end{align*} Again using that $\phi$ vanishes outside $\operatorname{supp}\phi\subset Q$, this becomes \begin{align*} \int_U \partial_{x_i}u\,\partial_{x_i}\phi\,d\mathcal L^n=-\int_U \partial_{x_i x_i}u\,\phi\,d\mathcal L^n. \end{align*} Summing over $i$ gives \begin{align*} \int_U \nabla u\cdot \nabla \phi\,d\mathcal L^n=\sum_{i=1}^n \int_U \partial_{x_i}u\,\partial_{x_i}\phi\,d\mathcal L^n. \end{align*} The identity above for each coordinate direction gives \begin{align*} \sum_{i=1}^n \int_U \partial_{x_i}u\,\partial_{x_i}\phi\,d\mathcal L^n=-\sum_{i=1}^n\int_U \partial_{x_i x_i}u\,\phi\,d\mathcal L^n. \end{align*} Since $\Delta u=\sum_{i=1}^n\partial_{x_i x_i}u$, we obtain \begin{align*} -\sum_{i=1}^n\int_U \partial_{x_i x_i}u\,\phi\,d\mathcal L^n=-\int_U (\Delta u)\phi\,d\mathcal L^n. \end{align*} Finally, the classical equation $-\Delta u=f$ gives \begin{align*} -\int_U (\Delta u)\phi\,d\mathcal L^n=\int_U f\phi\,d\mathcal L^n. \end{align*} Thus $u$ satisfies the weak identity for every $\phi\in C_c^\infty(U)$, and the final substitution is exactly where the sign convention for $-\Delta$ enters. [/example] The converse question is one of the driving questions of regularity theory. If $u$ solves an elliptic equation weakly and the coefficients and data have additional structure, later chapters ask whether $u$ has derivatives in the classical sense. [quotetheorem:6416] [citeproof:6416] The extra hypotheses in the theorem are not cosmetic. Continuity gives a pointwise meaning to $-\Delta u-f$, while the assumptions $f\in L^2(U)$ and $u\in H^1(U)$ ensure that the weak formulation is even a statement in the function spaces already defined. On an unbounded domain, a [continuous function](/page/Continuous%20Function) need not be square-integrable, and a $C^2$ function need not have square-integrable first derivatives. The theorem therefore does not claim that every weak solution is smooth; it only says that when a weak solution is already smooth enough, the weak and classical languages agree. Regularity theory begins where this theorem stops: it asks which elliptic hypotheses force weak solutions to enter such smoother classes. ## Ellipticity and Boundary Data The next problem is to isolate the structural condition that makes an equation behave like the Laplace equation. Elliptic equations have no distinguished time direction, and their solutions are controlled by boundary values and integral estimates rather than by initial conditions. [definition: Uniform Ellipticity in Divergence Form] Let $U \subset \mathbb R^n$ be open. A coefficient matrix $A:U\to \mathbb R^{n\times n}$ with entries $a_{ij}\in L^\infty(U)$ is uniformly elliptic on $U$ if there exists $\theta>0$ such that \begin{align*} \sum_{i,j=1}^n a_{ij}(x)\xi_i\xi_j \ge \theta |\xi|^2 \end{align*} for a.e. $x\in U$ and every $\xi\in \mathbb R^n$. [/definition] Uniform ellipticity is a coercivity condition: it says that the second-order part sees every direction in space with a positive amount of strength. This is why the same condition appears both in maximum principles and in the Hilbert-space estimates behind existence. [example: Constant Coefficient Elliptic Operators] Let $U\subset \mathbb R^n$ be open, and let $A=(a_{ij})\in \mathbb R^{n\times n}$ be symmetric with eigenvalues $0<\lambda_1\le \cdots \le \lambda_n$. Since the coefficients are constant, each $a_{ij}$ belongs to $L^\infty(U)$. We show that the operator \begin{align*} L:C^2(U)\to C(U), \qquad Lu=-\sum_{i,j=1}^n \partial_{x_i}(a_{ij}\partial_{x_j}u) \end{align*} is uniformly elliptic with ellipticity constant $\theta=\lambda_1$. By the spectral decomposition for real symmetric matrices, there is an orthogonal matrix $Q$ such that \begin{align*} A=Q\operatorname{diag}(\lambda_1,\ldots,\lambda_n)Q^\top. \end{align*} Fix $\xi\in \mathbb R^n$ and set $\eta=Q^\top \xi$. Then \begin{align*} \sum_{i,j=1}^n a_{ij}\xi_i\xi_j=\xi^\top A\xi. \end{align*} Substituting the diagonalisation of $A$ gives \begin{align*} \xi^\top A\xi=\xi^\top Q\operatorname{diag}(\lambda_1,\ldots,\lambda_n)Q^\top \xi. \end{align*} Since $\eta=Q^\top\xi$, this is \begin{align*} \xi^\top Q\operatorname{diag}(\lambda_1,\ldots,\lambda_n)Q^\top \xi=\eta^\top \operatorname{diag}(\lambda_1,\ldots,\lambda_n)\eta. \end{align*} Multiplying by the diagonal matrix gives \begin{align*} \eta^\top \operatorname{diag}(\lambda_1,\ldots,\lambda_n)\eta=\sum_{k=1}^n \lambda_k\eta_k^2. \end{align*} Because $\lambda_k\ge \lambda_1$ for every $k$, we have \begin{align*} \sum_{k=1}^n \lambda_k\eta_k^2\ge \sum_{k=1}^n \lambda_1\eta_k^2=\lambda_1|\eta|^2. \end{align*} Orthogonality of $Q$ gives \begin{align*} |\eta|^2=(Q^\top\xi)\cdot(Q^\top\xi)=\xi^\top QQ^\top\xi=\xi^\top\xi=|\xi|^2. \end{align*} Combining these identities yields \begin{align*} \sum_{i,j=1}^n a_{ij}\xi_i\xi_j\ge \lambda_1|\xi|^2. \end{align*} This inequality holds for every $\xi\in\mathbb R^n$ and is independent of $x\in U$, so the constant coefficient operator is uniformly elliptic with constant $\theta=\lambda_1$. The smallest eigenvalue is therefore the least stiffness of the quadratic form $\xi^\top A\xi$. [/example] Once ellipticity is fixed, the boundary condition determines the problem. The same differential expression may represent different analytic questions depending on whether the value, normal derivative, or a mixed boundary expression is prescribed. [definition: Dirichlet Problem for the Poisson Equation] Let $U \subset \mathbb R^n$ be a bounded [open set](/page/Open%20Set) and let $f\in L^2(U)$. In the classical formulation, let $g:\partial U\to \mathbb R$ be prescribed boundary data. In the weak formulation on a Lipschitz domain, let $g\in H^{1/2}(\partial U)$ and interpret $u=g$ through the trace operator from $H^1(U)$ to $H^{1/2}(\partial U)$. The Dirichlet problem for the Poisson equation is \begin{align*} -\Delta u=f \quad \text{in } U, \end{align*} with \begin{align*} u=g \quad \text{on } \partial U. \end{align*} [/definition] The boundary condition in this notation means that the trace of $u$ on $\partial U$ is prescribed. For the weak theory, this motivates building the boundary condition into the function space; in the homogeneous case the natural space of test functions and unknowns is $H^1_0(U)$. [definition: Homogeneous Dirichlet Weak Solution] Let $U \subset \mathbb R^n$ be bounded and open, and let $f\in L^2(U)$. A function $u\in H^1_0(U)$ is a homogeneous Dirichlet weak solution of $-\Delta u=f$ in $U$ if \begin{align*} \int_U \nabla u\cdot \nabla v\,d\mathcal L^n=\int_U fv\,d\mathcal L^n \end{align*} for every $v\in H^1_0(U)$. [/definition] Testing against all of $H^1_0(U)$ is stronger than testing only against smooth compactly supported functions, but the two formulations agree after density is established. This is the form that can be solved by Hilbert-space methods. ## Energy Methods and Variational Formulations The third problem is to explain why elliptic equations arise from minimisation. The Poisson equation with homogeneous Dirichlet boundary condition is the Euler--Lagrange equation of the Dirichlet energy with a forcing term, and this observation turns existence into a question about convex coercive functionals. [definition: Dirichlet Energy with Forcing] Let $U \subset \mathbb R^n$ be bounded and open, and let $f\in L^2(U)$. The Dirichlet energy with forcing is the functional $I:H^1_0(U)\to \mathbb R$ defined by \begin{align*} I[u]=\frac12\int_U |\nabla u|^2\,d\mathcal L^n-\int_U fu\,d\mathcal L^n. \end{align*} [/definition] The factor $1/2$ is chosen so that differentiating the quadratic part removes the factor $2$. To use the energy as more than intuition, one must check that every admissible first variation produces exactly the weak Poisson identity and that no boundary term survives. The issue is whether criticality in $H^1_0(U)$ is genuinely the same condition as the homogeneous Dirichlet weak formulation. [quotetheorem:6417] [citeproof:6417] The theorem explains the dictionary between the PDE and the energy, and its hypotheses identify the limits of that dictionary. The space $H^1_0(U)$ is not an arbitrary choice: it is where the gradient term is square-integrable and where the homogeneous boundary condition is built into the admissible variations. If the admissible space were enlarged to $H^1(U)$, then the variation of $\int_U |\nabla u|^2\,d\mathcal L^n$ would produce a boundary contribution, so the critical point condition would correspond to a natural Neumann-type boundary condition rather than the homogeneous Dirichlet problem. The assumption $f\in L^2(U)$ makes the forcing term $u\mapsto \int_U fu\,d\mathcal L^n$ finite on $H^1_0(U)$, using the basic Sobolev and Poincare estimates available on bounded domains; for example, a non-integrable forcing term near an interior singularity may make $\int_U fu\,d\mathcal L^n$ undefined for admissible $u$. Boundedness of $U$ is also part of the mechanism, since Poincare's inequality controls $\|u\|_{L^2(U)}$ by $\|\nabla u\|_{L^2(U)}$ on $H^1_0(U)$, whereas on unbounded domains translations and slowly decaying functions obstruct this direct coercivity estimate. A one-dimensional example makes the dictionary concrete while also showing how the Sobolev formulation recovers a familiar boundary value problem. [example: One Dimensional Dirichlet Energy] On $U=(0,1)$, let $f\in L^2(0,1)$ and define \begin{align*} I[u]=\frac12\int_0^1 |u'(x)|^2\,dx-\int_0^1 f(x)u(x)\,dx \end{align*} for $u\in H^1_0(0,1)$. Suppose $u$ minimises $I$, and fix $v\in H^1_0(0,1)$. For every $\varepsilon\in\mathbb R$, the competitor $u+\varepsilon v$ also lies in $H^1_0(0,1)$, and \begin{align*} I[u+\varepsilon v]=\frac12\int_0^1 |u'(x)+\varepsilon v'(x)|^2\,dx-\int_0^1 f(x)(u(x)+\varepsilon v(x))\,dx. \end{align*} Expanding the square inside the first integral gives \begin{align*} |u'(x)+\varepsilon v'(x)|^2=|u'(x)|^2+2\varepsilon u'(x)v'(x)+\varepsilon^2|v'(x)|^2. \end{align*} Substituting this expansion into the energy yields \begin{align*} I[u+\varepsilon v]=\frac12\int_0^1 |u'(x)|^2\,dx+\varepsilon\int_0^1 u'(x)v'(x)\,dx+\frac{\varepsilon^2}{2}\int_0^1 |v'(x)|^2\,dx-\int_0^1 f(x)u(x)\,dx-\varepsilon\int_0^1 f(x)v(x)\,dx. \end{align*} The terms independent of $\varepsilon$ are exactly $I[u]$, so \begin{align*} I[u+\varepsilon v]=I[u]+\varepsilon\left(\int_0^1 u'(x)v'(x)\,dx-\int_0^1 f(x)v(x)\,dx\right)+\frac{\varepsilon^2}{2}\int_0^1 |v'(x)|^2\,dx. \end{align*} Since $\varepsilon=0$ minimises this quadratic polynomial in $\varepsilon$, its linear coefficient must vanish. Hence \begin{align*} \int_0^1 u'(x)v'(x)\,dx=\int_0^1 f(x)v(x)\,dx \end{align*} for every $v\in H^1_0(0,1)$. If $f$ is continuous and $u$ is represented by a $C^2$ function, then the same identity may be tested with every $v\in C_c^\infty(0,1)$. Integration by parts gives \begin{align*} \int_0^1 u'(x)v'(x)\,dx=\left[u'(x)v(x)\right]_{x=0}^{x=1}-\int_0^1 u''(x)v(x)\,dx. \end{align*} Because $v$ has compact support in $(0,1)$, it vanishes near $0$ and $1$, so $v(0)=v(1)=0$ and \begin{align*} \left[u'(x)v(x)\right]_{x=0}^{x=1}=u'(1)v(1)-u'(0)v(0)=0. \end{align*} Therefore \begin{align*} \int_0^1 u'(x)v'(x)\,dx=-\int_0^1 u''(x)v(x)\,dx. \end{align*} Combining this with the weak identity gives \begin{align*} -\int_0^1 u''(x)v(x)\,dx=\int_0^1 f(x)v(x)\,dx. \end{align*} Equivalently, \begin{align*} \int_0^1 \left(-u''(x)-f(x)\right)v(x)\,dx=0 \end{align*} for every $v\in C_c^\infty(0,1)$. Since $-u''-f$ is continuous, the fundamental lemma of the [calculus of variations](/page/Calculus%20of%20Variations) forces $-u''-f=0$ pointwise on $(0,1)$. The condition $u\in H^1_0(0,1)$ gives the homogeneous boundary values $u(0)=u(1)=0$, so the variational problem recovers \begin{align*} -u''=f \quad \text{in } (0,1), \qquad u(0)=u(1)=0. \end{align*} [/example] The variational language also prepares for equations whose coefficients or nonlinearities are determined by an energy density. Chapters 5 and 6 extend this picture from the quadratic Dirichlet energy to coercive bilinear forms and then to direct methods in the calculus of variations. ## Existence, Uniqueness, and Regularity as Separate Questions The final organisational problem is to separate three questions that are often conflated. Existence asks whether a solution in the chosen class is present. Uniqueness asks whether the data determine at most one solution. Regularity asks whether a weak solution is smoother than its definition requires. [quotetheorem:6418] [citeproof:6418] This theorem isolates uniqueness as a statement about the boundary condition, the domain, and the function space. The homogeneous Dirichlet condition removes the constant functions that would otherwise lie in the kernel; by contrast, on $U=(0,1)$ the Neumann problem $-u''=0$ with $u'(0)=u'(1)=0$ has every constant function as a solution. Boundedness of $U$ is also part of the functional-analytic setting: on an unbounded domain such as $\mathbb R^n$, compactly supported plateau functions $w_R$ that equal $1$ on $B(0,R)$ and vanish outside $B(0,2R)$ show that gradient control alone is not a coercive replacement for the $H^1_0$ norm without extra decay or normalisation assumptions. The shared datum $f\in L^2(U)$ fixes the problem being compared and keeps the right-hand side inside the standard dual framework. The next example marks a different limitation of the method: existence and uniqueness in a weak space do not automatically supply classical differentiability. [example: Why Regularity Is Not Built Into Weak Existence] Let $U=(0,1)$ and take $f=\mathbf 1_{(0,1/2)}\in L^2(0,1)$. Define $u$ by \begin{align*} u(x)=-\frac{x^2}{2}+\frac{3x}{8}\quad\text{for }0\le x\le \frac12 \end{align*} and \begin{align*} u(x)=-\frac{x}{8}+\frac18\quad\text{for }\frac12\le x\le 1. \end{align*} Then $u(0)=0$ and $u(1)=0$. At $x=\frac12$, the left formula gives \begin{align*} -\frac{(1/2)^2}{2}+\frac{3(1/2)}{8}=-\frac18+\frac{3}{16}=\frac{1}{16}, \end{align*} while the right formula gives \begin{align*} -\frac{1/2}{8}+\frac18=-\frac{1}{16}+\frac{2}{16}=\frac{1}{16}. \end{align*} Thus $u$ is continuous at the joining point. The derivatives on the two sides are \begin{align*} \frac{d}{dx}\left(-\frac{x^2}{2}+\frac{3x}{8}\right)=-x+\frac38 \end{align*} and \begin{align*} \frac{d}{dx}\left(-\frac{x}{8}+\frac18\right)=-\frac18. \end{align*} At $x=\frac12$, the left derivative is \begin{align*} -\frac12+\frac38=-\frac18, \end{align*} which equals the right derivative. Hence $u$ is continuously differentiable and piecewise $C^2$, so $u\in H^1_0(0,1)$. For every $v\in C_c^\infty(0,1)$, split the weak integral at the point where the formula for $u$ changes: \begin{align*} \int_0^1 u'(x)v'(x)\,dx=\int_0^{1/2}\left(-x+\frac38\right)v'(x)\,dx+\int_{1/2}^1\left(-\frac18\right)v'(x)\,dx. \end{align*} On the first interval, integration by parts gives \begin{align*} \int_0^{1/2}\left(-x+\frac38\right)v'(x)\,dx=\left[\left(-x+\frac38\right)v(x)\right]_{0}^{1/2}+\int_0^{1/2}v(x)\,dx. \end{align*} Since $v(0)=0$, the boundary term on this interval is \begin{align*} \left[\left(-x+\frac38\right)v(x)\right]_{0}^{1/2}=\left(-\frac12+\frac38\right)v\left(\frac12\right)-0=-\frac18v\left(\frac12\right). \end{align*} On the second interval, \begin{align*} \int_{1/2}^1\left(-\frac18\right)v'(x)\,dx=\left[-\frac18v(x)\right]_{1/2}^{1}. \end{align*} Since $v(1)=0$, this boundary term is \begin{align*} \left[-\frac18v(x)\right]_{1/2}^{1}=0-\left(-\frac18v\left(\frac12\right)\right)=\frac18v\left(\frac12\right). \end{align*} The two interface contributions cancel: \begin{align*} -\frac18v\left(\frac12\right)+\frac18v\left(\frac12\right)=0. \end{align*} Therefore \begin{align*} \int_0^1 u'(x)v'(x)\,dx=\int_0^{1/2}v(x)\,dx. \end{align*} Because $f=\mathbf 1_{(0,1/2)}$, the right-hand side is \begin{align*} \int_0^{1/2}v(x)\,dx=\int_0^1 f(x)v(x)\,dx. \end{align*} Thus $u$ satisfies the weak identity for $-u''=f$ with homogeneous Dirichlet boundary values. However, $u$ is not $C^2(0,1)$. On $(0,1/2)$, \begin{align*} u''(x)=-1, \end{align*} while on $(1/2,1)$, \begin{align*} u''(x)=0. \end{align*} If $u''$ were continuous at $x=\frac12$, its left and right limits there would agree, but these limits are $-1$ and $0$. Thus this weak solution lies in $H^1_0(0,1)$ and solves the equation variationally, yet it is not a classical $C^2$ solution; regularity requires additional hypotheses and estimates beyond weak existence alone. [/example] The separation of existence, uniqueness, and regularity determines the order of the course. It also explains why elliptic theory touches several neighbouring subjects. In geometry, Laplace-type operators encode harmonic coordinates and curvature equations; in probability, harmonic functions describe hitting probabilities for [Brownian motion](/page/Brownian%20Motion); in [numerical analysis](/page/Numerical%20Analysis), coercive bilinear forms become the stability mechanism behind finite element methods. The roadmap below records how the later chapters revisit the model problems from this introduction with increasingly precise tools. [remark: Course Roadmap] The course first revisits harmonic functions and model elliptic equations, including fundamental solutions, mean value properties, Harnack inequalities, and maximum principles. It then develops boundary value problems through weak formulations, Lax--Milgram, Fredholm alternatives, eigenvalue problems, and compactness. The last part studies regularity and variational methods, including interior estimates, Sobolev and Schauder-type phenomena, convex minimisation, Euler--Lagrange equations, and obstacle-type problems. [/remark] These notes should therefore be read with two parallel questions in mind. Which formulation is appropriate for the data and boundary condition at hand? Once a solution exists in that formulation, what additional structure of the equation improves uniqueness, stability, or smoothness? The opening discussion has now clarified the guiding questions of the course: which weak or classical formulation suits the boundary data, and which structural properties of the operator improve control over the solution. Chapter 1 takes those questions into the simplest elliptic setting, where the Laplace and Poisson equations provide the model examples and harmonic functions reveal the core phenomena. # 1. Model elliptic equations and classical harmonic theory This opening chapter establishes the classical model for elliptic theory: the Laplace equation, the Poisson equation, and the harmonic functions that solve the homogeneous problem. The prerequisites are multivariable calculus, the [divergence theorem](/theorems/2754), basic open and closed sets in $\mathbb R^n$, and standard integration over balls and spheres. Chapters 4 through 6 replace pointwise second derivatives by weak derivatives and Hilbert-space arguments, but the guiding estimates already appear here in their classical form. The main theme is that elliptic equations have no preferred time direction: values inside a domain are controlled by sources and by boundary data surrounding the point. ## Laplace and Poisson Equations on Domains The first question is what it means for a function on a domain to be in equilibrium. In electrostatics, gravitation, steady-state heat flow, and incompressible potential flow, the same operator appears: the sum of the unmixed second derivatives. Its homogeneous equation describes absence of interior source, while its inhomogeneous version records how a prescribed source bends the solution. [definition: Laplacian] Let $U \subset \mathbb R^n$ be open. The Laplacian operator is the map \begin{align*} \Delta:C^2(U)\to C(U) \end{align*} which sends $u\in C^2(U)$ to the function $\Delta u: U \to \mathbb R$ defined by \begin{align*} \Delta u(x) = \sum_{i=1}^{n} \frac{\partial^2 u}{\partial x_i^2}(x). \end{align*} [/definition] The Laplacian is the prototype second-order [elliptic operator](/page/Elliptic%20Operator). The first class of solutions to isolate is the source-free case, because these functions provide the comparison objects for Poisson equations and the local corrections in representation formulas. This motivates the following definition of harmonicity as vanishing Laplacian throughout the domain. [definition: Harmonic Function] Let $U \subset \mathbb R^n$ be open. A function $u:U\to \mathbb R$ with $u \in C^2(U)$ is harmonic in $U$ if \begin{align*} \Delta u = 0 \quad \text{in } U. \end{align*} [/definition] The equation $\Delta u=0$ says that the function has no interior source. The next equation allows a source term and is the model for later weak formulations. [definition: Poisson Equation] Let $U \subset \mathbb R^n$ be open, let $f:U\to \mathbb R$ be a given function, and let $u:U\to \mathbb R$ be an unknown function with $u\in C^2(U)$. The Poisson equation with source $f$ is \begin{align*} -\Delta u = f \quad \text{in } U. \end{align*} [/definition] Boundary data are part of the problem, not a decorative addition. Without boundary conditions, adding a harmonic function to one solution of the Poisson equation gives another solution with the same source. [example: Harmonic Polynomials] In $\mathbb R^2$, let $u(x_1,x_2)=x_1^2-x_2^2$. Its first derivatives are \begin{align*} \partial_{x_1}u=2x_1,\qquad \partial_{x_2}u=-2x_2. \end{align*} Differentiating once more gives \begin{align*} \partial_{x_1x_1}u=2,\qquad \partial_{x_2x_2}u=-2. \end{align*} Therefore, by the definition of the Laplacian in two variables, \begin{align*} \Delta u=\partial_{x_1x_1}u+\partial_{x_2x_2}u=2+(-2)=0. \end{align*} Thus $u$ is harmonic on $\mathbb R^2$. More generally, write \begin{align*} (x_1+ix_2)^k=P_k(x_1,x_2)+iQ_k(x_1,x_2), \end{align*} where $P_k$ and $Q_k$ are real polynomials. For $k\ge 2$, differentiating with respect to $x_1$ gives \begin{align*} \partial_{x_1x_1}(P_k+iQ_k)=k(k-1)(x_1+ix_2)^{k-2}. \end{align*} Differentiating with respect to $x_2$ gives first \begin{align*} \partial_{x_2}(P_k+iQ_k)=ik(x_1+ix_2)^{k-1}. \end{align*} A second $x_2$-derivative then gives \begin{align*} \partial_{x_2x_2}(P_k+iQ_k)=ik(k-1)(x_1+ix_2)^{k-2}\cdot i. \end{align*} Since $i^2=-1$, this is \begin{align*} \partial_{x_2x_2}(P_k+iQ_k)=-k(k-1)(x_1+ix_2)^{k-2}. \end{align*} Adding the two second derivatives yields \begin{align*} \Delta(P_k+iQ_k)=k(k-1)(x_1+ix_2)^{k-2}-k(k-1)(x_1+ix_2)^{k-2}=0. \end{align*} Taking real and imaginary parts gives $\Delta P_k=0$ and $\Delta Q_k=0$. For $k=0$ the real and imaginary parts are constant, and for $k=1$ they are linear, so their second partial derivatives are also zero. Hence the real and imaginary parts of $(x_1+ix_2)^k$ are harmonic polynomials on $\mathbb R^2$, giving nonconstant harmonic functions and showing why constancy requires an additional global boundedness hypothesis. [/example] Exact harmonicity is too rigid for comparison arguments, since later maximum principles must also handle functions lying on one side of a harmonic function. The sign of $\Delta u$ records whether the graph bends upward or downward relative to harmonic averages, so we introduce the one-sided version now. [definition: Subharmonic Function] Let $U \subset \mathbb R^n$ be open. A function $u:U\to \mathbb R$ with $u \in C^2(U)$ is subharmonic in $U$ if \begin{align*} \Delta u \ge 0 \quad \text{in } U. \end{align*} [/definition] Subharmonic functions behave like functions lying below their harmonic replacements. They reappear in Chapter 2's maximum principles, where the sign of $\Delta u$ controls whether an interior maximum can occur. [remark: Boundary Data and Non-Uniqueness] If $u$ solves $-\Delta u=f$ in $U$ and $h$ is harmonic in $U$, then $u+h$ also solves $-\Delta(u+h)=f$. A boundary condition, such as $u=g$ on $\partial U$, selects the member of this affine family compatible with the prescribed boundary values. This is the classical shadow of the later variational statement: source terms determine an equation in the interior, while boundary data determine the admissible class. [/remark] The simplest non-polynomial harmonic functions arise when the function depends only on distance from the origin. These examples are also the local models for singularities. [example: Radial Harmonic Functions on Annuli] Let $A=\{x\in \mathbb R^n:r_0<|x|<r_1\}$ with $0<r_0<r_1$, and let $u(x)=\varphi(|x|)$, where $\varphi\in C^2((r_0,r_1))$. Write $r=|x|$. Since $r>0$ on $A$, \begin{align*} \frac{\partial r}{\partial x_i}=\frac{\partial}{\partial x_i}\left(\sum_{j=1}^n x_j^2\right)^{1/2}=\frac{x_i}{r}. \end{align*} Therefore \begin{align*} \frac{\partial u}{\partial x_i}=\varphi'(r)\frac{\partial r}{\partial x_i}=\varphi'(r)\frac{x_i}{r}. \end{align*} Differentiating once more in the same variable and using the product rule gives \begin{align*} \frac{\partial^2 u}{\partial x_i^2}=\varphi''(r)\frac{\partial r}{\partial x_i}\frac{x_i}{r}+\varphi'(r)\frac{\partial}{\partial x_i}\left(\frac{x_i}{r}\right). \end{align*} The remaining derivative is \begin{align*} \frac{\partial}{\partial x_i}\left(\frac{x_i}{r}\right)=\frac{1}{r}-\frac{x_i}{r^2}\frac{\partial r}{\partial x_i}=\frac{1}{r}-\frac{x_i^2}{r^3}. \end{align*} Hence \begin{align*} \frac{\partial^2 u}{\partial x_i^2}=\varphi''(r)\frac{x_i^2}{r^2}+\varphi'(r)\left(\frac{1}{r}-\frac{x_i^2}{r^3}\right). \end{align*} Summing over $i=1,\dots,n$ and using $\sum_{i=1}^n x_i^2=r^2$, the definition of the Laplacian gives \begin{align*} \Delta u(x)=\varphi''(r)\sum_{i=1}^n\frac{x_i^2}{r^2}+\varphi'(r)\sum_{i=1}^n\left(\frac{1}{r}-\frac{x_i^2}{r^3}\right). \end{align*} Thus \begin{align*} \Delta u(x)=\varphi''(r)\frac{r^2}{r^2}+\varphi'(r)\left(\frac{n}{r}-\frac{r^2}{r^3}\right)=\varphi''(r)+\frac{n-1}{r}\varphi'(r). \end{align*} So $u$ is harmonic in $A$ exactly when \begin{align*} 0=\varphi''(r)+\frac{n-1}{r}\varphi'(r). \end{align*} Multiplying by $r^{n-1}$ gives \begin{align*} 0=r^{n-1}\varphi''(r)+(n-1)r^{n-2}\varphi'(r)=\left(r^{n-1}\varphi'(r)\right)'. \end{align*} Hence $r^{n-1}\varphi'(r)=C$ for some constant $C$. If $n=2$, then $\varphi'(r)=C/r$, so \begin{align*} \varphi(r)=a+C\log r. \end{align*} Renaming $C$ as $b$ gives $\varphi(r)=a+b\log r$. If $n\ge 3$, then $\varphi'(r)=Cr^{1-n}$, so \begin{align*} \varphi(r)=a+\frac{C}{2-n}r^{2-n}. \end{align*} Renaming $C/(2-n)$ as $b$ gives $\varphi(r)=a+br^{2-n}$. Therefore the radial harmonic functions on annuli are exactly the logarithmic family in dimension $2$ and the inverse-power family in dimensions $n\ge 3$. [/example] ## Fundamental Solution and Newtonian Potentials The next problem is to solve Poisson's equation in all of $\mathbb R^n$ when the source is concentrated at one point. A function with a point source is not harmonic at that point, but it is harmonic away from it and has the correct singularity to reproduce test functions under integration by parts. [definition: Fundamental Solution of the Laplacian] For $n\ge 2$, the fundamental solution of $-\Delta$ in $\mathbb R^n$ is the function $\Phi:\mathbb R^n\setminus\{0\}\to \mathbb R$ given as follows. In dimension $n=2$, \begin{align*} \Phi(x)=\frac{1}{2\pi}\log\frac{1}{|x|}. \end{align*} In dimensions $n\ge 3$, \begin{align*} \Phi(x)=\frac{1}{n(n-2)\omega_n}|x|^{2-n}, \end{align*} where $\omega_n=\mathcal L^n(B(0,1))$. [/definition] The constants are chosen so that $-\Delta \Phi$ is the unit point mass at the origin in the distributional sense. This assertion is what makes the formula a fundamental solution rather than only a radial harmonic function away from $0$. The next theorem records both parts: smooth harmonicity off the pole and the exact normalisation of the singularity. [quotetheorem:566] [citeproof:566] This theorem is the point-source analogue of solving $-\Delta u=f$: the source has become a unit mass rather than an ordinary function. The dimension restriction is part of the statement. In one dimension the corresponding kernel for $-d^2/dx^2$ is a multiple of $|x|$, not a logarithmic or inverse-power radial kernel, so using the formula above would give the wrong singularity and the wrong flux normalisation. The distributional formulation is also necessary: there is no classical second derivative at the pole, and away from the pole the function is harmonic rather than equal to a point mass. The test-function hypothesis in the identity cannot be dropped casually. Compact support removes boundary terms at infinity when the punctured-domain integration by parts is performed; for a non-compact test function, such as a constant function, the displayed integral need not even be meaningful. Smoothness of $\phi$ near the origin is used to replace boundary averages over $\partial B(0,\varepsilon)$ by $\phi(0)$ as $\varepsilon\downarrow 0$. Thus the theorem does not solve a classical equation on all of $\mathbb R^n$, nor does it prescribe boundary data; it supplies a kernel with the exact local singularity needed for superposition. To solve Poisson equations with extended sources, the natural next step is to average these point-source responses with weights given by the source function. This construction is the Newtonian potential. [definition: Newtonian Potential] Let $n\ge 2$. The Newtonian potential operator is the map \begin{align*} N:C_c(\mathbb R^n)\to C(\mathbb R^n) \end{align*} where $C_c(\mathbb R^n)$ denotes real-valued compactly supported continuous functions on $\mathbb R^n$, and $C(\mathbb R^n)$ denotes real-valued continuous functions on $\mathbb R^n$. It sends $f:\mathbb R^n\to \mathbb R$ to the function $Nf:\mathbb R^n\to \mathbb R$ defined by \begin{align*} Nf(x)=\int_{\mathbb R^n}\Phi(x-y)f(y)\,d\mathcal L^n(y). \end{align*} [/definition] The potential is the convolution $\Phi*f$, interpreted as an ordinary integral when the source is sufficiently regular and compactly supported. The definition is useful because the distributional identity for $\Phi$ should pass through the integral and recover the source $f$. The next theorem states the classical version of this inversion property. [quotetheorem:2] [citeproof:2] The compact support hypothesis keeps the integral under control at infinity and justifies the Fubini and localization steps in the proof. Without it, even simple non-decaying sources cause trouble: in dimensions $n\ge 3$, taking $f\equiv 1$ makes the integral against $|x-y|^{2-n}$ diverge at infinity, and in dimension $n=2$ the logarithmic kernel has its own growth obstruction. The $C^2$ hypothesis is what turns the distributional identity into the classical pointwise equation; for a merely continuous source, the potential still has a distributional interpretation, but second derivatives need not exist pointwise everywhere. The theorem should also not be read as a uniqueness statement. If $h$ is any harmonic function on $\mathbb R^n$, then $Nf+h$ has the same source, so additional growth or boundary conditions are needed to select a particular solution. On bounded domains, the same source term must be accompanied by boundary contributions. For this reason the whole-space formula is best viewed as the singular-kernel part of a [representation formula](/theorems/39); later local estimates will combine this part with a harmonic remainder. [example: Compactly Supported Source in Euclidean Space] Assume $n\ge 3$, let $f\in C_c^2(\mathbb R^n)$ be supported in $B(0,R)$, and set $u=Nf$. If $|x|>2R$ and $y\in \operatorname{supp} f$, then $|y|\le R<|x|/2$, so for every $0\le t\le 1$, \begin{align*} |x-ty|\ge |x|-t|y|\ge |x|-|y|\ge \frac{|x|}{2}. \end{align*} By the definition of the Newtonian potential and the formula for the fundamental solution in dimensions $n\ge 3$, \begin{align*} u(x)=\frac{1}{n(n-2)\omega_n}\int_{\mathbb R^n}|x-y|^{2-n}f(y)\,d\mathcal L^n(y). \end{align*} We compare $|x-y|^{2-n}$ with its leading far-field value $|x|^{2-n}$. For $z\ne 0$, \begin{align*} \nabla_z |z|^{2-n}=(2-n)|z|^{-n}z. \end{align*} Applying the one-variable [fundamental theorem of calculus](/theorems/632) to $t\mapsto |x-ty|^{2-n}$ gives \begin{align*} |x-y|^{2-n}-|x|^{2-n}=\int_0^1 -(2-n)|x-ty|^{-n}(x-ty)\cdot y\,dt. \end{align*} Taking absolute values and using $|(x-ty)\cdot y|\le |x-ty|\,|y|$ gives \begin{align*} \left||x-y|^{2-n}-|x|^{2-n}\right|\le \int_0^1 (n-2)|x-ty|^{1-n}|y|\,dt. \end{align*} Since $|x-ty|\ge |x|/2$ and $|y|\le R$ on the support of $f$, \begin{align*} \left||x-y|^{2-n}-|x|^{2-n}\right|\le (n-2)\left(\frac{|x|}{2}\right)^{1-n}R=C_{n,R}|x|^{1-n}. \end{align*} Therefore we may write \begin{align*} |x-y|^{2-n}=|x|^{2-n}+E_x(y) \end{align*} on $\operatorname{supp} f$, where $|E_x(y)|\le C_{n,R}|x|^{1-n}$. Substituting this into the integral for $u$ gives \begin{align*} u(x)=\frac{|x|^{2-n}}{n(n-2)\omega_n}\int_{\mathbb R^n}f(y)\,d\mathcal L^n(y)+\frac{1}{n(n-2)\omega_n}\int_{\mathbb R^n}E_x(y)f(y)\,d\mathcal L^n(y). \end{align*} The error term satisfies \begin{align*} \left|\frac{1}{n(n-2)\omega_n}\int_{\mathbb R^n}E_x(y)f(y)\,d\mathcal L^n(y)\right|\le \frac{C_{n,R}}{n(n-2)\omega_n}|x|^{1-n}\int_{\mathbb R^n}|f(y)|\,d\mathcal L^n(y). \end{align*} Hence \begin{align*} u(x)=\frac{|x|^{2-n}}{n(n-2)\omega_n}\int_{\mathbb R^n}f(y)\,d\mathcal L^n(y)+O(|x|^{1-n}) \end{align*} as $|x|\to\infty$, with the error constant depending on $n$, $R$, and $\int_{\mathbb R^n}|f|\,d\mathcal L^n$. Thus the total mass $\int_{\mathbb R^n}f$ determines the first far-field term of the solution, while the remaining contribution decays at least one power of $|x|$ faster. [/example] Representation formulas on domains combine the source term with boundary data. The previous example separates the far-field contribution of the source, while local regularity arguments need a different separation: source inside a ball plus a harmonic remainder. The following decomposition is the form used later when singular integrals and harmonic estimates are combined. [quotetheorem:6419] [citeproof:6419] The cutoff is essential because the source $f$ is only assumed to be known on $B(x_0,R)$, while the Newtonian potential is a whole-space construction. If one tried to insert $f$ directly into $Nf$, the expression would require values of $f$ outside the ball and might fail to be compactly supported. The condition $\zeta=1$ on the smaller ball is equally important: if $\zeta$ did not equal $1$ near the point being estimated, then $-\Delta N(\zeta f)=\zeta f$ rather than $f$, and the remainder $u-N(\zeta f)$ would not be harmonic there. The smaller-ball restriction cannot be removed from this cutoff proof. A smooth compactly supported cutoff inside $B(x_0,R)$ cannot equal $1$ all the way to the boundary of $B(x_0,R)$; near the outer boundary, derivative and support effects from the localization are unavoidable. The compact support of $\zeta f$ is also what permits use of the whole-space theorem without adding artificial data outside the domain. Thus this decomposition is not a Green function formula for a bounded domain and does not determine boundary values. Its purpose is local: shrink the ball, replace the source by a compactly supported source, and leave a harmonic error on the region where estimates are needed. ## Mean Value, Harnack Inequality, and Liouville Consequences The final problem in this chapter is to extract qualitative information from harmonicity without solving the equation explicitly. The central fact is that harmonic functions are equal to their averages over spheres and balls. This averaging principle is the source of maximum principles, Harnack inequalities, and rigidity theorems. For a sphere $\partial B(x_0,r)$, the notation $\mathcal H^{n-1}$ denotes $(n-1)$-dimensional [Hausdorff measure](/page/Hausdorff%20Measure), which is the standard surface measure on smooth hypersurfaces. Thus $|\partial B(x_0,r)|$ denotes the $\mathcal H^{n-1}$-measure of the sphere. [quotetheorem:31] [citeproof:31] The theorem says that harmonic functions have no hidden interior bias: their value at the centre is exactly the surrounding average. Harmonicity is indispensable here. For example, $u(x)=|x|^2$ on $\mathbb R^n$ satisfies $\Delta u=2n$, and its average over $\partial B(0,r)$ is $r^2$ rather than $u(0)=0$. The corresponding statement for subharmonic functions becomes an inequality, which is why averages detect the sign of $\Delta u$. The closed-ball containment prevents the averaging sphere or ball from leaving the domain, where $u$ may not be defined and where the divergence-theorem argument cannot be localized. On $U=B(0,1)$, the formula cannot be asserted at $x_0=0$ with $r=2$, since the sphere is outside the domain of a general harmonic function on $U$. The $C^2$ assumption is the classical regularity needed to differentiate spherical averages and apply the [divergence theorem](/theorems/3614); later weak versions replace it by Sobolev regularity and recover smoothness as part of the theory. With these hypotheses in place, the same statement gives a quick test for many examples. [example: Mean Value Check for a Harmonic Polynomial] For $u(x_1,x_2)=x_1^2-x_2^2$, first take the circle $\partial B(0,r)$ with $r>0$. With \begin{align*} \gamma(\theta)=(r\cos\theta,r\sin\theta),\qquad 0\le \theta\le 2\pi, \end{align*} we have $d\mathcal H^1=r\,d\theta$ and $|\partial B(0,r)|=2\pi r$. Substituting the parametrisation into $u$ gives \begin{align*} u(\gamma(\theta))=(r\cos\theta)^2-(r\sin\theta)^2=r^2(\cos^2\theta-\sin^2\theta)=r^2\cos(2\theta). \end{align*} Therefore the circle average is \begin{align*} \frac{1}{|\partial B(0,r)|}\int_{\partial B(0,r)}u\,d\mathcal H^1=\frac{1}{2\pi r}\int_0^{2\pi}r^2\cos(2\theta)\,r\,d\theta. \end{align*} Canceling the factor $r$ in numerator and denominator gives \begin{align*} \frac{1}{2\pi r}\int_0^{2\pi}r^2\cos(2\theta)\,r\,d\theta=\frac{r^2}{2\pi}\int_0^{2\pi}\cos(2\theta)\,d\theta. \end{align*} Since \begin{align*} \int_0^{2\pi}\cos(2\theta)\,d\theta=\left[\frac{1}{2}\sin(2\theta)\right]_0^{2\pi}=\frac{1}{2}\sin(4\pi)-\frac{1}{2}\sin(0)=0, \end{align*} the average over $\partial B(0,r)$ is $0$, which equals $u(0,0)=0^2-0^2=0$. For a circle centred at $a=(a_1,a_2)$, use \begin{align*} \gamma(\theta)=(a_1+r\cos\theta,a_2+r\sin\theta). \end{align*} Then \begin{align*} u(\gamma(\theta))=(a_1+r\cos\theta)^2-(a_2+r\sin\theta)^2. \end{align*} Expanding both squares gives \begin{align*} u(\gamma(\theta))=a_1^2+2a_1r\cos\theta+r^2\cos^2\theta-a_2^2-2a_2r\sin\theta-r^2\sin^2\theta. \end{align*} Grouping the constant, linear, and quadratic trigonometric terms gives \begin{align*} u(\gamma(\theta))=a_1^2-a_2^2+2r(a_1\cos\theta-a_2\sin\theta)+r^2\cos(2\theta). \end{align*} Thus \begin{align*} \frac{1}{2\pi r}\int_0^{2\pi}u(\gamma(\theta))\,r\,d\theta=\frac{1}{2\pi}\int_0^{2\pi}u(\gamma(\theta))\,d\theta. \end{align*} Substituting the expanded expression gives \begin{align*} \frac{1}{2\pi}\int_0^{2\pi}u(\gamma(\theta))\,d\theta=a_1^2-a_2^2+\frac{r}{\pi}\left(a_1\int_0^{2\pi}\cos\theta\,d\theta-a_2\int_0^{2\pi}\sin\theta\,d\theta\right)+\frac{r^2}{2\pi}\int_0^{2\pi}\cos(2\theta)\,d\theta. \end{align*} The three trigonometric integrals vanish: \begin{align*} \int_0^{2\pi}\cos\theta\,d\theta=\left[\sin\theta\right]_0^{2\pi}=0, \end{align*} \begin{align*} \int_0^{2\pi}\sin\theta\,d\theta=\left[-\cos\theta\right]_0^{2\pi}=-\cos(2\pi)+\cos(0)=0, \end{align*} and \begin{align*} \int_0^{2\pi}\cos(2\theta)\,d\theta=0. \end{align*} Therefore \begin{align*} \frac{1}{|\partial B(a,r)|}\int_{\partial B(a,r)}u\,d\mathcal H^1=a_1^2-a_2^2=u(a). \end{align*} So the average of this harmonic polynomial over every circle equals its value at the centre, exactly matching the mean value property. [/example] The [mean value theorem](/theorems/186) also explains why interior extrema are highly constrained. If a harmonic function reaches its largest value inside the domain, the average over every surrounding small sphere must equal that largest value. This observation leads to the first maximum principle in the course. Informally, the bounded-domain maximum principle says that a harmonic function cannot create a new strict extremum in the interior. The precise hypotheses matter, and they will reappear in the elliptic-operator version below; at this stage the point is only the mechanism supplied by the mean value property. Interior attainment makes every sufficiently small surrounding average equal to the same extreme value, so the averaging identity becomes a rigidity principle rather than just an integral formula. This informal maximum-principle picture has important limits. Connectedness matters: if $U$ is the union of two disjoint balls and $u$ is the constant $0$ on one component and the constant $1$ on the other, then $u$ is harmonic on $U$ and has an interior maximum on the second component without being constant on all of $U$. Harmonicity matters too: a non-harmonic $C^2$ function such as $u(x)=-|x|^2$ has a strict interior maximum at the origin. For positive harmonic functions, however, merely ruling out a new interior maximum is not enough for compactness or limiting arguments. If a sequence is normalised at one point, estimates at neighbouring points must not depend on the particular function. The needed replacement is a scale-invariant comparison estimate on compactly contained balls. The Poisson kernel for a ball supplies the two facts behind that comparison: for $x\in B(x_0,R')$ and $\xi\in \partial B(x_0,R')$, $P(x,\xi)$ is the positive density which represents harmonic functions by their boundary values, \begin{align*} u(x)=\int_{\partial B(x_0,R')}P(x,\xi)u(\xi)\,d\mathcal H^{n-1}(\xi). \end{align*} These two Poisson-kernel facts are exactly what we need next: positivity lets us compare averages of a nonnegative harmonic function, while comparability on a smaller concentric ball turns boundary representation into interior control. The following theorem formulates that control as the Harnack inequality, which is the estimate used later for compactness and limiting arguments. [quotetheorem:689] [citeproof:689] Nonnegativity is indispensable because the proof compares positive Poisson averages. A sign-changing harmonic function can make the right-hand side useless: on $B(0,1)\subset\mathbb R^n$, $u(x)=x_1$ is harmonic, and on any symmetric smaller ball $B(0,r)$ its infimum is negative while its supremum is positive, so no estimate of the displayed form can hold with a positive constant. Even replacing nonnegativity by a lower bound of unknown size would not give a scale-invariant ratio estimate; one must first translate the function by a known bound, and that changes the quantity being compared. Harmonicity and the ball geometry are also part of the mechanism. If $\Delta u\ne 0$, Poisson kernel representation is unavailable and interior sources may create peaks not controlled by boundary averages. The estimate is interior: the constant degenerates as $r/R\uparrow 1$, reflecting the fact that boundary behaviour may be much more singular than interior behaviour. A concrete obstruction is supplied by positive harmonic functions on annuli, where values can change rapidly near a missing inner boundary or an outer boundary while remaining well controlled away from both. The next example isolates that phenomenon in radial form, so the abstract dependence of the Harnack constant on relative distance becomes visible in an elementary family. [example: Harnack Constant for Radial Positive Harmonic Functions] In the annulus $1<|x|<4$, write $r=|x|$. For $n\ge 3$, every radial harmonic function has the form \begin{align*} u(x)=\varphi(r)=a+br^{2-n}. \end{align*} Set \begin{align*} \alpha=4^{2-n},\quad \beta=3^{2-n},\quad \gamma=2^{2-n}. \end{align*} Since $2-n<0$, the larger annulus corresponds to $\alpha<r^{2-n}<1$, while $2<r<3$ corresponds to $\beta<r^{2-n}<\gamma$. Thus we need to compare the values of the affine function $s\mapsto a+bs$ on $(\beta,\gamma)$, assuming $a+bs>0$ on $(\alpha,1)$. If $b=0$, the function is constant and the ratio is $1$. If $b>0$, then $a+bs$ is increasing, and positivity on $(\alpha,1)$ gives $a+b\alpha\ge 0$ by taking the limit as $s\downarrow \alpha$. For $\beta<s<t<\gamma$, \begin{align*} \frac{a+bt}{a+bs}=\frac{(a+b\alpha)+b(t-\alpha)}{(a+b\alpha)+b(s-\alpha)}. \end{align*} Since $a+b\alpha\ge 0$ and $t-\alpha>s-\alpha>0$, \begin{align*} \frac{(a+b\alpha)+b(t-\alpha)}{(a+b\alpha)+b(s-\alpha)}\le \frac{b(t-\alpha)}{b(s-\alpha)}. \end{align*} Also $t<\gamma$ and $s>\beta$, so \begin{align*} \frac{b(t-\alpha)}{b(s-\alpha)}\le \frac{\gamma-\alpha}{\beta-\alpha}. \end{align*} If $b<0$, write $c=-b>0$. Then $a+bs=a-cs=(a-c)+c(1-s)$, and positivity on $(\alpha,1)$ gives $a-c\ge 0$ by taking the limit as $s\uparrow 1$. For $\beta<s<t<\gamma$, \begin{align*} \frac{a+bs}{a+bt}=\frac{(a-c)+c(1-s)}{(a-c)+c(1-t)}. \end{align*} Since $a-c\ge 0$ and $1-s>1-t>0$, \begin{align*} \frac{(a-c)+c(1-s)}{(a-c)+c(1-t)}\le \frac{c(1-s)}{c(1-t)}. \end{align*} Also $s>\beta$ and $t<\gamma$, so \begin{align*} \frac{c(1-s)}{c(1-t)}\le \frac{1-\beta}{1-\gamma}. \end{align*} Therefore, for $n\ge 3$, \begin{align*} \sup_{2<|x|<3}u(x)\le \max\left\{\frac{\gamma-\alpha}{\beta-\alpha},\frac{1-\beta}{1-\gamma}\right\}\inf_{2<|x|<3}u(x). \end{align*} In dimension $n=2$, every positive radial harmonic function on the annulus has the form \begin{align*} u(x)=\varphi(r)=a+b\log r. \end{align*} Set $s=\log r$. Then $1<r<4$ corresponds to $0<s<\log 4$, and $2<r<3$ corresponds to $\log 2<s<\log 3$. Applying the same affine-function comparison on the intervals $(0,\log 4)$ and $(\log 2,\log 3)$ gives \begin{align*} \sup_{2<|x|<3}u(x)\le \max\left\{\frac{\log 3}{\log 2},\frac{\log 4-\log 2}{\log 4-\log 3}\right\}\inf_{2<|x|<3}u(x). \end{align*} Thus positivity on the larger annulus forces a uniform ratio bound on the compactly contained smaller annulus, and the displayed constants show explicitly that the bound depends on the smaller annulus staying away from the boundary. [/example] The final classical consequence is a rigidity theorem on all of Euclidean space. It is the first example of an elliptic estimate turning a qualitative global hypothesis into a classification. [quotetheorem:38] [citeproof:38] The one-sided boundedness assumption is essential. The coordinate function $u(x)=x_1$ is harmonic on $\mathbb R^n$ but is neither bounded above nor bounded below, and it is not constant. Harmonic polynomials of positive degree give further examples showing that global harmonicity alone does not impose rigidity. The proof also explains why the bounded-above case must be handled by $M-u$ rather than by adding a constant to $u$: an upper bound supplies a nonnegative harmonic function only after reflection across that bound. The whole-space hypothesis is equally important. On a proper domain, bounded nonconstant harmonic functions occur frequently; for instance, the coordinate function $u(x)=x_1$ is bounded and harmonic on $B(0,1)$. [Liouville's theorem](/theorems/38) shows how an interior estimate becomes a global classification only when the estimate can be applied on arbitrarily large balls. It also parallels the complex-analytic Liouville theorem for bounded entire holomorphic functions, since [real and imaginary parts of holomorphic functions are harmonic](/theorems/336) in the plane. Once the classical harmonic theory is in place, the next issue is not merely whether solutions exist, but how boundary data governs them. Chapter 2 makes that shift explicit by developing boundary value problems and the maximum principle, the tools that convert local elliptic structure into global control from the boundary. # 2. Boundary value problems and maximum principles After Chapter 1's interior harmonic theory, this chapter turns to boundary value problems for elliptic equations. The central question is how boundary information controls a solution inside the domain, and which signs in the operator make that control possible. The maximum principles are the main tool: they convert local ellipticity into global uniqueness, comparison, and boundary behaviour. ## Boundary Conditions for Elliptic Problems A second-order elliptic equation on a bounded domain $U \subset \mathbb R^n$ needs boundary data before it becomes a well-posed boundary value problem. Different physical and variational models prescribe different traces at $\partial U$: the value of the unknown, its normal flux, or a mixture of the two. We first separate these conditions because the maximum principle treats them in different ways. [definition: Classical Uniformly Elliptic Operator] Let $U \subset \mathbb R^n$ be open. A linear second-order operator in nondivergence form is a [linear map](/page/Linear%20Map) \begin{align*} L:C^2(U)\to C(U) \end{align*} of the form \begin{align*} Lu = -\sum_{i,j=1}^n a_{ij}(x)\partial_{x_i x_j}u + \sum_{i=1}^n b_i(x)\partial_{x_i}u + c(x)u, \end{align*} where $a_{ij}, b_i, c\in C(U)\cap L^\infty(U)$ and $A(x)=(a_{ij}(x))$ is symmetric. The operator is uniformly elliptic if there exists $\theta>0$ such that \begin{align*} \sum_{i,j=1}^n a_{ij}(x)\xi_i\xi_j \ge \theta |\xi|^2 \end{align*} for all $x \in U$ and all $\xi \in \mathbb R^n$. [/definition] The sign convention places a minus sign in front of the second-order part, so the model operator is $L=-\Delta$. With this convention, hypotheses such as $c\ge 0$ are the natural ones for upper bounds and uniqueness. [example: Laplacian With This Sign Convention] On a bounded domain $U\subset \mathbb R^n$, take \begin{align*} Lu=-\Delta u=-\sum_{i=1}^n \partial_{x_i x_i}u. \end{align*} Comparing this with the definition of a classical uniformly elliptic operator, \begin{align*} Lu=-\sum_{i,j=1}^n a_{ij}(x)\partial_{x_i x_j}u+\sum_{i=1}^n b_i(x)\partial_{x_i}u+c(x)u, \end{align*} gives $a_{ij}=\delta_{ij}$, $b_i=0$, and $c=0$. For every $\xi=(\xi_1,\dots,\xi_n)\in\mathbb R^n$, \begin{align*} \sum_{i,j=1}^n a_{ij}\xi_i\xi_j=\sum_{i,j=1}^n \delta_{ij}\xi_i\xi_j. \end{align*} Since $\delta_{ij}=0$ when $i\ne j$ and $\delta_{ii}=1$, this becomes \begin{align*} \sum_{i,j=1}^n \delta_{ij}\xi_i\xi_j=\sum_{i=1}^n \xi_i^2=|\xi|^2. \end{align*} Therefore the uniform ellipticity inequality \begin{align*} \sum_{i,j=1}^n a_{ij}\xi_i\xi_j\ge \theta |\xi|^2 \end{align*} holds with $\theta=1$, because $|\xi|^2\ge 1\cdot |\xi|^2$ for all $\xi$. This is the reference sign convention for the maximum principles: at an interior positive maximum the Hessian is negative semidefinite, so the operator $-\Delta$ has the sign that prevents compatibility with $Lu\le0$ in the nonconstant strict case. [/example] The operator describes the equation in the interior, but a boundary value problem also has to specify what information is imposed on $\partial U$. Without such data, even the Laplace equation is far from unique: on a connected domain every constant function is harmonic, and many domains admit nonconstant harmonic functions as well. The most direct prescription is to fix the boundary values themselves, giving the Dirichlet condition. [definition: Dirichlet Boundary Condition] Let $U\subset \mathbb R^n$ be a domain, let $g\in C(\partial U)$, and let $u\in C(\overline U)$. The trace operator \begin{align*} \gamma_0:C(\overline U)\to C(\partial U), \qquad \gamma_0 u=u|_{\partial U}, \end{align*} defines the Dirichlet boundary condition with boundary data $g$ by \begin{align*} \gamma_0 u=g. \end{align*} [/definition] Dirichlet data fix the boundary height of the solution, which is exactly the quantity controlled by maximum principles. Some models instead prescribe how much flux crosses the boundary, so the next boundary condition records the normal derivative rather than the value. [definition: Neumann Boundary Condition] Let $U\subset \mathbb R^n$ have $C^1$ boundary, let $\nu:\partial U\to\mathbb R^n$ denote the outward unit normal, and let $g\in C(\partial U)$. The Neumann boundary operator is the map \begin{align*} N:C^1(\overline U)\to C(\partial U), \qquad Nu=\frac{\partial u}{\partial \nu}=\nabla u\cdot \nu. \end{align*} A function $u\in C^1(\overline U)$ satisfies the Neumann boundary condition with boundary data $g$ if \begin{align*} Nu=g. \end{align*} [/definition] Neumann data prescribe flux rather than height. Here $\mathcal L^n$ denotes $n$-dimensional Lebesgue measure. For the pure Laplacian this leaves constants undetermined, so uniqueness requires either a normalisation such as \begin{align*} \int_U u\,d\mathcal L^n=0 \end{align*} or a zeroth-order term that fixes the additive constant. If a boundary model only controls flux, it cannot distinguish two temperature profiles that differ by a constant. Between fixed value and fixed flux lies a mixed boundary condition with boundary damping, where the sign of the damping coefficient determines whether boundary maxima are suppressed or reinforced. [definition: Robin Boundary Condition] Let $U\subset \mathbb R^n$ have $C^1$ boundary, let $\nu$ denote the outward unit normal on $\partial U$, let $\alpha,g\in C(\partial U)$, and let $u\in C^1(\overline U)$. The Robin boundary operator with coefficient $\alpha$ is the map \begin{align*} R_\alpha:C^1(\overline U)\to C(\partial U), \qquad R_\alpha u=\frac{\partial u}{\partial \nu}+\alpha u. \end{align*} The function $u$ satisfies the Robin boundary condition with coefficient $\alpha$ and boundary data $g$ if \begin{align*} R_\alpha u=g. \end{align*} [/definition] Robin data interpolate between fixed flux and boundary damping. When $\alpha\ge 0$, the boundary term has the same sign as the interior zeroth-order term in the maximum principle. If $\alpha<0$, the boundary term rewards positive boundary values instead of penalising them, so comparison arguments for Robin problems require extra care. [example: Nonuniqueness for the Pure Neumann Laplacian] Let $U\subset\mathbb R^n$ be connected and bounded, and consider the pure Neumann problem \begin{align*} -\Delta u=0 \quad \text{in } U, \qquad \frac{\partial u}{\partial\nu}=0 \quad \text{on } \partial U. \end{align*} For each constant $K\in\mathbb R$, define $u_K(x)=K$ on $\overline U$. Then each first derivative is zero, \begin{align*} \partial_{x_i}u_K(x)=0 \qquad \text{for } i=1,\dots,n, \end{align*} so each second derivative is also zero, \begin{align*} \partial_{x_i x_i}u_K(x)=\partial_{x_i}(0)=0. \end{align*} Hence \begin{align*} -\Delta u_K =-\sum_{i=1}^n \partial_{x_i x_i}u_K =-\sum_{i=1}^n 0 =0 \end{align*} in $U$. On the boundary, $\nabla u_K=(0,\dots,0)$, so for the outward unit normal $\nu$, \begin{align*} \frac{\partial u_K}{\partial\nu} =\nabla u_K\cdot \nu =(0,\dots,0)\cdot \nu =0. \end{align*} Thus every constant $u_K$ solves the same equation with the same homogeneous Neumann boundary data. If $K_1\ne K_2$, then $u_{K_1}\ne u_{K_2}$, so the pure Neumann Laplacian does not have uniqueness unless the additive constants are removed by a normalisation or fixed by an additional term. [/example] ## Weak Maximum Principle The first maximum principle asks whether an elliptic inequality prevents a positive maximum from being created in the interior. For the operator $L=-\sum a_{ij}\partial_{x_i x_j}+\sum b_i\partial_{x_i}+cu$, the answer depends on the sign of $c$ and on the direction of the inequality. [quotetheorem:100] [citeproof:100] The theorem is a global bound from boundary data, not a [regularity theorem](/theorems/2750): it assumes a classical solution and then uses the sign structure of the operator to rule out certain interior extrema. Boundedness of $U$ matters because the supremum must be controlled by an actual boundary; on unbounded domains, functions such as $u(x)=x_1$ for $L=-\Delta$ have no finite boundary maximum controlling their interior growth. Continuity up to $\partial U$ is also part of the statement, since otherwise boundary values do not determine the value of the supremum on $\overline U$. Uniform ellipticity prevents degeneracy of the second-order part, while the condition $c\ge0$ is essential, as the next example shows by changing the sign of the zeroth-order term. [example: Uniqueness for the Dirichlet Problem] Let $U\subset\mathbb R^n$ be bounded, let $L$ be a uniformly elliptic operator with $c\ge0$, and suppose $u,v\in C^2(U)\cap C(\overline U)$ satisfy $Lu=Lv=f$ in $U$ and $u=v=g$ on $\partial U$. Set $w=u-v$. Since $C^2(U)\cap C(\overline U)$ is closed under subtraction, $w\in C^2(U)\cap C(\overline U)$. By linearity of $L$, \begin{align*} Lw=L(u-v)=Lu-Lv=f-f=0 \end{align*} in $U$. On the boundary, \begin{align*} w|_{\partial U}=(u-v)|_{\partial U}=g-g=0. \end{align*} Thus $Lw=0\le0$ in $U$ and $w\le0$ on $\partial U$. By the [Weak Maximum Principle](/theorems/100), $w\le0$ in $U$. Apply the same reasoning to $-w=v-u$. Linearity gives \begin{align*} L(-w)=-Lw=-0=0 \end{align*} in $U$, and on the boundary, \begin{align*} (-w)|_{\partial U}=(v-u)|_{\partial U}=g-g=0. \end{align*} Again the [Weak Maximum Principle](/theorems/100) gives $-w\le0$ in $U$, which is equivalent to $w\ge0$ in $U$. Therefore $w\le0$ and $w\ge0$ throughout $U$, so \begin{align*} w=0 \end{align*} in $U$. Since $w=u-v$, this means $u=v$ throughout $U$. Hence the Dirichlet problem has at most one classical solution under the maximum-principle sign condition. [/example] The sign condition on $c$ is not cosmetic. If the zeroth-order term has the wrong sign, a positive interior maximum can be compatible with the differential inequality. [example: Wrong Sign Zeroth-Order Term] On $U=(0,\pi)$, define \begin{align*} Lu=-u''-u. \end{align*} This is the one-dimensional operator $L=-\frac{d^2}{dx^2}+c$ with $c=-1$, so its zeroth-order coefficient has the opposite sign from the hypothesis $c\ge0$ in the [Weak Maximum Principle](/theorems/100). Let $u(x)=\sin x$. At the boundary points, \begin{align*} u(0)=\sin 0=0 \end{align*} and \begin{align*} u(\pi)=\sin \pi=0. \end{align*} Thus $u=0$ on $\partial U=\{0,\pi\}$. Its derivatives are \begin{align*} u'(x)=\cos x \end{align*} and \begin{align*} u''(x)=-\sin x. \end{align*} Substituting these into $L$ gives \begin{align*} Lu=-u''-u. \end{align*} Since $u''=-\sin x$ and $u=\sin x$, \begin{align*} Lu=-(-\sin x)-\sin x. \end{align*} Therefore \begin{align*} Lu=\sin x-\sin x=0 \end{align*} for every $x\in(0,\pi)$. The function is positive in the interior because $0<x<\pi$ implies $\sin x>0$. In particular, \begin{align*} u\left(\frac{\pi}{2}\right)=\sin\left(\frac{\pi}{2}\right)=1. \end{align*} So $u$ satisfies $Lu=0$ in $U$ and $u=0$ on $\partial U$, but has a positive interior maximum at $x=\pi/2$. If the weak maximum principle conclusion applied to this operator, the boundary condition would force $u\le0$ in $U$, contradicting $u(\pi/2)=1$. This shows exactly how the wrong sign $c=-1$ allows an interior positive maximum. [/example] This example isolates the obstruction: comparison arguments are valid only under the same sign hypotheses that make the maximum principle true. Once those hypotheses are in place, uniqueness alone is still too weak for estimates; one needs a principle that prevents an ordered subsolution and supersolution from crossing inside the domain when their boundary values are already ordered. [quotetheorem:4870] [citeproof:4870] Comparison is the bridge from uniqueness to estimates. Its hypotheses are inherited from the weak maximum principle: if the boundary order is removed, two equal harmonic functions shifted by a positive constant already violate the conclusion, and if the sign condition on $c$ is removed the sine example above gives an ordered boundary with interior crossing. The result also does not construct either function; it only says that, once a subsolution and supersolution are available with compatible boundary order, their order cannot reverse inside the domain. This is why explicit barriers are useful: once an explicit function dominates the boundary data and satisfies the right differential inequality, every solution is trapped beneath it. [example: Constant Barriers for a Bounded Source] Let \begin{align*} A=\frac{M}{c_0}. \end{align*} Since $|f|\le M$ in $U$, we have $-M\le f(x)\le M$ for every $x\in U$. The constant function $A$ satisfies $\Delta A=0$, and therefore \begin{align*} LA=-\Delta A+cA=0+c\frac{M}{c_0}=\frac{cM}{c_0}. \end{align*} Because $c\ge c_0>0$, we get \begin{align*} LA=\frac{cM}{c_0}\ge \frac{c_0M}{c_0}=M. \end{align*} Since $M\ge f=Lu$, this gives $Lu\le LA$ in $U$. On the boundary, \begin{align*} u=0\le \frac{M}{c_0}=A. \end{align*} By the *[Comparison Principle](/theorems/4870)*, $u\le A$ in $U$. For the lower barrier, the constant function $-A$ also satisfies $\Delta(-A)=0$, so \begin{align*} L(-A)=-\Delta(-A)+c(-A)=0-c\frac{M}{c_0}=-\frac{cM}{c_0}. \end{align*} Again using $c\ge c_0$, we have \begin{align*} L(-A)=-\frac{cM}{c_0}\le -\frac{c_0M}{c_0}=-M. \end{align*} Since $-M\le f=Lu$, this gives $L(-A)\le Lu$ in $U$. On the boundary, \begin{align*} -A=-\frac{M}{c_0}\le 0=u. \end{align*} Applying the *Comparison Principle* to $-A$ and $u$ gives $-A\le u$ in $U$. Combining the upper and lower estimates, \begin{align*} -\frac{M}{c_0}\le u\le \frac{M}{c_0} \end{align*} throughout $U$. Thus the positive zeroth-order coefficient converts the pointwise source bound $|f|\le M$ into the uniform solution bound $|u|\le M/c_0$. [/example] ## Strong Maximum Principle The weak maximum principle allows the maximum to occur on the boundary but says little if the same maximum also occurs inside. The strong maximum principle answers the rigidity question: if an elliptic subsolution reaches its best value at an interior point, must the function be constant? [quotetheorem:102] [citeproof:102] The extra information is qualitative rather than quantitative. Connectedness is essential because the conclusion propagates through the domain; on a disconnected domain a function can be constant at its maximum on one component and take a different value on another. The regularity and uniform ellipticity hypotheses are the mechanism behind the barrier argument, and degenerate operators can fail to propagate strictness from one direction to another. The nonnegative-maximum condition is tied to the assumption $c\ge0$: with a zeroth-order term present, applying the result to shifted functions can change the sign of the $cu$ term. Thus a nonconstant subsolution may approach its maximum near the boundary, but it cannot plateau at the maximum in the interior under these hypotheses. [example: Interior Rigidity for Harmonic Functions] Let $U\subset\mathbb R^n$ be connected and let $u\in C^2(U)$ satisfy $-\Delta u=0$ in $U$. Suppose $u$ attains a maximum at $x_0\in U$, and set $M=u(x_0)$ and $v=u-M$. Then $v\in C^2(U)$, and for each $x\in U$, \begin{align*} v(x)=u(x)-M\le u(x_0)-M=0. \end{align*} Also \begin{align*} v(x_0)=u(x_0)-M=M-M=0, \end{align*} so $v$ attains the nonnegative maximum $0$ at the interior point $x_0$. For the operator $L=-\Delta$, the zeroth-order coefficient is $c=0$, so the sign hypothesis $c\ge0$ holds. Since constants have zero second derivatives, \begin{align*} -\Delta v =-\Delta(u-M) =-\Delta u+\Delta M =-\Delta u+0 =0. \end{align*} Thus $Lv=0\le0$ in $U$. By the *Strong Maximum Principle*, $v$ is constant on $U$. Because $v(x_0)=0$, that constant is $0$, so $u-M=0$ throughout $U$, and hence \begin{align*} u(x)=M \end{align*} for every $x\in U$. If instead $u$ attains a minimum at $x_0$, then $-u$ attains a maximum at $x_0$. Moreover, \begin{align*} -\Delta(-u) =-\sum_{i=1}^n \partial_{x_i x_i}(-u) =-\sum_{i=1}^n \bigl(-\partial_{x_i x_i}u\bigr) =\sum_{i=1}^n \partial_{x_i x_i}u =\Delta u. \end{align*} Since $-\Delta u=0$, we have $\Delta u=0$, so $-\Delta(-u)=0$. Applying the same maximum argument to $-u$ shows that $-u$ is constant, and therefore $u$ is constant. Thus a nonconstant harmonic function on a connected domain can have neither an interior maximum nor an interior minimum. [/example] The connectedness hypothesis is part of the conclusion. Without it, a function can be constant on one component at its maximum and different on another component. [remark: Role of Connectedness] If $U=U_1\cup U_2$ is a disjoint union of two open connected sets, the function equal to $1$ on $U_1$ and $0$ on $U_2$ satisfies $-\Delta u=0$ on $U$. It attains its maximum at every point of $U_1$ but is not constant on all of $U$. [/remark] ## Hopf Boundary Point Lemma The strong maximum principle says that a nonconstant solution cannot attain its maximum in the interior. The next question is what happens when the maximum is attained at a smooth boundary point: if the solution drops inside the domain, its outward normal derivative should have a definite sign. [definition: Interior Ball Condition] Let $U\subset\mathbb R^n$ be open and let $x_0\in\partial U$. The domain $U$ satisfies an interior ball condition at $x_0$ if there exist $y\in U$ and $r>0$ such that \begin{align*} B(y,r):=\{x\in\mathbb R^n:|x-y|<r\}\subset U, \qquad x_0\in\partial B(y,r). \end{align*} [/definition] The interior ball condition records the exact geometry used by boundary barriers: a ball inside the domain touches the boundary at the point of interest. [illustration:interior-ball-condition] The next result asks what this geometry forces for the normal derivative when a nonconstant subsolution reaches its maximum at that boundary point. [quotetheorem:101] [citeproof:101] [Hopf's lemma](/theorems/101) is a boundary version of strictness. Each hypothesis has a concrete role. The interior ball condition is the geometric input that allows the exponential barrier to fit inside the domain; at an inward cusp such as \begin{align*} U=\{(x_1,x_2)\in\mathbb R^2:0<x_2<x_1^2,\ 0<x_1<1\} \end{align*} near the origin, no interior disk touches the cusp point, and the annular barrier cannot be placed. The strict local maximum assumption excludes the constant case: if $u\equiv M$, then $Lu=cM$ may satisfy the differential inequality when $c=0$, but $\partial u/\partial\nu=0$ rather than a strict positive number. Differentiability at $x_0$ is also a real hypothesis, since a function can satisfy a comparison estimate of the form $M-u\ge C\operatorname{dist}(x,\partial U)$ for some $C>0$ without possessing a classical normal derivative at the boundary point. The coefficient and regularity assumptions are what make the barrier computation legitimate; if ellipticity degenerates in the normal direction, a boundary maximum can fail to force linear decay into the domain. The forward use of the lemma is to convert comparison into boundary sign information. In uniqueness proofs it separates a nonconstant solution touching a barrier at the boundary from one that merely shares its boundary value. In variational language, it explains why minimisers of elliptic energies with positive interior forcing cannot have completely flat contact with a smooth obstacle boundary unless the active set has already changed the equation. [example: Barrier Near a Smooth Boundary Point] Let $U=B(0,1)$, let $x_0=e_1=(1,0,\dots,0)$, and let $L=-\Delta$. For $\alpha>0$, define \begin{align*} w(x)=e^{-\alpha |x|^2}-e^{-\alpha}. \end{align*} If $|x|=1$, then \begin{align*} w(x)=e^{-\alpha\cdot 1}-e^{-\alpha}=0. \end{align*} If $|x|<1$, then $|x|^2<1$, so $-\alpha |x|^2>-\alpha$ and hence \begin{align*} e^{-\alpha |x|^2}>e^{-\alpha}. \end{align*} Therefore $w(x)>0$ in $B(0,1)$. We compute the differential sign in a boundary annulus. Since \begin{align*} \partial_{x_i}|x|^2=2x_i, \end{align*} the chain rule gives \begin{align*} \partial_{x_i}w(x)=-2\alpha x_i e^{-\alpha |x|^2}. \end{align*} Differentiating this expression with respect to $x_i$ and using the product rule, \begin{align*} \partial_{x_i x_i}w(x)=\partial_{x_i}\left(-2\alpha x_i e^{-\alpha |x|^2}\right). \end{align*} Thus \begin{align*} \partial_{x_i x_i}w(x)=-2\alpha e^{-\alpha |x|^2}-2\alpha x_i\partial_{x_i}\left(e^{-\alpha |x|^2}\right). \end{align*} Because \begin{align*} \partial_{x_i}\left(e^{-\alpha |x|^2}\right)=-2\alpha x_i e^{-\alpha |x|^2}, \end{align*} we get \begin{align*} \partial_{x_i x_i}w(x)=\left(-2\alpha+4\alpha^2x_i^2\right)e^{-\alpha |x|^2}. \end{align*} Summing over $i=1,\dots,n$ gives \begin{align*} \Delta w(x)=\left(-2\alpha n+4\alpha^2|x|^2\right)e^{-\alpha |x|^2}. \end{align*} Therefore \begin{align*} Lw(x)=-\Delta w(x)=2\alpha\left(n-2\alpha |x|^2\right)e^{-\alpha |x|^2}. \end{align*} Now take the annulus \begin{align*} A=\left\{x\in B(0,1):\frac12<|x|<1\right\}. \end{align*} For $x\in A$, we have $|x|^2>1/4$. If $\alpha>2n$, then \begin{align*} 2\alpha |x|^2>2\alpha\cdot\frac14=\frac{\alpha}{2}>n. \end{align*} Hence $n-2\alpha |x|^2<0$ on $A$. Since $2\alpha e^{-\alpha |x|^2}>0$, it follows that \begin{align*} Lw(x)<0 \end{align*} throughout $A$. At the touching point $x_0=e_1$, the outward unit normal to $B(0,1)$ is $\nu(x_0)=e_1$. Also \begin{align*} \nabla w(x_0)=-2\alpha e^{-\alpha}e_1. \end{align*} Therefore \begin{align*} \frac{\partial w}{\partial \nu}(x_0)=\nabla w(x_0)\cdot e_1=-2\alpha e^{-\alpha}<0. \end{align*} Thus $w$ is positive just inside the ball, vanishes on the boundary, satisfies the strict barrier inequality $Lw<0$ in a thin boundary annulus for large $\alpha$, and decreases in the outward normal direction at $x_0$. This is the explicit model barrier used in the proof of the *Hopf Boundary Point Lemma*. [/example] ## Uniqueness by Comparison The maximum principles become boundary value theory when they are applied to differences of solutions. The key idea is that an elliptic equation with admissible sign cannot hide a positive difference in the interior if that difference is nonpositive on the boundary. [quotetheorem:1187] [citeproof:1187] This theorem is the basic uniqueness statement for classical Dirichlet problems. The boundary condition is essential: for the Neumann Laplacian on a connected domain, adding a constant gives another solution with the same normal derivative. The sign condition on $c$ is also essential, as $u(x)=\sin x$ on $(0,\pi)$ solves $-u''-u=0$ with zero Dirichlet data but is not the zero solution. Boundedness gives a boundary on which the weak maximum principle can control the supremum; on an unbounded domain, homogeneous boundary data on a finite portion of the boundary do not prevent growth at infinity. Uniform ellipticity and bounded coefficients are the analytic hypotheses behind the comparison principle used in the proof, and the classical regularity assumption is what makes $Lu=f$ and the boundary trace meaningful in this chapter's setting. The theorem says nothing about existence or higher regularity; later variational methods will supply weak solutions under Sobolev hypotheses, after which maximum principles continue to give order and uniqueness information whenever the weak solution has enough regularity or an appropriate weak comparison theorem is available. [example: Ordered Boundary Data Give Ordered Solutions] Let $U$ be bounded, let $L$ satisfy the maximum-principle sign hypotheses, and suppose $u,v\in C^2(U)\cap C(\overline U)$ solve \begin{align*} Lu=f \quad \text{in } U \end{align*} and \begin{align*} Lv=f \quad \text{in } U, \end{align*} with $u\le v$ on $\partial U$. Set $w=u-v$. Since $C^2(U)\cap C(\overline U)$ is closed under subtraction, $w\in C^2(U)\cap C(\overline U)$. By linearity of $L$, \begin{align*} Lw=L(u-v)=Lu-Lv=f-f=0 \end{align*} in $U$, so in particular $Lw=0\le 0$ in $U$. On the boundary, the assumed order gives \begin{align*} w|_{\partial U}=(u-v)|_{\partial U}\le 0. \end{align*} Applying the *Comparison Principle* to $w$ and $0$ yields \begin{align*} w\le 0 \end{align*} in $U$. Substituting $w=u-v$, we obtain \begin{align*} u-v\le 0, \end{align*} hence \begin{align*} u\le v \end{align*} throughout $U$. Thus increasing the boundary values preserves the order of the corresponding solutions everywhere inside the domain. [/example] The maximum principle gives strong qualitative information, but it does not yet provide a full representation of solutions. Chapter 3 builds the kernel formulas that complete that picture, showing how Green functions and Poisson kernels encode the same boundary-value problems in integral form. # 3. Green functions and Poisson kernels Building on Chapter 2's maximum principles and uniqueness results, this chapter continues the course's study of classical boundary value problems for [Laplace's equation](/page/Laplace's%20Equation) and Poisson's equation. The preceding chapter established the maximum principle and uniqueness for Dirichlet problems; the goal here is to build the integral formulas that produce the corresponding solutions when the domain is sufficiently regular. Green functions record the response to an interior point source, while Poisson kernels describe how boundary data propagates harmonically into the domain. The chapter develops these objects first as representation tools, then as objects with their own structure: singularity, positivity, symmetry, and probabilistic interpretation through harmonic measure. ## Green Functions and Boundary Value Problems The Dirichlet problem asks for a function whose Laplacian is prescribed in the interior and whose values are prescribed on the boundary. In all of space, the fundamental solution gives a convolution formula, but a bounded domain introduces boundary correction terms. The Green function is the device that subtracts the unwanted boundary trace of the fundamental solution. Throughout this chapter the classical Laplacian is the operator \begin{align*} -\Delta:C^2(\Omega)\to C(\Omega), \end{align*} and, when distributions are used, it is the continuous linear operator \begin{align*} -\Delta:\mathcal D'(\Omega)\to\mathcal D'(\Omega) \end{align*} defined by $(-\Delta T)(\phi)=T(-\Delta\phi)$ for $T\in\mathcal D'(\Omega)$ and $\phi\in C_c^\infty(\Omega)$. Thus the identity $-\Delta_xG(x,y)=\delta_y$ means that, for every $\phi\in C_c^\infty(\Omega)$, \begin{align*} \int_\Omega G(x,y)(-\Delta\phi(x))\,d\mathcal L^n(x)=\phi(y). \end{align*} We write $\omega_n=\mathcal L^n(B(0,1))$, so that $\mathcal H^{n-1}(\partial B(0,1))=n\omega_n$. [definition: Dirichlet Green Function] Let $\Omega \subset \mathbb R^n$ be a bounded domain with $n\ge2$. Let $\Phi:\mathbb R^n\setminus\{0\}\to\mathbb R$ denote the fundamental solution of $-\Delta$ in $\mathbb R^n$, given for $n\ge3$ and $n=2$ respectively by \begin{align*} \Phi(z)=\frac{1}{n(n-2)\omega_n}|z|^{2-n}\quad\text{if }n\ge3, \end{align*} and \begin{align*} \Phi(z)=-\frac{1}{2\pi}\log |z|\quad\text{if }n=2. \end{align*} A Dirichlet Green function for $-\Delta$ on $\Omega$ is a function $G: (\Omega \times \Omega) \setminus \{(x,x):x\in\Omega\} \to \mathbb R$ such that, for each fixed $y \in \Omega$, \begin{align*} -\Delta_x G(x,y) = \delta_y \text{ in } \Omega, \end{align*} $G(\cdot,y)$ has zero Dirichlet boundary trace on $\partial\Omega$, and $G(x,y)-\Phi(x-y)$ is harmonic in $x$ on $\Omega$. [/definition] In the classical setting, the zero boundary trace condition is often expressed by saying that $G(\cdot,y)$ extends continuously to $\overline{\Omega}\setminus\{y\}$ with boundary value $0$. Thus $G(\cdot,y)$ has exactly the same singularity as the free-space fundamental solution at $y$, but is adjusted to satisfy the homogeneous boundary condition. The correction is necessary because the free-space convolution generally does not vanish on $\partial\Omega$; for instance, a positive compactly supported source in an interval produces a free-space potential with nonzero endpoint values. If $H(x,y)$ denotes the harmonic correction, then $G(x,y)=\Phi(x-y)-H(x,y)$, with $H(\cdot,y)$ chosen so that $H(x,y)=\Phi(x-y)$ on $\partial\Omega$. [example: Green Function on an Interval] Let $\Omega=(0,L)$ and fix $y\in(0,L)$. Define \begin{align*} G(x,y)=\frac{x(L-y)}{L}\quad\text{for }0<x\le y, \end{align*} and \begin{align*} G(x,y)=\frac{y(L-x)}{L}\quad\text{for }y\le x<L. \end{align*} At $x=y$ the two pieces agree, because both give \begin{align*} G(y,y)=\frac{y(L-y)}{L}. \end{align*} At the endpoints, \begin{align*} G(0,y)=\frac{0\cdot (L-y)}{L}=0 \end{align*} and \begin{align*} G(L,y)=\frac{y(L-L)}{L}=0, \end{align*} so the Dirichlet boundary condition is satisfied. On $(0,y)$ and $(y,L)$ the function is affine in $x$, hence $G''(x,y)=0$ away from $y$. Its one-sided derivatives at the pole are \begin{align*} G_x(y^-,y)=\frac{L-y}{L} \end{align*} and \begin{align*} G_x(y^+,y)=-\frac{y}{L}. \end{align*} Therefore \begin{align*} G_x(y^+,y)-G_x(y^-,y)=-\frac{y}{L}-\frac{L-y}{L}=-\frac{L}{L}=-1. \end{align*} To verify the distributional equation, let $\phi\in C_c^\infty(0,L)$. Splitting the integral at $y$ gives \begin{align*} \int_0^L G(x,y)(-\phi''(x))\,dx=\int_0^y G(x,y)(-\phi''(x))\,dx+\int_y^L G(x,y)(-\phi''(x))\,dx. \end{align*} Integrating by parts twice on $(0,y)$ gives \begin{align*} \int_0^y G(x,y)(-\phi''(x))\,dx=-G(y,y)\phi'(y)+G(0,y)\phi'(0)+G_x(y^-,y)\phi(y)-G_x(0^+,y)\phi(0). \end{align*} Integrating by parts twice on $(y,L)$ gives \begin{align*} \int_y^L G(x,y)(-\phi''(x))\,dx=-G(L,y)\phi'(L)+G(y,y)\phi'(y)+G_x(L^-,y)\phi(L)-G_x(y^+,y)\phi(y). \end{align*} Since $\phi$ has compact support in $(0,L)$, the endpoint values $\phi(0)$, $\phi'(0)$, $\phi(L)$, and $\phi'(L)$ vanish. Since $G$ is continuous at $y$, the two terms involving $G(y,y)\phi'(y)$ cancel. Thus \begin{align*} \int_0^L G(x,y)(-\phi''(x))\,dx=\bigl(G_x(y^-,y)-G_x(y^+,y)\bigr)\phi(y). \end{align*} Substituting the derivative jump, \begin{align*} \int_0^L G(x,y)(-\phi''(x))\,dx=\left(\frac{L-y}{L}+\frac{y}{L}\right)\phi(y)=\phi(y). \end{align*} Hence $-G''(\cdot,y)=\delta_y$ in distributions, so this piecewise affine function is the Green function for the interval problem. [/example] The interval formula shows the main mechanism without geometric distractions: away from the pole the Green function solves the homogeneous equation, while the derivative jump carries the point source. To use this object for boundary value problems, the next problem is to recover a sufficiently regular Dirichlet solution from its forcing term and its boundary trace. Green's identity supplies exactly this conversion from differential data to integral data. For a classical Dirichlet solution on a bounded $C^2$ domain, Green's representation has the form \begin{align*} u(x)=\int_\Omega G(y,x)f(y)\,d\mathcal L^n(y)+\int_{\partial\Omega} P(x,y)g(y)\,d\mathcal H^{n-1}(y), \end{align*} where $G$ is the Dirichlet Green function and \begin{align*} P(x,y)=-\frac{\partial G}{\partial\nu_y}(y,x) \end{align*} is the Poisson kernel. This is a representation formula in the classical regime, not an automatic existence theorem for rough domains or weak boundary data. The boundary regularity and classical regularity of $u$ are not cosmetic: without enough boundary smoothness, $\partial_{\nu_y}G(y,x)$ may not exist pointwise, and continuous boundary data need not have the naive pointwise trace behaviour expected in the formula. The statement also does not assert existence of $G$ or solvability for arbitrary domains; it is a representation theorem once the classical solution and kernel are available. Its main consequence is to identify the Poisson kernel as the boundary analogue of the Green function: it is the outward normal derivative, in the boundary variable, of the interior point-source response. The next section computes it in the model domains where the geometry is explicit. ## Poisson Kernels for the Ball and Half-Space Explicit kernels answer a concrete problem: given boundary data on a highly symmetric domain, can we write the harmonic extension by a formula involving only the boundary values? For balls and half-spaces, reflections and inversion transform the singular part of the Green function into a boundary-vanishing expression. [definition: Poisson Kernel] Let $\Omega\subset\mathbb R^n$ be a domain with Dirichlet Green function $G$. Suppose that the outward normal derivative in the boundary variable exists for $x\in\Omega$ and $y\in\partial\Omega$. The Poisson kernel of $\Omega$ is the function \begin{align*} P: \Omega\times\partial\Omega \to \mathbb R \end{align*} defined by \begin{align*} P(x,y)=-\frac{\partial G}{\partial \nu_y}(y,x). \end{align*} [/definition] The definition packages the boundary term from Green's formula into a kernel, using the boundary variable as the differentiated variable of $G(\cdot,x)$. It also shows why smoothness matters: on a corner domain the normal derivative may exist only in a weak or nontangential sense, so the displayed pointwise function need not be available. The immediate problem is to compute this normal derivative in domains where the Green function is explicit. The ball is the first model because inversion in a sphere preserves harmonicity in a controlled way and converts the abstract boundary derivative into a concrete density. For the ball, symmetry suggests that boundary mass seen from $x$ should concentrate near the closest boundary points and spread uniformly when $x=0$. The formula below gives that density, proves its normalisation, and records the precise sense in which the Poisson integral recovers continuous boundary data. [illustration:kelvin-inversion-ball] The formula must do three jobs at once: be harmonic in the interior variable, integrate to $1$ against surface measure, and become an approximate identity as $x$ approaches a boundary point. These are exactly the properties needed to turn continuous boundary data into a harmonic solution with the prescribed trace, rather than merely producing a formal normal derivative of $G$. The ball is rigid enough that these requirements determine the scale and angular dependence of the kernel, so the next theorem converts the abstract definition of $P$ into a usable solution formula. [quotetheorem:576] [citeproof:576] The $C(\partial B(0,R))$ hypothesis is exactly what the approximate-identity argument uses at the boundary point; if $g$ is the sign of the first coordinate on the unit circle, then approaching a jump point does not recover a continuous boundary value. The restriction $n\ge2$ separates this surface-measure formula from the one-dimensional interval problem, where the boundary consists of two points and the harmonic extension is affine rather than an integral against $\mathcal H^{n-1}$ density on a sphere. The smooth ball geometry is also essential for this closed form: on a square or a domain with a corner, no spherical inversion symmetry is available, and the normal derivative of the Green function typically has corner singularities rather than the displayed expression. The theorem also does not solve Poisson's equation with an interior forcing term, since no Green-volume integral appears here. What the ball formula gives is the qualitative behaviour of harmonic extension: the factor $R^2-|x|^2$ measures distance from the boundary at the level of scale, while the denominator $|x-y|^n$ concentrates mass near boundary points close to $x$. [example: Solving the Dirichlet Problem on a Ball] Let $g(y)=y_1/R$ on $\partial B(0,R)$, and set $u(x)=x_1/R$. For $x=(x_1,\dots,x_n)$, the first derivatives are \begin{align*} \frac{\partial u}{\partial x_1}(x)=\frac{1}{R}\quad\text{and}\quad \frac{\partial u}{\partial x_j}(x)=0\text{ for }2\le j\le n. \end{align*} Differentiating once more gives \begin{align*} \frac{\partial^2 u}{\partial x_j^2}(x)=0\quad\text{for every }1\le j\le n. \end{align*} Therefore \begin{align*} \Delta u(x)=\sum_{j=1}^n\frac{\partial^2 u}{\partial x_j^2}(x)=0, \end{align*} so $u$ is harmonic in $B(0,R)$. If $y\in\partial B(0,R)$, then \begin{align*} u(y)=\frac{y_1}{R}=g(y), \end{align*} so $u$ has the prescribed boundary trace. Define \begin{align*} v(x)=\int_{\partial B(0,R)}P_R(x,y)\frac{y_1}{R}\,d\mathcal H^{n-1}(y). \end{align*} By the Poisson kernel theorem for the ball, $v$ is harmonic in $B(0,R)$ and extends continuously to $\overline{B(0,R)}$ with boundary value $g$. Hence $v-u$ is harmonic in $B(0,R)$, continuous on $\overline{B(0,R)}$, and for every $y\in\partial B(0,R)$ satisfies \begin{align*} (v-u)(y)=g(y)-g(y)=0. \end{align*} Applying [uniqueness for the Dirichlet problem](/theorems/33), equivalently the maximum principle to $v-u$ and $u-v$, gives \begin{align*} v(x)-u(x)=0\quad\text{for every }x\in B(0,R). \end{align*} Thus, for every $x\in B(0,R)$, \begin{align*} \int_{\partial B(0,R)}P_R(x,y)\frac{y_1}{R}\,d\mathcal H^{n-1}(y)=\frac{x_1}{R}. \end{align*} The kernel therefore reproduces this degree-one harmonic polynomial from its boundary trace, which checks the normalisation of $P_R$ against a nonconstant datum. [/example] The ball computation depends on spherical inversion, but many boundary estimates are local and use a flat model instead. The next problem is therefore to compute the kernel when the boundary is a hyperplane. The method of images gives a formula that becomes the local template for smooth boundaries after flattening. [quotetheorem:6421] [citeproof:6421] The boundedness of $g$ keeps the integral controlled over the noncompact boundary: for example, $g(y)=|y|^2$ makes the displayed integral diverge for low dimensions and fails to define a bounded harmonic extension. Continuity at $y_0$ is the hypothesis used for convergence at that point; for a bounded function such as $g(y)=\sin(1/|y-y_0|)$ near $y_0$, the kernel need not recover a pointwise boundary value because there is no boundary value to recover at that point. No non-tangential restriction is needed for bounded continuous data in this theorem, although non-tangential limits become important later for rougher boundary data and almost-everywhere trace theorems. The theorem also does not claim [uniform convergence](/page/Uniform%20Convergence) on all of $\mathbb R^{n-1}$. The half-space formula is local in spirit: after flattening a smooth boundary near a point, the leading boundary behaviour of many elliptic kernels resembles this kernel. A concrete image construction shows why the sign and the positivity are geometrically natural. [illustration:half-space-method-of-images] [example: Method of Images in a Half-Space] For $n\ge3$ and $y=(y',y_n)\in\mathbb R^n_+$, set \begin{align*} y^*=(y',-y_n). \end{align*} Define, for $x\in\mathbb R^n_+$ with $x\ne y$, \begin{align*} G(x,y)=\frac{1}{n(n-2)\omega_n}\left(|x-y|^{2-n}-|x-y^*|^{2-n}\right). \end{align*} If $x=(x',0)\in\partial\mathbb R^n_+$, then \begin{align*} |x-y|^2=|x'-y'|^2+y_n^2. \end{align*} Also, \begin{align*} |x-y^*|^2=|x'-y'|^2+(-y_n)^2=|x'-y'|^2+y_n^2. \end{align*} Hence $|x-y|=|x-y^*|$, so \begin{align*} G(x,y)=\frac{1}{n(n-2)\omega_n}\left(|x-y|^{2-n}-|x-y|^{2-n}\right)=0. \end{align*} For $x=(x',x_n)\in\mathbb R^n_+$, the reflected pole is farther away than the original pole. Indeed, \begin{align*} |x-y^*|^2-|x-y|^2=\left(|x'-y'|^2+(x_n+y_n)^2\right)-\left(|x'-y'|^2+(x_n-y_n)^2\right). \end{align*} Canceling the common tangential term gives \begin{align*} |x-y^*|^2-|x-y|^2=(x_n+y_n)^2-(x_n-y_n)^2. \end{align*} Expanding both squares, \begin{align*} (x_n+y_n)^2-(x_n-y_n)^2=(x_n^2+2x_ny_n+y_n^2)-(x_n^2-2x_ny_n+y_n^2). \end{align*} Therefore \begin{align*} |x-y^*|^2-|x-y|^2=4x_ny_n>0. \end{align*} Thus $|x-y|<|x-y^*|$. Since $2-n<0$, this implies \begin{align*} |x-y|^{2-n}>|x-y^*|^{2-n}, \end{align*} and hence $G(x,y)>0$ in the half-space away from the pole. Now let $\xi=(\xi',0)\in\partial\mathbb R^n_+$. The outward unit normal is $\nu_\xi=-e_n$, so \begin{align*} -\frac{\partial G}{\partial \nu_\xi}(\xi,y)=\frac{\partial G}{\partial \xi_n}(\xi,y). \end{align*} Put $z=(\xi',t)$ with $t>0$. By the chain rule, \begin{align*} \frac{\partial}{\partial t}|z-y|^{2-n}=(2-n)(t-y_n)\left(|\xi'-y'|^2+(t-y_n)^2\right)^{-n/2}. \end{align*} Similarly, \begin{align*} \frac{\partial}{\partial t}|z-y^*|^{2-n}=(2-n)(t+y_n)\left(|\xi'-y'|^2+(t+y_n)^2\right)^{-n/2}. \end{align*} Setting $t=0$ in the first derivative gives \begin{align*} \left.\frac{\partial}{\partial t}|z-y|^{2-n}\right|_{t=0}=(2-n)(-y_n)\left(|\xi'-y'|^2+y_n^2\right)^{-n/2}. \end{align*} Since $(2-n)(-y_n)=(n-2)y_n$, this is \begin{align*} \left.\frac{\partial}{\partial t}|z-y|^{2-n}\right|_{t=0}=(n-2)y_n\left(|\xi'-y'|^2+y_n^2\right)^{-n/2}. \end{align*} For the reflected pole, \begin{align*} \left.\frac{\partial}{\partial t}|z-y^*|^{2-n}\right|_{t=0}=(2-n)y_n\left(|\xi'-y'|^2+y_n^2\right)^{-n/2}. \end{align*} Since $(2-n)y_n=-(n-2)y_n$, this is \begin{align*} \left.\frac{\partial}{\partial t}|z-y^*|^{2-n}\right|_{t=0}=-(n-2)y_n\left(|\xi'-y'|^2+y_n^2\right)^{-n/2}. \end{align*} Therefore \begin{align*} -\frac{\partial G}{\partial \nu_\xi}(\xi,y)=\left.\frac{\partial G}{\partial t}(z,y)\right|_{t=0}. \end{align*} Substituting the two derivative values, \begin{align*} -\frac{\partial G}{\partial \nu_\xi}(\xi,y)=\frac{1}{n(n-2)\omega_n}\left(2(n-2)y_n\left(|\xi'-y'|^2+y_n^2\right)^{-n/2}\right). \end{align*} Canceling the factor $n-2$ gives \begin{align*} -\frac{\partial G}{\partial \nu_\xi}(\xi,y)=\frac{2y_n}{n\omega_n\left(|\xi'-y'|^2+y_n^2\right)^{n/2}}. \end{align*} This is exactly the half-space Poisson kernel with interior point $y$ and boundary point $\xi'$, so the reflected pole both enforces the zero boundary value and produces the positive boundary density. [/example] The Poisson kernel can also be read as harmonic measure, but this requires a change of viewpoint. In smooth domains the boundary representation is integration against a density; in rougher domains a density may not exist, while the representing boundary measure remains meaningful. For instance, in planar simply connected domains with highly irregular boundary, such as domains bounded by certain snowflake-type curves, harmonic measure may be singular with respect to arclength measure, so there is no surface-density Poisson kernel comparable to the formulas above. The next definition keeps the representation and discards the assumption that it comes from a kernel. [definition: Harmonic Measure] Let $\Omega\subset\mathbb R^n$ be a bounded domain whose boundary $\partial\Omega$ is compact and whose Dirichlet problem is regular for every continuous boundary datum. The harmonic measure is the assignment \begin{align*} \Omega &\to \mathcal P(\partial\Omega), & x&\mapsto \omega^x, \end{align*} where $\mathcal P(\partial\Omega)$ denotes the Borel probability measures on $\partial\Omega$, such that \begin{align*} u(x)=\int_{\partial\Omega} g(y)\,d\omega^x(y) \end{align*} for every $g\in C(\partial\Omega)$, where $u$ is the harmonic solution with boundary trace $g$. [/definition] When a Poisson kernel exists, harmonic measure has density $P(x,y)$ with respect to surface measure. This language is useful because on rougher domains a density may fail to exist, while the representing measure still captures the boundary values of harmonic functions. It also gives the probabilistic reading of the formulas above: if Brownian motion starts at $x$, then $\omega^x(E)$ is the probability that the first boundary hit lies in $E\subset\partial\Omega$. Under this interpretation, the Poisson integral is the expected boundary payoff, so positivity, total mass $1$, and concentration near the closest boundary point are the analytic shadows of a probability distribution on the exit location. ## Symmetry, Positivity, and Singular Structure The explicit formulas suggest three robust features of Dirichlet Green functions: symmetry in the two variables, positivity inside the domain, and a universal singularity near the diagonal. These properties are not accidents of balls and half-spaces; they follow from Green's identities and the maximum principle. [quotetheorem:6422] [citeproof:6422] The $C^2$ and regularity hypotheses are used to make the boundary terms and small-sphere limits in this classical Green-identity proof legitimate. They should be read as proof hypotheses for the pointwise argument above, not as a claim that symmetry always fails below this regularity threshold. In rough domains, symmetry often survives for the weak Green function because the Dirichlet Laplacian is self-adjoint, but the proof has to be formulated through the energy pairing rather than pointwise normal derivatives. A specific failure of the classical argument occurs in a polygonal domain: at a corner there is no single outward unit normal, so the boundary integral produced by Green's identity cannot be interpreted as a pointwise normal-derivative term along the whole boundary. Reentrant corners can also produce Green functions whose gradients have singular boundary behaviour, so the integration-by-parts statement must be replaced by a weak formulation. The theorem does not say that the Poisson kernel is symmetric, since its variables live in different spaces: one variable is interior and the other is on the boundary. It does, however, justify rewriting the earlier representation formulas with $G(x,y)$ in place of $G(y,x)$ whenever the hypotheses of this symmetry theorem are in force. Symmetry means that the influence of a point source at $y$ observed at $x$ equals the influence of a point source at $x$ observed at $y$, which is the kernel expression of self-adjointness for the Dirichlet Laplacian. The next question is whether the kernel also preserves the order structure predicted by the maximum principle. [quotetheorem:6423] [citeproof:6423] Connectedness is needed for strict positivity: on a disconnected domain, the Green function for a pole in one component is identically zero on the other components. The maximum-principle hypothesis includes the boundary comparison principle on each punctured domain $\Omega\setminus\overline{B}(y,\varepsilon)$ and the strong maximum principle in its interior; in the classical setting these follow, for example, when the punctured domains are regular enough for harmonic functions continuous up to the boundary pieces under consideration. The theorem also does not assert boundary positivity, since $G(\cdot,y)$ has zero Dirichlet trace on $\partial\Omega$. Positivity is the analytic reason that positive sources and positive boundary data produce positive solutions. It also implies the nonnegativity of the Poisson kernel once the Hopf boundary point lemma is applied to the boundary behaviour of $G$. [remark: Sign Convention] These notes use the Green function for $-\Delta$, so the fundamental solution is positive and the Poisson kernel is $P(x,y)=-\partial_{\nu_y}G(y,x)$, equivalently $P(x,y)=-\partial_{\nu_y}G(x,y)$ after symmetry is available. With the opposite convention for $\Delta$, the signs in the representation formula change. Keeping the operator sign fixed avoids sign changes between maximum principles and Green representations. [/remark] The sign convention controls positivity, but it does not yet describe how large the Green function becomes near its pole. The final structural problem is to separate the universal local singularity from the domain-dependent harmonic correction. This separation is what lets local estimates use the fundamental solution while boundary effects remain encoded in a smoother term. [quotetheorem:6424] [citeproof:6424] The boundary regularity and existence assumptions are needed to interpret the harmonic correction with the stated boundary trace. For example, in a punctured disk the removed interior point is not a regular boundary point for the classical Dirichlet problem in dimension two: continuous boundary data assigned independently at the puncture cannot in general be recovered by a harmonic function continuous up to that point. In a cusp domain, the harmonic correction may still exist in a weak sense, but its boundary trace need not be the classical pointwise trace used above. The theorem also does not give uniform estimates up to the boundary, because the correction term depends on the global geometry of $\Omega$. Its value is the separation of local analysis from global geometry: estimates near the pole use the fundamental solution, while estimates near the boundary depend on the harmonic correction and on the geometry of $\partial\Omega$. [example: Green Function in the Unit Disk] For the unit disk $D\subset\mathbb R^2\cong\mathbb C$ and a fixed point $w\in D$, define \begin{align*} G(z,w)=-\frac{1}{2\pi}\log\left|\frac{z-w}{1-\overline{w}z}\right|, \qquad z\in D,\ z\ne w. \end{align*} Equivalently, \begin{align*} G(z,w)=-\frac{1}{2\pi}\log|z-w|+\frac{1}{2\pi}\log|1-\overline{w}z|. \end{align*} The factor $1-\overline{w}z$ never vanishes in $D$, because $1-\overline{w}z=0$ would imply \begin{align*} 1=|\overline{w}z|=|w|\,|z|<1, \end{align*} a contradiction. Hence $z\mapsto \log|1-\overline{w}z|$ is harmonic in $D$, while $z\mapsto -\frac{1}{2\pi}\log|z-w|$ is the two-dimensional fundamental solution of $-\Delta$ with pole at $w$. Therefore \begin{align*} -\Delta_zG(z,w)=\delta_w \end{align*} in distributions on $D$. It remains to check the boundary trace. If $|z|=1$, then $\overline z z=1$, and \begin{align*} 1-\overline{w}z=z\overline z-\overline{w}z=z(\overline z-\overline w)=z\,\overline{(z-w)}. \end{align*} Taking moduli gives \begin{align*} |1-\overline{w}z|=|z|\,|\overline{(z-w)}|=1\cdot |z-w|=|z-w|. \end{align*} Thus, for $|z|=1$, \begin{align*} \left|\frac{z-w}{1-\overline{w}z}\right|=\frac{|z-w|}{|1-\overline{w}z|}=1, \end{align*} and so \begin{align*} G(z,w)=-\frac{1}{2\pi}\log 1=0. \end{align*} Near the pole $z=w$, the denominator satisfies \begin{align*} 1-\overline{w}z\to 1-\overline{w}w=1-|w|^2>0, \end{align*} so the singular part of $G(z,w)$ is exactly \begin{align*} -\frac{1}{2\pi}\log|z-w|. \end{align*} The denominator is therefore the harmonic correction that cancels the boundary value, while the numerator carries the logarithmic pole at $w$. [/example] The disk formula is the two-dimensional analogue of the image construction for balls and half-spaces. It also shows how conformal geometry enters Green functions in dimension two, a theme that belongs more naturally to complex analysis but provides a useful check on the general theory. [explanation: From Kernels to Variational Methods] Green functions give pointwise formulas, while the variational theory developed later gives existence and estimates in Sobolev spaces. The connection is that both are representations of the inverse Dirichlet Laplacian, a viewpoint that Chapter 9 will recast spectrally through the compact resolvent. In smooth domains and for regular data, solving $-\Delta u=f$ with zero boundary trace can be written as $u(x)=\int_\Omega G(x,y)f(y)\,d\mathcal L^n(y)$; in the variational setting, the same solution is constructed as the element $u\in H^1_0(\Omega)$ satisfying \begin{align*} \int_\Omega \nabla u\cdot\nabla v\,d\mathcal L^n=\int_\Omega f v\,d\mathcal L^n \end{align*} for every $v\in H^1_0(\Omega)$. The symmetry and positivity proved here reappear as symmetry and coercivity of the Dirichlet energy [bilinear form](/page/Bilinear%20Form). Harmonic measure also gives an operational bridge: in a smooth domain it has density $P(x,y)$ with respect to $d\mathcal H^{n-1}(y)$, while in probabilistic language it is the exit distribution of Brownian motion, so the same boundary-value solution can be read as an integral kernel formula, a representing measure, or an expected boundary payoff. [/explanation] The classical kernel representation is powerful, but it still relies on smoothness that is too restrictive for many elliptic problems. Chapter 4 replaces pointwise derivatives and classical boundaries with Sobolev spaces, where weak derivatives and traces make the same ideas usable in a far broader setting. # 4. Sobolev framework for weak elliptic equations This chapter supplies the functional-analytic language used for the weak theory of elliptic equations. Chapters 1 through 3 treated Laplace-type equations through classical derivatives, boundary values, maximum principles, and kernels; here the same problems are recast in spaces where first derivatives may exist only after integration by parts. The main questions are how to define derivatives for rough functions, how to encode boundary data without pointwise traces, and which compactness estimates make variational methods possible. The prerequisites are multivariable integration, basic $L^p$ spaces, integration by parts for smooth functions, and the elementary language of normed and Hilbert spaces. ## Weak Derivatives and Distributions Classical derivatives are too restrictive for the energy method: minimizers of integral functionals often arise first as $L^p$ limits, and their pointwise differentiability is not known in advance. The replacement is to move derivatives onto smooth test functions by integration by parts, so that rough functions can still carry derivative information. [definition: Test Function Space] Let $U \subset \mathbb R^n$ be open. The test function space on $U$ is \begin{align*} \mathcal D(U) := C_c^\infty(U). \end{align*} [/definition] Test functions are local probes: because their support stays away from the boundary, integration by parts introduces no boundary term. The next object must be broad enough to include rough limits, point masses, and derivatives of nonsmooth functions even when no pointwise formula exists. The right replacement is therefore not a function with values at points, but a rule that assigns a number to every test function in a linear and continuous way. [definition: Distribution] Let $U \subset \mathbb R^n$ be open. A distribution on $U$ is a continuous linear functional $T: \mathcal D(U) \to \mathbb R$. The space of distributions on $U$ is denoted $\mathcal D'(U)$. [/definition] Distributions extend the class of functions, but weak PDEs still begin with ordinary integrable functions. To compare the new language with the old one, an ordinary locally integrable function must determine the same kind of object by measuring its average against each compactly supported smooth probe. [definition: Regular Distribution] Let $f \in L^1_{\mathrm{loc}}(U)$. The [regular distribution](/page/Regular%20Distribution) associated to $f$ is the distribution $T_f \in \mathcal D'(U)$ defined by \begin{align*} T_f(\phi) := \int_U f\phi\,d\mathcal L^n, \qquad \phi \in \mathcal D(U). \end{align*} [/definition] Regular distributions keep functions visible inside the larger distribution space. The remaining obstacle is differentiation: a rough object may have no classical derivative, but integration by parts suggests that the derivative can be defined indirectly by letting the smooth test function absorb the derivative and by recording the resulting sign. [definition: Distributional Derivative] Let $T \in \mathcal D'(U)$ and let $\alpha$ be a multi-index. The [distributional derivative](/page/Distributional%20Derivative) $D^\alpha T \in \mathcal D'(U)$ is defined by \begin{align*} D^\alpha T(\phi) := (-1)^{|\alpha|}T(D^\alpha \phi), \qquad \phi \in \mathcal D(U). \end{align*} [/definition] Distributional derivatives may be singular, but elliptic energy spaces require derivatives represented by functions. This motivates the [weak derivative](/page/Weak%20Derivative), where the distributional derivative of a locally integrable function is again represented by a locally integrable function. [definition: Weak Derivative] Let $u, v \in L^1_{\mathrm{loc}}(U)$ and let $\alpha$ be a multi-index. The function $v$ is the weak derivative $D^\alpha u$ if \begin{align*} \int_U u D^\alpha \phi\,d\mathcal L^n = (-1)^{|\alpha|}\int_U v\phi\,d\mathcal L^n \end{align*} for every $\phi \in C_c^\infty(U)$. [/definition] The definition records exactly the identity satisfied by classical derivatives, but it asks only for integrability. The first consistency check is that no information is lost when the classical derivative already exists. [quotetheorem:6425] [citeproof:6425] This compatibility lets us keep classical notation while working in a larger class. The hypotheses matter: the argument needs enough classical differentiability to integrate by parts and enough local integrability for both sides of the weak identity to make sense. The result does not say that every weak derivative is classical, nor does it recover pointwise differentiability at exceptional points. Its role is one-way consistency, ensuring that weak theory extends the classical theory rather than replacing it with a different notion. A first example shows why the enlargement is necessary even in one dimension. [example: Weak Derivative of an Absolute Value Cusp] Let $U=(-1,1)$ and $u(x)=|x|$. Define $v(x)=-1$ for $-1<x<0$ and $v(x)=1$ for $0<x<1$; the value of $v(0)$ is irrelevant because changing a function at one point does not change its integral. For $\phi\in C_c^\infty((-1,1))$, split the integral at the cusp: \begin{align*} \int_{-1}^1 |x|\phi'(x)\,dx=\int_{-1}^0 (-x)\phi'(x)\,dx+\int_0^1 x\phi'(x)\,dx. \end{align*} On $(-1,0)$, integration by parts gives \begin{align*} \int_{-1}^0 (-x)\phi'(x)\,dx=\left[-x\phi(x)\right]_{-1}^0+\int_{-1}^0 \phi(x)\,dx. \end{align*} Since $\phi$ has compact support in $(-1,1)$, $\phi$ vanishes near $-1$, and hence the boundary term equals $0\cdot\phi(0)-1\cdot\phi(-1)=0$. Therefore \begin{align*} \int_{-1}^0 (-x)\phi'(x)\,dx=\int_{-1}^0 \phi(x)\,dx. \end{align*} On $(0,1)$, integration by parts gives \begin{align*} \int_0^1 x\phi'(x)\,dx=\left[x\phi(x)\right]_0^1-\int_0^1 \phi(x)\,dx. \end{align*} Again $\phi$ vanishes near $1$, so the boundary term equals $1\cdot\phi(1)-0\cdot\phi(0)=0$. Thus \begin{align*} \int_0^1 x\phi'(x)\,dx=-\int_0^1 \phi(x)\,dx. \end{align*} Combining the two pieces, \begin{align*} \int_{-1}^1 |x|\phi'(x)\,dx=\int_{-1}^0 \phi(x)\,dx-\int_0^1 \phi(x)\,dx. \end{align*} Because $v=-1$ on $(-1,0)$ and $v=1$ on $(0,1)$, the right-hand side is \begin{align*} -\left(\int_{-1}^0 (-1)\phi(x)\,dx+\int_0^1 1\cdot\phi(x)\,dx\right)=-\int_{-1}^1 v(x)\phi(x)\,dx. \end{align*} This is exactly the weak derivative identity for $\alpha=1$, so $D u=v$ weakly even though $|x|$ has no classical derivative at $0$. [/example] The cusp example also warns that weak derivatives ignore changes on sets of measure zero. Sobolev spaces therefore treat functions as equivalence classes, with differentiability measured by the integrability of all weak derivatives up to a specified order. ## Sobolev Spaces and Energy Spaces Elliptic equations of order two naturally involve first derivatives in the energy and second derivatives only after integration by parts. The right spaces must control derivatives in $L^p$, and the Hilbert case $p=2$ is especially important because it supports projection, coercivity, and variational arguments. [definition: Sobolev Space] Let $U \subset \mathbb R^n$ be open, let $k \in \mathbb N$, and let $1 \le p \le \infty$. The Sobolev space $W^{k,p}(U)$ consists of all $u \in L^p(U)$ such that $D^\alpha u \in L^p(U)$ for every multi-index $\alpha$ with $|\alpha| \le k$. For $1 \le p < \infty$, its norm is \begin{align*} \|u\|_{W^{k,p}(U)} := \left(\sum_{|\alpha|\le k}\|D^\alpha u\|_{L^p(U)}^p\right)^{1/p}. \end{align*} For $p=\infty$, its norm is \begin{align*} \|u\|_{W^{k,\infty}(U)} := \max_{|\alpha|\le k}\|D^\alpha u\|_{L^\infty(U)}. \end{align*} [/definition] The Sobolev norm combines size and differentiability, and different choices of $k$ and $p$ emphasize different analytic features. In elliptic boundary-value problems, one does not usually need all orders of weak differentiability at once; the natural energy measures the function itself and its first weak derivatives in the same square-integrable scale. This special case deserves its own notation because it is the setting in which Dirichlet energy, [weak convergence](/page/Weak%20Convergence), and variational formulations become Hilbert-space arguments. The next definition isolates that energy space by naming $W^{1,2}(U)$ as $H^1(U)$ and writing its norm in terms of the $L^2$ size of $u$ and the $L^2$ size of its gradient. [definition: The Space H One] Let $U \subset \mathbb R^n$ be open. The space $H^1(U)$ is $W^{1,2}(U)$, equipped with \begin{align*} \|u\|_{H^1(U)}^2 := \|u\|_{L^2(U)}^2 + \|\nabla u\|_{L^2(U)}^2, \end{align*} where \begin{align*} \|\nabla u\|_{L^2(U)}^2 := \sum_{i=1}^n \|\partial_i u\|_{L^2(U)}^2. \end{align*} [/definition] Completeness is essential because weak solutions are often obtained as limits of approximating sequences. The question is whether a [Cauchy sequence](/page/Cauchy%20Sequence) whose functions and weak derivatives converge in the relevant $L^p$ norms has a limit that still possesses those weak derivatives, so that approximation procedures do not leave the Sobolev space. [quotetheorem:6426] [citeproof:6426] The Banach and Hilbert structures describe interior control and justify taking limits of approximate solutions inside the same function space. Completeness would fail if the candidate weak derivatives were not required to be represented by $L^p$ functions: a Cauchy sequence of distributional derivatives could converge only distributionally, outside the intended energy class. The theorem also does not provide compactness in the Sobolev norm; it gives a complete ambient space, while compactness requires additional domain hypotheses and weaker target norms. Dirichlet problems also require a way to impose vanishing boundary data. Since arbitrary $H^1$ functions do not come with pointwise boundary values, we define the homogeneous space through approximation by smooth functions supported away from the boundary. [definition: The Space H One Zero] Let $U \subset \mathbb R^n$ be open. The space $H^1_0(U)$ is the closure of $C_c^\infty(U)$ in $H^1(U)$. [/definition] This definition avoids pointwise boundary values, which may not exist for an arbitrary $H^1$ function. It says that functions in $H^1_0(U)$ can be approximated in energy by smooth functions that vanish near the boundary. [example: Zero Boundary Data Without Pointwise Values] Let $U=(0,1)$ and $u(x)=x(1-x)$. We construct smooth compactly supported functions converging to $u$ in $H^1((0,1))$, which proves $u\in H^1_0((0,1))$ by the definition of $H^1_0$. Choose $\theta\in C^\infty(\mathbb R)$ with $0\le \theta\le 1$, with $\theta(t)=0$ for $t\le 1$, and with $\theta(t)=1$ for $t\ge 2$. For $0<\varepsilon<1/4$, define \begin{align*} \chi_\varepsilon(x):=\theta(x/\varepsilon)\theta((1-x)/\varepsilon). \end{align*} Set \begin{align*} u_\varepsilon(x):=\chi_\varepsilon(x)u(x). \end{align*} Then $u_\varepsilon\in C_c^\infty((0,1))$, since $\chi_\varepsilon=0$ on $(0,\varepsilon]$ and on $[1-\varepsilon,1)$, while $\chi_\varepsilon=1$ on $[2\varepsilon,1-2\varepsilon]$. We show that $u_\varepsilon\to u$ in $H^1((0,1))$. Since \begin{align*} u_\varepsilon-u=(\chi_\varepsilon-1)u, \end{align*} the difference is supported in $(0,2\varepsilon)\cup(1-2\varepsilon,1)$. On $(0,2\varepsilon)$ one has $0\le u(x)=x(1-x)\le x$, and on $(1-2\varepsilon,1)$ one has $0\le u(x)=x(1-x)\le 1-x$. Therefore \begin{align*} \|u_\varepsilon-u\|_{L^2(0,1)}^2 \le \int_0^{2\varepsilon} x^2\,dx+\int_{1-2\varepsilon}^1 (1-x)^2\,dx. \end{align*} The two integrals are \begin{align*} \int_0^{2\varepsilon}x^2\,dx=\frac{(2\varepsilon)^3}{3}. \end{align*} With $y=1-x$, \begin{align*} \int_{1-2\varepsilon}^1(1-x)^2\,dx=\int_0^{2\varepsilon}y^2\,dy=\frac{(2\varepsilon)^3}{3}. \end{align*} Thus \begin{align*} \|u_\varepsilon-u\|_{L^2(0,1)}^2 \le \frac{16}{3}\varepsilon^3. \end{align*} For the derivative, \begin{align*} (u_\varepsilon-u)'=(\chi_\varepsilon-1)u'+\chi_\varepsilon'u. \end{align*} Since $u'(x)=1-2x$, we have $|u'(x)|\le 1$ on $(0,1)$. Hence \begin{align*} \|(\chi_\varepsilon-1)u'\|_{L^2(0,1)}^2 \le \int_0^{2\varepsilon}1\,dx+\int_{1-2\varepsilon}^1 1\,dx=4\varepsilon. \end{align*} Let $M=\|\theta'\|_{L^\infty(\mathbb R)}$. Differentiating $\chi_\varepsilon$ gives \begin{align*} \chi_\varepsilon'(x)=\frac{1}{\varepsilon}\theta'(x/\varepsilon)\theta((1-x)/\varepsilon)-\frac{1}{\varepsilon}\theta(x/\varepsilon)\theta'((1-x)/\varepsilon). \end{align*} Because $0\le \theta\le 1$ and $|\theta'|\le M$, this implies \begin{align*} |\chi_\varepsilon'(x)|\le \frac{2M}{\varepsilon}. \end{align*} Using the same support and endpoint bounds for $u$, \begin{align*} \|\chi_\varepsilon'u\|_{L^2(0,1)}^2 \le \frac{4M^2}{\varepsilon^2}\left(\int_0^{2\varepsilon}x^2\,dx+\int_{1-2\varepsilon}^1(1-x)^2\,dx\right). \end{align*} Substituting the computed integrals, \begin{align*} \|\chi_\varepsilon'u\|_{L^2(0,1)}^2 \le \frac{64M^2}{3}\varepsilon. \end{align*} By $(a+b)^2\le 2a^2+2b^2$, \begin{align*} \|(u_\varepsilon-u)'\|_{L^2(0,1)}^2 \le 2\|(\chi_\varepsilon-1)u'\|_{L^2(0,1)}^2+2\|\chi_\varepsilon'u\|_{L^2(0,1)}^2. \end{align*} Therefore \begin{align*} \|(u_\varepsilon-u)'\|_{L^2(0,1)}^2 \le 8\varepsilon+\frac{128M^2}{3}\varepsilon. \end{align*} Combining the $L^2$ estimate and the derivative estimate, \begin{align*} \|u_\varepsilon-u\|_{H^1(0,1)}^2 \le \frac{16}{3}\varepsilon^3+8\varepsilon+\frac{128M^2}{3}\varepsilon. \end{align*} The right-hand side tends to $0$ as $\varepsilon\to 0$, so $u_\varepsilon\to u$ in $H^1((0,1))$. Thus $u$ lies in the $H^1$-closure of $C_c^\infty((0,1))$, meaning $u\in H^1_0((0,1))$. The boundary condition is encoded by energy approximation; for this continuous representative it is also reflected by the endpoint equalities $u(0)=u(1)=0$. [/example] The closure definition is robust, but for nonzero boundary data we need a separate mechanism for taking boundary values. That mechanism is the trace operator. ## Trace and Boundary Data A weak formulation involves test functions in the interior, so boundary values cannot be read directly from the integration-by-parts identity. The problem is to assign a boundary value to an $H^1$ function in a way that is continuous under $H^1$ convergence and agrees with classical restriction for smooth functions. [definition: $C^1$ Domain] An open set $U \subset \mathbb R^n$ is a $C^1$ domain if for every $x_0 \in \partial U$ there are $r>0$, a rigid motion of $\mathbb R^n$, and a $C^1$ function $\gamma: \mathbb R^{n-1}\to \mathbb R$ such that, in the transformed coordinates, \begin{align*} U\cap B(x_0,r)=\{(x',x_n)\in B(x_0,r): x_n>\gamma(x')\}. \end{align*} [/definition] The $C^1$ condition gives enough geometric regularity to flatten the boundary without losing control of first derivatives. The obstruction is that an arbitrary $H^1(U)$ function is an equivalence class in the interior, so its boundary values cannot be obtained by pointwise restriction; they must be assigned by a continuous operator that agrees with classical restriction on smooth functions. [quotetheorem:60] [citeproof:60] The theorem explains the weak meaning of Dirichlet data, but it is also a geometric statement about the boundary. The $C^1$ regularity gives charts in which the boundary can be flattened and controlled uniformly; on rougher domains, a continuous trace map may fail or the equality between zero trace and closure of $C_c^\infty(U)$ may no longer hold in this form. The operator does not choose pointwise boundary values of an arbitrary representative; for $p=2$, it assigns an $L^2(\partial U,\mathcal H^{n-1})$ boundary class. If $g$ is prescribed on the boundary and there is $G\in H^1(U)$ with $\operatorname{Tr}G=g$, then a solution with boundary data $g$ is sought in the affine space $G+H^1_0(U)$. [example: Extension from a $C^1$ Domain] Let $U\subset \mathbb R^n$ be a bounded $C^1$ domain and let $g$ be a boundary class in the range of the trace map $T:H^1(U)\to L^2(\partial U)$. Choose $G\in H^1(U)$ such that \begin{align*} \operatorname{Tr}G=g. \end{align*} A weak solution with boundary value $g$ is a function $u\in H^1(U)$ with $\operatorname{Tr}u=g$ satisfying \begin{align*} \int_U \nabla u\cdot\nabla v\,d\mathcal L^n=\int_U fv\,d\mathcal L^n \end{align*} for every $v\in H^1_0(U)$. Set \begin{align*} w:=u-G. \end{align*} Linearity of the trace gives \begin{align*} \operatorname{Tr}w=\operatorname{Tr}u-\operatorname{Tr}G=g-g=0. \end{align*} Hence $w\in H^1_0(U)$, using the zero-trace characterization in the *Trace Theorem*. Since $u=G+w$, linearity of the weak gradient gives \begin{align*} \nabla u=\nabla G+\nabla w. \end{align*} Therefore, for every $v\in H^1_0(U)$, \begin{align*} \int_U \nabla u\cdot\nabla v\,d\mathcal L^n=\int_U (\nabla G+\nabla w)\cdot\nabla v\,d\mathcal L^n. \end{align*} Distributing the dot product and using linearity of the integral, \begin{align*} \int_U (\nabla G+\nabla w)\cdot\nabla v\,d\mathcal L^n=\int_U \nabla G\cdot\nabla v\,d\mathcal L^n+\int_U \nabla w\cdot\nabla v\,d\mathcal L^n. \end{align*} Thus the original weak equation is equivalent to \begin{align*} \int_U \nabla G\cdot\nabla v\,d\mathcal L^n+\int_U \nabla w\cdot\nabla v\,d\mathcal L^n=\int_U fv\,d\mathcal L^n. \end{align*} Subtracting the known term involving $G$ gives the homogeneous-boundary problem for $w$: \begin{align*} \int_U \nabla w\cdot\nabla v\,d\mathcal L^n=\int_U fv\,d\mathcal L^n-\int_U \nabla G\cdot\nabla v\,d\mathcal L^n \end{align*} for every $v\in H^1_0(U)$. Conversely, if $w\in H^1_0(U)$ satisfies this last identity and $u:=G+w$, then \begin{align*} \operatorname{Tr}u=\operatorname{Tr}G+\operatorname{Tr}w=g+0=g. \end{align*} Adding $\int_U \nabla G\cdot\nabla v\,d\mathcal L^n$ to both sides recovers \begin{align*} \int_U \nabla u\cdot\nabla v\,d\mathcal L^n=\int_U fv\,d\mathcal L^n. \end{align*} So nonhomogeneous boundary data are absorbed into the chosen extension $G$, and the remaining unknown $w$ lies in the homogeneous space $H^1_0(U)$ with a modified forcing functional. [/example] Trace theory also clarifies why boundary regularity assumptions appear in elliptic estimates. Without enough control of the boundary geometry, restriction to $\partial U$ may fail to behave continuously with respect to the interior $H^1$ norm. [remark: Boundary Data Are Not Pointwise Data] For $u\in H^1(U)$, the expression $u|_{\partial U}$ means $\operatorname{Tr}u$, not pointwise restriction of an arbitrary representative. Two functions that agree a.e. in $U$ have the same trace. Boundary data in weak Dirichlet problems are therefore imposed in the trace sense. [/remark] With trace-zero boundary data now identified with $H^1_0(U)$, the next estimates explain why this homogeneous space is the natural test space. On this space, the gradient alone controls the whole $H^1$ norm on bounded domains. ## Poincare and Sobolev Inequalities Variational methods require coercive estimates: the energy should control the size of a function, not merely its derivatives. Constants obstruct this on $H^1(U)$, since a nonzero constant has zero gradient, so the boundary condition or a zero-average condition must remove that obstruction. [quotetheorem:75] [citeproof:75] Poincare's inequality says that on $H^1_0(U)$ the seminorm $\|\nabla u\|_{L^2(U)}$ is equivalent to the full $H^1$ norm. The boundary condition is essential: on $H^1(U)$ every nonzero constant has zero gradient, so no estimate of this form can hold. Boundedness is also essential, since on unbounded domains a function can spread out while keeping its gradient small relative to its $L^2$ norm. Connectedness rules out separate constant modes when one uses related zero-average variants of the inequality. This is the estimate that turns the Dirichlet energy into a coercive quadratic form. [example: Coercivity of the Dirichlet Energy] Let $U\subset \mathbb R^n$ be bounded, connected, and Lipschitz, and define \begin{align*} B[u,v]=\int_U \nabla u\cdot \nabla v\,d\mathcal L^n \end{align*} for $u,v\in H^1_0(U)$. Fix $u\in H^1_0(U)$, and let $C_U$ be a Poincare constant from the *Poincare Inequality*. Then \begin{align*} B[u,u]=\int_U \nabla u\cdot \nabla u\,d\mathcal L^n. \end{align*} Since $\nabla u\cdot \nabla u=|\nabla u|^2$ a.e. in $U$, this becomes \begin{align*} B[u,u]=\int_U |\nabla u|^2\,d\mathcal L^n. \end{align*} By the definition of the $L^2$ norm of the weak gradient, \begin{align*} B[u,u]=\|\nabla u\|_{L^2(U)}^2. \end{align*} Poincare's inequality gives \begin{align*} \|u\|_{L^2(U)}\le C_U\|\nabla u\|_{L^2(U)}. \end{align*} Both sides are nonnegative, so squaring preserves the inequality: \begin{align*} \|u\|_{L^2(U)}^2\le C_U^2\|\nabla u\|_{L^2(U)}^2. \end{align*} Using the definition of the $H^1$ norm, \begin{align*} \|u\|_{H^1(U)}^2=\|u\|_{L^2(U)}^2+\|\nabla u\|_{L^2(U)}^2. \end{align*} Substituting the Poincare bound for the first term gives \begin{align*} \|u\|_{H^1(U)}^2\le C_U^2\|\nabla u\|_{L^2(U)}^2+\|\nabla u\|_{L^2(U)}^2. \end{align*} Factoring the right-hand side yields \begin{align*} \|u\|_{H^1(U)}^2\le (1+C_U^2)\|\nabla u\|_{L^2(U)}^2. \end{align*} Since $\|\nabla u\|_{L^2(U)}^2=B[u,u]$, we have \begin{align*} \|u\|_{H^1(U)}^2\le (1+C_U^2)B[u,u]. \end{align*} Dividing by $1+C_U^2>0$ gives \begin{align*} B[u,u]\ge \frac{1}{1+C_U^2}\|u\|_{H^1(U)}^2. \end{align*} Thus the Dirichlet energy is coercive on $H^1_0(U)$: the zero-boundary condition removes the constant modes, so the gradient term controls the full $H^1$ norm. [/example] The next family of estimates controls higher integrability from derivative information. These estimates are indispensable when lower-order terms or nonlinearities require functions to belong to spaces beyond $L^2$. [quotetheorem:61] [citeproof:61] The whole-space inequality gives the model estimate, and the restriction $p<n$ is forced by scaling: at the critical exponent the derivative has exactly enough homogeneity to control $L^{p^*}$, but not stronger norms. The theorem does not imply boundedness of $u$ when $p<n$, and the endpoint cases require separate statements such as Morrey-type estimates or exponential integrability in dimension two. For elliptic boundary value problems, the estimate must also be transported from $\mathbb R^n$ to domains. To use the estimate there, we combine extension from Lipschitz domains with restriction back to $U$. [quotetheorem:903] [citeproof:903] The embeddings quantify how many integrability gains a first derivative gives. The endpoint behavior is delicate: in dimension two there is no embedding into $L^\infty(U)$ from $H^1(U)$ alone. [example: A Two Dimensional Function in Every Finite L P But Not Bounded] Choose $\eta\in C_c^\infty(B(0,1/2))$ with $0\le \eta\le 1$ and $\eta(x)=1$ for $|x|<1/4$, and define \begin{align*} u(x):=\eta(x)\log\log(e/|x|) \end{align*} for $0<|x|<1/2$, with any value assigned at $x=0$. Since $\eta=1$ near $0$, if $x_m\to 0$ and $x_m\ne 0$, then $e/|x_m|\to\infty$, hence $\log(e/|x_m|)\to\infty$, and therefore \begin{align*} u(x_m)=\log\log(e/|x_m|)\to\infty. \end{align*} Thus $u$ is not bounded near $0$. We first check that $u\in L^q(B(0,1/2))$ for every $1\le q<\infty$. On $B(0,1/4)$, polar coordinates give \begin{align*} \int_{B(0,1/4)} |u(x)|^q\,d\mathcal L^2(x)=2\pi\int_0^{1/4} |\log\log(e/r)|^q r\,dr. \end{align*} Set $s=\log(e/r)$. Then $r=e^{1-s}$, $dr=-e^{1-s}\,ds=-r\,ds$, and therefore $r\,dr=-e^{2-2s}\,ds$. The lower endpoint $r\to 0^+$ corresponds to $s\to\infty$, while $r=1/4$ corresponds to $s=\log(4e)$. Reversing the limits gives \begin{align*} 2\pi\int_0^{1/4} |\log\log(e/r)|^q r\,dr=2\pi e^2\int_{\log(4e)}^\infty |\log s|^q e^{-2s}\,ds. \end{align*} For $s\ge \log(4e)$, the function $|\log s|^q e^{-s}$ is bounded on $[\log(4e),\infty)$, so there is a constant $C_q$ such that \begin{align*} |\log s|^q e^{-2s}\le C_q e^{-s}. \end{align*} Since \begin{align*} \int_{\log(4e)}^\infty e^{-s}\,ds=e^{-\log(4e)}<\infty, \end{align*} the integral over $B(0,1/4)$ is finite. On the annulus $1/4\le |x|<1/2$, the function $\eta(x)\log\log(e/|x|)$ is smooth and bounded, and the annulus has finite measure, so the remaining $L^q$ integral is finite. It remains to check the $H^1$ energy. On $0<|x|<1/4$, the cutoff equals $1$, so \begin{align*} \nabla u(x)=\nabla\left(\log\log(e/|x|)\right). \end{align*} By the chain rule, \begin{align*} \nabla\left(\log\log(e/|x|)\right)=\frac{1}{\log(e/|x|)}\nabla\left(\log(e/|x|)\right). \end{align*} Since $\log(e/|x|)=1-\log|x|$ and $\nabla\log|x|=x/|x|^2$ for $x\ne 0$, \begin{align*} \nabla\left(\log(e/|x|)\right)=-\frac{x}{|x|^2}. \end{align*} Therefore \begin{align*} \nabla u(x)=-\frac{x}{|x|^2\log(e/|x|)}. \end{align*} Taking the squared norm gives \begin{align*} |\nabla u(x)|^2=\frac{|x|^2}{|x|^4\log^2(e/|x|)}=\frac{1}{|x|^2\log^2(e/|x|)}. \end{align*} Using polar coordinates, \begin{align*} \int_{B(0,1/4)}|\nabla u(x)|^2\,d\mathcal L^2(x)=2\pi\int_0^{1/4}\frac{1}{r\log^2(e/r)}\,dr. \end{align*} With $s=\log(e/r)$, we have $ds=-dr/r$. Hence \begin{align*} 2\pi\int_0^{1/4}\frac{1}{r\log^2(e/r)}\,dr=2\pi\int_{\log(4e)}^\infty \frac{1}{s^2}\,ds. \end{align*} The last integral is \begin{align*} 2\pi\int_{\log(4e)}^\infty \frac{1}{s^2}\,ds=\frac{2\pi}{\log(4e)}<\infty. \end{align*} On the annulus $1/4\le |x|<1/2$, both $\eta$ and $\log\log(e/|x|)$ are smooth with bounded derivatives, so the remaining $L^2$ and gradient integrals are finite. Thus $u\in H^1(B(0,1/2))$ and $u\in L^q(B(0,1/2))$ for every finite $q$, but $u\notin L^\infty(B(0,1/2))$. This shows that in dimension two the finite-$q$ conclusion of Sobolev embedding cannot be strengthened to boundedness. [/example] Sobolev inequalities provide continuous embeddings, while compactness results provide convergent subsequences. The distinction matters in existence proofs, where weak convergence is usually available first and strong convergence is needed to pass to lower-order terms. ## Compactness and Density A bounded sequence in an infinite-dimensional Sobolev space need not have a strongly convergent subsequence in the same Sobolev norm. The useful [compactness theorem](/theorems/2748) says that on bounded domains, controlling one derivative gives compactness after dropping to a weaker norm. [quotetheorem:64] [citeproof:64] Rellich-Kondrachov is stronger than a continuous Sobolev embedding because it produces strongly convergent subsequences after passing to a lower integrability exponent. The strict condition $q<p^*$ is essential in the subcritical case: at the critical exponent, concentration phenomena can keep a bounded Sobolev sequence from having a strongly convergent subsequence. This compactness is the mechanism that later allows weakly convergent minimizing sequences to pass through lower-order terms or nonlinear expressions. Compactness also depends both on bounded geometry and on boundedness of the domain. If mass can drift to infinity, the translation control no longer forces a convergent subsequence. [example: Compactness Failure on Unbounded Domains] Let $\psi\in C_c^\infty((-1,1))$ be nonzero, and define \begin{align*} u_j(x):=\psi(x-j), \qquad x\in \mathbb R. \end{align*} Since $\operatorname{supp}\psi$ is compactly contained in $(-1,1)$, there is $0<a<1$ such that $\operatorname{supp}\psi\subset[-a,a]$. Therefore \begin{align*} \operatorname{supp}u_j\subset[j-a,j+a]. \end{align*} We first compute the $H^1(\mathbb R)$ norm. By the change of variables $y=x-j$, \begin{align*} \|u_j\|_{L^2(\mathbb R)}^2=\int_{\mathbb R}|\psi(x-j)|^2\,dx. \end{align*} With $y=x-j$, this becomes \begin{align*} \|u_j\|_{L^2(\mathbb R)}^2=\int_{\mathbb R}|\psi(y)|^2\,dy=\|\psi\|_{L^2(\mathbb R)}^2. \end{align*} Also $u_j'(x)=\psi'(x-j)$, so the same change of variables gives \begin{align*} \|u_j'\|_{L^2(\mathbb R)}^2=\int_{\mathbb R}|\psi'(x-j)|^2\,dx. \end{align*} Again setting $y=x-j$ gives \begin{align*} \|u_j'\|_{L^2(\mathbb R)}^2=\int_{\mathbb R}|\psi'(y)|^2\,dy=\|\psi'\|_{L^2(\mathbb R)}^2. \end{align*} Hence \begin{align*} \|u_j\|_{H^1(\mathbb R)}^2=\|u_j\|_{L^2(\mathbb R)}^2+\|u_j'\|_{L^2(\mathbb R)}^2. \end{align*} Substituting the two identities above, \begin{align*} \|u_j\|_{H^1(\mathbb R)}^2=\|\psi\|_{L^2(\mathbb R)}^2+\|\psi'\|_{L^2(\mathbb R)}^2, \end{align*} which is independent of $j$. Thus $(u_j)$ is bounded in $H^1(\mathbb R)$. Now take $j\ne k$ with $|j-k|>2a$. Then $[j-a,j+a]$ and $[k-a,k+a]$ are disjoint, so $u_j(x)u_k(x)=0$ for every $x\in\mathbb R$. Therefore \begin{align*} \|u_j-u_k\|_{L^2(\mathbb R)}^2=\int_{\mathbb R}|u_j-u_k|^2\,dx. \end{align*} Expanding the square gives \begin{align*} \int_{\mathbb R}|u_j-u_k|^2\,dx=\int_{\mathbb R}\left(|u_j|^2-2u_ju_k+|u_k|^2\right)\,dx. \end{align*} Since $u_ju_k=0$ everywhere, \begin{align*} \int_{\mathbb R}\left(|u_j|^2-2u_ju_k+|u_k|^2\right)\,dx=\int_{\mathbb R}|u_j|^2\,dx+\int_{\mathbb R}|u_k|^2\,dx. \end{align*} Using [translation invariance](/theorems/4911) of the $L^2$ norm for both terms, \begin{align*} \|u_j-u_k\|_{L^2(\mathbb R)}^2=\|\psi\|_{L^2(\mathbb R)}^2+\|\psi\|_{L^2(\mathbb R)}^2. \end{align*} Thus \begin{align*} \|u_j-u_k\|_{L^2(\mathbb R)}^2=2\|\psi\|_{L^2(\mathbb R)}^2. \end{align*} Because $\psi\ne 0$, this number is positive. Every subsequence contains pairs whose indices differ by more than $2a$, so no subsequence is Cauchy in $L^2(\mathbb R)$, and hence no subsequence converges strongly in $L^2(\mathbb R)$. This is the translation obstruction excluded by the bounded-domain hypothesis in the *Rellich Kondrachov Compactness Theorem*. [/example] Compactness supplies subsequences, but weak formulations are first derived for smooth functions and then extended by closure. To justify this passage, we need density of smooth functions in Sobolev spaces. [quotetheorem:6427] [citeproof:6427] For homogeneous Dirichlet data, the density statement is built into the definition of $H^1_0(U)$ but still has practical content: compactly supported smooth functions are legitimate approximations of zero-trace weak solutions. [remark: How These Results Enter Weak Elliptic Equations] The weak Dirichlet problem for $-\Delta u=f$ on a bounded Lipschitz domain asks for $u\in H^1_0(U)$ satisfying \begin{align*} \int_U \nabla u\cdot\nabla v\,d\mathcal L^n=\int_U fv\,d\mathcal L^n \end{align*} for every $v\in H^1_0(U)$. Poincare gives coercivity of the left-hand side, Sobolev embedding controls the right-hand side for suitable $f$, trace theory explains the boundary condition, density connects smooth testing with weak testing, and Rellich-Kondrachov supplies the compactness used in variational limits. This framework is the starting point for the existence and regularity results developed in the next chapters. [/remark] With weak derivatives and traces available, the next step is to turn the formal weak equation into a solvable functional-analytic problem. Chapter 5 does this through coercivity, the Lax-Milgram theorem, and the Poincaré inequality, which together supply existence and uniqueness in the Sobolev framework. # 5. Lax-Milgram and coercive elliptic problems Using the Sobolev framework and Poincare inequality from Chapter 4, this chapter turns the weak formulation from a formal integration-by-parts identity into an existence theorem. It uses the Sobolev spaces $H^1(U)$ and $H^1_0(U)$, the dual space $H^{-1}(U)$, Poincare's inequality, and the [Riesz representation theorem](/theorems/218) for Hilbert spaces. The guiding question is: when does the energy identity associated with an elliptic operator determine exactly one function in $H^1_0(U)$? The answer is the Lax-Milgram theorem, which converts bounded and coercive bilinear forms on Hilbert spaces into solvability statements for boundary value problems. ## Divergence-Form Operators and Their Bilinear Forms The first task is to attach a Hilbert-space object to a second-order elliptic operator. In divergence form, integration by parts moves derivatives from the unknown onto test functions, so the weak problem is naturally encoded by a bilinear form rather than by pointwise second derivatives. [definition: Divergence-Form Elliptic Operator] Let $U \subset \mathbb R^n$ be open. A second-order divergence-form operator on $U$ is an operator formally written as \begin{align*} Lu = -\sum_{i,j=1}^n \partial_{x_i}(a_{ij}\,\partial_{x_j}u) + \sum_{i=1}^n b_i\,\partial_{x_i}u + c\,u, \end{align*} where $a_{ij}, b_i, c \in L^\infty(U)$. When $U$ is bounded, its weak realisation is the map \begin{align*} L:H^1_0(U)\to H^{-1}(U) \end{align*} defined by \begin{align*} (Lu)(v)=\int_U \sum_{i,j=1}^n a_{ij}\,\partial_{x_j}u\,\partial_{x_i}v\,d\mathcal L^n +\int_U \sum_{i=1}^n b_i\,\partial_{x_i}u\,v\,d\mathcal L^n +\int_U c\,uv\,d\mathcal L^n \end{align*} for all $u,v\in H^1_0(U)$. [/definition] The operator has now been specified, but existence theory cannot depend only on the symbolic expression for $L$. We need a quantitative condition on the leading coefficients that will later turn the energy $B[u,u]$ into control of $\nabla u$, and this is the role of uniform ellipticity. [definition: Uniform Ellipticity] The coefficient matrix $A(x)=(a_{ij}(x))_{i,j=1}^n$ is uniformly elliptic on $U$ if there exists $\theta>0$ such that \begin{align*} \sum_{i,j=1}^n a_{ij}(x)\xi_i\xi_j \ge \theta |\xi|^2 \end{align*} for a.e. $x\in U$ and all $\xi\in \mathbb R^n$. [/definition] This condition is pointwise in $x$ but its main consequence is global: after integration it bounds the $L^2$ norm of the gradient. That is what makes it compatible with the Sobolev space $H^1_0(U)$, where weak first derivatives are the available data. [definition: Bilinear Form Associated to an Elliptic Operator] Let $U\subset \mathbb R^n$ be open and let $L$ be a divergence-form operator with coefficients $a_{ij}, b_i, c \in L^\infty(U)$. The associated bilinear form is the map $B:H^1_0(U)\times H^1_0(U)\to \mathbb R$ given by \begin{align*} B[u,v] = \int_U \sum_{i,j=1}^n a_{ij}\,\partial_{x_j}u\,\partial_{x_i}v\,d\mathcal L^n + \int_U \sum_{i=1}^n b_i\,\partial_{x_i}u\,v\,d\mathcal L^n + \int_U c\,u v\,d\mathcal L^n . \end{align*} [/definition] For smooth compactly supported functions this form is exactly what integration by parts produces. The point of the definition is that it still makes sense when $u$ and $v$ only have weak first derivatives. [example: Poisson Energy Form] Let $U\subset\mathbb R^n$ be bounded and take $Lu=-\Delta u$. In the divergence-form notation, this means $a_{ij}=\delta_{ij}$ and $b_i=c=0$, so the associated bilinear form satisfies \begin{align*} B[u,v]=\int_U \sum_{i,j=1}^n \delta_{ij}\,\partial_{x_j}u\,\partial_{x_i}v\,d\mathcal L^n . \end{align*} Since $\delta_{ij}=0$ when $i\ne j$ and $\delta_{ii}=1$, the double sum keeps only the terms with $i=j$: \begin{align*} \sum_{i,j=1}^n \delta_{ij}\,\partial_{x_j}u\,\partial_{x_i}v=\sum_{i=1}^n \partial_{x_i}u\,\partial_{x_i}v . \end{align*} Hence \begin{align*} B[u,v]=\int_U \sum_{i=1}^n \partial_{x_i}u\,\partial_{x_i}v\,d\mathcal L^n=\int_U \nabla u\cdot\nabla v\,d\mathcal L^n . \end{align*} The weak Dirichlet problem for $-\Delta u=f$ with zero boundary data asks for $u\in H^1_0(U)$ such that \begin{align*} \int_U \nabla u\cdot\nabla v\,d\mathcal L^n=\int_U f v\,d\mathcal L^n \end{align*} for every $v\in H^1_0(U)$. If $f\in L^2(U)$, define $F(v)=\int_U f v\,d\mathcal L^n$. The map $F$ is linear in $v$, and *Cauchy-Schwarz* gives \begin{align*} |F(v)|=\left|\int_U f v\,d\mathcal L^n\right|\le \|f\|_{L^2(U)}\|v\|_{L^2(U)} . \end{align*} If $C_U$ is a Poincare constant for $U$, then *Poincare's inequality* gives \begin{align*} \|v\|_{L^2(U)}\le C_U\|\nabla v\|_{L^2(U)} . \end{align*} Combining these two estimates gives \begin{align*} |F(v)|\le C_U\|f\|_{L^2(U)}\|\nabla v\|_{L^2(U)} . \end{align*} Thus $F$ is a bounded linear functional on $H^1_0(U)$ equipped with the norm $\|v\|_{H^1_0}:=\|\nabla v\|_{L^2(U)}$, and the Poisson equation has been converted into the variational identity $B[u,v]=F(v)$. [/example] The example shows the general pattern: the differential equation becomes a bounded linear functional tested against all $v\in H^1_0(U)$. We next isolate the two estimates on $B$ that make this problem solvable. ## Coercivity, Boundedness, and Weak Solvability in $H^1_0$ The central question is whether the identity $B[u,v]=F(v)$ for all test functions determines a unique $u$. Boundedness makes the left-hand side continuous in both variables, while coercivity prevents the energy from degenerating along nonzero directions. [definition: Bounded Bilinear Form] Let $H$ be a real [Hilbert space](/page/Hilbert%20Space) with norm $\|\cdot\|_H$. A bilinear form $B:H\times H\to \mathbb R$ is bounded if there exists $M>0$ such that \begin{align*} |B[u,v]|\le M\|u\|_H\|v\|_H \end{align*} for all $u,v\in H$. [/definition] Boundedness is the continuity condition needed to use Hilbert-space methods. It also guarantees that each fixed $u$ defines a bounded linear functional $v\mapsto B[u,v]$, but it does not by itself rule out flat directions in which the energy gives no control. [definition: Coercive Bilinear Form] Let $H$ be a real Hilbert space with norm $\|\cdot\|_H$. A bilinear form $B:H\times H\to \mathbb R$ is coercive if there exists $\alpha>0$ such that \begin{align*} B[u,u]\ge \alpha \|u\|_H^2 \end{align*} for all $u\in H$. [/definition] Coercivity is stronger than nonnegativity. We need an abstract theorem saying that boundedness plus this lower bound makes the variational map $u\mapsto (v\mapsto B[u,v])$ invertible; this is the Lax-Milgram theorem. [quotetheorem:4946] [citeproof:4946] The theorem is abstract, but its hypotheses are sharp enough to explain common failures. Without coercivity, uniqueness can fail: on $H=\mathbb R$ the form $B[u,v]=0$ is bounded, but $B[u,v]=F(v)$ has no solution when $F\ne 0$ and infinitely many solutions when $F=0$. Boundedness is the continuity input that turns $u\mapsto (v\mapsto B[u,v])$ into a bounded operator $H\to H^*$; without it, [Riesz representation](/theorems/67) and closed-range arguments do not apply in this form. The conclusion is also only weak solvability: it produces $u\in H$, not additional differentiability or pointwise satisfaction of a PDE. We now apply it with $H=H^1_0(U)$, usually equipped with the equivalent norm $\|u\|_{H^1_0}:=\|\nabla u\|_{L^2}$ on bounded domains where Poincare's inequality applies. [quotetheorem:4869] [citeproof:4869] This theorem is the basic existence result for homogeneous Dirichlet boundary conditions, and each hypothesis has a specific job. Uniform ellipticity supplies the lower bound on the gradient; if the coefficient matrix degenerates on a set of positive measure, the energy may fail to control all directions in $H^1_0(U)$. Boundedness of the coefficients makes the form continuous, while Poincare's inequality permits $\|\nabla u\|_{L^2}$ to serve as a norm on $H^1_0(U)$. The theorem does not imply $u\in H^2(U)$ or classical differentiability; such regularity requires extra assumptions on $U$, the coefficients, and $F$. It is deliberately phrased in terms of $F\in H^{-1}(U)$, because many right-hand sides enter only as bounded functionals on $H^1_0(U)$. [example: Poisson Equation with $L^2$ Forcing] Let $U\subset\mathbb R^n$ be bounded, suppose Poincare's inequality holds on $U$, and equip $H^1_0(U)$ with the norm $\|v\|_{H^1_0}:=\|\nabla v\|_{L^2(U)}$. For $f\in L^2(U)$, define \begin{align*} F(v)=\int_U f v\,d\mathcal L^n \end{align*} for $v\in H^1_0(U)$. If $\alpha,\beta\in\mathbb R$ and $v,w\in H^1_0(U)$, then linearity of the integral gives \begin{align*} F(\alpha v+\beta w)=\alpha\int_U f v\,d\mathcal L^n+\beta\int_U f w\,d\mathcal L^n=\alpha F(v)+\beta F(w). \end{align*} By *Cauchy-Schwarz*, \begin{align*} |F(v)|=\left|\int_U f v\,d\mathcal L^n\right|\le \|f\|_{L^2(U)}\|v\|_{L^2(U)}. \end{align*} If $C_U$ is a Poincare constant for $U$, then *Poincare's inequality* gives \begin{align*} \|v\|_{L^2(U)}\le C_U\|\nabla v\|_{L^2(U)}. \end{align*} Combining the two estimates yields \begin{align*} |F(v)|\le C_U\|f\|_{L^2(U)}\|\nabla v\|_{L^2(U)}=C_U\|f\|_{L^2(U)}\|v\|_{H^1_0}. \end{align*} Thus $F$ is a bounded linear functional on $H^1_0(U)$, so $F\in H^{-1}(U)$ and $\|F\|_{H^{-1}}\le C_U\|f\|_{L^2(U)}$. For the Poisson operator, the bilinear form is \begin{align*} B[u,v]=\int_U \nabla u\cdot\nabla v\,d\mathcal L^n. \end{align*} The same *Cauchy-Schwarz* estimate gives \begin{align*} |B[u,v]|\le \|\nabla u\|_{L^2(U)}\|\nabla v\|_{L^2(U)}=\|u\|_{H^1_0}\|v\|_{H^1_0}. \end{align*} Also, \begin{align*} B[u,u]=\int_U \nabla u\cdot\nabla u\,d\mathcal L^n=\int_U |\nabla u|^2\,d\mathcal L^n=\|u\|_{H^1_0}^2, \end{align*} so $B$ is coercive with constant $1$. Therefore *Lax-Milgram* gives a unique $u\in H^1_0(U)$ such that \begin{align*} \int_U \nabla u\cdot\nabla v\,d\mathcal L^n=\int_U f v\,d\mathcal L^n \end{align*} for every $v\in H^1_0(U)$. This $u$ is the weak solution of $-\Delta u=f$ with zero Dirichlet boundary data. [/example] The same argument allows the medium to have direction-dependent conductivity. The coefficient matrix need not be a scalar multiple of the identity; what matters is uniform positive control in every direction. [example: Anisotropic Conductivity Equation] Let $U\subset\mathbb R^n$ be a bounded domain on which Poincare's inequality holds, let $C_U$ be a Poincare constant, and let $A(x)=(a_{ij}(x))$ be a bounded measurable symmetric matrix field satisfying \begin{align*} \theta |\xi|^2\le \xi^\top A(x)\xi\le \Lambda |\xi|^2 \end{align*} for a.e. $x\in U$ and every $\xi\in\mathbb R^n$. For $u,v\in H^1_0(U)$ define \begin{align*} B[u,v]=\int_U A\nabla u\cdot\nabla v\,d\mathcal L^n=\int_U \sum_{i,j=1}^n a_{ij}\,\partial_{x_j}u\,\partial_{x_i}v\,d\mathcal L^n . \end{align*} Since $A(x)$ is symmetric and $\xi^\top A(x)\xi\le \Lambda|\xi|^2$, the spectral bound gives $|A(x)\eta|\le \Lambda|\eta|$ for every $\eta\in\mathbb R^n$ and a.e. $x\in U$. Therefore \begin{align*} |A(x)\nabla u(x)\cdot\nabla v(x)|\le |A(x)\nabla u(x)|\,|\nabla v(x)|\le \Lambda|\nabla u(x)|\,|\nabla v(x)|. \end{align*} Integrating and applying *Cauchy-Schwarz* gives \begin{align*} |B[u,v]|\le \Lambda\|\nabla u\|_{L^2(U)}\|\nabla v\|_{L^2(U)}=\Lambda\|u\|_{H^1_0}\|v\|_{H^1_0}. \end{align*} Thus $B$ is bounded on $H^1_0(U)$. For coercivity, apply the lower ellipticity bound with $\xi=\nabla u(x)$: \begin{align*} B[u,u]=\int_U \nabla u^\top A\nabla u\,d\mathcal L^n\ge \int_U \theta|\nabla u|^2\,d\mathcal L^n=\theta\|u\|_{H^1_0}^2. \end{align*} Given $f\in L^2(U)$, define \begin{align*} F(v)=\int_U f v\,d\mathcal L^n . \end{align*} Linearity in $v$ follows from linearity of the integral. By *Cauchy-Schwarz* and then *Poincare's inequality*, \begin{align*} |F(v)|\le \|f\|_{L^2(U)}\|v\|_{L^2(U)}\le C_U\|f\|_{L^2(U)}\|\nabla v\|_{L^2(U)}=C_U\|f\|_{L^2(U)}\|v\|_{H^1_0}. \end{align*} So $F\in H^{-1}(U)$ and $\|F\|_{H^{-1}}\le C_U\|f\|_{L^2(U)}$. By *Lax-Milgram Theorem*, there is a unique $u\in H^1_0(U)$ such that \begin{align*} \int_U A\nabla u\cdot\nabla v\,d\mathcal L^n=\int_U f v\,d\mathcal L^n \end{align*} for every $v\in H^1_0(U)$. The same theorem gives \begin{align*} \|u\|_{H^1_0}\le \theta^{-1}\|F\|_{H^{-1}}\le \theta^{-1}C_U\|f\|_{L^2(U)}. \end{align*} Thus $u$ is the weak solution of $-\operatorname{div}(A\nabla u)=f$ with zero Dirichlet boundary data, and the estimate shows exactly how the solution is controlled by $\theta$, $C_U$, and $\|f\|_{L^2(U)}$. [/example] These examples are symmetric, but Lax-Milgram does not require symmetry. Lower-order terms and nonsymmetric leading coefficients can be included if the final form remains coercive; when coercivity fails, Chapter 10 returns to the same forms through Fredholm theory. ## Inhomogeneous Boundary Data and Lower-Order Terms The zero-boundary problem is the clean Hilbert-space model, but boundary value problems usually prescribe $u=g$ on $\partial U$. The question is how to impose such data in a Sobolev setting without treating boundary values pointwise. [definition: Weak Dirichlet Data by Lifting] Let $U\subset\mathbb R^n$ be a bounded domain for which there is a trace operator \begin{align*} \operatorname{Tr}:H^1(U)\to \mathcal T(U), \end{align*} where $\mathcal T(U):=\operatorname{Tr}(H^1(U))$ is the trace range. For $g\in \mathcal T(U)$, a function $u\in H^1(U)$ has Dirichlet data $g$ if there exists a lifting $G\in H^1(U)$ with $\operatorname{Tr}G=g$ such that \begin{align*} w:=u-G\in H^1_0(U). \end{align*} [/definition] The lifting converts the inhomogeneous problem into a zero-boundary problem for $w=u-G$. This transformation changes the right-hand side but leaves the coercive part of the operator acting on $H^1_0(U)$. The reduction only works if the boundary datum lies in the trace class of some $H^1$ function; arbitrary pointwise boundary prescriptions are not meaningful in this Sobolev formulation. [quotetheorem:6428] [citeproof:6428] This reduction is often more important than the notation suggests: boundary data is handled before applying the Hilbert-space theorem. The trace-lifting property is the bridge between boundary data and an affine Sobolev space; if no $G\in H^1(U)$ has the prescribed trace, then $G+H^1_0(U)$ is not available and the theorem says nothing. On a bounded Lipschitz domain this means $g$ must belong to the trace space $H^{1/2}(\partial U)$; a boundary prescription that is only a pointwise function on $\partial U$, or that has jumps incompatible with the trace space, cannot be imposed by this argument. The solution also attains the boundary condition only in the trace sense, not necessarily pointwise on $\partial U$. Once the unknown lies in $H^1_0(U)$, the same coercive machinery applies. [example: Poisson Equation with Nonzero Boundary Values] Let $U$ be a bounded Lipschitz domain, let $f\in L^2(U)$, and let $g$ be a boundary trace with a lifting $G\in H^1(U)$, so that $\operatorname{Tr}G=g$. We look for $u\in G+H^1_0(U)$, meaning $u=G+w$ for some $w\in H^1_0(U)$, such that \begin{align*} \int_U \nabla u\cdot\nabla v\,d\mathcal L^n=\int_U f v\,d\mathcal L^n \end{align*} for every $v\in H^1_0(U)$. Substituting $u=G+w$ and using linearity of the weak gradient gives \begin{align*} \nabla u=\nabla(G+w)=\nabla G+\nabla w. \end{align*} Hence, for every $v\in H^1_0(U)$, \begin{align*} \int_U \nabla u\cdot\nabla v\,d\mathcal L^n=\int_U (\nabla G+\nabla w)\cdot\nabla v\,d\mathcal L^n. \end{align*} By distributivity of the dot product and linearity of the integral, \begin{align*} \int_U (\nabla G+\nabla w)\cdot\nabla v\,d\mathcal L^n=\int_U \nabla G\cdot\nabla v\,d\mathcal L^n+\int_U \nabla w\cdot\nabla v\,d\mathcal L^n. \end{align*} Therefore the equation for $u$ is equivalent to finding $w\in H^1_0(U)$ such that \begin{align*} \int_U \nabla w\cdot\nabla v\,d\mathcal L^n=\int_U f v\,d\mathcal L^n-\int_U \nabla G\cdot\nabla v\,d\mathcal L^n \end{align*} for every $v\in H^1_0(U)$. Define \begin{align*} B[w,v]=\int_U \nabla w\cdot\nabla v\,d\mathcal L^n \end{align*} and \begin{align*} \Phi(v)=\int_U f v\,d\mathcal L^n-\int_U \nabla G\cdot\nabla v\,d\mathcal L^n. \end{align*} By *Cauchy-Schwarz*, \begin{align*} |B[w,v]|\le \|\nabla w\|_{L^2(U)}\|\nabla v\|_{L^2(U)}=\|w\|_{H^1_0}\|v\|_{H^1_0}. \end{align*} Also, \begin{align*} B[w,w]=\int_U |\nabla w|^2\,d\mathcal L^n=\|w\|_{H^1_0}^2, \end{align*} so $B$ is coercive with constant $1$. The functional $\Phi$ is linear in $v$ because both integrals are linear in $v$. For its first term, *Cauchy-Schwarz* gives \begin{align*} \left|\int_U f v\,d\mathcal L^n\right|\le \|f\|_{L^2(U)}\|v\|_{L^2(U)}. \end{align*} If $C_U$ is a Poincare constant for $U$, then *Poincare's inequality* gives \begin{align*} \|v\|_{L^2(U)}\le C_U\|\nabla v\|_{L^2(U)}. \end{align*} Combining these two estimates, \begin{align*} \left|\int_U f v\,d\mathcal L^n\right|\le C_U\|f\|_{L^2(U)}\|\nabla v\|_{L^2(U)}. \end{align*} For the lifting term, *Cauchy-Schwarz* gives \begin{align*} \left|\int_U \nabla G\cdot\nabla v\,d\mathcal L^n\right|\le \|\nabla G\|_{L^2(U)}\|\nabla v\|_{L^2(U)}. \end{align*} Using the triangle inequality, \begin{align*} |\Phi(v)|\le \left(C_U\|f\|_{L^2(U)}+\|\nabla G\|_{L^2(U)}\right)\|\nabla v\|_{L^2(U)}. \end{align*} Since $\|v\|_{H^1_0}=\|\nabla v\|_{L^2(U)}$, this becomes \begin{align*} |\Phi(v)|\le \left(C_U\|f\|_{L^2(U)}+\|\nabla G\|_{L^2(U)}\right)\|v\|_{H^1_0}. \end{align*} Thus $\Phi\in H^{-1}(U)$. By the *Lax-Milgram Theorem*, there is a unique $w\in H^1_0(U)$ such that \begin{align*} B[w,v]=\Phi(v) \end{align*} for every $v\in H^1_0(U)$. Setting $u=G+w$ gives the unique weak solution in the affine space $G+H^1_0(U)$, and $u-G=w\in H^1_0(U)$ means that $u$ attains the boundary value $g$ in the trace sense. [/example] The lifting argument handles boundary data, but the operator itself may also contain terms that do not have a fixed sign. We therefore need a criterion that says when drift and potential terms are still dominated by the elliptic energy. [quotetheorem:6429] [citeproof:6429] The hypothesis is a compact way of stating that drift and potential terms are small relative to the elliptic energy. The strict inequality $\beta<\theta$ is essential: for example, on a bounded domain the form $B[u,u]=\int_U |\nabla u|^2\,d\mathcal L^n-\lambda\int_U u^2\,d\mathcal L^n$ loses coercivity when $\lambda$ reaches the first Dirichlet eigenvalue. At that threshold the homogeneous equation has a nonzero solution, and solvability of the inhomogeneous problem requires a compatibility condition with the first eigenfunction. In concrete problems, the relative bound can be verified by Holder, [Young's inequality](/theorems/244), sign assumptions on $c$, or a smallness condition on the drift. [example: Bounded Drift and Potential Terms] Let $U\subset\mathbb R^n$ be bounded with Poincare constant $C_U$, and consider \begin{align*} Lu=-\Delta u+b\cdot\nabla u+cu \end{align*} with $b\in L^\infty(U;\mathbb R^n)$ and $c\in L^\infty(U)$. On $H^1_0(U)$ use the norm $\|u\|_{H^1_0}:=\|\nabla u\|_{L^2(U)}$. The associated bilinear form is \begin{align*} B[u,v]=\int_U \nabla u\cdot\nabla v\,d\mathcal L^n+\int_U b\cdot\nabla u\,v\,d\mathcal L^n+\int_U c u v\,d\mathcal L^n. \end{align*} We first verify boundedness. By *Cauchy-Schwarz*, \begin{align*} \left|\int_U \nabla u\cdot\nabla v\,d\mathcal L^n\right|\le \|\nabla u\|_{L^2(U)}\|\nabla v\|_{L^2(U)}. \end{align*} For the drift term, the pointwise estimate $|b\cdot\nabla u\,v|\le \|b\|_{L^\infty(U)}|\nabla u||v|$ gives \begin{align*} \left|\int_U b\cdot\nabla u\,v\,d\mathcal L^n\right|\le \|b\|_{L^\infty(U)}\|\nabla u\|_{L^2(U)}\|v\|_{L^2(U)}. \end{align*} By *Poincare's inequality*, $\|v\|_{L^2(U)}\le C_U\|\nabla v\|_{L^2(U)}$, so \begin{align*} \left|\int_U b\cdot\nabla u\,v\,d\mathcal L^n\right|\le C_U\|b\|_{L^\infty(U)}\|\nabla u\|_{L^2(U)}\|\nabla v\|_{L^2(U)}. \end{align*} For the potential term, \begin{align*} \left|\int_U c u v\,d\mathcal L^n\right|\le \|c\|_{L^\infty(U)}\|u\|_{L^2(U)}\|v\|_{L^2(U)}. \end{align*} Applying *Poincare's inequality* to both $u$ and $v$ yields \begin{align*} \left|\int_U c u v\,d\mathcal L^n\right|\le C_U^2\|c\|_{L^\infty(U)}\|\nabla u\|_{L^2(U)}\|\nabla v\|_{L^2(U)}. \end{align*} Adding the three bounds gives \begin{align*} |B[u,v]|\le \left(1+C_U\|b\|_{L^\infty(U)}+C_U^2\|c\|_{L^\infty(U)}\right)\|u\|_{H^1_0}\|v\|_{H^1_0}. \end{align*} Thus $B$ is bounded on $H^1_0(U)$. Now estimate the energy $B[u,u]$. The leading term is \begin{align*} \int_U \nabla u\cdot\nabla u\,d\mathcal L^n=\|\nabla u\|_{L^2(U)}^2. \end{align*} For the drift term, *Cauchy-Schwarz* gives \begin{align*} \left|\int_U b\cdot\nabla u\,u\,d\mathcal L^n\right|\le \|b\|_{L^\infty(U)}\|\nabla u\|_{L^2(U)}\|u\|_{L^2(U)}. \end{align*} Fix $\varepsilon>0$. Young's inequality $XY\le \varepsilon X^2+\frac{1}{4\varepsilon}Y^2$, with $X=\|\nabla u\|_{L^2(U)}$ and $Y=\|b\|_{L^\infty(U)}\|u\|_{L^2(U)}$, gives \begin{align*} \left|\int_U b\cdot\nabla u\,u\,d\mathcal L^n\right|\le \varepsilon\|\nabla u\|_{L^2(U)}^2+\frac{\|b\|_{L^\infty(U)}^2}{4\varepsilon}\|u\|_{L^2(U)}^2. \end{align*} Using *Poincare's inequality* on the last factor, \begin{align*} \left|\int_U b\cdot\nabla u\,u\,d\mathcal L^n\right|\le \left(\varepsilon+\frac{C_U^2\|b\|_{L^\infty(U)}^2}{4\varepsilon}\right)\|\nabla u\|_{L^2(U)}^2. \end{align*} Let $c_-(x)=\max\{-c(x),0\}$. Since $c(x)\ge -c_-(x)$ a.e., \begin{align*} \int_U c u^2\,d\mathcal L^n\ge -\int_U c_-u^2\,d\mathcal L^n. \end{align*} The $L^\infty$ bound on $c_-$ gives \begin{align*} -\int_U c_-u^2\,d\mathcal L^n\ge -\|c_-\|_{L^\infty(U)}\|u\|_{L^2(U)}^2. \end{align*} By *Poincare's inequality*, \begin{align*} \int_U c u^2\,d\mathcal L^n\ge -C_U^2\|c_-\|_{L^\infty(U)}\|\nabla u\|_{L^2(U)}^2. \end{align*} Combining the leading term with the lower bounds for the drift and potential terms gives \begin{align*} B[u,u]\ge \left(1-\varepsilon-\frac{C_U^2\|b\|_{L^\infty(U)}^2}{4\varepsilon}-C_U^2\|c_-\|_{L^\infty(U)}\right)\|\nabla u\|_{L^2(U)}^2. \end{align*} Therefore, if there exists $\varepsilon>0$ such that \begin{align*} \alpha:=1-\varepsilon-\frac{C_U^2\|b\|_{L^\infty(U)}^2}{4\varepsilon}-C_U^2\|c_-\|_{L^\infty(U)}>0, \end{align*} then \begin{align*} B[u,u]\ge \alpha\|u\|_{H^1_0}^2. \end{align*} Under this smallness condition, the drift and the negative part of the potential are dominated by the Dirichlet energy. Hence $B$ is bounded and coercive, so *Lax-Milgram Theorem* gives a unique $u\in H^1_0(U)$ solving $B[u,v]=F(v)$ for every $v\in H^1_0(U)$ and every $F\in H^{-1}(U)$. [/example] The example gives conditions under which Lax-Milgram still applies directly. When such a coercive estimate is lost, the next question is whether the failure is finite-dimensional rather than catastrophic; compact perturbations of coercive forms have exactly this behaviour. [quotetheorem:6430] [citeproof:6430] This result explains why coercivity is the cleanest case: it removes all compatibility conditions. Compactness matters because it preserves the Fredholm structure after passing from the coercive isomorphism $A_0$ to $I+A_0^{-1}K$; kernels remain finite-dimensional and ranges remain closed. For a noncompact perturbation, the range may fail to be closed and solvability need not reduce to finitely many compatibility conditions. The compatibility condition arises because any solution must satisfy the adjoint orthogonality relations forced by the homogeneous adjoint equation. Perturbative Fredholm theory is the next tool when lower-order terms or spectral parameters prevent a direct lower bound. Coercive linear theory gives existence, but many elliptic problems are more naturally understood as minimization problems. Chapter 6 recasts the weak formulation as an energy principle, showing how solutions arise as critical points or minimizers of a variational functional. # 6. Variational principles and energy minimization After Chapter 5's Lax-Milgram existence theorem for coercive bilinear forms, this chapter turns the weak theory of elliptic equations into a variational method. Instead of starting from a differential equation and then deriving an integral identity, we begin with an energy functional and ask whether it has a minimizer or a critical point. The main bridge is that, for quadratic energies, the Euler-Lagrange equation is the weak form of an elliptic boundary value problem. The chapter also introduces the direct method in the calculus of variations. Its logic is simple to state: take a minimizing sequence, use coercivity to get boundedness, extract a weakly convergent subsequence, and use weak lower semicontinuity to pass the minimum to the limit. The substance lies in verifying these hypotheses for Sobolev energies and for closed convex constraint sets such as obstacle classes. ## Dirichlet Energy and Weak Critical Points What should it mean for a function to solve Laplace's equation when the function is only in $H^1$ and may not have classical second derivatives? The variational answer is to measure the cost of a function by the size of its gradient and then identify solutions as functions that cannot decrease the energy under admissible perturbations. [definition: Dirichlet Energy] Let $U \subset \mathbb R^n$ be open. The Dirichlet energy on $U$ is the functional \begin{align*} E:H^1(U)\to \mathbb R,\qquad E[u] := \frac12 \int_U |\nabla u|^2\,d\mathcal L^n. \end{align*} [/definition] The factor $1/2$ has no effect on minimizers, but it removes a factor of $2$ from the first variation. To make minimization encode a boundary value problem, we need an admissible class that fixes the boundary data in the Sobolev trace sense. [definition: Sobolev Dirichlet Class] Let $U \subset \mathbb R^n$ be open and let $g \in H^1(U)$. The Sobolev Dirichlet class with boundary datum $g$ is \begin{align*} \mathcal A_g := g + H^1_0(U) = \{u \in H^1(U) : u-g \in H^1_0(U)\}. \end{align*} [/definition] This class records boundary data through the zero-trace space $H^1_0(U)$. We can now ask whether minimizing the gradient energy over this affine space is the same as satisfying the weak Laplace equation. [quotetheorem:577] [citeproof:577] The identity in the theorem is the weak equation $-\Delta u=0$ with homogeneous test functions. The affine constraint $u\in g+H^1_0(U)$ is essential: minimizing over all of $H^1(U)$ would ignore the boundary datum and constants would always have zero energy, even when $g$ is nonconstant on the boundary. The quadratic structure is also essential for the converse direction, because the expansion of $E[u+v]$ separates into a linear first variation plus a nonnegative remainder; for a nonconvex energy such as $\int_U (|\nabla u|^2-1)^2\,d\mathcal L^n$, a stationary point need not be a minimizer. The theorem does not say that a minimizer exists; it only identifies minimizers once they exist. Existence requires compactness and lower semicontinuity, which is why the chapter next turns from first variations to the direct method. A basic example checks that this variational formulation agrees with the classical notion when smooth harmonic functions are available. [example: Harmonic Functions as Energy Minimizers] Let $U=B(0,1)\subset\mathbb R^n$ and let $g(x)=x_1$. Then $g\in H^1(U)$ and $\nabla g=e_1$. If $\varphi\in C_c^\infty(U)$, extend $\varphi$ by $0$ outside $U$. Since the extension has compact support, [Fubini's theorem](/theorems/2961) gives \begin{align*} \int_U \nabla g\cdot\nabla\varphi\,d\mathcal L^n=\int_U \partial_1\varphi\,d\mathcal L^n=\int_{\mathbb R^{n-1}}\left(\int_{\mathbb R}\partial_1\varphi(x_1,x')\,dx_1\right)dx'=0. \end{align*} Now let $v\in H^1_0(U)$. By the definition of $H^1_0(U)$, there are $\varphi_k\in C_c^\infty(U)$ with $\varphi_k\to v$ in $H^1(U)$. The preceding computation gives $\int_U \nabla g\cdot\nabla\varphi_k\,d\mathcal L^n=0$, and \begin{align*} \left|\int_U \nabla g\cdot\nabla(v-\varphi_k)\,d\mathcal L^n\right|\le \|\nabla g\|_{L^2(U)}\|\nabla v-\nabla\varphi_k\|_{L^2(U)}\to 0 \end{align*} by Cauchy-Schwarz. Hence \begin{align*} \int_U \nabla g\cdot\nabla v\,d\mathcal L^n=0 \end{align*} for every $v\in H^1_0(U)$. To see the minimizing property directly, take any competitor $w\in g+H^1_0(U)$ and write $w=g+h$ with $h\in H^1_0(U)$. Using $\nabla w=\nabla g+\nabla h$, expand the energy: \begin{align*} E[w]=\frac12\int_U |\nabla g+\nabla h|^2\,d\mathcal L^n=\frac12\int_U |\nabla g|^2\,d\mathcal L^n+\int_U \nabla g\cdot\nabla h\,d\mathcal L^n+\frac12\int_U |\nabla h|^2\,d\mathcal L^n. \end{align*} The middle term is $0$ by the weak identity just proved, so \begin{align*} E[w]=E[g]+\frac12\int_U |\nabla h|^2\,d\mathcal L^n\ge E[g]. \end{align*} Thus the affine function $x_1$ is the Dirichlet-energy minimizer among all Sobolev functions with the same boundary datum, so the weak variational formulation recovers this classical harmonic solution. [/example] The example confirms the minimizer equation in a smooth case, but later energies are not always pure Dirichlet energy and may include forcing terms or coefficients. We therefore need a general language for first variations that applies before a particular PDE has been named. [definition: Weak Critical Point] Let $X$ be a real [Banach space](/page/Banach%20Space), let $I:X\to \mathbb R$ be a functional, and let $\mathcal A \subset X$ be an affine set. A point $u \in \mathcal A$ is a weak critical point of $I$ on $\mathcal A$ if, for every admissible direction $v$ with $u+t v \in \mathcal A$ for all sufficiently small $t \in \mathbb R$, the derivative \begin{align*} \frac{d}{dt}\Big|_{t=0} I[u+t v] \end{align*} exists and is equal to $0$. [/definition] For quadratic elliptic energies, the weak critical point condition is exactly a bilinear weak formulation. The next calculation is the model used to read a divergence-form equation from an energy. [quotetheorem:6431] [citeproof:6431] The corresponding PDE is $-\operatorname{div}(a\nabla u)=f$ in weak form, with zero boundary data. The lower bound $a\ge \theta>0$ is not needed merely to compute the derivative, but it becomes decisive for existence: if $a=0$ on a subinterval in one dimension, changing $u$ inside that subinterval can cost no gradient energy, and coercivity can fail. The assumption $f\in L^2(U)$ ensures that the linear term is continuous on $H^1_0(U)$ on bounded domains, using Poincare's inequality and Cauchy-Schwarz; if $f$ is too singular to define a bounded functional on $H^1_0(U)$, the displayed right-hand side may not be meaningful for every test function. This theorem does not by itself prove that the critical point exists or is unique; it only translates the variational stationarity condition into the weak PDE. The coefficient $a$ represents an inhomogeneous conductivity, so minimization selects the potential whose weighted gradient cost balances the forcing. [example: Weighted Conductivity Energy] Let $U \subset \mathbb R^n$ be bounded, let $a \in L^\infty(U)$ satisfy $0<\theta\le a(x)\le \Theta$ a.e., and let $f\in L^2(U)$. For $u,v\in H^1_0(U)$ and $t\in\mathbb R$, the gradient term expands pointwise as \begin{align*} |\nabla u+t\nabla v|^2=|\nabla u|^2+2t\nabla u\cdot\nabla v+t^2|\nabla v|^2. \end{align*} Substituting this into \begin{align*} I[u+t v]=\frac12\int_U a(x)|\nabla u+t\nabla v|^2\,d\mathcal L^n-\int_U f(u+t v)\,d\mathcal L^n \end{align*} gives \begin{align*} I[u+t v]=I[u]+t\left(\int_U a(x)\nabla u\cdot\nabla v\,d\mathcal L^n-\int_U fv\,d\mathcal L^n\right)+\frac{t^2}{2}\int_U a(x)|\nabla v|^2\,d\mathcal L^n. \end{align*} Hence the derivative at $t=0$ is \begin{align*} \frac{d}{dt}\Big|_{t=0}I[u+t v]=\int_U a(x)\nabla u\cdot\nabla v\,d\mathcal L^n-\int_U fv\,d\mathcal L^n. \end{align*} Therefore $u$ is a weak critical point exactly when this expression is $0$ for every $v\in H^1_0(U)$, equivalently \begin{align*} \int_U a(x)\nabla u\cdot \nabla v\,d\mathcal L^n=\int_U f v\,d\mathcal L^n. \end{align*} This is the weak form of the conductivity equation $-\operatorname{div}(a\nabla u)=f$ with zero boundary data. The weight $a$ changes the cost of gradients pointwise: where $a$ is larger, the term $a(x)|\nabla u|^2$ contributes more to the energy. The lower ellipticity bound is the estimate used later for coercivity, since \begin{align*} \frac12\int_U a(x)|\nabla u|^2\,d\mathcal L^n\ge \frac{\theta}{2}\int_U |\nabla u|^2\,d\mathcal L^n. \end{align*} [/example] ## The Direct Method for Coercive Energies How do we prove that an energy minimizer exists when no explicit formula is available? The direct method avoids solving the Euler-Lagrange equation at first. It proves existence by compactness and lower semicontinuity, then derives the equation or inequality satisfied by the minimizer. Each hypothesis in the method prevents a specific failure. Without coercivity, a minimizing sequence may run off to infinity; for example $I[u]=e^{-\|u\|_X}$ has infimum $0$ but no minimizer. Without weak lower semicontinuity, a weak limit of nearly minimizing functions may have larger energy than the limiting infimum. Without a weakly closed admissible set, the compactness step may produce a limit outside the constraint. These failures motivate the two definitions below before the theorem is stated. [definition: Coercive Functional] Let $X$ be a Banach space and let $I:X\to \mathbb R\cup\{+\infty\}$. The functional $I$ is coercive on a set $\mathcal A\subset X$ if \begin{align*} \|u_k\|_X \to \infty, \quad u_k\in \mathcal A \end{align*} implies \begin{align*} I[u_k]\to +\infty. \end{align*} [/definition] Coercivity prevents minimizing sequences from escaping to infinity in the ambient norm. Bounded weakly convergent subsequences are useful only if the energy cannot jump downward at the weak limit, so we need a lower semicontinuity condition adapted to weak convergence. [definition: Weak Lower Semicontinuity] Let $X$ be a Banach space and let $I:X\to \mathbb R\cup\{+\infty\}$. The functional $I$ is weakly lower semicontinuous on $X$ if, whenever $u_k \rightharpoonup u$ in $X$, \begin{align*} I[u] \le \liminf_{k\to\infty} I[u_k]. \end{align*} [/definition] Weak lower semicontinuity lets the weak limit inherit the minimizing property. The direct method now has three separate failure modes to rule out at once: a minimizing sequence might escape to infinity, it might have no weakly convergent subsequence, or its weak limit might fall outside the admissible set or lose the infimum value. The following existence principle packages the coercivity, compactness, admissibility, and lower semicontinuity hypotheses needed to close this argument. [quotetheorem:3105] [citeproof:3105] This theorem is an existence engine: once the analytic hypotheses are verified, the minimizer follows without guessing its form. Reflexivity is the compactness input; in non-reflexive spaces a bounded minimizing sequence need not have any weakly convergent subsequence. Closed convexity is the admissibility input; closed convex sets are weakly closed, while a merely norm-closed nonconvex set need not be weakly closed. Weak lower semicontinuity is the energy input; without it, the weak limit may lose the minimizing property. The theorem does not assert uniqueness or an Euler-Lagrange equation, since both require additional structure. In Hilbert-space elliptic problems, these hypotheses are packaged into a standard statement for convex coercive energies. [quotetheorem:6432] [citeproof:6432] The quadratic conductivity energy from the previous section fits this theorem once the linear term is controlled. The Hilbert-space setting supplies reflexivity, convexity gives weak lower semicontinuity for the energy, and coercivity prevents escape to infinity. Strict convexity is the extra ingredient for uniqueness; without it, a flat-bottomed convex functional such as $I[u]=0$ on a closed ball and $I[u]=\|u\|_{H^1_0}^2-1$ outside it can have many minimizers. The theorem still does not identify the PDE satisfied by the minimizer, so it must be paired with the Euler-Lagrange calculation from the first section. Poincare's inequality gives coercivity on $H^1_0(U)$ when $U$ is bounded. [example: Coercivity of the Weighted Dirichlet Energy] Assume $U\subset\mathbb R^n$ is bounded and open, $a\in L^\infty(U)$ satisfies $a(x)\ge \theta>0$ a.e., and $f\in L^2(U)$. For $u\in H^1_0(U)$, the weighted energy is \begin{align*} I[u]=\frac12\int_U a(x)|\nabla u|^2\,d\mathcal L^n-\int_U f u\,d\mathcal L^n. \end{align*} Using $a(x)\ge\theta$ a.e. and Cauchy-Schwarz, \begin{align*} I[u]\ge \frac{\theta}{2}\|\nabla u\|_{L^2(U)}^2-\|f\|_{L^2(U)}\|u\|_{L^2(U)}. \end{align*} By *Poincare's inequality*, there is a constant $C_P>0$, depending only on $U$, such that \begin{align*} \|u\|_{L^2(U)}\le C_P\|\nabla u\|_{L^2(U)} \end{align*} for every $u\in H^1_0(U)$. Substituting this estimate gives \begin{align*} I[u]\ge \frac{\theta}{2}\|\nabla u\|_{L^2(U)}^2-C_P\|f\|_{L^2(U)}\|\nabla u\|_{L^2(U)}. \end{align*} If $X=\|\nabla u\|_{L^2(U)}$, the right-hand side is \begin{align*} \frac{\theta}{2}X^2-C_P\|f\|_{L^2(U)}X. \end{align*} This tends to $+\infty$ as $X\to\infty$ because the coefficient $\theta/2$ of $X^2$ is positive. Also, \begin{align*} \|u\|_{H^1(U)}^2\le (C_P^2+1)\|\nabla u\|_{L^2(U)}^2, \end{align*} so $\|u\|_{H^1(U)}\to\infty$ forces $\|\nabla u\|_{L^2(U)}\to\infty$. Thus $I$ is coercive on $H^1_0(U)$. The existence and uniqueness conclusion uses the full direct-method package, not coercivity alone. The quadratic gradient term is convex and weakly lower semicontinuous, the linear map $u\mapsto \int_U fu\,d\mathcal L^n$ is weakly continuous on $H^1_0(U)$, and the lower bound $a\ge\theta>0$ makes the quadratic part strictly convex. Therefore *Existence of Minimizers for Coercive Convex Energies* gives a unique minimizer in $H^1_0(U)$, and *Euler-Lagrange Equation for Quadratic Energies* identifies it as the weak solution of \begin{align*} \int_U a(x)\nabla u\cdot\nabla v\,d\mathcal L^n=\int_U fv\,d\mathcal L^n \end{align*} for every $v\in H^1_0(U)$. [/example] ## Convex Integral Functionals and Obstacle Constraints What changes when the admissible functions must satisfy an inequality constraint rather than only boundary data? The minimizer may no longer satisfy an equality for every variation, because variations that cross the constraint are not allowed. The Euler-Lagrange equation is then replaced by a variational inequality. [definition: Convex Integral Functional] Let $U\subset\mathbb R^n$ be open, let $1<q<\infty$, and let $F:U\times\mathbb R\times\mathbb R^n\to\mathbb R\cup\{+\infty\}$ be measurable in $x$ and convex in the gradient variable $\xi$. The associated integral functional is the map \begin{align*} I:W^{1,q}(U)\to \mathbb R\cup\{+\infty\},\qquad I[u] := \int_U F(x,u(x),\nabla u(x))\,d\mathcal L^n. \end{align*} [/definition] Convexity is the structural reason weak lower semicontinuity is available for many integral functionals, but the dependence on the value $u(x)$ also has to be controlled. The following version separates the convex gradient part from a lower-order term whose continuity is compatible with strong $L^q$ convergence. [quotetheorem:986] [citeproof:986] This theorem explains why convex energies behave well under the weak compactness supplied by Sobolev spaces. Convexity in the gradient variable is the structural hypothesis: nonconvex gradient energies can develop oscillating minimizing sequences whose weak limits have strictly larger energy than the relaxed limit. Strong convergence of $u_k$ in $L^q(U)$ controls the lower-order dependence on $u$, and the growth bound prevents that term from losing integrability. The theorem does not provide coercivity or compactness, and more general integrands require the full normal-integrand lower-semicontinuity theory. The next model problem adds an inequality constraint, so the admissible set must be a closed convex subset rather than an affine space. [definition: Obstacle Class] Let $U\subset\mathbb R^n$ be bounded and open, let $g\in H^1(U)$, and let $\psi\in H^1(U)$. The obstacle class is \begin{align*} \mathcal K_{g,\psi}:=\{u\in g+H^1_0(U): u\ge \psi \text{ a.e. in } U\}. \end{align*} [/definition] The pointwise inequality is interpreted almost everywhere, which is the natural language for Sobolev functions. This motivates the following existence theorem for the membrane constrained above the obstacle. [quotetheorem:6433] [citeproof:6433] The minimizer describes an elastic membrane forced to lie above the obstacle $\psi$. Nonemptiness of $\mathcal K_{g,\psi}$ is a real compatibility condition: if the boundary datum lies below an obstacle that reaches the boundary in the trace sense, there may be no admissible function. Closed convexity is also essential, because the direct method must keep the weak limit above the obstacle. The theorem proves existence and uniqueness of the constrained minimizer, but it does not say where the membrane touches the obstacle or what equation holds on the non-contact region. This motivates the variational inequality studied systematically in Chapter 11, since downward variations may violate the obstacle and only one-sided comparisons remain admissible. [quotetheorem:6434] [citeproof:6434] This is the constrained replacement for the weak equation. Convexity of the admissible set is the reason the segment from $u$ to any competitor stays admissible; without it, the directional derivative along that segment would not be available. The minimizer assumption is also essential, since the inequality is a first-order necessary condition for the constrained minimum, not for an arbitrary admissible function. The theorem does not identify the contact set or imply smoothness of the free boundary. It gives an equality only for perturbations that move both upward and downward without violating the obstacle, and a one-sided inequality for general admissible comparisons. [example: Membrane Obstacle Problem] Let $U=(-1,1)$, let $g=0$ in the trace sense, and let $\psi\in C_c^\infty(U)$ have $\max_U\psi>0$. The admissible class is \begin{align*} \mathcal K_{0,\psi}=\{v\in H^1_0(U):v\ge \psi \text{ a.e. in }U\}. \end{align*} Since $\psi$ is continuous and positive at some point, it is positive on some subinterval of $U$, so the zero function fails the inequality $0\ge \psi$ there and is not admissible. By *Existence for the Membrane Obstacle Problem*, the Dirichlet energy \begin{align*} E[v]=\frac12\int_{-1}^1 |v'(x)|^2\,dx \end{align*} has a unique minimizer $u\in\mathcal K_{0,\psi}$. By *[Variational Inequality for the Obstacle Problem](/theorems/6434)*, this minimizer satisfies \begin{align*} \int_{-1}^1 u'(x)(v-u)'(x)\,dx\ge 0 \end{align*} for every $v\in\mathcal K_{0,\psi}$. Now suppose $J\subset U$ is an open interval on which the continuous representative of $u$ satisfies $u>\psi$. If $\varphi\in C_c^\infty(J)$, then for $|t|$ sufficiently small the functions $u+t\varphi$ still satisfy $u+t\varphi\ge\psi$ on $J$, and they agree with $u$ outside $J$; hence $u+t\varphi\in\mathcal K_{0,\psi}$. Applying the variational inequality with $v=u+t\varphi$ gives \begin{align*} 0\le \int_{-1}^1 u'(x)(t\varphi)'(x)\,dx = t\int_J u'(x)\varphi'(x)\,dx. \end{align*} Using $t>0$ gives $\int_J u'\varphi'\,dx\ge0$, while using $t<0$ gives $\int_J u'\varphi'\,dx\le0$. Therefore \begin{align*} \int_J u'(x)\varphi'(x)\,dx=0 \end{align*} for every $\varphi\in C_c^\infty(J)$, which is the weak equation $-u''=0$ on $J$. Thus $u'$ is weakly constant on $J$, so $u$ is affine on $J$. The contact set $\{u=\psi\}$ is nonempty: if $u>\psi$ everywhere in $U$, then the same argument would give $-u''=0$ on all of $U$, and the zero trace condition would force $u$ to be affine with $u(-1)=u(1)=0$, hence $u\equiv0$, contradicting $\max_U\psi>0$. The variational inequality therefore records both regimes at once: equality with the obstacle on the contact set, and affine behavior on every open interval where the obstacle is inactive. [/example] The obstacle problem is an instance of a general Hilbert-space phenomenon: minimizing a squared distance over a closed convex set gives a projection characterized by an inequality. This geometric theorem explains why variational inequalities are the natural language of constrained minimization. [quotetheorem:647] [citeproof:647] The hypotheses here are sharp in the same way as for the direct method. Closedness prevents the nearest point from lying only in the closure of $C$, convexity gives both uniqueness and the variational characterization, and Hilbert-space geometry supplies the [inner product](/page/Inner%20Product) expansion used in the proof. The theorem does not extend in this form to arbitrary Banach spaces, where nearest points may fail to be unique or may not satisfy an inner-product inequality. In elliptic variational problems, this theorem is often applied not with the original $H^1$ inner product but with the energy inner product associated to a coercive bilinear form. The projection inequality then becomes the weak variational inequality for the constrained PDE. The variational viewpoint produces weak solutions, yet those solutions still need more regularity to justify the classical intuition built in earlier chapters. Chapter 7 begins the regularity theory by proving that weak solutions of elliptic equations are smoother in the interior than their definition initially suggests. # 7. Interior regularity for weak solutions This chapter begins the regularity part of the course. The prerequisites are the weak formulation of elliptic boundary value problems from Chapters 4 and 5, the construction of weak solutions by Hilbert-space and variational methods from Chapters 5 and 6, and the basic Sobolev-space tools developed earlier in the notes. The goal is to understand how much smoothness an elliptic equation creates away from the boundary: existence in $H^1$ is only the starting point, and the estimates below explain when a weak solution has higher weak derivatives or even classical smoothness. The guiding principle is that boundary conditions should not affect estimates inside the domain. If $V \subset\subset U$, then a cutoff function supported in $U$ and equal to $1$ on $V$ lets us test the equation locally, at the cost of constants depending on $\operatorname{dist}(V, \partial U)$. Caccioppoli inequalities provide the first version of this principle; difference quotients and Calderon-Zygmund estimates then turn it into $H^2$ and $W^{2,p}$ regularity. ## Local Energy Estimates from Cutoffs The basic question is how to estimate derivatives of a weak solution on a smaller set using only the solution and the forcing on a larger set. Since weak formulations allow test functions with compact support, the natural device is to multiply the solution by a cutoff that vanishes before reaching the boundary. [definition: Interior Cutoff] Let $U \subset \mathbb R^n$ be open and let $V \subset\subset W \subset\subset U$ be open sets. An interior cutoff for $V \subset W$ is a map $\zeta:U\to\mathbb R$ such that $\zeta$ is the extension by $0$ to $U$ of a function in $C_c^\infty(W)$, $0 \le \zeta \le 1$ on $U$, and $\zeta = 1$ on $V$. [/definition] The cutoff creates a bridge between estimates on $W$ and estimates on $V$. The price is the appearance of $|\nabla \zeta|$, which measures how quickly the cutoff changes in the annular region $W \setminus V$. [example: Cutoff Between Concentric Balls] Choose $\rho\in C^\infty(\mathbb R)$ such that $0\le \rho\le 1$, $\rho(t)=1$ for $t\le 0$, and $\rho(t)=0$ for $t\ge 1/2$. Define \begin{align*} \zeta(x)=\rho\left(\frac{|x-x_0|-r}{R-r}\right). \end{align*} If $|x-x_0|\le r$, then $(|x-x_0|-r)/(R-r)\le 0$, hence $\zeta(x)=1$. If $|x-x_0|\ge (R+r)/2$, then \begin{align*} \frac{|x-x_0|-r}{R-r}\ge \frac{(R+r)/2-r}{R-r}=\frac{1}{2}, \end{align*} so $\zeta(x)=0$. Therefore $\operatorname{spt}\zeta\subset \overline{B(x_0,(R+r)/2)}\subset B(x_0,R)$. Since $\zeta$ is constant on $B(x_0,r)$, the possible nonsmoothness of $|x-x_0|$ at $x_0$ is irrelevant, and $\zeta\in C_c^\infty(B(x_0,R))$. It remains to check the derivative bound. On the regions $|x-x_0|<r$ and $|x-x_0|>(R+r)/2$, the function $\zeta$ is constant, so $\nabla\zeta=0$. On the annulus $r<|x-x_0|<(R+r)/2$, the chain rule gives \begin{align*} \nabla\zeta(x)=\rho'\left(\frac{|x-x_0|-r}{R-r}\right)\frac{x-x_0}{(R-r)|x-x_0|}. \end{align*} Taking norms and using $|(x-x_0)/|x-x_0||=1$ on this annulus, \begin{align*} |\nabla\zeta(x)|\le \frac{\|\rho'\|_{L^\infty(\mathbb R)}}{R-r}. \end{align*} Thus $\zeta$ is an interior cutoff for $B(x_0,r)\subset B(x_0,R)$, with constant $C=\|\rho'\|_{L^\infty(\mathbb R)}$ depending only on the fixed transition function. [/example] [illustration:caccioppoli-cutoff-annulus] The cutoff construction is not only a technical example; it is the device that turns a weak equation into a local energy estimate. For a weakly harmonic function there is no pointwise identity available for $|\nabla u|^2$, so interior regularity must begin with an integral substitute. The next estimate answers the resulting question: how much of the gradient energy on the smaller ball can be controlled using only the $L^2$ size of $u$ on the larger ball and the separation scale $R-r$? [quotetheorem:6435] [citeproof:6435] This estimate is local in two senses: the ball $B(x_0,r)$ stays away from the boundary, and the right-hand side only uses information from the larger ball $B(x_0,R)$. The weak harmonic equation is indispensable, because the proof replaces the missing pointwise identity $\Delta u=0$ by the identity obtained after testing with $\zeta^2u$. For instance, adding a compactly supported high-frequency bump \begin{align*} w_m(x)=\frac{1}{m}\eta(x)\sin(mx_1) \end{align*} in $B(0,1)$ gives $\|w_m\|_{L^2(B(0,1))}$ bounded by $C/m$ while $\|\nabla w_m\|_{L^2(B(0,1/2))}$ is bounded below for a suitable fixed cutoff $\eta$, so no estimate of this type can hold for arbitrary $H^1$ functions with a constant independent of $m$. The strict inequality $r<R$ is also structural: if $r=R$, a cutoff equal to $1$ on the same ball and compactly supported in it would have to drop to $0$ at the boundary, forcing derivatives to concentrate with no finite bound depending only on the dimension. The theorem does not control $u$ up to $\partial U$, nor does it give a converse from small energy to harmonicity. The next problem is to handle a nonzero right-hand side, where the same test function produces a forcing term that needs control. The harmonic argument still gives the coercive part of the estimate, but now the right-hand side contains $\int f\zeta^2u\,d\mathcal L^n$. To make that expression meaningful and quantitatively useful, the local data must be square-integrable on the larger ball, and the equation must hold weakly against compactly supported tests. [quotetheorem:6436] [citeproof:6436] The estimate does not require boundary data, so it applies equally to weak solutions obtained from Dirichlet, Neumann, or whole-space formulations. The scale factor on $f$ is not cosmetic: if the ball is rescaled, the forcing has second-derivative dimension and must be multiplied by a length squared in this energy estimate. Each hypothesis rules out a concrete failure. If $f\notin L^2_{\mathrm{loc}}$, the forcing term may not be bounded by Cauchy-Schwarz; in dimension $n\ge 3$, the model singularity $f(x)=|x|^{-n/2}$ near $0$ is locally integrable in weaker spaces but not in $L^2$, so the displayed right-hand side is infinite on every ball around $0$. If $u$ does not satisfy $-\Delta u=f$ weakly, the identity used to replace $\int \zeta^2|\nabla u|^2$ by a forcing term is unavailable; high-frequency compactly supported functions show that arbitrary $H^1$ functions cannot obey the estimate with $f=0$. If $B(x_0,R)$ is not compactly contained in $U$, the cutoff may leave the domain or touch the boundary, and the proof would require boundary information not present in the theorem. Thus the theorem still does not give global $H^1$ control unless boundary conditions or global coercivity are supplied. The boundary-free nature of the estimate is easiest to see on a simple domain where the global boundary condition is present but unused. The next example deliberately places the region of interest well inside the cube, then chooses a second intermediate cube to support the cutoff. This separates the mechanism of the interior estimate from any regularity or compatibility of the boundary values on $\partial U$. [example: Local Estimate Away from the Boundary] Let $U=(0,1)^n$, let $V=(1/4,3/4)^n$, and choose $W=(1/8,7/8)^n$. Then \begin{align*} \operatorname{dist}(V,\partial W)=\frac18. \end{align*} Choose an interior cutoff $\zeta\in C_c^\infty(W)$ with $\zeta=1$ on $V$ and $|\nabla\zeta|\le C_\zeta\,\operatorname{dist}(V,\partial W)^{-1}$. Since $-\Delta u=f$ weakly in $U$ and $\zeta$ is supported in $W\subset\subset U$, the local Poisson Caccioppoli estimate, *Caccioppoli Inequality for Poisson Equations*, gives \begin{align*} \int_V |\nabla u|^2\,d\mathcal L^n\le \int_W \zeta^2|\nabla u|^2\,d\mathcal L^n. \end{align*} The same estimate applied on the pair $V\subset W$ gives \begin{align*} \int_W \zeta^2|\nabla u|^2\,d\mathcal L^n\le C_0\left(\operatorname{dist}(V,\partial W)^{-2}\int_W |u|^2\,d\mathcal L^n+\operatorname{dist}(V,\partial W)^2\int_W |f|^2\,d\mathcal L^n\right). \end{align*} Combining these two inequalities and taking square roots, using $\sqrt{a+b}\le \sqrt a+\sqrt b$ for $a,b\ge 0$, yields \begin{align*} \|\nabla u\|_{L^2(V)}\le C_0^{1/2}\left(\operatorname{dist}(V,\partial W)^{-1}\|u\|_{L^2(W)}+\operatorname{dist}(V,\partial W)\|f\|_{L^2(W)}\right). \end{align*} Because $\operatorname{dist}(V,\partial W)=1/8$ is fixed, this becomes \begin{align*} \|\nabla u\|_{L^2(V)}\le C\left(\|u\|_{L^2(W)}+\|f\|_{L^2(W)}\right), \end{align*} where $C$ depends on $n$ and on the separation between $V$ and $\partial W$. The cutoff is supported strictly inside $U$, so the argument never uses the trace condition $u=0$ on $\partial U$; the estimate is controlled entirely by the local data on $W$. [/example] ## Difference Quotients and Interior $H^2$ Regularity The next problem is to turn energy estimates for $u$ into energy estimates for its derivatives. Differentiating a weak solution is not initially legitimate, so the course uses difference quotients as a substitute for derivatives that still live inside Sobolev spaces. [definition: Difference Quotient] Let $U \subset \mathbb R^n$ be open, let $h \in \mathbb R$ with $h\ne 0$, and let $e_k$ be the $k$-th coordinate vector. Define \begin{align*} U_{k,h}=\{x\in U:x+he_k\in U\}. \end{align*} For a function $u:U\to \mathbb R$, the $k$-th difference quotient with step $h$ is the map $D_{k,h}u:U_{k,h}\to\mathbb R$ given by \begin{align*} D_{k,h}u(x)=\frac{u(x+he_k)-u(x)}{h}. \end{align*} [/definition] Difference quotients obey discrete analogues of product rules and integration by parts. This matters because a weak solution in $H^1_{\mathrm{loc}}$ need not have classical second derivatives, so testing the equation with an ordinary derivative of $u$ is not justified at the start. The theorem below explains why uniform bounds for these quotients are enough to recover genuine weak derivatives. [quotetheorem:6437] [citeproof:6437] This characterisation reduces interior $H^2$ regularity to proving uniform $L^2$ bounds for difference quotients of $\nabla u$. The uniformity in $h$ is essential: a single bounded difference quotient at one scale says only finite-scale information and does not identify a weak derivative. The compact containment $W\subset\subset U$ is a separate hypothesis, not just a convenience, because it gives a fixed set on which all sufficiently small translations are legal; if $W$ reaches $\partial U$, the behaviour of difference quotients may reflect how the function terminates at the endpoint rather than interior differentiability. For example, $u(x)=x^{1/2}$ on $(0,1)$ has endpoint-dominated quotients near $0$, while on every interval $[\delta,1-\delta]$ the same function has stable interior quotients. The assumption $u\in L^2(U)$ is also not cosmetic: the theorem concludes that $u$ represents an $H^1(W)$ function, which already requires $u\in L^2(W)$. For $u(x)=|x|^{-n/2}$ near $0$, the function itself is not in $L^2$ on any neighbourhood of the singularity, so even a formal discussion of derivative quotients cannot produce the asserted Sobolev representative there. The result also does not assert pointwise differentiability; it is a Sobolev-space compactness statement whose conclusion is an $L^2$ weak derivative. The next theorem supplies those bounds for smooth coefficients, where coefficient differences remain controlled. The difference-quotient test will produce second derivatives only if the leading part of the operator is coercive and if all translated cutoff functions remain inside the domain. Smoothness of the coefficients controls the commutators created by taking finite differences. [quotetheorem:6438] [citeproof:6438] The theorem says that weak solutions of smooth uniformly elliptic divergence-form equations are twice weakly differentiable inside the domain. The $C^1$ coefficient hypothesis is used when bounding $D_{k,h}a_{ij}$; with merely bounded measurable coefficients, this difference-quotient proof breaks down and interior $H^2$ regularity can fail, as in divergence-form equations with jump coefficients across a smooth interface, where the flux is continuous but the full Hessian may contain an interface singularity. Uniform ellipticity is equally necessary: if the leading matrix degenerates, for example $a_{11}(x)=x_1^2$ and the other coefficients vanish near $\{x_1=0\}$, the equation no longer controls all first derivatives and cannot force second derivatives in directions where the operator has lost coercivity. The local containment $V\subset\subset U'\subset\subset U$ is what permits finite differences and cutoffs without boundary terms; on a half-ball, weak solutions with rough boundary data may fail to be $H^2$ up to the flat boundary even when the coefficients are smooth. The theorem therefore gives no boundary regularity and no estimate with constants independent of the distance from $V$ to $\partial U$. The following example uses the estimate as an iteration tool for Poisson equations with smooth forcing. [example: Bootstrapping Poisson Equation with Smooth Forcing] Let $U\subset\mathbb R^n$ be open, suppose $u\in H^1_{\mathrm{loc}}(U)$, and suppose \begin{align*} -\Delta u=f \end{align*} in the weak sense on $U$, with $f\in C^\infty(U)$. On every pair $V\subset\subset U'\subset\subset U$, the constant-coefficient case of *Interior $H^2$ Estimate for Smooth Coefficients* gives \begin{align*} \|u\|_{H^2(V)} \le C\left(\|f\|_{L^2(U')}+\|u\|_{L^2(U')}\right), \end{align*} so $u\in H^2_{\mathrm{loc}}(U)$. We now iterate this argument. Let $\alpha$ be a multi-index. Since distributional derivatives commute with the Laplacian, \begin{align*} -\Delta(D^\alpha u) = D^\alpha(-\Delta u) = D^\alpha f \end{align*} in $\mathcal D'(U)$. For $|\alpha|=1$, the first step gives $D^\alpha u\in H^1_{\mathrm{loc}}(U)$, while $D^\alpha f\in C^\infty(U)\subset L^2_{\mathrm{loc}}(U)$. Applying the same interior $H^2$ estimate to $D^\alpha u$ yields \begin{align*} D^\alpha u\in H^2_{\mathrm{loc}}(U), \end{align*} hence $u\in H^3_{\mathrm{loc}}(U)$. Inductively, if $u\in H^m_{\mathrm{loc}}(U)$ and $|\alpha|=m-1$, then $D^\alpha u\in H^1_{\mathrm{loc}}(U)$ and satisfies \begin{align*} -\Delta(D^\alpha u)=D^\alpha f, \end{align*} with $D^\alpha f\in L^2_{\mathrm{loc}}(U)$. The estimate gives $D^\alpha u\in H^2_{\mathrm{loc}}(U)$, which is exactly the statement that all derivatives of $u$ of order $m+1$ belong to $L^2_{\mathrm{loc}}(U)$. Therefore \begin{align*} u\in H^m_{\mathrm{loc}}(U) \end{align*} for every $m\in\mathbb N$. Finally, fix a compact set $K\subset U$ and an integer $\ell\ge 0$. Choose $m>\ell+n/2$. Since $u\in H^m_{\mathrm{loc}}(U)$, the [Sobolev embedding theorem](/theorems/903) on a compactly contained neighbourhood of $K$ gives $u\in C^\ell(K)$. Because $K$ and $\ell$ were arbitrary, $u\in C^\infty(U)$. Smooth forcing therefore propagates to smoothness of the weak solution in the interior, with no boundary regularity being used. [/example] ## Calderon-Zygmund Estimates and $W^{2,p}$ Regularity The $H^2$ estimate is tied to $L^2$ methods. Many applications require $L^p$ control of second derivatives, so the next question is whether the Poisson equation preserves the integrability scale of the forcing. [quotetheorem:6439] [citeproof:6439] The range $1<p<\infty$ is part of the theorem, not a technical decoration: the singular integral bounds used here fail at the endpoints in the same strong $L^p$ form. At $p=1$, Riesz transforms map $L^1$ only to weak $L^1$ in general, and at $p=\infty$ they map bounded functions to spaces such as BMO rather than back to $L^\infty$. The assumption $u\in W^{2,p}_{\mathrm{loc}}(U)$ means this is an a priori estimate, not yet a weak-to-strong theorem; it says that once second derivatives are already known to exist, their $L^p$ size is controlled by the equation and a lower-order norm. That is exactly the estimate needed before the next theorem, where mollification supplies smooth approximants, the a priori bound is applied uniformly, and the limiting argument proves that the original distributional solution has second derivatives after all. The compact containment $V\subset\subset U'\subset\subset U$ is again part of the mechanism: if a cutoff touches the boundary, commutator terms acquire boundary dependence and the interior singular-integral estimate no longer applies as stated. The cutoff term $\|u\|_{L^p(U')}$ records the fact that constants and low frequencies are not controlled by $\Delta u$ alone; for instance, $u\equiv c$ has $\Delta u=0$ but a nonzero local $L^p$ norm. The next result removes the temporary assumption that the solution already belongs to $W^{2,p}_{\mathrm{loc}}(U)$. Since a weak solution need not have any second derivatives at the outset, the proof must first regularise the equation and only then pass the Calderon-Zygmund estimate to the limit. The local $L^p$ assumption on $u$ is enough to interpret the distributional equation and to control the lower-order cutoff terms. [quotetheorem:6440] [citeproof:6440] This result is the main weak-to-strong regularity statement for the Laplacian in the chapter. Each assumption has a specific role. If $f\notin L^p_{\mathrm{loc}}$, the conclusion cannot hold because $D^2u\in L^p_{\mathrm{loc}}$ would force $\Delta u\in L^p_{\mathrm{loc}}$ as a sum of second derivatives; a Newtonian potential of a non-$L^p$ local source gives the model obstruction. If $u\notin L^p_{\mathrm{loc}}$, the lower-order term in the estimate is not finite and the distributional solution need not represent a Sobolev function on compact subsets. If the equation is not satisfied in distributions, the Calderon-Zygmund estimate has no reason to apply; adding a compactly supported rough $L^p$ function to a solution changes $D^2u$ without changing any prescribed right-hand side. The assumptions cannot be read as boundary hypotheses: they only describe the equation and data inside $U$, so corners, incompatible boundary data, or rough boundary geometry may still destroy global $W^{2,p}$ regularity. The restriction $1<p<\infty$ is inherited from Calderon-Zygmund theory, and the conclusion is local because every estimate is obtained after inserting a compactly supported cutoff. The weak-to-strong theorem has no boundary hypotheses, and that omission is not accidental. Interior estimates can be applied after cutting off away from $\partial U$, but the same argument gives no control over singularities created by the shape of the boundary or by boundary data. The following example isolates this distinction: the equation is harmonic in the interior, yet the boundary corner produces second-derivative blow-up at the edge of the domain. [example: Weak Solution Not Twice Differentiable at the Boundary] Let the re-entrant corner have opening angle $\omega\in(\pi,2\pi)$, and write points near the corner in polar coordinates $0<r<r_0$, $0<\theta<\omega$. Set \begin{align*} \alpha=\frac{\pi}{\omega}. \end{align*} Then $0<\alpha<1$. Choose boundary data whose local singular part is \begin{align*} u(r,\theta)=r^\alpha\sin(\alpha\theta). \end{align*} On the two sides of the corner this expression vanishes: $u(r,0)=r^\alpha\sin(0)=0$, and $u(r,\omega)=r^\alpha\sin(\alpha\omega)=r^\alpha\sin(\pi)=0$. For $0<r<r_0$ and $0<\theta<\omega$, the first radial derivative is \begin{align*} \partial_r u=\alpha r^{\alpha-1}\sin(\alpha\theta). \end{align*} The second radial derivative is \begin{align*} \partial_{rr}u=\alpha(\alpha-1)r^{\alpha-2}\sin(\alpha\theta). \end{align*} The angular derivatives are \begin{align*} \partial_\theta u=\alpha r^\alpha\cos(\alpha\theta). \end{align*} and \begin{align*} \partial_{\theta\theta}u=-\alpha^2 r^\alpha\sin(\alpha\theta). \end{align*} Using the polar-coordinate formula for the Laplacian, \begin{align*} \Delta u=\partial_{rr}u+\frac1r\partial_r u+\frac1{r^2}\partial_{\theta\theta}u. \end{align*} Substituting the three computed derivatives gives \begin{align*} \Delta u=\alpha(\alpha-1)r^{\alpha-2}\sin(\alpha\theta)+\alpha r^{\alpha-2}\sin(\alpha\theta)-\alpha^2r^{\alpha-2}\sin(\alpha\theta). \end{align*} Factoring the common term, \begin{align*} \Delta u=\bigl(\alpha(\alpha-1)+\alpha-\alpha^2\bigr)r^{\alpha-2}\sin(\alpha\theta). \end{align*} Since $\alpha(\alpha-1)+\alpha-\alpha^2=\alpha^2-\alpha+\alpha-\alpha^2=0$, we get \begin{align*} \Delta u=0. \end{align*} Thus $u$ is harmonic at every interior point of the sector. Away from the corner, where $r\ge \delta>0$, every derivative of $r^\alpha\sin(\alpha\theta)$ is bounded on compact subsets, so $u\in H^2_{\mathrm{loc}}(U\setminus\{\text{corner}\})$. At the corner, the radial second derivative already prevents square-integrability. Choose a closed angular interval $I\subset(0,\omega)$ and a constant $c>0$ such that $|\sin(\alpha\theta)|\ge c$ on $I$. Then \begin{align*} \int_0^{r_0}\int_I |\partial_{rr}u|^2\,r\,d\theta\,dr\ge \alpha^2(1-\alpha)^2c^2|I|\int_0^{r_0} r^{2\alpha-3}\,dr. \end{align*} Because $0<\alpha<1$, we have $2\alpha-3<-1$, and therefore the integral $\int_0^{r_0} r^{2\alpha-3}\,dr$ diverges at $0$. Hence the second derivatives are not square-integrable near the corner, even though the equation is harmonic in the interior. The failure comes from the boundary geometry, which is why the estimates in this chapter are interior estimates rather than global boundary regularity theorems. [/example] [illustration:reentrant-corner-level-curves] ## Weak-to-Strong Regularity and the Role of the Boundary Combining Caccioppoli estimates, difference quotients, and Calderon-Zygmund estimates gives a practical regularity principle: if the coefficients and forcing are regular on an interior region, then a weak solution is regular there. Boundary singularities, rough coefficients, and endpoint $L^p$ spaces are separate issues requiring different tools. [quotetheorem:96] [citeproof:96] This result reflects a template used repeatedly in elliptic theory: first gain a small amount of differentiability, then commute derivatives with the equation, then iterate. Smooth coefficients are essential for this bootstrapping argument because differentiating the equation differentiates the coefficients as well; rough coefficients can block the iteration even when weak solutions exist, for example across an interface where leading coefficients jump and second derivatives acquire singular behaviour. Uniform ellipticity is not replaceable by smoothness: a smooth degenerate operator such as $x_1^2\partial_{x_1x_1}+\sum_{i=2}^n\partial_{x_ix_i}$ loses control in the $x_1$ direction along $\{x_1=0\}$, so solutions need not gain the full set of derivatives. The assumption that the equation holds in distributions is also essential, since an arbitrary $H^1_{\mathrm{loc}}$ function with a cusp or oscillatory singularity can fail to be smooth despite having smooth coefficients available in the background. The boundary is absent from the argument because every step is performed after inserting a cutoff supported inside the domain, so the theorem does not assert smoothness up to $\partial U$. [remark: Interior Estimates Do Not Imply Boundary Regularity] Interior estimates control $u$ on $V\subset\subset U$ with constants depending on the separation from $\partial U$. As $V$ approaches the boundary, cutoff derivatives grow and the constants can blow up. Global regularity therefore requires hypotheses on the boundary geometry and compatibility with the boundary condition, not only regularity of the equation inside $U$. [/remark] The central estimates of the chapter can be read as a hierarchy. Caccioppoli estimates control first derivatives locally from the equation and $L^2$ data. Difference quotients upgrade this control to $H^2$ regularity for smooth uniformly elliptic coefficients. Calderon-Zygmund theory extends the Laplacian estimate to the full $W^{2,p}$ scale for $1<p<\infty$, making interior elliptic regularity one of the main bridges from weak variational solutions to strong differential equations. These same estimates also explain why elliptic PDE appear throughout analysis: harmonic analysis supplies the singular-integral bounds, variational methods supply the weak solutions to which the bounds are applied, and geometric or probabilistic models often use the resulting smoothness to turn an energy or averaging principle into pointwise information. Interior regularity explains what happens away from the boundary, but elliptic problems are ultimately shaped by their boundary conditions as well. Chapter 8 extends the smoothness theory to the edge of the domain, asking how much of the interior estimates survive near the boundary and under what geometric assumptions. # 8. Boundary regularity and elliptic estimates Boundary regularity asks how much of the interior smoothness theory survives when a solution approaches the edge of the domain. Chapters 4 through 7 developed weak solutions, energy estimates, and interior regularity; those arguments do not automatically see the boundary condition. This chapter explains how local coordinate changes reduce smooth boundary pieces to a flat model, how $H^2$ estimates are obtained for the Dirichlet Laplacian, and how the Hölder-based Schauder scale differs from the Sobolev scale. ## Flattening the Boundary and Local Coordinate Arguments The first problem is local: near a smooth boundary point, can estimates on a curved domain be reduced to estimates on a half-ball without changing the elliptic nature of the equation? The answer is yes, provided the boundary has enough differentiability and the coordinate map is controlled. This reduction is the basic mechanism behind boundary regularity. [definition: Local Boundary Flattening Map] Let $U \subset \mathbb R^n$ be open and let $x_0 \in \partial U$. A local boundary flattening map at $x_0$ is a $C^k$ diffeomorphism $\Phi: V \to W$, where $V$ and $W$ are neighbourhoods of $x_0$ and $0$ respectively, such that $\Phi(x_0)=0$, $\Phi(V \cap U)=W \cap \{y \in \mathbb R^n : y_n > 0\}$, and $\Phi(V \cap \partial U)=W \cap \{y \in \mathbb R^n : y_n = 0\}$. [/definition] The definition packages the geometric content of a smooth boundary into a coordinate chart. Once the boundary is flat, the next question is whether such charts exist under the usual $C^k$ boundary hypothesis. [quotetheorem:6441] [citeproof:6441] The $C^k$ hypothesis is doing two separate jobs. First, it gives an actual $C^k$ chart; a cusp such as $U=\{(x,y)\in\mathbb R^2:y>|x|^{1/2}\}$ near the origin has no $C^1$ graph representation after rotation, so the preceding construction cannot supply a $C^1$ diffeomorphism with bounded derivative. Second, flattening only straightens the set $V\cap \partial U$; it does not preserve constant coefficients, distances, or the Laplacian in its original form. The theorem therefore does not assert that the original operator keeps the same coefficients, nor that a nonsmooth boundary can be regularised by changing coordinates. The next issue is analytic rather than geometric: after changing variables, the transformed equation must retain uniform ellipticity with constants strong enough for estimates on the half-ball. [illustration:boundary-flattening-map] Flattening is useful only if the transformed equation remains in the same elliptic class. A change of variables can stretch normal and tangential directions unevenly, so the key quantitative issue is whether bounded distortion of the chart preserves the lower and upper ellipticity bounds of the coefficient matrix in the flattened coordinates. [quotetheorem:6442] [citeproof:6442] The bounds on $D\Phi$ and $D\Phi^{-1}$ are part of the ellipticity statement, not a harmless technical add-on. If $\Phi(x_1,x_2)=(x_1,x_2^3)$ near $x_2=0$, then the derivative degenerates at the flat boundary and the inverse derivative is unbounded; a uniformly elliptic equation can be transformed into one whose normal coefficient vanishes or blows up. Thus a coordinate change may preserve the topological shape of the domain while destroying the quantitative lower bound needed in energy and Schauder estimates. Even under the theorem, the transformed estimate is local and its constants depend on the chosen chart, so a global boundary estimate also needs a finite atlas with uniformly controlled charts and partition functions. [example: Flattening the Unit Ball Near the North Pole] Let $U=B(0,1)\subset \mathbb R^n$ and $x_0=(0,\dots,0,1)$. Write $x=(x',x_n)$ with $x'\in\mathbb R^{n-1}$, and work in a neighbourhood of $x_0$ where $x_n>0$. On this neighbourhood, the boundary equation $|x'|^2+x_n^2=1$ is equivalent to $x_n^2=1-|x'|^2$, and the condition $x_n>0$ selects \begin{align*} x_n=(1-|x'|^2)^{1/2}. \end{align*} Define \begin{align*} \Phi(x',x_n)=\bigl(x',(1-|x'|^2)^{1/2}-x_n\bigr). \end{align*} If $x\in U$ is close enough to $x_0$, then $|x'|^2+x_n^2<1$ and $x_n>0$, so $x_n^2<1-|x'|^2$ and hence \begin{align*} 0<x_n<(1-|x'|^2)^{1/2}. \end{align*} Therefore the last coordinate of $\Phi(x)$ is positive. If $x\in\partial U$ in the same neighbourhood, then $x_n=(1-|x'|^2)^{1/2}$, so the last coordinate of $\Phi(x)$ is $0$. Thus $\Phi$ sends the ball locally to $\{y_n>0\}$ and sends the boundary to $\{y_n=0\}$. Set \begin{align*} \gamma(x')=(1-|x'|^2)^{1/2}. \end{align*} For $1\le i\le n-1$, \begin{align*} \frac{\partial \gamma}{\partial x_i}(x')=\frac{1}{2}(1-|x'|^2)^{-1/2}(-2x_i)=-\frac{x_i}{(1-|x'|^2)^{1/2}}. \end{align*} For a vector $z=(z',z_n)$, the differential of $\Phi$ is therefore \begin{align*} D\Phi(x)z=\bigl(z',D\gamma(x')\cdot z'-z_n\bigr). \end{align*} The inverse map is \begin{align*} \Psi(y',y_n)=(y',\gamma(y')-y_n), \end{align*} and its Jacobian determinant has absolute value $1$, because the first $n-1$ coordinates are unchanged and the derivative in the final coordinate is $-1$. For the Laplacian, the original coefficient matrix is $A=I$. In the transformed bilinear form, the coefficient matrix is $\widetilde A(y)=D\Phi(\Psi(y))D\Phi(\Psi(y))^\top$. Equivalently, for $\xi=(\xi',\xi_n)$, \begin{align*} \xi\cdot \widetilde A(y)\xi=|\xi'|^2+2\xi_n\,D\gamma(y')\cdot \xi'+(1+|D\gamma(y')|^2)\xi_n^2. \end{align*} Here \begin{align*} D\gamma(y')=\left(-\frac{y_1}{(1-|y'|^2)^{1/2}},\dots,-\frac{y_{n-1}}{(1-|y'|^2)^{1/2}}\right), \end{align*} so \begin{align*} 1+|D\gamma(y')|^2=1+\frac{|y'|^2}{1-|y'|^2}=\frac{1}{1-|y'|^2}. \end{align*} The transformed coefficients depend on $y'$, so flattening this smooth curved boundary preserves the half-space geometry but changes the constant-coefficient Laplacian into a divergence-form operator with nonconstant coefficients. [/example] The example isolates the geometric part of the reduction, but a boundary value problem also carries prescribed values on $\partial U$. Homogeneous Dirichlet conditions survive composition with the chart; nonzero boundary data must be converted to a homogeneous condition before the local estimates can be applied. Without a lifting of the right Sobolev order, subtracting the boundary values may create a new forcing term that is only a distribution rather than an $L^2$ function. [definition: Compatible Dirichlet Lifting] Let $U \subset \mathbb R^n$ be bounded with boundary regular enough for the trace operator $\operatorname{Tr}:H^2(U)\to H^{3/2}(\partial U)$ to be defined. A compatible Dirichlet lifting of boundary data $g \in H^{3/2}(\partial U)$ is a function $G\in H^2(U)$ such that $\operatorname{Tr}G=g$ on $\partial U$. [/definition] This reduction is simple but important: if $u=g$ on $\partial U$, then $v=u-G$ has zero trace and solves a modified equation. The boundary estimate can then be stated in the homogeneous setting and transferred back by bounding $G$. ## $H^2$ Boundary Regularity for Smooth Domains and Compatible Dirichlet Data The central analytic question is whether an energy solution of the Poisson equation has two weak derivatives up to the boundary. Interior estimates already give $H^2$ regularity away from $\partial U$ when the forcing term lies in $L^2$. The boundary estimate is stronger because it controls derivatives in neighbourhoods that meet $\partial U$. [quotetheorem:6443] [citeproof:6443] This theorem is a prototype for elliptic estimates: the PDE controls the missing normal derivative once tangential derivatives have been estimated. Each hypothesis has a role. The condition $f\in L^2(U)$ is matched to the desired second weak derivatives in $L^2$; if $f$ is only in $H^{-1}(U)$, the variational solution need only lie in $H^1_0(U)$. The zero trace condition gives admissible tangential difference quotients at the boundary and allows Poincare's inequality to control lower-order terms. The $C^2$ boundary assumption rules out corner singularities: on a planar sector with opening angle $\omega>\pi$, the harmonic function $r^{\pi/\omega}\sin(\pi\theta/\omega)$ has zero values on the sides but has second derivatives too singular to belong to $L^2$ near the corner. The theorem does not assert pointwise continuity of $D^2u$, nor does it cover rough coefficients or nonsmooth domains; it supplies the Sobolev-scale estimate that will be combined with liftings for nonzero Dirichlet data and later contrasted with Schauder estimates. [example: Poisson Equation on a Smooth Bounded Domain] Let $U=B(0,1)\subset\mathbb R^n$ and let $f(x)=1$. We verify explicitly that $u(x)=(1-|x|^2)/(2n)$ solves $-\Delta u=f$ with zero boundary values. Since \begin{align*} u(x)=\frac{1-\sum_{j=1}^n x_j^2}{2n}, \end{align*} for each $1\le i\le n$ we have \begin{align*} \frac{\partial u}{\partial x_i}(x)=\frac{1}{2n}(-2x_i)=-\frac{x_i}{n}. \end{align*} Differentiating once more gives \begin{align*} \frac{\partial^2 u}{\partial x_i^2}(x)=\frac{\partial}{\partial x_i}\left(-\frac{x_i}{n}\right)=-\frac{1}{n}. \end{align*} If $i\ne j$, then $x_i$ is independent of $x_j$, so \begin{align*} \frac{\partial^2 u}{\partial x_j\partial x_i}(x)=\frac{\partial}{\partial x_j}\left(-\frac{x_i}{n}\right)=0. \end{align*} Therefore \begin{align*} \Delta u(x)=\sum_{i=1}^n \frac{\partial^2 u}{\partial x_i^2}(x)=\sum_{i=1}^n\left(-\frac{1}{n}\right)=-1. \end{align*} Thus $-\Delta u=1=f$ in $B(0,1)$. If $x\in\partial B(0,1)$, then $|x|=1$, and hence \begin{align*} u(x)=\frac{1-1}{2n}=0. \end{align*} The function $u$ is a polynomial of degree $2$, so $u$, its first derivatives, and all of its second weak derivatives are bounded on the bounded domain $B(0,1)$. Hence $u\in H^2(B(0,1))$. More generally, for a bounded $C^2$ domain and $f\in L^2(U)$, the *Global Sobolev Estimate for the Dirichlet Laplacian* gives \begin{align*} \|u\|_{H^2(U)}\le C\|f\|_{L^2(U)}. \end{align*} In this explicit ball example, the controlled second derivatives are exactly the diagonal derivatives $-1/n$ and the mixed derivatives $0$, so the estimate is a quantitative bound on the full second-derivative array rather than only a qualitative regularity statement. [/example] Nonzero Dirichlet data is handled by subtracting a lifting. The forcing term changes by the Laplacian of the lifting, so the boundary data must have enough Sobolev regularity to make that correction lie in $L^2$. [quotetheorem:6444] [citeproof:6444] The compatibility condition is not a technical ornament. Boundary data with too little trace regularity can prevent $H^2$ regularity even when the equation has a smooth right-hand side. On the half-space model, prescribing boundary data $g\in H^{1/2}(\partial U)$ that does not belong to $H^{3/2}(\partial U)$ gives a harmonic extension with the correct $H^1$ trace but without two square-integrable derivatives in general; the missing derivative count is exactly the obstruction measured by the trace theorem. The estimate also depends on the choice of lifting through $\|G\|_{H^2(U)}$, so the theorem is not a boundary-only bound until a bounded right inverse for the trace map has been invoked. It does not treat incompatible data, Neumann compatibility conditions, or singular domains; those require separate boundary spaces and sometimes weaker regularity conclusions. [remark: Meaning of Compatibility] For the Dirichlet Laplacian, compatibility at the $H^2$ level means that the boundary value is the trace of an $H^2$ function. In a smoother classical setting, it also reflects the fact that the solution must simultaneously satisfy the PDE in the interior and the prescribed boundary values on $\partial U$. Higher-order estimates require corresponding higher trace regularity and stronger smoothness of the boundary. [/remark] The geometry of the domain can break the estimate. Corners and cusps create singular modes that are invisible in smooth local charts. [illustration:reentrant-corner-singularity] [example: Loss of Regularity at a Reentrant Corner] Let the corner be modeled by the sector $0<r<\varepsilon$ and $0<\theta<\omega$, where $\omega>\pi$, and set $\lambda=\pi/\omega$. Then $0<\lambda<1$. Consider the singular mode \begin{align*} u(r,\theta)=r^\lambda\sin(\lambda\theta). \end{align*} On the side $\theta=0$, \begin{align*} u(r,0)=r^\lambda\sin(0)=0. \end{align*} On the side $\theta=\omega$, \begin{align*} u(r,\omega)=r^\lambda\sin(\lambda\omega)=r^\lambda\sin(\pi)=0. \end{align*} Thus $u$ has zero boundary values on the two sides of the sector. We verify that $u$ is harmonic away from $r=0$. Its radial derivatives are \begin{align*} u_r(r,\theta)=\lambda r^{\lambda-1}\sin(\lambda\theta). \end{align*} \begin{align*} u_{rr}(r,\theta)=\lambda(\lambda-1)r^{\lambda-2}\sin(\lambda\theta). \end{align*} Its angular derivatives are \begin{align*} u_\theta(r,\theta)=\lambda r^\lambda\cos(\lambda\theta). \end{align*} \begin{align*} u_{\theta\theta}(r,\theta)=-\lambda^2 r^\lambda\sin(\lambda\theta). \end{align*} Using the polar-coordinate formula $\Delta u=u_{rr}+r^{-1}u_r+r^{-2}u_{\theta\theta}$, we get \begin{align*} \Delta u=\lambda(\lambda-1)r^{\lambda-2}\sin(\lambda\theta)+\lambda r^{\lambda-2}\sin(\lambda\theta)-\lambda^2 r^{\lambda-2}\sin(\lambda\theta). \end{align*} Factoring the common term gives \begin{align*} \Delta u=\bigl(\lambda(\lambda-1)+\lambda-\lambda^2\bigr)r^{\lambda-2}\sin(\lambda\theta). \end{align*} Since \begin{align*} \lambda(\lambda-1)+\lambda-\lambda^2=\lambda^2-\lambda+\lambda-\lambda^2=0, \end{align*} we have $\Delta u=0$ away from the corner. The second radial derivative already has the singular size \begin{align*} |u_{rr}(r,\theta)|=|\lambda(\lambda-1)|r^{\lambda-2}|\sin(\lambda\theta)|. \end{align*} Because $\lambda\omega/2=\pi/2$, the factor $|\sin(\lambda\theta)|$ is bounded below by a positive constant on some angular interval around $\theta=\omega/2$. Hence, on the corresponding smaller sub-sector, \begin{align*} |u_{rr}(r,\theta)|^2\ge c r^{2\lambda-4} \end{align*} for some $c>0$. The area element in polar coordinates is $r\,dr\,d\theta$, so the radial part of the $L^2$ integral contains \begin{align*} \int_0^\varepsilon r^{2\lambda-4}r\,dr=\int_0^\varepsilon r^{2\lambda-3}\,dr. \end{align*} Since $\lambda<1$, we have $2\lambda-3<-1$, and therefore this integral diverges at $r=0$. Thus this zero-boundary harmonic mode has second derivatives that are not square-integrable near the reentrant corner. This explicit singular mode shows why the smooth-domain hypothesis in the global $H^2$ estimate cannot be replaced by an arbitrary Lipschitz boundary. [/example] ## Schauder Estimates and Sobolev Versus Hölder Regularity Scales The final question is what kind of regularity is obtained when the data is Hölder continuous rather than merely square-integrable. Sobolev estimates measure derivatives in integral norms, while Schauder estimates measure pointwise Hölder continuity of derivatives. The stronger pointwise scale is needed because an $L^2$ function may oscillate or jump on small sets while still having finite integral norm, and such behaviour cannot force continuous second derivatives of a solution. The two theories answer related but distinct questions. [definition: Hölder Space] Let $U\subset \mathbb R^n$ be open, let $k\in \mathbb N\cup\{0\}$, and let $0<\alpha<1$. The space $C^{k,\alpha}(U)$ consists of functions $u:U\to \mathbb R$ with $u\in C^k(U)$ such that, for every multi-index $\beta$ with $|\beta|=k$, the seminorm \begin{align*} [D^\beta u]_{C^{0,\alpha}(U)} := \sup_{x,y\in U,\ x\ne y}\frac{|D^\beta u(x)-D^\beta u(y)|}{|x-y|^\alpha} \end{align*} is finite and, for every multi-index $\beta$ with $|\beta|\le k$, the norm $\|D^\beta u\|_{C^0(U)}$ is finite. The norm is \begin{align*} \|u\|_{C^{k,\alpha}(U)} :=\sum_{|\beta|\le k}\|D^\beta u\|_{C^0(U)} +\sum_{|\beta|=k}[D^\beta u]_{C^{0,\alpha}(U)}. \end{align*} [/definition] The definition focuses on pointwise oscillation rather than integrability. This raises the analogue of the Sobolev regularity question: if $f$ is Hölder continuous, does the equation force the second derivatives of $u$ to have the same Hölder modulus? [quotetheorem:4947] [citeproof:4947] The compact containment $V\subset\subset U$ is essential because the constant deteriorates as $V$ approaches $\partial U$; cutoff functions and singular-kernel estimates need a positive distance from the boundary. On $U=B(0,1)\subset\mathbb R^2$, the harmonic functions $u_m(r,\theta)=r^m\cos(m\theta)$ have $f=0$ and $\|u_m\|_{C^0(U)}\le 1$, while their second derivatives near $\partial U$ grow like $m^2$. Thus no interior estimate can be extended to boundary-touching sets with a constant independent of the distance to $\partial U$. The term $\|u\|_{C^0(U)}$ appears because the equation controls second derivatives but not the harmonic part or additive constants: harmonic functions have $f=0$ yet may have nonzero second derivatives on a smaller set, so a size bound for $u$ is needed to close the estimate. Near $\partial U$, an interior ball may no longer fit inside the domain, and the boundary values can create singular behaviour not detected by an interior estimate. To use Schauder theory for a Dirichlet problem on all of $\overline U$, the same boundary obstruction reappears: the boundary must be flattened without losing the Hölder regularity of the coefficients. [quotetheorem:6445] [citeproof:6445] This is the Hölder analogue of the global $H^2$ estimate proved earlier in the chapter, but it has a different set of hypotheses and conclusions. The $C^{2,\alpha}$ boundary condition is needed because flattening differentiates the boundary twice; a corner or cusp can produce singular harmonic functions even with zero boundary values. The extension hypothesis on $g$ is also substantive: if the boundary value is only continuous, or even Hölder of order below the trace required for two derivatives, the solution may fail to have continuous second derivatives up to $\partial U$. The assumption $f\in C^{0,\alpha}(\overline U)$ cannot be weakened to merely bounded forcing in this conclusion; bounded right-hand sides can produce second derivatives with discontinuities, so the $C^{2,\alpha}$ norm of $u$ is not controlled. The theorem also does not provide existence by itself and does not handle variable coefficients unless their transformed coefficients have the corresponding Hölder regularity. These limitations are the reason finite element analysis expects corner singularities on polygonal meshes, while geometric analysis and classical potential theory often impose smooth boundary and Hölder data when pointwise curvature or layer-potential estimates are required. [example: Comparing Integral and Hölder Forcing] Let $U=B(0,1)\subset\mathbb R^n$ and consider the zero-Dirichlet problem \begin{align*} -\Delta u=f \quad\text{in }U,\qquad u=0\quad\text{on }\partial U. \end{align*} If $f\in L^2(U)$, the *Global Sobolev Estimate for the Dirichlet Laplacian* gives \begin{align*} u\in H^2(U) \end{align*} and \begin{align*} \|u\|_{H^2(U)}\le C\|f\|_{L^2(U)}. \end{align*} This conclusion means that every weak derivative $D^\beta u$ with $|\beta|\le 2$ belongs to $L^2(U)$; it does not say that the functions $D^\beta u$ with $|\beta|=2$ are continuous. For a concrete reason, define \begin{align*} f(x)=\mathbf 1_{\{x_1>0\}}(x). \end{align*} Then \begin{align*} \|f\|_{L^2(U)}^2 =\int_U |f(x)|^2\,d\mathcal L^n(x) =\int_{U\cap\{x_1>0\}}1\,d\mathcal L^n(x) =\mathcal L^n(U\cap\{x_1>0\}) \le \mathcal L^n(U)<\infty, \end{align*} so $f\in L^2(U)$. But for $t>0$ small, \begin{align*} f(te_1)=1,\qquad f(-te_1)=0, \end{align*} and both points tend to $0$ as $t\to 0^+$. Thus the two one-sided limits at $0$ are different, so $f$ is not continuous and therefore is not in $C^{0,\alpha}(\overline U)$ for any $0<\alpha<1$. By contrast, if $f\in C^{0,\alpha}(\overline U)$, then the zero boundary value has the extension $G=0\in C^{2,\alpha}(\overline U)$, and the *Boundary Schauder Estimate for the Dirichlet Laplacian* gives \begin{align*} u\in C^{2,\alpha}(\overline U). \end{align*} In this case each second derivative $D^\beta u$ with $|\beta|=2$ is continuous and satisfies the Hölder bound \begin{align*} [D^\beta u]_{C^{0,\alpha}(\overline U)} =\sup_{x,y\in \overline U,\ x\ne y} \frac{|D^\beta u(x)-D^\beta u(y)|}{|x-y|^\alpha} <\infty. \end{align*} Thus the Sobolev hypothesis controls second derivatives in an integral sense, while the Hölder hypothesis controls their pointwise oscillation; the two assumptions measure different kinds of regularity. [/example] This comparison is a practical guide for choosing an estimate, but it also points to a broader methodological distinction. Existence arguments often begin in a weak space, whereas classical regularity statements require more pointwise structure from the coefficients, data, and boundary. [remark: Sobolev and Schauder Scales] Sobolev spaces are stable under weak compactness and energy methods, making them natural for existence and variational arguments. Hölder spaces are closer to classical differentiability and are natural when coefficients, domains, and boundary data have pointwise regularity. Elliptic theory often uses both scales: first obtain a weak solution by Hilbert-space methods, then bootstrap regularity using Sobolev or Schauder estimates according to the available hypotheses. [/remark] Once boundary regularity is understood, the same elliptic operators can be studied spectrally rather than as boundary-value solvers. Chapter 9 uses the compactness built into the variational theory to develop eigenvalue problems, where the operator’s modes and frequencies become the central objects. # 9. Eigenvalue problems and compact resolvents This chapter turns the variational theory of elliptic boundary value problems into a spectral theory. Chapter 5 used coercive bilinear forms to solve equations such as $-\Delta u=f$ with Dirichlet boundary conditions; now the forcing term is replaced by a scalar multiple of the unknown. The main questions are whether the allowed scalars form a discrete sequence, how the associated eigenfunctions sit inside $L^2(U)$, and how the first eigenvalue reflects the geometry of the domain. Throughout this chapter, $U\subset \mathbb R^n$ is a bounded open set, usually with enough boundary regularity for the compact embedding $H^1_0(U)\hookrightarrow L^2(U)$ to hold. The Hilbert space viewpoint is that the inverse Dirichlet Laplacian is compact and self-adjoint on $L^2(U)$, so spectral properties of $-\Delta$ follow from the [spectral theorem for compact self-adjoint operators](/theorems/538). ## The Weak Dirichlet Eigenvalue Problem What should it mean to solve $-\Delta u=\lambda u$ when $u$ is only known to have one weak derivative? The variational formulation replaces pointwise second derivatives by integration against test functions, exactly as in the weak theory of the Poisson equation. [definition: Weak Dirichlet Eigenpair] Let $U\subset \mathbb R^n$ be a bounded open set. A weak Dirichlet eigenpair for $-\Delta$ on $U$ is a pair $(\lambda,u)$ with $\lambda\in \mathbb R$ and $u\in H^1_0(U)$, $u\ne 0$, such that \begin{align*} \int_U \nabla u\cdot \nabla v\,d\mathcal L^n=\lambda\int_U uv\,d\mathcal L^n \end{align*} for every $v\in H^1_0(U)$. [/definition] The condition $u\in H^1_0(U)$ encodes the homogeneous Dirichlet boundary condition. The equation is meaningful because both sides are continuous linear functionals of $v$ once $u\in H^1_0(U)$ and $U$ is bounded. A first consistency check is that the weak formulation has the same signs and boundary conditions as the classical problem. [example: Smooth Eigenfunctions Satisfy The Weak Form] Let $u\in C^2(U)\cap H^1_0(U)$ satisfy $-\Delta u=\lambda u$ pointwise in $U$. We verify that this classical equation implies the weak identity. First take $v\in C_c^\infty(U)$. Since $v$ has compact support in $U$, integration by parts has no boundary term. For each coordinate $i$, \begin{align*} \int_U \partial_i u\,\partial_i v\,d\mathcal L^n=-\int_U v\,\partial_i^2u\,d\mathcal L^n. \end{align*} Summing over $i=1,\dots,n$ gives \begin{align*} \int_U \nabla u\cdot \nabla v\,d\mathcal L^n=-\int_U v\,\Delta u\,d\mathcal L^n. \end{align*} Using $-\Delta u=\lambda u$ pointwise, \begin{align*} -\int_U v\,\Delta u\,d\mathcal L^n=\lambda\int_U uv\,d\mathcal L^n. \end{align*} Hence \begin{align*} \int_U \nabla u\cdot \nabla v\,d\mathcal L^n=\lambda\int_U uv\,d\mathcal L^n \end{align*} for every $v\in C_c^\infty(U)$. Now let $v\in H^1_0(U)$. By the definition of $H^1_0(U)$, choose $v_j\in C_c^\infty(U)$ with $v_j\to v$ in $H^1(U)$. Since $\nabla u\in L^2(U;\mathbb R^n)$, Cauchy-Schwarz gives \begin{align*} \left|\int_U \nabla u\cdot \nabla(v_j-v)\,d\mathcal L^n\right|\le \|\nabla u\|_{L^2(U)}\|\nabla(v_j-v)\|_{L^2(U)}. \end{align*} The right side tends to $0$ because $v_j\to v$ in $H^1(U)$. Also, since $u\in L^2(U)$, \begin{align*} \left|\lambda\int_U u(v_j-v)\,d\mathcal L^n\right|\le |\lambda|\,\|u\|_{L^2(U)}\|v_j-v\|_{L^2(U)}. \end{align*} This right side also tends to $0$. Passing to the limit in the identity for $v_j$ gives \begin{align*} \int_U \nabla u\cdot \nabla v\,d\mathcal L^n=\lambda\int_U uv\,d\mathcal L^n. \end{align*} Thus every classical Dirichlet eigenfunction with the stated regularity and boundary condition is a weak Dirichlet eigenfunction with the same eigenvalue. [/example] The example confirms that the weak definition faithfully extends the classical Dirichlet problem. The next issue is whether the weak formulation permits spectral values that would be impossible for the positive operator $-\Delta$. Testing the equation against the unknown function itself compares the sign of $\lambda$ with the nonnegative Dirichlet energy, and the Poincare inequality uses the boundary condition to exclude the zero-energy case. [quotetheorem:6415] [citeproof:6415] The boundary condition is essential here: for Neumann boundary conditions, constant functions give the eigenvalue $0$, so positivity would fail. Boundedness enters through the Poincare inequality; without an inequality controlling the $L^2$ mass by the Dirichlet energy, the same argument gives only nonnegativity. This limitation explains why the Rayleigh quotient is the natural next object: it packages exactly the comparison that Poincare makes coercive. The proof of positivity leaves behind the identity \begin{align*} \lambda=\frac{\displaystyle\int_U |\nabla u|^2\,d\mathcal L^n}{\displaystyle\int_U u^2\,d\mathcal L^n} \end{align*} for every eigenfunction $u$. This means the spectral value can be read from a comparison between energy and mass, without differentiating $u$ twice. To use that comparison as a search principle, the same energy-to-mass ratio must be defined for every admissible nonzero trial function, not only for eigenfunctions already known to exist. [definition: Rayleigh Quotient] The Rayleigh quotient is the functional \begin{align*} R:H^1_0(U)\setminus\{0\}\to \mathbb R \end{align*} defined by \begin{align*} R[u]=\frac{\displaystyle\int_U |\nabla u|^2\,d\mathcal L^n}{\displaystyle\int_U u^2\,d\mathcal L^n}. \end{align*} [/definition] Eigenfunctions are stationary points of this quotient under variations constrained by nonzero $L^2$ mass. The first eigenvalue will be the minimum of $R[u]$, while higher eigenvalues arise from minimisation after excluding lower eigenspaces. ## Compact Resolvents And The Spectral Theorem How can an unbounded differential operator have a manageable spectrum? The key is to invert the elliptic problem first: instead of studying $-\Delta$ directly, solve $-\Delta u=f$ and study the solution operator $f\mapsto u$ as a [compact operator](/page/Compact%20Operator) on $L^2(U)$. [definition: Dirichlet Resolvent] Let $U\subset \mathbb R^n$ be bounded and suppose the variational Dirichlet problem is well posed. The Dirichlet resolvent is the operator $K:L^2(U)\to L^2(U)$ defined as follows: for $f\in L^2(U)$, $Kf=u$, where $u\in H^1_0(U)$ is the unique weak solution of \begin{align*} \int_U \nabla u\cdot \nabla v\,d\mathcal L^n=\int_U fv\,d\mathcal L^n \end{align*} for every $v\in H^1_0(U)$. [/definition] Lax-Milgram gives existence and uniqueness of $u$, but spectral theory needs more than solvability. It needs compactness, symmetry, and positivity, because these are the hypotheses that make an infinite-dimensional operator behave like a symmetric matrix. The compact embedding $H^1_0(U)\hookrightarrow L^2(U)$ is the bridge from elliptic estimates to compact operator theory. [quotetheorem:6446] [citeproof:6446] The compactness statement is the point at which bounded domains behave differently from the whole space. Boundedness and the compact embedding are not cosmetic assumptions: on $\mathbb R^n$, translations prevent compactness, and the spectrum of $-\Delta$ is continuous rather than a sequence of eigenvalues. Self-adjointness comes from the symmetry of the Dirichlet energy, so nonsymmetric lower-order terms would require a different spectral theorem; positivity also does not say that every $f$ gives $(Kf,f)_{L^2}>0$, only that the quadratic form is nonnegative. These are exactly the hypotheses needed to pass from elliptic solvability to the compact self-adjoint spectral theorem. [quotetheorem:538] [citeproof:538] The quoted theorem is abstract, while the goal is a statement about the differential operator $-\Delta$. Its compactness hypothesis is essential: a bounded self-adjoint operator may have continuous spectrum when compactness is absent, as multiplication by the independent variable on $L^2(0,1)$ illustrates, and the Laplacian on $\mathbb R^n$ has continuous spectrum rather than a discrete eigenbasis. The previous resolvent construction supplies the compact self-adjoint operator to which the theorem applies. The next step is to translate each nonzero resolvent eigenvalue $\mu$ into a Laplacian eigenvalue $1/\mu$, producing the full discrete sequence of Dirichlet eigenvalues. [quotetheorem:6447] [citeproof:6447] This result justifies thinking of $-\Delta$ on such a bounded domain as an infinite-dimensional analogue of a positive symmetric matrix. The compact embedding hypothesis is what rules out continuous spectrum; without it, the resolvent need not have eigenvalues accumulating only at $0$. The theorem does not give explicit eigenfunctions or sharp estimates for the gaps $\lambda_{k+1}-\lambda_k$, so the next section extracts information from variational formulae corresponding to the matrix Rayleigh quotient. ## Rayleigh-Ritz And Courant-Fischer Formulae How can eigenvalues be found without solving the PDE explicitly? The variational answer is that the spectrum is encoded by the Dirichlet energy on finite-dimensional subspaces of $H^1_0(U)$. The next result is finite-dimensional on purpose. Once compact resolvent gives an $L^2$-orthonormal eigenbasis, the Dirichlet Laplacian behaves on each finite span like a positive symmetric matrix: the vector inner product becomes the $L^2$ inner product, and the matrix quadratic form becomes the Dirichlet energy. The quoted Rayleigh-Ritz theorem is the algebraic model for the PDE formula used in the examples below. [quotetheorem:2063] [citeproof:2063] The Rayleigh-Ritz formula explains the first mode, but it does not yet separate the higher modes. The attainment statement depends on compactness through the existence of the eigenbasis; on noncompact domains an infimum of the same quotient may fail to be an eigenvalue. It also identifies only the first eigenspace, not the ordering of the remaining modes. To locate the $k$-th eigenvalue, the variational problem must force $k$ independent directions or remove the first $k-1$ eigenspaces, leading to a minimisation over $k$-dimensional trial spaces and a maximisation over codimension $k-1$ constraint spaces. [quotetheorem:553] [citeproof:553] The min-max principle is especially useful when exact eigenfunctions are unavailable. The finite-dimensional and finite-codimensional hypotheses are essential: without dimension control, the quotient can be forced toward the bottom of the spectrum and no longer identifies the $k$-th eigenvalue. The result gives variational comparisons, not pointwise information about eigenfunctions. It compares domains, coefficients, and boundary conditions by comparing the spaces over which the Rayleigh quotient is tested. [example: Domain Monotonicity For The First Eigenvalue] Let $U_1\subset U_2$ be bounded open sets, and fix $u\in H^1_0(U_1)$ with $u\ne 0$. Let $\widetilde u$ be the zero extension of $u$ to $U_2$, meaning $\widetilde u=u$ on $U_1$ and $\widetilde u=0$ on $U_2\setminus U_1$. Since $H^1_0(U_1)$ is the closure of $C_c^\infty(U_1)$ in $H^1(U_1)$, extending compactly supported approximants by zero shows that $\widetilde u\in H^1_0(U_2)$. Also $\widetilde u\ne 0$. The weak gradient of $\widetilde u$ is $\nabla u$ on $U_1$ and $0$ on $U_2\setminus U_1$, so the Dirichlet energy is unchanged: \begin{align*} \int_{U_2}|\nabla \widetilde u|^2\,d\mathcal L^n=\int_{U_1}|\nabla u|^2\,d\mathcal L^n+\int_{U_2\setminus U_1}0\,d\mathcal L^n. \end{align*} Hence \begin{align*} \int_{U_2}|\nabla \widetilde u|^2\,d\mathcal L^n=\int_{U_1}|\nabla u|^2\,d\mathcal L^n. \end{align*} The same zero-extension identity gives the same $L^2$ mass: \begin{align*} \int_{U_2}\widetilde u^2\,d\mathcal L^n=\int_{U_1}u^2\,d\mathcal L^n+\int_{U_2\setminus U_1}0\,d\mathcal L^n. \end{align*} Thus \begin{align*} \int_{U_2}\widetilde u^2\,d\mathcal L^n=\int_{U_1}u^2\,d\mathcal L^n. \end{align*} Therefore the Rayleigh quotient is preserved by zero extension: \begin{align*} R_{U_2}[\widetilde u]=R_{U_1}[u]. \end{align*} By the Rayleigh-Ritz characterisation on $U_2$, \begin{align*} \lambda_1(U_2)=\inf_{w\in H^1_0(U_2)\setminus\{0\}}R_{U_2}[w]. \end{align*} Since $\widetilde u\in H^1_0(U_2)\setminus\{0\}$, this infimum is bounded above by its value at $\widetilde u$: \begin{align*} \lambda_1(U_2)\le R_{U_2}[\widetilde u]. \end{align*} Using $R_{U_2}[\widetilde u]=R_{U_1}[u]$, we get \begin{align*} \lambda_1(U_2)\le R_{U_1}[u]. \end{align*} This holds for every nonzero $u\in H^1_0(U_1)$, so taking the infimum over $H^1_0(U_1)\setminus\{0\}$ gives \begin{align*} \lambda_1(U_2)\le \inf_{u\in H^1_0(U_1)\setminus\{0\}}R_{U_1}[u]. \end{align*} Applying the Rayleigh-Ritz characterisation on $U_1$ yields \begin{align*} \lambda_1(U_2)\le \lambda_1(U_1). \end{align*} Thus enlarging a Dirichlet domain cannot increase the first eigenvalue. [/example] This comparison matches the physical interpretation: a membrane with more room to move has a lower fundamental frequency. Equality questions are more delicate and depend on connectedness and capacity, but the inequality itself is a direct variational consequence. ## Positivity And Simplicity Of The First Eigenvalue Why is the first eigenfunction special among all eigenfunctions? Variationally, the absolute value of a minimiser has the same energy, and analytically the maximum principle prevents a nonnegative eigenfunction from vanishing in the interior. [quotetheorem:6448] [citeproof:6448] The fixed sign property gives more than qualitative information about a minimiser: it constrains the dimension of the first eigenspace. Connectedness is essential, since on a disconnected domain with two congruent components the first eigenspace can have one independent ground state on each component. The positivity theorem also depends on the same maximum-principle hypotheses, so simplicity should be read under those assumptions. If two independent first eigenfunctions existed, a suitable linear combination would have zero integral while still solving the same first-eigenvalue equation, contradicting fixed sign. The next theorem turns that obstruction into the simplicity of the ground state. [quotetheorem:6449] [citeproof:6449] Higher eigenspaces need not be one-dimensional, and their eigenfunctions must change sign on connected domains. The conclusion is specific to the lowest eigenvalue: multiplicities can occur later through symmetry, as on a square or disk. This points toward nodal domain theory, where geometry of the zero set is linked to the ordering of the eigenvalue. [remark: Orthogonality Forces Sign Change] If $u$ is an eigenfunction with eigenvalue $\lambda_k>\lambda_1$, then $u$ is $L^2(U)$-orthogonal to every first eigenfunction. Since a first eigenfunction has a fixed sign, $u$ cannot also have a fixed sign unless $u=0$. Thus higher eigenfunctions on connected domains must change sign in a measure-theoretic sense. [/remark] ## Explicit Models And Weyl-Type Intuition What do the abstract eigenvalues look like in domains where separation of variables is available? The simplest examples show the spectrum as quantised frequencies, and they explain why the number of eigenvalues below a large threshold grows like a phase-space volume. [example: Eigenvalues On An Interval] Let $U=(0,L)\subset \mathbb R$. We compute the nonzero solutions of \begin{align*} -u''=\lambda u,\qquad u(0)=u(L)=0. \end{align*} If $\lambda=0$, then $u''=0$, so $u(x)=Ax+B$. The condition $u(0)=0$ gives $B=0$, and the condition $u(L)=0$ gives $AL+B=0$. Since $L>0$, this implies $A=0$, so $u=0$. If $\lambda<0$, write $\lambda=-\alpha^2$ with $\alpha>0$. Then \begin{align*} u(x)=Ae^{\alpha x}+Be^{-\alpha x}. \end{align*} The condition $u(0)=0$ gives $A+B=0$, so $B=-A$. Therefore \begin{align*} u(L)=A(e^{\alpha L}-e^{-\alpha L}). \end{align*} Since $\alpha L>0$, the factor $e^{\alpha L}-e^{-\alpha L}$ is nonzero, and $u(L)=0$ forces $A=0$. Hence $B=0$ as well, so again $u=0$. Thus a nonzero Dirichlet eigenfunction must have $\lambda>0$. Write $\beta=\sqrt{\lambda}$. The equation becomes $u''+\beta^2u=0$, so \begin{align*} u(x)=A\cos(\beta x)+B\sin(\beta x). \end{align*} The first boundary condition gives \begin{align*} 0=u(0)=A\cos 0+B\sin 0=A. \end{align*} Thus $u(x)=B\sin(\beta x)$. The second boundary condition gives \begin{align*} 0=u(L)=B\sin(\beta L). \end{align*} For a nonzero solution, $B\ne 0$, so $\sin(\beta L)=0$. Hence $\beta L=m\pi$ for some $m\in\mathbb N$, and therefore \begin{align*} \lambda_m=\beta^2=\left(\frac{m\pi}{L}\right)^2. \end{align*} For this value of $\lambda_m$, the eigenspace consists of the nonzero scalar multiples of \begin{align*} \sin\left(\frac{m\pi x}{L}\right). \end{align*} The normalising constant comes from the identity $\sin^2 t=(1-\cos(2t))/2$. Thus \begin{align*} \int_0^L \sin^2\left(\frac{m\pi x}{L}\right)\,dx=\int_0^L \frac{1-\cos\left(\frac{2m\pi x}{L}\right)}{2}\,dx. \end{align*} The first term contributes $L/2$, while the cosine term contributes \begin{align*} \frac{1}{2}\int_0^L \cos\left(\frac{2m\pi x}{L}\right)\,dx=\frac{1}{2}\left[\frac{L}{2m\pi}\sin\left(\frac{2m\pi x}{L}\right)\right]_{0}^{L}=0. \end{align*} Therefore \begin{align*} \int_0^L \sin^2\left(\frac{m\pi x}{L}\right)\,dx=\frac{L}{2}. \end{align*} So \begin{align*} \varphi_m(x)=\sqrt{\frac{2}{L}}\sin\left(\frac{m\pi x}{L}\right) \end{align*} has $L^2(0,L)$ norm $1$. If $m\ne l$, then the product-to-sum identity gives \begin{align*} \varphi_m(x)\varphi_l(x)=\frac{1}{L}\left[\cos\left(\frac{(m-l)\pi x}{L}\right)-\cos\left(\frac{(m+l)\pi x}{L}\right)\right]. \end{align*} Integrating the first cosine term gives \begin{align*} \int_0^L \cos\left(\frac{(m-l)\pi x}{L}\right)\,dx=\left[\frac{L}{(m-l)\pi}\sin\left(\frac{(m-l)\pi x}{L}\right)\right]_{0}^{L}=0, \end{align*} because $m-l$ is a nonzero integer. Similarly, \begin{align*} \int_0^L \cos\left(\frac{(m+l)\pi x}{L}\right)\,dx=\left[\frac{L}{(m+l)\pi}\sin\left(\frac{(m+l)\pi x}{L}\right)\right]_{0}^{L}=0. \end{align*} Hence \begin{align*} \int_0^L \varphi_m(x)\varphi_l(x)\,dx=0 \end{align*} whenever $m\ne l$. Thus $(\varphi_m)_{m=1}^\infty$ is the normalised Fourier sine eigenbasis of $L^2(0,L)$, and the interval spectrum is the sequence $\lambda_m=(m\pi/L)^2$. [/example] The interval already exhibits discreteness, divergence to infinity, and the relation between wavelength and eigenvalue. Rectangles add multiplicity and lattice-point counting. [example: Separation Of Variables On A Rectangle] Let $U=(0,a)\times(0,b)\subset \mathbb R^2$, with $a,b>0$. For a separated solution $u(x,y)=X(x)Y(y)$ of $-\Delta u=\lambda u$, the Laplacian is \begin{align*} -\Delta u=-X''(x)Y(y)-X(x)Y''(y). \end{align*} Thus \begin{align*} -X''(x)Y(y)-X(x)Y''(y)=\lambda X(x)Y(y). \end{align*} At points where $X(x)Y(y)\ne 0$, division by $X(x)Y(y)$ gives \begin{align*} -\frac{X''(x)}{X(x)}-\frac{Y''(y)}{Y(y)}=\lambda. \end{align*} Since the first term depends only on $x$ and the second only on $y$, each term is constant. Write \begin{align*} -\frac{X''(x)}{X(x)}=\alpha,\qquad -\frac{Y''(y)}{Y(y)}=\beta,\qquad \alpha+\beta=\lambda. \end{align*} The Dirichlet conditions on the vertical sides give $X(0)=X(a)=0$ for a nonzero separated solution, and the Dirichlet conditions on the horizontal sides give $Y(0)=Y(b)=0$. By the interval computation, the nonzero solutions of the $x$-problem are scalar multiples of \begin{align*} \sin\left(\frac{m\pi x}{a}\right),\qquad m\in\mathbb N, \end{align*} with \begin{align*} \alpha=\left(\frac{m\pi}{a}\right)^2. \end{align*} Similarly, the nonzero solutions of the $y$-problem are scalar multiples of \begin{align*} \sin\left(\frac{l\pi y}{b}\right),\qquad l\in\mathbb N, \end{align*} with \begin{align*} \beta=\left(\frac{l\pi}{b}\right)^2. \end{align*} Therefore the separated eigenfunctions are scalar multiples of \begin{align*} \sin\left(\frac{m\pi x}{a}\right)\sin\left(\frac{l\pi y}{b}\right), \end{align*} and the corresponding eigenvalue is \begin{align*} \lambda_{m,l}=\left(\frac{m\pi}{a}\right)^2+\left(\frac{l\pi}{b}\right)^2. \end{align*} Equivalently, \begin{align*} \lambda_{m,l}=\pi^2\left(\frac{m^2}{a^2}+\frac{l^2}{b^2}\right). \end{align*} The normalising constant separates because the integrand is a product of a function of $x$ and a function of $y$: \begin{align*} \int_0^a\int_0^b \sin^2\left(\frac{m\pi x}{a}\right)\sin^2\left(\frac{l\pi y}{b}\right)\,dy\,dx=\left(\int_0^a \sin^2\left(\frac{m\pi x}{a}\right)\,dx\right)\left(\int_0^b \sin^2\left(\frac{l\pi y}{b}\right)\,dy\right). \end{align*} Using the one-dimensional integral from the interval example on $(0,a)$ and on $(0,b)$, \begin{align*} \int_0^a \sin^2\left(\frac{m\pi x}{a}\right)\,dx=\frac{a}{2}. \end{align*} Also, \begin{align*} \int_0^b \sin^2\left(\frac{l\pi y}{b}\right)\,dy=\frac{b}{2}. \end{align*} Hence \begin{align*} \int_0^a\int_0^b \sin^2\left(\frac{m\pi x}{a}\right)\sin^2\left(\frac{l\pi y}{b}\right)\,dy\,dx=\frac{ab}{4}. \end{align*} Thus \begin{align*} \varphi_{m,l}(x,y)=\frac{2}{\sqrt{ab}}\sin\left(\frac{m\pi x}{a}\right)\sin\left(\frac{l\pi y}{b}\right),\qquad m,l\in\mathbb N, \end{align*} has $L^2(U)$ norm $1$. To verify the eigenvalue equation for the normalised functions, compute the second derivatives separately: \begin{align*} \partial_x^2\varphi_{m,l}=-\left(\frac{m\pi}{a}\right)^2\varphi_{m,l}. \end{align*} Likewise, \begin{align*} \partial_y^2\varphi_{m,l}=-\left(\frac{l\pi}{b}\right)^2\varphi_{m,l}. \end{align*} Therefore \begin{align*} -\Delta\varphi_{m,l}=\left(\frac{m\pi}{a}\right)^2\varphi_{m,l}+\left(\frac{l\pi}{b}\right)^2\varphi_{m,l}. \end{align*} Factoring out $\varphi_{m,l}$ gives \begin{align*} -\Delta\varphi_{m,l}=\pi^2\left(\frac{m^2}{a^2}+\frac{l^2}{b^2}\right)\varphi_{m,l}. \end{align*} Thus the rectangle has separated Dirichlet eigenfunctions $\varphi_{m,l}$ with eigenvalues $\lambda_{m,l}$. Repeated values occur exactly when distinct lattice points $(m,l)$ give the same number $\frac{m^2}{a^2}+\frac{l^2}{b^2}$, so multiplicity is a geometric arithmetic feature of the side lengths. [/example] The rectangle formula turns the qualitative theorem into a quantitative prediction. If eigenvalues correspond to lattice points in frequency space, then counting eigenvalues below $\Lambda$ should resemble counting lattice points inside a ball of radius $\sqrt{\Lambda}$. This heuristic is important because it separates the leading dependence on the volume of $U$ from lower-order boundary effects. [illustration:weyl-lattice-counting] The next observation records this heuristic as the standard leading asymptotic for sufficiently regular bounded domains. It should be read as an ordering statement about the compact-resolvent eigenvalue sequence: since $\lambda_k\to\infty$, the finite quantity $N(\Lambda)$ measures how densely that discrete sequence is packed below a large threshold. [remark: Weyl-Type Counting Intuition] Let $U\subset \mathbb R^n$ be a bounded domain with Lipschitz boundary. The eigenvalue counting function \begin{align*} N(\Lambda)=|\{k\in\mathbb N:\lambda_k\le \Lambda\}| \end{align*} has leading-order growth \begin{align*} N(\Lambda)\sim \frac{\omega_n}{(2\pi)^n}\mathcal L^n(U)\Lambda^{n/2} \end{align*} as $\Lambda\to\infty$, where $\omega_n=\mathcal L^n(B(0,1))$. The Lipschitz hypothesis is one standard way to exclude boundary pathologies while retaining the compact Sobolev embedding and the usual asymptotic law. For highly irregular domains, the compactness of the embedding may fail, the boundary may contribute more than a lower-order correction, or the Dirichlet problem may no longer have the same discrete eigenvalue ordering in the form used above. The leading term depends only on the volume of $U$; it does not determine multiplicities or the lower-order boundary corrections. Thus Weyl's law connects the compact-resolvent spectrum to phase-space geometry, while the earlier min-max principles remain the tools for individual eigenvalue comparisons. [/remark] The spectral theorem, Rayleigh quotient, and positivity theory together give a complete qualitative picture of the Dirichlet Laplacian on bounded domains: the spectrum is discrete, the first mode is variationally minimal and sign-definite, and higher modes are organised by orthogonality and increasing energy. These ideas also connect to quantum mechanics, vibrating membranes, heat flow, and numerical finite element methods, where the same compact-resolvent mechanism turns an elliptic operator into a countable family of modes. The spectral picture is cleanest when the operator is coercive, but many important problems fall just short of that ideal. Chapter 10 shows how Fredholm theory recovers solvability in the noncoercive case by isolating the finite-dimensional obstructions and imposing the correct compatibility conditions. # 10. Fredholm theory and noncoercive elliptic equations This chapter studies what remains of the variational method when coercivity fails in a finite-dimensional way. Chapter 5 used Lax--Milgram to solve elliptic equations by proving a uniform lower bound for the energy. Many natural boundary value problems, especially Neumann problems and eigenvalue-shifted equations, do not satisfy such a bound on the full Hilbert space. Fredholm theory explains why the failure is still manageable: compactness turns noncoercive elliptic problems into finite-dimensional obstructions plus closed-range solvability. ## Compact Perturbations of Coercive Problems The problem is to solve weak equations whose leading part is coercive but whose lower-order part can destroy positivity. A typical form is: find $u \in H$ such that \begin{align*} B[u,v] = F(v) \quad \text{for all } v \in H, \end{align*} where $H$ is a Hilbert space, $F \in H^*$, and $B$ is bounded but not coercive. The guiding question is whether compactness can replace coercivity up to a finite-dimensional error. [definition: Compact Perturbation of a Coercive Form] Let $H$ be a Hilbert space. A bounded bilinear form $B:H \times H \to \mathbb R$ is a compact perturbation of a coercive form if there are bounded bilinear forms $B_0$ and $K$ such that $B[u,v] = B_0[u,v] + K[u,v]$, the estimate \begin{align*} B_0[u,u] \ge \alpha \|u\|_H^2 \quad \text{for all } u \in H \end{align*} holds for some $\alpha>0$, and the operator $T_K:H\to H$ defined by $K[u,v]=(T_Ku,v)_H$ is compact. [/definition] The definition isolates the part already handled by Lax--Milgram from the part controlled by compactness. In elliptic equations, $B_0$ usually contains the principal second-order terms and a large positive zeroth-order term, while $K$ contains a negative zeroth-order correction or a compact embedding contribution. [example: Shifted Dirichlet Laplacian as Compact Perturbation] Let $U\subset \mathbb R^n$ be bounded, let $H=H^1_0(U)$, and use the equivalent Hilbert norm $\|u\|_H=\|\nabla u\|_{L^2(U)}$, justified by the *Poincare inequality*. Decompose the form as $B=B_0+K$, where \begin{align*} B_0[u,v]=\int_U \nabla u\cdot \nabla v\,d\mathcal L^n \end{align*} and \begin{align*} K[u,v]=-\lambda\int_U uv\,d\mathcal L^n. \end{align*} For every $u\in H$, \begin{align*} B_0[u,u]=\int_U |\nabla u|^2\,d\mathcal L^n=\|u\|_H^2, \end{align*} so $B_0$ is coercive with coercivity constant $1$ in this norm. It remains to identify the compact part. Let $J:H^1_0(U)\to L^2(U)$ be the inclusion map. By the *Rellich compact embedding theorem*, $J$ is compact. Define $T_K:H\to H$ by the [Riesz identity](/theorems/3164) \begin{align*} (T_Ku,v)_H=K[u,v]=-\lambda\int_U uv\,d\mathcal L^n \quad \text{for all } v\in H. \end{align*} If $(u_j)$ is bounded in $H$, compactness of $J$ gives a subsequence, still written $(u_j)$, that is Cauchy in $L^2(U)$. For this subsequence, \begin{align*} \|T_Ku_j-T_Ku_k\|_H=\sup_{\|v\|_H=1}\left|(T_K(u_j-u_k),v)_H\right|. \end{align*} Using the definition of $T_K$, \begin{align*} \left|(T_K(u_j-u_k),v)_H\right|=\left|\lambda\int_U (u_j-u_k)v\,d\mathcal L^n\right|. \end{align*} By Cauchy--Schwarz in $L^2(U)$, \begin{align*} \left|\lambda\int_U (u_j-u_k)v\,d\mathcal L^n\right|\le |\lambda|\,\|u_j-u_k\|_{L^2(U)}\|v\|_{L^2(U)}. \end{align*} The *Poincare inequality* gives $\|v\|_{L^2(U)}\le C_P\|v\|_H$, and therefore, when $\|v\|_H=1$, \begin{align*} \left|(T_K(u_j-u_k),v)_H\right|\le |\lambda|C_P\|u_j-u_k\|_{L^2(U)}. \end{align*} Taking the supremum over all such $v$ gives \begin{align*} \|T_Ku_j-T_Ku_k\|_H\le |\lambda|C_P\|u_j-u_k\|_{L^2(U)}. \end{align*} Since $(u_j)$ is Cauchy in $L^2(U)$, the sequence $(T_Ku_j)$ is Cauchy in $H$. Thus every bounded sequence has a subsequence whose image under $T_K$ converges in $H$, so $T_K$ is compact. Hence the shifted Dirichlet form is a compact perturbation of a coercive form. The weak equation for $-\Delta u-\lambda u=f$ therefore has Fredholm structure: solvability can fail only through a finite-dimensional kernel, which is why Dirichlet eigenvalues appear. [/example] This example suggests the finite-dimensional obstruction. A compact perturbation of an invertible operator need not remain invertible, but its failure of invertibility is controlled by a finite-dimensional kernel and a finite-codimensional range. [quotetheorem:219] [citeproof:219] This theorem is not a small perturbation result: the compact operator $K$ can have large norm, so invertibility is allowed to fail. What compactness rules out is infinite-dimensional failure at the eigenvalue $1$; without compactness, an operator such as a shift can be injective with non-closed range or have much worse range behaviour. For elliptic equations this theorem is applied after using a coercive isomorphism to rewrite the weak problem in the form $I-K$. The adjoint kernel is the space of weak solutions to the corresponding homogeneous adjoint equation, so the compatibility condition becomes orthogonality against adjoint homogeneous solutions. [quotetheorem:4877] [citeproof:4877] The point is not that lower-order terms are harmless; rather, under compact embedding they can obstruct solvability only through finite-dimensional eigenspaces. The compactness hypothesis is essential: on an unbounded domain the embedding into $L^2$ is generally not compact, and continuous spectrum can replace the finite-dimensional obstruction. The theorem also does not identify the adjoint kernel explicitly unless the adjoint weak problem is known; in nonsymmetric problems the compatibility conditions are against adjoint solutions, not necessarily against solutions of the original homogeneous equation. The next section gives the most important boundary-condition example, where the obstruction is already present before adding any potential term. ## Neumann Problems and Compatibility Conditions The Neumann problem asks for a function whose normal derivative, rather than its boundary value, is prescribed. Constants have zero gradient and zero normal derivative, so the energy cannot distinguish $u$ from $u+C$. The central question is how to formulate existence and uniqueness when constants lie in the kernel. [definition: Weak Neumann Problem] Let $U\subset\mathbb R^n$ be a bounded Lipschitz domain, let $f\in L^2(U)$, and let $g\in L^2(\partial U,\mathcal H^{n-1})$. A function $u\in H^1(U)$ is a weak solution of the Neumann problem with source $f$ and boundary flux $g$ if \begin{align*} \int_U \nabla u\cdot \nabla v\,d\mathcal L^n=\int_U fv\,d\mathcal L^n+\int_{\partial U}gv\,d\mathcal H^{n-1} \end{align*} for all $v\in H^1(U)$. [/definition] The weak formulation includes constant test functions, which is the feature absent from the Dirichlet problem. The immediate question is what the equation says when the test function measures only total mass, and this produces the compatibility condition required for solvability. [quotetheorem:677] [citeproof:677] This theorem is the prototype for Fredholm solvability: the kernel consists of constants, and the right-hand side must annihilate that kernel. The connectedness hypothesis is doing real work; if $U$ has two components, a function may be constant with different values on the two components, and one global integral condition no longer detects all kernel directions. The theorem also does not choose a canonical solution until a normalisation such as zero mean is imposed. To use Lax--Milgram after removing constants, the course needs an estimate saying that the gradient controls the whole $H^1$ norm once the constant part has been fixed. [quotetheorem:75] [citeproof:75] The estimate lets us choose the mean-zero representative, but this choice is a convenience rather than part of the PDE. Boundedness and connectedness are both essential in the form stated: constants are the only zero-gradient functions on a connected domain, while disconnected domains have one independent constant on each component. The estimate does not control the mean of $u$; it controls only the oscillatory part $u-u_U$, which is exactly what the Neumann energy sees. When the solution is intrinsically determined only up to constants, it is often cleaner to make constants invisible from the start. [definition: Quotient by Constants] Let $U\subset\mathbb R^n$ be a bounded connected domain. The quotient space $H^1(U)/\mathbb R$ is the space of equivalence classes $[u]$ under the relation $u\sim v$ iff $u-v$ is a.e. constant on $U$. [/definition] On $H^1(U)/\mathbb R$, the seminorm $\|\nabla u\|_{L^2(U)}$ becomes a genuine norm. The Poincare inequality modulo constants says that this norm is equivalent to the quotient norm inherited from $H^1(U)$. [example: Poisson Neumann Equation with Zero Mean Forcing] Let $U\subset\mathbb R^n$ be bounded, connected, and Lipschitz, and consider the Poisson--Neumann problem with zero boundary flux: \begin{align*} \int_U \nabla u\cdot \nabla v\,d\mathcal L^n=\int_U fv\,d\mathcal L^n \quad \text{for all } v\in H^1(U). \end{align*} If $f\in L^2(U)$ satisfies \begin{align*} \int_U f\,d\mathcal L^n=0, \end{align*} then the compatibility condition from the *[Neumann Compatibility Condition](/theorems/677)* holds, because the boundary term is \begin{align*} \int_{\partial U}0\cdot v\,d\mathcal H^{n-1}=0. \end{align*} On the mean-zero space \begin{align*} H^1_\diamond(U)=\left\{w\in H^1(U):\int_U w\,d\mathcal L^n=0\right\}, \end{align*} the bilinear form \begin{align*} a(u,v)=\int_U \nabla u\cdot \nabla v\,d\mathcal L^n \end{align*} is coercive by the *Poincare Inequality Modulo Constants*. Thus the unique mean-zero solution $u\in H^1_\diamond(U)$ is characterized by \begin{align*} \int_U \nabla u\cdot \nabla v\,d\mathcal L^n=\int_U fv\,d\mathcal L^n \quad \text{for all } v\in H^1_\diamond(U). \end{align*} If instead $f$ has nonzero mean and $u\in H^1(U)$ were a weak solution, then choosing the admissible test function $v=1$ would give \begin{align*} \int_U \nabla u\cdot \nabla 1\,d\mathcal L^n=\int_U f\cdot 1\,d\mathcal L^n. \end{align*} Since $\nabla 1=0$, the left side is \begin{align*} \int_U \nabla u\cdot 0\,d\mathcal L^n=0, \end{align*} so the equation would force \begin{align*} 0=\int_U f\,d\mathcal L^n, \end{align*} contradicting the assumed nonzero mean. Thus the condition is not a technical artefact but the conservation law that total source must equal total outward flux. [/example] This example also clarifies the role of connectedness. If $U$ has several connected components, the kernel consists of functions that are constant on each component, and the compatibility condition must hold separately on each component. ## Weak Maximum Principles and Spectral Obstructions Maximum principles gave uniqueness and sign information for coercive elliptic problems, but Fredholm theory shows that uniqueness can fail at eigenvalues. The question in this section is how positivity arguments interact with spectral obstructions. The answer is that maximum principles are powerful below the first eigenvalue, while resonance at eigenvalues forces orthogonality conditions instead of pointwise comparison. [definition: First Dirichlet Eigenvalue] Let $U\subset\mathbb R^n$ be a bounded domain. The first Dirichlet eigenvalue of $-\Delta$ is \begin{align*} \lambda_1(U)=\inf_{u\in H^1_0(U)\setminus\{0\}}\frac{\int_U |\nabla u|^2\,d\mathcal L^n}{\int_U |u|^2\,d\mathcal L^n}. \end{align*} [/definition] This variational number is the threshold at which the bilinear form for $-\Delta-\lambda$ stops being coercive. Below it, the energy estimate behind the maximum principle remains available. [quotetheorem:6451] [citeproof:6451] Coercivity below $\lambda_1(U)$ restores uniqueness by Lax--Milgram. The strict inequality $\lambda<\lambda_1(U)$ is necessary for this conclusion: at $\lambda=\lambda_1(U)$, a first eigenfunction makes the shifted quadratic form vanish, so no positive coercivity constant can exist. The theorem is an energy estimate, not a pointwise statement; it does not by itself say that solutions are positive or negative. It supports the weak maximum principle for the operator $-\Delta-\lambda$ when the sign convention and boundary hypotheses are compatible. [quotetheorem:6452] [citeproof:6452] The sign of the variational inequality is essential: reversing it gives the corresponding lower bound statement after testing with the negative part. The boundary condition $u\in H^1_0(U)$ is also essential because it permits $u^+$ as an admissible zero-boundary test function and excludes positive constants on the boundary. At $\lambda=\lambda_1(U)$ the same test no longer forces $u^+=0$, because the quadratic form may vanish on a nonzero first eigenspace. The natural replacement question is no longer whether comparison gives uniqueness, but whether the forcing term misses the homogeneous modes that prevent inversion. [quotetheorem:6453] [citeproof:6453] The self-adjointness in this theorem makes the adjoint obstruction identical to the original eigenspace. The hypothesis that $\lambda$ is an eigenvalue is precisely the resonant case; if $\lambda$ is not in the Dirichlet spectrum, [the Fredholm alternative](/page/The%20Fredholm%20Alternative) gives uniqueness and no orthogonality condition is needed. The theorem does not select one solution from the affine family until a complementary condition such as $u\perp E_\lambda$ is imposed. For nonsymmetric elliptic operators, the same statement holds with the kernel of the adjoint operator in place of $E_\lambda$. [example: Resonance for a Forced Eigenvalue Equation] Let $\phi_1\in H^1_0(U)$ be a first Dirichlet eigenfunction normalised by \begin{align*} \int_U \phi_1^2\,d\mathcal L^n=1, \end{align*} and consider the resonant equation \begin{align*} -\Delta u=\lambda_1(U)u+f, \qquad u\in H^1_0(U). \end{align*} Its weak form is: find $u\in H^1_0(U)$ such that, for every $v\in H^1_0(U)$, \begin{align*} \int_U \nabla u\cdot \nabla v\,d\mathcal L^n=\lambda_1(U)\int_U uv\,d\mathcal L^n+\int_U fv\,d\mathcal L^n. \end{align*} If $f=\phi_1$ and such a solution $u$ existed, then taking $v=\phi_1$ in the weak formulation would give \begin{align*} \int_U \nabla u\cdot \nabla \phi_1\,d\mathcal L^n=\lambda_1(U)\int_U u\phi_1\,d\mathcal L^n+\int_U \phi_1\phi_1\,d\mathcal L^n. \end{align*} Since $\phi_1\phi_1=\phi_1^2$ and $\phi_1$ is $L^2$-normalised, this becomes \begin{align*} \int_U \nabla u\cdot \nabla \phi_1\,d\mathcal L^n=\lambda_1(U)\int_U u\phi_1\,d\mathcal L^n+1. \end{align*} On the other hand, the weak eigenfunction identity for $\phi_1$ says that, for every $w\in H^1_0(U)$, \begin{align*} \int_U \nabla \phi_1\cdot \nabla w\,d\mathcal L^n=\lambda_1(U)\int_U \phi_1w\,d\mathcal L^n. \end{align*} Taking $w=u$ gives \begin{align*} \int_U \nabla \phi_1\cdot \nabla u\,d\mathcal L^n=\lambda_1(U)\int_U \phi_1u\,d\mathcal L^n. \end{align*} By symmetry of the dot product and ordinary multiplication, \begin{align*} \int_U \nabla u\cdot \nabla \phi_1\,d\mathcal L^n=\lambda_1(U)\int_U u\phi_1\,d\mathcal L^n. \end{align*} The two identities for $\int_U \nabla u\cdot \nabla \phi_1\,d\mathcal L^n$ therefore force \begin{align*} \lambda_1(U)\int_U u\phi_1\,d\mathcal L^n+1=\lambda_1(U)\int_U u\phi_1\,d\mathcal L^n. \end{align*} Subtracting the common term gives $1=0$, a contradiction. Thus forcing in the resonant eigendirection prevents solvability. If instead $f$ satisfies \begin{align*} \int_U f\phi\,d\mathcal L^n=0 \quad \text{for every first Dirichlet eigenfunction } \phi, \end{align*} then the compatibility condition in *Resonant Dirichlet Solvability* is satisfied at $\lambda=\lambda_1(U)$. Hence a solution exists, and any two solutions differ by an element of the first eigenspace. [/example] The comparison between the Neumann case and the resonant Dirichlet case is the main lesson of the chapter. Noncoercive elliptic problems are not beyond variational methods; they require identifying the kernel, imposing the corresponding compatibility conditions, and solving on a complementary subspace where a Poincare-type estimate restores control. After dealing with noncoercive linear problems, the course moves to situations where the admissible set itself is constrained. Chapter 11 replaces linear equations with variational inequalities, so the boundary value problem is governed by an obstacle-type condition rather than an unconstrained Euler-Lagrange equation. # 11. Variational inequalities and constrained elliptic problems Variational methods become more flexible when the admissible functions are not a linear space. In this chapter the unknown is required to stay inside a closed convex set, such as the set of Sobolev functions lying above a prescribed obstacle. The Euler--Lagrange equation is then replaced by a variational inequality, because not every perturbation of the minimizer remains admissible. This leads to the obstacle problem, complementarity conditions, and approximation by penalized equations. The prerequisites are the weak formulation of elliptic Dirichlet problems, coercive bilinear forms and Lax--Milgram from Chapter 5, Sobolev spaces $H^1_0(\Omega)$ and $H^{-1}(\Omega)$ from Chapter 4, and the direct method in the calculus of variations from Chapter 6. ## From Weak Equations to Variational Inequalities What replaces the equation $B[u,v]=F(v)$ when the allowed variations can only point into the admissible set? The answer is an inequality against all competitors in the constraint set. The geometry is the same as projecting a point onto a closed convex subset of a Hilbert space, but the analytic form is adapted to elliptic bilinear forms. [definition: Closed Convex Constraint Set] Let $H$ be a real Hilbert space. A subset $K \subset H$ is a closed convex constraint set if $K$ is nonempty, closed in the norm topology of $H$, and satisfies \begin{align*} (1-t)u+tv \in K \end{align*} for all $u,v\in K$ and all $t\in[0,1]$. [/definition] Closedness gives compactness-like stability under strong limits, while convexity says that the line segment from one admissible state to another remains admissible. These two properties are exactly what is needed to formulate first variations along rays $u+t(v-u)$. [example: Projection Onto a Closed Convex Set] Let $H=\mathbb R^2$ with the Euclidean inner product, and let \begin{align*} K=\{(x_1,x_2)\in\mathbb R^2:x_2\ge 0\}. \end{align*} Fix $z=(a,b)$ with $b<0$. We show that the closest point in $K$ is $u=(a,0)$ and that this point satisfies the projection inequality against every feasible direction. Take an arbitrary $v=(x,y)\in K$. Then $y\ge 0$, so $y-b\ge -b>0$. Hence \begin{align*} (y-b)^2\ge (-b)^2=b^2. \end{align*} Also $(a-x)^2\ge 0$, and therefore \begin{align*} |z-v|^2=|(a,b)-(x,y)|^2=(a-x)^2+(b-y)^2=(a-x)^2+(y-b)^2\ge b^2. \end{align*} For $u=(a,0)$, we have \begin{align*} |z-u|^2=|(a,b)-(a,0)|^2=|(0,b)|^2=b^2. \end{align*} Thus every point of $K$ is at distance at least $|b|$ from $z$, and $u$ attains that distance, so $u$ is a closest point in $K$. Now let $v=(x,y)\in K$ again. Then \begin{align*} z-u=(a,b)-(a,0)=(0,b). \end{align*} Also \begin{align*} v-u=(x,y)-(a,0)=(x-a,y). \end{align*} Taking the Euclidean inner product gives \begin{align*} (z-u)\cdot(v-u)=(0,b)\cdot(x-a,y)=0\cdot(x-a)+by=by. \end{align*} Since $b<0$ and $y\ge0$, we have $by\le0$, and hence \begin{align*} (z-u)\cdot(v-u)\le0. \end{align*} The nearest feasible point is therefore detected by an inequality against all directions that remain inside $K$, not by requiring the derivative to vanish in every direction. [/example] The projection example supplies the template for the general weak formulation: compare the unknown $u$ only with points $v$ that remain feasible, and measure the sign of the first variation in the direction $v-u$. This motivates the abstract variational inequality, which replaces a family of two-sided tests by a family of one-sided competitor tests. [definition: Variational Inequality] Let $H$ be a real Hilbert space, let $K\subset H$ be a closed convex constraint set, let $B:H\times H\to\mathbb R$ be a bilinear form, and let $F\in H^*$. A solution of the variational inequality associated with $(B,F,K)$ is an element $u\in K$ such that \begin{align*} B[u,v-u] \ge F(v-u) \end{align*} for every $v\in K$. [/definition] The expression $v-u$ should be read as an admissible direction from $u$ toward another competitor $v$. If $K=H$, testing with $v=u+w$ and $v=u-w$ gives $B[u,w]=F(w)$ for every $w\in H$, so the usual weak formulation is a special case. The central question is then whether coercivity, which solved unconstrained weak equations through Lax--Milgram, still gives a well-posed constrained problem. [quotetheorem:106] [citeproof:106] This theorem is the constrained counterpart of Lax--Milgram. The hypotheses are not cosmetic: if $K=\varnothing$ there is no admissible candidate, if $K$ is not closed a minimizing or fixed-point sequence may converge to a point outside $K$, and if coercivity fails uniqueness can be lost even for linear equations. Convexity is what makes the competitor ray $u+t(v-u)$ admissible and allows the projection argument to encode a first-order condition. A specific obstruction is already visible in $H=\mathbb R^2$ with $K=\{(-1,0),(1,0)\}$ and objective $J:K\to\mathbb R$ given by $J(x)=|x|^2$: both points are nearest feasible points to the origin, and the line segment between them leaves $K$, so no first-order inequality along feasible rays can select a unique projection. The theorem gives a weak solution and uniqueness in the Hilbert-space norm, but it does not by itself give pointwise regularity, a PDE on subregions, or any description of where a constraint is active. Those additional conclusions require the special structure of the obstacle problem. [remark: Symmetry Is Not Required] The bilinear form in Stampacchia's theorem need not be symmetric. Symmetry becomes important when the variational inequality is derived from minimizing an energy, but the inequality formulation itself only needs boundedness and coercivity. [/remark] For the standard Dirichlet problem on a bounded domain, the Hilbert space is $H^1_0(\Omega)$ and the bilinear form is the Dirichlet energy. The next section imposes a pointwise lower constraint and asks what equation remains true off the constrained region. ## The Obstacle Problem Suppose an elastic membrane is forced to lie above a fixed profile $\psi$. Where the membrane floats above the obstacle it should solve the usual Poisson equation, but where it touches the obstacle an additional reaction force appears. The obstacle problem encodes both behaviours without knowing the contact set in advance. Let $\Omega\subset\mathbb R^n$ be a bounded open set for which the Poincare inequality holds on $H^1_0(\Omega)$, for instance a bounded Lipschitz domain. We work with $H^1_0(\Omega)$ and the bilinear form \begin{align*} B[u,v]=\int_\Omega \nabla u\cdot \nabla v\,d\mathcal L^n, \end{align*} which is coercive on $H^1_0(\Omega)$ by Poincare's inequality. [definition: Obstacle Constraint Set] Let $\psi\in H^1(\Omega)$ be an obstacle. The zero-boundary obstacle constraint set is \begin{align*} K_\psi=\{v\in H^1_0(\Omega): v\ge \psi \text{ a.e. in }\Omega\}. \end{align*} [/definition] The set may be empty unless the boundary condition and obstacle are compatible. In applications one either assumes $K_\psi\ne\varnothing$ or works with a boundary datum $g$ and the affine space $g+H^1_0(\Omega)$. [example: Elastic Membrane Over an Obstacle] Let $\Omega\subset\mathbb R^2$ model a horizontal frame, let $f\in H^{-1}(\Omega)$ be a vertical load, and let $\psi$ describe a rigid support below the membrane. The admissible deformations are $v\in K_\psi$, and an equilibrium deformation $u\in K_\psi$ minimizes \begin{align*} I:K_\psi\to\mathbb R,\qquad I[v]=\frac12\int_\Omega |\nabla v|^2\,d\mathcal L^2 - f(v). \end{align*} Fix any competitor $v\in K_\psi$. Since $K_\psi$ is convex, the path \begin{align*} u_t=u+t(v-u)=(1-t)u+tv \end{align*} belongs to $K_\psi$ for every $t\in[0,1]$. Minimality of $u$ gives \begin{align*} 0\le I[u_t]-I[u]. \end{align*} Using $\nabla u_t=\nabla u+t\nabla(v-u)$ and the linearity of $f$, we expand the difference as \begin{align*} I[u_t]-I[u]=\frac12\int_\Omega |\nabla u+t\nabla(v-u)|^2\,d\mathcal L^2-f(u+t(v-u))-\frac12\int_\Omega |\nabla u|^2\,d\mathcal L^2+f(u). \end{align*} The pointwise identity $|A+tB|^2=|A|^2+2tA\cdot B+t^2|B|^2$ gives \begin{align*} |\nabla u+t\nabla(v-u)|^2=|\nabla u|^2+2t\,\nabla u\cdot\nabla(v-u)+t^2|\nabla(v-u)|^2. \end{align*} Also, \begin{align*} f(u+t(v-u))=f(u)+t f(v-u). \end{align*} Substituting these two identities into the energy difference and cancelling the terms $f(u)$ and $\frac12\int_\Omega|\nabla u|^2\,d\mathcal L^2$ yields \begin{align*} I[u_t]-I[u]=t\int_\Omega \nabla u\cdot\nabla(v-u)\,d\mathcal L^2+\frac{t^2}{2}\int_\Omega |\nabla(v-u)|^2\,d\mathcal L^2-t f(v-u). \end{align*} For $t>0$, divide the inequality $0\le I[u_t]-I[u]$ by $t$ to obtain \begin{align*} 0\le \int_\Omega \nabla u\cdot\nabla(v-u)\,d\mathcal L^2+\frac{t}{2}\int_\Omega |\nabla(v-u)|^2\,d\mathcal L^2-f(v-u). \end{align*} Letting $t\downarrow0$ gives \begin{align*} \int_\Omega \nabla u\cdot\nabla(v-u)\,d\mathcal L^2 \ge f(v-u) \end{align*} for every $v\in K_\psi$. Thus every feasible displacement from equilibrium has nonnegative first energy variation, which is the weak force-balance law with the unknown upward reaction supplied by the obstacle. [/example] When the bilinear form is symmetric, the variational inequality is exactly the first-order condition for constrained minimization. This connects the Hilbert-space theorem to the energy methods already used for weak solutions and sets up the existence theorem for the membrane above an obstacle. [quotetheorem:6454] [citeproof:6454] The assumptions in this theorem play separate roles. Nonemptiness is a compatibility condition between the obstacle and boundary values; without it the admissible class is void. Closed convexity of $K_\psi$ lets Stampacchia's theorem apply, coercivity of the Dirichlet form supplies uniqueness, and the condition $f\in H^{-1}(\Omega)$ is exactly what is needed for the load to act continuously on $H^1_0(\Omega)$. The Poincare hypothesis is the analytic input that turns the gradient energy into a coercive norm; if it is unavailable, constants or slowly varying modes can destroy coercivity of the Dirichlet form. The conclusion is still only a variational statement in $H^1_0(\Omega)$. It does not assert that $u$ is continuous, that the equality $u=\psi$ can be read pointwise, or that the interface between contact and non-contact regions is a regular surface. To ask where the membrane actually touches the obstacle, the solution and obstacle must be represented in a way that supports pointwise or quasi-pointwise comparison. This is the role of the contact set and free boundary terminology. [definition: Contact Set and Free Boundary] Let $u\in K_\psi$ solve the obstacle problem. The contact set is \begin{align*} \Lambda = \{x\in\Omega : u(x)=\psi(x)\} \end{align*} whenever the representatives are regular enough for pointwise evaluation. The non-contact set is \begin{align*} N = \{x\in\Omega : u(x)>\psi(x)\}, \end{align*} and the free boundary is \begin{align*} \Gamma = \partial N\cap \Omega. \end{align*} [/definition] The term free boundary reflects that $\Gamma$ is part of the solution, not part of the initial data. The next task is to translate this geometric split back into PDE language by identifying the sign and support of the reaction force. [quotetheorem:6455] [citeproof:6455] The formal notation $\min(-\Delta u-f,u-\psi)=0$ is reliable only after specifying the weak sense in which the two entries are compared; here that sense is the measure-valued complementarity condition above. In the natural Sobolev setting $-\Delta u-f$ is not usually a pointwise function, so the complementarity condition is a statement about a nonnegative measure acting on the gap $u-\psi$. The Radon measure hypothesis is essential: if $-\Delta u-f$ is only a general element of $H^{-1}(\Omega)$, it cannot be restricted to a contact set or integrated against $u-\psi$, and the phrase "supported where $u=\psi$" has no measure-theoretic content. A concrete failure is obtained on an interval by taking a function $g\in L^2$ that is not of bounded variation and considering the distribution $T=Dg\in H^{-1}$, defined by $T(\phi)=-\int g\phi'\,d\mathcal L^1$. This $T$ is not a Radon measure, so it cannot be localized by assigning mass to subsets of the contact set or interpreted as a nonnegative reaction force. The quasi-continuity and integrability assumptions are equally structural. Sobolev functions are equivalence classes modulo sets of $\mathcal L^n$-measure zero, while a Radon measure may charge sets that are invisible to Lebesgue measure; changing a representative on such a set can change $\int_\Omega (u-\psi)\,d\lambda$ unless the quasi-continuous representative is fixed. For example, in dimensions where a point can be charged by the reaction measure, two Lebesgue-a.e. identical representatives may assign different values at that point; then the statement $u\ge\psi$ a.e. does not determine whether the point lies in contact as seen by $\lambda$. If $\lambda$ concentrates on a polar or capacity-sensitive contact set, the ordinary a.e. inequality $u\ge\psi$ does not determine the value seen by $\lambda$. This is why regularity matters: continuity of $u$ and $\psi$ gives a classical-looking contact set, while the basic variational theorem only gives Sobolev equivalence classes. The complementarity form is nevertheless useful because it identifies the reaction force and prepares the free-boundary questions: where is the measure supported, and how regular is that support? [example: One-Dimensional Obstacle With a Contact Interval] Let $\Omega=(-1,1)$, impose zero boundary values, take $f=0$, and choose constants $a>b>0$. Set \begin{align*} \psi(x)=a(1-x^2)-b. \end{align*} Since $\psi(0)=a-b>0$ and $\psi(\pm1)=-b<0$, the obstacle rises above the zero boundary level near the center but lies below it at the endpoints. We look for an even solution whose contact set is $[-r,r]$, with $0<r<1$, so that $u=\psi$ on $[-r,r]$ and $u$ is affine on each non-contact interval. On the right-hand interval $(r,1)$, the affine part joins $(r,\psi(r))$ to $(1,0)$. Its slope is \begin{align*} \frac{0-\psi(r)}{1-r}=-\frac{a(1-r^2)-b}{1-r}. \end{align*} Using $1-r^2=(1-r)(1+r)$, this is \begin{align*} -\frac{a(1-r^2)-b}{1-r}=-\frac{a(1-r)(1+r)-b}{1-r}. \end{align*} Splitting the quotient gives \begin{align*} -\frac{a(1-r)(1+r)-b}{1-r}=-a(1+r)+\frac{b}{1-r}. \end{align*} Smooth fit at the free-boundary point $x=r$ requires this slope to equal the obstacle slope. Since \begin{align*} \psi'(x)=-2ax, \end{align*} we have \begin{align*} \psi'(r)=-2ar. \end{align*} Therefore \begin{align*} -2ar=-a(1+r)+\frac{b}{1-r}. \end{align*} Multiplying by $1-r>0$ gives \begin{align*} -2ar(1-r)=-a(1+r)(1-r)+b. \end{align*} Using $(1+r)(1-r)=1-r^2$, this becomes \begin{align*} -2ar+2ar^2=-a+ar^2+b. \end{align*} Adding $2ar-2ar^2$ to both sides gives \begin{align*} 0=a-2ar+ar^2-b. \end{align*} Factoring the quadratic term gives \begin{align*} 0=a(1-2r+r^2)-b. \end{align*} Thus \begin{align*} 0=a(1-r)^2-b. \end{align*} Hence \begin{align*} (1-r)^2=\frac{b}{a}. \end{align*} Because $a>b>0$, we have $0<b/a<1$, so the root lying in $(0,1)$ is \begin{align*} r=1-\sqrt{\frac ba}. \end{align*} With this value of $r$, the right affine piece can be written explicitly. Since $b=a(1-r)^2$, \begin{align*} \psi(r)=a(1-r^2)-b. \end{align*} Substituting $1-r^2=(1-r)(1+r)$ and $b=a(1-r)^2$ gives \begin{align*} \psi(r)=a(1-r)(1+r)-a(1-r)^2. \end{align*} Factoring $a(1-r)$ gives \begin{align*} \psi(r)=a(1-r)((1+r)-(1-r)). \end{align*} Since $(1+r)-(1-r)=2r$, this is \begin{align*} \psi(r)=2ar(1-r). \end{align*} The affine function through $(r,\psi(r))$ and $(1,0)$ therefore has slope \begin{align*} \frac{0-2ar(1-r)}{1-r}=-2ar \end{align*} and value $0$ at $x=1$, so on the right interval it is \begin{align*} u(x)=2ar(1-x)\quad\text{for }r\le x\le 1. \end{align*} By even symmetry, the left affine piece is \begin{align*} u(x)=2ar(1+x)\quad\text{for }-1\le x\le -r. \end{align*} On the contact interval, \begin{align*} u(x)=a(1-x^2)-b\quad\text{for }-r\le x\le r. \end{align*} The one-sided derivatives match at $x=r$ because the right affine slope is $-2ar=\psi'(r)$, and they match at $x=-r$ because the left affine slope is $2ar=\psi'(-r)$. Thus the contact set is $[-r,r]$, with \begin{align*} r=1-\sqrt{\frac ba}, \end{align*} and the free-boundary points are $x=-r$ and $x=r$. [/example] [illustration:obstacle-free-boundary] This example is the simplest model of a free boundary computation. In higher dimensions the same logic remains, but the regularity of $u$, the geometry of $\Gamma$, and the structure of singular contact points become major questions. ## Penalization and Approximation How can a constrained problem be solved using unconstrained elliptic equations? Direct constrained approaches require maintaining the inequality $u\ge\psi$ at every step, so a naive finite element or gradient method can leave the admissible set after a standard update and must then project back onto a convex cone. At the PDE level, the contact set is unknown, so solving $-\Delta u=f$ off contact and enforcing a reaction force on contact is not a closed equation until the free boundary is known. Penalization replaces the hard condition by a large energy cost for violating it. The resulting equations are easier to approximate numerically and form a useful route to existence. [definition: Penalized Obstacle Energy] Let $\Omega\subset\mathbb R^n$ be bounded, let $f\in L^2(\Omega)$, let $\psi\in H^1(\Omega)$, and let $\varepsilon>0$. Let $\beta_\varepsilon:\mathbb R\to\mathbb R$ be a nondecreasing function such that $\beta_\varepsilon(s)=0$ for $s\ge0$ and $\beta_\varepsilon(s)<0$ for $s<0$. Let $B_\varepsilon:\mathbb R\to\mathbb R$ be a function satisfying $B_\varepsilon'=\beta_\varepsilon$. The penalized obstacle energy is the functional \begin{align*} I_\varepsilon:H^1_0(\Omega)\to\mathbb R\cup\{+\infty\},\qquad v\mapsto \frac12\int_\Omega |\nabla v|^2\,d\mathcal L^n-\int_\Omega fv\,d\mathcal L^n + \int_\Omega B_\varepsilon(v-\psi)\,d\mathcal L^n. \end{align*} [/definition] The penalty term is chosen so that violations of the constraint create a restoring force in the Euler--Lagrange equation. This motivates differentiating the penalized energy: the result is an unconstrained semilinear elliptic equation whose nonlinear term records how much the approximate solution tries to pass below the obstacle. [quotetheorem:6456] [citeproof:6456] The equation has no explicit inequality constraint, but the nonlinear term becomes large when $u_\varepsilon$ drops below $\psi$. Differentiability is essential here: if $B_\varepsilon(s)=s^-$, the energy has a corner at $s=0$ and the Euler--Lagrange condition becomes a subdifferential inclusion rather than the displayed equation with a single function $\beta_\varepsilon$. The sign condition is also necessary: if $\beta_\varepsilon$ were positive on $s<0$, then the term would push the solution further below the obstacle in the weak equation. Monotonicity corresponds to convexity of the penalty; without it the penalized functional can have stationary points that do not minimize the energy. A concrete failure mode is a double-well penalty below the obstacle: the first variation may vanish at a lower well, so a solution of the penalized equation can remain separated below $\psi$ even though the intended constrained problem forbids that state. This theorem only identifies the equation satisfied by a minimizer after such a minimizer exists. It does not prove existence of $u_\varepsilon$, does not give uniform estimates as $\varepsilon\downarrow0$, and does not say that $u_\varepsilon$ approaches the obstacle solution. If coercivity of the full energy is absent, a minimizing sequence can escape to infinity in a nearly flat direction while the Euler--Lagrange equation still has formal critical points; for instance, on a domain without a Poincare inequality the Dirichlet term does not control additive constants. Those later conclusions require coercivity of the full energy, lower semicontinuity, compactness of the minimizing sequence, and a penalty strong enough to eliminate negative violations in a topology that survives passage to the limit. The next theorem records a convergence principle under explicit hypotheses of that kind. [quotetheorem:6457] [citeproof:6457] The convergence theorem explains why penalization is not only a computational trick: the unconstrained solutions recover the same constrained object selected by the variational inequality. The [quadratic lower bound](/theorems/2052) is stronger than pointwise blow-up on fixed negative values; it prevents violations from concentrating on smaller and smaller sets while remaining invisible to weak convergence. If the penalties stayed uniformly bounded below the obstacle, or only controlled fixed-size violations without an integrable estimate, a minimizing sequence could trade a lower Dirichlet or load energy against a thin negative dip and the weak limit need not be shown admissible. The compact embedding $H^1_0(\Omega)\hookrightarrow L^2(\Omega)$ is the mechanism that converts the uniform energy bound into strong $L^2$ convergence, which is needed both to pass to the load term $\int_\Omega fu_\varepsilon\,d\mathcal L^n$ and to conclude that the negative part disappears in the limit. Without this compactness, weak $H^1_0$ convergence alone would not control the obstacle inequality in the $L^2$ topology used by the penalty. Convexity and monotonicity keep the approximation compatible with the variational inequality; without them, different subsequential limits or non-minimizing critical points may appear. A concrete choice of penalty shows the sign convention and the approximating reaction force. [example: Penalized Approximation Converging to the Obstacle Solution] Take $s^-=\max\{-s,0\}$ and define \begin{align*} \beta_\varepsilon(s)=-\varepsilon^{-1}s^-. \end{align*} If $s<0$, then $s^-=-s$, so \begin{align*} \beta_\varepsilon(s)=-\varepsilon^{-1}(-s)=\frac{s}{\varepsilon}<0. \end{align*} If $s\ge0$, then $s^-=0$, so \begin{align*} \beta_\varepsilon(s)=0. \end{align*} Thus this penalty force is active exactly when the trial function lies below the obstacle. Substituting $s=u_\varepsilon-\psi$ into the weak penalized equation from *Penalized Equation* gives \begin{align*} \beta_\varepsilon(u_\varepsilon-\psi)=-\varepsilon^{-1}(u_\varepsilon-\psi)^-. \end{align*} Hence \begin{align*} \int_\Omega \nabla u_\varepsilon\cdot\nabla\phi\,d\mathcal L^n+\int_\Omega \beta_\varepsilon(u_\varepsilon-\psi)\phi\,d\mathcal L^n=\int_\Omega f\phi\,d\mathcal L^n \end{align*} becomes \begin{align*} \int_\Omega \nabla u_\varepsilon\cdot\nabla\phi\,d\mathcal L^n-\frac1\varepsilon\int_\Omega (u_\varepsilon-\psi)^-\phi\,d\mathcal L^n=\int_\Omega f\phi\,d\mathcal L^n \end{align*} for every $\phi\in H^1_0(\Omega)$. The corresponding penalty potential is \begin{align*} B_\varepsilon(s)=\frac{1}{2\varepsilon}(s^-)^2. \end{align*} For $s<0$, this reads \begin{align*} B_\varepsilon(s)=\frac{1}{2\varepsilon}s^2, \end{align*} so \begin{align*} B_\varepsilon'(s)=\frac{s}{\varepsilon}=-\frac{1}{\varepsilon}(-s)=-\varepsilon^{-1}s^-=\beta_\varepsilon(s). \end{align*} For $s>0$, we have \begin{align*} B_\varepsilon(s)=0, \end{align*} and therefore \begin{align*} B_\varepsilon'(s)=0=\beta_\varepsilon(s). \end{align*} At $s=0$, the left derivative is $0$ and the right derivative is $0$, so $B_\varepsilon'=\beta_\varepsilon$ everywhere. If $s<0$ and $0<\varepsilon_2<\varepsilon_1$, then $s^->0$ and \begin{align*} \frac{1}{2\varepsilon_2}(s^-)^2>\frac{1}{2\varepsilon_1}(s^-)^2. \end{align*} Thus the same negative violation of the obstacle costs more energy as $\varepsilon$ decreases. Under the hypotheses of *Convergence of Penalized Minimizers*, the corresponding minimizers satisfy \begin{align*} u_\varepsilon\to u\quad\text{strongly in }H^1_0(\Omega), \end{align*} where $u$ is the obstacle solution. The nonnegative quantity \begin{align*} -\beta_\varepsilon(u_\varepsilon-\psi)=\varepsilon^{-1}(u_\varepsilon-\psi)^- \end{align*} is the approximate reaction force, and in the limit it recovers the force enforcing the constraint $u\ge\psi$. [/example] Penalization also clarifies the reaction force. The functions $-\beta_\varepsilon(u_\varepsilon-\psi)$ approximate the nonnegative distribution $-\Delta u-f$, which is concentrated where the limiting solution touches the obstacle. Thus the constrained problem can be viewed either as a variational inequality, a complementarity system, or a limit of unconstrained elliptic equations. The obstacle problem shows that weak formulations can handle constraints as well as equations, but the same analytic machinery also extends to genuine nonlinearities. Chapter 12 develops nonlinear elliptic variational equations, where compactness, monotonicity, and Sobolev methods combine to produce existence results beyond the linear theory. # 12. Nonlinear elliptic variational equations This chapter develops the variational theory for nonlinear elliptic equations, building on the weak formulation and Sobolev spaces from Chapter 4, compactness from Chapter 4, and the linear Dirichlet theory from Chapters 5 and 6. The central question is how to prove existence of weak solutions when the operator is no longer linear, so that Lax--Milgram and superposition are unavailable. Two complementary mechanisms appear: monotonicity and convex minimization for equations such as the $p$-Laplacian, and min-max critical point methods for semilinear equations with nonconvex energies. These ideas also connect PDE with convex analysis, nonlinear functional analysis, phase-transition models, and finite-dimensional critical point theory, while retaining the same weak-solution viewpoint used throughout Chapters 4 through 11. ## The $p$-Laplace Equation as a Monotone Problem The linear Dirichlet problem for $-\Delta u=f$ was solved by coercivity of the Dirichlet form on $H^1_0(\Omega)$. For nonlinear diffusion, the replacement for a coercive bilinear form is a monotone map between a Banach space and its dual. This replacement is necessary because the map $u\mapsto -\Delta_p u$ is not additive when $p\ne 2$: in general $-\Delta_p(u+w)$ is not $-\Delta_p u-\Delta_p w$, since $|\nabla(u+w)|^{p-2}\nabla(u+w)$ does not split into separate terms. Let $\Omega \subset \mathbb R^n$ be a bounded open set and let $1<p<\infty$. The natural energy space for the $p$-Laplace equation is $W^{1,p}_0(\Omega)$, because the gradient enters with $p$-growth rather than quadratic growth. [definition: p-Laplace Operator] Let $1<p<\infty$. The negative $p$-Laplace operator is the map \begin{align*} -\Delta_p:W^{1,p}_0(\Omega)\to W^{-1,p'}(\Omega) \end{align*} defined by \begin{align*} (-\Delta_p u)(v):=\int_\Omega |\nabla u|^{p-2}\nabla u\cdot \nabla v\,d\mathcal L^n \end{align*} for $u,v\in W^{1,p}_0(\Omega)$. [/definition] In distributional notation this means $\Delta_p u=\operatorname{div}(|\nabla u|^{p-2}\nabla u)$, with the sign convention chosen so that $-\Delta_p$ is monotone. The Dirichlet $p$-Laplace equation with datum $f\in W^{-1,p'}(\Omega)$ is therefore the dual-space equation $-\Delta_p u=f$, which already includes the zero boundary condition through the choice of $W^{1,p}_0(\Omega)$. Here $p'$ is the Hölder conjugate exponent, $1/p+1/p'=1$. The weak formulation is obtained by multiplying by a test function and integrating by parts, just as for the Laplacian, but the coefficient $|\nabla u|^{p-2}$ now depends on the solution. [definition: Weak Solution of the p-Laplace Dirichlet Problem] Let $f \in W^{-1,p'}(\Omega)$. A function $u \in W^{1,p}_0(\Omega)$ is a weak solution of $-\Delta_p u=f$ if \begin{align*} \int_\Omega |\nabla u|^{p-2}\nabla u \cdot \nabla v\,d\mathcal L^n = f(v) \end{align*} for every $v \in W^{1,p}_0(\Omega)$. [/definition] The weak formulation turns the PDE into an equation $A(u)=f$ in the dual of $W^{1,p}_0(\Omega)$. Linearity is no longer available, so positivity of a bilinear form cannot be used to control the difference of two candidate solutions; the replacement must measure whether the operator points in a nonnegative direction along every chord between two inputs. [definition: Monotone Operator] Let $X$ be a real Banach space. An operator $A:X\to X^*$ is monotone if \begin{align*} (A(u)-A(v))(u-v) \ge 0 \end{align*} for all $u,v\in X$. It is strictly monotone if the inequality is strict whenever $u\ne v$. [/definition] Monotonicity is the nonlinear analogue of positivity of a bilinear form. For the $p$-Laplacian, the difficulty is that the operator is not generated by a fixed matrix: the coefficient $|\nabla u|^{p-2}$ changes with the unknown gradient. To prove monotonicity one must first control, point by point in gradient space, how the nonlinear vector field $\xi\mapsto |\xi|^{p-2}\xi$ changes along chords. [quotetheorem:6458] [citeproof:6458] Integrating this pointwise inequality gives monotonicity of the operator on $W^{1,p}_0(\Omega)$, but the pointwise theorem itself is only an algebraic inequality for vectors; it does not by itself prove existence, boundary regularity, or compactness. The restriction $1<p<\infty$ is doing real work: it gives strict convexity of $|\xi|^p$ and places the energy space in the reflexive range. At $p=1$, the total variation energy can have nonunique minimizers; for instance, least-gradient problems may admit several functions with the same trace and the same total variation. At $p=\infty$, the limiting equation is governed by the infinity Laplacian rather than by a standard reflexive Sobolev Euler--Lagrange operator. The remaining existence mechanism is that monotonicity, coercivity, and weak compactness replace the Lax--Milgram argument. [quotetheorem:6459] [citeproof:6459] The theorem gives the nonlinear counterpart of the linear Dirichlet existence theorem, but each hypothesis marks a boundary of the method. Boundedness of $\Omega$ is used through Poincare's inequality to make the gradient norm coercive on $W^{1,p}_0(\Omega)$; on unbounded domains, minimizing sequences may drift to infinity or fail to be controlled in $L^p$. The range $1<p<\infty$ gives reflexivity, while the assumption $f\in W^{-1,p'}(\Omega)$ is exactly what makes the forcing term continuous on $W^{1,p}_0(\Omega)$; rougher data require a different notion of solution. The equation also no longer has a superposition principle or a Green function representation, so explicit intuition must come from special families rather than linear kernels. To see what the operator does in concrete terms, radial solutions provide a useful model family. [example: Radial p-Harmonic Functions] Let $\Omega=A_{a,b}:=\{x\in\mathbb R^n:a<|x|<b\}$ and let $u(x)=U(r)$ with $r=|x|$. Since $a>0$, the radial variable is smooth on the annulus, and \begin{align*} \nabla u(x)=U'(r)\frac{x}{r}, \qquad |\nabla u(x)|=|U'(r)|. \end{align*} Thus \begin{align*} |\nabla u|^{p-2}\nabla u=|U'(r)|^{p-2}U'(r)\frac{x}{r}. \end{align*} Set $A(r):=|U'(r)|^{p-2}U'(r)$. For each coordinate, \begin{align*} \frac{\partial}{\partial x_i}\left(A(r)\frac{x_i}{r}\right)=A'(r)\frac{x_i^2}{r^2}+A(r)\left(\frac{1}{r}-\frac{x_i^2}{r^3}\right). \end{align*} Summing over $i=1,\ldots,n$ and using $\sum_i x_i^2=r^2$ gives \begin{align*} \operatorname{div}\left(A(r)\frac{x}{r}\right)=A'(r)+A(r)\left(\frac{n}{r}-\frac{r^2}{r^3}\right). \end{align*} Since $r^2/r^3=1/r$, this becomes \begin{align*} \Delta_p u=A'(r)+\frac{n-1}{r}A(r). \end{align*} Multiplying by $r^{n-1}$ gives \begin{align*} r^{n-1}\Delta_p u=r^{n-1}A'(r)+(n-1)r^{n-2}A(r). \end{align*} The right-hand side is exactly \begin{align*} \frac{d}{dr}\left(r^{n-1}A(r)\right). \end{align*} Therefore $\Delta_p u=0$ is equivalent to \begin{align*} \frac{d}{dr}\left(r^{n-1}|U'(r)|^{p-2}U'(r)\right)=0. \end{align*} Hence there is a constant $K$ such that \begin{align*} r^{n-1}|U'(r)|^{p-2}U'(r)=K. \end{align*} Equivalently, \begin{align*} |U'(r)|^{p-2}U'(r)=K r^{1-n}. \end{align*} The inverse of $t\mapsto |t|^{p-2}t$ is $s\mapsto \operatorname{sgn}(s)|s|^{1/(p-1)}$, so for a constant $C$ determined by $K$, \begin{align*} U'(r)=C r^{-(n-1)/(p-1)}. \end{align*} If $p\ne n$, then \begin{align*} 1-\frac{n-1}{p-1}=\frac{p-n}{p-1}. \end{align*} Thus \begin{align*} \int r^{-(n-1)/(p-1)}\,dr=\frac{p-1}{p-n}r^{(p-n)/(p-1)}. \end{align*} So, after renaming constants, the nonconstant radial $p$-harmonic profiles are \begin{align*} U(r)=B+D r^{(p-n)/(p-1)} \qquad (p\ne n). \end{align*} If $p=n$, then the exponent is \begin{align*} -\frac{n-1}{p-1}=-1, \end{align*} and therefore \begin{align*} U(r)=B+C\log r. \end{align*} Thus the radial scale is a power $r^{(p-n)/(p-1)}$ away from the borderline case, while at $p=n$ it becomes logarithmic; for $p=n=2$, this recovers the usual harmonic logarithm in two dimensions. [/example] The radial computation also explains why nonlinear elliptic equations may have very different local behaviour from linear equations. The variational existence proof is robust, but regularity and explicit representation formulas require separate arguments. ## Semilinear Equations from Energy Functionals The next problem is to keep the principal part linear while allowing nonlinear reaction terms. These equations arise when an energy contains the Dirichlet term plus a potential depending on $u$. Let $\Omega\subset\mathbb R^n$ be bounded and open. A typical semilinear Dirichlet problem has the form \begin{align*} -\Delta u = g(x,u) \quad \text{in }\Omega, \qquad u=0 \quad \text{on }\partial\Omega. \end{align*} The weak formulation must place $g(x,u)$ in a space that can be paired with test functions in $H^1_0(\Omega)$. [definition: Weak Solution of a Semilinear Dirichlet Problem] Let $g:\Omega\times\mathbb R\to\mathbb R$ be measurable in $x$ and continuous in its second variable. A function $u\in H^1_0(\Omega)$ is a weak solution of \begin{align*} -\Delta u=g(x,u) \end{align*} if the map \begin{align*} v\mapsto \int_\Omega g(x,u)v\,d\mathcal L^n \end{align*} defines an element of $H^{-1}(\Omega)$ and \begin{align*} \int_\Omega \nabla u\cdot\nabla v\,d\mathcal L^n = \int_\Omega g(x,u)v\,d\mathcal L^n \end{align*} for every $v\in H^1_0(\Omega)$. [/definition] This definition describes what a solution is, but it does not yet explain where solutions come from. The obstruction is that the nonlinear term $g(x,u)$ is not a fixed element of the dual space; it changes with the unknown function. When $g$ is the derivative of a potential $G$, the weak equation can instead be encoded as stationarity of a scalar energy, making variational methods available. [definition: Semilinear Energy Functional] Let $G:\Omega\times\mathbb R\to\mathbb R$ be measurable in $x$ and continuously differentiable in $s$, and set $g(x,s)=\partial_s G(x,s)$. Assume that $G(\cdot,u)\in L^1(\Omega)$ for every $u\in H^1_0(\Omega)$. The semilinear energy functional is the map \begin{align*} I:H^1_0(\Omega)\to\mathbb R \end{align*} defined by \begin{align*} I[u]=\frac{1}{2}\int_\Omega |\nabla u|^2\,d\mathcal L^n-\int_\Omega G(x,u)\,d\mathcal L^n. \end{align*} [/definition] The energy packages the elliptic operator and the nonlinear reaction into a single scalar functional. To use this functional, one still has to justify that its directional derivative is exactly the weak residual of the PDE. This is the bridge from a formal energy expression to a legitimate variational equation: critical points of $I$ should coincide with weak solutions only when the potential term differentiates correctly in the Sobolev space. [quotetheorem:6460] [citeproof:6460] The theorem gives a direct recipe: choose a potential, compute its first variation, and read off the stationary equation. The differentiability hypothesis is not cosmetic. If $G(x,s)=|s|$ then the potential term is not differentiable at functions that vanish on a set of positive measure, so the Euler--Lagrange equation must be replaced by a variational inequality or subdifferential inclusion. If $G(x,s)=|s|^{2^*}$ in dimension $n\ge 3$, the potential lies at the critical Sobolev growth threshold and compactness may fail even when the first variation is formally meaningful. The statement also explains why weak solutions are paired with all $v\in H^1_0(\Omega)$ rather than only smooth test functions, since the derivative of $I$ naturally lives in the dual of the full energy space. The Allen--Cahn energy is a standard example where the potential has two preferred phases. [example: Allen Cahn Stationary Equation] Assume $\Omega$ is bounded and either $n\le 4$, or work in a Sobolev subspace on which $u\mapsto u^4$ is integrable. For $\varepsilon>0$, define \begin{align*} I_\varepsilon[u]=\int_\Omega \left(\frac{\varepsilon^2}{2}|\nabla u|^2+\frac{1}{4}(u^2-1)^2\right)d\mathcal L^n \end{align*} on $H^1_0(\Omega)$ under this embedding hypothesis, or on an affine Sobolev class if nonzero boundary values are prescribed. For an admissible variation $v$, compute the energy along $u+tv$: \begin{align*} I_\varepsilon[u+tv]=\int_\Omega \left(\frac{\varepsilon^2}{2}|\nabla u+t\nabla v|^2+\frac{1}{4}\big((u+tv)^2-1\big)^2\right)d\mathcal L^n. \end{align*} For the gradient part, \begin{align*} |\nabla u+t\nabla v|^2=|\nabla u|^2+2t\nabla u\cdot\nabla v+t^2|\nabla v|^2. \end{align*} Therefore \begin{align*} \frac{d}{dt}\bigg|_{t=0}\frac{\varepsilon^2}{2}|\nabla u+t\nabla v|^2=\varepsilon^2\nabla u\cdot\nabla v. \end{align*} For the potential part, set $F(s)=\frac{1}{4}(s^2-1)^2$. By the chain rule, \begin{align*} F'(s)=\frac{1}{4}\cdot 2(s^2-1)\cdot 2s=s(s^2-1)=s^3-s. \end{align*} Applying the chain rule again to $s=u+tv$ gives \begin{align*} \frac{d}{dt}\bigg|_{t=0}\frac{1}{4}\big((u+tv)^2-1\big)^2=F'(u)v=(u^3-u)v. \end{align*} Hence the first variation is \begin{align*} \frac{d}{dt}\bigg|_{t=0}I_\varepsilon[u+tv]=\int_\Omega \left(\varepsilon^2\nabla u\cdot\nabla v+(u^3-u)v\right)d\mathcal L^n. \end{align*} Thus $u$ is a critical point exactly when \begin{align*} \int_\Omega \left(\varepsilon^2\nabla u\cdot\nabla v+(u^3-u)v\right)d\mathcal L^n=0 \end{align*} for every admissible test function $v$. Equivalently, \begin{align*} \varepsilon^2\int_\Omega \nabla u\cdot\nabla v\,d\mathcal L^n+\int_\Omega (u^3-u)v\,d\mathcal L^n=0. \end{align*} Using the weak form of the Dirichlet Laplacian, this is the stationary Allen--Cahn equation \begin{align*} -\varepsilon^2\Delta u+u^3-u=0. \end{align*} The potential term is minimized when $u^2-1=0$, namely at $u=\pm 1$, so the energy favours phases near those two values while the gradient term penalizes rapid transitions. [/example] This example illustrates the central tension in semilinear variational problems. The energy can be bounded below and still have several critical points with different qualitative meanings, so minimization finds only part of the solution set. ## Critical Points Beyond Minimizers The final question is how to detect solutions that are not global minimizers. In variational language, these solutions are saddle points of the energy, and their existence depends on a combination of geometry and compactness. A typical situation is that $I[0]=0$, the energy is positive on a small sphere around $0$, but along some direction it eventually becomes negative. Any path from $0$ to that negative region must pass over an energy barrier. [definition: Palais Smale Condition] Let $X$ be a Banach space and let $I\in C^1(X;\mathbb R)$. The functional $I$ satisfies the Palais--Smale condition at level $c\in\mathbb R$ if every sequence $(u_k)$ in $X$ such that \begin{align*} I[u_k]\to c, \qquad \|I'[u_k]\|_{X^*}\to 0 \end{align*} has a strongly convergent subsequence in $X$. [/definition] The Palais--Smale condition turns approximate critical points into genuine critical points. For saddle points, compactness alone is not enough: one also needs a reason that every admissible path must cross a positive energy barrier instead of sliding back to the trivial solution. The min-max construction uses this barrier to create an approximate critical sequence, and the Palais--Smale condition is what prevents that sequence from disappearing without a true critical point. [quotetheorem:6461] [citeproof:6461] For semilinear elliptic equations, the Palais--Smale condition often follows from Sobolev compact embeddings, provided the nonlinearity has subcritical growth. The compactness hypothesis is essential: at critical Sobolev growth, bounded Palais--Smale sequences can concentrate instead of converging strongly, so the min-max level need not be achieved. The geometric assumptions are also sharp in spirit, because without a positive barrier around $0$ or a path descending to negative energy, the min-max value may collapse to the zero critical point. This is where the compactness theory from earlier chapters enters the nonlinear variational method, and it is also the point where variational PDE meets the same compactness-versus-concentration phenomenon that appears in geometric analysis and statistical mechanics models of phase transition. [example: Double Well Energy with Multiple Critical Points] Let $W\in C^1(\mathbb R)$ have two local wells, for instance near $s=-1$ and $s=1$, and choose an admissible Sobolev class in which functions close to both phases satisfy the boundary condition. For \begin{align*} I[u]=\frac{1}{2}\int_\Omega |\nabla u|^2\,d\mathcal L^n+\int_\Omega W(u)\,d\mathcal L^n, \end{align*} we compute the first variation in an admissible direction $v$: \begin{align*} I[u+tv]=\frac{1}{2}\int_\Omega |\nabla u+t\nabla v|^2\,d\mathcal L^n+\int_\Omega W(u+tv)\,d\mathcal L^n. \end{align*} The gradient term expands pointwise as \begin{align*} |\nabla u+t\nabla v|^2=|\nabla u|^2+2t\nabla u\cdot\nabla v+t^2|\nabla v|^2. \end{align*} Therefore \begin{align*} \frac{d}{dt}\bigg|_{t=0}\frac{1}{2}|\nabla u+t\nabla v|^2=\nabla u\cdot\nabla v. \end{align*} For the potential term, the chain rule gives \begin{align*} \frac{d}{dt}\bigg|_{t=0}W(u+tv)=W'(u)v. \end{align*} Thus \begin{align*} I'[u](v)=\int_\Omega \nabla u\cdot\nabla v\,d\mathcal L^n+\int_\Omega W'(u)v\,d\mathcal L^n. \end{align*} So $u$ is a critical point exactly when \begin{align*} \int_\Omega \nabla u\cdot\nabla v\,d\mathcal L^n+\int_\Omega W'(u)v\,d\mathcal L^n=0 \end{align*} for every admissible test function $v$. This is the weak form of \begin{align*} -\Delta u+W'(u)=0. \end{align*} If one low-energy state lies near the well at $-1$ and another lies near the well at $1$, minimization may select one of them. Under the barrier and compactness hypotheses of the *[Mountain Pass Theorem](/theorems/6461)*, any path joining the two low-energy regions must pass through a higher energy level, and the corresponding min-max value gives another critical point. This shows why the variational theory studies the full critical point set of $I$, not only its global minimizers. [/example] The nonlinear theory in this chapter keeps the same weak-solution philosophy as the linear theory, but the mechanism for existence has split into two complementary methods. Monotonicity and convexity give robust existence and uniqueness for equations such as the $p$-Laplacian. Min-max methods, supported by Palais--Smale compactness, find saddle-type solutions for semilinear problems whose energy landscape contains barriers rather than a single convex basin. ## Connections and Further Reading These notes connect elliptic theory to several nearby Androma topics. The weak formulation and Lax--Milgram arguments rely on Hilbert space methods, bounded linear functionals, and compactness principles in Sobolev spaces. The maximum principle chapters connect to harmonic functions, Green functions, and potential theory, while the spectral chapters connect the compact resolvent of the Dirichlet Laplacian to self-adjoint operator theory and variational eigenvalue principles. The final nonlinear chapters point toward monotone operator methods, calculus of variations, and min-max theory for critical points of functionals. For further study, the natural next topics are parabolic equations, where elliptic estimates control spatial regularity for heat flow; geometric analysis, where concentration and critical Sobolev growth become central; and numerical finite element methods, where weak formulations and coercive bilinear forms are turned into computable linear systems. The same compactness-versus-coercivity distinction also reappears in Fredholm alternatives, eigenvalue bifurcation, and nonlinear phase-transition models such as Allen--Cahn type equations. ## References - Androma, [Trace Theorem](/theorems/60). - Androma, [Poincare Inequality](/theorems/75). - Androma, [Weak Maximum Principle](/theorems/100). - Androma, [Strong Maximum Principle for Elliptic Operators](/theorems/102). - Androma, [Lax-Milgram Theorem](/theorems/4946). - Androma, [Sobolev Space](/page/Sobolev%20Space).

Created by admin on 6/12/2026 | Last updated on 6/12/2026

What brings you to Androma?

Start with a route through the knowledge graph.

Partial Differential Equations II: Elliptic Theory and Variational Methods

Sign in to Androma

Check your inbox

One last step

Partial Differential Equations II: Elliptic Theory and Variational Methods

Prerequisites (0/31 completed)

Prerequisites Graph

Rate this page