Hille-Yosida Theorem — Statement & Proof

Theorem

Edit Issues Pull Requests Attributions Admin

Discussion

No discussion available for this theorem.

Proof

[proofplan] We prove the contraction case ($M = 1$, $\omega = 0$); the general case follows by rescaling. The forward implication (semigroup $\Rightarrow$ resolvent estimate) is the easier direction: we exhibit the resolvent as the Laplace transform $R(\lambda, A)x = \int_0^\infty e^{-\lambda t}T(t)x\, d\mathcal{L}^1(t)$ and read off the bound from contractivity. The reverse direction is the celebrated construction: given the resolvent estimate, define the **Yosida approximations** $A_\lambda := \lambda A R(\lambda, A) = \lambda^2 R(\lambda, A) - \lambda I$, which are bounded operators that approximate $A$ on $D(A)$. The exponentials $e^{tA_\lambda}$ form contraction semigroups (verified via the bound on $R(\lambda, A)$), and we pass to the limit $\lambda \to \infty$ using a Cauchy estimate to obtain the desired semigroup $T(t)$. Finally we verify that the limit semigroup has $A$ as its generator. The general case $\|T(t)\| \le Me^{\omega t}$ reduces to the contraction case after rescaling; the appearance of $R(\lambda, A)^n$ rather than $R(\lambda, A)$ encodes the constant $M$ via an iterated Neumann argument. [/proofplan] [step:Express the resolvent as the Laplace transform of the semigroup to obtain the forward implication] Assume (1): $A$ generates a $C_0$-semigroup $\{T(t)\}_{t\ge 0}$ with $\|T(t)\|_{\mathcal{L}(X)} \le 1$ for all $t \ge 0$. Fix $\lambda > 0$ and define \begin{align*} S_\lambda: X &\to X, \\ x &\mapsto \int_0^\infty e^{-\lambda t} T(t) x \, d\mathcal{L}^1(t). \end{align*} The integrand $t \mapsto e^{-\lambda t} T(t) x$ is strongly continuous (since the semigroup is strongly continuous) and satisfies the Bochner-integrable bound $\|e^{-\lambda t} T(t) x\|_X \le e^{-\lambda t} \|x\|_X$, integrable on $[0,\infty)$ since $\lambda > 0$. Hence $S_\lambda x$ is well-defined as a Bochner integral, and \begin{align*} \|S_\lambda x\|_X \le \int_0^\infty e^{-\lambda t} \|x\|_X \, d\mathcal{L}^1(t) = \frac{1}{\lambda} \|x\|_X. \end{align*} [claim:$S_\lambda x \in D(A)$ and $(\lambda I - A)S_\lambda x = x$ for every $x \in X$] [/claim] [proof] Fix $h > 0$. Using the semigroup property $T(t+h) = T(h)T(t)$ together with linearity and boundedness of $T(h)$, \begin{align*} \frac{T(h) - I}{h} S_\lambda x &= \frac{1}{h}\int_0^\infty e^{-\lambda t}[T(t+h)x - T(t)x]\, d\mathcal{L}^1(t) \\ &= \frac{1}{h}\left[\int_h^\infty e^{-\lambda(s-h)} T(s)x \, d\mathcal{L}^1(s) - \int_0^\infty e^{-\lambda t} T(t)x \, d\mathcal{L}^1(t)\right] \\ &= \frac{e^{\lambda h} - 1}{h}\int_0^\infty e^{-\lambda t} T(t)x \, d\mathcal{L}^1(t) - \frac{e^{\lambda h}}{h}\int_0^h e^{-\lambda t} T(t) x \, d\mathcal{L}^1(t), \end{align*} where in the second line we substituted $s = t + h$ and in the third we split the integral $\int_h^\infty = \int_0^\infty - \int_0^h$ and combined exponentials. Letting $h \to 0^+$: the first term tends to $\lambda S_\lambda x$ since $\frac{e^{\lambda h}-1}{h} \to \lambda$, while the second term tends to $T(0)x = x$ by continuity of the integrand at $0$. Therefore \begin{align*} \lim_{h \to 0^+} \frac{T(h) - I}{h} S_\lambda x = \lambda S_\lambda x - x. \end{align*} By the [Closure and Density of the $C_0$-Semigroup Generator](/theorems/3144), the existence of this limit means $S_\lambda x \in D(A)$ and $A S_\lambda x = \lambda S_\lambda x - x$, i.e., $(\lambda I - A) S_\lambda x = x$. [/proof] A symmetric computation (or direct application: for $x \in D(A)$, $T(t)Ax = AT(t)x$, so $S_\lambda$ commutes with $A$ on $D(A)$) gives $S_\lambda(\lambda I - A)y = y$ for $y \in D(A)$. Thus $\lambda I - A$ is bijective from $D(A)$ to $X$ with inverse $S_\lambda$, so $\lambda \in \rho(A)$ and $R(\lambda, A) = S_\lambda$. The bound $\|R(\lambda, A)\|_{\mathcal{L}(X)} \le 1/\lambda$ follows from the estimate above. For the iterated bound, differentiating the Laplace transform identity in $\lambda$ (or iterating) yields $R(\lambda, A)^n x = \frac{1}{(n-1)!}\int_0^\infty t^{n-1} e^{-\lambda t} T(t) x \, d\mathcal{L}^1(t)$, hence $\|R(\lambda, A)^n\|_{\mathcal{L}(X)} \le \frac{1}{(n-1)!}\int_0^\infty t^{n-1} e^{-\lambda t}\, d\mathcal{L}^1(t) = \frac{1}{\lambda^n}$. [/step] [step:Define the Yosida approximations $A_\lambda := \lambda A R(\lambda, A)$ and show they converge strongly to $A$ on $D(A)$] Assume (2): every $\lambda > 0$ belongs to $\rho(A)$ with $\|R(\lambda, A)\|_{\mathcal{L}(X)} \le 1/\lambda$. Define the **Yosida approximations** \begin{align*} A_\lambda: X &\to X, \\ x &\mapsto \lambda A R(\lambda, A) x \quad (\lambda > 0). \end{align*} We rewrite this in a more useful form. The resolvent identity $A R(\lambda, A) = \lambda R(\lambda, A) - I$ (which holds because $(\lambda I - A) R(\lambda, A) = I$ implies $A R(\lambda, A) = \lambda R(\lambda, A) - I$) gives \begin{align*} A_\lambda = \lambda^2 R(\lambda, A) - \lambda I. \end{align*} This makes manifest that $A_\lambda \in \mathcal{L}(X)$: it is a bounded operator with \begin{align*} \|A_\lambda\|_{\mathcal{L}(X)} \le \lambda^2 \cdot \frac{1}{\lambda} + \lambda = 2\lambda. \end{align*} [claim:$\lambda R(\lambda, A) x \to x$ as $\lambda \to \infty$ for every $x \in X$] [/claim] [proof] First take $x \in D(A)$. Then \begin{align*} \lambda R(\lambda, A) x - x = \lambda R(\lambda, A) x - (\lambda I - A) R(\lambda, A) x = A R(\lambda, A) x = R(\lambda, A) A x, \end{align*} where we used that $R(\lambda, A)$ commutes with $A$ on $D(A)$ (since $R(\lambda, A) = (\lambda I - A)^{-1}$). Hence $\|\lambda R(\lambda, A) x - x\|_X = \|R(\lambda, A) Ax\|_X \le \frac{1}{\lambda} \|Ax\|_X \to 0$. For general $x \in X$: since $D(A)$ is dense in $X$ (by hypothesis), given $\varepsilon > 0$ there is $y \in D(A)$ with $\|x - y\|_X < \varepsilon$. Using $\|\lambda R(\lambda, A)\|_{\mathcal{L}(X)} \le 1$: \begin{align*} \|\lambda R(\lambda, A) x - x\|_X &\le \|\lambda R(\lambda, A)(x - y)\|_X + \|\lambda R(\lambda, A) y - y\|_X + \|y - x\|_X \\ &\le 2\varepsilon + \|R(\lambda, A) Ay\|_X. \end{align*} Letting $\lambda \to \infty$ and then $\varepsilon \to 0$ yields the conclusion. [/proof] Consequently, for $x \in D(A)$, \begin{align*} A_\lambda x = \lambda A R(\lambda, A) x = \lambda R(\lambda, A) A x \to A x \quad \text{as } \lambda \to \infty. \end{align*} [/step] [step:Define $T_\lambda(t) := e^{tA_\lambda}$ and show each is a contraction semigroup using the resolvent bound] For each $\lambda > 0$, since $A_\lambda$ is bounded, the exponential $T_\lambda(t) := \exp(tA_\lambda) = \sum_{n=0}^\infty \frac{t^n A_\lambda^n}{n!}$ is well-defined for all $t \in \mathbb{R}$ and forms a uniformly continuous semigroup. We bound $\|T_\lambda(t)\|_{\mathcal{L}(X)}$ using the decomposition $A_\lambda = \lambda^2 R(\lambda, A) - \lambda I$: \begin{align*} T_\lambda(t) = e^{t(\lambda^2 R(\lambda, A) - \lambda I)} = e^{-\lambda t} e^{t \lambda^2 R(\lambda, A)}, \end{align*} where the splitting is valid because $\lambda I$ commutes with $\lambda^2 R(\lambda, A)$. Bounding the second exponential using the power series and $\|R(\lambda, A)^n\|_{\mathcal{L}(X)} \le 1/\lambda^n$ (from iterating the hypothesis $\|R(\lambda, A)\|_{\mathcal{L}(X)} \le 1/\lambda$ at the level of the resolvent identity, or directly from the Neumann series): \begin{align*} \|e^{t\lambda^2 R(\lambda, A)}\|_{\mathcal{L}(X)} \le \sum_{n=0}^\infty \frac{(t\lambda^2)^n}{n!}\|R(\lambda, A)^n\|_{\mathcal{L}(X)} \le \sum_{n=0}^\infty \frac{(t\lambda^2)^n}{n! \lambda^n} = e^{t\lambda}. \end{align*} Combining: \begin{align*} \|T_\lambda(t)\|_{\mathcal{L}(X)} \le e^{-\lambda t} \cdot e^{t \lambda} = 1, \quad t \ge 0. \end{align*} So each $\{T_\lambda(t)\}_{t\ge 0}$ is a contraction semigroup. [guided] The bound $\|R(\lambda, A)\|_{\mathcal{L}(X)} \le 1/\lambda$ does not directly say anything about exponentials; we need a workaround. The trick is the algebraic identity \begin{align*} A_\lambda = \lambda^2 R(\lambda, A) - \lambda I, \end{align*} which expresses $A_\lambda$ as a *positive* operator $\lambda^2 R(\lambda, A)$ shifted by $-\lambda I$. Why is this useful? Because $-\lambda I$ contributes a factor $e^{-\lambda t}$, which is exponentially small, while the positive operator $\lambda^2 R(\lambda, A)$ has growth at most $e^{\lambda t}$ — these *exactly* cancel. To make this rigorous: $\lambda I$ and $\lambda^2 R(\lambda, A)$ commute (the former is a scalar), so we may split the matrix exponential \begin{align*} T_\lambda(t) = e^{tA_\lambda} = e^{-\lambda t I} e^{t\lambda^2 R(\lambda, A)} = e^{-\lambda t}\, e^{t\lambda^2 R(\lambda, A)}. \end{align*} For the operator-valued exponential $e^{t\lambda^2 R(\lambda, A)}$, we estimate via the power series. We need bounds on $\|R(\lambda, A)^n\|$. Since $\lambda \in \rho(A)$ and $R(\lambda, A) \in \mathcal{L}(X)$ with $\|R(\lambda, A)\|_{\mathcal{L}(X)} \le 1/\lambda$, the operator norm submultiplicativity gives $\|R(\lambda, A)^n\|_{\mathcal{L}(X)} \le \|R(\lambda, A)\|_{\mathcal{L}(X)}^n \le 1/\lambda^n$. Then \begin{align*} \|e^{t\lambda^2 R(\lambda, A)}\|_{\mathcal{L}(X)} \le \sum_{n=0}^\infty \frac{(t\lambda^2)^n}{n!} \cdot \frac{1}{\lambda^n} = \sum_{n=0}^\infty \frac{(t\lambda)^n}{n!} = e^{t\lambda}. \end{align*} Multiplying by $e^{-\lambda t}$ from the first exponential gives $\|T_\lambda(t)\|_{\mathcal{L}(X)} \le 1$, exactly as needed. What would fail in the general case $\|R(\lambda, A)\|_{\mathcal{L}(X)} \le M/(\lambda - \omega)$? The submultiplicativity bound $\|R(\lambda, A)^n\|_{\mathcal{L}(X)} \le M^n/(\lambda - \omega)^n$ would introduce an $M^n$ factor that does *not* cancel — the exponential series would diverge. This is precisely why the general statement requires the *iterated* bound $\|R(\lambda, A)^n\|_{\mathcal{L}(X)} \le M/(\lambda - \omega)^n$ (with $M$, not $M^n$). The factor $M$ outside the exponent is the right hypothesis. [/guided] [/step] [step:Show $(T_\lambda(t)x)$ is Cauchy in $\lambda$ uniformly on bounded $t$, defining $T(t)x := \lim T_\lambda(t)x$] For $\mu, \lambda > 0$ and $x \in D(A)$, the bounded operators $A_\mu$ and $A_\lambda$ commute (both are polynomials in $R(\mu, A)$ and $R(\lambda, A)$, which commute by the resolvent equation $R(\lambda, A) - R(\mu, A) = (\mu - \lambda) R(\lambda, A) R(\mu, A)$). Therefore $T_\mu(t)$ and $T_\lambda(t)$ commute, and we may write \begin{align*} T_\mu(t) x - T_\lambda(t) x = \int_0^t \frac{d}{ds}\left[T_\mu(t-s) T_\lambda(s) x\right] d\mathcal{L}^1(s) = \int_0^t T_\mu(t-s) T_\lambda(s) (A_\lambda - A_\mu) x \, d\mathcal{L}^1(s), \end{align*} where the derivative inside the integrand uses commutativity: $\frac{d}{ds}[T_\mu(t-s)T_\lambda(s)x] = T_\mu(t-s)T_\lambda(s)(A_\lambda - A_\mu)x$. Since $\|T_\mu(t-s)\|_{\mathcal{L}(X)} \le 1$ and $\|T_\lambda(s)\|_{\mathcal{L}(X)} \le 1$, taking norms: \begin{align*} \|T_\mu(t) x - T_\lambda(t) x\|_X \le t \cdot \|A_\lambda x - A_\mu x\|_X. \end{align*} Both $A_\lambda x \to Ax$ and $A_\mu x \to Ax$ as $\lambda, \mu \to \infty$ (Step 2), so $\|A_\lambda x - A_\mu x\|_X \to 0$. Hence $(T_\lambda(t) x)_{\lambda > 0}$ is Cauchy in $X$, uniformly for $t$ in any bounded interval $[0, t_0]$. Define \begin{align*} T(t) x := \lim_{\lambda \to \infty} T_\lambda(t) x \quad \text{for } x \in D(A). \end{align*} Since $\|T_\lambda(t)\|_{\mathcal{L}(X)} \le 1$ uniformly and $D(A)$ is dense in $X$, this extends to all $x \in X$ with $\|T(t)\|_{\mathcal{L}(X)} \le 1$. The convergence is uniform in $t$ on bounded intervals, so $T(\cdot)x$ is continuous (limit of continuous functions). The semigroup property $T(s+t) = T(s)T(t)$ passes to the limit from $T_\lambda(s+t) = T_\lambda(s) T_\lambda(t)$, and $T(0) = I$. Strong continuity at $t = 0$ holds because each $T_\lambda$ is uniformly continuous and the convergence is uniform on $[0, t_0]$. [/step] [step:Verify that $A$ is the generator of $\{T(t)\}_{t \ge 0}$] Let $B$ denote the generator of the semigroup $\{T(t)\}_{t\ge 0}$ constructed in Step 4. We must show $B = A$. For $x \in D(A)$, since $T_\lambda(t)x \to T(t)x$ and $A_\lambda x \to Ax$, we have \begin{align*} T_\lambda(t) x - x = \int_0^t T_\lambda(s) A_\lambda x \, d\mathcal{L}^1(s). \end{align*} Letting $\lambda \to \infty$ (using uniform convergence in $s \in [0, t]$ and uniform boundedness of $T_\lambda$): \begin{align*} T(t) x - x = \int_0^t T(s) A x \, d\mathcal{L}^1(s). \end{align*} Dividing by $t$ and letting $t \to 0^+$, the right-hand side tends to $T(0)Ax = Ax$ by continuity of $s \mapsto T(s)Ax$. Thus the limit $\lim_{t\to 0^+}\frac{T(t)x - x}{t} = Ax$ exists in $X$, so $x \in D(B)$ and $Bx = Ax$. This proves $A \subseteq B$ (i.e., $D(A) \subseteq D(B)$ and $B$ extends $A$). To show $A = B$: pick any $\lambda > 0$. By the forward direction (already proved in Step 1 applied to the contraction semigroup $T(t)$), $\lambda \in \rho(B)$ with $\|R(\lambda, B)\|_{\mathcal{L}(X)} \le 1/\lambda$. By hypothesis, $\lambda \in \rho(A)$. Now $A \subseteq B$ implies $\lambda I - A \subseteq \lambda I - B$. Since $\lambda I - A: D(A) \to X$ is bijective and $\lambda I - B: D(B) \to X$ is bijective, both have the same inverse on $X$ (the unique pre-image), forcing $D(A) = D(B)$. Therefore $A = B$. [/step] [step:Reduce the general case $\|T(t)\|_{\mathcal{L}(X)} \le M e^{\omega t}$ to the contraction case via rescaling and renorming] For the general case, we use two reductions: **Reduction 1 (rescaling):** If $A$ generates $\{T(t)\}_{t\ge 0}$, then $A - \omega I$ generates $\{e^{-\omega t} T(t)\}_{t\ge 0}$. The bound $\|T(t)\|_{\mathcal{L}(X)} \le Me^{\omega t}$ becomes $\|e^{-\omega t} T(t)\|_{\mathcal{L}(X)} \le M$, and the resolvent shifts: $R(\lambda, A - \omega I) = R(\lambda + \omega, A)$. The resolvent estimate $\|R(\lambda, A)^n\|_{\mathcal{L}(X)} \le M/(\lambda - \omega)^n$ becomes $\|R(\lambda, A - \omega I)^n\|_{\mathcal{L}(X)} \le M/\lambda^n$ for $\lambda > 0$. So we reduce to: $A$ generates $\{T(t)\}_{t\ge 0}$ with $\|T(t)\|_{\mathcal{L}(X)} \le M$ if and only if $\rho(A) \supset (0, \infty)$ and $\|R(\lambda, A)^n\|_{\mathcal{L}(X)} \le M/\lambda^n$ for $\lambda > 0$, $n \ge 1$. **Reduction 2 (renorming):** Define a new norm \begin{align*} |\!|\!|x|\!|\!| := \sup_{\lambda > 0, n \ge 0} \|\lambda^n R(\lambda, A)^n x\|_X. \end{align*} The bound $\|R(\lambda, A)^n\|_{\mathcal{L}(X)} \le M/\lambda^n$ guarantees $\|x\|_X \le |\!|\!|x|\!|\!| \le M\|x\|_X$, so $|\!|\!|\cdot|\!|\!|$ is equivalent to $\|\cdot\|_X$ and $X$ remains a Banach space. By construction, in the new norm, $|\!|\!|\lambda R(\lambda, A) x|\!|\!| \le |\!|\!|x|\!|\!|$ for all $\lambda > 0$, hence $|\!|\!|R(\lambda, A)|\!|\!| \le 1/\lambda$ — the contraction-case hypothesis. Applying Steps 1-5 in the renormed space yields a semigroup $\{T(t)\}_{t \ge 0}$ generated by $A$ with $|\!|\!|T(t)|\!|\!| \le 1$. Translating back: $\|T(t) x\|_X \le |\!|\!|T(t) x|\!|\!| \le |\!|\!|x|\!|\!| \le M \|x\|_X$. Combining both reductions: under hypothesis (2) of the general statement, $A$ generates a $C_0$-semigroup with $\|T(t)\|_{\mathcal{L}(X)} \le Me^{\omega t}$. The forward direction is again immediate from the Laplace transform: $\|T(t)\|_{\mathcal{L}(X)} \le Me^{\omega t}$ implies for $\lambda > \omega$, \begin{align*} R(\lambda, A)^n x = \frac{1}{(n-1)!}\int_0^\infty t^{n-1} e^{-\lambda t} T(t) x \, d\mathcal{L}^1(t), \end{align*} so $\|R(\lambda, A)^n\|_{\mathcal{L}(X)} \le \frac{M}{(n-1)!}\int_0^\infty t^{n-1} e^{-(\lambda-\omega)t}\, d\mathcal{L}^1(t) = \frac{M}{(\lambda - \omega)^n}$. This completes the proof of the equivalence in both the contraction case and the general case. [/step]

What brings you to Androma?

Start with a route through the knowledge graph.

Hille-Yosida Theorem (Theorem # 3139)

Discussion

Proof

Explore Further

Sign in to Androma

Check your inbox

One last step

Hille-Yosida Theorem (Theorem # 3139)

Discussion

Proof

Explore Further