[proofplan]
We apply the Li-Yau differential estimate to $f=\log u$ along an arbitrary piecewise $C^1$ path joining $(x,s)$ to $(y,t)$ in spacetime. The chain rule converts the differential estimate into a lower bound for the derivative of $f(\gamma(\tau),\tau)$, while the elementary inequality, valid for every $p \in M$ and every pair of tangent vectors $a,b \in T_pM$,
\begin{align*}
g_p(a,b) \geq -|a|_g^2-\frac{1}{4}|b|_g^2
\end{align*}
controls the spatial gradient term by the path energy. After integrating in time, we optimize over all paths and use the standard length-energy relation to replace the path energy by
\begin{align*}
\frac{d_g(x,y)^2}{t-s}.
\end{align*}
Exponentiating the resulting logarithmic inequality gives the stated Harnack estimate.
[/proofplan]
[step:Pass from the heat solution to the logarithmic differential inequality]
Define the map $f: M \times (0,T] \to \mathbb{R}$ by
\begin{align*}
f(p,\tau)=\log u(p,\tau).
\end{align*}
Since $u$ is assumed smooth and strictly positive, $f$ is smooth on $M \times (0,T]$. By the Li-Yau differential estimate assumed in the theorem statement, for every $(p,\tau) \in M \times (0,T]$,
\begin{align*}
|\nabla f(p,\tau)|_g^2 - \partial_t f(p,\tau) \leq \frac{n}{2\tau}.
\end{align*}
Equivalently,
\begin{align*}
\partial_t f(p,\tau) \geq |\nabla f(p,\tau)|_g^2 - \frac{n}{2\tau}.
\end{align*}
[/step]
[step:Estimate the derivative of $\log u$ along an arbitrary path]
Let $\mathcal{L}^1$ denote one-dimensional [Lebesgue measure](/page/Lebesgue%20Measure) on $\mathbb{R}$. Let
\begin{align*}
\gamma: [s,t] &\to M
\end{align*}
be a piecewise $C^1$ curve with $\gamma(s)=x$ and $\gamma(t)=y$. Define the map $F_\gamma: [s,t] \to \mathbb{R}$ by
\begin{align*}
F_\gamma(\tau)=f(\gamma(\tau),\tau).
\end{align*}
On each interval on which $\gamma$ is $C^1$, the chain rule gives
\begin{align*}
\frac{d}{d\tau}F_\gamma(\tau)
=
\partial_t f(\gamma(\tau),\tau)
+
g_{\gamma(\tau)}(\nabla f(\gamma(\tau),\tau),\dot{\gamma}(\tau)).
\end{align*}
First, substituting the Li-Yau lower bound into the chain-rule identity gives
\begin{align*}
\frac{d}{d\tau}F_\gamma(\tau) \geq |\nabla f(\gamma(\tau),\tau)|_g^2-\frac{n}{2\tau}+g_{\gamma(\tau)}(\nabla f(\gamma(\tau),\tau),\dot{\gamma}(\tau)).
\end{align*}
Then applying the elementary estimate gives
\begin{align*}
\frac{d}{d\tau}F_\gamma(\tau) \geq |\nabla f(\gamma(\tau),\tau)|_g^2-\frac{n}{2\tau}-|\nabla f(\gamma(\tau),\tau)|_g^2-\frac{1}{4}|\dot{\gamma}(\tau)|_g^2.
\end{align*}
Cancelling the equal gradient terms yields
\begin{align*}
\frac{d}{d\tau}F_\gamma(\tau) \geq -\frac{n}{2\tau}-\frac{1}{4}|\dot{\gamma}(\tau)|_g^2.
\end{align*}
[guided]
The goal is to compare $f(x,s)=\log u(x,s)$ with $f(y,t)=\log u(y,t)$. Since these values occur at different points and different times, we connect them by a spacetime path. Choose a piecewise $C^1$ curve
\begin{align*}
\gamma: [s,t] \to M
\end{align*}
with $\gamma(s)=x$ and $\gamma(t)=y$, and define the map $F_\gamma: [s,t] \to \mathbb{R}$ by
\begin{align*}
F_\gamma(\tau)=f(\gamma(\tau),\tau).
\end{align*}
On each interval where $\gamma$ is $C^1$, the chain rule gives
\begin{align*}
\frac{d}{d\tau}F_\gamma(\tau)
=
\partial_t f(\gamma(\tau),\tau)
+
g_{\gamma(\tau)}(\nabla f(\gamma(\tau),\tau),\dot{\gamma}(\tau)).
\end{align*}
The Li-Yau estimate controls the first term from below:
\begin{align*}
\partial_t f(\gamma(\tau),\tau)
\geq
|\nabla f(\gamma(\tau),\tau)|_g^2-\frac{n}{2\tau}.
\end{align*}
Substituting this into the chain-rule identity yields
\begin{align*}
\frac{d}{d\tau}F_\gamma(\tau)
\geq
|\nabla f(\gamma(\tau),\tau)|_g^2
-\frac{n}{2\tau}
+
g_{\gamma(\tau)}(\nabla f(\gamma(\tau),\tau),\dot{\gamma}(\tau)).
\end{align*}
The only term with no fixed sign is the [inner product](/page/Inner%20Product). We control it by the elementary inequality
\begin{align*}
g_{\gamma(\tau)}(a,b) \geq -|a|_g^2-\frac{1}{4}|b|_g^2,
\end{align*}
which follows from
\begin{align*}
0 \leq \left|a+\frac{1}{2}b\right|_g^2 = |a|_g^2+g_{\gamma(\tau)}(a,b)+\frac{1}{4}|b|_g^2.
\end{align*}
Applying this with $a=\nabla f(\gamma(\tau),\tau)$ and $b=\dot{\gamma}(\tau)$ gives
\begin{align*}
\frac{d}{d\tau}F_\gamma(\tau) \geq |\nabla f(\gamma(\tau),\tau)|_g^2-\frac{n}{2\tau}-|\nabla f(\gamma(\tau),\tau)|_g^2-\frac{1}{4}|\dot{\gamma}(\tau)|_g^2.
\end{align*}
Cancelling the two gradient-square terms gives
\begin{align*}
\frac{d}{d\tau}F_\gamma(\tau) \geq -\frac{n}{2\tau}-\frac{1}{4}|\dot{\gamma}(\tau)|_g^2.
\end{align*}
This is the key cancellation: the positive $|\nabla f|_g^2$ term supplied by Li-Yau exactly absorbs the possible negative contribution from the spatial motion of the path.
[/guided]
[/step]
[step:Integrate the pathwise differential inequality from $s$ to $t$]
Since $\gamma$ is piecewise $C^1$, there is a partition $s=\tau_0<\tau_1<\cdots<\tau_m=t$ such that $\gamma$ is $C^1$ on each open subinterval $(\tau_{j-1},\tau_j)$. Hence $F_\gamma$ is continuous on $[s,t]$, is $C^1$ on each such open subinterval, and the preceding differential inequality holds for $\mathcal{L}^1$-a.e. $\tau \in [s,t]$. Applying the [fundamental theorem of calculus](/theorems/632) on each subinterval and summing gives
\begin{align*}
f(y,t)-f(x,s)=\int_s^t \frac{d}{d\tau}F_\gamma(\tau)\,d\mathcal{L}^1(\tau).
\end{align*}
Using the differential inequality and the linearity of the [Lebesgue integral](/page/Lebesgue%20Integral) gives
\begin{align*}
f(y,t)-f(x,s) \geq -\frac{n}{2}\int_s^t \frac{1}{\tau}\,d\mathcal{L}^1(\tau)-\frac{1}{4}\int_s^t |\dot{\gamma}(\tau)|_g^2\,d\mathcal{L}^1(\tau).
\end{align*}
Evaluating the elementary logarithmic integral gives
\begin{align*}
f(y,t)-f(x,s) \geq -\frac{n}{2}\log\left(\frac{t}{s}\right)-\frac{1}{4}\int_s^t |\dot{\gamma}(\tau)|_g^2\,d\mathcal{L}^1(\tau).
\end{align*}
Equivalently,
\begin{align*}
f(x,s)-f(y,t) \leq \frac{n}{2}\log\left(\frac{t}{s}\right)+\frac{1}{4}\int_s^t |\dot{\gamma}(\tau)|_g^2\,d\mathcal{L}^1(\tau).
\end{align*}
[/step]
[step:Minimize the path energy by the distance between $x$ and $y$]
Because the theorem now assumes that $x$ and $y$ lie in the same connected component of the smooth manifold $M$, they can be joined by a piecewise $C^1$ curve. Let $\Gamma_{x,y}^{s,t}$ denote the nonempty set of all piecewise $C^1$ curves
\begin{align*}
\gamma: [s,t] &\to M
\end{align*}
with $\gamma(s)=x$ and $\gamma(t)=y$. For $\gamma \in \Gamma_{x,y}^{s,t}$, define its length $L_g(\gamma)$ and energy $E_g(\gamma)$ by
\begin{align*}
L_g(\gamma):=\int_s^t |\dot{\gamma}(\tau)|_g\,d\mathcal{L}^1(\tau).
\end{align*}
and
\begin{align*}
E_g(\gamma):=\int_s^t |\dot{\gamma}(\tau)|_g^2\,d\mathcal{L}^1(\tau).
\end{align*}
By the [Cauchy-Schwarz inequality](/theorems/432) applied to the functions $\tau \mapsto |\dot{\gamma}(\tau)|_g$ and $\tau \mapsto 1$ on $[s,t]$,
\begin{align*}
L_g(\gamma)^2
\leq
E_g(\gamma)(t-s).
\end{align*}
Since $d_g(x,y) \leq L_g(\gamma)$ by the definition of the Riemannian distance,
\begin{align*}
E_g(\gamma) \geq \frac{d_g(x,y)^2}{t-s}.
\end{align*}
Conversely, for every $\varepsilon>0$, the definition of $d_g(x,y)$ and the standard smoothing of piecewise $C^1$ curves give a regular piecewise $C^1$ curve $\sigma_\varepsilon:[0,1]\to M$ from $x$ to $y$ with
\begin{align*}
L_g(\sigma_\varepsilon) \leq d_g(x,y)+\varepsilon.
\end{align*}
Because $\sigma_\varepsilon$ is regular on each smooth segment, its arclength parameter is piecewise $C^1$ with piecewise $C^1$ inverse after subdividing at the break points. Reparametrize $\sigma_\varepsilon$ on $[s,t]$ proportional to arclength, obtaining a piecewise $C^1$ curve $\gamma_\varepsilon \in \Gamma_{x,y}^{s,t}$ with constant speed
\begin{align*}
|\dot{\gamma}_\varepsilon(\tau)|_g=\frac{L_g(\sigma_\varepsilon)}{t-s}
\end{align*}
for $\mathcal{L}^1$-a.e. $\tau \in [s,t]$. Therefore
\begin{align*}
E_g(\gamma_\varepsilon)=\int_s^t \frac{L_g(\sigma_\varepsilon)^2}{(t-s)^2}\,d\mathcal{L}^1(\tau)=\frac{L_g(\sigma_\varepsilon)^2}{t-s}\leq\frac{(d_g(x,y)+\varepsilon)^2}{t-s}.
\end{align*}
Letting $\varepsilon \downarrow 0$ yields
\begin{align*}
\inf_{\gamma \in \Gamma_{x,y}^{s,t}} E_g(\gamma)
=
\frac{d_g(x,y)^2}{t-s}.
\end{align*}
[/step]
[step:Exponentiate the optimized logarithmic inequality]
The estimate from the integration step holds for every $\gamma \in \Gamma_{x,y}^{s,t}$, so taking the infimum of the energy term gives
\begin{align*}
f(x,s)-f(y,t) \leq \frac{n}{2}\log\left(\frac{t}{s}\right)+\frac{d_g(x,y)^2}{4(t-s)}.
\end{align*}
Since $f=\log u$, this is
\begin{align*}
\log u(x,s)-\log u(y,t) \leq \frac{n}{2}\log\left(\frac{t}{s}\right)+\frac{d_g(x,y)^2}{4(t-s)}.
\end{align*}
Exponentiating both sides, using that the exponential function is increasing on $\mathbb{R}$ and that $u(y,t)>0$, gives
\begin{align*}
\frac{u(x,s)}{u(y,t)} \leq \left(\frac{t}{s}\right)^{n/2}\exp\left(\frac{d_g(x,y)^2}{4(t-s)}\right).
\end{align*}
Multiplying by $u(y,t)$ proves
\begin{align*}
u(x,s) \leq u(y,t)\left(\frac{t}{s}\right)^{n/2}
\exp\left(\frac{d_g(x,y)^2}{4(t-s)}\right),
\end{align*}
which is the desired integrated Li-Yau Harnack inequality.
[/step]