[proofplan]
We differentiate the three moving pieces of $\mathcal F$: the scalar curvature, the gradient norm of $f$, and the weighted measure. The first-order constraint on $e^{-f(s)}d\mu_{g(s)}$ removes all terms coming from variation of the density and identifies $v$ with $\frac{1}{2}\operatorname{tr}_g(h)$. The remaining divergence terms are integrated by parts against the fixed weight $e^{-f}$; the Hessian terms and gradient-square terms cancel until only the pairing with $\operatorname{Ric}_g+\operatorname{Hess}_g f$ remains.
[/proofplan]
[step:Differentiate the weighted measure and identify the constraint]
Let $I \subset \mathbb{R}$ be an open interval containing $0$ on which the variations $g(s)$ and $f(s)$ are defined. For each $s \in I$, let $\rho(s) \in C^\infty(M)$ be the positive function
\begin{align*}
\rho(s) := e^{-f(s)}.
\end{align*}
The [first variation formula](/theorems/2728) for the Riemannian volume form gives
\begin{align*}
\frac{d}{ds}\Big|_{s=0}d\mu_{g(s)}
=
\frac{1}{2}\operatorname{tr}_g(h)\,d\mu_g.
\end{align*}
Also,
\begin{align*}
\frac{d}{ds}\Big|_{s=0}e^{-f(s)}
=
-v e^{-f}.
\end{align*}
Therefore
\begin{align*}
\frac{d}{ds}\Big|_{s=0}\left(e^{-f(s)}\,d\mu_{g(s)}\right)
=
\left(\frac{1}{2}\operatorname{tr}_g(h)-v\right)e^{-f}\,d\mu_g.
\end{align*}
The imposed pointwise first-order constraint is exactly
\begin{align*}
v=\frac{1}{2}\operatorname{tr}_g(h).
\end{align*}
[guided]
We first isolate the effect of the constraint, because this is the mechanism that removes the density-variation terms later. Define
\begin{align*}
\rho(s):=e^{-f(s)}.
\end{align*}
Then $\rho(s)$ is a smooth positive function on $M$, and differentiating the exponential gives
\begin{align*}
\frac{d}{ds}\Big|_{s=0}\rho(s)
=
\frac{d}{ds}\Big|_{s=0}e^{-f(s)}
=
-v e^{-f}.
\end{align*}
The first variation formula for the Riemannian volume form gives
\begin{align*}
\frac{d}{ds}\Big|_{s=0}d\mu_{g(s)}
=
\frac{1}{2}\operatorname{tr}_g(h)\,d\mu_g.
\end{align*}
Here the hypothesis that $g(s)$ is a smooth variation ensures the coordinate matrix is differentiable in $s$. The formula follows from differentiating $\sqrt{\det(g_{ij}(s))}$ in local coordinates: if $Jg$ denotes the coordinate matrix of $g$ in a chart, then
\begin{align*}
\frac{d}{ds}\Big|_{s=0}\sqrt{\det(Jg(s))}
=
\frac{1}{2}\operatorname{tr}\left((Jg)^{-1}Jh\right)\sqrt{\det(Jg)}
=
\frac{1}{2}\operatorname{tr}_g(h)\sqrt{\det(Jg)}.
\end{align*}
Multiplying the two first variations by the product rule gives
\begin{align*}
\frac{d}{ds}\Big|_{s=0}\left(e^{-f(s)}\,d\mu_{g(s)}\right)
=
\left(-v e^{-f}\right)d\mu_g
+
e^{-f}\left(\frac{1}{2}\operatorname{tr}_g(h)\,d\mu_g\right).
\end{align*}
Combining the two terms gives
\begin{align*}
\frac{d}{ds}\Big|_{s=0}\left(e^{-f(s)}\,d\mu_{g(s)}\right)
=
\left(\frac{1}{2}\operatorname{tr}_g(h)-v\right)e^{-f}\,d\mu_g.
\end{align*}
The hypothesis says that this signed measure vanishes to first order. Since $e^{-f}d\mu_g$ is a positive smooth measure, the coefficient must vanish pointwise:
\begin{align*}
v=\frac{1}{2}\operatorname{tr}_g(h).
\end{align*}
[/guided]
[/step]
[step:Differentiate the scalar curvature and gradient terms]
Let $n := \dim M$, and let $S_g \in C^\infty(M)$ denote the scalar curvature of the metric $g$. For a smooth function $u \in C^\infty(M)$, define the Laplace-Beltrami operator with the positive-divergence sign convention by
\begin{align*}
\Delta_g u := \operatorname{div}_g(\nabla u).
\end{align*}
For a smooth function $u \in C^\infty(M)$, define its Riemannian Hessian $\operatorname{Hess}_g u \in \Gamma(T^*M\otimes T^*M)$ by
\begin{align*}
\operatorname{Hess}_g u(X,Y) := X(Yu) - (\nabla_XY)u
\end{align*}
for all vector fields $X,Y \in \mathfrak X(M)$, where $\nabla$ is the Levi-Civita connection of $g$.
In a local $g$-orthonormal frame, let $\operatorname{div}_g h \in \Omega^1(M)$ be the one-form with components
\begin{align*}
(\operatorname{div}_g h)_j := \sum_{i=1}^n \nabla_i h_{ij},
\end{align*}
and let $\operatorname{div}_g\operatorname{div}_g h \in C^\infty(M)$ be the smooth function
\begin{align*}
\operatorname{div}_g\operatorname{div}_g h
:=
\sum_{i,j=1}^n \nabla_j\nabla_i h_{ij}.
\end{align*}
Since $g(s)$ is a smooth variation of metrics, the scalar-curvature first variation is obtained as follows. In a local $g$-orthonormal frame at a point, write a dot for $\frac{d}{ds}\big|_{s=0}$. Differentiating the inverse metric gives $\dot g^{ij}=-h_{ij}$, and differentiating the Christoffel symbols gives
\begin{align*}
\dot\Gamma_{ij}^k
=
\frac{1}{2}\left(\nabla_i h_{jk}+\nabla_j h_{ik}-\nabla_k h_{ij}\right).
\end{align*}
Using the curvature convention $\operatorname{Ric}_{ij}=R_{kij}^{\phantom{kij}k}$, the induced Ricci variation is
\begin{align*}
\dot{\operatorname{Ric}}_{ij}
=
\nabla_k\dot\Gamma_{ij}^k-\nabla_j\dot\Gamma_{ik}^k
=
\frac{1}{2}\left(
\nabla_k\nabla_i h_{jk}
+
\nabla_k\nabla_j h_{ik}
-
\nabla_k\nabla_k h_{ij}
-
\nabla_i\nabla_j\operatorname{tr}_g h
\right).
\end{align*}
Contracting and commuting the contracted covariant derivatives in the standard Ricci-variation computation yields
\begin{align*}
g^{ij}\dot{\operatorname{Ric}}_{ij}
=
\operatorname{div}_g\operatorname{div}_g h
-
\Delta_g(\operatorname{tr}_g h).
\end{align*}
Therefore, differentiating $S_{g(s)}=g(s)^{ij}\operatorname{Ric}_{ij}(g(s))$ gives
\begin{align*}
\frac{d}{ds}\Big|_{s=0}S_{g(s)}
=
-\langle h,\operatorname{Ric}_g\rangle_g
+
\operatorname{div}_g\operatorname{div}_g h
-
\Delta_g(\operatorname{tr}_g h).
\end{align*}
The standard variation formula for the squared gradient norm, obtained by differentiating the inverse metric contribution and the function variation $v$, is
\begin{align*}
\frac{d}{ds}\Big|_{s=0}|\nabla^{g(s)} f(s)|_{g(s)}^2
=
-\langle h,\nabla f\otimes\nabla f\rangle_g
+
2\langle \nabla f,\nabla v\rangle_g.
\end{align*}
Using the product rule for the integral defining $\mathcal F$ and the density computation from the previous step, we obtain
\begin{align*}
\frac{d}{ds}\Big|_{s=0}\mathcal F[g(s),f(s)]
=
\int_M
\left[
-\langle h,\operatorname{Ric}_g\rangle_g
+
\operatorname{div}_g\operatorname{div}_g h
-
\Delta_g(\operatorname{tr}_g h)
\right]e^{-f}\,d\mu_g.
\end{align*}
Adding the differentiated gradient-norm contribution and the density-variation contribution gives
\begin{align*}
\frac{d}{ds}\Big|_{s=0}\mathcal F[g(s),f(s)]
=
\int_M
\left[
-\langle h,\operatorname{Ric}_g\rangle_g
+
\operatorname{div}_g\operatorname{div}_g h
-
\Delta_g(\operatorname{tr}_g h)
\right]e^{-f}\,d\mu_g
+
\int_M
\left[
-\langle h,\nabla f\otimes\nabla f\rangle_g
+
2\langle \nabla f,\nabla v\rangle_g
\right]e^{-f}\,d\mu_g
+
\int_M
\left(S_g+|\nabla f|_g^2\right)
\left(\frac{1}{2}\operatorname{tr}_g(h)-v\right)e^{-f}\,d\mu_g.
\end{align*}
Since $v=\frac{1}{2}\operatorname{tr}_g(h)$, the last integral is zero.
[guided]
The goal of this step is to separate the variation of the integrand from the variation of the weighted density. Let $n := \dim M$, and let $S_g \in C^\infty(M)$ denote the scalar curvature of the metric $g$. For any smooth function $u \in C^\infty(M)$, we use the Laplace-Beltrami sign convention
\begin{align*}
\Delta_g u := \operatorname{div}_g(\nabla u).
\end{align*}
For any smooth function $u \in C^\infty(M)$, define $\operatorname{Hess}_g u \in \Gamma(T^*M\otimes T^*M)$ by
\begin{align*}
\operatorname{Hess}_g u(X,Y) := X(Yu) - (\nabla_XY)u
\end{align*}
for all vector fields $X,Y \in \mathfrak X(M)$, where $\nabla$ is the Levi-Civita connection of $g$.
In a local $g$-orthonormal frame, define
\begin{align*}
(\operatorname{div}_g h)_j := \sum_{i=1}^n \nabla_i h_{ij}
\end{align*}
and
\begin{align*}
\operatorname{div}_g\operatorname{div}_g h := \sum_{i,j=1}^n \nabla_j\nabla_i h_{ij}.
\end{align*}
We now compute the scalar-curvature variation rather than treating it as a black box. In a local $g$-orthonormal frame at a point, write a dot for $\frac{d}{ds}\big|_{s=0}$. If $h^\sharp \in \Gamma(\operatorname{End}(TM))$ is defined by $g(h^\sharp X,Y)=h(X,Y)$ for vector fields $X,Y \in \mathfrak X(M)$, then differentiating $g(s)^{ik}g(s)_{kj}=\delta^i_j$ gives
\begin{align*}
\frac{d}{ds}\Big|_{s=0}g(s)^{-1}=-h^\sharp,
\end{align*}
or, in components, $\dot g^{ij}=-h_{ij}$. Next, differentiating the formula for the Levi-Civita connection gives
\begin{align*}
\dot\Gamma_{ij}^k
=
\frac{1}{2}\left(\nabla_i h_{jk}+\nabla_j h_{ik}-\nabla_k h_{ij}\right).
\end{align*}
The Ricci tensor is the contraction $\operatorname{Ric}_{ij}=R_{kij}^{\phantom{kij}k}$ with the curvature convention used here, so its variation is
\begin{align*}
\dot{\operatorname{Ric}}_{ij}
=
\nabla_k\dot\Gamma_{ij}^k-\nabla_j\dot\Gamma_{ik}^k.
\end{align*}
Substituting the displayed expression for $\dot\Gamma$ gives
\begin{align*}
\dot{\operatorname{Ric}}_{ij}
=
\frac{1}{2}\left(
\nabla_k\nabla_i h_{jk}
+
\nabla_k\nabla_j h_{ik}
-
\nabla_k\nabla_k h_{ij}
-
\nabla_i\nabla_j\operatorname{tr}_g h
\right).
\end{align*}
After contraction with $g^{ij}$, the two mixed-divergence terms combine to $\operatorname{div}_g\operatorname{div}_g h$, and the trace term gives $-\Delta_g(\operatorname{tr}_g h)$, so
\begin{align*}
g^{ij}\dot{\operatorname{Ric}}_{ij}
=
\operatorname{div}_g\operatorname{div}_g h
-
\Delta_g(\operatorname{tr}_g h).
\end{align*}
Finally, differentiating $S_{g(s)}=g(s)^{ij}\operatorname{Ric}_{ij}(g(s))$ gives
\begin{align*}
\frac{d}{ds}\Big|_{s=0}S_{g(s)}
=
-\langle h,\operatorname{Ric}_g\rangle_g
+
\operatorname{div}_g\operatorname{div}_g h
-
\Delta_g(\operatorname{tr}_g h).
\end{align*}
The other moving part of the integrand is $|\nabla^{g(s)} f(s)|_{g(s)}^2$. Differentiating the inverse metric gives the tensor term $-\langle h,\nabla f\otimes\nabla f\rangle_g$, while differentiating the function $f(s)$ gives $2\langle \nabla f,\nabla v\rangle_g$. Hence
\begin{align*}
\frac{d}{ds}\Big|_{s=0}|\nabla^{g(s)} f(s)|_{g(s)}^2
=
-\langle h,\nabla f\otimes\nabla f\rangle_g
+
2\langle \nabla f,\nabla v\rangle_g.
\end{align*}
The product rule for the functional $\mathcal F$ therefore gives three contributions: scalar curvature, gradient norm, and weighted-density variation. Using the previous step for the density term, we get
\begin{align*}
\frac{d}{ds}\Big|_{s=0}\mathcal F[g(s),f(s)]
=
\int_M
\left[
-\langle h,\operatorname{Ric}_g\rangle_g
+
\operatorname{div}_g\operatorname{div}_g h
-
\Delta_g(\operatorname{tr}_g h)
\right]e^{-f}\,d\mu_g
+
\int_M
\left[
-\langle h,\nabla f\otimes\nabla f\rangle_g
+
2\langle \nabla f,\nabla v\rangle_g
\right]e^{-f}\,d\mu_g
+
\int_M
\left(S_g+|\nabla f|_g^2\right)
\left(\frac{1}{2}\operatorname{tr}_g(h)-v\right)e^{-f}\,d\mu_g.
\end{align*}
The constraint from the first step says $v=\frac{1}{2}\operatorname{tr}_g(h)$, so the entire density-variation contribution is zero. This leaves only the curvature and gradient-norm variations to be simplified by [integration by parts](/theorems/2098).
[/guided]
[/step]
[step:Integrate the divergence terms against the weighted measure]
Define $\rho \in C^\infty(M)$ by
\begin{align*}
\rho := e^{-f}.
\end{align*}
Since $M$ is closed, the closed-manifold form of the [Gauss-Green Theorem](/theorems/28) gives [integration by parts](/theorems/210) with no boundary term. Applying it twice to $\operatorname{div}_g\operatorname{div}_g h$ gives
\begin{align*}
\int_M \operatorname{div}_g\operatorname{div}_g h\,\rho\,d\mu_g
=
\int_M \langle h,\operatorname{Hess}_g\rho\rangle_g\,d\mu_g.
\end{align*}
Because
\begin{align*}
\operatorname{Hess}_g\rho
=
\operatorname{Hess}_g(e^{-f})
=
e^{-f}\left(\nabla f\otimes\nabla f-\operatorname{Hess}_g f\right),
\end{align*}
we have
\begin{align*}
\int_M \operatorname{div}_g\operatorname{div}_g h\,e^{-f}\,d\mu_g
=
\int_M
\left\langle h,\nabla f\otimes\nabla f-\operatorname{Hess}_g f\right\rangle_g
e^{-f}\,d\mu_g.
\end{align*}
Similarly,
\begin{align*}
\int_M -\Delta_g(\operatorname{tr}_g h)\,e^{-f}\,d\mu_g
=
-\int_M
\operatorname{tr}_g(h)
\left(|\nabla f|_g^2-\Delta_g f\right)
e^{-f}\,d\mu_g.
\end{align*}
Finally, integration by parts applied to $\langle \nabla f,\nabla v\rangle_g$ gives
\begin{align*}
2\int_M \langle \nabla f,\nabla v\rangle_g e^{-f}\,d\mu_g
=
2\int_M v\left(|\nabla f|_g^2-\Delta_g f\right)e^{-f}\,d\mu_g.
\end{align*}
Using $v=\frac{1}{2}\operatorname{tr}_g(h)$, this becomes
\begin{align*}
2\int_M \langle \nabla f,\nabla v\rangle_g e^{-f}\,d\mu_g
=
\int_M
\operatorname{tr}_g(h)
\left(|\nabla f|_g^2-\Delta_g f\right)e^{-f}\,d\mu_g,
\end{align*}
where the last equality uses $v=\frac{1}{2}\operatorname{tr}_g(h)$.
[guided]
The divergence terms must be moved from $h$ and $\operatorname{tr}_g(h)$ onto the weight $e^{-f}$. Define the smooth positive weight $\rho \in C^\infty(M)$ by
\begin{align*}
\rho := e^{-f}.
\end{align*}
Because $M$ is closed, the closed-manifold form of the [Gauss-Green Theorem](/theorems/28) gives integration by parts without boundary terms. Applying integration by parts twice to $\operatorname{div}_g\operatorname{div}_g h$ transfers the two covariant derivatives from $h$ to $\rho$ and yields
\begin{align*}
\int_M \operatorname{div}_g\operatorname{div}_g h\,\rho\,d\mu_g
=
\int_M \langle h,\operatorname{Hess}_g\rho\rangle_g\,d\mu_g.
\end{align*}
Now compute the Hessian of the weight. Since $\rho=e^{-f}$, the product and chain rules give
\begin{align*}
\operatorname{Hess}_g\rho
=
\operatorname{Hess}_g(e^{-f})
=
e^{-f}\left(\nabla f\otimes\nabla f-\operatorname{Hess}_g f\right).
\end{align*}
Substituting this into the preceding identity gives
\begin{align*}
\int_M \operatorname{div}_g\operatorname{div}_g h\,e^{-f}\,d\mu_g
=
\int_M
\left\langle h,\nabla f\otimes\nabla f-\operatorname{Hess}_g f\right\rangle_g
e^{-f}\,d\mu_g.
\end{align*}
The same integration-by-parts principle applies to the Laplacian term. Since $\Delta_g$ is $\operatorname{div}_g\nabla$ with our sign convention,
\begin{align*}
\int_M -\Delta_g(\operatorname{tr}_g h)\,e^{-f}\,d\mu_g
=
-\int_M
\operatorname{tr}_g(h)
\left(|\nabla f|_g^2-\Delta_g f\right)
e^{-f}\,d\mu_g.
\end{align*}
Finally, integrate by parts in the gradient term involving $v$:
\begin{align*}
2\int_M \langle \nabla f,\nabla v\rangle_g e^{-f}\,d\mu_g
=
2\int_M v\left(|\nabla f|_g^2-\Delta_g f\right)e^{-f}\,d\mu_g.
\end{align*}
Using the fixed weighted-measure constraint $v=\frac{1}{2}\operatorname{tr}_g(h)$ changes this to
\begin{align*}
2\int_M \langle \nabla f,\nabla v\rangle_g e^{-f}\,d\mu_g
=
\int_M
\operatorname{tr}_g(h)
\left(|\nabla f|_g^2-\Delta_g f\right)e^{-f}\,d\mu_g.
\end{align*}
These are exactly the terms needed for the cancellation in the final step.
[/guided]
[/step]
[step:Cancel the weighted terms and obtain the first variation formula]
Substituting the three integration-by-parts identities into the differentiated formula gives
\begin{align*}
\frac{d}{ds}\Big|_{s=0}\mathcal F[g(s),f(s)]
=
\int_M
\left[
-\langle h,\operatorname{Ric}_g\rangle_g
-\langle h,\nabla f\otimes\nabla f\rangle_g
\right]e^{-f}\,d\mu_g
+
\int_M
\left\langle h,\nabla f\otimes\nabla f-\operatorname{Hess}_g f\right\rangle_g
e^{-f}\,d\mu_g
-
\int_M
\operatorname{tr}_g(h)
\left(|\nabla f|_g^2-\Delta_g f\right)e^{-f}\,d\mu_g
+
\int_M
\operatorname{tr}_g(h)
\left(|\nabla f|_g^2-\Delta_g f\right)e^{-f}\,d\mu_g.
\end{align*}
The last two integrals cancel. The two terms involving $\nabla f\otimes\nabla f$ also cancel. Therefore
\begin{align*}
\frac{d}{ds}\Big|_{s=0}\mathcal F[g(s),f(s)]
=
-\int_M
\left[
\langle h,\operatorname{Ric}_g\rangle_g
+
\langle h,\operatorname{Hess}_g f\rangle_g
\right]e^{-f}\,d\mu_g.
\end{align*}
By bilinearity of the metric pairing on symmetric two-tensors, this is
\begin{align*}
\frac{d}{ds}\Big|_{s=0}\mathcal F[g(s),f(s)]
=
-\int_M
\left\langle h,\operatorname{Ric}_g+\operatorname{Hess}_g f\right\rangle_g
e^{-f}\,d\mu_g.
\end{align*}
This is the desired first variation formula under the fixed weighted-measure constraint.
[guided]
We now substitute the integration-by-parts identities from the previous step into the variation formula. The expression becomes
\begin{align*}
\frac{d}{ds}\Big|_{s=0}\mathcal F[g(s),f(s)]
=
\int_M
\left[
-\langle h,\operatorname{Ric}_g\rangle_g
-\langle h,\nabla f\otimes\nabla f\rangle_g
\right]e^{-f}\,d\mu_g
+
\int_M
\left\langle h,\nabla f\otimes\nabla f-\operatorname{Hess}_g f\right\rangle_g
e^{-f}\,d\mu_g
-
\int_M
\operatorname{tr}_g(h)
\left(|\nabla f|_g^2-\Delta_g f\right)e^{-f}\,d\mu_g
+
\int_M
\operatorname{tr}_g(h)
\left(|\nabla f|_g^2-\Delta_g f\right)e^{-f}\,d\mu_g.
\end{align*}
The two trace terms are the same integral with opposite signs, so their sum is zero. The tensor terms involving $\nabla f\otimes\nabla f$ also cancel because
\begin{align*}
-\langle h,\nabla f\otimes\nabla f\rangle_g
+
\langle h,\nabla f\otimes\nabla f\rangle_g
=
0.
\end{align*}
After these cancellations, the only remaining terms are the Ricci term and the Hessian term:
\begin{align*}
\frac{d}{ds}\Big|_{s=0}\mathcal F[g(s),f(s)]
=
-\int_M
\left[
\langle h,\operatorname{Ric}_g\rangle_g
+
\langle h,\operatorname{Hess}_g f\rangle_g
\right]e^{-f}\,d\mu_g.
\end{align*}
Since the metric pairing is bilinear on symmetric two-tensors,
\begin{align*}
\langle h,\operatorname{Ric}_g\rangle_g
+
\langle h,\operatorname{Hess}_g f\rangle_g
=
\left\langle h,\operatorname{Ric}_g+\operatorname{Hess}_g f\right\rangle_g.
\end{align*}
Therefore
\begin{align*}
\frac{d}{ds}\Big|_{s=0}\mathcal F[g(s),f(s)]
=
-\int_M
\left\langle h,\operatorname{Ric}_g+\operatorname{Hess}_g f\right\rangle_g
e^{-f}\,d\mu_g.
\end{align*}
This is exactly the claimed first variation formula under the fixed weighted-measure constraint.
[/guided]
[/step]