[proofplan]
We reduce the multivariable statement to a one-variable statement along the line segment $t \mapsto x_0+th$. Openness gives a ball around $x_0$ contained in $U$, so these line segments are admissible for all sufficiently small $h$. Applying the chain rule twice expresses the second derivative of the one-variable restriction in terms of the Hessian of $f$. The remainder is then an integral average of $h^\top(Hf_{x_0+th}-Hf_{x_0})h$, and continuity of the Hessian at $x_0$ makes this average $o(|h|^2)$.
[/proofplan]
custom_env
admin
[step:Choose a ball on which all line segments stay inside $U$]
Since $U$ is open and $x_0 \in U$, there exists $\rho > 0$ such that
\begin{align*}
B(x_0,\rho) \subset U.
\end{align*}
Fix $h \in B(0,\rho)$. Define the line segment map
\begin{align*}
\gamma_h: [0,1] &\to U
\end{align*}
by
\begin{align*}
\gamma_h(t) = x_0+th.
\end{align*}
For every $t \in [0,1]$ we have $|\gamma_h(t)-x_0| = t|h| < \rho$, so $\gamma_h(t) \in B(x_0,\rho) \subset U$.
[/step]
custom_env
admin
[step:Restrict $f$ to the line segment and compute its derivatives]For fixed $h \in B(0,\rho)$, define
\begin{align*}
g_h: [0,1] &\to \mathbb{R}
\end{align*}
by
\begin{align*}
g_h(t) = f(x_0+th).
\end{align*}
Since $f \in C^2(U)$ and $\gamma_h$ is a smooth map from $[0,1]$ into $U$, the one-variable function $g_h$ is $C^2$ on $[0,1]$. By the chain rule,
\begin{align*}
g_h'(t) = \nabla f(x_0+th)\cdot h.
\end{align*}
Applying the chain rule once more to the map $t \mapsto \nabla f(x_0+th)\cdot h$ gives
\begin{align*}
g_h''(t) = h^\top Hf_{x_0+th}h.
\end{align*}
In particular,
\begin{align*}
g_h'(0) = \nabla f(x_0)\cdot h
\end{align*}
and
\begin{align*}
g_h''(0) = h^\top Hf_{x_0}h.
\end{align*}[/step]
custom_env
admin
[guided]The purpose of introducing $g_h$ is to turn the multivariable Taylor expansion into an ordinary one-variable expansion in the parameter $t$. For fixed $h \in B(0,\rho)$, the map
\begin{align*}
g_h: [0,1] &\to \mathbb{R}
\end{align*}
is defined by
\begin{align*}
g_h(t) = f(x_0+th).
\end{align*}
The previous step verified that $x_0+th \in U$ for every $t \in [0,1]$, so this definition is valid on the whole interval.
Because $f \in C^2(U)$, all first and second partial derivatives of $f$ exist and are continuous on $U$. The line map $t \mapsto x_0+th$ is smooth. Therefore the chain rule applies to the composition $g_h = f \circ \gamma_h$, and its first derivative is
\begin{align*}
g_h'(t) = \sum_{i=1}^n \partial_{x_i}f(x_0+th)h_i.
\end{align*}
This is exactly the Euclidean dot product
\begin{align*}
g_h'(t) = \nabla f(x_0+th)\cdot h.
\end{align*}
We differentiate once more. Since each component $\partial_{x_i}f$ is $C^1$ on $U$, the chain rule gives
\begin{align*}
g_h''(t) = \sum_{i=1}^n \sum_{j=1}^n \partial_{x_j}\partial_{x_i}f(x_0+th)h_jh_i.
\end{align*}
Since $f \in C^2(U)$, equality of mixed partial derivatives gives $\partial_{x_j}\partial_{x_i}f(x_0+th)=\partial_{x_i}\partial_{x_j}f(x_0+th)$. By the definition of the Hessian matrix $Hf_{x_0+th}$, this double sum is the quadratic form of the Hessian on $h$:
\begin{align*}
g_h''(t) = h^\top Hf_{x_0+th}h.
\end{align*}
Evaluating these formulas at $t=0$ gives
\begin{align*}
g_h'(0) = \nabla f(x_0)\cdot h
\end{align*}
and
\begin{align*}
g_h''(0) = h^\top Hf_{x_0}h.
\end{align*}
These are exactly the first-order and second-order terms appearing in the desired Taylor expansion.[/guided]
custom_env
admin
[step:Write the one-variable expansion with an integral remainder]
Let $\mathcal{L}^1$ denote one-dimensional [Lebesgue measure](/page/Lebesgue%20Measure) on $\mathbb{R}$. For a $C^2$ function $g: [0,1]\to \mathbb{R}$, the [fundamental theorem of calculus](/theorems/632) gives
\begin{align*}
g(1)-g(0) = \int_0^1 g'(s)\,d\mathcal{L}^1(s).
\end{align*}
For each $s \in [0,1]$, applying the same theorem to $g'$ on $[0,s]$ gives
\begin{align*}
g'(s) = g'(0)+\int_0^s g''(t)\,d\mathcal{L}^1(t).
\end{align*}
Substituting this identity into the previous integral and using the standard triangular-domain form of [Fubini's theorem](/theorems/2961) for the [continuous function](/page/Continuous%20Function) $(s,t)\mapsto g''(t)$ on $\{(s,t):0\le t\le s\le 1\}$ yields
\begin{align*}
g(1)=g(0)+g'(0)+\int_0^1 (1-t)g''(t)\,d\mathcal{L}^1(t).
\end{align*}
Applying this identity to $g=g_h$ gives
\begin{align*}
f(x_0+h)=f(x_0)+\nabla f(x_0)\cdot h+\int_0^1 (1-t)h^\top Hf_{x_0+th}h\,d\mathcal{L}^1(t).
\end{align*}
Since
\begin{align*}
\int_0^1 (1-t)\,d\mathcal{L}^1(t)=\frac{1}{2},
\end{align*}
we obtain
\begin{align*}
f(x_0+h)=f(x_0)+\nabla f(x_0)\cdot h+\frac{1}{2}h^\top Hf_{x_0}h+r(h),
\end{align*}
where
\begin{align*}
r: B(0,\rho) &\to \mathbb{R}
\end{align*}
is defined by
\begin{align*}
r(h)=\int_0^1 (1-t)h^\top\left(Hf_{x_0+th}-Hf_{x_0}\right)h\,d\mathcal{L}^1(t).
\end{align*}
[/step]
custom_env
admin
[step:Use continuity of the Hessian to prove the Peano estimate]
Let $\|\cdot\|_{\mathrm{op}}$ denote the operator norm on real $n\times n$ matrices induced by the Euclidean norm on $\mathbb{R}^n$. For $h \in B(0,\rho)$, the definition of $r(h)$ and the quadratic-form estimate $|h^\top Ah|\le \|A\|_{\mathrm{op}}|h|^2$ give
\begin{align*}
|r(h)| \le |h|^2\int_0^1 (1-t)\left\|Hf_{x_0+th}-Hf_{x_0}\right\|_{\mathrm{op}}\,d\mathcal{L}^1(t).
\end{align*}
Hence, for $h\ne 0$,
\begin{align*}
\frac{|r(h)|}{|h|^2} \le \frac{1}{2}\sup_{0\le t\le 1}\left\|Hf_{x_0+th}-Hf_{x_0}\right\|_{\mathrm{op}}.
\end{align*}
Because $f \in C^2(U)$, the map
\begin{align*}
U &\to \mathbb{R}^{n\times n}
\end{align*}
given by
\begin{align*}
x \mapsto Hf_x
\end{align*}
is continuous at $x_0$. If $h \to 0$ and $0\le t\le 1$, then $x_0+th \to x_0$ uniformly in $t$, so
\begin{align*}
\sup_{0\le t\le 1}\left\|Hf_{x_0+th}-Hf_{x_0}\right\|_{\mathrm{op}} \to 0.
\end{align*}
Therefore
\begin{align*}
\lim_{\substack{h\to 0, h\ne 0}}\frac{|r(h)|}{|h|^2}=0.
\end{align*}
The expansion and the Peano remainder estimate are proved for every $h \in B(0,\rho)$, so the theorem follows.
[/step]