[proofplan]
We first prove the identity up to the stopping time at which Brownian motion exits a compact interval. On that stopped interval, $f'$ and $f''$ are bounded and $f''$ is uniformly continuous, so [Taylor's theorem](/theorems/827) along a partition of $[0,t]$ has a remainder controlled by the Brownian mesh and the quadratic variation sum. The first-order sums converge to the stochastic integral, the second-order sums converge to the time integral by weighted quadratic variation, and the remainder vanishes. Letting the compact interval expand gives the stated localized identity.
[/proofplan]
[step:Localize Brownian motion to a compact range]
Fix $t\geq 0$ and, for each integer $n\geq 1$, define the stopping time
\begin{align*}
\tau_n &= \inf\{u\geq 0: |W_u|\geq n\}.
\end{align*}
The stopped path $(W_{u\wedge\tau_n})_{0\leq u\leq t}$ takes values in the compact interval $[-n,n]$. Since $f\in C^2(\mathbb R)$, the functions $f'$ and $f''$ are bounded on $[-n,n]$, and $f''$ is uniformly continuous on $[-n,n]$. It is therefore enough to prove the formula with $t$ replaced by $t\wedge\tau_n$ and then let $n\to\infty$, because $\tau_n\uparrow\infty$ almost surely by continuity of Brownian paths.
[/step]
[step:Apply Taylor's theorem on a partition before the stopping time]
Let $\pi=\{0=t_0<t_1<\cdots<t_m=t\}$ be a partition of $[0,t]$, and set
\begin{align*}
Y_u &= W_{u\wedge\tau_n}.
\end{align*}
For each $k$, [Taylor's theorem](/theorems/827) gives
\begin{align*}
f(Y_{t_{k+1}})-f(Y_{t_k})
&= f'(Y_{t_k})(Y_{t_{k+1}}-Y_{t_k})
+\frac{1}{2}f''(Y_{t_k})(Y_{t_{k+1}}-Y_{t_k})^2
+R_k,
\end{align*}
where
\begin{align*}
|R_k|
&\leq \frac{1}{2}\omega_n\left(|Y_{t_{k+1}}-Y_{t_k}|\right)(Y_{t_{k+1}}-Y_{t_k})^2.
\end{align*}
Here $\omega_n:[0,\infty)\to[0,\infty)$ is the modulus of continuity of $f''$ on $[-n,n]$, defined by
\begin{align*}
\omega_n(r)&=\sup\{|f''(x)-f''(y)|: x,y\in[-n,n], |x-y|\leq r\}.
\end{align*}
Summing over $k$ telescopes the left-hand side:
\begin{align*}
f(Y_t)-f(Y_0)
&= \sum_{k=0}^{m-1}f'(Y_{t_k})(Y_{t_{k+1}}-Y_{t_k}) \\
&\quad+\frac{1}{2}\sum_{k=0}^{m-1}f''(Y_{t_k})(Y_{t_{k+1}}-Y_{t_k})^2
+\sum_{k=0}^{m-1}R_k.
\end{align*}
[/step]
[step:Pass the three sums to their limits]
Let $|\pi|=\max_k(t_{k+1}-t_k)$ denote the mesh of the partition, and let $|\pi|\to0$ along deterministic partitions. Write $[W]_u$ for the [quadratic variation of Brownian motion](/theorems/3543) at time $u$. Since $f'(Y_u)$ is bounded and adapted, the definition of the Itô integral gives
\begin{align*}
\sum_{k=0}^{m-1}f'(Y_{t_k})(Y_{t_{k+1}}-Y_{t_k})
&\to \int_0^t f'(W_{s\wedge\tau_n})\,dW_{s\wedge\tau_n}
\end{align*}
in probability. The weighted quadratic variation convergence for Brownian motion gives
\begin{align*}
\sum_{k=0}^{m-1}f''(Y_{t_k})(Y_{t_{k+1}}-Y_{t_k})^2
&\to \int_0^{t\wedge\tau_n} f''(W_s)\,d\mathcal L^1(s)
\end{align*}
in probability. This weighted convergence follows from the quadratic variation identity $[W]_{u}=u$ and [uniform continuity](/page/Uniform%20Continuity) of the continuous adapted weight $s\mapsto f''(Y_s)$ on $[0,t]$.
For the remainder, Brownian paths are uniformly continuous on $[0,t]$ almost surely, so
\begin{align*}
\max_{0\leq k<m}|Y_{t_{k+1}}-Y_{t_k}|&\to0
\end{align*}
almost surely. Also the quadratic sums $\sum_k(Y_{t_{k+1}}-Y_{t_k})^2$ converge in probability to $t\wedge\tau_n$ and are therefore bounded in probability. Since $\omega_n(r)\to0$ as $r\downarrow0$, the estimate on $R_k$ gives
\begin{align*}
\sum_{k=0}^{m-1}R_k&\to0
\end{align*}
in probability.
Passing to the limit in the telescoping identity yields
\begin{align*}
f(W_{t\wedge\tau_n})
&= f(W_0)
+ \int_0^t f'(W_{s\wedge\tau_n})\,dW_{s\wedge\tau_n}
+\frac{1}{2}\int_0^{t\wedge\tau_n} f''(W_s)\,d\mathcal L^1(s)
\end{align*}
almost surely.
[/step]
[step:Remove the localization]
The stopped stochastic integral satisfies
\begin{align*}
\int_0^t f'(W_{s\wedge\tau_n})\,dW_{s\wedge\tau_n}
&= \int_0^{t\wedge\tau_n} f'(W_s)\,dW_s.
\end{align*}
Since $\tau_n\uparrow\infty$ almost surely, for almost every outcome there is an index $n_0$ such that $t<\tau_n$ for all $n\geq n_0$. For those $n$,
\begin{align*}
W_{t\wedge\tau_n} &= W_t, &
\int_0^{t\wedge\tau_n} f''(W_s)\,d\mathcal L^1(s)
&= \int_0^t f''(W_s)\,d\mathcal L^1(s),
\end{align*}
and the stopped stochastic integral agrees with the localized stochastic integral $\int_0^t f'(W_s)\,dW_s$. Letting $n\to\infty$ gives
\begin{align*}
f(W_t)
&= f(W_0)
+ \int_0^t f'(W_s)\,dW_s
+\frac{1}{2}\int_0^t f''(W_s)\,d\mathcal L^1(s)
\end{align*}
almost surely.
[/step]