[guided]We begin with an arbitrary smooth local frame and turn it into an orthonormal one without losing smoothness. Let $x_0\in M$. Since $E\to M$ is a smooth rank-$r$ vector bundle, there is an open neighbourhood $U\subset M$ of $x_0$ and smooth sections $e_i:U\to E$ for $1\le i\le r$ such that $(e_1(x),\dots,e_r(x))$ is a basis of $E_x$ for every $x\in U$.
The fibrewise Gram-Schmidt process is legitimate here because every denominator that appears is a positive smooth function. Define
\begin{align*}
u_1(x)=\frac{e_1(x)}{\sqrt{h_x(e_1(x),e_1(x))}}.
\end{align*}
The function $x\mapsto h_x(e_1(x),e_1(x))$ is smooth by smoothness of the bundle metric and of the section $e_1$. It is strictly positive because $e_1(x)\neq 0$ and $h_x$ is positive definite. Therefore its square root is smooth and positive, so $u_1:U\to E$ is a smooth section with $h_x(u_1(x),u_1(x))=1$.
Now suppose that smooth orthonormal sections $u_1,\dots,u_{k-1}:U\to E$ have been constructed, where $2\le k\le r$. Define
\begin{align*}
v_k(x)=e_k(x)-\sum_{j=1}^{k-1}h_x(e_k(x),u_j(x))u_j(x).
\end{align*}
This subtracts from $e_k(x)$ its projection onto the already constructed orthonormal span. For each $1\le i<k$, bilinearity of $h_x$ gives
\begin{align*}
h_x(v_k(x),u_i(x))=h_x(e_k(x),u_i(x))-\sum_{j=1}^{k-1}h_x(e_k(x),u_j(x))h_x(u_j(x),u_i(x)).
\end{align*}
Since the sections $u_1,\dots,u_{k-1}$ are orthonormal, $h_x(u_j(x),u_i(x))$ is $1$ when $j=i$ and $0$ otherwise, so this expression is $0$. Thus $v_k(x)$ is orthogonal to each earlier $u_i(x)$.
We also need $v_k(x)\neq 0$. If $v_k(x)=0$, then $e_k(x)$ lies in $\operatorname{span}(u_1(x),\dots,u_{k-1}(x))$. Each $u_j(x)$ lies in $\operatorname{span}(e_1(x),\dots,e_j(x))$, hence in $\operatorname{span}(e_1(x),\dots,e_{k-1}(x))$. Therefore $e_k(x)$ would lie in $\operatorname{span}(e_1(x),\dots,e_{k-1}(x))$, contradicting that $(e_1(x),\dots,e_r(x))$ is a basis of $E_x$.
Hence $h_x(v_k(x),v_k(x))>0$ for every $x\in U$. Since $v_k$ is built from smooth sections using the smooth bundle metric, $x\mapsto h_x(v_k(x),v_k(x))$ is a positive smooth function. Define
\begin{align*}
u_k(x)=\frac{v_k(x)}{\sqrt{h_x(v_k(x),v_k(x))}}.
\end{align*}
Then $u_k$ is smooth, has $h_x(u_k(x),u_k(x))=1$, and is orthogonal to $u_1(x),\dots,u_{k-1}(x)$. Induction gives a smooth local orthonormal frame $(u_1,\dots,u_r)$ on $U$.[/guided]