[proofplan]
The proof first shows that every implicit Euler step is well-defined by restricting the minimization to a compact sublevel set. Testing each minimization problem with the previous discrete point gives energy monotonicity and a discrete action estimate. These estimates put every discrete point in one fixed compact set and give a uniform square-root modulus of continuity, up to a vanishing time-step error, for the piecewise-constant interpolations. Compactness at rational times, together with that modulus, yields locally uniform convergence on every finite time interval by a diagonal argument.
[/proofplan]
[step:Show that each implicit Euler minimization has a solution]
Fix $\tau>0$ and suppose that $k\in\mathbb N\cup\{0\}$ is such that $x_k^\tau\in X$ and $E(x_k^\tau)<\infty$. Define the auxiliary functional $\Phi_{k,\tau}:X\to(-\infty,\infty]$ by
\begin{align*}
\Phi_{k,\tau}(y)=E(y)+\frac{1}{2\tau}d(y,x_k^\tau)^2.
\end{align*}
The map $y\mapsto d(y,x_k^\tau)^2$ is continuous on $X$, and $E$ is lower semicontinuous, so $\Phi_{k,\tau}$ is lower semicontinuous.
Let
\begin{align*}
K_k^\tau:=\{y\in X:E(y)\le E(x_k^\tau)\}.
\end{align*}
By hypothesis, $K_k^\tau$ is compact. Since $\Phi_{k,\tau}(x_k^\tau)=E(x_k^\tau)<\infty$, every minimizing sequence for $\Phi_{k,\tau}$ may be replaced by points whose $\Phi_{k,\tau}$-value is at most $E(x_k^\tau)+1$. To get actual attainment, it is enough to minimize on the smaller compact set $K_k^\tau$: if $y\notin K_k^\tau$, then $E(y)>E(x_k^\tau)$, hence
\begin{align*}
\Phi_{k,\tau}(y)>E(x_k^\tau)=\Phi_{k,\tau}(x_k^\tau).
\end{align*}
Thus no minimizer can lie outside $K_k^\tau$.
Because $\Phi_{k,\tau}$ is lower semicontinuous on the compact metric space $K_k^\tau$, it attains its minimum there. Choose a minimizer and call it $x_{k+1}^\tau$. Then $E(x_{k+1}^\tau)\le E(x_k^\tau)<\infty$, so the construction continues by induction from $x_0^\tau=x_0$.
[guided]
We must first justify that the recursive definition is not empty. Fix a time step $\tau>0$ and assume that the old point $x_k^\tau$ has already been constructed with finite energy. The functional minimized at this step is
\begin{align*}
\Phi_{k,\tau}:X&\to(-\infty,\infty]
\end{align*}
\begin{align*}
y&\mapsto E(y)+\frac{1}{2\tau}d(y,x_k^\tau)^2.
\end{align*}
The squared distance term is continuous as a function of $y$, because the metric $d$ is continuous in each variable. Since $E$ is lower semicontinuous, the sum $\Phi_{k,\tau}$ is lower semicontinuous.
The compactness hypothesis applies to sublevels of $E$, not directly to sublevels of $\Phi_{k,\tau}$. The key observation is that the previous point $x_k^\tau$ is an admissible competitor. Define
\begin{align*}
K_k^\tau:=\{y\in X:E(y)\le E(x_k^\tau)\}.
\end{align*}
This set is compact by the compact-sublevel hypothesis. If $y\notin K_k^\tau$, then $E(y)>E(x_k^\tau)$, and since the metric term is nonnegative,
\begin{align*}
\Phi_{k,\tau}(y)
=
E(y)+\frac{1}{2\tau}d(y,x_k^\tau)^2
>
E(x_k^\tau)
=
\Phi_{k,\tau}(x_k^\tau).
\end{align*}
Therefore a point outside $K_k^\tau$ cannot minimize $\Phi_{k,\tau}$, because the old point $x_k^\tau$ gives a strictly smaller value.
Thus the minimization problem over $X$ is equivalent to the minimization problem over the compact metric space $K_k^\tau$. A lower semicontinuous real-extended function on a compact metric space attains its minimum: take a minimizing sequence, pass to a convergent subsequence by compactness, and use lower semicontinuity to pass the inequality to the limit. Applying this to $\Phi_{k,\tau}|_{K_k^\tau}$ gives a minimizer $x_{k+1}^\tau\in K_k^\tau$. Since $x_{k+1}^\tau\in K_k^\tau$, we also get $E(x_{k+1}^\tau)\le E(x_k^\tau)<\infty$, so the next step is well-defined. Induction constructs the whole sequence $(x_k^\tau)_{k\ge 0}$.
[/guided]
[/step]
[step:Derive the discrete energy monotonicity and action estimate]
Fix $\tau>0$. For every $k\ge 0$, the minimality of $x_{k+1}^\tau$ and the admissible competitor $y=x_k^\tau$ give
\begin{align*}
E(x_{k+1}^\tau)+\frac{1}{2\tau}d(x_{k+1}^\tau,x_k^\tau)^2
\le
E(x_k^\tau).
\end{align*}
Hence
\begin{align*}
E(x_{k+1}^\tau)\le E(x_k^\tau)
\end{align*}
for every $k\ge 0$. Therefore
\begin{align*}
E(x_k^\tau)\le E(x_0)
\end{align*}
for every $k\ge 0$.
Define the fixed compact sublevel set
\begin{align*}
K:=\{x\in X:E(x)\le E(x_0)\}.
\end{align*}
Every point $x_k^\tau$ belongs to $K$. Since $K$ is compact and $E$ is lower semicontinuous, $E$ attains a finite minimum on $K$. Define
\begin{align*}
m:=\min_{x\in K}E(x).
\end{align*}
Then $m>-\infty$ because $E$ takes values in $(-\infty,\infty]$ and the minimum is attained at a point of $K$.
Summing the one-step inequality from $k=0$ to $k=N-1$ gives, for every $N\in\mathbb N$,
\begin{align*}
\frac{1}{2\tau}\sum_{k=0}^{N-1}d(x_{k+1}^\tau,x_k^\tau)^2
\le
E(x_0)-E(x_N^\tau)
\le
E(x_0)-m.
\end{align*}
Equivalently,
\begin{align*}
\sum_{k=0}^{N-1}d(x_{k+1}^\tau,x_k^\tau)^2
\le
2\tau(E(x_0)-m).
\end{align*}
[/step]
[step:Convert the discrete action bound into a uniform metric modulus]
Let $0\le s\le t<\infty$. For $\tau>0$, define integers $p_\tau,q_\tau\in\mathbb N\cup\{0\}$ by requiring
\begin{align*}
s\in[p_\tau\tau,(p_\tau+1)\tau)
\end{align*}
and
\begin{align*}
t\in[q_\tau\tau,(q_\tau+1)\tau).
\end{align*}
If $p_\tau=q_\tau$, then
\begin{align*}
d(\bar{x}_\tau(s),\bar{x}_\tau(t))=0.
\end{align*}
If $p_\tau<q_\tau$, the triangle inequality gives
\begin{align*}
d(\bar{x}_\tau(s),\bar{x}_\tau(t))
\le
\sum_{k=p_\tau}^{q_\tau-1}d(x_{k+1}^\tau,x_k^\tau).
\end{align*}
Applying the Cauchy-Schwarz inequality to the finite sum over the index set $\{p_\tau,\dots,q_\tau-1\}$ yields
\begin{align*}
d(\bar{x}_\tau(s),\bar{x}_\tau(t))^2
\le
(q_\tau-p_\tau)
\sum_{k=p_\tau}^{q_\tau-1}d(x_{k+1}^\tau,x_k^\tau)^2.
\end{align*}
The action estimate gives
\begin{align*}
\sum_{k=p_\tau}^{q_\tau-1}d(x_{k+1}^\tau,x_k^\tau)^2
\le
2\tau(E(x_0)-m).
\end{align*}
Moreover, from the definitions of $p_\tau$ and $q_\tau$,
\begin{align*}
(q_\tau-p_\tau)\tau\le t-s+\tau.
\end{align*}
Therefore
\begin{align*}
d(\bar{x}_\tau(s),\bar{x}_\tau(t))^2
\le
2(E(x_0)-m)(t-s+\tau).
\end{align*}
Thus, for all $0\le s\le t<\infty$,
\begin{align*}
d(\bar{x}_\tau(s),\bar{x}_\tau(t))
\le
\sqrt{2(E(x_0)-m)}\sqrt{t-s+\tau}.
\end{align*}
[guided]
The piecewise-constant curves are not genuinely equicontinuous for fixed $\tau$, because they jump at the grid times. What we need is a modulus whose defect disappears as $\tau\to 0$.
Fix $0\le s\le t<\infty$. Define $p_\tau,q_\tau\in\mathbb N\cup\{0\}$ by
\begin{align*}
s\in[p_\tau\tau,(p_\tau+1)\tau)
\end{align*}
and
\begin{align*}
t\in[q_\tau\tau,(q_\tau+1)\tau).
\end{align*}
Then $\bar{x}_\tau(s)=x_{p_\tau}^\tau$ and $\bar{x}_\tau(t)=x_{q_\tau}^\tau$. If $p_\tau=q_\tau$, the two values are equal, so the distance is zero.
Assume now that $p_\tau<q_\tau$. The triangle inequality along the discrete chain gives
\begin{align*}
d(\bar{x}_\tau(s),\bar{x}_\tau(t))
=
d(x_{p_\tau}^\tau,x_{q_\tau}^\tau)
\le
\sum_{k=p_\tau}^{q_\tau-1}d(x_{k+1}^\tau,x_k^\tau).
\end{align*}
This estimate involves the sum of the jump lengths, but the energy inequality controls the sum of the squared jump lengths. To pass from one to the other, apply the Cauchy-Schwarz inequality to the finite family of nonnegative numbers
\begin{align*}
a_k:=d(x_{k+1}^\tau,x_k^\tau),
\qquad
k\in\{p_\tau,\dots,q_\tau-1\}.
\end{align*}
It gives
\begin{align*}
\left(\sum_{k=p_\tau}^{q_\tau-1}a_k\right)^2
\le
(q_\tau-p_\tau)\sum_{k=p_\tau}^{q_\tau-1}a_k^2.
\end{align*}
Substituting the definition of $a_k$,
\begin{align*}
d(\bar{x}_\tau(s),\bar{x}_\tau(t))^2
\le
(q_\tau-p_\tau)
\sum_{k=p_\tau}^{q_\tau-1}d(x_{k+1}^\tau,x_k^\tau)^2.
\end{align*}
The partial sum is bounded by the full action sum, so the previous step gives
\begin{align*}
\sum_{k=p_\tau}^{q_\tau-1}d(x_{k+1}^\tau,x_k^\tau)^2
\le
2\tau(E(x_0)-m).
\end{align*}
It remains to relate the number of jumps to the time interval length. Since $s\ge p_\tau\tau$ and $t<(q_\tau+1)\tau$, we have
\begin{align*}
(q_\tau-p_\tau)\tau
=
q_\tau\tau-p_\tau\tau
\le
t-s+\tau.
\end{align*}
Combining these estimates gives
\begin{align*}
d(\bar{x}_\tau(s),\bar{x}_\tau(t))^2
\le
2(E(x_0)-m)(t-s+\tau).
\end{align*}
Taking square roots yields
\begin{align*}
d(\bar{x}_\tau(s),\bar{x}_\tau(t))
\le
\sqrt{2(E(x_0)-m)}\sqrt{t-s+\tau}.
\end{align*}
This is the exact replacement for equicontinuity: the additional $\tau$ accounts for the possible jump at a grid point, and it vanishes along every sequence of time steps tending to zero.
[/guided]
[/step]
[step:Extract pointwise limits at rational times]
Let
\begin{align*}
\mathbb Q_+:=\mathbb Q\cap[0,\infty).
\end{align*}
The set $\mathbb Q_+$ is countable, so choose an enumeration
\begin{align*}
\mathbb Q_+=\{r_i:i\in\mathbb N\}.
\end{align*}
For every $j\ge 1$ and every $t\ge 0$, the point $\bar{x}_{\tau_j}(t)$ belongs to $K$. Since $K$ is compact, the sequence $(\bar{x}_{\tau_j}(r_1))_{j\ge 1}$ has a convergent subsequence in $K$. From that subsequence, extract a further subsequence for which the values at $r_2$ converge. Continuing inductively and taking the diagonal subsequence, we obtain a subsequence $(\tau_{j_\ell})_{\ell\ge 1}$ such that, for every $r\in\mathbb Q_+$, the sequence $(\bar{x}_{\tau_{j_\ell}}(r))_{\ell\ge 1}$ converges in $K$.
Define $x_{\mathbb Q}:\mathbb Q_+\to K$ by
\begin{align*}
x_{\mathbb Q}(r)=\lim_{\ell\to\infty}\bar{x}_{\tau_{j_\ell}}(r).
\end{align*}
Passing to the limit in the modulus estimate along the subsequence, for $r,s\in\mathbb Q_+$ with $r\le s$, gives
\begin{align*}
d(x_{\mathbb Q}(r),x_{\mathbb Q}(s))
\le
\sqrt{2(E(x_0)-m)}\sqrt{s-r}.
\end{align*}
Thus $x_{\mathbb Q}$ is uniformly continuous on every bounded subset of $\mathbb Q_+$.
[/step]
[step:Extend the rational-time limit to a continuous curve]
Fix $T>0$. Since $x_{\mathbb Q}$ is uniformly continuous on $\mathbb Q\cap[0,T]$ and $X$ is complete, it has a unique continuous extension to $[0,T]$. These extensions agree on overlapping intervals because they agree on the dense set of rational times. Hence there is a unique continuous map
\begin{align*}
x:[0,\infty)\to X
\end{align*}
such that $x(r)=x_{\mathbb Q}(r)$ for every $r\in\mathbb Q_+$.
Since $0\in\mathbb Q_+$ and $\bar{x}_{\tau_{j_\ell}}(0)=x_0$ for every $\ell\ge 1$, we have
\begin{align*}
x(0)=x_{\mathbb Q}(0)=x_0.
\end{align*}
Moreover, the estimate for $x_{\mathbb Q}$ extends by continuity to all $0\le s\le t<\infty$:
\begin{align*}
d(x(s),x(t))
\le
\sqrt{2(E(x_0)-m)}\sqrt{t-s}.
\end{align*}
In particular, $x$ is continuous.
[/step]
[step:Upgrade rational-time convergence to locally uniform convergence]
Fix $T>0$ and $\varepsilon>0$. Let
\begin{align*}
A:=\sqrt{2(E(x_0)-m)}.
\end{align*}
Choose $\delta>0$ such that
\begin{align*}
A\sqrt{2\delta}<\frac{\varepsilon}{3}.
\end{align*}
Choose $L\in\mathbb N$ and rational points $r_0,\dots,r_L\in\mathbb Q\cap[0,T]$ such that $r_0=0$, $r_L=T$ if $T\in\mathbb Q$, and for every $t\in[0,T]$ there exists $i\in\{0,\dots,L\}$ with
\begin{align*}
|t-r_i|<\delta.
\end{align*}
If $T\notin\mathbb Q$, choose the finite rational net inside $[0,T]$ with the same covering property.
For each fixed $i\in\{0,\dots,L\}$, the convergence
\begin{align*}
\bar{x}_{\tau_{j_\ell}}(r_i)\to x(r_i)
\end{align*}
holds as $\ell\to\infty$. Since there are finitely many indices, choose $\ell_0\in\mathbb N$ such that, for every $\ell\ge \ell_0$ and every $i\in\{0,\dots,L\}$,
\begin{align*}
d(\bar{x}_{\tau_{j_\ell}}(r_i),x(r_i))<\frac{\varepsilon}{3}.
\end{align*}
Increase $\ell_0$ if necessary so that $\tau_{j_\ell}<\delta$ for all $\ell\ge\ell_0$.
Now fix $\ell\ge\ell_0$ and $t\in[0,T]$. Choose $i$ with $|t-r_i|<\delta$. The modulus estimate for $\bar{x}_{\tau_{j_\ell}}$ gives
\begin{align*}
d(\bar{x}_{\tau_{j_\ell}}(t),\bar{x}_{\tau_{j_\ell}}(r_i))
\le
A\sqrt{|t-r_i|+\tau_{j_\ell}}
<
A\sqrt{2\delta}
<
\frac{\varepsilon}{3}.
\end{align*}
The continuity estimate for $x$ gives
\begin{align*}
d(x(r_i),x(t))
\le
A\sqrt{|t-r_i|}
<
A\sqrt{\delta}
<
\frac{\varepsilon}{3}.
\end{align*}
By the triangle inequality,
\begin{align*}
d(\bar{x}_{\tau_{j_\ell}}(t),x(t))
\le
d(\bar{x}_{\tau_{j_\ell}}(t),\bar{x}_{\tau_{j_\ell}}(r_i))
+
d(\bar{x}_{\tau_{j_\ell}}(r_i),x(r_i))
+
d(x(r_i),x(t))
<
\varepsilon.
\end{align*}
Since $t\in[0,T]$ was arbitrary,
\begin{align*}
\sup_{0\le t\le T}d(\bar{x}_{\tau_{j_\ell}}(t),x(t))\le\varepsilon
\end{align*}
for all $\ell\ge\ell_0$. Therefore
\begin{align*}
\sup_{0\le t\le T}d(\bar{x}_{\tau_{j_\ell}}(t),x(t))\to0
\end{align*}
as $\ell\to\infty$.
[/step]
[step:Conclude the existence of a minimizing movement]
The subsequence $(\tau_{j_\ell})_{\ell\ge 1}$ and the continuous curve $x:[0,\infty)\to X$ constructed above satisfy $x(0)=x_0$ and, for every $T>0$,
\begin{align*}
\sup_{0\le t\le T}d(\bar{x}_{\tau_{j_\ell}}(t),x(t))\to0.
\end{align*}
This is precisely the locally uniform convergence required in the definition of a minimizing movement along the vanishing time steps $(\tau_{j_\ell})_{\ell\ge 1}$. Hence a minimizing movement for $E$ starting from $x_0$ exists along a subsequence of every sequence $\tau_j\to0$.
[/step]