[proofplan]
The forward direction is immediate: an isomorphism has a two-sided inverse, hence is bijective. The content is the reverse direction: we must show that the set-theoretic inverse $\alpha^{-1}$ of a bijective linear map is automatically linear. This follows by applying $\alpha^{-1}$ to a linear combination in $V$, using linearity of $\alpha$ and injectivity to transfer the computation back to $U$.
[/proofplan]
[step:Deduce bijectivity from the existence of a linear inverse ($\Rightarrow$)]
Suppose $\alpha$ is an isomorphism, so there exists a linear map $\beta: V \to U$ with $\alpha \circ \beta = \operatorname{id}_V$ and $\beta \circ \alpha = \operatorname{id}_U$. Since $\alpha$ has a two-sided inverse as a function, $\alpha$ is bijective.
[/step]
[step:Construct the set-theoretic inverse of a bijective linear map]
Suppose $\alpha: U \to V$ is a bijective linear map. Since $\alpha$ is bijective, the set-theoretic inverse $\alpha^{-1}: V \to U$ exists, satisfying $\alpha \circ \alpha^{-1} = \operatorname{id}_V$ and $\alpha^{-1} \circ \alpha = \operatorname{id}_U$.
[/step]
[step:Show $\alpha^{-1}$ is linear]
Let $v_1, v_2 \in V$ and $\lambda, \mu \in \mathbb{F}$. Set $u_1 = \alpha^{-1}(v_1)$ and $u_2 = \alpha^{-1}(v_2)$, so $\alpha(u_1) = v_1$ and $\alpha(u_2) = v_2$. By linearity of $\alpha$:
\begin{align*}
\alpha(\lambdau_1 + \muu_2) = \lambda\alpha(u_1) + \mu\alpha(u_2) = \lambdav_1 + \muv_2.
\end{align*}
Applying $\alpha^{-1}$ to both sides:
\begin{align*}
\lambdau_1 + \muu_2 = \alpha^{-1}(\lambdav_1 + \muv_2).
\end{align*}
Substituting back $u_i = \alpha^{-1}(v_i)$:
\begin{align*}
\alpha^{-1}(\lambdav_1 + \muv_2) = \lambda\alpha^{-1}(v_1) + \mu\alpha^{-1}(v_2).
\end{align*}
Hence $\alpha^{-1}$ is linear, and $\alpha$ is an isomorphism with inverse $\beta = \alpha^{-1}$.
[guided]
The key point is that we never need to "prove" the inverse is linear by checking some abstract condition --- the linearity of $\alpha$ itself does the work. Here is the logic in detail.
We want to show $\alpha^{-1}(\lambdav_1 + \muv_2) = \lambda\alpha^{-1}(v_1) + \mu\alpha^{-1}(v_2)$. The right-hand side is a specific element of $U$; call it $w = \lambdau_1 + \muu_2$ where $u_i = \alpha^{-1}(v_i)$. We compute $\alpha(w)$ using linearity:
\begin{align*}
\alpha(w) = \alpha(\lambdau_1 + \muu_2) = \lambda\alpha(u_1) + \mu\alpha(u_2) = \lambdav_1 + \muv_2.
\end{align*}
Since $\alpha$ is injective (it is bijective by hypothesis), the equation $\alpha(w) = \lambdav_1 + \muv_2$ has a unique solution, which is $w = \alpha^{-1}(\lambdav_1 + \muv_2)$. But we also know $w = \lambdau_1 + \muu_2 = \lambda\alpha^{-1}(v_1) + \mu\alpha^{-1}(v_2)$. Equating the two expressions for $w$ gives the desired linearity.
Why does this fail for non-linear bijections in general? It does not --- the argument above uses the linearity of $\alpha$ in the forward direction. A bijective non-linear map may have a non-linear inverse.
[/guided]
[/step]