[guided]The goal is to produce many points in $T$ that cannot be too close to each other. We use only supports, not signs. The delicate point is that a crude greedy argument over all $k$-subsets loses too much when $d/k$ is bounded, so we build the supports with a block structure.
Let $N:=\lfloor d/k\rfloor$. Since $k\leq d/2$, there are at least two available coordinates per block. Choose pairwise disjoint blocks
\begin{align*}
P_1,\dots,P_k\subset\{1,\dots,d\}
\end{align*}
with $|P_j|=N$ for every $j\in\{1,\dots,k\}$. Let $\mathcal{S}$ be the set of supports that choose exactly one coordinate from each block:
\begin{align*}
\mathcal{S}:=\{S\subset\bigcup_{j=1}^k P_j: |S\cap P_j|=1 \text{ for every } j\in\{1,\dots,k\}\}.
\end{align*}
Every element of $\mathcal{S}$ has cardinality $k$, and there are $N$ independent choices in each of the $k$ blocks, so
\begin{align*}
|\mathcal{S}|=N^k.
\end{align*}
Define the block Hamming distance $\rho:\mathcal{S}\times\mathcal{S}\to\{0,1,\dots,k\}$ by
\begin{align*}
\rho(S,R):=|\{j\in\{1,\dots,k\}:S\cap P_j\neq R\cap P_j\}|.
\end{align*}
We choose $\mathcal{F}\subset\mathcal{S}$ maximal under the requirement that distinct elements have $\rho$-distance greater than $k/2$. Because of maximality, every $R\in\mathcal{S}$ lies within $\rho$-distance at most $k/2$ of some chosen $S\in\mathcal{F}$; otherwise we could add $R$ to $\mathcal{F}$.
For a fixed $S\in\mathcal{S}$, count the supports $R\in\mathcal{S}$ with $\rho(S,R)\leq k/2$. If exactly $\ell$ blocks are changed, there are $\binom{k}{\ell}$ choices of those blocks and at most $(N-1)^\ell$ choices for the new coordinates. Hence the ball of radius $\lfloor k/2\rfloor$ has cardinality at most
\begin{align*}
\sum_{\ell=0}^{\lfloor k/2\rfloor}\binom{k}{\ell}(N-1)^\ell
\leq
2^kN^{k/2}.
\end{align*}
Since these balls cover $\mathcal{S}$, division gives
\begin{align*}
|\mathcal{F}|\geq \frac{N^k}{2^kN^{k/2}}=\left(\frac{N}{4}\right)^{k/2}.
\end{align*}
For $N\geq8$, this is at least $(d/(8k))^{k/2}$ because $N=\lfloor d/k\rfloor\geq d/(2k)$. Now define, for each $S\in\mathcal{F}$, the normalized indicator vector $u_S\in\mathbb{R}^d$ by
\begin{align*}
(u_S)_i=
\begin{cases}
k^{-1/2}, & i\in S,\cr
0, & i\notin S.
\end{cases}
\end{align*}
This vector is $k$-sparse and has norm one:
\begin{align*}
|u_S|^2=\sum_{i\in S}k^{-1}=1.
\end{align*}
Hence $u_S\in T$. Let
\begin{align*}
M:=\{u_S:S\in\mathcal{F}\}.
\end{align*}
If $S,R\in\mathcal{F}$ are distinct, then $\rho(S,R)>k/2$. In each block where the choices differ, one coordinate belongs to $S\setminus R$ and one coordinate belongs to $R\setminus S$, so $|S\cap R|=k-\rho(S,R)<k/2$. Therefore
\begin{align*}
|u_S-u_R|^2
=
\frac{|S\triangle R|}{k}
=
\frac{2k-2|S\cap R|}{k}
\geq 1.
\end{align*}
Since $d/k\geq8$ in this case, this gives $|M|\geq\exp(\gamma_1 k\log(d/k))$ for a universal constant $\gamma_1>0$.
For the remaining bounded ratios $2\leq N<8$, supports alone with separation greater than $k/2$ are not the right tool: a binary code of relative distance greater than $1/2$ has no positive exponential rate. Instead we keep one support fixed and use signs. Choose a fixed set $Q\subset\{1,\dots,d\}$ with $|Q|=k$. Define the Hamming distance $h:(\{-1,1\}^Q)\times(\{-1,1\}^Q)\to\{0,1,\dots,k\}$ by
\begin{align*}
h(\varepsilon,\delta):=|\{i\in Q:\varepsilon_i\neq\delta_i\}|.
\end{align*}
Choose $\mathcal{E}\subset\{-1,1\}^Q$ maximal under the condition that distinct sign vectors have Hamming distance greater than $k/4$. Maximality means that the Hamming balls of radius $\lfloor k/4\rfloor$ centred at $\mathcal{E}$ cover the whole sign cube. A ball of this radius has cardinality at most
\begin{align*}
\sum_{\ell=0}^{\lfloor k/4\rfloor}\binom{k}{\ell}
\leq
\exp(kH_0(1/4)),
\end{align*}
where $H_0(t):=-t\log t-(1-t)\log(1-t)$ is the binary entropy function with natural logarithms. Since $H_0(1/4)<\log 2$, division by the largest possible ball gives
\begin{align*}
|\mathcal{E}|
\geq
\exp(k(\log 2-H_0(1/4))).
\end{align*}
After decreasing the constant to handle the finitely many small values of $k$, this is at least $\exp(\gamma_2 k)$ for a universal constant $\gamma_2>0$. Because $2\leq d/k<8$, the quantities $k$ and $k\log(d/k)$ are comparable by universal constants, so $|\mathcal{E}|\geq\exp(\gamma_3 k\log(d/k))$ for a universal constant $\gamma_3>0$.
For each $\varepsilon\in\mathcal{E}$, define $v_\varepsilon\in\mathbb{R}^d$ by
\begin{align*}
(v_\varepsilon)_i=
\begin{cases}
k^{-1/2}\varepsilon_i, & i\in Q,\cr
0, & i\notin Q.
\end{cases}
\end{align*}
Then $v_\varepsilon$ is supported on $Q$, so it is $k$-sparse, and
\begin{align*}
|v_\varepsilon|^2=\sum_{i\in Q}k^{-1}=1.
\end{align*}
Thus $v_\varepsilon\in T$. Let $M:=\{v_\varepsilon:\varepsilon\in\mathcal{E}\}$. If $\varepsilon\neq\delta$, the vector $v_\varepsilon-v_\delta$ has magnitude $2k^{-1/2}$ exactly on the $h(\varepsilon,\delta)$ coordinates where the signs differ. Hence
\begin{align*}
|v_\varepsilon-v_\delta|^2
=
\frac{4h(\varepsilon,\delta)}{k}
\geq 1.
\end{align*}
Combining the two cases, there is a universal constant $\gamma>0$ such that $M\subset T$, $|M|\geq\exp(\gamma k\log(d/k))$, and $|x-z|\geq1$ for distinct $x,z\in M$.[/guided]