[guided]We now turn maximality into a first-order optimality condition. Let $\operatorname{Sym}_n$ be the vector space of real symmetric $n\times n$ matrices, with pairing $(M,N)\mapsto\operatorname{tr}(MN)$, and set
\begin{align*}
V:=\operatorname{Sym}_n\times\mathbb{R}^n.
\end{align*}
Define the constraint map
\begin{align*}
q:A&\to V,\\
u&\mapsto (u\otimes u,u).
\end{align*}
The matrix component $u\otimes u$ measures the first-order effect of a linear perturbation in the supporting direction $u$, while the vector component $u$ measures the first-order effect of translation. Let $\mathcal{L}^n$ denote $n$-dimensional Lebesgue measure on $\mathbb{R}^n$.
Assume, toward a contradiction, that $(I_n,0)$ is not in the cone generated by the vectors $q(u)$. We want to apply the [finite-dimensional separating hyperplane theorem](/page/Separating%20Hyperplane%20Theorem), so we verify its hypotheses. The cone is convex by construction. For closedness, $A$ is compact and $q$ is continuous, so $q(A)$ is compact; also $0\notin q(A)$ because $u\in S^{n-1}$ gives $u\otimes u\neq 0$. Thus $q(A)$ is bounded away from $0$ in any fixed Euclidean norm on $V$, and a convergent sequence of conic combinations has bounded total coefficient after normalising the generators. Compactness then gives a convergent subsequence of the normalised generators, so the limit remains in the cone. Hence the cone generated by $q(A)$ is closed.
The point $(I_n,0)$ lies outside this closed convex cone by assumption. The separating hyperplane theorem gives a linear functional on $V$ which is non-positive on the cone and positive at $(I_n,0)$. Because $V=\operatorname{Sym}_n\times\mathbb{R}^n$ is finite-dimensional and the trace pairing identifies $\operatorname{Sym}_n^*$ with $\operatorname{Sym}_n$, this functional has the form $(M,z)\mapsto \operatorname{tr}(HM)+b\cdot z$ for some $H\in\operatorname{Sym}_n$ and $b\in\mathbb{R}^n$. Therefore
\begin{align*}
\operatorname{tr}(H)&>0,\\
u\cdot Hu+b\cdot u&\leq 0 \qquad \text{for every }u\in A.
\end{align*}
The non-strict inequality is not enough to control the second-order error in the support function expansion, so we create strict first-order slack. Choose $\varepsilon>0$ such that $\operatorname{tr}(H)-n\varepsilon>0$, and define
\begin{align*}
G:=H-\varepsilon I_n\in\operatorname{Sym}_n.
\end{align*}
Then
\begin{align*}
\operatorname{tr}(G)&>0,\\
u\cdot Gu+b\cdot u&=u\cdot Hu+b\cdot u-\varepsilon\leq -\varepsilon \qquad \text{for every }u\in A.
\end{align*}
The strict negative margin on $A$ is the point of replacing $H$ by $G$: it will absorb the $O(t^2)$ term at active directions.
For $t>0$, define
\begin{align*}
E_t:=tb+(I_n+tG)B(0,1).
\end{align*}
For small $t>0$, the matrix $I_n+tG$ remains positive definite, so $E_t$ is an ellipsoid. Its volume is the determinant of the linear part times the volume of the unit ball:
\begin{align*}
\mathcal{L}^n(E_t)=\det(I_n+tG)\mathcal{L}^n(B(0,1)).
\end{align*}
The derivative of the determinant at the identity in the direction $G$ is $\operatorname{tr}(G)$, so
\begin{align*}
\det(I_n+tG)=1+t\operatorname{tr}(G)+O(t^2).
\end{align*}
Since $\operatorname{tr}(G)>0$, this volume is larger than $\mathcal{L}^n(B(0,1))$ for all sufficiently small positive $t$.
We must check containment in $K$. For $v\in S^{n-1}$, the support function of $E_t$ is
\begin{align*}
h_{E_t}(v)=t b\cdot v+|(I_n+tG)v|.
\end{align*}
The norm expansion is uniform in $v\in S^{n-1}$ because the sphere is compact:
\begin{align*}
|(I_n+tG)v|=1+t\,v\cdot Gv+O(t^2).
\end{align*}
Therefore
\begin{align*}
h_{E_t}(v)=1+t(b\cdot v+v\cdot Gv)+O(t^2).
\end{align*}
Define the continuous first-order coefficient
\begin{align*}
\psi:S^{n-1}&\to\mathbb{R},\\
v&\mapsto b\cdot v+v\cdot Gv.
\end{align*}
On the active set $A$, we have $\psi(v)\leq -\varepsilon$. By continuity, there is an open neighbourhood $N\subset S^{n-1}$ of $A$ such that $\psi(v)\leq -\varepsilon/2$ for every $v\in N$. The remainder in the expansion is uniform, so for all sufficiently small $t>0$ the negative term $-t\varepsilon/2$ dominates the $O(t^2)$ error. Hence
\begin{align*}
h_{E_t}(v)\leq 1\leq h_K(v)\qquad\text{for every }v\in N.
\end{align*}
Away from $A$, compactness supplies a genuine support-function gap. Since $S^{n-1}\setminus N$ is compact and does not meet $A=\{v:h_K(v)=1\}$, the continuous function $h_K-1$ has a positive minimum $\delta>0$ on $S^{n-1}\setminus N$. The function $\psi$ is bounded on $S^{n-1}$, and the remainder is uniform, so after reducing $t>0$ if necessary we have
\begin{align*}
h_{E_t}(v)\leq 1+\delta\leq h_K(v)\qquad\text{for every }v\in S^{n-1}\setminus N.
\end{align*}
Combining the estimates on $N$ and on $S^{n-1}\setminus N$ gives
\begin{align*}
h_{E_t}(v)\leq h_K(v)\qquad\text{for every }v\in S^{n-1}.
\end{align*}
We now use the support-function containment criterion for compact convex sets: if $C,D\subset\mathbb{R}^n$ are compact and convex, then $C\subset D$ exactly when $h_C(v)\leq h_D(v)$ for every $v\in S^{n-1}$. The sets $E_t$ and $K$ satisfy these hypotheses, so the support-function inequality implies $E_t\subset K$. We have produced an ellipsoid inside $K$ with larger volume than $B(0,1)$, contradicting [John position](/page/John%20Position). Thus
\begin{align*}
(I_n,0)\in \operatorname{cone}\{(u\otimes u,u):u\in A\}.
\end{align*}[/guided]