[proofplan]
We separate the active and inactive inequality constraints at the local minimizer. Inactive constraints remain inactive in a neighbourhood, so only the equality constraints and active inequalities affect first-order feasible directions. LICQ lets us realize every linearized feasible direction as a limit of differentiable feasible curves, and local minimality then forces the objective gradient to be nonnegative on the linearized feasible cone. A finite-dimensional polar-cone/Farkas argument converts that first-order inequality into the existence of equality multipliers and nonnegative active inequality multipliers; extending the inactive multipliers by zero gives the stated KKT system.
[/proofplan]
[step:Discard inactive inequalities locally and define the linearized feasible cone]
Write $I:=I(z^*)$. Since $G_i(z^*)<0$ for every $i\notin I$ and each component $G_i:U\to\mathbb{R}$ is continuous at $z^*$, there is an open neighbourhood $V\subset U$ of $z^*$ such that
\begin{align*}
G_i(z)<0 \quad \text{for every } z\in V \text{ and every } i\notin I.
\end{align*}
Thus, near $z^*$, inactive inequalities impose no first-order restriction.
Define the linearized feasible cone $K\subset\mathbb{R}^q$ by
\begin{align*}
K:=\{d\in\mathbb{R}^q : DA_{z^*}d=0 \text{ and } DG_{i,z^*}d\leq 0 \text{ for every } i\in I\}.
\end{align*}
Here $DA_{z^*}:\mathbb{R}^q\to\mathbb{R}^a$ is the total derivative of $A$ at $z^*$, and $DG_{i,z^*}:\mathbb{R}^q\to\mathbb{R}$ is the total derivative of the scalar map $G_i$ at $z^*$.
[/step]
[step:Realize strict linearized directions by feasible curves]
Let $d\in\mathbb{R}^q$ satisfy
\begin{align*}
DA_{z^*}d=0
\end{align*}
and
\begin{align*}
DG_{i,z^*}d<0 \quad \text{for every } i\in I.
\end{align*}
We prove that there exists $\varepsilon>0$ and a differentiable curve $\gamma:[0,\varepsilon)\to U$ such that $\gamma(0)=z^*$, $\gamma'(0)=d$, and $\gamma(t)\in\mathcal{F}$ for every sufficiently small $t\geq 0$.
First treat the equality constraints. If $a=0$, define $\gamma(t):=z^*+td$ for all sufficiently small $t\geq 0$ with $z^*+td\in U$; then $A(\gamma(t))=0$ is vacuous. If $a>0$, the LICQ family contains $\{\nabla A_j(z^*) : j\in\{1,\dots,a\}\}$ as a linearly independent subfamily, so $DA_{z^*}:\mathbb{R}^q\to\mathbb{R}^a$ is surjective. Since $A$ is $C^1$ near $z^*$ and $A(z^*)=0$, the finite-dimensional [implicit function theorem](/theorems/52) gives a $C^1$ local parametrization of $A^{-1}(\{0\})$ near $z^*$ whose tangent space at $z^*$ is $\ker DA_{z^*}$. Because $d\in\ker DA_{z^*}$, there are $\varepsilon>0$ and a differentiable curve $\gamma:[0,\varepsilon)\to U$ such that
\begin{align*}
\gamma(0)=z^*, \quad \gamma'(0)=d, \quad A(\gamma(t))=0 \text{ for every } t\in[0,\varepsilon).
\end{align*}
For each active index $i\in I$, differentiability of $G_i$ at $z^*$ along the curve $\gamma$ gives
\begin{align*}
G_i(\gamma(t))=G_i(z^*)+tDG_{i,z^*}d+r_i(t),
\end{align*}
where $r_i:[0,\varepsilon)\to\mathbb{R}$ satisfies $r_i(t)/t\to 0$ as $t\downarrow 0$. Since $G_i(z^*)=0$ and $DG_{i,z^*}d<0$, we have $G_i(\gamma(t))<0$ for every sufficiently small $t>0$. Inactive inequalities are negative for small $t$ by the neighbourhood $V$. Therefore, after reducing $\varepsilon$ if necessary, $\gamma(t)\in\mathcal{F}$ for all sufficiently small $t\geq 0$.
[guided]
The purpose of this step is to justify that strict linearized feasible directions are actual first-order motions through feasible points. Fix $d\in\mathbb{R}^q$ with
\begin{align*}
DA_{z^*}d=0
\end{align*}
and
\begin{align*}
DG_{i,z^*}d<0 \quad \text{for every } i\in I.
\end{align*}
The equality condition says that $d$ is tangent to the equality constraints at first order, while the strict inequality condition says each active inequality immediately moves into the feasible side.
We first build a curve that satisfies the equality constraints exactly. If $a=0$, there are no equality constraints, so the straight curve $\gamma(t):=z^*+td$ works for all sufficiently small $t\geq 0$ because $U$ is open. If $a>0$, the gradients $\nabla A_j(z^*)$ are linearly independent because they form a subfamily of the LICQ family. Equivalently, the derivative $DA_{z^*}:\mathbb{R}^q\to\mathbb{R}^a$ has full row rank and is surjective. The finite-dimensional implicit function theorem applies to the $C^1$ map $A:U\to\mathbb{R}^a$ at the point $z^*$ with $A(z^*)=0$ and surjective derivative. It gives that $A^{-1}(\{0\})$ is locally a $C^1$ submanifold and that its tangent space at $z^*$ is $\ker DA_{z^*}$. Since $d\in\ker DA_{z^*}$, the local parametrization of this submanifold yields a differentiable curve $\gamma:[0,\varepsilon)\to U$ satisfying
\begin{align*}
\gamma(0)=z^*, \quad \gamma'(0)=d, \quad A(\gamma(t))=0 \text{ for every } t\in[0,\varepsilon).
\end{align*}
Now check the active inequalities along this equality-feasible curve. For each $i\in I$, the component $G_i:U\to\mathbb{R}$ is differentiable at $z^*$. Therefore the one-variable expansion along $\gamma$ is
\begin{align*}
G_i(\gamma(t))=G_i(z^*)+tDG_{i,z^*}\gamma'(0)+r_i(t),
\end{align*}
where $r_i:[0,\varepsilon)\to\mathbb{R}$ is a remainder with $r_i(t)/t\to 0$ as $t\downarrow 0$. Since $i\in I$, $G_i(z^*)=0$, and since $\gamma'(0)=d$, this becomes
\begin{align*}
G_i(\gamma(t))=tDG_{i,z^*}d+r_i(t).
\end{align*}
Because $DG_{i,z^*}d<0$ and $r_i(t)/t\to 0$, the negative linear term dominates the remainder for all sufficiently small $t>0$. Hence $G_i(\gamma(t))<0$ for every active $i\in I$ and every sufficiently small $t>0$.
Finally, if $i\notin I$, then $G_i(z^*)<0$. Continuity of $G_i$ and continuity of $\gamma$ at $0$ imply $G_i(\gamma(t))<0$ for all sufficiently small $t\geq 0$. Thus the curve satisfies the equalities exactly and all inequalities for small positive time, so $\gamma(t)\in\mathcal{F}$.
[/guided]
[/step]
[step:Use local minimality to obtain nonnegativity on the whole linearized cone]
First suppose $d\in K$ satisfies $DG_{i,z^*}d<0$ for every $i\in I$. By the preceding step, there is a differentiable feasible curve $\gamma:[0,\varepsilon)\to\mathcal{F}$ with $\gamma(0)=z^*$ and $\gamma'(0)=d$. Since $z^*$ is a local minimizer of $F$ on $\mathcal{F}$,
\begin{align*}
F(\gamma(t))-F(z^*)\geq 0
\end{align*}
for all sufficiently small $t\geq 0$. Dividing by $t>0$ and letting $t\downarrow 0$ gives
\begin{align*}
\nabla F(z^*)\cdot d\geq 0.
\end{align*}
It remains to remove the strictness. LICQ implies the [linear map](/page/Linear%20Map)
\begin{align*}
L:\ker DA_{z^*}\to\mathbb{R}^{I},\qquad Ld:=(DG_{i,z^*}d)_{i\in I}
\end{align*}
is surjective. Indeed, if its range were not all of $\mathbb{R}^{I}$, there would be coefficients $\alpha_i$, not all zero, such that
\begin{align*}
\sum_{i\in I}\alpha_iDG_{i,z^*}d=0 \quad \text{for every } d\in\ker DA_{z^*}.
\end{align*}
This would imply that $\sum_{i\in I}\alpha_i\nabla G_i(z^*)$ belongs to the span of $\{\nabla A_j(z^*)\}_{j=1}^a$, contradicting LICQ.
Choose $w\in\ker DA_{z^*}$ such that
\begin{align*}
DG_{i,z^*}w=-1 \quad \text{for every } i\in I.
\end{align*}
For arbitrary $d\in K$ and every $\varepsilon>0$, the vector $d+\varepsilon w$ satisfies
\begin{align*}
DA_{z^*}(d+\varepsilon w)=0
\end{align*}
and
\begin{align*}
DG_{i,z^*}(d+\varepsilon w)<0 \quad \text{for every } i\in I.
\end{align*}
Therefore
\begin{align*}
\nabla F(z^*)\cdot(d+\varepsilon w)\geq 0.
\end{align*}
Letting $\varepsilon\downarrow 0$ yields
\begin{align*}
\nabla F(z^*)\cdot d\geq 0 \quad \text{for every } d\in K.
\end{align*}
[/step]
[step:Represent the polar cone by equality and active inequality gradients]
Define the polar cone $K^\circ\subset\mathbb{R}^q$ by
\begin{align*}
K^\circ:=\{v\in\mathbb{R}^q : v\cdot d\leq 0 \text{ for every } d\in K\}.
\end{align*}
From the previous step,
\begin{align*}
-\nabla F(z^*)\in K^\circ.
\end{align*}
Let $DG_{I,z^*}:\mathbb{R}^q\to\mathbb{R}^{I}$ be the linear map defined by
\begin{align*}
DG_{I,z^*}d:=(DG_{i,z^*}d)_{i\in I}.
\end{align*}
We write vectors $\mu_I\in\mathbb{R}^{I}$ with coordinates indexed by $i\in I$, and $DG_{I,z^*}^{\top}:\mathbb{R}^{I}\to\mathbb{R}^q$ denotes the transpose map.
We use the following finite-dimensional Farkas polar form. If $E:X\to Y$ and $H:X\to\mathbb{R}^m$ are linear maps between finite-dimensional Euclidean spaces, and
\begin{align*}
C:=\{x\in X:E x=0 \text{ and } (Hx)_r\leq 0 \text{ for every } r\in\{1,\dots,m\}\},
\end{align*}
then
\begin{align*}
C^\circ=\{E^\top \lambda+H^\top \nu : \lambda\in Y,\ \nu\in[0,\infty)^m\}.
\end{align*}
This is the separating-hyperplane form of Farkas' lemma for a homogeneous system of linear equalities and inequalities. Applying it with $X=\mathbb{R}^q$, $Y=\mathbb{R}^a$, $E=DA_{z^*}$, $m=|I|$, and $H=DG_{I,z^*}$ gives
\begin{align*}
K^\circ=\{DA_{z^*}^{\top}\lambda+DG_{I,z^*}^{\top}\mu_I : \lambda\in\mathbb{R}^a,\ \mu_I\in\mathbb{R}^{I},\ \mu_i\geq 0 \text{ for every } i\in I\}.
\end{align*}
The sign convention matches the definition of $K^\circ$: if $v=DA_{z^*}^{\top}\lambda+DG_{I,z^*}^{\top}\mu_I$ with $\mu_i\geq 0$ for every $i\in I$ and $d\in K$, then
\begin{align*}
v\cdot d=\lambda\cdot DA_{z^*}d+\sum_{i\in I}\mu_iDG_{i,z^*}d\leq 0.
\end{align*}
Applying this representation to $-\nabla F(z^*)\in K^\circ$, there exist $\lambda\in\mathbb{R}^a$ and $\mu_I\in\mathbb{R}^{I}$ with $\mu_i\geq 0$ for every $i\in I$ such that
\begin{align*}
-\nabla F(z^*)=DA_{z^*}^{\top}\lambda+DG_{I,z^*}^{\top}\mu_I.
\end{align*}
Equivalently,
\begin{align*}
\nabla F(z^*)+DA_{z^*}^{\top}\lambda+DG_{I,z^*}^{\top}\mu_I=0.
\end{align*}
[/step]
[step:Extend the active multipliers and verify the KKT conditions]
Define $\mu\in\mathbb{R}^b$ as follows: for $i\in I$, set $\mu_i:=(\mu_I)_i$, and for $i\notin I$, set $\mu_i:=0$. Then $\mu_i\geq 0$ for every $i\in\{1,\dots,b\}$, because active multipliers are nonnegative and inactive multipliers are zero.
Since $\mu_i=0$ for $i\notin I$, the active stationarity equation becomes the full stationarity equation
\begin{align*}
\nabla F(z^*)+DA_{z^*}^{\top}\lambda+DG_{z^*}^{\top}\mu=0.
\end{align*}
Primal feasibility and inequality feasibility are part of the assumption that $z^*$ is a local minimizer over $\mathcal{F}$, so
\begin{align*}
A(z^*)=0
\end{align*}
and
\begin{align*}
G_i(z^*)\leq 0 \quad \text{for every } i\in\{1,\dots,b\}.
\end{align*}
Finally, if $i\in I$, then $G_i(z^*)=0$, so $\mu_iG_i(z^*)=0$. If $i\notin I$, then $\mu_i=0$, so again $\mu_iG_i(z^*)=0$. Therefore
\begin{align*}
\mu_iG_i(z^*)=0 \quad \text{for every } i\in\{1,\dots,b\}.
\end{align*}
All KKT conditions follow.
[/step]