A directional derivative answers a simple local question: if a function is observed from a point and the input is nudged in one chosen direction, what is the instantaneous rate of change? In one-variable calculus there are only two directions along the real line, so the ordinary [derivative](/page/Derivative) already captures the local first-order behaviour. In several variables, a function may rise steeply in one direction, remain flat in another, and fail to have a coherent linear approximation even when every individual directional rate exists.
This makes the directional derivative a bridge between elementary partial derivatives and the full total derivative. It is also the pointwise ancestor of the Gateaux derivative in functional analysis and the geometric source of the gradient in Euclidean spaces. The concept is useful because it separates two questions that are often conflated: whether a function has rates of change along lines, and whether those rates assemble into a single linear approximation.
## Definition
For a real-valued function on an [open set](/page/Open%20Set) in Euclidean space, the most direct way to measure change in a direction is to restrict the function to the line through the point. The direction tells us which line to use; the ordinary one-variable limit then measures the slope of the resulting function at time $0$.
[definition: Directional Derivative]
Let $U \subset \mathbb{R}^n$ be open, let $f: U \to \mathbb{R}$ be a function, let $a \in U$, and let $v \in \mathbb{R}^n$. The directional derivative of $f$ at $a$ in the direction $v$ is the limit
\begin{align*}
D_v f(a) = \lim_{t \to 0} \frac{f(a + tv) - f(a)}{t},
\end{align*}
when this limit exists in $\mathbb{R}$.
[/definition]
The vector $v$ is not required to have length $1$. Scaling $v$ changes the speed at which the line is traversed, so the value of $D_v f(a)$ records both the geometric direction and the chosen parametrisation. When the intent is only to measure slope per unit distance, the direction vector is usually normalised by requiring $|v| = 1$.
[example: Linear Function in a Chosen Direction]
Let $f:\mathbb{R}^2\to\mathbb{R}$ be given by $f(x_1,x_2)=2x_1-x_2$. At $a=(0,0)$ in the direction $v=(3,1)$,
\begin{align*}
\frac{f(a+tv)-f(a)}{t}
=\frac{f(3t,t)-f(0,0)}{t}
=\frac{6t-t}{t}=5
\end{align*}
for $t\neq 0$. Hence $D_v f(0,0)=5$. If the unit direction $u=v/|v|=(3,1)/\sqrt{10}$ is used instead, the directional derivative is $5/\sqrt{10}$, so the length of the direction vector affects the parametrised rate.
[/example]
Many arguments require evaluating the same directional rate at several points, rather than at one fixed point. A PDE estimate, an optimization algorithm, or a comparison of slopes along a vector field needs to know where the derivative exists as the base point moves. This motivates a function whose input is the base point and whose value is the directional derivative in a fixed direction.
[definition: Directional Derivative Function]
Let $U \subset \mathbb{R}^n$ be open, let $f: U \to \mathbb{R}$ be a function, and let $v \in \mathbb{R}^n$. The directional derivative function in the direction $v$ is the function $D_v f: U_v \to \mathbb{R}$ defined by
\begin{align*}
D_v f(a)=\lim_{t \to 0}\frac{f(a+tv)-f(a)}{t}
\end{align*}
for every $a \in U_v$, where
\begin{align*}
U_v = \{a \in U : D_v f(a) \text{ exists}\}.
\end{align*}
[/definition]
Computations with arbitrary directions need a coordinate starting point, because formulas for functions of several variables are usually written in coordinates. We therefore need a named special case for changing only one coordinate at a time; this is the version that becomes the usual partial derivative.
[definition: Coordinate Directional Derivative]
Let $U \subset \mathbb{R}^n$ be open, let $f: U \to \mathbb{R}$ be a function, let $a \in U$, and let $e_i \in \mathbb{R}^n$ be the $i$th standard basis vector. The coordinate directional derivative of $f$ at $a$ in the $i$th coordinate direction is
\begin{align*}
D_{e_i}f(a) = \lim_{t \to 0} \frac{f(a + te_i) - f(a)}{t},
\end{align*}
when this limit exists.
[/definition]
The coordinate directional derivative is usually written as $\partial_{x_i}f(a)$. The notation is shorter, but it should not hide the fact that this is still a directional derivative: it measures variation only along the line parallel to the $x_i$-axis.
Two-sided line slopes are well suited to interior points of open domains. At boundary points of a feasible region, or at a corner of a nonsmooth function, the reverse direction may be unavailable or may describe a different physical question. The one-sided version keeps the initial forward rate while discarding the irrelevant reverse motion.
[definition: One-Sided Directional Derivative]
Let $U \subset \mathbb{R}^n$, let $f: U \to \mathbb{R}$ be a function, let $a \in U$, and let $v \in \mathbb{R}^n$ be such that $a + tv \in U$ for all sufficiently small $t > 0$. The one-sided directional derivative of $f$ at $a$ in the direction $v$ is
\begin{align*}
D_v^+ f(a) = \lim_{t \downarrow 0} \frac{f(a + tv) - f(a)}{t},
\end{align*}
when this limit exists in $\mathbb{R}$.
[/definition]
The one-sided version is not a replacement for differentiability. It is a different local measurement, suited to boundaries, cones of feasible directions, and nonsmooth functions such as norms.
## Equivalent Characterisations
The definition uses a line because a direction through a point gives a one-variable slice of the original function. Naming this slice helps compare directional derivatives with ordinary derivatives.
[definition: Line Restriction]
Let $U \subset \mathbb{R}^n$ be open, let $f: U \to \mathbb{R}$ be a function, let $a \in U$, and let $v \in \mathbb{R}^n$. A line restriction of $f$ at $a$ in the direction $v$ is a function $g: I \to \mathbb{R}$ defined by
\begin{align*}
g(t)=f(a+tv),
\end{align*}
where $I \subset \mathbb{R}$ is an open interval containing $0$ and $a + tv \in U$ for every $t \in I$.
[/definition]
A directional derivative is defined by changing $f$ only along the line $a+tv$, so it should be recoverable from the ordinary derivative of the one-variable slice $g(t)=f(a+tv)$. The point that needs checking is that the two limiting processes use the same parameter $t$ and therefore produce the same number when the slice is differentiable at $0$.
[quotetheorem:7721]
The line-restriction viewpoint computes one direction at a time, but separate one-dimensional slopes can exist without cohering into a single linear approximation to $f$ near $a$.
To move from directional information to a usable multivariable first-order theory, we need a condition that tests every small displacement $h$ at once. Differentiability at a point supplies that condition by requiring one [linear map](/page/Linear%20Map) to govern the error uniformly to first order, rather than collecting unrelated directional limits.
[definition: Differentiability at a Point]
Let $U \subset \mathbb{R}^n$ be open, let $f: U \to \mathbb{R}$ be a function, and let $a \in U$. The function $f$ is differentiable at $a$ if there exists a linear map
\begin{align*}
Df_a: \mathbb{R}^n \to \mathbb{R}
\end{align*}
such that
\begin{align*}
f(a+h) = f(a) + Df_a(h) + o(|h|)
\end{align*}
as $h \to 0$ in $\mathbb{R}^n$. Here $o(|h|)$ denotes an error term whose size is negligible compared with $|h|$: if $r(h)=f(a+h)-f(a)-Df_a(h)$, then $r(h)/|h| \to 0$ as $h \to 0$ with $h\neq 0$.
[/definition]
The definition gives a linear approximation in all directions, but it remains to connect that approximation with the directional derivatives introduced earlier. The point is that a differentiable function has one linear error-controlled approximation, while a directional derivative only probes one line through the point. Substituting the special displacement $h=tv$ turns the total derivative into a one-variable rate of change, so differentiability supplies a single source for all directional derivatives.
For scalar-valued functions on Euclidean space, the linear map $Df_a: \mathbb{R}^n \to \mathbb{R}$ can be written using the Euclidean [inner product](/page/Inner%20Product): there is a unique vector, denoted $\nabla f(a)$, such that
\begin{align*}
Df_a(h)=\nabla f(a)\cdot h
\end{align*}
for every $h\in \mathbb{R}^n$. Combining this representation with the compatibility between total and directional derivatives gives the computational rule
\begin{align*}
D_v f(a)=\nabla f(a)\cdot v.
\end{align*}
This formula explains why the gradient is often called the direction of steepest increase. When $|v| = 1$, Cauchy-Schwarz bounds the directional derivative by the Euclidean length of $\nabla f(a)$, with equality in the direction of the gradient when $\nabla f(a) \neq 0$.
## Smooth Computations and Pathologies
### Gradient Computations
The first example shows the typical smooth situation: directional derivatives are computed by dotting the gradient with the direction vector. The computation also illustrates why the size of $v$ matters.
[example: Quadratic Function]
Let $f: \mathbb{R}^2 \to \mathbb{R}$ be given by
\begin{align*}
f(x_1,x_2)=x_1^2+3x_1x_2-x_2^2.
\end{align*}
The coordinate derivatives are
\begin{align*}
\partial_{x_1}f(x_1,x_2)=2x_1+3x_2
\end{align*}
and
\begin{align*}
\partial_{x_2}f(x_1,x_2)=3x_1-2x_2,
\end{align*}
so
\begin{align*}
\nabla f(x_1,x_2)=(2x_1+3x_2,3x_1-2x_2).
\end{align*}
At $a=(1,2)$ this gives
\begin{align*}
\nabla f(1,2)=(2\cdot 1+3\cdot 2,3\cdot 1-2\cdot 2)=(8,-1).
\end{align*}
For $v=(2,1)$, the point $a+tv$ is
\begin{align*}
(1,2)+t(2,1)=(1+2t,2+t).
\end{align*}
Substituting this into $f$ gives
\begin{align*}
f(1+2t,2+t)=(1+2t)^2+3(1+2t)(2+t)-(2+t)^2.
\end{align*}
Expanding each term,
\begin{align*}
(1+2t)^2=1+4t+4t^2,
\end{align*}
\begin{align*}
3(1+2t)(2+t)=3(2+5t+2t^2)=6+15t+6t^2,
\end{align*}
and
\begin{align*}
(2+t)^2=4+4t+t^2.
\end{align*}
Therefore
\begin{align*}
f(1+2t,2+t)=(1+4t+4t^2)+(6+15t+6t^2)-(4+4t+t^2)=3+15t+9t^2.
\end{align*}
Since
\begin{align*}
f(1,2)=1^2+3\cdot 1\cdot 2-2^2=1+6-4=3,
\end{align*}
the directional quotient is
\begin{align*}
\frac{f((1,2)+t(2,1))-f(1,2)}{t}=\frac{3+15t+9t^2-3}{t}=15+9t
\end{align*}
for $t \neq 0$. Taking $t \to 0$ gives
\begin{align*}
D_v f(1,2)=15.
\end{align*}
This agrees with the gradient dot product
\begin{align*}
\nabla f(1,2)\cdot v=(8,-1)\cdot(2,1)=8\cdot 2+(-1)\cdot 1=15.
\end{align*}
The length of $v$ is
\begin{align*}
|v|=\sqrt{2^2+1^2}=\sqrt{5},
\end{align*}
so the corresponding unit vector is
\begin{align*}
u=\frac{v}{|v|}=\frac{(2,1)}{\sqrt{5}}.
\end{align*}
Writing $s=t/\sqrt{5}$, we have $(1,2)+tu=(1,2)+s(2,1)$. From the computation above,
\begin{align*}
f((1,2)+s(2,1))=3+15s+9s^2.
\end{align*}
Substituting $s=t/\sqrt{5}$ gives
\begin{align*}
f((1,2)+tu)=3+\frac{15t}{\sqrt{5}}+\frac{9t^2}{5}.
\end{align*}
Thus
\begin{align*}
\frac{f((1,2)+tu)-f(1,2)}{t}=\frac{15}{\sqrt{5}}+\frac{9t}{5}
\end{align*}
for $t \neq 0$, and hence
\begin{align*}
D_u f(1,2)=\frac{15}{\sqrt{5}}.
\end{align*}
The same geometric direction gives rate $15$ when traversed with velocity $(2,1)$, but rate $15/\sqrt{5}$ when traversed at unit speed.
[/example]
Smooth functions behave well because the total derivative organises all line slopes into one linear object. The next example shows why the existence of many directional derivatives by itself is a weak condition.
### Directional Data Without Differentiability
[example: Directional Derivatives Without Continuity]
Define $f: \mathbb{R}^2 \to \mathbb{R}$ by setting $f(0,0)=0$ and, for $(x_1,x_2) \neq (0,0)$,
\begin{align*}
f(x_1,x_2) = \frac{x_1^2x_2}{x_1^4+x_2^2}.
\end{align*}
For an arbitrary direction $v=(a,b)$ and $t\neq 0$,
\begin{align*}
f(tv)=f(ta,tb)=\frac{t^3a^2b}{t^4a^4+t^2b^2}=\frac{ta^2b}{t^2a^4+b^2}
\end{align*}
whenever $tv\neq (0,0)$. Since $f(0,0)=0$, the directional quotient is
\begin{align*}
\frac{f(tv)-f(0,0)}{t}=\frac{a^2b}{t^2a^4+b^2}.
\end{align*}
If $b\neq 0$, this quotient tends to $a^2/b$ as $t\to 0$. If $b=0$, then $f(ta,0)=0$ for all $t$, so the quotient is identically $0$. Thus every directional derivative at the origin exists. More explicitly, if $b\neq 0$, then
\begin{align*}
D_{(a,b)}f(0,0)=\frac{a^2}{b},
\end{align*}
while if $b=0$, then
\begin{align*}
D_{(a,b)}f(0,0)=0.
\end{align*}
This directional data is not linear in the direction vector. For example, with $v=(1,1)$ and $w=(1,2)$,
\begin{align*}
D_v f(0,0)=1,\qquad D_w f(0,0)=\frac12,\qquad D_{v+w}f(0,0)=D_{(2,3)}f(0,0)=\frac43.
\end{align*}
Since $1+1/2=3/2\neq 4/3$, additivity fails. Hence these directional derivatives cannot come from a total derivative at the origin.
The same example also shows the missing continuity. Along the curve $(x_1,x_2)=(s,s^2)$ with $s\neq 0$,
\begin{align*}
f(s,s^2)=\frac{s^2s^2}{s^4+s^4}=\frac12,
\end{align*}
so $f(s,s^2)$ does not tend to $0=f(0,0)$ as $s\to 0$. Therefore the function is not continuous at the origin, even though all directional derivatives there exist.
[/example]
This example is a standard warning: directional derivatives test straight-line approaches, while continuity and differentiability require control over all approaches. Curved paths can detect behaviour that every line misses.
Nonsmooth functions often retain one-sided directional information. The absolute value is the basic model.
### One-Sided and Coordinate Tests
[example: Absolute Value at a Corner]
Let $f: \mathbb{R} \to \mathbb{R}$ be $f(x)=|x|$. At $a=0$ and in direction $v \in \mathbb{R}$, the two-sided directional quotient is
\begin{align*}
\frac{f(0+tv)-f(0)}{t}=\frac{|tv|-|0|}{t}
\end{align*}
for $t\neq 0$. Since $|0|=0$, this is
\begin{align*}
\frac{f(tv)-f(0)}{t}=\frac{|tv|}{t}.
\end{align*}
Using $|tv|=|t||v|$, we get
\begin{align*}
\frac{|tv|}{t}=\frac{|t||v|}{t}.
\end{align*}
If $t>0$, then $|t|=t$, so
\begin{align*}
\frac{|t||v|}{t}=\frac{t|v|}{t}=|v|.
\end{align*}
If $t<0$, then $|t|=-t$, so
\begin{align*}
\frac{|t||v|}{t}=\frac{(-t)|v|}{t}=-|v|.
\end{align*}
Thus, when $v\neq 0$, the quotient approaches $|v|$ from the right and $-|v|$ from the left. Since $|v|\neq -|v|$ for $v\neq 0$, the two-sided directional derivative $D_v f(0)$ does not exist.
For the zero direction $v=0$, the quotient is
\begin{align*}
\frac{|t\cdot 0|}{t}=\frac{0}{t}=0
\end{align*}
for every $t\neq 0$, so $D_0f(0)=0$.
The one-sided quotient only uses $t>0$. For every $v\in\mathbb{R}$ and every $t>0$,
\begin{align*}
\frac{f(tv)-f(0)}{t}=\frac{|t||v|}{t}=\frac{t|v|}{t}=|v|.
\end{align*}
Therefore
\begin{align*}
D_v^+ f(0)=|v|.
\end{align*}
The corner at $0$ destroys the two-sided directional derivative in every nonzero direction, but the forward directional rate still exists in every direction.
[/example]
A final example connects the concept back to coordinate computations. It also shows how partial derivatives can miss directional behaviour unless differentiability is already known.
[example: Coordinate Derivatives Versus Diagonal Direction]
Define $f: \mathbb{R}^2 \to \mathbb{R}$ by setting $f(0,0)=0$ and, for $(x_1,x_2)\neq (0,0)$,
\begin{align*}
f(x_1,x_2)=\frac{x_1x_2}{x_1^2+x_2^2}.
\end{align*}
We compute the two coordinate directional derivatives at the origin and then compare them with the derivative in the diagonal direction.
For the first coordinate direction $e_1=(1,0)$, we have
\begin{align*}
(0,0)+te_1=(t,0).
\end{align*}
If $t\neq 0$, then $(t,0)\neq (0,0)$, so
\begin{align*}
f(t,0)=\frac{t\cdot 0}{t^2+0^2}=\frac{0}{t^2}=0.
\end{align*}
Since $f(0,0)=0$, the directional quotient is
\begin{align*}
\frac{f((0,0)+te_1)-f(0,0)}{t}=\frac{f(t,0)-0}{t}=\frac{0}{t}=0
\end{align*}
for every $t\neq 0$. Therefore
\begin{align*}
D_{e_1}f(0,0)=0.
\end{align*}
For the second coordinate direction $e_2=(0,1)$, we have
\begin{align*}
(0,0)+te_2=(0,t).
\end{align*}
If $t\neq 0$, then $(0,t)\neq (0,0)$, so
\begin{align*}
f(0,t)=\frac{0\cdot t}{0^2+t^2}=\frac{0}{t^2}=0.
\end{align*}
Thus
\begin{align*}
\frac{f((0,0)+te_2)-f(0,0)}{t}=\frac{f(0,t)-0}{t}=\frac{0}{t}=0
\end{align*}
for every $t\neq 0$, and hence
\begin{align*}
D_{e_2}f(0,0)=0.
\end{align*}
Now take the diagonal direction $v=(1,1)$. For $t\neq 0$,
\begin{align*}
(0,0)+tv=(t,t).
\end{align*}
Since $(t,t)\neq (0,0)$, substituting into the formula for $f$ gives
\begin{align*}
f(t,t)=\frac{t\cdot t}{t^2+t^2}.
\end{align*}
The numerator is $t^2$ and the denominator is $2t^2$, so
\begin{align*}
f(t,t)=\frac{t^2}{2t^2}.
\end{align*}
Because $t\neq 0$, we may cancel $t^2$:
\begin{align*}
f(t,t)=\frac{1}{2}.
\end{align*}
Therefore the directional quotient in the diagonal direction is
\begin{align*}
\frac{f((0,0)+t(1,1))-f(0,0)}{t}=\frac{f(t,t)-0}{t}=\frac{1}{2t}.
\end{align*}
As $t\to 0^+$, the quotient $\frac{1}{2t}$ tends to $+\infty$, while as $t\to 0^-$ it tends to $-\infty$. Hence there is no finite two-sided limit, so
\begin{align*}
D_{(1,1)}f(0,0)
\end{align*}
does not exist. The two coordinate partial derivatives at the origin both exist and equal $0$, but they do not control the diagonal directional derivative.
[/example]
## Algebraic Laws and Optimization
The algebraic properties of directional derivatives depend strongly on differentiability. If the total derivative exists, then the directional derivative is linear in the direction vector.
[quotetheorem:7722]
Without differentiability, this linearity may fail. The absolute value example already shows positive homogeneity for the one-sided derivative but not additivity; the two-sided derivative may fail to exist at a corner.
The coordinate derivatives are the pieces of the gradient, and computational work often starts from them. Once differentiability is known, these coordinate rates determine the derivative in every direction, so the next formula is the standard bridge from partial derivatives to arbitrary directional derivatives.
[quotetheorem:326]
This formula is often the computational form of the gradient identity. Its hypotheses matter: the existence of the partial derivatives alone does not guarantee the formula in arbitrary directions.
Many applications do not move along straight lines; they move along curves. If a curve has initial velocity $v$, differentiability of $f$ says that only this initial velocity matters to first order, which leads to the directional form of the chain rule.
[quotetheorem:7723]
This theorem explains the geometric role of $v$: it is the velocity of the path through the point. The directional derivative measures the first-order change of $f$ along every differentiable path with that initial velocity, provided $f$ has a total derivative at the point.
Directional derivatives are often used to express optimality conditions. At an unconstrained local minimum of a differentiable function, every directional derivative vanishes because the gradient vanishes.
[quotetheorem:7724]
For constrained optimization, the corresponding statement is usually one-sided and restricted to feasible directions. This is why the one-sided definition is retained even though the two-sided version is cleaner in unconstrained calculus.
A reader might hope that having $D_v f(a)$ for every $v$ already forces good local behaviour. The examples above show the opposite phenomenon: straight lines through a point are too small a family of tests to control all nearby points. The following theorem records that limitation as a reusable warning.
[quotetheorem:7725]
This theorem is not a pathology for its own sake. It marks the precise reason that multivariable differential calculus is built around linear approximation, not merely around the existence of line slopes.
## Beyond and Connected Topics
Directional derivatives are the directional shadows of the total derivative. The total derivative is stronger because it approximates the function uniformly over all sufficiently small displacement vectors, while a directional derivative fixes one vector and studies a single line.
The gradient packages all directional derivatives of a differentiable scalar-valued function into one vector. For unit vectors $v$, the value $\nabla f(a)\cdot v$ measures the signed slope in that direction; among all unit directions, the maximum value is $|\nabla f(a)|$ when $\nabla f(a)\neq 0$.
Partial derivatives are coordinate directional derivatives. They are indispensable for computation, but they do not by themselves guarantee differentiability. The examples above show that coordinate information may miss diagonal or curved behaviour.
In functional analysis, the same idea becomes the Gateaux derivative, where the input space may be an infinite-dimensional [vector space](/page/Vector%20Space). In that setting the direction $v$ is often a perturbation of a function, and the directional derivative becomes the first variation of a functional.
In analysis and PDE, directional derivatives also lead toward [weak derivative](/page/Weak%20Derivative) and distributional viewpoints. When classical directional derivatives are not available pointwise, weak formulations recover derivative information through integration against test functions.
[remark: Directional Versus Coordinate Language]
The phrase "derivative in the direction $v$" refers to the vector $v$ and the line $a+tv$. The phrase "partial derivative with respect to $x_i$" refers to the special case $v=e_i$. Confusing these two can lead to incorrect conclusions about differentiability.
[/remark]
## References
[Derivative](/page/Derivative).
Michael Spivak, *Calculus on Manifolds* (1965).
Walter Rudin, *Principles of Mathematical Analysis* (1976).
Serge Lang, *Undergraduate Analysis* (1997).