### Detecting minima, maxima, and saddle points

The Taylor's theorem introduced earlier is also applicable to multivariate functions.
For a bivariate function, for a second-order approximation

$$ f(x+a, y+b) = f(a,b) + x\dox{f}\bigg\rvert_{a,b} + y\doy{f}\bigg\rvert_{a,b} + \frac{1}{2}\left( x^2 \doxx{f}\bigg\rvert_{a,b} + xy\doxy{f}\bigg\rvert_{a,b} + xy \doyx{f}\bigg\rvert_{a,b} + y^2 \doyy{f}\bigg\rvert_{a,b} \right) $$

Here, \( g\bigg\rvert_{a,b} \) means that the derivative is evaluated at the point \( (a,b) \).

Suppose, the function has a minimum at \( (a,b) \).

This means the gradient is zero.
So, \(\dox{f}\bigg\rvert_{a,b} = 0 \) and \( \doy{f}\bigg\rvert_{a,b} = 0 \).

Because \( f(x+a,y+b) > f(a,b) \) for \( (a,b) \) to have a minimum, from the Taylor series expansion above, it also means,

$$ x^2 \doxx{f}\bigg\rvert_{a,b} + xy\doxy{f}\bigg\rvert_{a,b} + xy \doyx{f}\bigg\rvert_{a,b} + y^2 \doyy{f}\bigg\rvert_{a,b} > 0 $$

Conversely, if the function had a maximum at \( (a,b) \), then,
$$ x^2 \doxx{f}\bigg\rvert_{a,b} + xy\doxy{f}\bigg\rvert_{a,b} + xy \doyx{f}\bigg\rvert_{a,b} + y^2 \doyy{f}\bigg\rvert_{a,b} < 0 $$

And finally, if the above two conditions are not met, then the function has neither a minimum nor a maximum at \( (a,b) \).
So, \((a,b) \) must be a saddle point.

In the matrix notation, we can also write the expression above as

$$ \begin{bmatrix}x & y\end{bmatrix} \begin{bmatrix}\doxx{f}\bigg\rvert_{a,b} & \doxy{f}\bigg\rvert_{a,b} \\ \doyx{f}\bigg\rvert_{a,b} & \doyy{f}\bigg\rvert_{a,b} \end{bmatrix} \begin{bmatrix}x \\ y \end{bmatrix} $$

Let

$$ \vx = \begin{bmatrix} x \\ y \end{bmatrix} $$

and

$$ \mathbf{H}\bigg\rvert_{a,b} =\begin{bmatrix}\doxx{f}\bigg\rvert_{a,b} & \doxy{f}\bigg\rvert_{a,b} \\ \doyx{f}\bigg\rvert_{a,b} & \doyy{f}\bigg\rvert_{a,b} \end{bmatrix}$$

So, what we are effectively saying is that, if \( \dox{f}\bigg\rvert_{a,b} = 0 \) and \( \doy{f}\bigg\rvert_{a,b} = 0 \), then the function has a

- minimum at \( (a,b) \), if \( \vx^T \mathbf{H}\bigg\rvert_{a,b} \vx > 0 \)
- maximum at \( (a,b) \), if \( \vx^T \mathbf{H}\bigg\rvert_{a,b} \vx < 0 \)
- saddle point at \((a,b)\), otherwise.

But wait! Notice that the first two relationships are definitions of positive and negative definite matrices respectively.

So, if \( \nabla\bigg\rvert_{a,b} = 0 \), then the function has a

- minimum at \( (a,b) \), if its Hessian \( \mathbf{H}\bigg\rvert_{a,b} \) at that point is positive definite.
- maximum at \( (a,b) \), if its Hessian \( \mathbf{H}\bigg\rvert_{a,b} \) at that point is negative definite.
- saddle point at \((a,b)\), if its Hessian \( \mathbf{H}\bigg\rvert_{a,b} \) at that point is indefinite.

This is very important.
**To distinguish among maxima, minima, and saddle points, investigate the definiteness of the Hessian.**