**Keywords:** quadratic functions

## Quadratic Functions^{1}

Quadratic function, a type of performance index, is universal. One of its key properties is that it can be represented in a second-order Taylor series precisely.

F(\boldsymbol{x})=\frac{1}{2}\boldsymbol{x}^TA\boldsymbol{x}+\boldsymbol{d}\boldsymbol{x}+c\tag{1}

$$

where $A$ is a symmetric matrix(if it is not symmetric, it can be easily converted into symmetric).

And recall the property of gradient:

\nabla (\boldsymbol{h}^T\boldsymbol{x})=\nabla (\boldsymbol{x}^T\boldsymbol{h})=\boldsymbol{h}\tag{2}

$$

and

\nabla (\boldsymbol{x}^TQ\boldsymbol{x})=Q\boldsymbol{x}+Q^T\boldsymbol{x}=2Q\boldsymbol{x}\tag{3}

$$

then the first-order of quadratic functions is:

\nabla F(\boldsymbol{x})=A\boldsymbol{x}+\boldsymbol{d}\tag{4}

$$

and second-order of quadratic functions is:

\nabla^2 F(\boldsymbol{x})=A\tag{5}

$$

More than second-order derivatives are $0$.

The shape of quadratic functions can be described by eigenvalues and eigenvectors of its **Hessian matrix**. **Hessian matrix** is a symmetric matrix and its eigenvectors are mutually orthogonal, let:

\boldsymbol{z}_1,\boldsymbol{z}_2,\cdots ,\boldsymbol{z}_n\tag{6}

$$

denote the whole set of the eigenvectors of the Hessian matrix $A$ and we build a matrix:

B=\begin{bmatrix}

\boldsymbol{z}_1&\boldsymbol{z}_2&\cdots &\boldsymbol{z}_n

\end{bmatrix}\tag{7}

$$

and then we have $BB^T=I$ then $B^{-1}=B^T$. When Hessian matrix is positive definite, it would have $n$ eigenvectors(Hessian has a full rank $n$) and we can change Hessian matrix into a new matrix whose basises are the set of eigenvectors:

A’=B^TAB=\begin{bmatrix}

\lambda_1&&&\\

&\lambda_2&&\\

&&\ddots&\\

&&&\lambda_n

\end{bmatrix}=\Lambda\tag{8}

$$

after changing the basis, the new Hessian matrix is a diagonal matrix whose elements on the diagonal are the eigenvalues. This process can be inversed because of $BB^T=I$:

A=BA’B^T=B\Lambda B^T\tag{9}

$$

We can calculate the derivative towards any directions:

\frac{\boldsymbol{p}^T\nabla^2 F(\boldsymbol{x})\boldsymbol{p}}{||\boldsymbol{p}||^2}=\frac{\boldsymbol{p}^TA\boldsymbol{p}}{||\boldsymbol{p}||^2}\tag{10}

$$

For columns of $B$ can span the whole space, So we can find a vector $\boldsymbol{c}$ satisfy:

\boldsymbol{p}=B\boldsymbol{c}\tag{11}

$$

then take equation(8), equation(11) into equation(10) we get:

\frac{\boldsymbol{c}^TB^TAB\boldsymbol{c}}{\boldsymbol{c}^TB^TB\boldsymbol{c}}=\frac{\boldsymbol{c}^T\Lambda\boldsymbol{c}}{\boldsymbol{c}^TI\boldsymbol{c}}=\frac{\sum^n_{i=1}\lambda_i c_i^2}{\sum_{i=1}^n c^2_i}\tag{12}

$$

From equation(12), we could conclude:

\lambda_{\text{min}}\leq \frac{\boldsymbol{p}^TA\boldsymbol{p}}{||\boldsymbol{p}||^2}\leq \lambda_{\text{max}}\tag{13}

$$

- maximum $2^{\text{nd}}$ derivative occure along with $\boldsymbol{z}_{\text{max}}$(according to $\lambda_{\text{max}}$)
- eigenvalues is the $2^{\text{nd}}$ derivative along its eigenvector.
- eigenvectors could define a new coordinate system
- eigenvectors are principal axes of the function contour.
- going along with the $\boldsymbol{z}_{\text{max}}$ direction could have the largest change in function value $|\Delta F(x)|$
- eigenvalues here are all positive because of positive definite.

An example:

F(\boldsymbol{x})=x_1^2+x_1x_2+x_2^2

=\frac{1}{2}\boldsymbol{x}^T

\begin{bmatrix}

2&1\\1&2

\end{bmatrix}\boldsymbol{x}\tag{14}

$$

we can calculate the eigenvectors and eigenvalues:

\begin{aligned}

&\lambda_1=1&&\boldsymbol{z}_1=\begin{bmatrix}

1\\-1

\end{bmatrix}\\

&\lambda_2=3&&\boldsymbol{z}_2=\begin{bmatrix}

1\\1

\end{bmatrix}

\end{aligned}\tag{15}

$$

The contour plot and 3-D plots is:

Another example:

F(\boldsymbol{x})=-\frac{1}{4}x_1^2-\frac{3}{2}x_1x_2-\frac{1}{4}x_2^2

=\frac{1}{2}\boldsymbol{x}^T

\begin{bmatrix}

-0.5&-1.5\\-1.5&-0.5

\end{bmatrix}\boldsymbol{x}\tag{16}

$$

we can calculate the eigenvectors and eigenvalues:

\begin{aligned}

&\lambda_1=1&&\boldsymbol{z}_1=\begin{bmatrix}

-1\\1

\end{bmatrix}\\

&\lambda_2=-2&&\boldsymbol{z}_2=\begin{bmatrix}

-1\\-1

\end{bmatrix}

\end{aligned}\tag{17}

$$

The contour plot and 3-D plots is:

## Conclusion

- $\lambda_i>0$ or $i=1,2,\cdots$, $F(x)$ have a single strong minimum
- $\lambda_i<0$ or $i=1,2,\cdots$, $F(x)$ have a single strong maximum
- $\lambda_i$ have both negative and positive together. $F(x)$ has a saddle point
- $\lambda_i\geq 0$ and have a $\lambda_j=0$, $F(x)$ has a weak minimum or has no stationary point.
- $\lambda_i\leq 0$ and have a $\lambda_j=0$, $F(x)$ has a weak maximum or has no stationary point.