Chapter 6 Positive Definite Matrices-CFANZ编程社区

Chapter 6 Positive Definite Matrices

6.1 Minima, Maxima, and Saddle points

$F(x,y)=7+2(x+y)^2-y\sin y-x^3 \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ f(x,y)=2x^2+4xy+y^2$

Does either $F (x, y)$ or $f (x, y)$ have a minimum at the point $x = y = 0$

Remark 3 The zero-order terms $F (0, 0) = 7$ and $f (0, 0) = 0$ have no effect on the answer.

Remark 4 The linear terms give a necessary condition: To have any chance of a minimum, the first derivatives must vanish at $x = y = 0 :$
$\frac {\partial F} {\partial x} = 4(x+y)-3x^2 = 0 \ \ \ \ \ \ \ \ \ \text{and} \ \ \ \ \ \ \ \ \ \ \frac {\partial F} {\partial y} = 4(x+y) - y \cos y - \sin y = 0 \\ \frac {\partial f} {\partial x} = 4x + 4y = 0 \ \ \ \ \ \ \ \ \ \ \text{and} \ \ \ \ \ \ \ \ \ \ \ \ \ \ \frac{\partial f}{\partial y} = 4x+2y =0. \ \ \ \ \ \ \ \text{All zero.}$
Remark 5 The second derivative at $(0, 0)$ are decisive:
$\frac {\partial^2F} {\partial x^2} = 4-6x = 4 \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \frac {\partial^2 f} {\partial x^2} =4 \\ \frac {\partial^2 F} {\partial x \partial y} = \frac {\partial^2 F} {\partial y \partial x} = 4 \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \frac{\partial^2 f}{\partial x \partial y} = \frac{\partial^2 f} {\partial y \partial x} =4 \\ \frac {\partial^2 F}{\partial y^2} = 4 +y\sin y -2 \cos y = 2 \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \frac {\partial^2 f}{\partial y^2 } = 2$
Remark 6 The higher-degree term in $F$ have no effect on the question of a local minimum, but they can prevent it from being a global minimum.
$\begin{matrix} \text{Express $f(x,y)$ } \\ \text{using squares} \end{matrix} \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ f= ax^2+2bxy +cy^2 = a(x+\frac bay)^2 +(c- \frac{b^2}a)y^2 \tag2$

6A $ax^2+2bxy+cy^2$ is positive definite if and only if $a > 0$ and $ac>b^2.$ Any $f (x, y)$ has a minimum at a point where $\frac {\partial F} {\partial x} = \frac {\partial F} {\partial y} = 0$ with
$\frac {\partial F^2}{\partial x^2} >0 \ \ \ \ \ \ \ \ \text{and} \ \ \ \ \ \ \left[ \begin{matrix} \frac {\partial F^2} {\partial x^2} \end{matrix} \right] \left[ \begin{matrix} \frac {\partial F^2} {\partial y^2} \end{matrix} \right] > \left[ \begin{matrix} \frac {\partial F^2} {\partial x \partial y} \end{matrix} \right]^2 \tag{3}$
Singular case $ac=b^2$

Saddle Point $ac<b^2$

Higher Dimensions: Linear Algebra

Calculus would be enough to find our conditions $F_{xx}>0$ and $F_{xx}F_{yy}>F_{xy}^2$ for a minimum.

A quadratic $f (x, y)$ comes directly from a symmetric 2 by 2 matrix
$\text{$x^TAx$ \ \ in $R^2$} \ \ \ \ \ \ ax^2+2bxy+cy^2 = \left[ \begin{matrix} x & y \end{matrix} \right] \left[ \begin{matrix} a & b \\ b & c \end{matrix} \right] \left[ \begin{matrix} x \\ y \end{matrix} \right] \tag4$
For any symmetric matrix $A$ , the product $x^TAx$ is a pure quadratic form $f(x_1, \dots, x_n)$ :
$\text{$x^TAx$ in $R^n$ } \ \ \ \ \ \ \ \ \ \ \ \left[ \begin{matrix} x_1 & x_2 & \dots & x_n \end{matrix} \right] \left[ \begin{matrix} a_{11} & a_{12} & \dots & a_{1n} \\ a_{21} & a_{22} & \dots & a_{2n} \\ \dots & \dots & \dots & \dots \\ a_{n1} & a_{n2} & \dots & a_{nn} \end{matrix} \right] \left[ \begin{matrix} x_1 \\ x_2 \\ \dots \\ x_n \end{matrix} \right] = \sum^n_{i = 1} \sum^n_{j=1} a_{ij}x_ix_j \tag5$
The $f=a_{11}x_1^2 + 2 a_{12}x_1x_2 + \dots + a_{nn}x_n^2$

Then $F$ has a minimum when the pure quadratic $x^TAx$ is positive definite.
$\text{Taylor series } \ \ \ F(x) = F(0)+x^T(\text{grad}\ F) + \frac 1 2 x^TAx + \text{higher order terms}$

6.2 Tests for Positive Definiteness

6B Each of the following tests is a necessary and sufficient condition for the real symmetric matrix $A$ to be positive definite:

(I) $x^Tkx>0$ for all nonzero real vectors $x$ .

(II) All the eigenvalues of $A$ satisfy $\lambda_i >0$ .

(III) All the upper left submatrices $A_k$ have positive determinants.

(IV) All the pivots (without row exchanges) satisfy $d_k>0$ .

For rectangular matrix $R$ with $m$ equations with $\geq n$ , the least-squares problem $R x = b$ .

The least-square choice $\hat x$ is the solution of $R^TR\hat x = R^Tb$ . That matrix $A= R^TR$ is not only symmetric but positive definite.

6C The symmetric matrix $A$ is positive definite if and only if

(V) There is a matrix R with independent columns such that $A=R^TR$ .

Semidefinite Matrices

6D Each of the following tests is a necessary and sufficient condition for a symmetric matrix $A$ to be positive semidefinite:

(I’) $x^TAx \geq 0$ for all vectors $x$ (this defines positive semidefinite)

(II’) All the eigenvalues of $A$ satisfy $\lambda_i \geq 0$

(III’) No principal submatrices have negative determinants

(IV’) No pivots are negative.

(V’) There is a matrix $R$ , possibly with dependent columns, such that $A=R^TR$

6.3 Singular Value Decomposition

Singular Value Decomposition: Any $m$ by $n$ matrix $A$ can be factored into
$U\Sigma V^T = (\text{orthogonal})(\text{diagonal})(\text{orthogonal})$
The columns of $U$ ( $m$ by $m$ ) are eigenvectors of $AA^T$ , and the columns of $V$ ( $n$ by $n$ ) are eigenvectors of $A^TA$ . The $r$ singular values on the diagonal of $\Sigma$ ( $m$ by $n$ ) are the square roots of the nonzero eigenvalues of both $AA^T$ and $A^TA$ .

Remark 1. For positive definite matrices, $\Sigma$ is $\Lambda$ and $U\Sigma V^T$ is identical to $Q\Lambda Q^T$ . For other symmetric matrices, any negative eigenvalues in $\Lambda$ become positive in $\Sigma$ . For complex matrices, $\Sigma$ remains real but $U$ and $V$ become unitary.

Remark 2. $U$ and $V$ give orthonormal bases for all four fundamental subspaces:
$\text{first } \ \ \ \ r \ \ \ \ \ \ \text{columns of U:} \ \ \ \ \ \text{column space of } A \\ \text{last } \ \ \ \ m - r \ \ \ \ \ \ \text{columns of U:} \ \ \ \ \ \text{column space of } A\\ \text{first } \ \ \ \ r \ \ \ \ \ \ \text{columns of U:} \ \ \ \ \ \text{column space of } A\\ \text{last } \ \ \ \ n - r \ \ \ \ \ \ \text{columns of U:} \ \ \ \ \ \text{column space of } A$
Remark 3. When $A$ multiplies a column $v_j$ of $V$ , it produces $\sigma_j$ times a column of $U$ . That comes directly from $AV=U\Sigma$ , looked at a column at a time.

Remark 4. Eigenvectors of $AA^T$ and $A^TA$ must go into the columns of $U$ and $V$ .
$AA^T=(U\Sigma V^T)(V\Sigma^T U^T) = U\Sigma \Sigma^TU^T \text{and, simialarly, } A^TA= V\Sigma^T\Sigma V^T \tag 1$
Remark 5. Here is the reason that $Av_j = \sigma_ju_j$ , start with $A^TAv_j = \lambda_j^2v_j$
$\text{Multiply by $A$ } \ \ \ \ \ \ \ \ \ \ \ \ \ \ AA^TAv_j = \sigma_j^2 A V_j \tag 2$

4. Least Squares

For a rectangular system $A x = b$ . the least-squares solution comes from the normal equations $A^TA\hat x = A^Tb$ . If $A$ has dependent columns the $A^TA$ is not invertible and $\hat x$ is not determined.
$\text{The optimal solution of } Ax= b \text{ is the minimum length solution of } A^TA\hat x = A^Tb.$