0
点赞
收藏
分享

微信扫一扫

Chapter 6 Positive Definite Matrices

琛彤麻麻 2022-02-01 阅读 17

Chapter 6 Positive Definite Matrices

6.1 Minima, Maxima, and Saddle points

F ( x , y ) = 7 + 2 ( x + y ) 2 − y sin ⁡ y − x 3                         f ( x , y ) = 2 x 2 + 4 x y + y 2 F(x,y)=7+2(x+y)^2-y\sin y-x^3 \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ f(x,y)=2x^2+4xy+y^2 F(x,y)=7+2(x+y)2ysinyx3                       f(x,y)=2x2+4xy+y2

Does either F ( x , y ) F(x,y) F(x,y) or f ( x , y ) f(x,y) f(x,y) have a minimum at the point x = y = 0 x=y=0 x=y=0

Remark 3 The zero-order terms F ( 0 , 0 ) = 7 F(0, 0) = 7 F(0,0)=7 and f ( 0 , 0 ) = 0 f(0, 0)=0 f(0,0)=0 have no effect on the answer.

Remark 4 The linear terms give a necessary condition: To have any chance of a minimum, the first derivatives must vanish at x = y = 0 : x=y=0: x=y=0:
∂ F ∂ x = 4 ( x + y ) − 3 x 2 = 0          and           ∂ F ∂ y = 4 ( x + y ) − y cos ⁡ y − sin ⁡ y = 0 ∂ f ∂ x = 4 x + 4 y = 0           and               ∂ f ∂ y = 4 x + 2 y = 0.        All zero. \frac {\partial F} {\partial x} = 4(x+y)-3x^2 = 0 \ \ \ \ \ \ \ \ \ \text{and} \ \ \ \ \ \ \ \ \ \ \frac {\partial F} {\partial y} = 4(x+y) - y \cos y - \sin y = 0 \\ \frac {\partial f} {\partial x} = 4x + 4y = 0 \ \ \ \ \ \ \ \ \ \ \text{and} \ \ \ \ \ \ \ \ \ \ \ \ \ \ \frac{\partial f}{\partial y} = 4x+2y =0. \ \ \ \ \ \ \ \text{All zero.} xF=4(x+y)3x2=0         and          yF=4(x+y)ycosysiny=0xf=4x+4y=0          and              yf=4x+2y=0.       All zero.
Remark 5 The second derivative at ( 0 , 0 ) (0,0) (0,0) are decisive:
∂ 2 F ∂ x 2 = 4 − 6 x = 4                                         ∂ 2 f ∂ x 2 = 4 ∂ 2 F ∂ x ∂ y = ∂ 2 F ∂ y ∂ x = 4                                   ∂ 2 f ∂ x ∂ y = ∂ 2 f ∂ y ∂ x = 4 ∂ 2 F ∂ y 2 = 4 + y sin ⁡ y − 2 cos ⁡ y = 2                                   ∂ 2 f ∂ y 2 = 2 \frac {\partial^2F} {\partial x^2} = 4-6x = 4 \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \frac {\partial^2 f} {\partial x^2} =4 \\ \frac {\partial^2 F} {\partial x \partial y} = \frac {\partial^2 F} {\partial y \partial x} = 4 \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \frac{\partial^2 f}{\partial x \partial y} = \frac{\partial^2 f} {\partial y \partial x} =4 \\ \frac {\partial^2 F}{\partial y^2} = 4 +y\sin y -2 \cos y = 2 \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \frac {\partial^2 f}{\partial y^2 } = 2 x22F=46x=4                                       x22f=4xy2F=yx2F=4                                 xy2f=yx2f=4y22F=4+ysiny2cosy=2                                 y22f=2
Remark 6 The higher-degree term in F F F have no effect on the question of a local minimum, but they can prevent it from being a global minimum.
Express  f ( x , y )   using squares                     f = a x 2 + 2 b x y + c y 2 = a ( x + b a y ) 2 + ( c − b 2 a ) y 2 (2) \begin{matrix} \text{Express $f(x,y)$ } \\ \text{using squares} \end{matrix} \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ f= ax^2+2bxy +cy^2 = a(x+\frac bay)^2 +(c- \frac{b^2}a)y^2 \tag2 Express f(x,y) using squares                   f=ax2+2bxy+cy2=a(x+aby)2+(cab2)y2(2)

6A a x 2 + 2 b x y + c y 2 ax^2+2bxy+cy^2 ax2+2bxy+cy2 is positive definite if and only if a > 0 a>0 a>0 and a c > b 2 . ac>b^2. ac>b2. Any f ( x , y ) f(x,y) f(x,y) has a minimum at a point where ∂ F ∂ x = ∂ F ∂ y = 0 \frac {\partial F} {\partial x} = \frac {\partial F} {\partial y} = 0 xF=yF=0 with
∂ F 2 ∂ x 2 > 0         and       [ ∂ F 2 ∂ x 2 ] [ ∂ F 2 ∂ y 2 ] > [ ∂ F 2 ∂ x ∂ y ] 2 (3) \frac {\partial F^2}{\partial x^2} >0 \ \ \ \ \ \ \ \ \text{and} \ \ \ \ \ \ \left[ \begin{matrix} \frac {\partial F^2} {\partial x^2} \end{matrix} \right] \left[ \begin{matrix} \frac {\partial F^2} {\partial y^2} \end{matrix} \right] > \left[ \begin{matrix} \frac {\partial F^2} {\partial x \partial y} \end{matrix} \right]^2 \tag{3} x2F2>0        and      [x2F2][y2F2]>[xyF2]2(3)
Singular case a c = b 2 ac=b^2 ac=b2

Saddle Point a c < b 2 ac<b^2 ac<b2

Higher Dimensions: Linear Algebra

Calculus would be enough to find our conditions F x x > 0 F_{xx}>0 Fxx>0 and F x x F y y > F x y 2 F_{xx}F_{yy}>F_{xy}^2 FxxFyy>Fxy2 for a minimum.

A quadratic f ( x , y ) f(x,y) f(x,y) comes directly from a symmetric 2 by 2 matrix
x T A x    in  R 2        a x 2 + 2 b x y + c y 2 = [ x y ] [ a b b c ] [ x y ] (4) \text{$x^TAx$ \ \ in $R^2$} \ \ \ \ \ \ ax^2+2bxy+cy^2 = \left[ \begin{matrix} x & y \end{matrix} \right] \left[ \begin{matrix} a & b \\ b & c \end{matrix} \right] \left[ \begin{matrix} x \\ y \end{matrix} \right] \tag4 xTAx   in R2      ax2+2bxy+cy2=[xy][abbc][xy](4)
For any symmetric matrix A A A, the product x T A x x^TAx xTAx is a pure quadratic form f ( x 1 , … , x n ) f(x_1, \dots, x_n) f(x1,,xn):
x T A x  in  R n               [ x 1 x 2 … x n ] [ a 11 a 12 … a 1 n a 21 a 22 … a 2 n … … … … a n 1 a n 2 … a n n ] [ x 1 x 2 … x n ] = ∑ i = 1 n ∑ j = 1 n a i j x i x j (5) \text{$x^TAx$ in $R^n$ } \ \ \ \ \ \ \ \ \ \ \ \left[ \begin{matrix} x_1 & x_2 & \dots & x_n \end{matrix} \right] \left[ \begin{matrix} a_{11} & a_{12} & \dots & a_{1n} \\ a_{21} & a_{22} & \dots & a_{2n} \\ \dots & \dots & \dots & \dots \\ a_{n1} & a_{n2} & \dots & a_{nn} \end{matrix} \right] \left[ \begin{matrix} x_1 \\ x_2 \\ \dots \\ x_n \end{matrix} \right] = \sum^n_{i = 1} \sum^n_{j=1} a_{ij}x_ix_j \tag5 xTAx in Rn            [x1x2xn]a11a21an1a12a22an2a1na2nannx1x2xn=i=1nj=1naijxixj(5)
The f = a 11 x 1 2 + 2 a 12 x 1 x 2 + ⋯ + a n n x n 2 f=a_{11}x_1^2 + 2 a_{12}x_1x_2 + \dots + a_{nn}x_n^2 f=a11x12+2a12x1x2++annxn2

Then F F F has a minimum when the pure quadratic x T A x x^TAx xTAx is positive definite.
Taylor series     F ( x ) = F ( 0 ) + x T ( grad  F ) + 1 2 x T A x + higher order terms \text{Taylor series } \ \ \ F(x) = F(0)+x^T(\text{grad}\ F) + \frac 1 2 x^TAx + \text{higher order terms} Taylor series    F(x)=F(0)+xT(grad F)+21xTAx+higher order terms

6.2 Tests for Positive Definiteness

6B Each of the following tests is a necessary and sufficient condition for the real symmetric matrix A A A to be positive definite:

​ (I) x T k x > 0 x^Tkx>0 xTkx>0 for all nonzero real vectors x x x.

​ (II) All the eigenvalues of A A A satisfy λ i > 0 \lambda_i >0 λi>0.

​ (III) All the upper left submatrices A k A_k Ak have positive determinants.

​ (IV) All the pivots (without row exchanges) satisfy d k > 0 d_k>0 dk>0.

For rectangular matrix R R R with m m m equations with m ≥ n m \geq n mn , the least-squares problem R x = b Rx=b Rx=b.

The least-square choice x ^ \hat x x^ is the solution of R T R x ^ = R T b R^TR\hat x = R^Tb RTRx^=RTb. That matrix A = R T R A= R^TR A=RTR is not only symmetric but positive definite.

6C The symmetric matrix A A A is positive definite if and only if

​ (V) There is a matrix R with independent columns such that A = R T R A=R^TR A=RTR.

Semidefinite Matrices

6D Each of the following tests is a necessary and sufficient condition for a symmetric matrix A A A to be positive semidefinite:

​ (I’) x T A x ≥ 0 x^TAx \geq 0 xTAx0 for all vectors x x x (this defines positive semidefinite)

​ (II’) All the eigenvalues of A A A satisfy λ i ≥ 0 \lambda_i \geq 0 λi0

​ (III’) No principal submatrices have negative determinants

​ (IV’) No pivots are negative.

​ (V’) There is a matrix R R R, possibly with dependent columns, such that A = R T R A=R^TR A=RTR

6.3 Singular Value Decomposition

Singular Value Decomposition: Any m m m by n n n matrix A A A can be factored into
A = U Σ V T = ( orthogonal ) ( diagonal ) ( orthogonal ) A = U\Sigma V^T = (\text{orthogonal})(\text{diagonal})(\text{orthogonal}) A=UΣVT=(orthogonal)(diagonal)(orthogonal)
The columns of U U U ( m m m by m m m) are eigenvectors of A A T AA^T AAT, and the columns of V V V ( n n n by n n n) are eigenvectors of A T A A^TA ATA. The r r r singular values on the diagonal of Σ \Sigma Σ( m m m by n n n) are the square roots of the nonzero eigenvalues of both A A T AA^T AAT and A T A A^TA ATA.

Remark 1. For positive definite matrices, Σ \Sigma Σ is Λ \Lambda Λ and U Σ V T U\Sigma V^T UΣVT is identical to Q Λ Q T Q\Lambda Q^T QΛQT. For other symmetric matrices, any negative eigenvalues in Λ \Lambda Λ become positive in Σ \Sigma Σ. For complex matrices, Σ \Sigma Σ remains real but U U U and V V V become unitary.

Remark 2. U U U and V V V give orthonormal bases for all four fundamental subspaces:
first      r       columns of U:     column space of  A last      m − r       columns of U:     column space of  A first      r       columns of U:     column space of  A last      n − r       columns of U:     column space of  A \text{first } \ \ \ \ r \ \ \ \ \ \ \text{columns of U:} \ \ \ \ \ \text{column space of } A \\ \text{last } \ \ \ \ m - r \ \ \ \ \ \ \text{columns of U:} \ \ \ \ \ \text{column space of } A\\ \text{first } \ \ \ \ r \ \ \ \ \ \ \text{columns of U:} \ \ \ \ \ \text{column space of } A\\ \text{last } \ \ \ \ n - r \ \ \ \ \ \ \text{columns of U:} \ \ \ \ \ \text{column space of } A first     r      columns of U:     column space of Alast     mr      columns of U:     column space of Afirst     r      columns of U:     column space of Alast     nr      columns of U:     column space of A
Remark 3. When A A A multiplies a column v j v_j vj of V V V, it produces σ j \sigma_j σj times a column of U U U. That comes directly from A V = U Σ AV=U\Sigma AV=UΣ, looked at a column at a time.

Remark 4. Eigenvectors of A A T AA^T AAT and A T A A^TA ATA must go into the columns of U U U and V V V.
A A T = ( U Σ V T ) ( V Σ T U T ) = U Σ Σ T U T and, simialarly,  A T A = V Σ T Σ V T (1) AA^T=(U\Sigma V^T)(V\Sigma^T U^T) = U\Sigma \Sigma^TU^T \text{and, simialarly, } A^TA= V\Sigma^T\Sigma V^T \tag 1 AAT=(UΣVT)(VΣTUT)=UΣΣTUTand, simialarly, ATA=VΣTΣVT(1)
Remark 5. Here is the reason that A v j = σ j u j Av_j = \sigma_ju_j Avj=σjuj, start with A T A v j = λ j 2 v j A^TAv_j = \lambda_j^2v_j ATAvj=λj2vj
Multiply by  A                  A A T A v j = σ j 2 A V j (2) \text{Multiply by $A$ } \ \ \ \ \ \ \ \ \ \ \ \ \ \ AA^TAv_j = \sigma_j^2 A V_j \tag 2 Multiply by A               AATAvj=σj2AVj(2)

4. Least Squares

For a rectangular system A x = b Ax= b Ax=b. the least-squares solution comes from the normal equations A T A x ^ = A T b A^TA\hat x = A^Tb ATAx^=ATb. If A A A has dependent columns the A T A A^TA ATA is not invertible and x ^ \hat x x^ is not determined.
The optimal solution of  A x = b  is the minimum length solution of  A T A x ^ = A T b . \text{The optimal solution of } Ax= b \text{ is the minimum length solution of } A^TA\hat x = A^Tb. The optimal solution of Ax=b is the minimum length solution of ATAx^=ATb.

举报

相关推荐

0 条评论