2.8 Concavity and convexity

Definition 2.8.1 CRn is a convex if for any x,yC, for any 0t1, (1t)x+tyC.

Definition 2.8.2 f:CR is a convex function if

  • C is convex.
  • For any x,yC, for any 0t1, f((1t)x+ty)(1t)f(x)+tf(y)()

Definition 2.8.3 f is strictly convex if f is convex and () takes equality if and only if x=y or t=0 or t=1.

Definition 2.8.4 f:CR is concave if f:CR is convex.

Definition 2.8.5 Assuming fC2, mark

                  ∂2f
Hf (x) ==  ∖lef t(-------∖right)n×n
                 ∂xi∂xj

as the Hessian matrix of f at x.

Expand the Taylor series of degree 2 of fC2 at x0, we have f(x0+v)=f(x0)+i=1nfxi(x0)vi+12i,j=1n2fxixj(x0)vivjQuadratic form with respect to v+o(v2)=f(x0)+i=1nfxi(x0)vi+12(v1vn)(2fxixj)n×n(v1vn)=f(x0)+i=1nfxi(x0)vi+12vTHf(x)v

Let g(t)=f((1t)x+ty), fC2gC2. f is (strictly) convex g is (strictly) convex on [0,1]. Therefore, for any xy, g(t)=f((1t)x+ty)(yx)=i=1nfxi((1t)x+ty)(yixi)g(t)=i,j=1n2fxixj((1t)x+ty)(yixi)(yjxj)=(yx)THf((1t)x+ty)(yx)

Terminally, we have

Theorem 2.8.6 Assuming fC2, for any x, Hf(x) is positive definitef is strictly convexHf(x) is semi-positive definitef is convexHf(x) is negative definitef is strictly concaveHf(x) is semi-negative definitef is concave

Theorem 2.8.7 Assuming fC2 is convex, then for any x0,xC,

f(x) ≥ f (x  ) + ∂f (x )(x − x )
           0        0       0

Definition 2.8.8 x0 is a minimum point of f, if there exists a neighborhood u of x0 such that for any xU, f(x)f(x0). If additionally f(x)>f(x0) when xx0, then f is a strict minimum point. Similar for (strict) maximum point. Minimum points and maximum points are both extreme points.

Definition 2.8.9 x0 is a critical point of f if f(x0)=0, or for any v{vRn|v=1}, fv=0, or f(x0)=0.

Theorem 2.8.10 (Fermat) Assuming f is derivable at the extreme point x0, then x0 is the critical point of f.

Proof Assuming f(x0)0, there exists v0 such that

      -d-
∖lef t.dt f(x0 + tv)∖right|t=0 =  ∂f(x0)(v) ⁄= 0

Assuming it to be positive, then there exists δ>0 such that for any δ<t1<0<t2<δ,

f(x0 + t1v) < f (x0 ) < f(x0 + t2v)

contradicted with that x0 is the extreme point!

Theorem 2.8.11 Assuming fC2, x0 is a critical point of f. If Hf(x0) is positive (negative) definite, then x0 is the minimum (maximum) point of f. If Hf(x0) is not degenerate, yet it’s neither positive nor negative definite, then x0 is a saddle critical point, not extreme point.

Example 2.8.1 Seek all extreme points for

f (x,y) =

View it in polar coordinate. g(r,θ)=r2cosθsinθlnr2=r2lnrsin2θgr=r(2lnr+1)sin2θgθ=2r2lnrcos2θ2gr2=(2lnr+3)sin2θ2gθ2=4r2lnrsin2θ2grθ=2r(2lnr+1)cos2θ

Firstly,

⇔   r◟-=◝◜0 ◞  ∪∪
   Abadoned

Left as exercise. :)

Example 2.8.2 Does

F (x, y,z) = x(1 + yz) + exp(x + y + z) − 1

determine a implicit function z=f(x,y) near (x,y,z)=(0,0,0) ? Expand the Taylor series of degree 3 of F(x,y,z) at (0,0,0) to get 0=x+xyz1+(1+z+12z2+16z3+o(z3))(1+x+y+12(x+y)2+16(x+y)3+o((x+y)3))=2x+y+z+12(x+y+z)2+xyz+16(x+y+z)3+(x+y)2z24+o((x+y)3)+o(z3)

Assuming r=x2+y2, the linear part

2x + y + z = 0 ⇒  z = − 2x − y + o(z) + o(r)

so the linear equation has unique solution for any (x,y). According to IFT, there exists z=f(x,y) near (0,0,0). Then it’s left as exercise. :)

Conditional extreme value. Target function f(x1,...,xn), constraining conditions
∖lef t{ ∖right.

See Figure 2.6 for instance.

PIC
(a) Contour plot
PIC
(b) 3D plot
Figure 2.6: Conditional extreme value

Lagrange’s multiplier. Construct

F (x1,...,xn,λ1,...,λr) = f (x1, ...,xn ) − λ1g1 (x1,...,xn) − λrgr(x1,...,xn)

Theorem 2.8.12 Assuming f,g1,...,grC1, x is the (conditional) extreme point of f under the condition g1,...,gr=0, then there exists λ1,...,λrR such that (x,λ1,...,λr) is the critical point of F, i.e.

As for (1), we have

          ∑ r
∇f (x∗) =     λj∇gj (x∗)
          j=1

At x, the level set (equal value set) of f is tangent to the constrained surface Σ={x|gk(x)=0,k=1,...,r}.

Theorem 2.8.13 Assuming (x,λ) is the critical point of F, if the Hessian matrix H of F at (x,λ) is positive definite limited on the linear space TxΣ, i.e.

  T
v  Hv  > 0,     ∀v ∈ Tx ∗ Σ ∖{0}

then x is the strict minimum point of f under the given constrained conditions. If H is negative definite on TxΣ, then x is the strict maximum point of f under the given constrained conditions. If H has either positive or negative eigenvalues on TxΣ, then x is not the conditional extreme point of f.

Example 2.8.3 Seek (xy+yz+xz)min when x,y,z>0 and xyz=1. Construct

F(x, y,z,λ) = xy + yz + xz −  λ(xyz − 1)

then we have

Notice that

λ =  y-+-z-=  x +-z-= x-+-y-⇒  x = y =  z = 1,λ = 2
      yz       xz       xy

The Hessian matrix

H =

H is neither positive definite nor negative definite on R3, so we consider the constrained surface

Σ = {(x, y,z)|xyz  = 1},     x∗ = (1,1,1)

The tangent space

T  ∗ = ∖left{∖left.∖right|u + v + w = 1∖right} =  ∖lef t{∖left.∖right |u, v ∈ ℝ∖right}
  x

Then the quadratic form

=  2(u2 + uv + v2) > 0

meaning x=(1,1,1) is the conditional strictly minimum point of f.

Yet does f takes the minimum at x? (Counter-example showed in Example 2.8) See Figure 2.7 for hints.

If x>N, then yz=1x<1N, i.e. y<1N or z<1N. WLOG assuming y<1N, then

                  1   √ ---
f(x, y,z) ≥ xz =  -->   N  > 3 ⇒  N >  9
                  y

Assuming B={(x,y,z)|x,y,z[0,10]}, then for any (x,y,z)(R+)3B, f(x,y,z)>3=f(1,1,1)B. Since BΣ is bounded closed, f takes the minimum in BΣ, so f take the minimum in B. Terminally, f takes the minimum at the conditional extreme point (1,1,1), and the conditional minimum of f is 3.

PIC

Figure 2.7: Prove hint

Example 2.8.4 Assuming f:ER is derivable on ERm, x is the only critical point of f and it’s a strict maximum(minimum) value, yet f does not take the maximum/minimum at x (even f may be unbounded)! See Figure 2.8.

PIC
(a) View 1
PIC
(b) View 2
Figure 2.8: z=e3x+y33yex