2.1 Derivative of multiple variable function

In single variable function, the derivative of f at x=x0 is defined as

              f (x ) − f (x0)      f(x0 + h) − f(x0)
f ′(x0) =  lim  -------------=  lim  ------------------
         x→x0    x − x0       h→0         h

In multiple variable function, since one could not be divided by a vector, so we should rewrite the expression above. f(x0+h)f(x0)hf(x0)0f(x0+h)f(x0)f(x0)hh0f(x0+h)f(x0)f(x0)hh0

Definition 2.1.1 x0 is the interior point of ERm if there exists δ0>0 such that B(x0,δ0)={xRm|xx0<δ0}E. B(x0,δ0) is also called an δ0-open ball at x0.

Definition 2.1.2 f:ERp is derivable (differentiable) at x0 if there exists a linear mapping A:RmRp such that f(x0+h)=f(x0)+Ah+o(h), h0. A is called the derivative of f at x0, marked as A=f(x0) or Df(x0).

Note In 1-dimension case, the derivative A=f(x0) is also a linear mapping, merely here the preimage space and the image space are both 1-dimension, so A:RR is equivalent to the coefficient of the proportional function, reduced to a number. The geometric meaning of derivative is shown in Figure 2.1.

PIC

Figure 2.1: The geometric meaning of derivative

When m=1, p1, consider the derivative of f:(a,b)Rp at t0(a,b). Traditionally,

  ′          f(t0 +-h-) −-f-(t0)-
f (t0) = lhim→0        h

If regard f as a kind of motion, then f(t0) is the instantaneous velocity when t=t0.

                      ′
f(t0 + h) =  f◟(t0) +◝f◜-(t0)h◞ +o (h)
            uniform linear motion

is the best uniform linear motion to approximately describe the original real motion. Here f(t0)h=f(t0)(h), it connects the derivative with the differential.

Definition 2.1.3 Given ERm, f:ERp, x0 is the interior point of E. Then for any vRm, if

 lim  f(x0-+-vt)-−-f(x0-)
t→0+          t

exists, then it’s marked as fv(x0), called the derivative of f along v at x0. Particularly, if v=1, then it’s called the directional derivative.

Note

PIC
Figure 2.2: The geometric meaning of directional derivative

Theorem 2.1.4

1.

If f is derivable at x0, then for any vRm, fv(x0) exists where

∂f
---(x0) = ∂f (x0)(v)
∂v
2.

When m2, there exists f such that for any vRm, fv(x0) exists, yet f is not continuous at x0. It is mainly because except for m=1, there are infinite directional derivatives (only 2 when m=1).

Example 2.1.1 Several examples for derivable functions.

1.

Constant mapping.

2.

Linear mapping.

3.

Inner product. Given a linear mapping A, f:RnRn where f(x)=Ax,x is derivable at any x0. Notice that f(x0+h)=A(x0+h),x0+hAx0,x0=Ax0,h+Ah,x0+Ah,h=Ax0,h+Ah,x0linear with respect to h+O(h2)=(A+AT)x0,h+o(h)

4.

A is a square matrix, f(A)=A1 is derivable.

Consider A=I first. For any B<1, I+B is invertible (proved in the homework). Notice that (I+B)(IB)=IB2(I+B)[IB+(I+B)1B2]=I

Therefore f(I+B)=IB+(I+B)1B2. Notice that (I+B)1B2(I+B)1B2[(I+B)1I+I]B2[B1B+1]B2=B21B2B2=O(B2)=o(B)

when B<12. So f(I+B)=f(I)B+o(B), i.e. f(I)(B)=B.

Consider invertible A then. A+B=A(I+A1B) is invertible when A1B<1, i.e. B<1A1. So f is derivable at A. Notice that f(A+B)=(A+B)1=(I+A1B)1A1=f(I+A1B)A1=[f(I)A1B+o(B)]A1=A1A1BA1+o(B)

Therefore f(A)(B)=A1BA1.

5.

If F and G are both derivable at x0, then H(x)=F(x),G(x) is derivable. Notive that H(x0+h)H(x0)=F(x0+h),G(x0+h)F(x0),G(x0)=F(x0)+F(x0)(h)+o(h),G(x0)+G(x0)(h)+o(h)F(x0),G(x0)=F(x0),G(x0)(h)+F(x0)(h),G(x0)linear with respect to h+o(h)

Therefore

∂H (x0)(h ) = ⟨F (x0),∂G (x0)(h )⟩ + ⟨∂F (x0)(h),G (x0)⟩
6.

Consider det:MnR, we have

           ∑
det (A ) =       𝜖σ ,...,σ A1 σ ⋅⋅⋅Anσ
          σ ,...,σ   1   n   1       n
           1   n

where ϵσ1,...,σn is Levi-Civita symbol. Consider a mapping eij:MnR, AAij. It’s linear, so it’s differentiable. Since det(A) is composed by the addition, multiplication and composition of eij, all eij are differentiable, therefore, det(A) is differentiable.

Consider a special matrix Eij whose elements are all 0 except that the row i column j element is 1. {Eij} compose a set of bases of Mn. Notice that

∂ det
-----(A) = ∂ det(A )(Eij)
∂Eij

we have

                    ∗                   ∗             ∗                ∗
det(A + tEij) = a1jA1j + ⋅⋅⋅ + (aij + t)A ij + ⋅⋅⋅ + anjA nj = det(A ) + tA ij

i.e.

∂ det-(A) = A ∗
∂Eij         ij

Therefore det(A)(B)=det(A)(i,jnbijEij)=i,jnbijdetEij(A)=ijnbijAij=j=1ni=1n(A)jiTbij=j=1n(ATB)jj=tr(ATB)

Terminally

det(A + B ) = det(A ) + tr(A ∗TB ) + o(B )

Theorem 2.1.5 Chain rule. Given F:ERp, G:RpRq, F is derivable at x0, G is derivable at y0=F(x0), then GF is derivable at x0 where (GF)(x0)=G(y0)F(x0)(GF)(x0)(h)=G(y0)(F(x0)(h))

It’s called the differentiation of the composition is the composition of the differentiation.