An Introduction to Differential Geometry through ... - USU Math/Stat

183 downloads 0 Views 847KB Size Report
Chapter 9 investigates the Lie bracket of vector-fields and Killing vec- tors for a metric. Chapter 10 generalizes chapter 8 and introduces the general notion of a ...
An Introduction to Differential Geometry through Computation Mark E. Fels c Draft date April 18, 2011

Contents Preface

iii

1 Preliminaries 1.1 Open sets . . . . . . . . . . . . . . . 1.2 Smooth functions . . . . . . . . . . . 1.3 Smooth Curves . . . . . . . . . . . . 1.4 Composition and the Chain-rule . . . 1.5 Vector Spaces, Basis, and Subspaces . 1.6 Algebras . . . . . . . . . . . . . . . . 1.7 Exercises . . . . . . . . . . . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

1 . 1 . 2 . 5 . 6 . 9 . 17 . 20

2 Linear Transformations 2.1 Matrix Representation . . . . . . . . . . . . . 2.2 Kernel, Rank, and the Rank-Nullity Theorem 2.3 Composition, Inverse and Isomorphism . . . . 2.4 Exercises . . . . . . . . . . . . . . . . . . . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

25 25 30 34 39

3 Tangent Vectors 3.1 Tangent Vectors 3.2 Derivations . . 3.3 Vector fields . . 3.4 Exercises . . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

43 43 44 51 53

4 The 4.1 4.2 4.3 4.4

and Curves . . . . . . . . . . . . . . . . . . . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

push-forward and the Jacobian The push-forward using curves . . . . . . . . . . . . . . . . . . The push-forward using derivations . . . . . . . . . . . . . . . The Chain rule, Immersions, Submersions, and Diffeomorphisms Change of Variables . . . . . . . . . . . . . . . . . . . . . . . . i

55 55 58 62 65

ii

CONTENTS 4.5

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

5 Differential One-forms and Metric Tensors 5.1 Differential One-Forms . . . . . . . . . . . . . . 5.2 Bilinear forms and Inner Products . . . . . . . . 5.3 Tensor product . . . . . . . . . . . . . . . . . . 5.4 Metric Tensors . . . . . . . . . . . . . . . . . . 5.4.1 Arc-length . . . . . . . . . . . . . . . . . 5.4.2 Orthonormal Frames . . . . . . . . . . . 5.5 Raising and Lowering Indices and the Gradient 5.6 A tale of two duals . . . . . . . . . . . . . . . . 5.7 Exercises . . . . . . . . . . . . . . . . . . . . . . 6 The 6.1 6.2 6.3 6.4

Pullback and Isometries The Pullback of a Differential One-form The Pullback of a Metric Tensor . . . . . Isometries . . . . . . . . . . . . . . . . . Exercises . . . . . . . . . . . . . . . . . .

9 The 9.1 9.2 9.3

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

111 . 111 . 117 . 122 . 130

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

187 187 189 192 193 193 200

and the Straightening Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

149 149 155 158 163 168

10 Hypersurfaces 10.1 Regular Level Hyper-Surfaces 10.2 Patches and Covers . . . . . . 10.3 Maps between surfaces . . . . 10.4 More General Surfaces . . . . 10.5 Metric Tensors on Surfaces . . 10.6 Exercises . . . . . . . . . . . . 8 Flows, Invariants 8.1 Flows . . . . 8.2 Invariants . . 8.3 Invariants I . 8.4 Invariants II . 8.5 Exercises . . .

. . . . . . . . .

75 75 81 86 89 93 95 97 105 107

Lie Bracket and Killing Lie Bracket . . . . . . . . Killing vectors . . . . . . . Exercises . . . . . . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

Vectors 171 . . . . . . . . . . . . . . . . . . . . 171 . . . . . . . . . . . . . . . . . . . . 179 . . . . . . . . . . . . . . . . . . . . 185

CONTENTS 10 Hypersurfaces 10.1 Regular Level Hyper-Surfaces 10.2 Patches and Covers . . . . . . 10.3 Maps between surfaces . . . . 10.4 More General Surfaces . . . . 10.5 Metric Tensors on Surfaces . . 10.6 Exercises . . . . . . . . . . . .

iii

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

187 . 187 . 189 . 192 . 193 . 193 . 200

11 Group actions and Multi-parameter Groups 11.1 Group Actions . . . . . . . . . . . . . . . . . . . . . 11.2 Infinitesimal Generators . . . . . . . . . . . . . . . 11.3 Right and Left Invariant Vector Fields . . . . . . . 11.4 Invariant Metric Tensors on Multi-parameter groups 11.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

203 203 207 209 211 214

12 Connections and Curvature 217 12.1 Connections . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217 12.2 Parallel Transport . . . . . . . . . . . . . . . . . . . . . . . . . 217 12.3 Curvature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217

iv

CONTENTS

Preface This book was conceived after numerous discussions with my colleague Ian Anderson about what to teach in an introductory one semester course in differential geometry. We found that after covering the classical differential geometry of curves and surfaces that it was difficult to make the transition to more advanced texts in differential geometry such as [?], or to texts which use differential geometry such as in differential equations [?] or general relativity [?], [?]. This book aims to make this transition more rapid, and to prepare upper level undergraduates and beginning level graduate students to be able to do some basic computational research on such topics as the isometries of metrics in general relativity or the symmetries of differential equations. This is not a book on classical differential geometry or tensor analysis, but rather a modern treatment of vector fields, push-forward by mappings, one-forms, metric tensor fields, isometries, and the infinitesimal generators of group actions, and some Lie group theory using only open sets in IR n . The definitions, notation and approach are taken from the corresponding concept on manifolds and developed in IR n . For example, tangent vectors are defined as derivations (on functions in IR n ) and metric tensors are a field of positive definite symmetric bilinear functions on the tangent vectors. This approach introduces the student to these concepts in a familiar setting so that in the more abstract setting of manifolds the role of the manifold can be emphasized. The book emphasizes liner algebra. The approach that I have taken is to provide a detailed review of a linear algebra concept and then translate the concept over to the field theoretic version in differential geometry. The level of preparation in linear algebra effects how many chapters can be covered in one semester. For example, there is quite a bit of detail on linear transformations and dual spaces which can be quickly reviewed for students with advanced training in linear algebra. v

CONTENTS

1

The outline of the book is as follows. Chapter 1 reviews some basic facts about smooth functions from IR n to IR m , as well as the basic facts about vector spaces, basis, and algebras. Chapter 2 introduces tangent vectors and vector fields in IR n using the standard two approaches with curves and derivations. Chapter 3 reviews linear transformations and their matrix representation so that in Chapter 4 the push-forward as an abstract linear transformation can be defined and its matrix representation as the Jacobian can be derived. As an application, the change of variable formula for vector fields is derived in Chapter 4. Chapter 5 develops the linear algebra of the dual space and the space of bi-linear functions and demonstrates how these concepts are used in defining differential one-forms and metric tensor fields. Chapter 6 introduces the pullback map on one-forms and metric tensors from which the important concept of isometries is then defined. Chapter 7 investigates hyper-surfaces in IR n , using patches and defines the induced metric tensor from Euclidean space. The change of coordinate formula on overlaps is then derived. Chapter 8 returns to IR n to define a flow and investigates the relationship between a flow and its infinitesimal generator. The theory of flow invariants is then investigated both infinitesimally and from the flow point of view with the goal of proving the rectification theorem for vector fields. Chapter 9 investigates the Lie bracket of vector-fields and Killing vectors for a metric. Chapter 10 generalizes chapter 8 and introduces the general notion of a group action with the goal of providing examples of metric tensors with a large number of Killing vectors. It also introduces a special family of Lie groups which I’ve called multi-parameter groups. These are Lie groups whose domain is an open set in IR n . The infinitesimal generators for these groups are used to construct the left and right invariant vector-fields on the group, as well as the Killing vectors for some special invariant metric tensors on the groups.

2

CONTENTS

Chapter 1 Preliminaries 1.1

Open sets

The components (or Cartesian coordinates ) of a point x ∈ IR n will be denoted by x = (x1 , x2 , . . . , xn ). Note that the labels are in the up position. That is x2 is not the square of x unless we are working in IR 1 , IR 2 , IR 3 where we will use the standard notation of x, y, z. The position of indices is important, and make many formulas easier to remember or derive. The Euclidean distance between the points x = (x1 , . . . , xn ) and y = (y 1 , . . . , y n ) is p d(x, y) = (x1 − y 1 )2 + . . . + (xn − y n )2 . The open ball of radius r ∈ IR + at the point p ∈ IR n is the set Br (p) ⊂ IR n , defined by Br (p) = { x ∈ IR n | d(x, p) < r}. A subset U ⊂ IR n is an open set if given any point p ∈ U there exists an r ∈ IR + (which depends on p) such that the open ball Br (p) satisfies Br (p) ⊂ U . The empty set is also taken to be open. Example 1.1.1. The set IR n is an open set. Example 1.1.2. Let p ∈ IR n and r ∈ IR + . Any open ball Br (p) is an open set. 1

2

CHAPTER 1. PRELIMINARIES

Example 1.1.3. The upper half plane is the set U = { (x, y) ∈ IR 2 | y > 0 } and is open. Example 1.1.4. The set V = { (x, y) ∈ IR 2 | y ≥ 0 } is not open. Any point (x, 0) ∈ V can not satisfy the open ball condition. Example 1.1.5. The unit n-sphere S n ⊂ IR n+1 is the subset S n = { x ∈ IR n+1 | d(x, 0) = 1 } and S n is not open. No point x ∈ S n satisfies the open ball condition. The set S n is the boundary of the open ball B1 (0) ⊂ IR n+1 . Roughly speaking, open sets contain no boundary point. This can be made precise using some elementary topology.

1.2

Smooth functions

In this section we recall some facts from multi-variable calculus. A real-valued function f : IR n → IR has the form, f (x) = f (x1 , x2 , . . . , xn ). We will only be interested in functions whose domain Dom(f ), is either all of IR n or an open subset U ⊂ IR n . For example f (x, y) = log xy is defined only on the set U = {(x, y) ∈ IR 2 | xy > 0}, which is an open set in IR 2 , and Dom(f ) = U . A function f : IR n → IR is continuous at p ∈ IR n if lim f (x) = f (p).

x→p

If U ⊂ IR n is an open set, then C 0 (U ) denotes the functions defined on U which are continuous at every point of U .

1.2. SMOOTH FUNCTIONS

3

Example 1.2.1. Let U = { (x, y) | (x, y) 6= (0, 0) }, the function f (x, y) =

1 x2 + y 2

is continuous on the open set U ⊂ IR 2 Note that if f ∈ C 0 (IR n ) then f ∈ C 0 (U ) for any open subset U ⊂ IR n . The partial derivatives of f at the point p = (x10 , . . . , xn0 ) in the xi direction is i+1 n 1 2 i n f (x10 , x20 , . . . , xi0 + h, xi+1 ∂f 0 , . . . , x0 ) − f (x0 , x0 , . . . , x0 , x0 , . . . , x0 ) = lim ∂xi p h→0 h which is also written (∂xi f )|p . Let U ⊂ IR n be an open set. A function f : U → IR is said to be C 1 (U ) if all the partial derivatives ∂xi f, 1 ≤ i ≤ n exists at every point in U and these n-functions are continuous at every point in U . The partial derivatives of order k are denoted by ∂kf ∂xi1 ∂xi2 . . . ∂xik where 1 ≤ i1 , i2 , . . . , ik ≤ n. We say for a function f : U → IR , U an open set in IR n , that f ∈ C k (U ) if all the partial derivatives up to order k exist at every point in the open set U and they are also continuous at every point in U . A function f : U → IR is said to be smooth or f ∈ C ∞ (U ) if f ∈ C k (U ) for all k ≥ 0. In other words a function is smooth if its partial derivatives exist to all orders at every point in U , and the resulting functions are continuous. Example 1.2.2. Let i ∈ {1, . . . , n}. The coordinate functions f i : IR n → IR , where f i (x) = xi (so the ith coordinate) satisfy f i ∈ C ∞ (IR n ), and are smooth functions. The coordinate functions f i will just be written as xi . Any polynomial in the coordinate functions X X P (x) = a0 + ai x i + ai1 i2 xi1 xi2 + . . . up to finite order 1≤i≤n

1≤i1 ,i2 ≤n

satisfies P (x) ∈ C ∞ (IR n ), and are smooth functions.

4

CHAPTER 1. PRELIMINARIES

Example 1.2.3. Let U ⊂ IR n be an open set and define the functions 1U , 0U : U → IR by 1U = { 1, f or all 0U = { 0, f or all

(1.1)

x ∈ U }, x ∈ U }.

The function 1U , 0U ∈ C ∞ (U ). The function 1U is the unit function on U , and 0U is the 0 function on U . All the partial derivatives are 0 for these functions. One reason we work almost exclusively with smooth functions is that if f ∈ C ∞ (U ) then ∂xi f ∈ C ∞ (U ), 1 ≤ i ≤ n, and so all the partial derivatives are again smooth functions. While working with this restricted class of functions is not always necessary, by doing so the exposition is often simpler. The set of functions C k (U ) have the following algebraic properties [?]. Proposition 1.2.4. Let f, g ∈ C k (U ) (k ≥ 0 including k = ∞), and let α ∈ IR . Then 1. (αf )(x) = αf (x) ∈ C k (U ), 2. (f + g)(x) = f (x) + g(x) ∈ C k (U ), 3. (f g)(x) = f (x)g(x) 4. ( fg )(x) =

f (x) g(x)

∈ C k (U ),

∈ C k (V ), where V = { x ∈ U | g(x) 6= 0 }.

Example 1.2.5. Let P (x) and Q(x) be polynomials on IR n . Then by 4 in Lemma 1.2.4 P (x) f (x) = Q(x) is a smooth function on the open set V = { x ∈ IR n | Q(x) 6= 0 }. A function Φ : IR n → IR m is written in components as Φ(x) = (Φ1 (x), Φ2 (x), . . . , Φm (x)),

x ∈ IR n .

The function Φ is smooth or Φ ∈ C ∞ (IR n , IR m ) if each component Φ1 , Φ2 , . . . Φn ∈ C ∞ (IR n ). If U ⊂ IR n is open, then C ∞ (U, IR m ) denotes the C ∞ functions defined on U .

1.3. SMOOTH CURVES

5

Example 1.2.6. The function Φ : IR 2 → IR 3 given by Φ(x, y) = (x + y, x − y, x2 + y 2 ) has components Φ1 (x, y) = x + y, Φ2 (x, y) = x − y, Φ3 (x, y) = x2 + y 2 . Therefore Φ ∈ C ∞ (IR 2 , IR 3 ).

1.3

Smooth Curves

Let a, b ∈ IR , a < b then I = (a, b) is the open interval I = {x ∈ IR | a < x < b}. A function σ ∈ C ∞ (I, IR ) is a mapping σ : I → IR n , and is called a smooth or C ∞ curve. If t denotes the coordinate on I the curve σ has components σ(t) = (σ 1 (t), σ 2 (t), . . . , σ n (t)). The derivative σ(t) ˙ of the curve σ(t) is  1  dσ dσ dσ 1 dσ n σ(t) ˙ = = , ,..., . dt dt dt dt If t0 ∈ I then σ(t ˙ 0 ) is the tangent vector to σ at the point σ(t0 ). The Euclidean arc-length of a curve σ (when it exists) is v 2 Z b Z bu n  uX dσ i t L(σ) = dt = ||σ||dt ˙ dt a a i=1 where ||σ|| ˙ =

pPn

˙ i=1 (σ

i )2 .

Example 1.3.1. Let σ : IR → IR 3 be the smooth curve σ(t) = (cos t, sin t, t) ,

t ∈ IR

which is known as the helix. The tangent vector at an arbitrary t value is σ(t) ˙ =

dσ = (− sin t, cos t, 1). dt

6

CHAPTER 1. PRELIMINARIES

When t =

π 4

we have the tangent vector (− √12 , √12 , 1), which looks like, diagram

The arc-length of σ doesn’t exists on IR . If we restrict the domain of σ to I = (0, 2π) we get Z 2π p √ L(σ) = sin2 t + cos2 t + 1dt = 2 2π 0

1.4

Composition and the Chain-rule

An easy way to construct smooth functions is through function composition. Let m, n, k ∈ Z + and let Φ : IR n → IR m , Ψ : IR m → IR l . The composition of the two functions Ψ and Φ is the function Ψ ◦ Φ : IR n → IR l defined by (Ψ ◦ Φ)(x) = Ψ(Φ(x)) f or all x ∈ IR n . Note that for unless l = n the composition of Φ with Ψ cannot be defined. Let (xi )1≤i≤n be coordinates on IR n , (y a )1≤a≤m be coordinates on IR m and (uα )1≤α≤l be coordinates on IR l . In terms of these coordinates the components of the functions Φ and Ψ can be written y a = Φa (x1 , . . . , xn ) 1 ≤ a ≤ m, uα = Ψα (y 1 , . . . , y m ) 1 ≤ α ≤ l. The components of the composition Ψ ◦ Φ are then uα = Ψα (Φ(x1 , . . . , xn )) 1 ≤ α ≤ l. Example 1.4.1. Let σ : IR → IR 3 be the helix from example 1.3.1, and let Φ : IR 3 → IR 2 be (1.2)

Φ(x, y, z) = (xy + 2yz, x + y).

The composition Φ ◦ σ : IR → IR 2 is the curve Φ ◦ σ(t) = (sin t cos t + 2t sin t, cos t + sin t). Now let Ψ : IR 2 → IR 2 be given by (1.3)

Ψ(u, v) = (u − v, uv).

The composition Ψ ◦ Φ : IR 3 → IR 2 is then (1.4)

Ψ ◦ Φ(x, y, z) = (xy + 2yz − x − y, x2 y + 2xyz + xy 2 + 2y 2 z).

1.4. COMPOSITION AND THE CHAIN-RULE

7

The formula for first partial derivatives of a composition of two functions is known as the chain-rule. Theorem 1.4.2. (The chain-rule). Let Φ ∈ C 1 (IR n , IR m ), and Ψ ∈ C 1 (IR m , IR l ). Then Ψ ◦ Φ ∈ C 1 (IR n , IR l ), and m ∂(Ψ ◦ Φ)α X ∂Ψα ∂Φa (1.5) = , 1 ≤ i ≤ n, 1 ≤ α ≤ l. ∂xi ∂y a ya =Φa (x) ∂xi a=1 Example 1.4.3. We verify the chain-rule for the functions Ψ and Φ in example 1.4.1. For the left side of equation 1.5, we have using equation 1.4, ∂x (Ψ ◦ Φ) = (y − 1, 2xy + 2yz + y 2 ).

(1.6)

While for the right side we need (1.7) ∂Ψ = (∂u Φ1 , ∂u Φ2 )|(u,v)=Φ(x,y,z) = (1, v)|(u,v)=Φ(x,y,z) = (1, x + y) ∂u (u,v)=Φ(x,y,z) ∂Ψ = (∂u Φ1 , ∂u Φ2 )|(u,v)=Φ(x,y,z) = (−1, u)|(u,v)=Φ(x,y,z) = (−1, xy + 2yz) ∂v (u,v)=Φ(x,y,z)

and (1.8)

∂x Φ = (y, 1).

Therefore the two terms on the right side of 1.5 for α = 1, 2 can be computed from equations 1.7 and 1.8 to be ∂Ψ1 ∂Φ1 ∂Ψ1 ∂Φ2 + =y−1 ∂u ∂x ∂v ∂x ∂Ψ2 ∂Φ1 ∂Ψ2 ∂Φ2 + = (x + y)y + (xy + 2yz) ∂u ∂x ∂v ∂x which agrees with 1.6. Theorem 1.4.2 generalizes to the composition of C k functions. Theorem 1.4.4. Let k ≥ 0 (including k = ∞), and let U ⊂ IR n , V ⊂ IR m be open sets. If Φ ∈ C k (U, V ) and Ψ ∈ C k (V, IR l ), then Ψ ◦ Φ ∈ C k (U, IR l ). Therefore the composition of two smooth functions is again a smooth function.

8

CHAPTER 1. PRELIMINARIES

Example 1.4.5. Let g(x, y, z) = x2 + y 2 + z 2 . Clearly g ∈ C ∞ (IR 3 ) because any polynomial is C ∞ . Let h(u) = eu , and h ∈ C ∞ (IR ). Therefore by 2 2 2 Theorem 1.4.4 above ex +y +z ∈ C ∞ (IR 3 ). Likewise all the compositions in example 1.4.1 are C ∞ . Example 1.4.6. The function f (x, y, z) = log(x + y + z) is smooth on U = {(x, y, z) | x + y + z > 0 }. Example 1.4.7. Let σ : I → IR n be a smooth curve in IR n defined on an open interval I ⊂ IR . Let Φ ∈ C ∞ (IR n , IR m ) then Ψ ◦ σ ∈ C ∞ (I, IR m ) and is a smooth curve in IR m . This composition produces a smooth curve in the range space of Φ. The chain-rule produces n

(1.9)

X ∂Φa d a dσ i Φ (σ(t)) = | σ(t) dt ∂xi dt i=1

The next theorem is technical but will be needed in Chapter 2. Theorem 1.4.8. Let f ∈ C ∞ (U ) where U ⊂ IR n is an open set, and let p = (x10 , . . . , xn0 ) ∈ U . There exists an open ball Br (p) ⊂ U and function gi ∈ C ∞ (Br (p)), 1 ≤ i ≤ n such that n X f (x) = f (p) + (xi − xi0 )gi (x) f or allx ∈ Br (p) i=1

and where

∂f . gi (p) = ∂xi p

Proof. Let Br (p) be an open ball about p contained in U . Let x ∈ Br (p) then the line l : [0, 1] → IR n l(t) = p + t(x − p) has the properties l(t) ⊂ Br (p), 0 ≤ t ≤ 1 and l(0) = p, l(1) = x. XXXXXXXXXXXXXXX Picture XXXXXXXXXXXXXXXXXXXX

1.5. VECTOR SPACES, BASIS, AND SUBSPACES

9

Therefore we can evaluate f (l(t)), and use the fundamental theorem of calculus to write, Z

1

d f (p + t(x − p))dt 0 dt = f (p) + f (l(1)) − f (l(0)) = f (x).

f (x) = f (p) + (1.10)

We expand out the derivative on the first line in equation 1.10 using the chain-rule 1.4.2 to get (1.11)

n X d d ∂f i i f (l(t)) = f (p + t(x − p)) = (x − x0 ) i , dt dt ∂x p+t(x−p) i=1

where p = (x10 , . . . , xn0 ). Substituting from equation 1.11 into the first line in equation 1.10 gives Z

1

d f (p + t(x − p))dt, 0 dt Z 1 n X ∂f i i = f (p) + (x − x0 ) ∂xi

f (x) = f (p) +

0

i=1

Therefore let Z gi (x) = 0

1

dt.

p+t(x−p)

∂f dt, ∂xi p+t(x−p)

which can be checked to satisfy the conditions in the theorem.

1.5

Vector Spaces, Basis, and Subspaces

We begin by reviewing the algebra of matrices. Let m, n ∈ Z + , then Mm×n (IR ) denotes the set of m × n matrices with real entries. A matrix A ∈ Mm×n (IR ) has m rows and n columns. The components of A are given by Aaj , 1 ≤ a ≤ m, 1 ≤ j ≤ n. If A, B ∈ Mm×n (IR ) and c ∈ IR then A + B, cA ∈ Mm×n (IR ), where in components (1.12)

(A + B)aj = Aaj + Bja , (cA)aj = cAaj ,

1 ≤ a ≤ m, 1 ≤ j ≤ n.

10

CHAPTER 1. PRELIMINARIES

If A ∈ Mm×n (IR ), B ∈ Mn×p (IR ) then the product of A and B is the matrix AB ∈ Mm×p (IR ) defined by (AB)as

=

n X

Aaj Bsj ,

1 ≤ a ≤ m, 1 ≤ s ≤ p.

j=1

Example 1.5.1. Let x ∈ IR n and let A ∈ Mm×n (IR ). If we view x as x ∈ Mn×1 (IR ) (so having n rows and one column) then Ax ∈ Mm×1 (IR ) is a vector having m-components. The vector Ax is given by just standard matrix vector multiplication. The transpose of A ∈ Mm×n (IR ) is the matrix AT ∈ Mn×m with the property (AT )ij = Aji . If A ∈ Mn×n (IR ) and AT = A then A is a symmetric matrix , if AT = −A then A is a skew-symmetric matrix . Finally if A ∈ Mn×n (IR ) then the trace of A is n X trace(A) = Aii i=1

which is the sum of the diagonal elements. Definition 1.5.2. A vector space V over IR is a non-empty set with a binary operation + : V × V → V , and a scalar multiplication · : IR × V → V which satisfy V1) (u + v) + w = (u + v) + w, V2) u + v = v + u, V3) there exists 0 ∈ V such that u + 0 = u, V4) for all u there exists v such that u + v = 0, V5) 1 · u = u, V6) (ab) · u = a · (b · u), V7) a · (u + v) = a · u + a · v V8) (a + b) · u = a · u + b · v. for all u, v, w ∈ V , a, b ∈ IR . For vector-spaces we will drop the symbol · in a · u and write au instead. For example rule V8 is then (a + b)u = au + bu.

1.5. VECTOR SPACES, BASIS, AND SUBSPACES

11

Example 1.5.3. Let V = IR n , and let + be ordinary component wise addition and let · be ordinary scalar multiplication. Example 1.5.4. Let V = Mm×n (IR ), and let + be ordinary matrix addition, and let · be ordinary scalar multiplication as defined in equation 1.12. With these operations Mm×n (IR ) is a vector-space. Example 1.5.5. Let f, g ∈ C k (U ) and c ∈ IR , and let (f + g)(x) = f (x) + g(x),

(c · f )(x) = cf (x) f or allx ∈ U,

Properties 1 and 2 in Lemma 1.2.4 show that f +g, c·f ∈ C k (U ). Let 0 = 0U , the zero function on U defined in equation 1.1. With these definitions C k (U ) is a vector-space over IR (for any k including k = ∞). Let S be a non-empty subset of V . A vector v ∈ V is a linear combination of elements of S is there exists ci ∈ IR (recall an up index never means to the power), and vi ∈ S such that v=

k X

ci vi .

i=1

Note that the zero-vector 0 will always satisfy this condition with c1 = 0, v1 ∈ S. The set of all vector which are a linear combination of S is called the span of S and denoted by span(S) . A subset S ⊂ V is linearly independent if for every choice {vi }1≤i≤k ⊂ S, the only combination (1.13)

k X

ci v i = 0

i=1

is ci = 0, 1 ≤ i ≤ k. The empty set is taken to be linear independent. Example 1.5.6. Let V = IR 3 , and let        1 1 0  (1.14) S = 2 ,  0  , 1 , and 1 −1 1

  2 v = 3 , 1

12

CHAPTER 1. PRELIMINARIES 1. Is v a linear combination of elements in S? 2. Is S a linearly independent set? To answer the first question, we try to solve the system of equations         1 1 0 1 1  2 3    0 + c 1 = 2 c 2 +c 1 −1 1 1

for c1 , c2 , c3 ∈ IR . This is the  1  2 (1.15) 1 The augmented matrix and  1 1 0 2 0 1 1 −1 1

matrix equation   1   2 1 0 c 2    c = 3 , 0 1 c3 1 −1 1

row reduced form are,    1 0 21 | 32 | 2 | 3 rref → 0 1 − 12 | 12  . 0 0 0 | 0 | 1

The system of equations in consistent. Therefore v is a linear combination of vectors in S. We can determine the values of c1 , c2 , c3 from the row reduced form of the coefficient matrix. This corresponding reduced form gives the equations 3 1 c1 + c3 = 2 2 1 1 c2 − c3 = 2 2 There are an infinite number of solutions, given in parametric form by  1  3   1 c −2 2 c2  =  1  + t  1  t ∈ IR . 2 2 0 t c3 In this solution we let c3 be the parameter t. If we choose for example t = 0, then c1 = 23 and c2 = 12 and we note that       1 1 2 3  1    2 + 0 = 3 . 2 2 1 −1 1

1.5. VECTOR SPACES, BASIS, AND SUBSPACES

13

To answer the second questions on whether S is linearly independent we check for solution to equation 1.13, by looking for solutions to the homogeneous form of the systems of equations in (1.15),    1   0 1 1 0 c 2     2 0 1 c = 0 , (1.16) c3 0 1 −1 1 If the only solution is c1 = c2 = c3 = 0, then the set S is a linearly independent set. The row reduced echelon form of the coefficient matrix for this system of equations is   1 0 21 0 1 − 1  . 2 0 0 0 Therefore, there are an infinite number of solutions to the system (1.16), given by  1  1 −2 c c2  = t  1  t ∈ IR . − 1 2 2 c3 t For example choosing t = 1 gives,         1 1 0 0 1  1      2 + 0 + 1 = 0 . − 2 2 1 −1 1 0 Therefore S is not a linearly independent set. A subset S ⊂ V is a spanning set if every v ∈ V is a linear combination of elements of S, or span(S) = V . Lastly, a subset β ⊂ V is a basis if β is linearly independent and a spanning set for V . We will always think of a basis β as an ordered set. Example 1.5.7. The set S in equation 1.14 of example 1.5.6 is not a basis for IR 3 . It is not linearly independent. It is also not a spanning set. Example 1.5.8. Let V = IR n , and let β = {e1 , e2 , . . . , en },

14

CHAPTER 1. PRELIMINARIES

where   0 0 . . .   0 ei =   1   0 .  ..  0

(1.17)

1 in the ith row, 0 otherwise.

The set β is the standard basis for IR n , and the dimension of IR n is n. Example 1.5.9. Let V = IR 3 , and let        1 1 0  S = 2 ,  0  , 1 1 −1 0 Is β a basis for IR 3 ? We first check if S is a linearly independent set. As in the example above we need to find the solutions to the homogeneous system c1 v1 +c2 v2 +c3 v3 = 0 (where v1 , v2 , v3 are the three vectors in S given above). We get     1 1 0 1 0 0 2 0 1  rref → 0 1 0 . 1 −1 0 0 0 1 Therefore the only solution to c1 v1 + c2 v2 + c3 v3 = 0 (the homogeneous system) is c1 = 0, c2 = 0, c3 = 0. The row reduced form of the coefficient matrix also shows that S is a spanning set. Also see the theorem below. Example 1.5.10. Let V = Mm×n (IR ), the vector-space of m × n matrices. Let Eji ∈ V, 1 ≤ i ≤ m, 1 ≤ j ≤ n be the matrices Eji

 =

1 in the ith row jth column 0 everywhere else

1 ≤ i ≤ m, 1 ≤ j ≤ n.

The collection {Eji }1≤i≤m,1≤j≤n forms a basis for Mm×n (IR ) called the standard basis. We order them as β = {E11 , E21 , . . . , E12 , . . . , Enm }

1.5. VECTOR SPACES, BASIS, AND SUBSPACES

15

A vector space is finite dimensional if there exists a basis β containing only a finite number of elements. It is a fundamental theorem that any two basis for a finite dimensional vector space have the same number of elements (cardinality) and so we define the dimension of V to be the cardinality of a basis β. A second fundamental theorem about finite dimensional vector spaces is the following Theorem 1.5.11. Let V be an n-dimensional vector space. 1. A spanning set S has at least n elements. 2. If S is a spanning set having n elements, then S is a basis. 3. A linearly independent set S has at most n elements. 4. If S is a linearly independent set with n elements, then S is a basis. Using part 4 of Theorem 1.5.11, we can concluded that S in example 1.5.9 is a basis, since we determined it was a linearly independent set. The set S in example 1.5.6 is not a basis for IR 3 by 4 of Theorem 1.5.11. A useful property of a basis is the following. Theorem 1.5.12. Let V be an n-dimensional vector-space with basis β = {vi }1≤i≤n . Then every vector v ∈ V can be written as a unique linear combination of elements of β. Proof. Since β is a spanning set, suppose that v ∈ V can be written as (1.18)

v=

n X

i

c vi

and v =

n X

i=1

di vi .

i

Taking the difference of these two expressions gives, 0=

n X

(ci − di )vi .

i=1

Since the set β is a linearly independent set, we conclude ci − di = 0, and the two expressions for v in (1.18) agree.

16

CHAPTER 1. PRELIMINARIES

The (unique) real numbers c1 , . . . , cn are the coefficients of the vector v in the basis β. Also note that this theorem is true (and not hard to prove) for vector-spaces which are not necessarily finite dimensional. A subset W ⊂ V is a subspace if the set W is a vector-space using the vector addition and scalar-multiplication from V . The notion of a subspace is often more useful than that of a vector-space on its own. Lemma 1.5.13. A subset W ⊂ V is a subspace if and only if 1. 0 ∈ W , 2. u + v ∈ W, f or allu, v ∈ W , 3. cu ∈ W, f or allu ∈ W, c ∈ IR . Another way to restate these conditions is that a non-empty subset W ⊂ V is a subspace if and only if it is closed under + and scalar multiplication. In order to prove this lemma we would need to show that W satisfies the axioms V 1) through V 8). This is not difficult because the set W inherits these properties from V . Example 1.5.14. Let S ⊂ V non-empty, and let W = span(S). Then W is a subspace of V . We show this when S is finite, the infinite case is similar. Let v1 ∈ S, then 0v1 = 0 so 1) in Lemma 1.5.13 is true. Let v, w ∈ span(S) then k k X X i v= c vi w = di vi i=1

i=1

where S = {v1 , . . . , vk }. Then v+w =

k X

(ci + di )vi

i=1

and so v +w ∈ span(S), and 2) in Lemma 1.5.13 hold. Property 3) in Lemma 1.5.13 is done similarly.

1.6. ALGEBRAS

1.6

17

Algebras

Definition 1.6.1. An IR -algebra (V, ∗) is a vector space V (over IR ) together with an operation ∗ : V × V → V satisfying 1. (av1 + bv2 ) ∗ w = a(v1 ∗ w) + b(v2 ∗ w), 2. v ∗ (aw1 + bw2 ) = av ∗ w1 + bv ∗ w2 . The operation ∗ in this definition is called vector-multiplication. Properties 1 and 2 are referred to as the bi-linearity of ∗. Example 1.6.2. Let V = IR 3 with its usual vector-space structure. Let the multiplication on V be the cross-product. Then (V, ×) is an algebra. Example 1.6.3. Let n ∈ Z + and let V = Mn×n (IR ) be the vector-space of n × n matrices with ordinary matrix addition and scalar multiplication defined in equation 1.12. Let ∗ be matrix multiplication. This is an algebra because of the following algebraic properties of matrix multiplication: (cA + B) ∗ C = cA ∗ C + B ∗ C A ∗ (cB + C) = cA ∗ B + A ∗ C for all c ∈ IR , A, B, C ∈ Mn×n (IR ). These are properties 1 and 2 in Definition 1.6.1. Example 1.6.4. Let V = C k (U ), where U is an open set in IR n . This is vector-space (see example 1.5.5). Define multiplication of vectors by f ∗ g = f · g by the usual multiplication of functions. Part 3) in Lemma 1.2.4 implies f ∗ g ∈ C k (U ). Therefore C k (U ) is an algebra for any k (including k = ∞). Let (V, ∗) be an algebra. An algebra is said to be associative if (1.19)

v1 ∗ (v2 ∗ v3 ) = (v1 ∗ v2 ) ∗ v3 ,

commutative if (1.20)

v1 ∗ v2 = v2 ∗ v1 ,

and anti-commutative if v1 ∗ v2 = −v2 ∗ v1 for all v1 , v2 , v3 ∈ V . If (V, ∗) is an algebra and W ⊂ V is a subset, then W is called a subalgebra if W is itself an algebra using the operation ∗ from V .

18

CHAPTER 1. PRELIMINARIES

Lemma 1.6.5. A subset W ⊂ V of the algebra (V, ∗) is a subalgebra if and only if 1. W is a subspace of V and 2. for all w1 , w2 ∈ W , w1 ∗ w2 ∈ W . Proof. If W ⊂ V is a subalgebra, then it is necessarily a vector-space, and hence a subspace of V . In order that ∗ be well-defined on W it is necessary that for all w1 , w2 ∈ W , that w1 ∗ w2 ∈ W . Therefore conditions 1 and 2 are clearly necessary. Suppose now that W ⊂ V and conditions 1 and 2 are satisfied. By Lemma 1.5.13 condition 1 implies that W is a vector-space. Condition 2 implies that ∗ is well-defined on W , while the bi-linearity of ∗ on W follows from that on V . Therefore conditions 1 and 2 are sufficient. Example 1.6.6. Let W ⊂ M2×2 (IR ) be the subset of upper-triangular 2 × 2 matrices. The set W is a subalgebra of M2×2 (IR ) with ordinary matrix multiplication (see example 1.6.3). Properties 1 and 2 in Lemma 1.6.5 are easily verified. Lemma 1.6.7. Let W ⊂ V be a subspace and β = {wi }1≤i≤m a basis for W . Then W is a subalgebra if and only if wi ∗ wj ∈ W, 1 ≤ i, j ≤ m. Proof. If W is a subalgebra then clearly wi ∗ wj ∈ W, 1 ≤ i, j ≤ m holds, and the condition is clearly necessary. We now prove that it is sufficient. By Lemma 1.6.5 we need to show that if u, v ∈ W , then u ∗ v ∈ W . Since β is a basis, there exist ai , bi ∈ IR , 1 ≤ i ≤ m such that u=

m X

i

a wi ,

i=1

v=

m X

bi w i .

i=1

Then using bi-linearity of ∗ (1.21)

u∗v =

n X

ai bi wi ∗ wj .

i,j=1

By hypothesis wi ∗ wj ∈ W , and since W is a subspace the right side of equation 1.21 is in W . Therefore by Lemma 1.6.5 W is a subalgebra.

1.6. ALGEBRAS

19

Let (V, ∗) be a finite-dimensional algebra, and let β = {ei }1≤i≤n be a basis of V . Since ei ∗ ej ∈ V , there exists ck ∈ IR such that ei ∗ ej =

(1.22)

n X

ck ek .

k=1

Now equation (1.22) holds for each choice 1 ≤ i, j ≤ n we can then write, for each 1 ≤ i, j ≤ n there exists ckij ∈ IR , such that ei ∗ ej =

(1.23)

n X

ckij ek .

k=1

The real numbers ckij are called the structure constants of the algebra in the basis β. Example 1.6.8. Let β = {E11 , E21 , E12 , E22 } be the standard basis for M2×2 (IR ). Then (1.24) E11 ∗ E11 = E11 , E11 ∗ E21 = E21 , E11 ∗ E12 = 0, E11 ∗ E22 = 0, E21 ∗ E11 = 0, E21 ∗ E21 = 0, E21 ∗ E12 = E11 , E21 ∗ E22 = E21 , E12 ∗ E11 = E12 , E12 ∗ E21 = E22 , E12 ∗ E12 = 0, E12 ∗ E22 = 0, E22 ∗ E11 = 0, E22 ∗ E21 = 0, E22 ∗ E12 = E12 , E22 ∗ E22 = E22 , which can also be written in table form * E11 E21 E12 E22

E11 E11 0 E12 0

E21 E21 0 E22 0

E12 0 E11 0 E12

E22 0 E22 0 E22

Therefore the non-zero structure constants (equation 1.23) are read off equation 1.24, c111 = 1, c212 = 1, c122 = 1, c123 = 1, c124 = 1, c231 = 1, c432 = 1, c343 = 1, c444 = 1.

20

CHAPTER 1. PRELIMINARIES

1.7

Exercises

1. Let Φ : IR 2 → IR 2 , and Ψ : U → IR 2 be x Φ(x, y) = (x + y, x2 + y 2 ), Ψ(x, y) = ( , x − y) y where U = {(x, y) | y 6= 0}. (a) Why are Φ and Ψ smooth functions? (b) Compute Ψ ◦ Φ and find its domain, and state why it is C ∞ . (c) Compute Φ ◦ Ψ and find its domain. (d) Verify the chain-rule (1.5) for Ψ ◦ Φ. 2. Find the functions g1 (x, y), g2 (x, y) at p = (1, 2) for f (x, y) = ex+y in Lemma 1.4.8. 3. Let σ : I → IR n be a smooth curve where I = (a, b), a, b ∈ IR . Define the function s : (a, b) → IR by Z t (1.25) s(t) = ||σ||dt ˙ t ∈ I. a

Note that the image of I is the open interval (0, L(σ)), that ds = ||σ||, ˙ dt and if ||σ|| ˙ = 6 0 then s(t) is an invertible function. Suppose that the inverse function of the function s(t) in 1.25 exists and is C ∞ . Call this function t(s), and let γ : (0, L(σ)) → IR n be the curve γ(s) = σ ◦ t(s). This parameterization of the curve σ is called the arc-length parameterization . The function κ : (0, L(σ)) → IR κ = ||

d2 γ || ds2

is called the curvature of γ. (a) Compute the arc-length parameterization for the helix on I = (0, 2π).

1.7. EXERCISES

21

(b) Compute ||γ 0 || for the helix. (c) Prove ||γ 0 || = 1 for any curve σ. (d) Compute κ (as a function of s) for the helix. (e) Show that κ ◦ s(t) for a curve σ can be computed by  (1.26)

κ(t) =

ds dt

−1 ||

dT || dt

where T = ||σ|| ˙ −1 σ˙ is the unit tangent vector of σ(t). (Hint: Apply the chain-rule to γ(s) = σ(t(s)).) (f) Compute κ(t) for the helix using the formula in 1.26. (g) Compute the curvature for the curve σ(t) = (et cos t, et sin t, t),

t ∈ (0, 2π).

4. Compute the matric products AB and BA if they are defined. (a) 

 1 −1 A= 2 1

  1 −1 B = 2 3  1 1

(b)   1 3 A = 2 2 3 1



1 −2 3 B= −1 1 1



5. Which of the following sets β of vector define a basis for IR 3 . In cases when β is not a basis, state which property fails and prove that it fails.        1 −1 2  (a) β = 1 ,  2  , 3 . 1 3 1        1 −2 −1       0 , 2 , 2 . (b) β = 1 3 4

22

CHAPTER 1. PRELIMINARIES

(c)

(d)

     2 1  β = 2 , 1 . 1 1          1 2 1 4         1 , 3 , 1 , 2 . β= 1 1 0 1

6. Show that algebra Mn×n (IR ) with matrix multiplication is an algebra and that it is also associative (see 1.19). Is it a commutative algebra (see 1.20)? 0 7. Let Mn×n (IR ) ⊂ Mn×n (IR ) be the subset of trace-free matrices,

(1.27)

0 Mn×n (n, IR ) = { A ∈ Mn×n (IR ) | trace(A) = 0 }.

Show that, (a) trace(cA + B) = c trace(A) + trace(B), c ∈ IR , A, B ∈ Mn×n (IR ), (b) trace(AB) = trace(BA), A, B ∈ Mn×n (IR ), 0 (n, IR ) ⊂ Mn×n (IR ) is a subspace, and (c) Mn×n 0 (d) that for n > 1, Mn×n (n, IR ) ⊂ Mn×n (IR ), is not a subalgebra.

8. Show that IR 3 with the cross-product is an algebra. Is it commutative or associative? 9. Consider the vector-space V = Mn×n (IR ) and define the function [ , ] : V × V → V by [A, B] = AB − BA A, B ∈ Mn×n (IR ). (a) Show that (Mn×n (IR ), [ , ]) is an algebra. This algebra is called gl(n, IR ), where gl stands for general linear. (b) Is gl(n, IR ) commutative or anti-commutative? (Consider n = 1, n > 1 separately.) (c) Is gl(n, IR ) associative for n > 1? (d) Show [A, [B, C]] + [C, [A, B]] + [B, [C, A]] = 0, f or allA, B, C ∈ Mn×n (IR ). Compare this with problem (b) above. (This is called the Jacobi identity for gl(n, IR ).)

1.7. EXERCISES

23

10. Compute the structure constants for gl(2, IR ) using the standard basis for M2×2 (IR ). 11. Let sl(n, IR ) ⊂ gl(n, IR ) be the subspace of trace-free matrices (see equation 1.27, and problem 7c), (a) Show that sl(n, IR ) is a subalgebra of gl(n, IR ). Compare with problem 7. (Hint Use part a and b of problem 7) (b) Find a basis for the subspace sl(2, IR ) ⊂ gl(2, IR ), and compute the corresponding structure constants. (Hint: sl(2, IR ) is 3 dimensional)

24

CHAPTER 1. PRELIMINARIES

Chapter 2 Linear Transformations 2.1

Matrix Representation

Let V and W be two vector spaces. A function T : V → W is a linear transformation if T (au + bv) = aT (u) + bT (v) f or all u, v ∈ V, a, b ∈ IR . The abstract algebra term for a linear transformation is a homomorphism (of vector-spaces). Example 2.1.1. The function T : IR 2 → IR 3 ,     2x + 3y x (2.1) T = x+y  y x−y is a linear transformation. This is easily check by computing,       0     0  2(ax + bx0 ) + 3(ay + by 0 ) x x x x 0 0  = aT +bT . T a +b 0 =  ax + bx + ay + by y y y y0 ax + bx0 − ay − by 0 Example 2.1.2. Let A ∈ Mm×n (IR be a m × n matrix and define LA : IR n → IR m by LA (x) = Ax where Ax is matrix vector multiplication (see example 1.5.1). It follows immediately from properties 1.12 of matrix multiplication that the function LA is a linear transformation. 25

26

CHAPTER 2. LINEAR TRANSFORMATIONS Note that in example 2.1.1     2 3   x x T = 1 1  y y 1 −1

Let V be an n dimensional vector-space with basis β = {v1 , . . . , vn }, and let W be an m dimensional vector space with basis γ = {w1 , . . . , wm }. Given a linear transformation T : V → W a linear transformation, we begin by applying T to the vector v1 , so that T (v1 ) ∈ W . Since γ is a basis for W , there exists real numbers A11 , A21 , . . . , Am 1 such that T (v1 ) =

m X

Aa1 wa .

a=1

The role of the “extra index” 1 the coefficients A will be clear in a moment. Repeating this argument with all of the basis vector vj ∈ β we find that for each j, 1 ≤ j ≤ n there A1j , A2j , . . . , Am j ∈ IR such that (2.2)

T (vj ) =

m X

Aaj wa .

a=1

The set of numbers Aaj , 1 ≤ j ≤ n, 1 ≤ a ≤ m form a matrix (Aai ) with m rows, and n columns (so an m × n matrix). This is the matrix representation of T in the basis’ β and γ and is denoted by (2.3)

[T ]γβ = (Aaj ).

Example 2.1.3. Continuing with T : IR 2 → IR 3 in example 2.1.1 above, we compute [T ]γβ where            1 0 0  1 0      0 , 1 , 0 . β= , , γ= 0 1 0 0 1 So β and γ are the standard basis for IR 2 and IR 3 respectively.           2 1 0 0 1        T = 1 = 2 0 + 1 1 + 1 0 0 1 0 0 1

2.1. MATRIX REPRESENTATION and

27

          3 1 0 0 0        T = 1 = 3 0 + 1 1 − 1 0 . 1 −1 0 0 1

Therefore

  2 3 [T ]γβ = 1 1  . 1 −1

Note that the coefficients of T (e1 ) in the basis γ are in the first column, and those of T (e2 ) are in the second column. We now compute [T ]γβ in the following basis’            1 1 0  1 1 (2.4) β= , , γ = 2 ,  0  , 1 −1 0 1 −1 0 We get,  T and



1 −1

        1 1 0 −1 1  3      2 − 0 − 1 = 0 = 2 2 2 1 −1 0

          1 1 0 2 3  1  1    2 + 0 − 2 1 . T = 1 = 0 2 2 1 −1 0 1

Therefore  (2.5)

1 2

3 2 1 2



 [T ]γβ = − 32 −1 −2

Again note that the coefficients of T (v1 ) in the basis γ are in the first column, and those of T (v2 ) are in the second column. Expanding on the remark at the end of the example, the columns of [T ]γβ are the coefficients of T (v1 ), T (v2 ), . . . , T (vn ) in the basis γ! Example 2.1.4. Let A ∈ Mm×n and let LA : IR n → IR m be linear transformation in example 2.1.2. Let β = {e1 , . . . , en } be the standard basis for IR n

28

CHAPTER 2. LINEAR TRANSFORMATIONS

and let γ = {f1 , . . . , fm } be the standard basis for IR m . Then  1 Aj m  A2  X  j (2.6) LA (ej ) =  ..  Aaj fa ,  .  a=1

Am j

and therefore [LA ]γβ = A. The following lemma shows that the above example essentially describes all linear transformations from IR n to IR m . Lemma 2.1.5. Let T : IR n → IR m be a linear transformation. There exists a matrix A ∈ Mm×n (IR ) such that T (x) = LA (x)

f or all x ∈ IR n .

Suppose now that T : V → W is a linear transformation between the finite dimensional vector spaces V and W . Let β = {vi }1≤i≤n be a basis for V and γ = {wa }1≤a≤n for W . Let v ∈ V which in the basis β is (2.7)

v=

n X

ξ i vi ,

i=1

where ξ i ∈ IR , 1 ≤ i ≤ n are the coefficients of v in the basis β. Now let w = T (v). Then w ∈ W and so can be written in terms of the basis γ as w=

m X

η a fa

a=1

where ηa ∈ IR , 1 ≤ a ≤ m. We then find Lemma 2.1.6. The coefficients of the vector w = T (v) are given by a

η =

n X i=1

where A is the m × n matrix A = [T ]γβ .

Aai ξ i

2.1. MATRIX REPRESENTATION

29

Proof. We simply expand out T (v) using equation (2.7), and using the linearity of T to get T (v) = T (

n X

n X

i

ξ vi ) =

i=1

T (v) =

n X

ξi

i=1

Pm

Aai wa , we get ! ! m m n X X X Aai wa = Aai ξ i wa .

Now substituting for T (vi ) = (2.8)

a=1

a=1

i=1

ξ i T (vi ).

a=1

i=1

Since {wa }1≤a≤m is a basis the coefficients of wa in equation 2.8 must be the same, and so a

(2.9)

η =

n X

Aai ξ i .

i=1

The m-coefficients η a of w can be thought of as an column vector, and the n coefficients ξ i of v can be thought of as a column vector. Equation (2.9) then reads     η1 ξ1  ..   ..   .  = A .  ηm

ξn

where the right side is standard matrix vector multiplication. This can also be written, (2.10)

[w]γ = A[v]β

where [w]γ is the column m-vector of coefficients of w in the basis γ, and [v]β is the column n-vector of the coefficients of v in the basis β. What we have just seen by Lemma 2.1.6 is that every linear transformation T ∈ L(V, W ) where V is dimension n and W is dimension m is completely determined by its value on a basis, or by its matrix representation. That is if P β = {vi }1≤i≤n is a basis then given T (vi ) we can compute T (v) where v = ni=1 ξ i vi , ξ i ∈ IR to be T(

n X i=1

i

ξ vi ) =

n X i=1

ξi T (v i ).

30

CHAPTER 2. LINEAR TRANSFORMATIONS

Conversely any function Tˆ : β → W extends to a unique linear transformation T : V → W defined by n n X X i T( ξ vi ) = ξi Tˆ(v i ) i=1

i=1

which agrees with Tˆ on the basis β. We have therefore proved the following lemma. Lemma 2.1.7. Let β be a basis for the vector space V . Every linear transformation T ∈ L(V, W ) uniquely determines a function Tˆ : β → W . Conversely every function Tˆ : β → W determines a unique linear transformation T ∈ L(V, W ) which agrees with Tˆ on the basis β. A simple corollary is then Corollary 2.1.8. Let T, U ∈ L(V, W ), with the dimension of V being n and the dimension of W being m. Then T = U if and only if [T ]γβ = [U ]γβ for any (and hence all) basis β for V and γ for W .

2.2

Kernel, Rank, and the Rank-Nullity Theorem

Let T : V → W be a linear transformation the kernel of T (denoted ker(T )) or the null space of T is the set ker(T ) = { v ∈ V | T (v) = 0W } where 0W is the zero-vector in W . The image or range of T (denoted R(T )) is R(T ) = { w ∈ W | w = T (v) for some v ∈ V }. Lemma 2.2.1. Let T : V → W be a linear transformation, then ker(T ) is a subspace of V , and R(T ) is a subspace of W . Proof. We need to show that ker(T ) satisfies conditions 1),2),3) from the subspace Lemma 1.5.13. We begin by computing T (0) = T (a0) = aT (0) f or all a ∈ IR .

2.2. KERNEL, RANK, AND THE RANK-NULLITY THEOREM

31

Therefore T (0) = 0W , and 0 ∈ ker(T ). Now suppose u, v ∈ ker(T ), then T (au + v) = aT (u) + T (v) = 0W + 0W = 0. This shows property 2) and 3) hold from Lemma 1.5.13, and so ker(T ) is a subspace of V . The proof that R(T ) is a subspace is left as an exercise. The rank of T denoted by rank(T ), is defined to be the dimension of R(T ), rank(T ) = dim R(T ). The nullity of T denoted by nullity(T ), is defined to be nullity(T ) = dim ker(T ). Example 2.2.2. Let LA : IR n → IR m be the linear transformation LA (x) = Ax from example 2.1.2. Let β = {ei }1≤i≤n is the standard basis for IR n defined in 1.17. Then x = x1 e1 + x2 e2 + . . . xn en and (2.11)

n n X X i LA (x) = A( x ei ) = xi A(ei ). i=1

i=1

The kernel of LA is also called the the kernel of A, that is (2.12)

ker(A) = { x ∈ IR n | Ax = 0}.

where 0 is the zero vector in IR m . The range space R(LA ) of LA is then found from equation 2.11 to be R(LA ) = span{A(e1 ), . . . , A(en )}. By equation 2.6 we have

(2.13)

     A11 A1n      A2   A2    1  n R(LA ) = span  ..  , . . . ,  ..    .   .       Am m  A 1 n

or R(LA ) is the span of the columns of A.

32

CHAPTER 2. LINEAR TRANSFORMATIONS

Example 2.2.3. Let Z : IR 5 → IR 4 be the linear transformation     v v + 2w + 3y − 2z w    w + 2x + 3y + 3z      (2.14) Z  x  = v + w − x + y − 3z  .  y  w + 3x + 4y + 5z z To find ker(Z) we therefore need to solve     v + 2w + 3y − 2z 0  w + 2x + 3y + 3z  0     v + w − x + y − 3z  = 0 . w + 3x + 4y + 5z 0 The parametric solution to these equation is       v 1 0 w 1 1        x  = t1  1  + t2 −2 .       y  −1 0 z 0 1 Note that ker(Z) is a two-dimensional subspace with basis     1 0             1   1      βker(T ) =   1  , −2 .    −1  0        0 1 Writing the linear transformation as    v 1 2 0 w   0 1 2    Z  x  = 1 1 −1  y  0 1 3 z

3 3 1 4

   v −2   w 3  x  −3  y  5 z

then from equation 2.13   1     0 R(Z)span  1 ,    0

  2 1  , 1 1



 0 2  , −1 3

  3 3  , 1 4

  −2    3    . −3   5

2.2. KERNEL, RANK, AND THE RANK-NULLITY THEOREM

33

We leave as an exercise the following lemma. Lemma 2.2.4. Let T : V → W be a linear transformation with β = {vi }1≤i≤n a basis for V and γ = {wa }1≤a≤m a basis for W . The function T is injective if and only if (see equation 2.12), ker(A) = 0 where 0 is the zero vector in IR n . The next proposition provides a good way to find a basis for R(T ). Proposition 2.2.5. Let T : V → W be a linear transformation, where dim V = n and dim W = m. Suppose β1 = {v1 , . . . , vk } form a basis for ker T , and that β = {v1 , . . . , vw , uk+1 , . . . , un } form a basis for V , then {T (uk+1 ), . . . , T (un )} form a basis for R(T ). In practice this means we first find a basis for ker(T ) and then extend it to a basis for V . Apply T to the vectors not in the kernel, and these are a basis for R(T ). Example 2.2.6. Let Z : IR 5 → IR 4 be the linear transformation in equation 2.14 in example 2.2.3. We extend ker Z to a basis as in Lemma 2.2.5           1 0 1 0 0                  1 0 1 1         0            β =  1  , −2 , 0 , 0 , 1 .    −1  0  0 0 0       0 1 0 0 0 (Exercise: Check this is a basis!). We then find             0 0 1 0 2 1 0 1 0    2    1   0             Z 0 = 1 , Z 0 = 1 , Z 1 = −1 , 0 0 0 3 1 0 0 0 0 which by Lemma 2.2.5 form a basis for R(Z). Note that rank(Z) = 3. By counting the basis elements for V in the Lemma 2.2.5 leads to a Theorem known as the rank-nullity theorem, or the dimension theorem. Theorem 2.2.7. If V is a finite dimensional vector-space and T : V → W is a linear transformation, then dim V = dim ker(T ) + dim R(T ) = nullity(T ) + rank(T ).

34

2.3

CHAPTER 2. LINEAR TRANSFORMATIONS

Composition, Inverse and Isomorphism

The composition of two linear transformations plays a significant role in linear algebra. We begin with a straight-forward proposition. Proposition 2.3.1. Let U, V, W be vector-spaces, and let T : U → V , and S : V → W be linear transformations. The composition S ◦ T : U → W is a linear transformation. Proof. Exercise 3. Example 2.3.2. Let S : IR 4 → IR 2 be the linear transformation   w    x  2w + 4y + 4z   (2.15) S .  y  = x − 2z z Let T be the linear transformation given in equation 2.1 in example 2.1.1. Compute T ◦ S or S ◦ T if they are defined. Only T ◦ S is defined and we have, (2.16)     w   2(2w + 4y + 4z) + 3(x − 2z)  x   2w + 4y + 4z   T ◦S =  2w + 4y + 4z + x − 2z   y   = T x − 2z 2w + 4y + 4z − (x − 2z) z   4w + 3x + 8y + 2z =  2w + x + 4y + 2z  2w − x + 4y + 6z Note that it is not possible to compute S ◦ T . Finally, suppose that T : U → V , S : V → W are linear transformations and that β = {u1 , . . . , un }, γ = {v1 , . . . , vm }, δ = {w1 , . . . , wp } are basis for U, V and W respectively. Let [T ]γβ = (Bja )

1 ≤ a ≤ m, 1 ≤ j ≤ n

2.3. COMPOSITION, INVERSE AND ISOMORPHISM

35

and [S]δγ = (Aαa )

1 ≤ α ≤ p, 1 ≤ a ≤ m,

and [S ◦ T ]δβ = (Cjα )

1 ≤ α ≤ p, 1 ≤ j ≤ n

be the matrix representations of T (m × n),S (p × n) and S ◦ T (p × n). Theorem 2.3.3. The coefficients of the p × n matrix C are Cjα

(2.17)

=

m X

Aαa Bja .

b=1

Proof. Let’s check this formula. We compute ! m X S ◦ T (uj ) = S Bja va a=1

= =

m X a=1 m X

Bja S(va ) by linearity of S Bja

a=1

=

p X

Aαa wα

α=1

p m X X α=1

!

! Aαa Bja wα

rearrange the summation.

a=1

This is formula (2.17). This theorem is the motivation on how to define matrix multiplication. If A ∈ Mp×m (IR ) and B ∈ Mm×n (IR ) then the product AB ∈ Mp×n is the p × n matrix C whose entries are given by (2.17). Example 2.3.4. Let T and S be from equations (2.1) and (2.15), we then check (2.17) using the standard basis for each space. That is we check [T ◦ S] = [T ][S] where these are the matrix representations in the standard basis. We have     2 3 2 0 4 4   [T ] = 1 1 , [S] = 0 1 4 6 1 −1

36

CHAPTER 2. LINEAR TRANSFORMATIONS

while from equation (2.16),   4 3 8 2 [T ◦ S] = 2 1 4 2 . 2 −1 4 6 Multiplying,      2 3  4 3 8 2 2 0 4 4 [T ][S] = 1 1  = 2 1 4 2 . 0 1 4 6 1 −1 2 −1 4 6 so we have checked (2.17) in the standard basis. We check formula (2.17) again but this time we will use the basis (2.4) for 2 IR , IR 3 , and δ = {e1 , e2 , e3 , e4 } is again the standard basis for IR 4 . We’ve got [T ]γβ is equation (2.5), so we need to compute [S]βδ . This is   0 −1 0 2 β [S]δ = 2 1 4 2 while using (2.16), we have     1 3   3 1 6 4 2 2 γ 3 1  0 −1 0 2    2 2 −2 = − 2 2 [T ◦ S]δ = 1 2 1 4 2 −4 −1 −8 −6 −1 −2 This verifies equation (2.16). A linear transformation T : V → W is an isomorphism if T is an invertible function. That is T is a bijective function, and so one-to-one and onto. Proposition 2.3.5. A linear transformation T : V → W between two ndimensional vector spaces is an isomorphism if and only if ker(T ) = {0}. Proof. By exercise 2 in this chapter T is injective if and only if ker(T ) = {0}. By the dimension Theorem 2.2.7 dim ker(T ) = 0 if and only if R(T ) = W . In other words T is injective if and only if T is surjective. An n × n matrix A is invertible if there exists an n × n matrix B such that AB = BA = I where I is the n × n identity matrix, in which case we write B = A−1 . The standard test for invertibility is the following.

2.3. COMPOSITION, INVERSE AND ISOMORPHISM

37

Proposition 2.3.6. A matrix A ∈ Mn×n is invertible if and only if det(A) 6= 0. Furthermore if A is invertible then A−1 is obtained by row reduction of the augmented system ( A | I ) → ( I | A−1 ) The invertibility of a matrix and an isomorphism are related by the next lemma. Proposition 2.3.7. Let T : V → W be an isomorphism from the n-dimensional vector-space V to the m dimensional vector-space W , with β a basis for V and γ a basis for W . Then 1. W is n-dimensional, and 2. T −1 : W → V is linear. −1 −1 3. [T −1 ]βγ = [T ]γβ , where [T ]γβ is the inverse matrix of [T ]γβ . Example 2.3.8. Let T : IR 2 → IR 2 be the linear transformation,     x 2x + y T = . y x+y We have according to Lemma 2.3.7,     2 1 1 −1 −1 [T ] = , and [T ] = 1 1 −1 2 Therefore, T

−1

      x x−y −1 x = [T ] = . y y 2y − x

Double checking this answer we compute         x x−y 2(x − y) + 2y − x x −1 T ◦T =T = = y 2y − x x − y + 2y − x y By applying the dimension theorem we have a simple corollary. Corollary 2.3.9. A function T ∈ L(V, W ) with dim V = dim W is an isomorphism if and only if ker T = {0}.

38

CHAPTER 2. LINEAR TRANSFORMATIONS Let L(V, W ) be the set of all linear transformations from V to W .

Lemma 2.3.10. Let S, T ∈ L(V, W ) and a ∈ IR then aS + T ∈ L(V, W ) where (aS + T )(v) = aS(v) + T (v). In particular this makes L(V, W ) is a vector-space. Proof. We need to show aS + T is indeed linear. Let v1 , v2 ∈ V , c ∈ IR , then (aS + T )(cv1 + v2 ) = aS(cv1 + v2 ) + T (cv1 + v2 ) = acS(v1 ) + aS(v2 ) + cT (v1 ) + T (v2 ) = c(aS + T )(v1 ) + (aS + T )(v2 ) Therefore aS + T ∈ L(V, W ). The fact that L(V, W ) is a vector-space can be shown in a number of ways. We now find Lemma 2.3.11. Let V be an n dimensional vector-space and W an m dimensional vector space. Then dim L(V, W ) = nm Proof. Let β = {vi }{1 ≤ i ≤ n} and γ = {wa }1≤a≤m be basis for V and W respectively. Define Φ : L(V, W ) → Mm×n (IR ) by (2.18)

Φ(T ) = [T ]γβ .

We claim Φ is an isomorphism. First Φ is a linear transformation, which is an exercise. It follows from Corollary 2.1.8, that Φ(T0 ) = 0m×n is the unique linear transformation with the 0 matrix for its representation, and so Φ is injective. If A ∈ Mm×n then Tˆ(vi ) =

m X

Aai wa ,

a=1

by Lemma 2.1.7 extends to a linear transformation T : V → W with [T ]γβ = A. Therefore Φ is onto, and so an isomorphism.

2.4. EXERCISES

2.4

39

Exercises

1. A transformation T : IR 3 → IR 2 is defined by

(2.19)

    x x+y−z   T( y ) = . 2y + z z

(a) Show that T is linear. (b) Find the matrix representing T        1 0 0  i. 0 , 1 , 0 and 0 0 1        1 0 2  ii. 0 , 1 , 1 and 0 1 3        1 0 0  iii. 0 , 1 , 0 and 0 0 1        0 0 1  iv. 0 , 1 , 0 and 1 0 0

for each of the following basis.     1 0 , . 0 1     1 0 , . 0 1     1 1 , . 2 3     0 1 , . 1 0

(c) Compute the kernel of T using the matrix representations in (i), (ii) and (iii). Show that you get the same answer in each case. 2. Prove that if T : V → W is a linear transformation, then T is injective if and only if kernel(T ) = 0V . 3. Prove Lemma 2.2.4. (Hint use exercise 2.) 4. Let T : V → W be a linear transformation. Prove that R(T ) ⊂ W is a subspace. 5. For the linear transformation S in equation 2.15 from example 2.3.2, find a basis for ker(S) and R(S) using lemma 2.2.5

40

CHAPTER 2. LINEAR TRANSFORMATIONS 6. (a) Find a linear transformation T : IR 4 → IR 3 with         −1   1  2 1  1  0     ker(T ) = span  and image(T ) = span 0 , 1 . 1 , −1 , 1 1 0 2 Express your answer as T (xe1 + ye2 + ze3 + we4 ). (b) Is your answer unique? (c) Can you repeat a) with           1 0   1  2 1  1  0  2       ker(T ) = span   ,   ,   and R(T ) = span 0 , 1 , ? 1 −1 3 1 1 0 2 1 7. Prove Lemma 2.1.5. 8. Prove Proposition 2.2.5.

9. If T : V → W and S : W → U are linear transformations, show that S ◦ T : V → U is a linear transformation.     x x−y 2 2 10. Let S : IR → IR be given by S( )= . y 2y − x (a) Show that S is a linear transformation.   x   y , where T is the linear transformation given in (b) Find S ◦ T z equation 2.19 of problem 1. (c) Find [T ], [S], and [S ◦ T ] the matrices of these linear transformations with respect to the standard basis for IR 3 and IR 2 (d) Check that [S ◦ T ] = [S][T ]. (e) Find the matrix for the linear transformation S 2 = S ◦ S. (f) Find the inverse function S −1 : IR 2 → IR 2 . Show that S −1 is linear and that [S −1 ] = [S]−1 .

2.4. EXERCISES

41

11. Prove that Φ : L(V, W ) → Mm×n (IR ) defined in equation 2.18 is a linear transformation, by showing Φ(aT + S) = aΦ(T ) + Φ(S) , a ∈ IR , T, S ∈ L(V, W ).

42

CHAPTER 2. LINEAR TRANSFORMATIONS

Chapter 3 Tangent Vectors In multi-variable calculus a vector field on IR 3 is written v = P (x, y, z)i + Q(x, y, z)j + R(x, y, z)k where P, Q, R ∈ C ∞ (IR 3 ). While in differential geometry the same vector field would be written as a differential operator v = P (x, y, z)∂x + Q(x, y, z)∂y + R(x, y, z)∂z . This chapter will show why vector fields are written as differential operators and then examine their behavior under a changes of variables.

3.1

Tangent Vectors and Curves

Let p ∈ IR n be a point. We begin with a preliminary definition that a tangent vector at p is an ordered pair (p , a) where a ∈ IR n which we will write as (a)p . The point p is always included - we can not think of tangent vectors as being at the origin of IR n unless p is the origin. Let Vp be the set of tangent vectors at the point p. This set has the structure of a vector space over IR where we define addition and scalar multiplication by c(a)p + (b)p = (ca + b)p , c ∈ IR , a, b ∈ IR n . This purely set theoretical discussion of tangent vectors does not reflect the geometric meaning of tangency. 43

44

CHAPTER 3. TANGENT VECTORS

Recall that a smooth curve is function σ : I → IR n , where I ⊂ IR is an open interval. Let t0 ∈ I, and let p = σ(t0 ). The tangent vector to σ at p = σ(t0 ) ∈ IR n is ! dσ (a)p = = (σ(t ˙ 0 ))σ(t0 ) . dt t=t0 σ(t0 )

Example 3.1.1. From the helix above, the tangent vector at t = π4 ,is !   dσ 1 1 . (3.1) = −√ , √ , 1 dt t= π 2 2 ( √1 , √1 , π ) 1 1 π 4

(√ ,√ , 4 ) 2

2

2

2 4

Consider the curve   1 1 1 1 π α(t) = − √ t + √ , √ t + √ , t + 4 2 2 2 2 The tangent vector to α(t) at t = 0 is exactly the same as that in (3.1). Two completely different curves can have the same tangent vector at a point. A representative curve of a given vector (a)p ∈ Vp is a smooth curve σ : I → IR n defined on a non-empty open interval I that satisfies σ(t0 ) = p and σ(t ˙ 0 ) = a for some t0 ∈ I. Example 3.1.2. Let (a)p ∈ Vp . The curve σ : IR → IR n given by (3.2)

σ(t) = p + ta

satisfies σ(0) = p and σ(0) ˙ = a, and is a representative of the tangent vector (a)p . The curve with components i

σ i (t) = xi0 + ea (t−1)

(3.3)

where p = (xi0 )1≤i≤n is also a representative of (a)p with t0 = 1

3.2

Derivations

The second geometric way to think about tangent vectors is by using the directional derivative. Let (a)p be a tangent vector at the point p and let

3.2. DERIVATIONS

45

σ : I → IR n be representative curve (so σ(t0 ) = p, σ(t ˙ 0 ) = a). Recall that the directional derivative at p of a smooth function f ∈ C ∞ (IR n ) along (a)p and denoted Da f (p) is   d f ◦ σ Da f (p) = dt t=t0 n i X ∂f dσ = (3.4) i ∂x dt σ(t0 ) t=t0 i=1 n X ∂f , = ai i ∂x p i=1 where the chain-rule 1.4.2 was used in this computation. Using this formula let’s make a few observations about the mathematical properties of the directional derivative. Lemma 3.2.1. Let (a)p ∈ Vp , then the directional derivative given in equation 3.4 has the following properties: 1. Da f (p) ∈ IR , 2. The directional derivative of f only depends only the tangent vector (a)p and not on the curve σ used to represent it. 3. The function f in equation 3.4 need not be C ∞ everywhere but only C ∞ on some open set U in IR n with p ∈ U . 4. If g is a smooth function defined on some open set V with p ∈ V we can compute the directional derivatives Da (cf + g)(p) = cDa f (p) + Da g(p) Da (f g)(p) = Da (f )(p)g(p) + f (p)Da g(p)

(3.5)

where c ∈ IR . Proof. These claims are easily verified. For example, to verify the second property in equation 3.5, d (f ◦ σ g ◦ σ) |t=t0 dt n X ∂f dσ i ∂g dσ i = g(σ(t0 )) + f (σ(t0 )) i i ∂x dt ∂x σ(t0 ) dt t=t0 σ(t ) t=t 0 0 i=1

Da (f g)(p) =

= Da f (p)g(p) + f (p)Da g(p) by equation 3.4.

46

CHAPTER 3. TANGENT VECTORS

The directional derivative leads to the idea that a tangent vector at a point p is something that differentiates smooth function defined in an open set about that point, and satisfies properties 1-4 above. This is exactly what is done in modern differential geometry and we now pursue this approach. The next definition, which is a bit abstract, is motivated by point 3 in Lemma 3.2.1 above. Let p ∈ IR n , and define [ C ∞ (p) = C ∞ (U ), where p ∈ U and U is open. U ⊂IR n

If f ∈ C ∞ (p) then there exists an open set U containing p and f ∈ C ∞ (U ). Therefore C ∞ (p) consists of all functions which are C ∞ on some open set containing p. The set C ∞ (p) has the similar algebraic properties as C ∞ (U ). For example if f, g ∈ C ∞ (p) with Dom(f ) = U and Dom(g) = V , then define (f + g)(x) = f (x) + g(x), (f g)(x) = f (x)g(x),

x ∈ U ∩ V,

Note that p ∈ U ∩V , U ∩V is an open set, and therefore f +g and f g ∈ C ∞ (p). We now come to the fundamental definition. Definition 3.2.2. A derivation of C ∞ (p) is a function Xp : C ∞ (p) → IR which satisfies for all f, g ∈ C ∞ (p) and a, b ∈ IR , (3.6)

linearity Leibniz Rule

Xp (af + bg) = aXp (f (x)) + Xp (g), Xp (f g) = Xp (f )g(p) + f (p)Xp (g) .

Let Der(C ∞ (p)) denote the set of all derivations of C ∞ (p). If Xp , Yp ∈ Der(C ∞ (p)) and a ∈ IR we define aXp + Yp ∈ Der(C ∞ (p)) by (3.7)

(aXp + Yp )(f ) = aXp (f ) + Yp (f ),

f or allf ∈ C ∞ (p).

Lemma 3.2.3. With the operations 3.7 the set Der(C ∞ (p)) is a vector-space. Proof. The zero vector 0p is the derivation 0p (f ) = 0. The vector-space properties are now easy to verify.

3.2. DERIVATIONS

47

Example 3.2.4. Let f ∈ C ∞ (p) and let Xp (f ) = ∂xi f |p , where i ∈ {1, . . . , n}. Then Xp ∈ Der(C ∞ (IR n )). More generally, if (ξ i )1≤i≤n ∈ IR n then Xp = (ξ 1 ∂1 + ξ 2 ∂2 + . . . + ξ n ∂n )|p satisfies Xp ∈ Der(C ∞ (p)). Example 3.2.5. Let p ∈ IR n and let (a)p ∈ Vp . Define the function T : Vp → Der(C ∞ (p)) by (3.8)

T ((a)p )(f ) = Da f (p).

The function T takes the vector (a)p to the corresponding directional derivative. We need to check that T ((a)p ) is in fact a derivation. This is clear though from property 4 in Lemma 3.2.1. Lemma 3.2.6. The function T : Vp → Der(C ∞ (p)) is an injective linear transformation. Proof. The fact that T is linear is left as an exercise. To check that it is injective we use exercise 2 in Chapter 2. Suppose T ((a)p ) = 0p is the zero derivation. Then T ((a)p )(xi ) = Da xi (p) = ai = 0. Therefore (a)p = (0)p and T is injective. We now turn towards proving the important property that Der(C ∞ (p)) is an n-dimensional vector-space. In order to prove this, some basic properties of derivations are needed. Lemma 3.2.7. Let Xp ∈ Der(C ∞ (p)) then 1. for any open set U containing p, Xp (1U ) = 0 where 1U is defined in equation 1.1, and 2. Xp (c) = 0, for all c ∈ IR Proof. We prove 1 by using the Leibniz property from 3.6, Xp (1U ) = Xp (1U 1U ) = Xp (1U )1 + Xp (1U )1 = 2Xp (1U ). Therefore Xp (1U ) = 0. To prove 2, use linearity from equation 3.6 and part 1, Xp (c) = Xp (c · 1) = cXp (1) = 0 f or all , c ∈ IR .

48

CHAPTER 3. TANGENT VECTORS

Corollary 3.2.8. If f ∈ C ∞ (p) and U ⊂ Dom(f ) then Xp (f ) = Xp (1U · f ). Corollary 3.2.9. If f, g ∈ C ∞ (p) and there exists an open set V ⊂ IR n with p ∈ V , and f (x) = g(x), f or all x ∈ V , then Xp (f ) = Xp (g). Proof. Note that 1V · f = 1V · g. The result is then immediate from the previous corollary. The main theorem is the following. Theorem 3.2.10. Let Xp ∈ Der(C ∞ (p)), then there exists ξ i ∈ IR , 1 ≤ i ≤ n such that X i ∂f (3.9) Xp (f ) = ξ . i ∂x p 1≤i≤n The real numbers ξ i are determined by evaluating the derivation Xp on the coordinate functions xi , ξ i = Xp (xi ), and (3.10)

Xp =

n X

ξ i ∂xi |p .

i=1

Proof. Let f ∈ C ∞ (p), and let U = Dom(f ) be open, and p = (x10 , . . . , xn0 ). By Lemma 1.4.8 there exists functions gi ∈ C ∞ (p), 1 ≤ i ≤ n, defined on an open ball Br (p) ⊂ U such that the function F : Br (p) → IR given by (3.11)

F (x) = f (p) +

n X

(xi − xi0 )gi (x),

x ∈ Br (p),

i=1

where

∂f , gi (p) = ∂xi p

agrees with f (x) on Br (p). Since f and F agree on Br (p), Corollary 3.2.9 implies Xp (f ) = Xp (F ).

3.2. DERIVATIONS

49

Finally by using the properties for Xp of linearity, Leibniz rule and Xp (f (p)) = 0 (Lemma 3.2.7) in equation 3.11 we have ! n X Xp (f ) = Xp (F ) = Xp (xi − xi0 )gi (x) i=1

(3.12)

= =

n X i=1 n X

Xp (xi − xi0 )gi (p) + (xi − xi0 )|x=p Xp (g i ) Xp (xi )gi (p).

i=1

Property 2 in Lemma 1.4.8 gives gi (p) = (∂xi f )|p which in equation 3.12 gives equation 3.9 and the theorem is proved. Corollary 3.2.11. The set β = {∂xi |p }1≤i≤n forms a basis for Der(C ∞ (p)). Corollary 3.2.11 leads to the modern definition of the tangent space. Definition 3.2.12. Let p ∈ IR n . The tangent space at p denoted by Tp IR n is the n-dimensional vector-space Der(C ∞ (p). The linear transformation T : Vp → Tp IR n given in equation 3.8 is then an isomorphism on account of Lemma 3.2.6 and the dimension theorem (2.2.7). If σ : I → IR n is a curve with tangent vector (σ(t ˙ 0 ))σ(t0 ) then the corresponding derivation Xp = T ((σ(t ˙ 0 ))σ(t0 ) is (3.13)

n X ∂ dσ i . Xp = dt t=t0 ∂xi p=σ(t0 ) i=1

Example 3.2.13. Let p = (1, 1, 1) ∈ IR 3 and let (1, −2, 3)p ∈ Vp . The curve σ(t) = (1 + t, e−2t , 1 + 3t) is a representative curve for (1, −2, 3)p . The corresponding tangent vector Xp ∈ Tp IR 3 is Xp = (∂x − 2∂y + 3∂z )p . Example 3.2.14. Let p = (1, 2) ∈ IR 2 . Find Xp ∈ Der(C ∞ (p)) = Tp IR 2 in the coordinate basis where Xp (x2 + y 2 ) = 3 and Xp (xy) = 1. We begin by

50

CHAPTER 3. TANGENT VECTORS

writing Xp = a∂x + b∂y , a, b ∈ IR , then note by Theorem 3.2.10 above, that a = Xp (x) and b = Xp (y). Applying the two rules of derivations in definition 3.2.2 we get Xp (x2 + y 2 ) = (2x)|(1,2) Xp (x) + (2y)|(1,2) Xp (y) = 2Xp (x) + 4Xp (y) and Xp (xy) = = (y)|(1,2) Xp (y) + (x)|(1,2) Xp (y) = 2Xp (x) + Xp (y) Using Xp (x2 + y 2 ) = 3, Xp (xy) = 1 this gives the system of equations for a = Xp (x) and b = Xp (y) to be      2 4 a 3 = 2 1 b 1 The solution to which is    −1    a 2 4 3 = = b 2 1 1 Therefore

 Xp =

2 1 ∂x + ∂y 6 3

1 6 2 3

 .

 p

Example 3.2.15. Generalizing the previous example, suppose we want to find Xp ∈ Tp IR n = Der(C ∞ (IR n )), where f 1 , f 2 , . . . , f n ∈ C ∞ (p) are given together with (3.14)

Xp (f 1 ) = c1 , Xp (f 2 ) = c2 , . . . , Xp (f n ) = cn .

Under what conditions on f i , 1 ≤ i ≤ n do these equations completely determine P Xp ? By Theorem 3.2.10 or Corollary 3.2.11 above we know that Xp = ni=1 ξ i ∂xi |p . Applying this to (3.14) we find n 1 X i ∂f = c1 ξ i ∂x p i=1 n 2 X i ∂f ξ = c2 i ∂x p (3.15) i=1 .. . n X ∂f n = cn ξi i ∂x p i=1

3.3. VECTOR FIELDS

51

This system of equations can be written as a matrix/vector equation Jξ = c where ξ = (ξ 1 , . . . , ξ n ), c = (c1 , . . . , cn ) and J is the n × n Jacobian matrix ∂f j j Ji = , ∂xi p and Jξ is standard matrix vector multiplication. Equation (3.2.15) has a unique solution if and only if J is invertible, in other words if and only if det J 6= 0. A set of n-functions f 1 , f 2 , . . . , f n ∈ C ∞ (p) which satisfy ! ∂f j 6= 0, det ∂xi p are said to be functionally independent at p. The term functionally independent will be discussed in more detail in Section 8.3.

3.3

Vector fields

A vector field X on IR n is a function that assigns to each point p ∈ IR n a tangent vector at that point. That is X(p) ∈ Tp IR n . Therefore there exists functions ξ i (x), 1 ≤ i ≤ n on IR n such that, X=

n X

ξ i (x)∂xi |x ,

x ∈ IR n .

i=1

Pn

We will write this as X = i=1 ξ i (x)∂xi (dropping the |x ) since the point at which ∂xi is evaluated can be inferred from the coefficients. The vector field X is smooth or a C ∞ vector field if the coefficients ξ i (x) are smooth functions. Vector fields will play a prominent role in the rest of the book. Example 3.3.1. The vector field X on IR 3 given by X = xy 2 ∂x + xz∂y + ey+z ∂z is smooth.

52

CHAPTER 3. TANGENT VECTORS

Vector fields have algebraic properties that are similar to tangent vectors. Let U ⊂ IR n be an open set. A function X : C ∞ (U ) → C ∞ (U ) is called a derivation of C ∞ (U ) if for all f, g ∈ C ∞ (U ) and a, b ∈ IR , (3.16)

linearity Leibniz rule

X(af + bg) = aX(f ) + X(g), X(f g) = X(f )g + f X(g).

We let Der(C ∞ (U )) be the set of all derivations of C ∞ (U ). Example 3.3.2. Consider ∂xi where i ∈ {1, . . . , n}, and C ∞ (IR n ). The partial derivatives ∂xi satisfy properties one and two in equation 3.16, and so ∂xi ∈ Der(C ∞ (IR n )). More generally, let X = ξ 1 (x)∂x1 + ξ 2 (x)∂x2 + . . . + ξ n (x)∂xn where ξ 1 (x), . . . , ξ n (x) ∈ C ∞ (U ). The first order differential operator X is a derivation of C ∞ (U ). In IR 3 the differential operator X = yz∂x + x(y + z)∂z is a derivation of C ∞ (IR 3 ), and if f = xeyz , then X(f ) = (yz∂x + x(y + z)∂z )(xeyz ) = yzeyz + xy(y + z)eyz . Example 3.3.3. Let X ∈ Der(C ∞ (IR )), and n ∈ Z + then X(xn ) = nxn−1 X(x) where xn = x · x · x . . . · x. This is immediately true for n = 1. While by the Leibniz rule, X(xxn−1 ) = X(x)xn−1 + xX(xn−1 ). It then follows by induction that X(xn ) = xn−1 X(x) + x(n − 1)xn−2 X(x) = nxn−1 X(x).

3.4. EXERCISES

3.4

53

Exercises

1. Let (σ(t ˙ 0 ))σ(t0 ) be the tangent vector to the curve at the indicated point as computed in section 1.3. Give two other representative curves for each resulting tangent vector (different than in equations 3.2 and 3.3), and give the corresponding derivation in the coordinate basis (see example 3.2.13). (a) x = t3 + 2 , y = t2 − 2 at t = 1, (b) r = et , θ = t at t = 0, (c) x = cos(t), y = sin(t) , z = 2t , at t = π/2. 2. Let Xp be a tangent vector at a point p in IR n . Use the properties of a X(f ) 1 , f ∈ C ∞ (p), f (p) 6= 0. Hint: derivation to prove that X( ) = − f f (p)2 1 Write 1 = f · ( ). f 3. Let p = (3, 2, 2) ∈ IR 3 and suppose Xp ∈ Tp IR 3 is a tangent vector satisfying Xp (x) = 1, Xp (y) = 1, Xp (z) = 2. (a) Find the formula for Xp as a derivation in the coordinate basis. (b) Calculate Xp (x2 + y 2 ) and Xp (z/y) using just the properties of a derivation. p (c) Calculate Xp (f ) where f (x, y, z) = x2 − zy − 1 using just the properties of a derivation. Hint: Find a formula for Xp (f 2 ). 2 4. With p = (3, −4) ∈ IR 2 find Xp p ∈ Tp IR (with (x, y) coordinates) such that Xp (x + xy) = 4 and Xp ( x2 + y 2 ) = 51 .

5. With p = (1, −1) ∈ IR 2 find all Xp ∈ Tp IR 2 (with (x, y) coordinates) such that Xp (x2 + y 2 ) = 2. 6. Prove Corollary 3.2.11. 7. Let f1 = x + y + z, f2 = x2 + y 2 + z 2 , f3 = x3 + y 3 + z 3 be functions on IR 3 .

54

CHAPTER 3. TANGENT VECTORS (a) Show f1 , f2 , f3 are functionally independent at all point (a, b, c) such that (a − b)(b − c)(c − a) 6= 0. (b) Find the derivation Xp at the point p = (0, −1, 1) such that Xp (f1 ) = 2, Xp (f2 ) = 3, Xp (f3 ) = −4. 8. Given the vector fields X, Y on IR 3 , X = x∂x − 3y∂y + z∂z ,

Y = ∂x + x∂y − ∂z

and the functions f = x2 yz,

g = y2 − x + z2,

compute (a) X(f ),X(g), Y (f ), and Y (g), (b) Y (X(f )), X(Y (f ), and (c) X(Y (g)) − Y (X(g)). 9. Let X, Y be any two vector fields on IR n and define X ◦Y : C ∞ (IR n ) → C ∞ (IR n ) by X ◦ Y (f ) = X(Y (f )). Does X ◦ Y define a derivation? 10. Another common way to define tangent vectors uses the notion of germs. On the set C ∞ (p) define two functions f, g ∈ C ∞ (p) to be equivalent if there exists an open set V with p ∈ V , and V ⊂ Dom(f ), V ⊂ Dom(g) such that f (x) = g(x), f or allx ∈ V . Show that this is an equivalence relation on C ∞ (p). The set of equivalence classes are called germs of C ∞ functions at p. The tangent space Tp IR n is then defined to be the vector-space of derivations of the germs of C ∞ functions at p.

Chapter 4 The push-forward and the Jacobian 4.1

The push-forward using curves

Let Φ : IR n → IR m be a C ∞ function and let p ∈ IR n . The goal of this section is to use the function Φ to define a function Φ∗,p : Tp IR n → TΦ(p) IR m . The map Φ∗,p is called the push-forward, the differential, or the Jacobian of Φ at the point p. In section 4.2 we give a second definition of Φ∗,p and show it agrees with the first. The first definition has a simple geometric interpretation, while the second definition is more convenient for studying the properties of Φ∗,p . In this section we return to the discuss the interpretation of Φ∗,p in terms of curves. Example 4.1.1. Let Φ : IR 2 → IR 3 be given as in 4.23 by (4.1)

Φ(x, y) = (u = x2 + y 2 , v = x2 − y 2 , w = xy).

Let σ(t) = (1 + 3t, 2 − 2t), which has the tangent vector at t = 0 given by (4.2)

(σ(0); σ(0)) ˙ = (3, −2)(1,2) . 55

56

CHAPTER 4. THE PUSH-FORWARD AND THE JACOBIAN

Using the map 3.8 the tangent vector in equation 4.11 is the derivation (4.3)

Xp = (∂x + 2∂y )|p , p = (1, 2).

The image curve Φ ◦ σ : IR → IR 3 is given by  Φ(σ(t)) = ((1 + 3t)2 + (2 − 2t)2 ), ((1 + 3t)2 − (2 − 2t)2 ), (1 + 3t)(2 − 2t) . The tangent vector to Φ ◦ σ at t = 0 is then     d Φ(σ(t) = (−2, 14, 4)(5,−3,2) (4.4) Φ(σ(0)); dt t=0 Again using the map 3.8 the tangent vector in equation 4.12 is the derivation (4.5)

Yq = (−2∂u + 14∂v + 4∂w )|q , q = (5, −3, 2).

Note that by example ?? (4.6)

Yq = Φ∗,p Xp = (−2∂u + 14∂v + 4∂w )|q .

Equations 4.11, 4.12, and 4.6 can be summarized as follows. If Xp is the tangent vector to the curve σ at p = σ(t0 ), then Φ∗,p Xp is the tangent vector to Φ ◦ σ at the image point Φ(p) = Φ(σ(t0 ))! Let Xp ∈ Tp IR n and let σ : I → IR n be a smooth curve which represents Xp by σ(t ˙ 0 ) = Xp . By composing Φ : IR n → IR m with σ we get the image curve Φ ◦ σ : I → IR m from which we prove Theorem 4.1.2. The image of the tangent vector to σ at p = σ(t0 ) is the tangent vector of the image curve Φ(σ) at q = Φ(σ(t0 )),   d Φ ◦ σ (4.7) Ψ∗,p σ(t ˙ 0) = dt t=0 XXXXXXXXXXXXXXXX Diagram XXXXXXXXXXXXXXXX Proof. We begin by writing equation the tangent vector Xp = σ(t ˙ 0 ) where p = σ(t0 ) as n X dσ i ∂ (4.8) Xp = . dt t=t0 ∂xi p i=1

4.1. THE PUSH-FORWARD USING CURVES

57

For the curve Φ(σ(t)), let Yq be the tangent vector at q = Φ(σ(t0 )) which is ! n n n X X X dΦi (σ(t)) ∂ ∂Φi dσ j ) ∂ (4.9) Yq = = i dt ∂xj p dt t=t0 ∂y i . t=t0 ∂y q i=1 i=1 j=1 where (y i )1≤i≤n denotes the coordinates on the image space of Φ. On the other hand using the tangent the tangent vectof Xp in equation 4.8, the push-foward Φ∗,p (Xp ) ∈ Tq IR m from equation 4.15 in Theorem 4.2.3 is precisely 4.9. Before giving the definition we consider an example. Example 4.1.3. Let Φ : IR 2 → IR 3 be given as in 4.23 by (4.10)

Φ(x, y) = (u = x2 + y 2 , v = x2 − y 2 , w = xy).

Let σ : IR → IR 2 be the curve σ(t) = (1 + 3t, 2 − 2t), which has the tangent vector at t = 0 given by (4.11)

(σ(0)) ˙ σ(0) = (3, −2)(1,2) .

The image curve Φ ◦ σ : IR → IR 3 is given by  Φ(σ(t)) = ((1 + 3t)2 + (2 − 2t)2 ), ((1 + 3t)2 − (2 − 2t)2 ), (1 + 3t)(2 − 2t) . The tangent vector to the image curve Φ ◦ σ when t = 0 is then   d (4.12) Φ(σ(t) = (−2, 14, 4)(5,−3,2) . dt t=0 Φ(σ(0)) The map Φ∗,p we define below has the property that if (σ(t ˙ 0 ))σ(t0 ) is the tangent vector to the curve σ at p = σ(t0 ) with corresponding derivation Xp as in equation 3.13, then Yq = Φ∗,p Xp is the derivation Yq corresponding to the tangent vector Φ ◦ σ at the image point Φ(p) = Φ(σ(t0 ))! In example 4.1.3 this means with p = (1, 2) and Xp = (3∂x − 2∂y )(1,2) from 4.11, and Yq = (−2∂u + 14∂v + 4∂w )(5,−3,2) from 4.12, that Yq = Φ∗,p Xp .

58

4.2

CHAPTER 4. THE PUSH-FORWARD AND THE JACOBIAN

The push-forward using derivations

Let Xp ∈ Tp IR n be a tangent vector which by definition is completely determined by its action on C ∞ (p). In order that Φ∗,p (Xp ) to be a tangent vector at q = Φ(p) ∈ IR m we need to define Φ∗,p (Xp )(g) f or all g ∈ C ∞ (q), and show that the result is a derivation (see 3.2.2). Before giving the definition we make the following simple observation. Let g ∈ C ∞ (q) which is a function on the image space of Φ. The composition g ◦ Φ : IR n → IR , is a smooth function on an open subset of IR n which contains p, and so Xp (g ◦ Φ) is well defined! Using this calculation, we are now ready to define Φ∗,p (Xp ). Theorem 4.2.1. Let Φ : IR n → IR m be a smooth function, let p ∈ IR n and q = Φ(p). Given Xp ∈ Tp IR n define Φ∗,p (Xp ) : C ∞ (q) → IR by (4.13)

Φ∗,p (Xp )(g) = Xp (g ◦ Φ) f or all g ∈ C ∞ (q).

Then Φ∗,p (Xp ) ∈ Tq IR n . Proof. The function g ∈ C ∞ (q) in 4.13 is arbitrary and so Φ∗,p (Xp ) : C ∞ (q) → IR . Denote this function by Yq = Φ∗,p (Xp ). If we check that Yq is a derivation of C ∞ (q), then by definition 3.2.2 Yq ∈ Tq IR m . This is easy and demonstrates the power of using derivations. Let g, h ∈ C ∞ (q) a, b ∈ IR , then we compute using the fact that Xq is a derivation, Yq (ag + bh) = Xq (ag ◦ Φ + bh ◦ Φ) = aXp (g ◦ Φ) + bXp (h ◦ Φ) = aYq (g) + bYq (h) and using · for multiplication of functions we have, Yq (g · h) = Xp (g ◦ Φ · h ◦ Φ) = Xp (g ◦ Φ) · h ◦ Φ(p) + g ◦ Φ(p) · Xp (h ◦ Φ) = Yq (g) · h(q) + g(q) · Yq (h). Therefore Yq in (4.13) is derivation of C ∞ (q) and so is an element of Tq IR m .

4.2. THE PUSH-FORWARD USING DERIVATIONS

59

We now check that Φ∗,p is a linear transformation. Theorem 4.2.2. The function Φ∗,p : Tp IR n → TΦ(p) IR m is a linear transformation. Proof. Let Xp , Yp ∈ Tp IR n and a ∈ IR . We use equation 4.13 to compute Φ∗,p (aXp + Yp )(g) = (aXp + Yp )(g ◦ Φ) = aXp (g ◦ Φ) + Yp (g ◦ Φ) = aΦ∗,p Xp (g) + Φ∗,p Yp (g). Therefore Φ∗,p (aXp + Yp ) = aΦ∗,p Xp + Φ∗,p Yp . To gain a better understanding of definition (4.13) we now write out this equation in coordinates using the coordinate basis. Proposition 4.2.3. Let Φ : IR n → IR m be a smooth function and let q = Φ(p). Let β = {∂xi |p }1≤i≤n be the coordinate basis for Tp IR n and let γ = {∂ya |q }1≤a≤m be the coordinate basis for Tq IR m . If Xp ∈ Tp IR n is given by (4.14)

Xp =

n X

ξ i ∂xi |p ,

ξ i ∈ IR ,

i=1

then (4.15)

Φ∗,p Xp =

m n X X a=1

i=1

! a ∂ ∂Φ ξi ∂xi p ∂y a q

Proof. By definition we have (4.16)

(Φ∗,p Xp )(g) =

n X

! ξ i ∂xi |p

(g ◦ Φ)

i=1

If we expand out the derivative term in 4.16 using the chain rule we get, m X ∂g ◦ Φ ∂g ∂Φa = . ∂xi p a=1 ∂y a q ∂xi p

60

CHAPTER 4. THE PUSH-FORWARD AND THE JACOBIAN

Using this in (4.16), we get ! m a X ∂g ∂Φ Xp (g ◦ Φ) = ξi ∂y a q ∂xi p a=1 i=1 ! m n a X X ∂g i ∂Φ = ξ ∂xi p ∂y a q a=1 i=1 n X

(4.17)

where in the second line we have switched the order of the summations. By taking the coefficients of ∂ya |q equation 4.15 now follows directly from 4.17. The following corollary gives us the important interpretation of the function Φ∗,p . Corollary 4.2.4. The matrix representation of the linear transformation Φ∗,p : Tp IR n → Tq IR m in the basis β = {∂xi |p }1≤i≤n for Tp IR n and the basis γ = {∂ya |q }1≤a≤m for TΦ(p) IR m is given by the Jacobian ∂Φa (4.18) [Φ∗,p ] = ∂xi p Proof. Suppose the matrix representation of Φ∗,p is given by the matrix (Jia ) ∈ Mm×n (IR ) (see equation 2.3) (4.19)

Φ∗,p (∂xi |p ) =

m X

Jia ∂ya |q .

a=1

This entries in the matrix Jia are easily determined using equation 4.15 which gives m X ∂Φa (4.20) Φ∗,p (∂xi |p ) = ∂ya |q , i ∂x p a=1 therefore (4.21)

∂Φa . [Φ∗,p ] = ∂xi p

4.2. THE PUSH-FORWARD USING DERIVATIONS

61

A particulary useful way to compute Φ∗,p is the next corollary. Corollary 4.2.5. The coefficients of the image vector Yq = Φ∗,p Xp in the coordinate basis γ = {∂ya |q }1≤a≤m are given by n X ∂Φa i a a (4.22) η = Φ∗,p Xp (y ) = ξ. i ∂x p i=1 Example 4.2.6. Let Φ : IR 2 → IR 3 be given by Φ(x, y) = (u = x2 + y 2 , v = x2 − y 2 , w = xy).

(4.23)

Letp = (1, 2) and Xp = (3∂x − 2∂y )|p . Compute Φ∗,p Xp first using derivation approach 4.13, and then the Jacobian (4.21. With q = Φ(p) = (5, −3, 2) we now find the coefficients of Yq = (a∂u + b∂v + c∂w )|q = Φ∗,p Xp by using the definition 4.13 in the form of equation 4.22, a= b= c=

Φ∗,p (Xp )(u) = Φ∗,p (Xp )(v) = Φ∗,p (Xp )(w) =

(3∂x − 2∂y )(1,2) (x2 + y 2 ) = X(1,2) (x2 − y 2 ) = X(1,2) (xy) =

6 − 8 = −2 6 + 8 = 14 6 − 2 = 4.

This gives (4.24)

Yq = (−2∂u + 14∂v + 4∂w )|q .

We now compute Φ∗,p Xp using the Jacobian matrix at (1, 2), which is      a 2x 2y 2 4 ∂Φ = 2x −2y  = 2 −4 , ∂xi (1,2) y x (1,2) 2 1 and the coefficients of Φ∗,p (Xp ) are     2 4   −2 2 −4 3 =  14  . −2 2 1 4 This gives the same coefficients for Φ∗,p Xp in the coordinate basis {∂u |q , ∂v |q , ∂w |q } as in equation (4.24). Remark 4.2.7. Note that the definition of Φ∗,p in 4.13 was given independent of coordinates! Then the coordinate dependent formulas 4.22 and 4.18 for Φ∗,p were derived from its definition. This is what we strive for when giving definitions in differential geometry.

62

CHAPTER 4. THE PUSH-FORWARD AND THE JACOBIAN

4.3

The Chain rule, Immersions, Submersions, and Diffeomorphisms

Let Φ : IR n → IR m and let (4.25)

T IR n =

[

Tp IR n .

p∈IR n

The set T IR n consists of all possible tangent vectors at every possible point and an element of T IR n is just a tangent vector Xp at a particular point p ∈ IR n . We now define the map Φ∗ : T IR n → T IR m by the point-wise formula (4.26)

Φ∗ (Xp ) = Φ∗,p Xp .

At a generic point p = (x1 , . . . , xn ) ∈ IR n with standard basis {∂xi |p }1≤i≤n for Tp IR n and basis {∂ya |Φ(p) }1≤a≤m for TΦ(p) IR m the matrix representation of Φ∗ is again computed by using 4.2.4 to give the m × n functions on IR n  a ∂Φ (4.27) [Φ∗ ] = . ∂xi Example 4.3.1. Let α : IR → IR n be a smooth curve. Then n X dαi ∂ f or all t ∈ IR . α∗ ∂t = dt ∂xi α(t) i=1 This formula agrees with the point-wise given formula in equation 3.13. We now prove the chain-rule using the derivation definition of Φ∗ . Theorem 4.3.2. (The chain rule) Let Φ : IR n → IR m and let Ψ : IR m → IR p . Then (4.28)

(Ψ ◦ Φ)∗ = (Ψ∗ ) ◦ (Φ∗ )

Proof. Note that the right hand side of (4.28) is a composition of linear transformations. It suffices to check this point-wise on account of 4.26. Let p ∈ IR n , r = Ψ ◦ Φ(p), and let g ∈ C ∞ (r). Let Xp ∈ Tp IR n then by definition 4.13 (Ψ ◦ Φ)∗,p Xp (g) = Xp (g ◦ Ψ ◦ Φ) .

4.3. THE CHAIN RULE, IMMERSIONS, SUBMERSIONS, AND DIFFEOMORPHISMS63 On the right side we get by definition 4.13 ((Ψ∗ ) ◦ (Φ∗ )Xp ) (g) = (Φ∗ Xp ) (g ◦ Φ) = Xp (g ◦ Φ ◦ Ψ). We have shown (Ψ ◦ Φ)∗ Xp (g) = ((Ψ∗ ) ◦ (Φ∗ )Xp ) (g) for all g ∈ C ∞ (r). Therefore (Ψ ◦ Φ)∗ Xp = ((Ψ∗ ) ◦ (Φ∗ )) Xp for all Xp ∈ T IR n and the theorem is proved. Recall in section 3.1 that the rank of a linear transformation T is the dimension of R(T ), the range space of T . Let Φ : U → V , where U ⊂ IR n and V ⊂ IR m be a C ∞ function. Definition 4.3.3. The function Φ is an immersion at p ∈ U if the rank of Φ∗,p : Tp U → TΦ(p) V is n. The function Φ is an immersion if it is an immersion at each point in p ∈ U . By fact that dim Tp U is n-dimensional, the dimension Theorem 2.2.7 shows that the map Φ∗ is an immersion if it is injective at each point p ∈ U . This also implies by exercise 2 in the section 3 that the kernel Φ∗,p at each point p ∈ U consists of only the zero tangent vector at p. Example 4.3.4. Let Φ : IR 2 IR 3 be given by (4.29)

Φ(x, y) = (u = x2 + y 2 , v = x2 − y 2 , w = xy).

At a generic point p = (x, y) ∈ IR 2 with standard basis {∂x |p , ∂y |p } for Tp IR 2 and basis {∂u |Φ(p) , ∂v |Φ(p) , ∂w |Φ(p) } for TΦ(p) IR 3 the matrix representation of Φ∗ is (see equation 4.27)   2x 2y [Φ∗ ] = 2x −2y  . y x According to Lemma 2.2.4 (see also exercise 2), Φ∗ is injective if and only if ker[Φ∗ ] is the zero vector in IR 2 . This only happens when x = y = 0. Therefore Φ is an immersion at every point except the origin in IR 2 . Definition 4.3.5. The function Φ is a submersion at p ∈ U if the rank of Φ∗,p : Tp U → TΦ(p) V is m. The function Φ is an submersion if it is a submersion at each point in p ∈ U .

64

CHAPTER 4. THE PUSH-FORWARD AND THE JACOBIAN

In other words, Φ is a submersion if Φ∗,p is surjective for each point p ∈ IR n . For the next definitions we consider the case m = n. Definition 4.3.6. A smooth function Φ : U → V , U, V ⊂ IR n which is invertible, with inverse function Φ−1 : V → U , which is also smooth, is a diffeomorphism A less restrictive notion is a local diffeomorphism. Definition 4.3.7. A smooth function Φ : U → V is a local diffeomorphism at p ∈ U if there exists an open set W with W ⊂ U and p ∈ W , such that Φ : W → Φ(W ) is a diffeomorphism. Theorem 4.3.8. Inverse Function Theorem. Let Φ : U → V be C ∞ . The function Φ is a local diffeomorphism at p ∈ U if and only if Φ∗,p : Tp U → TΦ(p) V is an isomorphism. Proof. The proof that this condition is necessary is simple. If W ⊂ U is an open set on which Φ restricts to a diffeomorphism, let Z = Φ(W ), and Ψ : Z → W be the inverse of Φ. By the chain-rule (Ψ ◦ Φ)∗,p = Ψ∗,Φ(p) Φ∗,p = I and so Φ∗,p is an isomorphism at p ∈ W . The proof of converse is given by the inverse function theorem and can be found in many texts [?]. Example 4.3.9. Let f : IR → IR be f (t) = t3 then 1

f −1 (t) = t 3 . We have for W = IR − 0, the function f : W → IR is a local diffeomorphism.

4.4. CHANGE OF VARIABLES

4.4

65

Change of Variables

Let U, V ⊂ IR n be open sets. A change of coordinates is a diffeomorphism Φ : U → V . If x1 , . . . , xn are coordinates on U and y 1 , . . . , y n are coordinates on V , then the map Φ is given by y i = Φi (x1 , . . . , xn ),

1 ≤ i ≤ n.

If p ∈ U and it has x-coordinates (x10 , . . . , xn0 ), then y0i = Φi (x10 , . . . , xn0 ) are the y-coordinates of Φ(p). Since Φ is a diffeomorphism, then Φ−1 : V → U is a diffeomorphism. Example 4.4.1. Let V = IR 2 − {(x, 0), then (4.30)

x ≥ 0} and let U = IR + × (0, 2π)

Φ(r, θ) = (x = r cos θ, y = r sin θ)

is a change of coordinates. Example 4.4.2. Let V = IR 3 − {{(x, y, z), and let U = IR + × (0, 2π) × (0, π) then

x ≥ 0} ∪ {(x, y, z),

z 6= 0}}

Φ(ρ, θ, φ) = (x = ρ cos θ sin φ, y = ρ sin θ sin φ, ρ = cos φ) is a change of coordinates. We now look at how vector-fields behave under a change of coordinates. Let U, V ⊂ IR n and Φ : U → V be a change of coordinates, and let X be a vector-field on U . Since for each point p ∈ U , Xp is a tangent vector, we can map this tangent vector to the image using Φ∗,p as in sections 4.1, 4.2 Φ∗,p X(p) ∈ TΦ(p) V. If we do this for every point p ∈ U , and use the fact that Φ is a diffeomorphism (so one-to-one and onto) we find that Φ∗,p applied to X defines a tangent vector at each point q ∈ V , and therefore a vector-field. To see how this is defined, let q ∈ V , and let p ∈ U be the unique point in U such that p = Φ−1 (q). We then define the vector-field Y point-wise on V by Yq = Φ∗,p (Xp ).

66

CHAPTER 4. THE PUSH-FORWARD AND THE JACOBIAN

Let’s give a coordinate version of this formula. Suppose (xi )1≤i≤n are coordinates on U , and (y i )1≤i≤n are coordinates on V . We have a vector-field X=

n X

ξ i (x)

∂ , ∂xi

η i (y)

∂ . ∂y i

i=1

and want to find the vector-field Y =

n X i=1

Let p ∈ U , and q = Φ(p) the formula we have is Yq = (Φ∗,p Xp )p=Φ−1 (q) and by equation 4.15 is ! n n X X ∂ ∂Φj i . (4.31) Yq = ξ (p) ∂xi p −1 ∂y j q j=1 i=1 p=Φ

The coefficients of the vector-field Y are then ! n j X ∂Φ (4.32) η j (q) = ξ i (p) ∂xi p i=1

(q)

, p=Φ−1 (q)

or by equation 4.22 (4.33)

η j (q) = Y (y i )||p=Φ−1 (q) = X(Φi )||p=Φ−1 (q) ,

The formulas 4.32 and 4.33 for Y is called the change of variables formula for a vector-field. The vector-field Y is also called the push-forward by Φ of the vector-field X and is written Y = Φ∗ X. Example 4.4.3. The change to polar coordinates is given in 4.30 in example 4.4.1 above. Let X = r∂r . We find this vector-field in rectangular coordinates, by computing Φ∗ X using equation 4.33.This means Φ∗ X = a∂x + b∂y

4.4. CHANGE OF VARIABLES

67

where a = Φ∗ X(x)|Φ−1 (x,y) and b = Φ∗ X(y)|Φ−1 (x,y) . We compute these to be a = Φ∗ X(x) = (r∂r (r cos θ)) |Φ−1 (x,y) = x b = Φ∗ X(y) = (r∂r (r sin θ)) |Φ−1 (x,y) = y This gives the vector-field, (4.34)

Φ∗ (r∂r ) = x∂x + y∂y .

Similarly we find (4.35)

Φ∗ (∂θ ) = −y∂x + x∂y .

−1 −1 −1 p We now compute Φ∗ ∂x and Φ∗ ∂y . Using the formula Φ (x, y) = (r = x2 + y 2 , θ = arctan(yx−1 ), gives

x r y −1 (Φ∗ ∂x )(θ) = − 2 . r (Φ−1 ∗ ∂x )(r) =

Therefore (4.36)

Φ−1 ∗ ∂x = cos θ∂r −

sin θ ∂θ . r

Φ−1 ∗ ∂y = sin θ∂r +

cos θ ∂θ . r

Similarly we have (4.37)

−1 There is a simple way to compute Φ−1 ∗ without having to compute Φ directly which we now describe. Suppose we are given Φ : U → V a diffeomorphism, we have for the push forward of the vector-fields ∂xj from equation 4.31, n X ∂Φi ∂ Φ∗ (∂xj ) = . j i ∂x −1 (y) ∂x Φ i=1

Similarly we have (4.38)

n −1 i X ∂(Φ ) ∂ Φ−1 . ∗ (∂y j ) = j i ∂y Φ(x) ∂x i=1

68

CHAPTER 4. THE PUSH-FORWARD AND THE JACOBIAN

However by the chain-rule Φ∗ ◦ Φ−1 ∗ = I∗ and therefore in a basis this yields −1 [Φ∗ ][Φ∗ ] = In which in the coordinate basis yields   i −1  ∂Φ ∂(Φ−1 )i = (4.39) . j ∂y ∂xj Φ(x) Therefore using equation 4.39 in equation 4.38 we have −1 n  X ∂(Φi ∂ −1 (4.40) Φ∗ ∂yj = . j i ∂y ∂x i=1 Example 4.4.4. Continuing from example 4.4.3 we have   a  ∂Φ cos θ −r sin θ = sin θ r cos θ ∂xi and 

∂Φa ∂xi

−1

 cos θ sin θ . = − sinr θ cosr θ 

Therefore equation 4.40 yields   sin θ cos θ sin θ −1 = cos θ∂r − Φ∗ ∂x = [∂r , ∂θ ] ∂θ , cos θ sin θ − r r r which agrees with 4.36. A similar computation reproduces Φ−1 ∗ ∂y (compare with in equation 4.37) Example 4.4.5. Let Φ be the change to spherical coordinates Φ(ρ, θ, φ) = (x = ρ cos θ sin φ, y = ρ sin θ sin φ, z = ρ cos φ), from example 4.4.2. We compute Φ∗ by using equation ??, (Φ∗ ∂ρ )(x) = cos θ sin φ (Φ∗ ∂ρ )(y) = sin θ sin φ (Φ∗ ∂ρ )(z) = cos φ. Therefore substituting Φ−1 (x, y, z) gives Φ∗ ∂ρ = (cos θ sin φ∂x + sin θ sin φ∂y + cos φ∂z )Φ−1 (ρ,θ,φ) 1 =p (x∂x + y∂y + z∂z ) . x2 + y 2 + z 2

4.4. CHANGE OF VARIABLES

69

Similarly we get Φ∗ ∂θ = −y∂x + x∂y , p and using ρ sin φ = x2 + y 2 we get p z Φ∗ ∂φ = p (x∂x + y∂y ) − x2 + y 2 ∂z . x2 + y 2 Finally suppose we have a vector-field X = Aρ ∂ρ + Aθ ∂θ + Aφ ∂φ where Aρ , Aθ , Aφ are functions of ρ, θ, φ. We then compute Φ∗ X = Ax ∂x + Ay ∂y + Az ∂z which is a vector-field in rectangular coordinates by Aρ Φ∗ (Aρ ∂ρ + Aθ ∂θ + Aφ ∂φ ) = p (x∂x + y∂y + z∂z ) + x2 + y 2 + z 2 p z (x∂x + y∂y ) − x2 + y 2 ∂z ) = + Aθ (−y∂x + x∂y ) + Aφ ( p x2 + y 2 ! xAρ xzAφ p − yAθ + p ∂x + x2 + y 2 + z 2 x2 + y 2 ! yAρ yzAφ p + xAθ + p ∂y + x2 + y 2 + z 2 x2 + y 2 ! p zAρ p − x2 + y 2 Aφ ∂z . x2 + y 2 + z 2 Therefore (see Appendix VII, [?]), xzAφ − yAθ + p x2 + y 2 + z 2 x2 + y 2 yAρ yzAφ Ay = p + xAθ + p 2 2 2 x +y +z x2 + y 2 p zAρ Az = p − x2 + y 2 Aφ , x2 + y 2 + z 2

Ax = p

xAρ

where Aρ , Aθ , Aφ are expressed in terms or x, y, z.

70

CHAPTER 4. THE PUSH-FORWARD AND THE JACOBIAN

Example 4.4.6. Let U = { (x, y, z) ∈ IR 3 | z 6= 0 } and let Φ : U → V be Φ(x, y, z) = (x −

y2 y , z, ). 2z z

where V = { (u, v, w) ∈ IR 3 | v 6= 0 }. We write the vector-field X = y∂x + z∂y . in (u, v, w) coordinates by computing y2 )=0 2z Φ∗ (X)(v) = (y∂x + z∂y )(z) = 0 y Φ∗ (X)(u) = (y∂x + z∂y )( ) = 1. z Φ∗ (X)(u) = (y∂x + z∂y )(x −

Therefore Φ∗ X = ∂w . Remark 4.4.7. Given a smooth function Φ : IR n → IR m , it is not possible to push-forward a generic vector-field X on IR n and get another vector-field on IR m unless m = n.

4.5. EXERCISES

4.5

71

Exercises

1. For each of the following maps and derivations compute Φ∗,p (X) using the definition of Φ∗,p . (a) Φ(x, y) = (u = xy, v = x2 y, w = y 2 ), p = (1, −1), Xp = 3∂x −2∂y . (b) Φ(x, y, z) = (r = x/y, s = y/z), p = (1, 2, −2), Xp = ∂x −∂y +2∂z . (c) Φ(t) = (x = exp(3t) sin(πt), y = exp(t) cos(2πt), z = t), p = (0), X = ∂t . 2. Repeat question 1 by choosing a curve σ with σ(t ˙ 0 ) = Xp and using Theorem 4.1.2. 3. Let Φ(x, y) = (u = x3 + 3xy 2 , v = 3x2 y + y 3 ), and let p = (1, −1). Compute the matrix representation of Φ∗,p : Tp IR 2 → TΦ(p) IR 2 with respect to the basis {X1 , X2 } for the tangent space for Tp R2 and {Y1 , Y2 } for TΦ(p) R2 . (a) X1 = ∂x |p , X2 = ∂y |p , Y1 = ∂u |q , Y2 = ∂v |q . (b) X1 = (∂x − 2∂y )|p , X2 = (∂x + ∂y )|p , Y1 = ∂u |q , Y2 = ∂v |q . (c) X1 = (∂x − 2∂y )|p , X2 = (∂x + ∂y )|p , Y1 = (2∂u + 3∂v )|q , Y2 = (∂u + 2∂v )|q . ∂ ∂ − p = (1, 1), in polar 4. (a) Write the tangent vector Xp = ∂x p ∂y p coordinates using the coordinate basis. ∂ ∂ (b) Write the tangent vector Xp = −2 +3 , p = (r = 1, θ = ∂r p ∂θ p π/3) given in polar coordinates, in Cartesian coordinates using the coordinate basis. (c) Let   1 2 2 Φ(u, v) = uv, (v − u ) 2 be the change coordinates. Write the tangent vector to parabolic ∂ ∂ Xp = − , p = (u = 1, v = 1) given in parabolic ∂u p ∂v p coordinates, in Cartesian coordinates using the coordinate basis.

72

CHAPTER 4. THE PUSH-FORWARD AND THE JACOBIAN (d) Write the tangent vector Xp in part c) in polar coordinates using the coordinate basis. 5. Check the chain rule (Ψ ◦ Φ)∗,p = Ψ∗,p ◦ Φ∗,p for each of the following maps at the point indicated. You may calculate in the coordinate basis. (a) Φ(x, y) = (x2 −y 2 , xy, x+y+2), (2, 1) (b) Φ(t) = (t, t2 , t3 ), 2

3

(c) Φ(t) = (t, t , t ),

Ψ = (u, v, w) = (1/u, 1/v, 1/w),

Ψ = (u, v, w) = (u/v, v/w, uw), 2

2

2

Ψ = (u, v, w) = (u + v + w ),

p = (1) p = (1)

6. Find the inverses of the following maps Φ and check that (Φ∗,p )(−1) = (Φ(−1) )∗,p at the indicated point. (a) Φ(x, y, z) = (x + 1, y + x2 , z − xy),

p = (1, 1, 2).

(b) Φ(x, y) = ( 21 (x2 − y 2 ), xy) p = (2, 1), on U = {(x, y) ∈ IR 2 | x > y > 0}. 7. Find the points, if any, in the domain of the following maps about which the map fails to be a local diffeomorphism. (a) Φ(x, y) = (x3 − 3xy 2 , −y 3 + 3x2 y). (b) Φ(x, y, z) = (x2 + y + z, y 2 − z + y, y − z). 8. Show that the following maps are immersions. (a) Φ(u, v) = (u + v, u − v, u2 + v 2 ) (b) Φ(u, v, w) = (u+v+w, u2 +v 2 +w2 , u3 +v 3 +w3 , u4 +v 4 +w4 ), v > u. 9. Show that the following maps are submersions. (a) Φ(x, y, z) = (x + y − z, x − y) (b) Φ(x, y, z) = (x2 + y 2 − z 2 , x2 − y 2 ) y > 0. 10. Let F : IR 3 → IR 3 be given by F (x, y, z) = (x2 + y 2 , x2 + 2z, x − y + z)

w>

p=

4.5. EXERCISES

73

(a) Find the kernel of F∗,p : T(0,0,0) IR 3 → T(0,0,0) IR 3 . (b) Does there exists a vector Xp ∈ T(1,0,1) IR 3 such that F∗,p Xp = Yq where Yq = (2∂x − 3∂y + ∂z )(1,3,2) (c) Compute F∗,p (∂x − ∂z )|(1,1,1) in two different ways. (d) Determine the set of points where F is an immersion, submersion, and local diffeomorphism. Are these three answers different? Why or why not.  11. Let Φ(u, v) = uv, 12 (v 2 − u2 ) be the change to parabolic coordinates. (a) Find u∂u + v∂v in rectangular coordinates using the coordinate basis ∂x , ∂y . (b) Find u∂v − v∂u in rectangular coordinates. (c) Find y∂x − x∂y in parabolic coordinates. 12. Let X = Ax ∂x + Ay ∂y + Az ∂z be a vector-field in rectangular. (a) Compute ∂x , ∂y and ∂z in spherical coordinates using the technique from example 4.4.4. (b) Suppose Y = Aρ ∂ρ + Aθ ∂θ + Aφ ∂φ is the formula for X in spherical coordinates. Write Aρ , Aθ , Aφ in terms of Ax , Ay , Az . (See appendix VII, [?])

74

CHAPTER 4. THE PUSH-FORWARD AND THE JACOBIAN

Chapter 5 Differential One-forms and Metric Tensors 5.1

Differential One-Forms

Recall that if V is an n dimensional vector space W is an m dimensional vector-space that L(V, W ) the set of linear transformations from V to W was shown in Lemma 2.3.11 to be an mn dimensional vector-space. If W = IR then L(V, IR ) is n dimensional and is called the dual space to V , denoted V ∗ . The set V ∗ is also called the space of linear functionals on V , the co-vectors on V or the one-forms on V . Example 5.1.1. Let V = IR 3 , then   x   y  = x + 2y + z (5.1) α z satisfies T ∈ V ∗ . That is α is a linear transformation from V to IR . In fact it is not difficult to show (see Exercise 1 that if α ∈ V ∗ , there exists a, b, c ∈ IR such that   x   y  = ax + by + cz. (5.2) α z Suppose that β = {vi }1≤i≤n is a basis for the n dimensional vector-space V is an n−dimensional vector-space. By Lemma 2.1.7 every linear transformation is uniquely determined by its values on a basis. In particular if 75

76CHAPTER 5. DIFFERENTIAL ONE-FORMS AND METRIC TENSORS α ∈ V ∗ , then α is determined by the real numbers α(vi ), and again by Lemma 2.1.7 every function α ˆ : β → IR extends to a unique element α ∈ V ∗ (which agrees with α ˆ on β). Define the function α ˆ 1 : β → IR by α ˆ 1 (v1 ) = 1 α ˆ 1 (vi ) = 0 2 ≤ i ≤ n, which extends to a unique linear transformation α1 : V → IR , where 1

α (

n X

ci v i ) = c1 .

i

We can then define a sequence of functions αi , 1 ≤ i ≤ n in a similar way. For each fixed i ∈ {1, . . . , n}, let αi be the unique element of V ∗ satisfying  1 if i = j, i i (5.3) α (vj ) = δj = 0 otherwise. Example 5.1.2. Let        1 1 1        (5.4) β = v1 = 0 , v2 = 1 , v3 = 1 .   −1 0 1 be a basis for IR 3 . We compute α1 , α2 , α3 and express them in the form of equation 5.2. One way to do this is to note e1 = v1 − v2 + v3 , e2 = 2v2 − v1 − v3 , e3 = v3 − v2 . Then using the properties in equation 5.3 we get, α1 (e1 ) = α1 (v1 − v2 + v3 ) = 1, α1 (e2 ) = −1, α1 (e3 ) = 0. Therefore (5.5)

  x α1 y  = α1 (xe1 + ye2 + ze3 ) = x − y. z

Similarly (5.6)

  x α2 y  = −x + 2y − z, z

   x α3 y  = x − y + z. z

5.1. DIFFERENTIAL ONE-FORMS

77

Theorem 5.1.3. The set of linear transformations α1 , . . . , αn ∈ V ∗ defined by equation 5.3 form a basis for V ∗ . If α ∈ V ∗ then (5.7)

α=

n X

α(vi )αi .

i=1

Proof. To show that these are linearly independent, suppose Z=

n X

ci α i

i=1

is the zero element of V ∗ . The zero element is the linear transformation which maps every vector in V to 0 ∈ IR . Therefore 0 = Z(v1 ) = (

n X

ci αi )(v1 ) =

i=1

n X

ci αi (v1 ) = c1 α1 (v1 ) = c1

i=1

and so c1 = 0. Likewise applying Z to the rest of the basis elements vj we get zero. That is, 0 = Z(vj ) = cj and so cj = 0, and {αi }1≤i≤n is a linearly independent set. To prove that {αi }1≤i≤n are a spanning set, let α ∈ V ∗ and consider τ ∈ V ∗, n X τ= α(vi )αi . i=1

Then τ (vj ) =

n X i=1

i

α(vi )α (vj ) =

n X

α(vi )δji = α(vj ).

i=1

Therefore the two linear transformation τ, α ∈ V ∗ agree on a basis and so by Lemma 2.1.7 are equal. This proves equation 5.7 holds and that {αi }1≤i≤n is a spanning set. Given the basis β the basis {αi }1≤i≤n for V ∗ in 5.3 is called the dual basis.

78CHAPTER 5. DIFFERENTIAL ONE-FORMS AND METRIC TENSORS Example 5.1.4. Let α : IR 3 → IR be the linear functional in equation 5.1 and let β ∗ = {α1 , α2 , α3 } be the dual basis given in equations 5.5, 5.6 to β in equation 5.4. Then by equation 5.7 we have       1 1 1 α = α  0  α1 + α 1 α2 + 1 α3 −1 0 1 = 3α2 + 4α3 . We now consider the case V = Tp IR n and the corresponding dual space Tp∗ IR n which is called the space of one-forms, co-vectors, or dual vectors at the point p ∈ IR n . In this case it turns out that elements in Tp∗ IR n appear in a natural way. Recall that if Xp ∈ Tp IR n , then Xp is a derivation of C ∞ (p). Let f ∈ C ∞ (p) and so there exists an open set U ⊂ IR n with p ∈ U and f ∈ C ∞ (U ), p ∈ U . Then by definition (5.8)

Xp (f ) ∈ IR .

If Yp ∈ Tp IR n then (5.9)

(aXp + Yp )(f ) = aXp (f ) + Yp (f ).

Now in the formula Xp (f ) ∈ IR from equation 5.8, we change the way we think of this formula - instead of this formula saying Xp : f → IR so that f is the argument of Xp , we instead think of Xp as being the argument of the function f ! That is “f 00 : Tp IR n → IR . We use a new notation to distinguish this way of thinking of f . Define the function dfp : Tp IR n → IR by (5.10)

dfp (Xp ) = Xp (f ).

Proposition 5.1.5. The function dfp satisfies dfp ∈ Tp∗ IR n . Proof. We need to prove that dfp is a linear function. That is we need to show (5.11)

dfp (aXp + Yp ) = a dfp (Xp ) + dfp (Yp ).

The left side of this on account of equation 5.10 is (5.12)

dfp (aXp + Yp ) = (aXp + Yp )(f )

5.1. DIFFERENTIAL ONE-FORMS

79

while again on account of 5.10 is (5.13)

a dfp (Xp ) + dfp (Yp ) = a Xp (f ) + Yp (f ).

Equation 5.12 and 5.13 agree by equation 5.9. Example 5.1.6. Let f ∈ C ∞ (IR ) be given by f (x, y) = xy 2 , and let p = (1, 2) ∈ IR 2 . We compute and dfp (Xp ) where Xp = (3∂x − 2∂y )p by equation 5.10, dfp (Xp ) = (3∂x − 2∂y )(xy 2 )|(1,2) = 4. We know that for each point p ∈ IR n the set β = {∂xi }1≤i≤n is a basis for Tp IR n . Let’s calculate the dual basis. We begin by considering the function f 1 (x1 , . . . , xn ) = x1 which we just call x1 . Then by equation 5.10 dx1p (∂x1 |p ) = ∂x1 (x1 ) = 1, dx1p (∂x2 |p ) = ∂x2 (x1 )0, . . . , dx1p (∂xn |p ) = 0. This leads to the general case (5.14)

dxip (∂xj |p ) = δji ,

where δji is the Kronecker delta. From equation 5.14 we conclude that for each p ∈ IR n the set of one-forms (5.15)

β ∗ = {dxip }1≤i≤n

form a basis for Tp∗ IR n which satisfies equation 5.3. Equation 5.15 is the dual basis to the coordinate basis β = {∂xi }1≤i≤n to the tangent space Tp IR n . We will call the basis {dxip } the coordinate basis for Tp∗ IR n . We now express dfp ∈ Tp∗ IR n in terms of our basis β ∗ in equation 5.15. Let p ∈ U then by Theorem 5.1.3 we have dfp =

n X

ci ∈ IR ,

i=1

where ci = dfp (∂xi |p ) = ∂xi |p (f ). Therefore n X ∂f (5.16) dfp = dxip i ∂x p i=1 holds at each point p ∈ U .

80CHAPTER 5. DIFFERENTIAL ONE-FORMS AND METRIC TENSORS Example 5.1.7. Let f : IR 3 → IR be f (x, y, z) = x2 ey−z , and p = (1, 1, 1). dfp = 2xey−z dx + x2 ey−z dy − x2 ey−z dz

 p

,

= 2dxp + dyp − dzp Let Xp = (2∂x − ∂y − 3∂z )|p , and we compute dfp (Xp ) in two ways. First by definition 5.10 we have dfp (Xp ) = Xp (x2 ey−z ) = 4 − 1 + 3 = 6. The second way we use the properties of the vector space Tp∗ IR 3 we have dfp (Xp ) = (2dxp + dyp − dzp )(2∂x |p − ∂y |p − 3∂z |p ) = 2dxp (2∂x |p − ∂y |p − 3∂z |p ) + dyp (2∂x |p − ∂y |p − 3∂z |p ) − dzp (2∂x |p − ∂y |p − 3∂z |p ) = 4 − 1 + 3 = 6. In this computation the fact that dxp (∂x |p ) = 1, dxp (∂y |p ) = 0, dxp (∂z |p ) = 0, . . . , dzp (∂z |p ) = 1 has been used. In direct analogy with a vector-field on IR n being a smoothly varying choice of a tangent vector at each point we define a differential one-form (or a one-form field ) on IR n as a smoothly varying choice of one-form at each point. Since {dxi |p }1≤i≤n form a basis for Tp∗ IR n at each point then every differential one-form α can be written n X α= αi (x)dxi |x i=1 n where αi (x) are n smooth functions on IR . As with vector-fields we will P n i drop the subscript on dx , and write α = i=1 αi (x)dxi .

Example 5.1.8. On IR 3 , α = ydx − xdy + zydz is a one-form field. At the point p = (1, 2, 3), αp = 2dxp − dyp + 6dzp

5.2. BILINEAR FORMS AND INNER PRODUCTS

81

If f ∈ C ∞ (IR n ) then for each p ∈ IR n , by equation 5.10, dfp ∈ Tp∗ IR n . This holds for each p ∈ IR n and so df is then a one-form field called the differential of f . The expansion of df in terms of the differential one-forms dxi which form the dual basis at each point is obtained from equation 5.16 to be (5.17)

n X ∂f i df = dx . i ∂x i=1

Example 5.1.9. Find the differential of f = x2 yexz ∈ C ∞ (IR 3 ). By equation 5.17 df = (2xy + x2 yz)exz dx + x2 exz dy + x3 yexz dz.

5.2

Bilinear forms and Inner Products

In the previous section we began by with the linear algebra of linear functions T : V → IR . A generalization of this is to consider a function B : V ×V → IR which satisfies the properties (5.18)

B(a1 v1 + a2 v2 , w) = a1 B(v1 , w) + a2 B(v2 , w), B(v, a1 w1 + a2 w2 ) = a1 B(v, w1 ) + a2 B(v, w2 ).

These equations imply that B is linear as a function of each of its two arguments. A function B which satisfies these conditions is called a bilinear form. Example 5.2.1. Let A ∈ Mn×n (IR ). Define B : IR n × IR n → IR by B(x, y) =

n X

xi Aij y j = xT Ay,

i,j=1

where xT is the transpose of the column vector x. The function B is easily checked to be bilinear form. Let B, C : V × V → IR be bilinear forms. Then aB + C is a bilinear form where (5.19)

(aB + C)(v, w) = aB(v, w) + C(v, w).

82CHAPTER 5. DIFFERENTIAL ONE-FORMS AND METRIC TENSORS Denote by B(V ) the space of bilinear forms, which is a vector-space (see exercise 5 in this chapter). For the rest of this section, let V be a finite dimensional vector-space with basis β = {vi }1≤i≤n . Let B ∈ B(V ) be a bilinear form. The n × n matrix (Bij ) whose entries are (5.20)

Bij = B(vi , vj ) ,

1 ≤ i, j ≤ n,

is the matrix representation of the bilinear form B in the basis β. Example 5.2.2. Let B : IR 3 × IR 3 → IR be the bilinear function  1   1  y x B x2  , y 2  = 2x1 y 1 − x1 y 2 + x1 y 3 − x2 y 1 − x2 y 2 + x3 y 1 + x3 y 3 . y3 x3 Using the basis (5.21)

     1 1 1  β = v1 =  0  , v2 = 1 , v3 = 1 .   −1 0 1  



we compute the entries of the matrix representation Bij = B(vi , vj ). We find

(5.22)

B(v1 , v1 ) = 1, B(v1 , v2 ) = 0, B(v3 , v1 ) = 0,

B(v1 , v2 ) = 0, B(v1 , v3 ) = 0, B(v2 , v2 ) = −1, B(v2 , v3 ) = 0, B(v3 , v2 ) = 0, B(v3 , v3 ) = 2

and so the matrix representation of B in the basis β = {v1 , v2 , v3 } is   1 0 0 (5.23) [Bij ] =  0 −1 0  . 0 0 2 Example 5.2.3. As in example 5.2.1 let B : IR n × IR n → IR be given by B(x, y) = xT Ay where A ∈ Mn×n (IR ). In the standard basis β = {ei }1≤i≤n , B(ei , ej ) = Aij the entry in the ith row and j th column of A.

5.2. BILINEAR FORMS AND INNER PRODUCTS

83

Theorem 5.2.4. Let v, w ∈ V with (5.24)

v=

n X

i

a vi ,

w=

i=1

n X

bi vi .

i=1

Then (5.25)

B(v, w) =

n X

Bij ai bj = [v]T (Bij )[w],

i,j=1

where [v] = [ai ], [w] = [bi ] are the column vectors of the components of v and w in (5.24). Proof. We simply expand out using (5.24), and the bi-linearity condition, ! n n X X B(v, w) = B ai vi , bj vj i=1

=

n X i=1

ai B

j=1

vi ,

n X

! bj vj

=

j=1

n X n X

ai bj B (vi , wj ) .

i=1 j=1

This is the formula in equation 5.25 in the theorem. An easy consequence of this theorem is the following. Corollary 5.2.5. Let B, C ∈ B(V ). The bilinear forms satisfy B = C if and only if for any basis β, Bij = Cij . A bilinear form B ∈ B(V ) is symmetric if B(w, v) = B(v, w) for all v, w ∈ V . The bilinear for B is skew-symmetric or alternating if B(w, v) = −B(v, w) for all v, w ∈ V . Example 5.2.6. Let V = IR n and let Qij be any symmetric matrix. Define B(x, y) =

n X

Qij xi y j = xT Qy.

i,j=1

Then the bilinear form B is symmetric. We have the following simple proposition.

84CHAPTER 5. DIFFERENTIAL ONE-FORMS AND METRIC TENSORS Proposition 5.2.7. A bilinear form B : V × V → IR is symmetric if and only if its matrix representation (in any basis) is a symmetric matrix. The bilinear form B is skew-symmetric if and only if its matrix representation in a basis is a skew-symmetric matrix. It is easy to show that if the matrix representation of B is symmetric or skew-symmetric in a basis, then it will have this property in every basis. The matrix representation of the bilinear form B in example 5.2.2 is 5.23 and is symmetric. Therefore Proposition implies B is a symmetric bilinear form. Definition 5.2.8. An inner product is a bilinear form γ : V × V → IR that satisfies 1. γ(v, w) = γ(w, v), and 2. γ(v, v) ≥ 0, 3. γ(v, v) = 0 if and only if v = 0V . If γ is an inner product on V , then the length of a vector v ∈ V is p ||v|| = γ(v, v). Theorem 5.2.9. (Cauchy-Schwarz) Let γ : V ×V → IR be an inner product. Then (5.26)

|γ(v, w)| ≤ ||v||||w||.

A proof of the Cauchy-Schwarz inequality 5.26 can be found in most books on linear algebra, see for example [?]. Given an inner product γ, the Cauchy-Schwartz inequality 5.26 allows us to define the angle θ between v, w (in the plane W = span{v, w}) by the usual formula (5.27)

γ(v, w) = cos θ ||v|| ||w||.

Example 5.2.10. Let V = IR n . The standard inner product is γ(x, y) =

n X i=1

xi y i = x T y

5.2. BILINEAR FORMS AND INNER PRODUCTS

85

P P where x = ni=1 xi ei , y = ni=1 y i ei . which is often called the dot product. It is easy to check that γ is an inner product. In the standard basis β = {ei }1≤i≤n , gij = γ(ei , ej ) = δij , 1 ≤ i, j ≤ n, P and is the identity matrix. If v = ni=1 ai ei then ||v|| =

n X

(ai )2 .

i=1

Let β = {vi }1≤i≤n be a basis for V and let γ is an inner product. The n × n matrix [gij ] with entries (5.28)

gij = γ(vi , vj ) ,

1 ≤ i, j ≤ n,

is the matrix representation 5.20 of the bilinear form γ in the basis β. Properties 1 in 5.2.8 implies the matrix [gij ] is symmetric. Property 2 and 3 for γ in definition 5.2.8 can be related to properties of its matrix representation [gij ]. First recall that A real symmetric matrix is always diagonalizable over the IR (Theorem 6.20 of [?]). This fact gives a test for when a symmetric bilinear form γ is positive definite in terms of the eigenvalues of a matrix representation of γ. Theorem 5.2.11. Let γ be a bilinear form on V and let gij = γ(vi , vj ) be the coefficients of the matrix representation of γ in the basis β = {vi }1≤i≤n . The bilinear form γ is an inner product if and only if its matrix representation [gij ] is a symmetric matrix with strictly positive eigenvalues. The property that [gij ] is symmetric with positive eigenvalues does not depend on the choice of basis. Example 5.2.12. Consider again the bilinear form β : IR 3 × IR 3 → IR from example 5.2.2,  1   1  x y 2   2    x , y B = 2x1 y 1 − x1 y 2 + x1 y 3 − x2 y 1 − x2 y 2 + x3 y 1 + x3 y 3 . x3 y3 In the basis in equation 5.21 the matrix representation is   1 0 0 [Bij ] =  0 −1 0  . 0 0 2

86CHAPTER 5. DIFFERENTIAL ONE-FORMS AND METRIC TENSORS Therefore by proposition 5.2 B is a symmetric bilinear form. However by Theorem 5.2.11 B is not positive definite, and hence B is not an inner product. Example 5.2.13. Let V = IR n and let Qij be any symmetric matrix with positive eigenvalues. Define γ(x, y) =

n X

Qij xi y j = xT Qy.

i,j=1

Then by Theorem 5.2.11 the bilinear form γ is an inner product.

5.3

Tensor product

There is a very important way to construct bilinear forms using linear ones called the tensor product. Let V be a vector-space and let α1 , α2 ∈ V ∗ . Now define α1 ⊗ α2 : V × V → IR by α1 ⊗ α2 (v, w) = α1 (v)α2 (w),

f or all v, w ∈ V.

Theorem 5.3.1. The function α1 ⊗ α2 : V × V → IR is bilinear. Proof. This is simple to check. Using the linearity of α1 we find, α1 ⊗ α2 (av1 + v2 , w) = α1 (av1 + v2 )α2 (w) = aα1 (v1 )α2 (w) + α1 (v2 )α2 (w) = aα1 ⊗ α2 (v1 , w) + α1 ⊗ α2 (v2 , w) and so α1 ⊗ α2 is linear in the first argument. The linearity of α1 ⊗ α2 in the second argument is shown using the linearity of α2 . Given α1 , α2 ∈ V ∗ , the bilinear form α1 ⊗ α2 and is called the tensor product of α1 and α2 . This construction is the beginning of the subject multi-linear algebra, and the theory of tensors. Let β = {vi }1≤i≤n be a basis for V and let β ∗ = {αj }1≤j≤n be the dual basis for V ∗ , and let (5.29)

∆ = {αi ⊗ αj }1≤i,j≤n .

The next theorem is similar to Theorem 5.1.3.

5.3. TENSOR PRODUCT

87

Theorem 5.3.2. The n2 elements of ∆ in equation 5.29 form a basis for B(V ). Moreover X (5.30) B= Bij αi ⊗ αj 1≤i,j≤n

where Bij = B(vi , vj ). Proof. Let B ∈ B(V ), and let Bij = B(vi , vj ),

vi , vj ∈ β

be the n × n matrix representation of B in the basis β. Now construct C ∈ B(V ), X C= Bij αi ⊗ αj . 1≤i,j≤n

We compute the matrix representation of C in the basis β by computing X C(vk , vl ) = Bij αi (vk )αj (vl ) 1≤i,j≤n

=

X

Bij δki δlj

1≤i,j≤n

= Bkl . Here we have used αi (vk ) = δki , αj (vl ) = δlj . Therefore by corollary 5.2.5 B = C, and so ∆ is a spanning set. The proof that ∆ is a linearly independent set is left for the exercises. Note that formula 5.30 for bilinear forms is the analogue to formula 5.7 for linear function. Example 5.3.3. Let B : IR 3 × IR 3 → IR be the bilinear form from example 5.2.2. Let {αi }1≤i≤3 be the dual basis of the basis in equation 5.21. Then by Theorem 5.3.2 and equation 5.22 we have (5.31)

B = α1 ⊗ α1 − α2 ⊗ α2 + 2α3 ⊗ α3 .

Let β = {vi }1≤i≤n be a basis for V and ∆ = {αi }1≤i≤n be a basis for the dual space V ∗ and let B be symmetric bilinear form on V . Theorem 5.3.2 allows us to write the inner product B as in equation 5.30 we have X (5.32) B= Bij αi ⊗ αj , 1≤i,j≤n

88CHAPTER 5. DIFFERENTIAL ONE-FORMS AND METRIC TENSORS where Bij = B(vi , vj ). In equation 5.32 we note that by using the symmetry Bij = Bji we can write Bij αi ⊗ αj + Bji αj ⊗ αi = Bij (αi ⊗ αj + αj ⊗ αi ). Therefore we define 1 αi αj = (αi ⊗ αj + αj ⊗ αi ), 2 so that equation 5.32 can be written X (5.33) B= Bij αi αj . 1≤i,j≤n

This notation will be used frequently in the next section. Equation 5.31 of example we can be rewritten as B = α1 ⊗ α1 − α2 ⊗ α2 + 2α3 ⊗ α3 = (α1 )2 − (α2 )2 + 2(α3 )3 . Example 5.3.4. Let V be a vector-space and β = {vi }1≤i≤n a basis and β ∗ = {αj }1≤j≤n the dual basis. Let γ=

n X

i

i

ci α ⊗ α =

n X

i=1

ci (αi )2

ci ∈ IR , ci > 0.

i=1

We claim γ is an inner-product on V . First γ is a sum of bilinear forms and therefore bilinear. Let’s compute the matrix representation of γ in the basis β. First note n X γ(v1 , v1 ) = ( ci αi ⊗ αi )(v1 , v1 ), =

i=1 n X

ci αi (v1 ) ⊗ αi (v1 )

i=1

= c1 i

where we’ve used α (v1 ) = 0, i 6= 1. Similarly then γ(v1 , v2 ) = 0, and in general γ(vi , vi ) = ci , and γ(vi , vj ) = 0, i 6= j. Therefore the matrix representation is the diagonal matrix, [gij ] = diag(c1 , c2 , . . . , cn ). By Theorem 5.2.11 and that ci > 0, γ is positive definite, and so an inner product.

5.4. METRIC TENSORS

5.4

89

Metric Tensors

Let p ∈ IR n and suppose we have an inner product γp : Tp IR n × Tp IR n → IR on the tangent space at p. The matrix representation 5.28 in the coordinate basis {∂xi }1≤i≤n , is the n × n matrix [gij ]1≤i,j≤n with entries [γp ]ij = gij = γp (∂xi |p , ∂xj |p ) ,

gij ∈ IR .

If we use the dual basis {dxi |p }1≤i≤n , then equation 5.30 in Theorem 5.3.2 says X γ(p) = gij dxi |p ⊗ dxj |p 1≤i,j≤n

(5.34) =

X

gij dxi |p dxj |p

by equation 5.33.

1≤i,j≤n

Definition 5.4.1. A metric tensor γ on IR n is a choice of inner product γp : Tp IR n × Tp IR n → IR for each point p ∈ IR n , which varies smoothly with p. A metric tensor γ is also called a Riemannian metric. It is a field of bilinear forms on the tangent space satisfying the conditions of an inner product at each point. We now say precisely what it means for γ to vary smoothly. The derivations ∂xi form a basis at every point, therefore given a metric tensor γ, we define the n2 functions gij : IR n → IR by (5.35)

gij (x) = γ(∂xi |x , ∂xj |x ) f or all x ∈ IR n .

A smooth metric tensor is one where the functions gij (x) are C ∞ functions on IR n . Using {dxi |p }1≤i≤n as the dual basis for ∂xi |p , applying the formula 5.34 pointwise for γ using equation 5.35 yields X (5.36) γ= gij (x) dxi dxj . 1≤i,j≤n

Note that at each fixed point p ∈ IR n , the matrix [gij (p)] will be symmetric and positive definite (by Theorem 5.2.11) because γp is an inner product. Conversely we have the following.

90CHAPTER 5. DIFFERENTIAL ONE-FORMS AND METRIC TENSORS Theorem 5.4.2. Let [gij (x)] be a matrix of smooth functions on IR n with the property that at each point p ∈ IR n that [gij (p)] is a positive definite symmetric matrix, then (5.37)

γ=

n X

gij (x)dxi dxj

i,j=1

is a metric tensor on IR n . This theorem then states that every metric tensor is of the form 5.37. The functions gij (x) are be called the components of γ in the coordinate basis, or the components of γ. Pn g (x)dxi dxj , a point p ∈ IR n and Given a metric tensor γ = Pn i Pn i,j=1i ij vectors Xp = i=1 ξ ∂xi |p , Yp = i=1 η ∂xi |p , Theorem 5.2.4 gives, (5.38)

T

γp (Xp , Yp ) = [Xp ] [gij (p)][Yp ] =

n X

gij (p)ξ i η j

i,j=1

where [Xp ] = [ξ i ], [Yp ] = [η i ] are the column vectors of the coefficients of Xp and Yp in the coordinate basis. P P Given two vector-fields X = ni=1 ξ i (x)∂xi , Y = ni=1 η i (x)∂xi on IR n we can also evaluate (5.39)

T

γ(X, Y ) = [X] [gij (x)][Y ] =

n X

gij (x)ξ i (x)η j (x) ∈ C ∞ (IR n ).

i,j=1

A metric tensor γ on an open set U ⊂ IR n is defined exactly as for IR n . Example 5.4.3. Let U ⊂ IR 3 be the open set U = {(x, y, z) ∈ IR 3 | xz 6= 0 }. Then  2  1 2 1 2 y y 1 + dz 2 , (5.40) γ = 2 dx + 2 dy − 2 2 dydz + x x zx x2 z 2 z 2 is a metric tensor on U . The components of γ are 1  0 0 x2 1 − zxy 2  [gij (x)] =  0 x2 2 0 − zxy 2 xy2 z2 + z12

5.4. METRIC TENSORS

91

Let p = (1, 2, −1) and Xp = (∂x + ∂y − ∂z )|p , Yp = (∂x + ∂z )|p . Using equation 5.38 we have    1 1 0 0    0  = −2. γp (Xp , Yp ) = [1, 1, −1] 0 1 2 0 2 5 1 If X = z∂x + y∂y + x∂z and Y = y∂x − x∂y then equation 5.39 gives γ(X, Y ) =

yz y y − + . x2 x z

Example 5.4.4. Let γ be a metric tensor on IR 2 . By equation 5.36 there exists functions E, F, G ∈ C ∞ (IR 2 ) such that (5.41)

γ = E dx ⊗ dx + F (dx ⊗ dy + dy ⊗ dx) + G dy ⊗ dy = E(dx)2 + 2F dxdy + G(dy)2 = Edx2 + 2F dxdy + Gdy 2 .

The matrix [gij (x) of components of γ are then   E F (5.42) [gij (x)] = F G which is positive definite at each point in IR 2 , since γ was assumed to be a metric tensor field. Example 5.4.5. On IR n let (5.43)

γE =

X

(dxi )2 .

The components of γ E at a point p in the coordinate basis {∂xi |p }1≤i≤n are γ E (∂xi |p , ∂xj |p ) = δij . The metric γ E is called the Euclidean metric tensor. If Xp , Yp ∈ Tp IR n then equation 5.38 gives γpE (Xp , Yp ) = [Xp ]T [Yp ]. Example 5.4.6. Let U = { (x, y) ∈ IR 2 | y > 0 }, and let 1  0 y2 (5.44) [gij ] = . 0 y12

92CHAPTER 5. DIFFERENTIAL ONE-FORMS AND METRIC TENSORS This 2 × 2 matrix defines a metric tensor on U given by formula 5.37 by γ=

1 dx2 + dy 2 (dx ⊗ dx + dy ⊗ dy) = . y2 y2

Let p = (1, 2) ∈ U and let (5.45)

Xp = (2∂x − 3∂y )|p , Yp = (−∂x + ∂y )|p .

The real number γp (Xp , Yp ) can be computed using formula 5.38 or by expanding, γ(Xp , Yp ) =

1 1 5 1 (dx(Xp )dx(Xp ) + dy(Xp )dy(Xp )) = (2)(−1)+ (−3)(1) = − 4 4 4 4

If the point were p = (2, 5) with Xp , Yp from equation 5.45 we would have γ(Xp , Yp ) =

1 1 1 (2)(−1) + (−3)(1) = − . 25 25 5

Notice how the computation depends on the point p ∈ U . Example 5.4.7. Let U ⊂ IR 2 an open set, and let   E F [gij ] = F G where E, F, G ∈ C ∞ (U ) and this 2 × 2 matrix is positive definite at each point in U . This 2 × 2 matrix defines the metric tensor using equation 5.37 producing the metric in equation 5.41 on the open set U ⊂ IR 2 . Remark 5.4.8. In general relativity one is interested in symmetric bilinear forms γ : V × V → IR which are non-degenerate (see exercise 9) but are not positive definite. A famous example of a metric tensor in general relativity is the Schwartzschild metric. This is the metric tensor whose coefficients in coordinates (t, r, θ, φ) are given by −1 +  0 [gij (t, r, θ, φ)] =   0 

0

2M r

0 1 1− 2M r

0 0

0 0

 0 0  , 0 

r2 sin2 φ 0 r2

5.4. METRIC TENSORS

93

which written in terms of differentials is   1 2M 2 dt2 + ds = −1 + dr2 + r2 sin2 φdθ2 + r2 dφ2 . r 1 − 2M r This metric tensor is not a Riemannian metric tensor. It does not satisfy the positive definite criteria as a symmetric bilinear form on the tangent space at each point. It is however non-degenerate. The Einstein equations in general relativity are second order differential equations for the coefficients of a metric [gij ], 1 ≤ i, j, ≤ 4. The differential equations couple the matter and energy in space and time together with the second derivatives of the coefficients of the metric. The idea is that the distribution of energy determines the metric tensor [gij ]. This then determines how things are measured in the universe. The Scwartzschild metric represents the geometry outside of a fixed spherically symmetric body. Remark 5.4.9. You need to be careful about the term metric as it is used here (as in metric tensor). It is not the same notion as the term metric in topology! Often in differential geometry the term metric is used, instead of the full name metric tensor, which further confuses the issue. There is a relationship between these two concepts - see remark 5.4.13 below.

5.4.1

Arc-length Pn

Let γ = i,j=1 gij (x)dxi dxj be a metric tensor on IR n , and σ : [a, b] → IR n a continuous curve on the interval [a, b], and smooth on (a, b). Denote by σ(t) = (x1 (t), x2 (t), . . . , xn (t)) the components of the curve. At a fixed value of t, σ˙ is the tangent vector (see equation 3.13), and it’s length with respect to γ is compute using 5.38 to be v uX p u n dxi dxj γ(σ, ˙ σ) ˙ =t gij (σ(t)) . dt dt i,j=1 Integrating this function with respect to t gives v Z bp Z b uX u n dxi dxj t γ(σ, ˙ σ)dt ˙ = gij (σ(t)) dt (5.46) L(σ) = dt dt a a i,j=1 which is the arc-length of σ with respect to the metric γ. Note that the components gij (x) of the metric tensor are evaluated along the curve σ.

94CHAPTER 5. DIFFERENTIAL ONE-FORMS AND METRIC TENSORS Example 5.4.10. For the Euclidean metric tensor γE v 2 Z bu n  uX dxi t L(σ) = dt , dt a i as expected. Example 5.4.11. Compute the arc-length of the line σ(t) = (t + 1, 2t + 1) 0 ≤ t ≤ 1 with respect to the metric in equation (5.44). We get, √ Z 1s 1 5 (1 + 4)) dt = log 3. (2t + 1)2 2 0 This is not the same as the arc-length using the Euclidean metric tensor which is found to be Z 1 √ √ 1 + 4dt = 5. 0

Remark 5.4.12. Another name for a metric tensor is a “line-element field” (written ds2 ), because the metric can be used to measure the length of a line. Remark 5.4.13. We now explain a relationship between metric tensors and the term metric in topology. Recall a metric on a set U is a function d : U × U → IR satisfying for all x, y, z ∈ U , 1. d(x, y) = d(y, x), 2. d(x, y) ≥ 0, 3. d(x, y) = 0 if and only if y = x, 4. d(x, y) ≤ d(x, y) + d(y, z) (the triangle inequality). Let γ be a metric tensor on IR n and let p, q ∈ IR n . Define d(p, q) by Z bp d(p, q) = inf σ γ(σ, ˙ σ)dt ˙ a

where σ : [a, b] → IR n is a curve satisfying σ(a) = p, σ(b) = q.

5.4. METRIC TENSORS

5.4.2

95

Orthonormal Frames

Recall that a basis β = {ui }1≤i≤n for an n-dimensional vector space V with inner product γ, is an orthonormal basis if γ(ui , uj ) = δij ,

1 ≤ i, j ≤ n.

The theory of orthonormal frames begins with an extension of the GramSchmidt process for constructing orthonormal basis on inner product spaces which we now recall. Starting with an basis β = {vi }1≤i≤n for the n-dimensional vector space V with inner product γ, the Gram-Schmidt process constructs by induction the set of vector β˜ = {wi }1≤i≤n by w 1 = v1 , w j = vj −

j−1 X γ(vj , wk ) wk , γ(w k , vw ) k=1

2 ≤ j ≤ n.

Theorem 5.4.14. The set of vector β˜ form a basis for V and are orthogonal to each other γ(wi , wj ) = 0 i 6= j. The proof of this is standard see [?]. From the set β 0 the final step in the Gram-Schmidt process is to let 1 ui = p wi γ(wi , wi )

1≤i≤n

so that β = {ui }1≤i≤n is an orthonormal basis for V . We now apply the Gram-Schmidt algorithm in the setting of a metric tensor. Let β 0 = {Xi }1≤i≤n be n vector-fields on IR n which are linearly independent at each point. A basis of Tp IR n is also called a frame, and the collection of vector-fields β 0 a frame field or moving frame. An orthonormal basis of T p IR n is called an orthonormal frame and if the vector-fields {Xi }1≤i≤n satisfy γ(Xi , Xj ) = δij and so are are orthonormal at each point in IR n , then the collection β 0 is called an orthonormal frame field.

96CHAPTER 5. DIFFERENTIAL ONE-FORMS AND METRIC TENSORS We now show how to construct orthonormal frame fields using the GramSchmidt process. Let γ be a Riemannian metric tensor on IR n . Construct the following vector fields from the frame field β 0 = {Xi }1≤i≤n , Y1 = X1 , (5.47)

Yj = Xj −

j−1 X γ(Xj , Yk ) k=1

γ(Yk , Yk )

Yk ,

2 ≤ j ≤ n.

As in Theorem 5.4.14 Theorem 5.4.15. The vector-fields {Yi }1≤i≤n are smooth and linearly independent at each point in IR n , and are mutually orthogonal at each point γ(Yi , Yj ) = 0,

i 6= j.

Again as in the final step of the Gram-Schmidt process let 1 Yi 1 ≤ i ≤ n, Zi = p γ(Yi , Yi ) and the set β = {Zi }1≤i≤n is a set of vector-fields that form an orthonormal basis with respect to γp for Tp IR n for each point p ∈ IR n , and so form an orthonormal frame field. Example 5.4.16. Let γ be the metric tensor on U = {(x, y, z) ∈ IR 3 | xz 6= 0} given by  2  1 2 1 2 y y 1 γ = 2 dx + 2 dy − 2 2 dydz + + dz 2 . x x zx x2 z 2 z 2 We find an orthornomal frame field starting with the coordinate frame ∂x , ∂y , ∂z and then using equation 5.47. The first vector-field is Y1 = ∂x , the second is Y2 = ∂y −

γ(∂y , Y1 ) Y1 = ∂ y , γ(Y1 , Y1 )

and the third is γ(∂z , Y1 ) γ(∂z , Y2 ) Y1 − Y2 γ(Y1 , Y1 ) γ(Y2 , Y2 ) y = ∂z − 0∂x + ∂y . z Finally the resulting orthonormal frame field is Y3 = ∂z −

(5.48)

Z1 = x∂x , Z2 = x∂y , Z3 = z∂z + y∂y .

5.5. RAISING AND LOWERING INDICES AND THE GRADIENT

5.5

97

Raising and Lowering Indices and the Gradient

Given a function f ∈ C ∞ (IR n ) its differential df (see equation 5.10) defines at each point p ∈ IR n an element of Tp∗ IR n . This is not the gradient of f . We will show that when given a metric tensor-field γ on IR n , it can be used to convert the differential one-form df into a vector-field X, which is called the gradient of f (with respect to γ). This highlights the fact that in order to define the gradient of a function a metric tensor is needed. Again we need some linear algebra. Suppose that B is a non-degenerate bilinear form on V (see exercise 9). In particular B could be an inner product. Now as a function B requires two vectors as input. Suppose we fix one vector in the input like B(−, v) and view this as a function of one vector. That is, let v ∈ V be a fixed vector and define the function αv : V → IR by (5.49)

αv (w) = B(w, v) f or all w ∈ V.

The notation αv is used to emphasize that the form αv depends on the initial choice of v ∈ V . Let’s check αv ∈ V ∗ , αv (aw1 + w2 ) = B(aw1 + w2 , v) = a B(w1 , v) + B(w2 , v) = aαv (w1 ) + αv (w2 ) a ∈ IR , w1 , w2 ∈ V. Therefore α ∈ V ∗ . From this we see that given v ∈ V we can construct from v an element αv ∈ V ∗ using the bilinear form B. That is we can use B to convert a vector to an element of the dual space αv called the dual of v with respect to B. From this point of view the bilinear form B allows us to define a function TB : V → V ∗ by (5.50)

TB (v)(w) = B(w, v),

f or all w ∈ V.

How does TB depend on V ? Let’s compute TB (av1 + v2 )(w) = B(w, av1 + v2 ) by 5.50 = a B(w, v1 ) + B(w, v2 ) = (aTB (v1 ) + TB (v2 )) (w). Therefore TB (av1 + v2 ) = aTB (v2 ) + TB (v2 ) and TB is a linear transformation. We now work out TB in a basis!

98CHAPTER 5. DIFFERENTIAL ONE-FORMS AND METRIC TENSORS Proposition 5.5.1. Let β = {vi }1≤i≤n for V and let β ∗ = {αi }1≤i≤n be the dual basis. The matrix representation of TB : V → V ∗ is ∗

[TB ]ββ = [gjk ] where gij = B(vi , vj ). Proof. We begin by writing (5.51)

TB (vi ) =

n X

gik αk

k=1

and determine gik . By equation 5.50, TB (vi )(vj ) = B(vi , vj ) and therefore evaluating equation 5.51 on vj gives ! n X B(vi , vj ) = gik αk (vj ) = gij . k=1

This proves the theorem. Now let v ∈ V and αv = TB (v) which we write in the basis β and dual basis β ∗ as v= (5.52)

n X

ai vi , and

i=1

αv = TB (v) =

n X

bi α i ,

ai , bi ∈ IR ,

i=1

where we assume ai are known since v is given, and we want to find bj in terms of ai . It follows immediately from Lemma 2.1.6 that the coefficients bj of the image form αv are (5.53)

bj =

n X

gjk ak .

k=1

We can then write (5.54)

TB (v) =

n n X X j=1

k=1

! gjk ak

αj .

5.5. RAISING AND LOWERING INDICES AND THE GRADIENT

99

The form αv = TB (v) in equation 5.54 is called the dual of v with respect to B, and this process is also sometimes called “lowering the index” of v with B. Another way to compute TB (v) is given by the following corollary (see Theorem 5.3.2). Corollary 5.5.2. Let B=

n X

gij αi ⊗ αj

i,j=1

where gij = B(vi , vj ). Then (5.55)

TB (v) =

n X

gij αj (v)αi .

i,j=1

Example 5.5.3. Let B be the symmetric non-degenerate bilinear form from example 5.2.2  1   1  x y (5.56) B x2  , y 2  = 2x1 y 1 − x1 y 2 + x1 y 3 − x2 y 1 − x2 y 2 + x3 y 1 + x3 y 3 . x3 y3 Let v = e1 − e2 + 2e3 , then we compute αv = TB (v) by noting αv (e1 ) = B(v, e1 ) = 5, αv (e2 ) = 0 αv (e3 ) = B(v, e3 ) = 3. Therefore αv (xe1 + ye2 + ze3 ) = 5x + 3z. Theorem 5.5.4. The linear transformation TB in equation 5.50 is an isomorphism. Proof. Suppose TB (v) = 0, where 0 ∈ V ∗ , then 0 = TB (v)(w) = B(v, w) f or all w ∈ V. Since B is non-degenerate (see exercise 8), this implies v = 0. Which proves the lemma. Let TB−1 : V ∗ → V be the inverse of TB , and let α ∈ V ∗ then by equation 5.50 TB ◦ TB−1 (α)(v) = B(v, TB−1 (α)), f or all v ∈ V.

100CHAPTER 5. DIFFERENTIAL ONE-FORMS AND METRIC TENSORS However TB ◦ TB−1 = I the identity, and therefore (5.57)

α(v) = B(v, TB−1 (α)),

f or all v ∈ V.

Let βP= {vi }1≤i≤n be a basis for V , β ∗ = {αi }1≤i≤n the dual basis, and let α = ni=1 bi αi . We now find v = TB−1 (α) in the basis β. By Proposition ∗ ∗ 5.5.1 [TB ]ββ = [gij ]. Using Proposition 2.3.7, [TB−1 ]ββ ∗ = ([TB ]ββ )−1 , and so let ∗ [g ij ] denote the inverse matrix of [gij ] = [TB ]ββ . Utilizing Lemma 2.1.6 the coefficients ai of the image vector TB (α) in terms of the coefficients bj of α are given by (5.58)

ai =

n X

g ij bj ,

j=1

and (5.59)

v = TB−1 (α) =

n n X X i=1

! g ij bj

vi .

j=1

The vector v in formula 5.58 is called the dual of α with respect to B. The process is also sometimes called “raising the index” with B. Example 5.5.5. We continue with the example 5.5.3 and we compute the dual of α(xe1 + ye2 + ze3 ) = 3x − y + 2z with respect to B (raise the index). From equation 5.56, we find the matrix [g ij ] = [gij ]−1 is   1 −1 −1 1 [g ij ] = −1 −3 1  . 4 1 −1 3 Therefore

and

1 [TB−1 (α)] = [1, 1, 5] 2 1 1 5 TB−1 (α) = e1 + e2 + e3 . 2 2 2

In summary using TB : V → V ∗ we can map vectors to dual vectors, and using the inverse Tγ−1 : V ∗ → V we can map dual vectors to vectors. This will be the essential part of the gradient.

5.5. RAISING AND LOWERING INDICES AND THE GRADIENT 101 Let γ be a Riemannian metric on IR n , and suppose X is a vector-field. We can convert the vector-field X to a differential one-form αX by using the formula 5.50 at each point (5.60)

αX (Yp ) = γp (Yp , X(p)),

f or all Tp ∈ Tp IR n .

We then define Tγ from vector-fields on IR n to differential one-forms on IR n by Tγ (X)(Yp ) = γ(Yp , X(p)) f or allYp ∈ Tp IR n . P If X = ni=1 ξ i (x)∂xi , and the metric components are gij (x) = γ(∂xi , ∂xj ) then formula 5.54 or equation 5.55 applied point-wise for lowering the index gives (5.61) ! ! n n n n X X X X αX = Tγ (X) = gij (x)dxj (X) dxi = gij (x)ξ j (x) dxi . i=1

j=1

i=1

j=1

The differential one-form αX is the called dual of the vector-field X with respect to the metric γ. P The function Tγ is invertible, and given a differential form α = ni=1 αi (x)dxi , then its dual with respect to the metric γ is given by the raising index formula 5.59 ! n n X X −1 ij (5.62) X = Tγ (α) = g (x)αj (x) ∂xi i=1

j=1

where again [g ij (x)] is the inverse of the matrix [gij (x)]. Example 5.5.6. Let 1 (dx2 + dy 2 ) x 1+e 2 be a Riemannian metric on IR , and let X = x∂x + y∂y . The dual of X with respect to γ is computed using 5.61 γ=

α=

1 1 1 1 dx(X)dx + dy(Y )dy = xdx + ydy. x x x 1+e 1+e 1+e 1 + ex

If α = ydx − xdy then by equation 5.62 the dual of α with respect to γ is X = y(1 + ex )∂x − x(1 + ex )∂y .

102CHAPTER 5. DIFFERENTIAL ONE-FORMS AND METRIC TENSORS More generally if γ = Edx2 + 2F dxdy + Gdy 2 , is a metric tensor on IR 2 and and X = a∂x + b∂y a vector-field on U then the dual of X is computed using 5.61 to be Tγ (X) = (aE + bF )dx + (aF + bG)dy. If α = adx + bdy is a differential one-form then by equation 5.62 its dual is Tγ−1 (α) =

1 ((aG − bF )∂x + (bE − aF )∂y ) EG − F 2

where we have used, ij



[g ] =

E F F G

−1

1 = EG − F 2



G −F −F E

 .

Finally we can define the gradient of a function. Let f ∈ C ∞ (U ), and γ a Riemannian metric tensor on U (or at least a non-degenerate bilinear form-field). Let (5.63)

grad(f ) = { the dual of df with respect to γ } = Tγ−1 (df ).

By equation 5.62 the formula for grad(f ) is (5.64)

grad(f ) =

n n X X i=1

∂f g ij (x) j ∂x j=1

! ∂xi

Example 5.5.7. Continuing from Example 5.5.6 with the metric tensor γ = (1 + ex )−1 (dx2 + dy 2 ) on IR 2 , if f ∈ C ∞ (IR 2 ) then df = fx dx + fy dy and its dual with respect to γ is grad(f ) = (1 + ex )fx ∂x + (1 + ex )fy ∂y . Example 5.5.8. Let (x, y, z) be coordinates on IR 3 , and γ = dx2 + dy 2 − 2xdydz + (1 + x2 )dz 2 .

5.5. RAISING AND LOWERING INDICES AND THE GRADIENT 103 The components of γ in the coordinate basis are   1 0 0 −x  , [gij (x)] = 0 1 0 −x 1 + x2 while

  1 0 0 [g ij (x)] = 0 1 + x2 x . 0 x 1

Given f (x, y, z) then its gradient with respect to this metric is  fx ∂x + (1 + x2 )fy + xfz ) ∂y + (xfy + fz )∂z . Example 5.5.9. If n X γE = (dxi )2 i=1

is the Euclidean metric tensor then n X ∂f ∂ i. grad(f ) = i x ∂x i=1

In this case the coefficients of the gradient of a function are just the partial derivatives of the function. This is what occurs in a standard multi-variable calculus course. Let Up ∈ Tp IR n be a unit vector with respect to a metric tensor γ. The rate of change of f ∈ C ∞ (IR n ) at the point p in the direction Up is Up (f ). As in ordinary calculus, the following is true. Theorem 5.5.10. Let f ∈ C ∞ (IR n ) with gradp (f ) 6= 0. Then gradp (f ) is the direction at p in which f increases the most rapidly, and the rate of change is ||gradp (f )||γ .

104CHAPTER 5. DIFFERENTIAL ONE-FORMS AND METRIC TENSORS Proof. We begin by using equation 5.57 with γp : Tp IR n × Tp IR n → IR and definition 5.63 to get Up (f ) = dfp (Up ) = γp (Up , Tγ−1 (dfp )) = γp (Up , gradp (f )). The Cauchy-Schwartz inequality 5.26 applied to this formula then gives, |Up (f )| = |γ(Up , gradp (f )| ≤ ||grad(f )||γ . The result follows by noting the maximum rate ||gradp (f )||γ of |Up (f )| is obtained when Up = ||gradp (f )||−1 γ gradp (f ).

This next bit is not necessary for doing the assignment but is another way to define the raising the index procedure. Since Tγ is an invertible linear transformation it can be used to define a non-degenerate bilinear form γ ∗ : V ∗ × V ∗ → IR , defined by γ ∗ (α, β) = γ(Tγ−1 (α), Tγ−1 (β)). Theorem 5.5.11. The function γ ∗ is bilinear, non-degenerate. If γ is symmetric, then so is γ ∗ . If γ is positive definite then so is γ ∗ . Furthermore if {vi }1≤i≤n is a basis for V , and {αi }1≤i≤n is the dual basis then, γ ∗ (αi , αj ) = g ij where [g ij ] is the inverse matrix of [gij ] = [γ(vi , vj )]. If γ is a metric tensor on IR n then γ ∗ is called the contravariant form of the metric γ. The raising the index procedure (and the gradient) can then be defined in terms of γ ∗ using the isomorphism Tγ ∗ : V ∗ → V Tγ ∗ (α)(τ ) = γ ∗ (α, τ ). In this formula we have identified (V ∗ )∗ = V . Remark 5.5.12. Applications of the gradient in signal processing can be found in [?], [?], [?].

5.6. A TALE OF TWO DUALS

5.6

105

A tale of two duals

Given a vector v ∈ V , where V is a finite dimensional vector-space, there is no notion of the dual of v unless there is an inner product γ on V . In this case the inner product γ can be used to define the function Tγ as in equation 5.50 giving Tγ (v) ∈ V ∗ which is the dual with respect to γ. The matter is quite different if we are given a basis β = {vi }1≤i≤n for V . We then have the dual basis β ∗ = {αi }1≤i≤n defined by (see equation 5.3), αi (vj ) = δji ,

1 ≤ i, j ≤ n.

Suppose we also have γ as an inner product on V , and let β˜∗ = {σ i }1≤i≤n be the forms dual with respect to γ, σ i = Tγ (vi ),

1 ≤ i ≤ n.

These two duals are related by the following theorem. Theorem 5.6.1. The set β˜∗ is a basis for V ∗ and αi = σ i if and only if vi is an orthonormal basis. Proof. The fact that β˜∗ is a basis will be left as an exercise. Suppose that β is an orthornormal basis, then σ i (vj ) = Tγ (vj ) = γ(vi , vj ) = δji . However by definition αi (vj ) = δji , and so for each i = 1, . . . , n, αi and σ i agree on a basis, and hence are the same elements of V ∗ . This proves the sufficiency part of the theorem. Finally assume that αi = σ i , 1 ≤ i ≤ n. Then δji = αi (vj ) = σ i (vj ) = γ(vi , vj ), and β is an orthonormal basis. Theorem 5.6.1 has an analogue for frames fields. Let {Xi }1≤i≤n be a frame field on IR n . The algebraic dual equations (5.65)

αi (Xj ) = δji

f or all p ∈ IR n ,

106CHAPTER 5. DIFFERENTIAL ONE-FORMS AND METRIC TENSORS define αi , 1 ≤ i ≤ n as a field of differential one-forms, which form a basis for Tp∗ IR n for each p ∈ IR n . Given a Riemmanian metric tensor γ define the differential one-forms as in equation 5.60, (5.66)

σ j = Tγ (Xj ) = γ(−, Xj ),

1 ≤ j ≤ n.

We then have the field version of Theorem 5.6.1. Corollary 5.6.2. The one-form fields {σ i }1≤i≤n from equation 5.66 define a basis for Tp∗ IR n for each point p ∈ IR n . The fields satisfy αi = σ i , 1 ≤ i ≤ n, if and only if Xi is an orthonormal frame field. Example 5.6.3. In equation 5.67 of example 5.4.16 we found the orthonormal frame field (5.67)

Z1 = x∂x , Z2 = x∂y , Z3 = z∂z + y∂y .

for the metric tensor 1 y 1 γ = 2 dx2 + 2 dy 2 − 2 2 dydz + x x zx



y2 1 + 2 2 2 xz z



dz 2

on U = {(x, y, z) ∈ IR 3 | xz 6= 0}. The algebraic dual defined in equation 5.65 of Z1 , Z2 , Z3 is easily computed by using Corollary 5.6.2 by taking the dual with respect to γ. We find α1 = Tγ (Z1 ) =

1 dx, x

α2 = Tγ (Z2 ) =

1 dy, x

1 α3 = Tγ (Z3 ) = dz. z

5.7. EXERCISES

5.7

107

Exercises

1. Let V = IR 3 and T ∈ V ∗ . Prove there exists a, b, c ∈ IR such that    x T y  = ax + by + cz. z 2. Let Xp , Yp ∈ Tp IR 2 , with p = (1 − 2) be Xp = (2∂x + 3∂y )|p , Yp = (3∂x + 4∂y )|p . Compute the dual basis to β = {Xp , Yp } 3. Let f = 2xyz, let p = (1, 1, 1), and Xp = (−3∂x + ∂y + ∂z )|p . (a) Compute dfp (Xp ). (b) Find df in the coordinate basis. (c) Let g = x2 +y 2 +z 2 . Are dg and df linear dependent at any point? 4. Show that B(V ) the space of bilinear functions on V with addition and scalar multiplication defined by equation 5.19 is a vector-space. (Do not assume that V is finite dimensional.) 5. Show that S(V ) ⊂ B(V ) the symmetric bilinear functions, form a subspace. (Do not assume that V is finite dimensional.) 6. Prove corollary 5.2.5. 7. Finish the proof of Theorem 5.3.2 by showing {αi ⊗ αj }1≤i,j≤n is a linearly independent set. 8. A bilinear function B : V × V → IR is non-degenerate if B(v, w) = 0 f or all w ∈ V then v = 0. (a) Prove that an inner product on V is non-degenerate.

108CHAPTER 5. DIFFERENTIAL ONE-FORMS AND METRIC TENSORS (b) Given a basis β = {vi }1≤i≤n for V , prove that B is non-degenerate if and only if the matrix [gi j] = [B(vi , vj )] is invertible. 9. Let B : IR 3 × IR 3 → IR be the bilinear function from example 5.2.2, (a) Show B is a symmetric non-degenerate bilinear function on IR 3 . (b) Is B positive definite? (c) Compute αv the dual of v = −2e1 + e2 + e3 with respect to B as defined in equation 5.49. (d) Compute the dual of the form α(xe1 + ye2 + ze3 ) = 4x − 3y + z with respect to B (raise the index). (Answer = (2, 1, −2)). 10. Let η : IR 2 × IR 2 → IR be the function η(x, y) = x1 y 2 + x2 y 1 , where x = (x1 , x2 ), y = (y 1 , y 2 ). (a) Is η a symmetric non-degenerate bilinear function on IR 2 ? If so, is η positive definite ? (b) Let β = {v1 = (1, 1), v2 = (1, −1)} write η as a linear combination of tensor products of the dual basis β ∗ = {α1 , α2 }. (c) Compute αv1 and αv2 as defined in equation 5.49 where v1 , v2 are from part (b) (lower the index of v1 and v2 ). Compare to part (b). (d) Compute the dual of the α(xe1 + ye2 ) = 4x − 3y with respect to η (raise the index). 11. Let γ be a symmetric bilinear forms and β = {vi }1≤i≤n a basis for V , and β ∗ = {αj }1≤j≤n a basis for V ∗ . (a) Show that γ=

X

γ(vi , vj )αi αj ,

1≤i,j≤n i

j

where α α is given in equation 5.33. (b) Show that ∆s = {αi αj }1≤i≤j≤n forms a basis for S(V ), the symmetric bilinear forms on V .

5.7. EXERCISES

109

12. For the metric tensor on IR 3 given by γ = dx2 + dy 2 − 2xdydz + (1 + x2 )dz 2 , (a) compute the dual of the vector-field z∂x + y∂y + x∂z with respect to γ (lower the index). (b) Compute the dual of the differential form ydx + zdy − (1 + x2 )dz with respect to the metric γ (raise the index). (c) Find an orthonormal frame field and its dual. (Hint: See Corollary 5.6.2) 13. Let U = {(x, y) | y > 0} with metric tensor γ=

1 (dx2 + dy 2 ) y2

(a) Compute the arc-length of a “straight line” between the points √ (0, 2) and (1, 1). (b) Compute the arc-length of a circle passing through the points √ (0, 2) and (1, 1) which has its center on the x-axis. Compare to part (a). Hint: You will need to find the circle. (c) Find an orthonormal frame field and its dual. (d) Find the gradient of f ∈ C ∞ (U ). 14. For the metric γ = dφ2 + sin2 φ dθ2 on the open set 0 < φ < π, 0 < θ < 2π, find (a) an orthonormal frame field and its dual. (b) Compute the gradient of f (θ, φ).

110CHAPTER 5. DIFFERENTIAL ONE-FORMS AND METRIC TENSORS

Chapter 6 The Pullback and Isometries 6.1

The Pullback of a Differential One-form

Recall that in Chapter 4 that a function Φ : IR n → IR m induces a linear transformation at each point p ∈ IR n , Φ∗,p : Tp IR n → Tq IR m , where q = Φ(p), defined on derivations by (Φ∗,p Xp )(g) = Xp (g ◦ Φ) g ∈ C ∞ (q). If Xp =

Pn

i=1

ξ i ∂xi |p , ξ i ∈ IR then Φ∗,p Xp =

n X n X j=1 i=1

∂Φa ∂ ξ . ∂xi p ∂y a q i

The map Φ∗,p : Tp IR n → Tq IR m induces a map Φ∗q going in the other direction, Φ∗q : Tq∗ IR m → Tp∗ IR n on the dual space. The definition of the function Φ∗q is easy once we examine the linear algebra. Let V, W be real vector-spaces and T : V → W a linear transformation. There exists a map T t : W ∗ → V ∗ defined as follows. If τ ∈ W ∗ then T ∗ (τ ) ∈ V ∗ is defined by its value on v ∈ V through (6.1)

T t (τ )(v) = τ (T (v)) f or allv ∈ V.

Lemma 6.1.1. If τ ∈ W ∗ then T t (τ ) ∈ V ∗ . Furthermore T t : W ∗ → V ∗ is a linear transformation. 111

112

CHAPTER 6. THE PULLBACK AND ISOMETRIES

Proof. The first part of this lemma is proved by showing that T t (τ ) is a linear function of v in equation 6.1. Suppose v1 , v2 ∈ V , and c ∈ IR , then T t (τ )(cv1 + v2 ) = τ (T (cv1 + v2 )) = τ (cT (v1 ) + T (v2 )) = cτ (T (v1 )) + τ (T (v2 )) = cT t (τ )(v1 ) + T t (τ )(v2 ).

(6.2)

Therefore T t (τ ) ∈ V ∗ . The proof that T t is linear is an exercise. To be more concrete about what T t it is useful to write it in a basis. Suppose that V and W are finite dimensional vector-spaces of dimension n and m respectively, and that β = {vi }1≤i≤n is a basis for V , and γ = {wa }1≤a≤n is a basis for W . Let A = [T ]γβ be the matrix representation of T then A is the m × n matrix determined by the equations (see equation 2.2) (6.3)

T (vi ) =

m X

Aai wa .

a=1

P Furthermore if v = ni=1 ci vi then the coefficients of the image vector T (v) are by Lemma 2.1.6 or equation 2.10, n X [T (v)]γ = [ Aai ci ] = A[v]β . i=1

Now let β ∗ = {αi }1≤i≤n be the basis of V ∗ which is the dual basis of β for V , and γ = {τ a }1≤a≤n the basis of W ∗ which is the dual basis of γ for W ∗ as defined in equation 5.3. The matrix representation of function the∗ T t : W ∗ → V ∗ will now be computed in the basis γ ∗ , β ∗ . Let B = [T t ]βγ ∗ which is an n × m matrix, which is determined by t

(6.4)

a

T (τ ) =

n X

Bia αi .

i=1

By evaluate the right side of equation 6.4 on vk ∈ β we get (

n X i=1

Bia αi )(vk )

=

n X i=1

Bia αi (vk )

=

n X i=1

Bia δki = Bka .

6.1. THE PULLBACK OF A DIFFERENTIAL ONE-FORM

113

Therefore equation 6.4 gives Bka = T t (τa )(vk ) = τ a (T (vk )) n X a =τ ( Abk wb )

by equation 6.1 by equation 6.3

i=1

(6.5)

=

n X

Abk τ a (wb )

i=1

= =

n X

Abk δba

i=1 Aak .

Therefore equation 6.5 gives T (vi ) =

m X

Aai wa ,

a=1

(6.6) T t (τ a ) =

n X

Aai αi .

i=1

Pm



a Suppose τ ∈ W and τ = a=1 ca τ . We then write out what are the coefficients of T t (τ ) in the basis β ∗ by computing, t

T (

m X

a

ca τ ) =

a=1

(6.7)

=

m X a=1 m X a=1

ca T t (τ a ) n X ca ( Aai αi )

by equation 6.6

i=1

n X m X = ( ca Aai )αi . i=1 a=1

In other words the coefficients of the image [T t (τ )]β ∗ are the row vector we get by multiplying A on the left by the row vector [τ ]γ ∗ = [c1 , . . . , cm ], [T t (τ )]β ∗ = [τ ]γ ∗ A. Now let’s put formula 6.6 to use in the case Φ∗,p : Tp IR n → Tq IR m , q = Φ(p). Denote by Φ∗q : Tq∗ IR m → Tp∗ IR n the map on the dual, and let

114

CHAPTER 6. THE PULLBACK AND ISOMETRIES

τq ∈ Tq∗ IR m (note a dual vector at a point in IR m in the image of Φ). Then the corresponding definition from 6.1 of the map Φ∗q (τ ) ∈ Tp∗ IR n is (Φ∗q τq )(Xp ) = τq (Φ∗,p Xp ). Let β = {∂xi |p }1≤i≤n be the coordinate basis for Tp IR n , and γ = {∂ya |q }1≤a≤m the coordinate basis for Tq IR n , and the corresponding dual basis are {dxi |p }1≤i≤n , and {dy a |q }1≤a≤m . The matrix representation of Φ∗,p in the coordinate basis is m X ∂Φa ∂ya |q , Φ∗,p (∂xi |p ) = ∂xi p a=1 and so equation 6.6 gives (6.8)

Φ∗q (dy a |q )

n X ∂Φa = dxi |p . i ∂x p i=1

Note the difference in the summation index in these last two equations. An important observation from equation 6.8 needs to be made. Equation 5.16 is n X ∂Φa a dxi |p , (dΦ )p = ∂xi i=1

p

and so equation 6.8 can then be written, (6.9)

Φ∗q (dy a |q ) = (dΦa )p .

This motivates the definition. Definition 6.1.2. Let Φ : IR n → IR n , and g ∈ C ∞ (IR m ). The pullback of g to IR n denoted by Φ∗ g is the function Φ∗ g = g ◦ Φ, and Φ∗ g ∈ C ∞ (IR n ). Using definition 6.1.2 we have for the coordinate functions Φ∗ y a = Φa , and equation 6.9 can be written (6.10)

Φ∗q (dy a )p = d(Φ∗ y a )p = (dΦa )p .

6.1. THE PULLBACK OF A DIFFERENTIAL ONE-FORM

115

P a Finally for a general element of τq ∈ Tq∗ IR m , τq = m a=1 ca dy |q , ca ∈ IR , we then find as in equation 6.7, by using equations 6.9, and 6.8 that Φ∗q ( (6.11)

m X

a

ca dy |q ) =

a=1

=

m X a=1 n X i=1

ca (dΦa )p , ! ∂Φa ca dxi |p . i ∂x p a=1

m X

Example 6.1.3. Let Φ : IR 2 → IR 3 be given by (6.12)

Φ(x, y, z) = (u = x + y, v = x2 − y 2 , w = xy),

and let τq = (2du − 3dv + dw)q where q = Φ(1, 2) = (3, −3, 2). We compute Φ∗q τq by first using 6.9, Φ∗q duq = d(x + y)(1,2) = (dx + dy)(1,2) (6.13)

Φ∗q dvq = d(x2 − y 2 )(1,2) = (2xdx − 2ydy)(1,2) = (2dx − 4dy)(1,2) Φ∗q dwq = d(xy)(1,2) = (2dx + dy)(1,2) .

Therefore by equation 6.11 and the equation 6.13, Φ∗q (2du − 3dv + dw)|q = 2Φ∗q (du|q ) − 3Φ∗q (dv|q ) + Φ∗q (dw|q ) = 2(dx + dy) − 3(2dx − 4dy) + 2dx + dy = (−2dx + 15dy)(1,2) . We now come to a fundamental observation. Recall if X is a vector-field on IR n and Φ : IR n → IR m , that we cannot use Φ∗ to define a vector-field on IR m . In the case m = n and Φ is a diffeomorphism then it is possible to push-forward a vector-field as in section 4.5, but in general this is not the case. Let’s compare this with what happens for one-form fields. Suppose that τ is now a one-form field on IR m . ( τ is specified on the image space of Φ). For any point p ∈ IR n (the domain of Φ) we can define αp = Φ∗q (τΦ(p) ). We call the differential one-form α the pullback of τ and we write α = Φ∗ τ for the map on the differential one-form τ . Therefore we can always pullback a differential one-form on the image to a differential one-form on the domain, something we can not do with the push-forward for vector-fields!

116

CHAPTER 6. THE PULLBACK AND ISOMETRIES

We now give a formula for Φ∗ τ in coordinates. If τ = then with y = Φ(x) we have m X τy=Φ(x) = fa (Φ(x))dy a |Φ(x) .

Pm

a=1

fa (y)dy a ,

a=1

From equation 6.11 ∗

(Φ τ )x =

Φ∗q (τΦ(x) )

=

m X

fa (Φ(x))dΦa

a=1

(6.14) =

n X i=1

! ∂Φa dxi |x fa (Φ(x)) i ∂x x a=1

m X

which holds at every point x in the domain of Φ. In particular note that equation 6.14 implies that using definition 6.1.2 the pullback of the coordinate differential one-forms are , n X ∂Φa i dx . (6.15) Φ∗ dy a = d (Φ∗ y a ) = dΦa = i ∂x i=1 We then rewrite equation 6.14 as (dropping the subscript x) m X fa (Φ(x))Φ∗ dy a Φ∗ τ = a=1

(6.16) =

n m X X i=1

∂Φa fa (Φ(x)) i ∂x a=1

! dxi

Example 6.1.4. Let Φ : IR 2 → IR 3 be as in equation 6.12 from example 6.1.3. We compute Φ∗ du, Φ∗ dv, Φ∗ dw using 6.15, (6.17)

Φ∗ du = d(Φ∗ u) = d(x + y) = dx + dy, Φ∗ dv = d(Φ∗ v) = d(x2 − y 2 ) = 2xdx − 2ydy, Φ∗ dw = d(Φ∗ w) = d(xy) = ydx + xdy .

Let’s calculate Φ∗ (vdu − udv + wdw). In this case we have by formula 6.16 and 6.17 Φ∗ (vdu − udv + wdw) = (x2 − y 2 )Φ∗ du − (x + y)Φ∗ dv + xyΦ∗ dw , = (x2 − y 2 )d(x + y) − (x + y)d(x2 − y 2 ) + xy d(xy) , = (x2 − y 2 )(dx + dy) − (x + y)(2xdx − 2ydy) + xy(ydx + xdy) , = (xy 2 − 3x2 − 2xy)dx + (x2 + y 2 + 2xy + x2 y)dy.

6.2. THE PULLBACK OF A METRIC TENSOR

6.2

117

The Pullback of a Metric Tensor

Generalizing what we did in the previous section, suppose that T : V → W is a linear transformation we define the function T t : B(W ) → B(V ) from the bilinear functions on W to those on V as (6.18)

T t (B)(v1 , v2 ) = B(T (v1 ), T (v2 )),

B ∈ B(W ).

We have a lemma analogous to Lemma 6.1.1 in the previous section checking that T t (B) is really bilinear. Lemma 6.2.1. Let T : V → W be a linear transformation, and B ∈ B(W ), then T t (B) ∈ B(V ). Furthermore if B is symmetric, then T t (B) is symmetric. If T is injective and B is positive definite (or non-degenerate) then T ∗ B is positive definite (or non-degenerate). Proof. The fact that T t (B) is bilinear is similar to Lemma 6.1.1 above and won’t be repeated. Suppose that B is symmetric then for all v1 , v2 ∈ V , T t (B)(v1 , v2 ) = B(T (v1 ), T (v2 )) = B(T (v2 ), T (v1 )) = T t (B)(v2 , v1 ). Therefore T t (B) is symmetric. Suppose that T is injective, and that B is positive definite. Then T t (B)(v, v) = B(T (v), T (v)) ≥ 0 because B is positive definite. If B(T (v), T (v)) = 0 then T (v) = 0, which by the injectivity of T implies v = 0. Therefore T t (B) is positive definite. Suppose now that V and W are finite dimensional vector-spaces of dimension n and m respectively, and that β = {vi }1≤i≤n is a basis for V , and γ = {wa }1≤a≤n is a basis for W . Denoting as usual A = [T ]γβ , then A is the m × n matrix determined by T (vi ) =

m X

Aai wa .

a=1

For B ∈ B(V ) the matrix representation of B is (6.19)

Bab = B(wa , wb ).

118

CHAPTER 6. THE PULLBACK AND ISOMETRIES

We now compute the matrix representation of the bilinear function T t (B), T t (B)ij = T t (B)(vi , vj ) = B(T (vi ), T (vj ) m m X X a = B( A i wa , Abi wb ) a=1

(6.20)

=

X

b=1

Aai Abj B(wa , wb )

1≤a,b≤m

=

X

Aai Abj Bab

1≤a,b≤m

In terms of matrix multiplication one can write equation 6.20 as (T t (B)) = AT (B)A. Now let β ∗ = {αi }1≤i≤n be the basis of V ∗ which is the dual basis of β for V , and γ = {τ a }1≤a≤n the basis of W ∗ which is the dual basis of γ for W ∗ . Using equation 6.19 and the tensor product basis as in equation 5.30 we have X Bab τ a ⊗ τ b . (6.21) B= 1≤a,b≤m

While using the coefficients in equation 6.20 and the tensor product basis as in equation 5.30 we have ! X X (6.22) T t (B) = Aai Abj Bab αi ⊗ αj . 1≤i,j≤n

1≤a,b≤m

By using the formula for T t (τ a ), T t (τ b ) from equation 6.6, this last formula can also be written as ! X T t (B) = T t Bab τ a ⊗ τ b 1≤a,b≤m (6.23) X = Bab T t (τ a ) ⊗ T t (τ b ) . 1≤a,b≤m

Let Φ : IR n → IR m be a smooth function and let p ∈ IR n , and let B : Tq IR m × Tq IR m → IR , with q = Φ(p), be a bilinear function. Then

6.2. THE PULLBACK OF A METRIC TENSOR

119

Φ∗q (B) : Tp IR n × Tp IR n → IR is a bilinear function defined exactly as in 6.18 by Φ∗q (B)(Xp , Yp ) = B(Φ∗,p Xp , Φ∗,p Yp ). Suppose that we have for Tp IR n the standard coordinate basis {∂xi |p }1≤i≤n , and for Tq IR m the basis {∂ya |q }, with the corresponding dual basis {dxi |p }1≤i≤n and {dy a |q }1≤a≤m . Recall equation 6.8, Φ∗q (dy a |q )

n X ∂Φa dxi |p . = i ∂x p i=1

Therefore writing B=

X

Bab dy a |q ⊗ dy b |q ,

Bab ∈ IR ,

1≤a,b≤m

formula 6.22 gives (6.24)

(Φ∗q B)p =

X

X

1≤i,j≤n

1≤a,b≤m

Bab

! ∂Φa ∂Φb dxi |p ⊗ dxj |p . ∂xi p ∂xj p

Equation 6.24 can also be written using equation 6.23 as X (6.25) (Φ∗q B)p = Bab (Φ∗q dy a |Φ(p) ) ⊗ (Φ∗q dy b |Φ(p) ) 1≤a,b≤m

Example 6.2.2. Let Φ : U → IR 3 , U = {(θ, φ) ∈ IR 2 | 0 < θ < 2π, 0 < φ < π}, be the function (6.26)

Φ(θ, φ) = (x = cos θ sin φ, sin θ sin φ, cos φ).    Let p = π4 , π4 , q = 12 , 21 , √12 and let B = (dx2 +dy 2 +dz 2 )|q . From equation 6.25 we find Φ∗q (B)p = (Φ∗q dx|q )2 + (Φ∗q dy|q )2 + (Φ∗q dz|q )2 . Then by equation 6.8, 1 1 Φ∗ dx|q = (− sin θ sin φdθ + cos θ cos φdφ)p = − dθ|p + dφ|p . 2 2

120

CHAPTER 6. THE PULLBACK AND ISOMETRIES

Similarly 1 1 Φ∗q dyq = dθ|p + dφ|p , 2 2

1 Φ∗q dz|q = − √ dφ|p , 2

and so 1 Φ∗q (dx2 + dy 2 + dz 2 )q = (dφ2 + dθ2 )|p . 2 Recall from the previous section the important property that a differential one-form (or one-form field) that is defined on the image of a smooth function Φ : IR n → IR m pulls-back to a differential one-form on the domain of Φ. A similar property holds for fields of bilinear functions such as Riemannian metric tensors. In particular, if γ is a field of bilinear functions (such as a metric tensor field), then the pullback of Φ∗ γ is a field of bilinear forms defined by, (6.27) (Φ∗ γ)p (Xp , Yp ) = γp (Φ∗,p Xp , Φ∗,p Yp ), f or all p ∈ IR n , Xp , Yp ∈ Tp IR n . For each p ∈ IR n (the domain of Φ), then Φ∗ γ is a bi-linear function on Tp IR n , and so a field of bi-linear function. Suppose that X γ= gab (y)dy a ⊗ dy b 1≤a,b≤m

is a a field of bilinear functions on IR m then Φ∗ γ is the field of bilinear function defined by equation 6.22 at every point in IR n . If y = Φ(x) then γΦ(x) =

X

gab (Φ(x))dy a |Φ(x) ⊗ dy b |Φ(x)

1≤a,b≤m

and by equation 6.24 (6.28)

Φ∗ γ =

X

gab (Φ(x))(Φ∗ dy a ) ⊗ (Φ∗ dy b ).

1≤a,b≤m

This can be further expanded to using equation 6.15 to (6.29)



Φγ=

n X m X i,j=1 a,b=1

gab (Φ(x))

∂Φa ∂Φb i dx ⊗ dxj ∂xi ∂xj

6.2. THE PULLBACK OF A METRIC TENSOR

121

Example 6.2.3. Continue with Φ in equation 6.26 in example 6.2.2 we compute Φ∗ γ E by using equation 6.28 Φ∗ (dx2 + dy 2 + dz 2 ) = (Φ∗ dx)2 + (Φ∗ dy)2 + (Φ∗ dz)2 = (− sin θ sin φ dθ + cos θ cos φ dφ)2 + (cos θ sin φ dθ + sin θ cos φ dφ)2 + sin2 φ dφ2 = dφ2 + sin2 φdθ2 Example 6.2.4. With Φ : IR 2 → IR 3 given by Φ(x, y) = (x, y, z = f (x, y)) we compute Φ∗ γ E using equation 6.28 by first computing Φ∗ dx = dx, Φ∗ dy = dy, Φ∗ dz = fx dx + fy dy. Therefore Φ∗ (dx2 + dy 2 + dz 2 ) = dx2 + dy 2 + (fx dx + fy dy)2 = (1 + fx2 )dx2 + 2fx fy dxdy + (1 + fy2 )dy 2 . Example 6.2.5. Let U ⊂ IR 3 and γ be the metric tensor in 5.4.3, and let Φ : IR 3 → U be Φ(u, v, w) = (x = eu , y = veu , z = e2 ) . By equation 6.28 the pullback Φ∗ γ is (6.30)   1 1 v v2 1 γ u 2 2 w Φ = 2u (d(e )) + 2u dv − 2 w 2u dvd(e ) + + 2w (d(ew ))2 2(u+w) e e e e e e 2 −2u 2 −2u 2 −2u 2 = du + e dv − 2ve dvdw + (1 + v e )dw . Theorem 6.2.6. Let Φ : IR m → IR n be an immersion, and γ a Riemannian metric tensor on IR n . Then the pullback Φ∗ γ is a Riemannian metric on IR m Proof. We only need to prove that at each point p ∈ IR m that Φ∗ γ is an inner product. XXXXXXX More examples will be given in the next section.

122

CHAPTER 6. THE PULLBACK AND ISOMETRIES

6.3

Isometries

Let γ be a fixed metric tensor on IR n . Definition 6.3.1. A diffeomorphism Φ : IR n → IR n is an isometry of the metric γ if Φ∗ γ = γ.

(6.31)

Let’s write this out more carefully using the definition of pullback in equation 6.27. We have Φ is an isometry of γ if and only if (6.32)

f or allp ∈ IR n , Xp , Yp ∈ Tp IR n .

γ(Xp , Yp ) = γ( Φ∗,p Xp , Φ∗,p Yp ),

The metric tensor γ on the right hand side is evaluated at Φ(p. Lemma 6.3.2. Let Φ : IR n → IR n be a diffeomorphism, then the following are equivalent: 1. Φ is an isometry. 2. For all p ∈ IR n , and 1 ≤ i, j, ≤ n, (6.33)

γ(∂xi |p , ∂xj |p ) = γ(Φ∗,p (∂xi |p ), Φ∗,p (∂xj |p )) .

3. For all p ∈ IR n and Xp ∈ Tp IR n , γ(Xp , Xp ) = γ(Φ∗,p Xp , Φ∗,p Xp ). Proof. Clearly 1 implies P 2 by equation 6.32. Suppose that 2 holds and Xp ∈ Tp IR n where Xp = ni=1 X i ∂xi |p , X i ∈ IR . Using bilinearity, γ(Φ∗,p Xp , Φ∗,p Xp ) = γ(Φ∗,p

n X i=1

= =

n X i,j=1 n X

i

X ∂x , Φ∗,p

n X

X j ∂ xj )

j=1

X i X j γ(Φ∗,p ∂xi |p , Φ∗,p ∂xj |p ) X i X j γ(∂xi |p , ∂xj |p ) by1.

i,j=1

= γ(X, X). Therefore 2 implies 3.

i

6.3. ISOMETRIES

123

For 3 to imply 1), let Xp , Yp ∈ Tp IR n , then by hypothesis (6.34)

γ(Xp + Yp , Xp + Yp ) = γ(Φ∗,p (Xp + Yp ), Φ∗,p (Xp + Yp )).

Expanding this equation using bilinearity the left hand side of this is (6.35)

γ(Xp + Yp , Xp + Yp ) = γ(Xp , Xp ) + 2γ(Xp , Yp ) + γ(Yp , Yp )

while the right hand side of equation 6.34 is (6.36) γ(Φ∗,p (Xp +Yp ), Φ∗,p (Xp +Yp )) = γ(Φ∗,p Xp , Φ∗,p Xp )+2γ(Φ∗,p Xp , Φ∗,p Yp )+γ(Φ∗,p Yp , Φ∗,p Yp ). Substituting equations 6.35 and 6.36 into equation 6.34 we have (6.37) γ(Xp , Xp )+2γ(Xp , Yp )+γ(Yp , Yp ) = γ(Φ∗,p Xp , Φ∗,p Xp )+2γ(Φ∗,p Xp , Φ∗,p Yp )+γ(Φ∗,p Yp , Φ∗,p Yp ). Again using the hypothesis 3, γ(Φ∗,p Xp , Φ∗,p Xp ) = γ(Xp , Xp ), and γ(Φ∗,p Y, Φ∗,p Y ) = γ(Y, Y ) in equation 6.37 we are left with 2γ(Φ∗,p Xp , Φ∗,p Yp ) = 2γ(Xp , Yp ), which shows that 3 implies 1 (by equation 6.32). The last condition says that for Φ to be an isometry it is necessary and sufficient that Φ∗,p preserves lengths of vectors. Example 6.3.3. Let Φ : IR 2 → IR 2 be the diffeomorphism,   1 1 (6.38) Φ(x, y) = √ (x − y), √ (x + y) . 2 2 This function is a counter clockwise rotation by π/4 about the origin in IR 2 . We compute now what Φ does to tangent vectors. Let Xp = X 1 ∂x + X 2 ∂y at the point p = (x0 , y0 ). We find the coefficient of Φ∗,p Xp in the coordinate basis are " #  " # 1 1 2 1 √1 √1 √ − (X − X ) X 2 . [Φ∗,p (Xp )] = √12 √1 2 2 = √1 X (X 1 + X 2 ) 2 2 2 The two by two matrix is the Jacobian matrix for Φ at the point p (in this case the point p doesn’t show up in evaluating the Jacobian). We see the

124

CHAPTER 6. THE PULLBACK AND ISOMETRIES

the coefficients of the image vector, are just the rotated form of the ones we started with in Xp . XXXXXXXXXXXXX DRAW PICTURE XXXXXXXXXXXXXX Therefore we can check condition 3 in Lemma 6.3.2 for an isometry by computing 

1 γ(Φ∗,p Xp , Φ∗,p Xp ) = √ (X 1 − X 2 ) 2 1 2 = (X ) + (X 2 )2 = γ(Xp , Xp )

2

 +

1 √ (X 1 + X 2 ) 2

2

Example 6.3.4. In this next example, consider the metric tensor in IR 2 given by (6.39)

γ=

1 (dx2 + dy 2 ) 2 2 1+x +y

We claim the diffeomorphism Φ in equation (6.38) is an isometry for this metric. We first compute, 1 1 Φ∗ dx = √ (dx − dy) , Φ∗ dy = √ (dx + dy). 2 2 Then computing, 1 Φγ= 1 1 + ( √2 (x − y))2 + ( √12 (x + y))2 1 = (dx2 + dy 2 ). 1 + x2 + y 2 ∗



1 1 (dx − dy)2 + (dx − dy)2 2 2



Therefore Φ satisfies equation 6.31 and is an isometry. Example 6.3.5. In this next example consider the diffeomorphisms of IR 2 , (6.40)

Ψt (x, y) = (x + t, y),

where t ∈ IR . We compute Ψ∗t dx = dx, Ψ∗t dy = dy

6.3. ISOMETRIES

125

since t is a constant. For the Euclidean metric γ E , we find Ψ∗t γ E = dx2 + dy 2 . Therefore Ψ in equation (6.40) is an isometry for all t ∈ IR . Are Ψt isometries for the metric in equation (6.39) in example 6.3.4? We have   1 1 ∗ 2 2 Ψt (dx + dy ) = (dx2 + dy 2 ). 2 2 1+x +y 1 + (x + t)2 + y 2 This will not equal γ unless t = 0. In which case Ψt is just the identity transformation. Example 6.3.6. In this next example consider the diffeomorphisms of IR 2 , (6.41)

Ψt (x, y) = (x cos t − y sin t, x sin t + y cos t),

where t ∈ [0, 2π). We compute from equation 6.15, Ψ∗t dx = cos t dx − sin t dy, Ψ∗t dy = sin t dx + cos t dy, since t is a constant. For the Euclidean metric γ E we find Ψ∗t γ E = (cos t dx − sin t dy)2 + (sin t dx + cos t dy)2 = dx2 + dy 2 . Therefore Ψt in equation (6.32) is an isometry for all t ∈ IR . What about the metric in equation (6.39) in example 6.3.4. We have  1 2 2 (dx + dy ) = 1 + x2 + y 2  1 (cos t dx − sin t dy)2 + ((sin t dx + cos t dy))2 2 2 1 + (x cos t − y sin t) + (x sin t + y cos t) 1 = (dx2 + dy 2 ) 1 + x2 + y 2

Ψ∗t



Therefore Φt is an isometry of this metric tensor as well.

126

CHAPTER 6. THE PULLBACK AND ISOMETRIES

Example 6.3.7. In example 5.4.3 we have U = {(x, y, z) ∈ IR 3 | xz 6= 0 } with the following metric tensor on U ,   2 1 2 y y 1 2 dz 2 . (6.42) γ = 2 dx + dy − 2 2 dydz + + x zx x2 z 2 z 2 For each a, c ∈ IR ∗ and let b ∈ IR , define the function Φ(a,b,c) : U → U by (6.43)

Φ(a,b,c) (x, y, z) = (u = ax, v = ay + bz, w = cz)

Therefore (noting that a, b, c are constants) we have (6.44)

Φ∗(a,b,c) du = a dx , Φ∗(a,b,c) dv = a dy + b dz , Φ∗(a,b,c) dw = c dz .

Using equation 6.44 we find  2    v 1 2 v 1 1 2 2 ∗ du + 2 dv − 2 2 dvdw + + dw , Φ(a,b,c) u2 u wu u2 w2 w2 1 ay + bz 1 2 (a dy + b dz) − 2 (a dy + b dz)(c dz) = 2 (adx)2 + ax (ax)2 cz(ax)2   1 (ay + bw)2 + + (c dz)2 2 2 2 (ax) (cz) (cz)  2  1 2 y y 1 1 2 + = 2 dx + 2 dy − 2 2 dydz + dz 2 . x x zx x2 z 2 z 2 Therefore for each a, b, c the diffeomorphism Φ(a,b,c) is an isometry of the metric γ. Given a metric tensor γ the the isometries have a simple algebraic structure. Theorem 6.3.8. Let γ be a metric tensor in IR n . The set of isometries of γ form a group with composition of functions as the group operations. This group is called the isometry group of the metric. Proof. Let Φ and Ψ be isometries of the metric γ and Xp , Yp ∈ Tp IR n , then γ ((Φ ◦ Ψ)∗ Xp , (Φ ◦ Ψ)∗ Yp ) = γ (Φ∗,p Ψ∗ Xp , Φ∗,p Ψ∗ Yp ) = γ (Ψ∗ Xp , Ψ∗ Yp ) Φ = γ (Xp , Yp ) Ψ

chain rule is an isometry 6.32 is an isometry 6.32

6.3. ISOMETRIES

127

Therefore the composition of two isometries is an isometry, and we have a well defined operation for the group. Composition of functions is associative, and so the group operation is associative. The identity element is the identity function. We leave as an exercise to prove that if Φ is an isometry, then Φ−1 is also an isometry. Lastly in coordinates the isometry condition. Suppose that Pn wei write out i j γ = i,j=1 g j(x)dx dx is a metric tensor in IR n and that Φ : IR n → IR n is a diffeomorphism. Let’s expand out the right hand side of condition 2, in Lemma 6.3.2 with q = Φ(p), ! n n X X ∂Φl ∂Φk ∂ k |q , ∂ l |q) γ(Φ∗,p ∂xi |p ), Φ∗,p ∂xj |p ) = γ ∂xi p y ∂xj p y l=1 k=1 n X ∂Φk ∂Φl = γ(∂yk |q , ∂yl |q ) (6.45) ∂xi p ∂xj p k,l=1 n X ∂Φk ∂Φl = [γq ]kl ∂xi p ∂xj p k,l=1 Since p was arbitrary we can summarize the computation in equation 6.45 with the following lemma which is the component form for a diffeomorphism to be an isometry. Lemma 6.3.9. A P diffeomorphism Φ : IR n → IR n is an isometry of the metric tensor γ = ni,j=1 g i j(x)dxi dxj on IR n if and only if (6.46)

n X ∂Φk ∂Φl g (Φ(x)) = gij (x). i ∂xj kl ∂x k,l=1

This lemma can be viewed in two ways. First given a diffeomorphism, we can check if it is an isometry of a given metric γ. The second and more interesting point of view is that equations (6.46) can be viewed as partial differential equations for the function Φ given γ. These partial differential equation are very non-linear for Φ, but have some very unusual properties and can be integrated in a number of special situations. Example 6.3.10. Equation 6.46 for the Euclidean metric tensor-field γ E on IR n is n X ∂Φk ∂Φl (6.47) δ = δij . i ∂xj kl ∂x k,l=1

128

CHAPTER 6. THE PULLBACK AND ISOMETRIES

If we differentiate this with respect to xm we get  n  X ∂ 2 Φk ∂Φl ∂Φk ∂ 2 Φl (6.48) δkl = 0. + ∂xi ∂xm ∂xj ∂xi ∂xj ∂xm k,l=1 Now using equation 6.47 but replacing i with m and then differentiating with respect to xi we get equation 6.48 with i and m switched,  n  X ∂ 2 Φk ∂Φl ∂Φk ∂ 2 Φl (6.49) + m j i δkl = 0. m ∂xi ∂xj ∂x ∂x ∂x ∂x k,l=1 Do this again with j and m to get,  n  X ∂ 2 Φk ∂Φl ∂Φk ∂ 2 Φl (6.50) + δkl = 0. ∂xi ∂xj ∂xm ∂xi ∂xm ∂xj k,l=1 Now take equation 6.48 plus 6.49 minus 6.50 to get 0=

n X k,l=1

(

∂ 2 Φk ∂Φl ∂Φk ∂ 2 Φl + ∂xi ∂xm ∂xj ∂xi ∂xj ∂xm ∂ 2 Φk ∂Φl ∂Φk ∂ 2 Φl + ∂xm ∂xi ∂xj ∂xm ∂xj ∂xi  2 k l ∂ Φ ∂Φ ∂Φk ∂ 2 Φl − i j m− δkl . ∂x ∂x ∂x ∂xi ∂xm ∂xj

+

The second and sixth term cancel, and so do the fourth and fifth, while the first and third term are the same. Therefore this simplifies to 0=2

n X ∂ 2 Φk ∂Φl δ . i ∂xm ∂xj kl ∂x k,l=1

Now the condition Φ is a diffeomorphism implies that Φ∗ is invertible (so the Jacobian matrix is invertible), and so 0=

∂ 2 Φk . ∂xi ∂xm

This implies Φ is linear and that Φ(x) = Ax + b

6.3. ISOMETRIES

129

where A is an invertible matrix, and b ∈ IR n . Finally using condition 6.47, we have AT A = I and A is an orthogonal matrix. The method we used to solve these equations is to take the original system of equations and differentiate them to make a larger system for which all possible second order partial derivatives are prescribed. This holds in general for the isometry equations for a diffeomorphism and the equations are what is known as a system of partial differential equations of finite type. There another way to find isometries without appealing to the equations (6.46) for the isometries. This involves finding what are known as “Killing vectors” and their corresponding flows, which we discuss in the next two chapters. The equations for a Killing vector are linear and often easier to solve than the non-linear equations for the isometries. “Killing vectors” are named after the mathematician Wilhelm Killing (1847-1923).

130

6.4

CHAPTER 6. THE PULLBACK AND ISOMETRIES

Exercises

1. If T : V → W is a linear transformation, show that T t : W ∗ → V ∗ is also a linear transformation. 2. Let Φ : IR 3 → IR 2 be Φ(x, y, z) = (u = x + y + z, v = xy + xz) and compute (a) Φt 2du|(4,3) − dv|(4,3)

 (1,2,1)

, and

(b) Φ∗ (vdu + dv). 3. Let Φ : U → IR 3 , U = {(ρ, θ, φ) ∈ IR 3 | 0 < ρ, 0 < θ < 2π, 0 < φ < π}, be Φ(ρ, θ, φ) = (x = ρ cos θ sin φ, y = ρ sin θ sin φ, z = ρ cos φ) and compute (a) Φ∗ (xdx + ydy + zdz) , (b) Φ∗ (ydx − xdy) , (c) Φ∗ (dx2 + dy 2 + dz 2 ) , (d) Φ∗ (df ), f = x2 + y 2 . 4. Let B ∈ B(W ) and T : V → W injective. Prove that if B is nondegenerate then T t (B) is non-degenerate. 5. Let U = {(x, y) | y > 0} with metric tensor γ=

1 (dx2 + dy 2 ) y2

(a) Show that for each t ∈ IR , the transformations ψt (x, y) → (et x, et y) are isometries.

6.4. EXERCISES

131

6. Show that for each a, b, c ∈ IR , that Φ(a,b,c) : IR 3 → IR 3 given by Φ(a,b,c) (x, y, z) = (x + a, y + az + b, z + c) is an isometry for the metric tensor γ on IR 3 , (6.51)

γ = dx2 + dy 2 − 2xdydz + (1 + x2 )dz 2 .

7. For which a, b, c ∈ IR is Φ(a,b,c) (x, y, z) = (x + a, y + cx + b, z + c) an isometry of the metric tensor 6.51 in the previous problem? 8. By using the pullback, show that every diffeomorphism Φ : IR n → IR n of the form (6.52)

Φ(x) = Ax + b,

where A is an n × n matrix satisfying AT A = I, and b ∈ IR n is an isometry of the Euclidean tensor on IR n . (Hint in components Pn metric i j i equation 6.52 is Φ = j=1 Aj x + bi .) 9. Complete the proof of Theorem 6.3.8 by showing that if Φ is an isometry of the metric tensor γ, then Φ−1 is also an isometry of γ.

132

CHAPTER 6. THE PULLBACK AND ISOMETRIES

Chapter 7 Hypersurfaces 7.1

Regular Level Hyper-Surfaces

Let F ∈ C ∞ (IR n+1 ) and let c ∈ IR . The set of points S ⊂ IR n+1 defined by S = { p ∈ IR n+1 | F (p) = c} is called a regular level surface (or hyper-surface) if the differential F∗ : Tp IR n+1 → TF (p) IR is surjective at each point p ∈ S. Let’s rewrite this condition in a basis. If ∂xi |p and ∂u |q are the coordinate basis at the points p and q = F (p) then the matrix representation of F∗ is computed using ?? to be (7.1)

[F∗ ] = (∂x1 F, ∂x2 F, . . . , ∂xn+1 F ) .

Therefore S is a regular level surface if at each point p ∈ S, at least one of the partial derivative of F with respect to xi does not vanish at that point. Example 7.1.1. Let F : IR n+1 → IR be the function F (x) = xT x =

n+1 X (xi )2 i=1

and let r ∈ IR + . The n sphere of radius r given by Srn = { p ∈ IR n+1 | F (p) = r2 }. The standard n-sphere denoted by S n has radius 1. 133

134

CHAPTER 7. HYPERSURFACES

The sphere Srn is a regular level surface. To check this we compute  [F∗ ] = 2x2 , 2x2 , . . . , 2xn+1 and note that at any point p ∈ Srn not all x1 , . . . , xn+1 can be zero at the same time (because r2 > 0). Let S ⊂ IR n+1 be a regular level surface F = c. The tangent space Tp S, p ∈ S is Tp S = { Xp ∈ Tp IR n+1 | Xx ∈ ker F∗ |p }. Note that since F∗ has rank 1, that dim Tp S = n by the dimension theorem 2.2.7. Lemma 7.1.2. If Xp ∈ Tp IR n+1 then Xp ∈ Tp S if and only if Xp (F ) = 0 Proof. Let ι ∈ C ∞ (IR ) be the identity function. We compute F∗ Xp (ι) = Xp (ι ◦ F ) = Xp (F ). This vanishes if and only if Xp (F ) = 0. Example 7.1.3. Let r > 0 and Sr2 = { (x, y, z) ∈ IR 3 | x2 + y2 + z 2 = r2 }, which is the 2-sphere in IR 3 of radius r. Let p = √16 , √16 , √13 ∈ S 2 (where r = 1), and let Xp = ∂x − ∂y then Xp (x2 + y 2 + z 2 ) = (2x − 2y)|p = 0. Therefore Xp ∈ Tp S 2 . Let’s compute the tangent space Tp Sr2 by finding a basis. In the coordinate basis we have by equation 10.1, [F∗ ] = (2x, 2y, 2z) where x, y, z satisfy x2 + y 2 + z 2 = r2 . In order to compute the kernel note the following, if z 6= ±r, then ker[F∗ ] = span{(−y, x, 0), (0, −z, y)}. Rewriting this in the standard basis we have Tp Sc2 = span{−y∂x + x∂y , −z∂y + y∂z } p 6= (0, 0, ±r). At the point p = (0, 0, ±r) we have Tp Sc2 = span{∂x , ∂y } p = (0, 0, ±r).

7.2. PATCHES AND COVERS

135

Example 7.1.4. Let z = f (x, y), (x, y) ⊂ U where U is an open set in IR 2 . As usual we let F (x, y, z) = z − f (x, y), so that the graph z = f (x, y) is written as the level surface F = 0. We compute in the coordinate basis [F∗ ] = (−fx , −fy , 1) and so the surface is a regular level surface. At a point p(x0 , y0 , f (x0 , y0 )) ∈ S we have Tp S = span{ ∂x + fx (x0 , y0 )∂z , ∂y + fy (x0 , y0 )∂z }. Let σ : I → IR be a smooth curve lying on the surface S, which means F ◦ σ = c. Applying the chain rule to the function F ◦ σ : IR → IR (or differentiating with respect to t) gives (F ◦ σ)∗ ∂t = F∗ σ∗ ∂t = c∗ ∂t = 0. Therefore σ∗ ∂t ∈ Tσ(t) S. In the next section we will answer the question of whether for every tangent vector Xp ∈ Tp S there exists a representative curve σ for Xp lying on S.

7.2

Patches and Covers

Let S ⊂ IR n+1 be a regular level surface. Definition 7.2.1. A coordinate patch (or coordinate chart) on S is a pair (U, ψ) where 1. U ⊂ IR n is open, 2. ψ : U → IR n+1 is a smooth injective function, 3. ψ(U ) ⊂ S, 4. and ψ∗ is injective at every point in U (so ψ is an immersion) A coordinate patch about a point p ∈ S is a coordinate patch (U, ψ) with p ∈ ψ(U ). The function ψ provides coordinates on the set ψ(U ) ⊂ S.

136

CHAPTER 7. HYPERSURFACES

Example 7.2.2. Recall the regular level surface Sr2 = {(x, y, z) ∈ IR 3 | x2 + y 2 + z 2 = r2 , r ∈ IR + }. Let U = (0, 2π) × (0, π) ⊂ IR 2 , which is clearly open. The function ψ : U → IR 3 given by (7.2)

ψ(u, v) = (r cos u sin v, r sin u sin v, r cos v)

is a surface patch on the 2-sphere Sr2 . Example 7.2.3. Let S be the regular level surface defined by a graph z = f (x, y), (x, y) ∈ U open. let ψ : U → IR 3 be the function ψ(u, v) = (u, v, f (u, v)) (u, v) ∈ U. The conditions for a patch are easily checked. Every point (x0 , y0 , z0 ) ∈ S is contained in the given patch. Suppose (U, ψ) is a patch on a regular level surface S. Then ψ : U → S is a one-to-one immersion. The differential ψ∗ is injective by definition and so ψ∗ (Tx U ) ⊂ Tψ(x) IR n+1 is an n-dimensional subspace. However we find even more is true. Lemma 7.2.4. The map ψ∗ : Tp U → Tq S, where q = ψ(p) is an isomorphism. Proof. We have ψ∗ (Tp U ) and Tq S are n-dimensional subspaces of Tq IR n+1 . If ψ∗ (Tp U ) ⊂ Tp S then they are isomorphic. Let Xp ∈ Tp U , we only need to check that ψ∗ Xp ∈ ker F∗ . We compute F∗ ψ∗ Xp = Xp (F ◦ ψ). However by the patch conditions F ◦ ψ = c and so Xp (F ◦ ψ) = 0. Therefore Lemma 10.1.2 implies ψ∗ Xp ∈ Tq S, and so psi∗ (Tp U ) = Tq S. Example 7.2.5. Continuing with example 10.2.2, let (u, v) ∈ (0, 2π) × (0, π) we find ψ∗ ∂u |(u,v) ∈ Tp S 2 is ψ∗ ∂u |(u,v) = (− sin u sin v∂x + cos u sin v∂y )(cos u sin v,sin u sin v,cos v) . Note that one can check that ψ∗ ∂u ∈ Tp S 2 using Lemma 10.1.2. Example 7.2.6. For the example of a graph in 10.2.3 we compute ψ∗ ∂u |(u0 ,v0 ) = ∂x + fx (u0 , v0 )∂z , ψ∗ ∂v |(u0 ,v0 ) = ∂y + fy (u0 , v0 )∂z .

7.2. PATCHES AND COVERS

137

Let (U, ψ) be a patch on a regular level surface S, and let q ∈ S be a point contained in ψ(U ). We now argue that given any Xq ∈ Tq S there exists a curve σ : I → S with σ(0) ˙ = Xq . By Lemma 10.2.4 let Yp ∈ Tp U with ψ∗ Yp = Xq , which exists and is unique since ψ∗ is an isomorphism. Let σ be a representative curve for Yp in U ⊂ IR n . Then ψ ◦ σ is a representative curve for Xp . This follows from the chain rule, d d ψ ◦ σ|t=0 = ψ∗ σ∗ = ψ∗ Yp = Xq . dt dt t=0 A covering of a regular level surface is a collection of surface patches C = (Uα , ψα )α∈A where [ ψα (Uα ). S= α∈A

In other words every point p ∈ S is contained in the image of some surface patch. Example 7.2.7. We continue with Sr2 = {(x, y, z) ∈ IR 3 | x2 + y 2 + z 2 = r2 , r > 0}. Let D = {(u, v) ∈ IR 2 | u2 + v 2 < r2 } which is an open set in IR 2 . The set D with the function ψz+ : D → IR 3 given by √ ψz+ (u, v) = (x = u, y = v, z = r2 − u2 − v 2 ) is a surface patch on Sr2 (the upper hemi-sphere). Likewise the pair (D, ψz− ) is a surface patch on Sr2 (the bottom hemi-sphere) where √ ψz− (u, v) = (x = u, y = v, z = − r2 − u2 − v 2 ). Continuing in the way we construct four more patches all using D and the functions √ ψx± (u, v) = (x = ± r2 − u2 − v 2 , y = u, z = v) √ ψy± (u, v) = (x = u, y = ± r2 − u2 − v 2 , z = v). The collection C = {(D, ψz± ), (D, ψx± ), (D, ψy± ) } is a cover of Sr2 by coordinate patches. The fact is regular level surfaces always admit a cover. This follows from the next theorem which we won’t prove.

138

CHAPTER 7. HYPERSURFACES

Theorem 7.2.8. Let S ⊂ IR n+1 be a regular level hyper-surface, and let p ∈ S. There exists a exists a surface patch (U, ψ) with p ∈ ψ(U ). The proof of this theorem involves the implicit function theorem from advanced calculus, see [?] for the theorem. Corollary 7.2.9. Let S ⊂ IR n+1 be a regular level hyper-surface. There exists a cover (Uα , ψα ), α ∈ A of S.

7.3

Maps between surfaces

Suppose S ⊂ IR n+1 and Σ ⊂ IR m+1 are two regular level surfaces and that Φ : IR n → IR m . We’ll say that Φ restricts to a smooth map from S to Σ if Φ(p) ∈ Σ f or all p ∈ S. Example 7.3.1. Let S n ⊂ IR n+1 be the standard n-sphere, and consider the function Φ : IR n+1 → IR n+1 given by Φ(p) = −p. The function Φ restricts to a smooth function from S n ⊂ IR n+1 to S n ⊂ IR n+1 . More generally let A ∈ Mn+1,n+1 (IR ) where AT A = I. Define the function ΦA : IR n+1 → IR n+1 by ΦA (x) = Ax. The function ΦA is linear, and so smooth. If x ∈ S n (so xT x = 1), then [ΦA (x)]T ΦA (x) = xT AT Ax = xT x = 1. Therefore ΦA restricts to a smooth map ΦA : S n → S n . Example 7.3.2. Let S ⊂ IR 3 be the regular level surface S = {(u, v, w) | 4u2 + 9v 2 + w2 = 1 }. The function Φ : IR 3 → IR 3 given by Φ(x, y, z) = (2x, 3y, z) restricts to a smooth map from S 2 to Σ. Example 7.3.3. Let Φ : IR 3 → IR 5 be given by Φ(x, y, z) = (x, y, z, 0, 0), then Φ restricts to a smooth function from S 2 to S 4 .

7.4. MORE GENERAL SURFACES

139

A smooth map Φ : S → Σ is said to be an immersion if Φ∗ : Tp S → TΦ(p) Σ is injective for each p ∈ S, and a submersion if Φ∗ is surjective for each p. A general notion of a smooth function Φ : S → Σ which does not necessarily come from a function on the ambient IR space is given in more advanced courses.

7.4

More General Surfaces

Another type of surface that is often encountered in multi-variable calculus is a parameterized surface. An example is S ⊂ IR 3 given by S = {(x = s cos t, y = s sin t, z = t),

s, t ∈ IR 2 }

which is known as the helicoid. With this description of S it is unclear whether S is actually a level surface or not. It is possible to define what is known as a parameterized surface, but let’s look at the general definition of a regular surface which includes parameterized surfaces and regular level surfaces. Definition 7.4.1. A regular surface S ⊂ IR n+1 is a subset with the following properties, 1. for each point p ∈ S there exists an open set U ⊂ IR n and a smooth injective immersion ψ : U → IR n+1 such that p ∈ ψ(U ) ⊂ S, 2. and furthermore there exists an open set V ∈ IR n+1 such that ψ(U ) = S∩V. One important component of this definition is that S can be covered by surface patches. The idea of a cover is fundamental in the definition of a manifold.

7.5

Metric Tensors on Surfaces

Consider the coordinate patch on the upper half of the unit sphere S 2 ⊂ IR 3 in example 10.2.7 with r = 1, given by the function ψ : U → IR 3 where U is the inside of the unit disk u2 + v 2 < 1, and √ (7.3) ψ(u, v) = ( x = u, y = v, z = 1 − u2 − v 2 ).

140

CHAPTER 7. HYPERSURFACES

We can view the disk u2 + v 2 < 1 as lying in the xy-plane and the image under ψ as the upper part of S 2 . Let σ(t) = (x(t), y(t), z(t)) be a curve on the upper half of the sphere. The curve σ is the image of the curve τ (t) = (u(t), v(t)) ⊂ U which is the projection of σ into the xy-plane. In particular from equation 10.3 we have u(t) = x(t), v(t) = y(t) and therefore have p (7.4) σ(t) = ψ ◦ τ (t) = (u(t), v(t), 1 − u(t)2 − v(t)2 ) Given the curve (10.4) on the surface of the sphere we can compute its arclength as a curve in IR 3 using the Euclidean metric γEu (see equation ??), s Z b  2  2  2 dx dy dz + + dt LEu (σ) = dt dt dt a v !2 Z bu u du 2  dv 2 du dv 1 t (u + v ) dt = + + p dt dt dt 1 − u(t)2 − v(t)2 dt a Z br 1 = ((1 − v 2 )(u) ˙ 2 + 2uv u˙ v˙ + (1 − u2 )(v) ˙ 2 ) dt 1 − u2 − v 2 a Note that this is the same arc-length we would have computed for the curve τ (t) = (u(t), v(t)) using the metric tensor (7.5)

γˆU =

 1 2 2 2 (1 − v )du + 2uvdudv + (1 − u )dv 1 − u2 − v 2

defined on the set U ! Let’s look at this problem in general and see where the metric tensor 10.5 comes from. Suppose S ⊂ IR n+1 is a regular level surface. Let γ be a metric tensor on IR n+1 . The metric γ induces a metric tensor on S, we denote by γS as follows. Let q ∈ S, and X, Y ∈ Tq S, then (7.6)

γS (X, Y ) = γ(X, Y ).

This is well defined because Tq S ⊂ Tq IR n+1 . The function γS : Tq S × Tq S → IR easily satisfies the properties of bi-linearity, symmetric, positive definite and is an inner product. The algebraic properties are then satisfied for γS to be a metric tensor are satisfied, but what could smoothness be?

7.5. METRIC TENSORS ON SURFACES

141

Suppose that ψ : U → IR n+1 is a smooth patch on S. We now construct ˆU as a metric tensor on U ⊂ IR n which represents γS as above. We define γ the metric tensor γS in the patch (U, ψ) in a point-wise manner as (7.7)

ˆU (Xp , Yp ) = γS (ψ∗ Xp , ψ∗ Yp ), γ

Xp , Yp ∈ Tu U, p ∈ U.

ˆU = ψ ∗ γ! Expanding equation 10.7 using 10.6 we We now claim that is γ have ˆU (Xp , Yp ) = γS (ψ∗ Xp , ψ∗ Yp ) = γ(ψ∗ Xp , ψ∗ Yp ) Xp , Yp ∈ Tp U. γ Therefore by definition (see 6.27), (7.8)

ˆU = ψ ∗ γ. γ

Finally according to Theorem 6.2.6, γU is a Riemannian metric tensor on U . ˆU is smooth on any chart on S. We then define γS to be smooth because γ Example 7.5.1. Using the chart on S 2 from equation (10.3), we find ψ ∗ dx = du, ψ ∗ dy = dv, ψ ∗ dz = − √

v u du − √ dv. 1 − u2 − v 2 1 − u2 − v 2

ˆU in Computing the induced metric using 10.8, ψ ∗ (dx2 + dy 2 + dz 2 ) we get γ equation (10.5). Example 7.5.2. Let z = f (x, y) be a surface in IR 3 , an let ψ : IR 2 → IR 3 be the standard patch ψ(x, y) = (x, y, z = f (x, y)). We computed ψ ∗ γEu = (1 + fx2 )dx2 + 2fx fy dxdy + (1 + fy2 )dy 2 , in example 6.2.4 of section 6.2. This is the metric tensor on a surface in IR 3 given by a graph. Example 7.5.3. Let S = {(w, x, y, z) | x2 + y 2 + z 2 − w2 = 1 }, and let (U, ψ) be the coordinate patch on S, √ √ √ ψ(t, u, v) = (w = t, x = t2 + 1 cos u sin v, y = t2 + 1 sin u sin v, z = t2 + 1 cos v), where U = {(t, u, v) | t ∈ IR u ∈ (0, 2π), v ∈ (0, π)}. With the Euclidean metric on IR 4 , the components of the surface metric γS in the patch (U, ψ)

142

CHAPTER 7. HYPERSURFACES

using equation 10.8 are computed using √ √ t cos u sin v dt − t2 + 1 sin u sin v du + t2 + 1 cos u cos v dv, t2 + 1 √ √ t ψ ∗ dy = √ sin u sin v dt − t2 + 1 cos u sin v du + t2 + 1 sin u cos v dv, t2 + 1 √ t ψ ∗ dz = √ cos v dt + t2 + 1 sin v dv t2 + 1 ψ ∗ dw = dt. ψ ∗ dx = √

Therefore ψ ∗ γEu = (ψ ∗ dx)2 + (ψ ∗ dy)2 + (ψ ∗ dz)2 + (ψ ∗ dw)2 2t2 + 1 2 = 2 dt + (t2 + 1)(sin2 v du2 + dv 2 ). t +1 Remark 7.5.4. If γ is not positive definite then the signature of γS will depend on S. We now come to a very important computation. Suppose that S is a regular level hypersurface, γ a metric tensor on IR n+1 and that the corresponding metric tensor on S is γS . Let (U, ψ) and (V, φ) be two coordinate patches on S which satisfy ψ(U ) ∩ φ(V ) 6= {}. On each of these open sets U, V ⊂ IR n ˆU , and γˆV be the induced metric tensor as defined by equation (10.6) let γ ˆU and γ ˆV on the sets U and V respectively. The question is then how are γ related? In other words how are the coordinate forms of the metric on S related at points of S in two different coordinate patches? Let W = ψ(U ) ∩ φ(V ) which is a non-empty subset of S, and let U0 = {u ∈ U | ψ(u) ∈ W }, and V0 = {v ∈ V | ψ(v) ∈ W }. The functions ψ and φ are injective and so ψ : U0 → W and φ : V0 → W are then bijective. Consequently we have the bijection (7.9)

φ−1 ◦ ψ : U0 → V0 ,

where φ−1 : W → V0 . While the function φ−1 ◦ ψ exists, its explicit determination is not always easy. The functions ψ and φ provide two different coordinate systems for the points of S which lie in W , and 10.9 are the change of coordinate functions for the points in W .

7.5. METRIC TENSORS ON SURFACES

143

Example 7.5.5. Let (U, ψ) be the chart on S 2 from equation (10.3) (upper half of sphere), and let (V, φ) be the chart √ φ(s, t) = (x = s, y = 1 − s2 − t2 , z = t) where V is the interior of the unit disk V = {(s, t) | s2 + t2 < 1 }. The set W is then set Drawpicture, and U0 = { (u, v) | u2 + v 2 = 1, v > 0 },

V0 = { (s, t) | s2 + t2 = 1, t > 0 }.

In order to compute φ−1 ◦ ψ : U0 → V0 we use the projection map π(x, y, z) = (s = x, t = z) which maps the right hemisphere to V . Therefore √ √ φ−1 ◦ ψ(u, v) = π(u, v, 1 − u2 − v 2 ) = (s = u, t = 1 − u2 − v 2 ). Example 7.5.6. We now let (U, ψ) be the coordinate patch on S 2 in example 10.2.2 given in equation 10.2 by ψ(θ, φ) = (cos θ sin φ, sin θ sin φ, cos φ), 0 < θ < 2π, 0 < φ < π, and let (V, ζ) be the patch ζ(u, v) = (u, v,

√ 1 − u2 − v 2 ), u2 + v 2 < 1.

The overlap on S 2 consists of points in the top half of S 2 minus those with y = 0, x ≥ 0, and U0 = {(θ, φ) | 0 < θ < 2π, 0 < φ
0). Let S ⊂ IR n+1 be a regular level surface F = c. The tangent space Tp S, p ∈ S is Tp S = { Xp ∈ Tp IR n+1 | Xx ∈ ker F∗ |p }. Note that since F∗ has rank 1, that dim Tp S = n by the dimension theorem 2.2.7. Lemma 10.1.2. If Xp ∈ Tp IR n+1 then Xp ∈ Tp S if and only if Xp (F ) = 0 Proof. Let ι ∈ C ∞ (IR ) be the identity function. We compute F∗ Xp (ι) = Xp (ι ◦ F ) = Xp (F ). This vanishes if and only if Xp (F ) = 0. Example 10.1.3. Let r > 0 and Sr2 = { (x, y, z) ∈IR 3 | x2 +  y 2 + z 2 = r2 }, which is the 2-sphere in IR 3 of radius r. Let p = √16 , √16 , √13 ∈ S 2 (where r = 1), and let Xp = ∂x − ∂y then Xp (x2 + y 2 + z 2 ) = (2x − 2y)|p = 0. Therefore Xp ∈ Tp S 2 . Let’s compute the tangent space Tp Sr2 by finding a basis. In the coordinate basis we have by equation 10.1, [F∗ ] = (2x, 2y, 2z) where x, y, z satisfy x2 + y 2 + z 2 = r2 . In order to compute the kernel note the following, if z 6= ±r, then ker[F∗ ] = span{(−y, x, 0), (0, −z, y)}. Rewriting this in the standard basis we have Tp Sc2 = span{−y∂x + x∂y , −z∂y + y∂z } p 6= (0, 0, ±r). At the point p = (0, 0, ±r) we have Tp Sc2 = span{∂x , ∂y } p = (0, 0, ±r).

10.2. PATCHES AND COVERS

189

Example 10.1.4. Let z = f (x, y), (x, y) ⊂ U where U is an open set in IR 2 . As usual we let F (x, y, z) = z − f (x, y), so that the graph z = f (x, y) is written as the level surface F = 0. We compute in the coordinate basis [F∗ ] = (−fx , −fy , 1) and so the surface is a regular level surface. At a point p(x0 , y0 , f (x0 , y0 )) ∈ S we have Tp S = span{ ∂x + fx (x0 , y0 )∂z , ∂y + fy (x0 , y0 )∂z }. Let σ : I → IR be a smooth curve lying on the surface S, which means F ◦ σ = c. Applying the chain rule to the function F ◦ σ : IR → IR (or differentiating with respect to t) gives (F ◦ σ)∗ ∂t = F∗ σ∗ ∂t = c∗ ∂t = 0. Therefore σ∗ ∂t ∈ Tσ(t) S. In the next section we will answer the question of whether for every tangent vector Xp ∈ Tp S there exists a representative curve σ for Xp lying on S.

10.2

Patches and Covers

Let S ⊂ IR n+1 be a regular level surface. Definition 10.2.1. A coordinate patch (or coordinate chart) on S is a pair (U, ψ) where 1. U ⊂ IR n is open, 2. ψ : U → IR n+1 is a smooth injective function, 3. ψ(U ) ⊂ S, 4. and ψ∗ is injective at every point in U (so ψ is an immersion) A coordinate patch about a point p ∈ S is a coordinate patch (U, ψ) with p ∈ ψ(U ). The function ψ provides coordinates on the set ψ(U ) ⊂ S.

190

CHAPTER 10. HYPERSURFACES

Example 10.2.2. Recall the regular level surface Sr2 = {(x, y, z) ∈ IR 3 | x2 + y 2 + z 2 = r2 , r ∈ IR + }. Let U = (0, 2π) × (0, π) ⊂ IR 2 , which is clearly open. The function ψ : U → IR 3 given by (10.2)

ψ(u, v) = (r cos u sin v, r sin u sin v, r cos v)

is a surface patch on the 2-sphere Sr2 . Example 10.2.3. Let S be the regular level surface defined by a graph z = f (x, y), (x, y) ∈ U open. let ψ : U → IR 3 be the function ψ(u, v) = (u, v, f (u, v)) (u, v) ∈ U. The conditions for a patch are easily checked. Every point (x0 , y0 , z0 ) ∈ S is contained in the given patch. Suppose (U, ψ) is a patch on a regular level surface S. Then ψ : U → S is a one-to-one immersion. The differential ψ∗ is injective by definition and so ψ∗ (Tx U ) ⊂ Tψ(x) IR n+1 is an n-dimensional subspace. However we find even more is true. Lemma 10.2.4. The map ψ∗ : Tp U → Tq S, where q = ψ(p) is an isomorphism. Proof. We have ψ∗ (Tp U ) and Tq S are n-dimensional subspaces of Tq IR n+1 . If ψ∗ (Tp U ) ⊂ Tp S then they are isomorphic. Let Xp ∈ Tp U , we only need to check that ψ∗ Xp ∈ ker F∗ . We compute F∗ ψ∗ Xp = Xp (F ◦ ψ). However by the patch conditions F ◦ ψ = c and so Xp (F ◦ ψ) = 0. Therefore Lemma 10.1.2 implies ψ∗ Xp ∈ Tq S, and so psi∗ (Tp U ) = Tq S. Example 10.2.5. Continuing with example 10.2.2, let (u, v) ∈ (0, 2π)×(0, π) we find ψ∗ ∂u |(u,v) ∈ Tp S 2 is ψ∗ ∂u |(u,v) = (− sin u sin v∂x + cos u sin v∂y )(cos u sin v,sin u sin v,cos v) . Note that one can check that ψ∗ ∂u ∈ Tp S 2 using Lemma 10.1.2. Example 10.2.6. For the example of a graph in 10.2.3 we compute ψ∗ ∂u |(u0 ,v0 ) = ∂x + fx (u0 , v0 )∂z , ψ∗ ∂v |(u0 ,v0 ) = ∂y + fy (u0 , v0 )∂z .

10.2. PATCHES AND COVERS

191

Let (U, ψ) be a patch on a regular level surface S, and let q ∈ S be a point contained in ψ(U ). We now argue that given any Xq ∈ Tq S there exists a curve σ : I → S with σ(0) ˙ = Xq . By Lemma 10.2.4 let Yp ∈ Tp U with ψ∗ Yp = Xq , which exists and is unique since ψ∗ is an isomorphism. Let σ be a representative curve for Yp in U ⊂ IR n . Then ψ ◦ σ is a representative curve for Xp . This follows from the chain rule, d d ψ ◦ σ|t=0 = ψ∗ σ∗ = ψ∗ Yp = Xq . dt dt t=0 A covering of a regular level surface is a collection of surface patches C = (Uα , ψα )α∈A where [ ψα (Uα ). S= α∈A

In other words every point p ∈ S is contained in the image of some surface patch. Example 10.2.7. We continue with Sr2 = {(x, y, z) ∈ IR 3 | x2 + y 2 + z 2 = r2 , r > 0}. Let D = {(u, v) ∈ IR 2 | u2 + v 2 < r2 } which is an open set in IR 2 . The set D with the function ψz+ : D → IR 3 given by √ ψz+ (u, v) = (x = u, y = v, z = r2 − u2 − v 2 ) is a surface patch on Sr2 (the upper hemi-sphere). Likewise the pair (D, ψz− ) is a surface patch on Sr2 (the bottom hemi-sphere) where √ ψz− (u, v) = (x = u, y = v, z = − r2 − u2 − v 2 ). Continuing in the way we construct four more patches all using D and the functions √ ψx± (u, v) = (x = ± r2 − u2 − v 2 , y = u, z = v) √ ψy± (u, v) = (x = u, y = ± r2 − u2 − v 2 , z = v). The collection C = {(D, ψz± ), (D, ψx± ), (D, ψy± ) } is a cover of Sr2 by coordinate patches. The fact is regular level surfaces always admit a cover. This follows from the next theorem which we won’t prove.

192

CHAPTER 10. HYPERSURFACES

Theorem 10.2.8. Let S ⊂ IR n+1 be a regular level hyper-surface, and let p ∈ S. There exists a exists a surface patch (U, ψ) with p ∈ ψ(U ). The proof of this theorem involves the implicit function theorem from advanced calculus, see [?] for the theorem. Corollary 10.2.9. Let S ⊂ IR n+1 be a regular level hyper-surface. There exists a cover (Uα , ψα ), α ∈ A of S.

10.3

Maps between surfaces

Suppose S ⊂ IR n+1 and Σ ⊂ IR m+1 are two regular level surfaces and that Φ : IR n → IR m . We’ll say that Φ restricts to a smooth map from S to Σ if Φ(p) ∈ Σ f or all p ∈ S. Example 10.3.1. Let S n ⊂ IR n+1 be the standard n-sphere, and consider the function Φ : IR n+1 → IR n+1 given by Φ(p) = −p. The function Φ restricts to a smooth function from S n ⊂ IR n+1 to S n ⊂ IR n+1 . More generally let A ∈ Mn+1,n+1 (IR ) where AT A = I. Define the function ΦA : IR n+1 → IR n+1 by ΦA (x) = Ax. The function ΦA is linear, and so smooth. If x ∈ S n (so xT x = 1), then [ΦA (x)]T ΦA (x) = xT AT Ax = xT x = 1. Therefore ΦA restricts to a smooth map ΦA : S n → S n . Example 10.3.2. Let S ⊂ IR 3 be the regular level surface S = {(u, v, w) | 4u2 + 9v 2 + w2 = 1 }. The function Φ : IR 3 → IR 3 given by Φ(x, y, z) = (2x, 3y, z) restricts to a smooth map from S 2 to Σ. Example 10.3.3. Let Φ : IR 3 → IR 5 be given by Φ(x, y, z) = (x, y, z, 0, 0), then Φ restricts to a smooth function from S 2 to S 4 .

10.4. MORE GENERAL SURFACES

193

A smooth map Φ : S → Σ is said to be an immersion if Φ∗ : Tp S → TΦ(p) Σ is injective for each p ∈ S, and a submersion if Φ∗ is surjective for each p. A general notion of a smooth function Φ : S → Σ which does not necessarily come from a function on the ambient IR space is given in more advanced courses.

10.4

More General Surfaces

Another type of surface that is often encountered in multi-variable calculus is a parameterized surface. An example is S ⊂ IR 3 given by S = {(x = s cos t, y = s sin t, z = t),

s, t ∈ IR 2 }

which is known as the helicoid. With this description of S it is unclear whether S is actually a level surface or not. It is possible to define what is known as a parameterized surface, but let’s look at the general definition of a regular surface which includes parameterized surfaces and regular level surfaces. Definition 10.4.1. A regular surface S ⊂ IR n+1 is a subset with the following properties, 1. for each point p ∈ S there exists an open set U ⊂ IR n and a smooth injective immersion ψ : U → IR n+1 such that p ∈ ψ(U ) ⊂ S, 2. and furthermore there exists an open set V ∈ IR n+1 such that ψ(U ) = S∩V. One important component of this definition is that S can be covered by surface patches. The idea of a cover is fundamental in the definition of a manifold.

10.5

Metric Tensors on Surfaces

Consider the coordinate patch on the upper half of the unit sphere S 2 ⊂ IR 3 in example 10.2.7 with r = 1, given by the function ψ : U → IR 3 where U is the inside of the unit disk u2 + v 2 < 1, and √ (10.3) ψ(u, v) = ( x = u, y = v, z = 1 − u2 − v 2 ).

194

CHAPTER 10. HYPERSURFACES

We can view the disk u2 + v 2 < 1 as lying in the xy-plane and the image under ψ as the upper part of S 2 . Let σ(t) = (x(t), y(t), z(t)) be a curve on the upper half of the sphere. The curve σ is the image of the curve τ (t) = (u(t), v(t)) ⊂ U which is the projection of σ into the xy-plane. In particular from equation 10.3 we have u(t) = x(t), v(t) = y(t) and therefore have p (10.4) σ(t) = ψ ◦ τ (t) = (u(t), v(t), 1 − u(t)2 − v(t)2 ) Given the curve (10.4) on the surface of the sphere we can compute its arclength as a curve in IR 3 using the Euclidean metric γEu (see equation ??), s Z b  2  2  2 dx dy dz + + dt LEu (σ) = dt dt dt a v !2 Z bu u du 2  dv 2 du dv 1 t (u + v ) dt = + + p dt dt dt 1 − u(t)2 − v(t)2 dt a Z br 1 = ((1 − v 2 )(u) ˙ 2 + 2uv u˙ v˙ + (1 − u2 )(v) ˙ 2 ) dt 1 − u2 − v 2 a Note that this is the same arc-length we would have computed for the curve τ (t) = (u(t), v(t)) using the metric tensor (10.5)

γˆU =

 1 2 2 2 (1 − v )du + 2uvdudv + (1 − u )dv 1 − u2 − v 2

defined on the set U ! Let’s look at this problem in general and see where the metric tensor 10.5 comes from. Suppose S ⊂ IR n+1 is a regular level surface. Let γ be a metric tensor on IR n+1 . The metric γ induces a metric tensor on S, we denote by γS as follows. Let q ∈ S, and X, Y ∈ Tq S, then (10.6)

γS (X, Y ) = γ(X, Y ).

This is well defined because Tq S ⊂ Tq IR n+1 . The function γS : Tq S × Tq S → IR easily satisfies the properties of bi-linearity, symmetric, positive definite and is an inner product. The algebraic properties are then satisfied for γS to be a metric tensor are satisfied, but what could smoothness be?

10.5. METRIC TENSORS ON SURFACES

195

Suppose that ψ : U → IR n+1 is a smooth patch on S. We now construct ˆU as a metric tensor on U ⊂ IR n which represents γS as above. We define γ the metric tensor γS in the patch (U, ψ) in a point-wise manner as (10.7)

ˆU (Xp , Yp ) = γS (ψ∗ Xp , ψ∗ Yp ), γ

Xp , Yp ∈ Tu U, p ∈ U.

ˆU = ψ ∗ γ! Expanding equation 10.7 using 10.6 we We now claim that is γ have ˆU (Xp , Yp ) = γS (ψ∗ Xp , ψ∗ Yp ) = γ(ψ∗ Xp , ψ∗ Yp ) Xp , Yp ∈ Tp U. γ Therefore by definition (see 6.27), (10.8)

ˆU = ψ ∗ γ. γ

Finally according to Theorem 6.2.6, γU is a Riemannian metric tensor on U . ˆU is smooth on any chart on S. We then define γS to be smooth because γ Example 10.5.1. Using the chart on S 2 from equation (10.3), we find ψ ∗ dx = du, ψ ∗ dy = dv, ψ ∗ dz = − √

v u du − √ dv. 1 − u2 − v 2 1 − u2 − v 2

ˆU in Computing the induced metric using 10.8, ψ ∗ (dx2 + dy 2 + dz 2 ) we get γ equation (10.5). Example 10.5.2. Let z = f (x, y) be a surface in IR 3 , an let ψ : IR 2 → IR 3 be the standard patch ψ(x, y) = (x, y, z = f (x, y)). We computed ψ ∗ γEu = (1 + fx2 )dx2 + 2fx fy dxdy + (1 + fy2 )dy 2 , in example 6.2.4 of section 6.2. This is the metric tensor on a surface in IR 3 given by a graph. Example 10.5.3. Let S = {(w, x, y, z) | x2 + y 2 + z 2 − w2 = 1 }, and let (U, ψ) be the coordinate patch on S, √ √ √ ψ(t, u, v) = (w = t, x = t2 + 1 cos u sin v, y = t2 + 1 sin u sin v, z = t2 + 1 cos v), where U = {(t, u, v) | t ∈ IR u ∈ (0, 2π), v ∈ (0, π)}. With the Euclidean metric on IR 4 , the components of the surface metric γS in the patch (U, ψ)

196

CHAPTER 10. HYPERSURFACES

using equation 10.8 are computed using √ √ t cos u sin v dt − t2 + 1 sin u sin v du + t2 + 1 cos u cos v dv, t2 + 1 √ √ t ψ ∗ dy = √ sin u sin v dt − t2 + 1 cos u sin v du + t2 + 1 sin u cos v dv, t2 + 1 √ t ψ ∗ dz = √ cos v dt + t2 + 1 sin v dv t2 + 1 ψ ∗ dw = dt. ψ ∗ dx = √

Therefore ψ ∗ γEu = (ψ ∗ dx)2 + (ψ ∗ dy)2 + (ψ ∗ dz)2 + (ψ ∗ dw)2 2t2 + 1 2 = 2 dt + (t2 + 1)(sin2 v du2 + dv 2 ). t +1 Remark 10.5.4. If γ is not positive definite then the signature of γS will depend on S. We now come to a very important computation. Suppose that S is a regular level hypersurface, γ a metric tensor on IR n+1 and that the corresponding metric tensor on S is γS . Let (U, ψ) and (V, φ) be two coordinate patches on S which satisfy ψ(U ) ∩ φ(V ) 6= {}. On each of these open sets U, V ⊂ IR n ˆU , and γˆV be the induced metric tensor as defined by equation (10.6) let γ ˆU and γ ˆV on the sets U and V respectively. The question is then how are γ related? In other words how are the coordinate forms of the metric on S related at points of S in two different coordinate patches? Let W = ψ(U ) ∩ φ(V ) which is a non-empty subset of S, and let U0 = {u ∈ U | ψ(u) ∈ W }, and V0 = {v ∈ V | ψ(v) ∈ W }. The functions ψ and φ are injective and so ψ : U0 → W and φ : V0 → W are then bijective. Consequently we have the bijection (10.9)

φ−1 ◦ ψ : U0 → V0 ,

where φ−1 : W → V0 . While the function φ−1 ◦ ψ exists, its explicit determination is not always easy. The functions ψ and φ provide two different coordinate systems for the points of S which lie in W , and 10.9 are the change of coordinate functions for the points in W .

10.5. METRIC TENSORS ON SURFACES

197

Example 10.5.5. Let (U, ψ) be the chart on S 2 from equation (10.3) (upper half of sphere), and let (V, φ) be the chart √ φ(s, t) = (x = s, y = 1 − s2 − t2 , z = t) where V is the interior of the unit disk V = {(s, t) | s2 + t2 < 1 }. The set W is then set Drawpicture, and U0 = { (u, v) | u2 + v 2 = 1, v > 0 },

V0 = { (s, t) | s2 + t2 = 1, t > 0 }.

In order to compute φ−1 ◦ ψ : U0 → V0 we use the projection map π(x, y, z) = (s = x, t = z) which maps the right hemisphere to V . Therefore √ √ φ−1 ◦ ψ(u, v) = π(u, v, 1 − u2 − v 2 ) = (s = u, t = 1 − u2 − v 2 ). Example 10.5.6. We now let (U, ψ) be the coordinate patch on S 2 in example 10.2.2 given in equation 10.2 by ψ(θ, φ) = (cos θ sin φ, sin θ sin φ, cos φ), 0 < θ < 2π, 0 < φ < π, and let (V, ζ) be the patch ζ(u, v) = (u, v,

√ 1 − u2 − v 2 ), u2 + v 2 < 1.

The overlap on S 2 consists of points in the top half of S 2 minus those with y = 0, x ≥ 0, and U0 = {(θ, φ) | 0 < θ < 2π, 0 < φ
0 by az + b z→ , where cz + d   a b SL(2) = { ∈ M2×2 (IR ) | ad − bc = 1}. c d (d) Let G = O(3), O(3) = { A ∈ M3×3 (IR ) | AAT = I }, act on R3 − 0 by µ(A, x) = Ax. 4. Find the isotropy subgroup for each of the following actions at the given point. (a) Question 1 a

(x0 , y0 ) = (2, 3) and (x0 , y0 ) = (0, 0).

(b) Question 1 b (x0 , y0 , z0 ) = (1, 2, 3) and (x0 , y0 , z0 ) = (1, 1, 0), and (x0 , y0 , z0 ) = (1, 0, 0) (c) Question 4 c

z0 = i and z0 = i + 1.

5. Find the right invariant vector fields on the multi-parameter group in Example 13.6. 6. Let ds2 = (u + v)−2 (du2 + dv 2 ) be a metric tensor on U = {(u, v) ⊂ IR 2 | u + v 6= 0}, and let a ∈ IR ∗ , b ∈ IR , and let Φ(a,b) : IR 2 → IR 2 be Φ(a,b) (u, v) = (au + b, av − b) (a) Show that Φ is a group action. (b) Show that U ⊂ IR 2 is a Φ(a,b) invariant set for any a ∈ IR ∗ , b ∈ IR . (c) Show that Φ(a,b) is an isometry of the metric tensor for any a ∈ IR ∗ , b ∈ IR . (d) Compute the infinitesimal generators of Φ.

216CHAPTER 11. GROUP ACTIONS AND MULTI-PARAMETER GROUPS (e) Check that infinitesimal generators vector-fields are Killing vectors. 7. Let GL(n, IR ) be the group of invertible matrices (See appendix A). (a) Show that ρ : GL(n, IR ) × IR n → IR n ρ(A, x) = Ax A ∈ GL(n, IR ), x ∈ IR n . is a group action. (b) Let M = Mn×n (IR ) be the set of all n × n real matrices. Is the function ρ : G × M → M given by ρ(A, X) = AXA−1 ,

A ∈ G, X ∈ M 2

an action of GL(n, IR ) on M = Rn ? 8. Let G = (a, b, c), a ∈ IR ∗ , b, c ∈ IR be a group with multiplication (a, b, c) ∗ (x, y, z) = (ax, ay + b, az + c). (a) Compute the left and right invariant vector-fields. (b) Compute the coframe dual to the left invariant vector fields. (c) Construct the metric tensor as in 11.3 from the left invariant vector fields and check that the right invariant vector fields are Killing vectors. (d) Find a basis for all Killing vectors fields for the metric γ constructed in part 3.

Chapter 12 Connections and Curvature 12.1

Connections

12.2

Parallel Transport

12.3

Curvature

217